SQL-PROTOCOL is a SQL Relationnal Database Management System (RDBMS) entirely written in REBOL with JOIN and SORT capability. This allow you having an easy to use lightweight database engine embeded in your REBOL application.
Today, sql-protocol execute only these kind of query :
Query can be submited either as a standard SQL query string or as a SQL like query dialect block.
This quick example illustrates how to load the protocol, open a database, select some rows from two tables, probe the result and close the database.
Using standard SQL query string :
do %sql-protocol.r db: open sql:my-db insert db { SELECT * FROM a, b WHERE a.c2 = b.c1 AND a.c1 = 1 ORDER BY 1, 2 DESC } foreach item copy db [probe item] close db
The same using the SQL dialect :
do %sql-protocol.r db: open sql:my-db insert db [ SELECT * FROM a b WHERE a.c2 = b.c1 AND a.c1 = 1 ORDER BY 1 [2 DESC] ] foreach item copy db [probe item] close db
Moreover, sql-protocol provide a basic compabibility with the ODBC text driver {Microsoft text driver (*.csv,*.txt)} in order to provide a quick and simple way to share data between REBOL application and any ODBC application,for example, MS Excel to produce table or chart,or MS Word to produce letters or mailing.
sql-protocol provide also a set of file protocol which can be used directly in your script :
DATA | text file containing a REBOL block for each row |
HEAP | same as DATA but for transient table (in memory table) |
CSV | delimited file by any caracter except doublequote ("), newline (^/) or linefeed (^M). |
FIXED | fixed length file (<u>not implemented up now</u>) |
NOTA BENE |
This documentation include what is already realized but also on what I intend to develop. I indicate each time when something in not implemented if it will bo done soon or not. |
Query on the database can be done :
The engine carries out only the following requests:
SQL-PROTOCOL support 4 table types (format) :
DATA | format which is text file containing a REBOL block for each table row |
HEAP | format which is for in memory table (transient table) containing a REBOL block for each row |
CSV | format which is delimited file (CSVDelimited, TABDelimited or Delimited by any caracter except doublequote ("), newline (^/) or linefeed (^M). |
FIXED | format which is fixed length file (Fixed) (<u>not implemented up now</u>) |
By default :
Installing sql-protocol is very simple :
Or
Opening a database is very simple:
open sql:my-database
In this case, the protocol open the database contained in the directory my-database.
You can specify the standard /new refinement in order to reinitialize the database (<u>this is not implemented up to now but I will make it soon</u>):
open/new sql:my-database
In this case, all the file contained in the directory are erased and a new schema.ctl file is created.
Other refinement will cause an error (<u>error are not thrown up to now but I will change it soon</u>).
When you open a database, you can also precise the file name of the schema :
open sql:my-database/my-schema.txt
In this case, if the schema file does'nt exist, it is created. It's the same with the directory and the /new refinement is also authorized.
SQL-PROTOCOL implements the following queries :
SELECT | allows you to select rows from table(s). |
INSERT | allows you to insert row(s) in a table. |
UPDATE | allows you to update row(s) in a table. |
DELETE | allows you to delete row(s) from a table. |
SQL-PROTOCOL implements the following WHERE clauses :
DELETE FROM tbl_name [WHERE where_condition]
The DELETE statement deletes rows from tbl_name and returns the number of rows deleted. The WHERE clause, if given, specifies the conditions that identify which rows to delete. With no WHERE clause, all rows are deleted.
where_condition is an expression that evaluates to true for each row to be deleted.
INSERT [INTO] tbl_name [(col_name,...)] VALUES (expr,...),(...),...
Or:
INSERT [INTO] tbl_name SET col_name=expr, ...
(<u>not implemented up now</u>)
Or:
INSERT [INTO] tbl_name [(col_name,...)] SELECT ...
(<u>not implemented up now</u>)
INSERT inserts new rows into an existing table. The INSERT ... VALUES and INSERT ... SET forms of the statement insert rows based on explicitly specified values. The INSERT ... SELECT form inserts rows selected from another table or tables.
tbl_name is the table into which rows should be inserted. The columns for which the statement provides values can be specified as follows:
Column values can be given in several ways:
For example, you can do this because the value for col2 refers to col1, which has previously been assigned:
INSERT INTO tbl_name (col1,col2) VALUES(15,col1*2);
But the following is not legal, because the value for col1 refers to col2, which is assigned after col1:
INSERT INTO tbl_name (col1,col2) VALUES(col2*2,15);
INSERT statements that use VALUES syntax can insert multiple rows. To do this, include multiple lists of column values, each enclosed within parentheses and separated by commas. Example:
INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);
The values list for each row must be enclosed within parentheses. The following statement is illegal because the number of values in the list does not match the number of column names:
INSERT INTO tbl_name (a,b,c) VALUES(1,2,3,4,5,6,7,8,9);
SELECT [DISTINCT] select_expr, ... FROM table_references [WHERE where_condition] [ORDER BY {col_name | expr | position} [ASC | DESC], ...]
SELECT is used to retrieve rows selected from one or more tables.
To query the database, you have to insert the query string into the port and next to get the result either by a copy, a first or a pick.
Quick example:
db: open sql:my-db insert db {SELECT * FROM my-table} foreach item copy db [probe item] close db
The most commonly used clauses of SELECT statements are these:
Clauses used must be given in exactly the order shown in the syntax description.
SELECT t1.name, t2.salary FROM employee AS t1, info AS t2 WHERE t1.name = t2.name; SELECT t1.name, t2.salary FROM employee t1, info t2 WHERE t1.name = t2.name;
SELECT college, region, seed FROM tournament ORDER BY region, seed; SELECT college, region AS r, seed AS s FROM tournament ORDER BY r, s; SELECT college, region, seed FROM tournament ORDER BY 2, 3;
UPDATE [LOW_PRIORITY] [IGNORE] tbl_name SET col_name1=expr1 [, col_name2=expr2 ...] [WHERE where_condition]
The UPDATE statement updates columns of existing rows in tbl_name with new values.
WHERE condition is an expression that evaluates to true for each row to be updated.
UPDATE persondata SET age=age+1;
UPDATE persondata SET age=age*2, age=age+1;
NOTA BENE |
Remember, it's a REBOL dialect so you have to respect REBOL syntax : * don't forget to place a space before an after the = or all other operator (+, -*, /, >, <, ).
|
The SQL dialect for SELECT is very similar to the SQL To query the database, you have to insert the query block into the port and next to get the result either by a copy, a first or a pick.
Quick example:
db: open sql:my-db insert db [SELECT * FROM my-table] foreach item copy db [probe item] close db
To select all columns from a table you can use the * :
insert db [SELECT * FROM a]
If you want to select some specific columns,
insert db [SELECT c1 c2 FROM a]
You can also use the dot notation alias.column:
insert db [SELECT a.c1 a.c2 FROM a]
You can also use the path notation alias/column:
insert db [SELECT a/c1 a/c2 FROM a]
Same rules apply to the WHERE clause :
insert db [SELECT * FROM a WHERE c1 = 2] insert db [SELECT * FROM a WHERE a.c1 = 2] insert db [SELECT * FROM a WHERE a/c1 = 2]
You can also give an alias name to a table. In this case, you have to set the table / alias pair into a block. The AS word is optional :
insert db [SELECT t1.c1 t1.c2 FROM [a t1]] insert db [SELECT t1.c1 t1.c2 FROM [a AS t1]]
Nota Bene |
Remember, it's a REBOL dialect so you have to respect REBOL syntax and don't forget to place a space before an after the =or all other operator (+, -*, /, >, <, ). |
To do a join, you do simply :
insert db [SELECT * from a b]
If you want select some columns from a table and all from another, you can use the alias.* notation :
insert db [SELECT a.c1 b.* FROM a b]
When you join a table with itself, you must give an alias for each table :
insert db [SELECT t1.c1 t2.c1 FORM [a AS t1] [a AS t2] WHERE t1.c1 = t2.c2]
Nota Bene |
Remember, it's a REBOL dialect so you have to respect REBOL syntax and don't forget to place a space before an after the =or all other operator (+, -*, /, >, <, ). |
You can use the LIKE clause in your query :
insert db [SELECT * FROM a WHERE c1 LIKE "my*"]
LIKE clause use * and ? as wild card and not the standard SQL % and _
You can give an ORDER BY clause in order to sort the result.
You can either give the column(s) number
insert db [SELECT * FROM a ORDER BY 1 2 3]
Or you can give the columns name by its alias.columns name :
insert db [SELECT * FROM a ORDER BY a.c1 a.c2 a.c3]
By default, the engine sort the result in ascending order, but you can sort it in descending order by giving the pair alias.columns ASC or DESC in a block :
insert db [SELECT * FROM a ORDER BY [a.c1 ASC] [a.c2 DESC]]
ASC or DESC can also be given with columns index :
insert db [SELECT * FROM a ORDER BY [1 ASC] 2 [DESC]]
Limitation |
Remember that GROUP BY, HAVING and statistical function (COUNT(...), MAX(...), etc...) are not implemented. It's the same for BETWEEN and IN word. |
For INSERT, the dialect is quiet simple.
Quick example:
insert db [INSERT INTO a VALUES [1 2 3]]
Inserting one row without specifying the column list :
insert db [INSERT INTO a VALUES [1 2 3]]
Inserting one row with a column list :
insert db [INSERT [c b a] INTO a VALUES [3 2 1]]
Inserting many rows at a time :
insert db [INSERT INTO a VALUES [1 2 3] [2 3 4] [3 4 5]]
Inserting rows selected from another query (<u>this is not implemented but it will be soon because it's easy to make</u>) :
insert db [INSERT INTO temp SELECT * FORM a b WHERE a.c2 = b.c1 ORDER BY a.c1 b.c1]
For UPDATE, the dialect is also very similar to SQL.
Quick example :
insert db [UPDATE a SET c2: 3 WHERE c1 = 1]
Nota Bene |
You must use the simple column notation, not the alias.columnnotation |
If you want, you can set many columns at a time :
insert db [UPDATE a SET c2: 3 c3: 4 WHERE c1: 1]
Nota Bene |
Remember, it's a dialect so you have to respect REBOL syntax and don't forget to place a space before an after the =or all other operator (+, -*, /, >, <, ). |
For DELETE, it's very simple.
Quick example :
insert db [DELETE FROM a WHERE c1 = 1]
Nota Bene |
You must use the simple column notation, not the alias.columnnotation |
Deleting all rows in a table :
insert db [ DELETE FROM a ]
SQL-PROTOCOLimplements the following DDL (Data Definition Language) queries:
CREATE TABLE | which allows you to create a new table. |
DROP TABLE | which allows you to delete a table. |
For CREATE TABLE the dialect is extended in order to have a better control on the file creation.
Quick example .
insert db [CREATE TABLE a [c1 c2 c3]]
Creating a table without column type :
insert db [ CREATE-TABLE table-name [C1 C2 C3] ]
As REBOL is an untyped programming language, it's more simple to ommit the columns type for DATA or HEAP protocol.
Creating a DATA or HEAP table with full columns description (<u>this is not implement up to now and I will make later</u>) :
insert db [ CREATE TABLE a [ [F01 binary!] [F02 bitset!] [F03 block!] [F04 char!] [F05 datatype!] [F06 date!] [F07 decimal!] [F08 email!] [F10 file!] [F11 get-word!] [F12 hash!] [F13 hex!] [F14 image!] [F15 integer!] [F16 issue!] [F17 list!] [F18 lit-path!] [F19 lit-word!] [F20 logic!] [F21 money!] [F22 pair!] [F23 paren!] [F24 path!] [F25 refinement] [F26 set-path] [F27 set-word] [F28 string] [F29 tag] [F30 time] [F31 tuple] [F32 url] [F33 word] ] TYPE = HEAP ]
You can create a CSV table without giving the columns type and length. In this case, you can put what you want in each row but this is not supported by {Microsoft text driver).
insert db [ CREATE TABLE a [C1 C2 C3] type = CSV]
By default CSV table include the header, the format is Delimited and use semicolumn delimiter (;).
Creating a CSV table with full columns description. By default CSV table include the header, the format is Delimited and use semicolumn delimiter (;).
insert db [ CREATE TABLE a [ [F01 Bit] [F02 Byte] [F03 Char Width 255] [F04 Currency] [F05 Date] [F06 Float] [F07 Integer] [F08 LongChar] [F09 Short] [F10 Single] ] TYPE = CSV ]
Creating a CSV table with full columns description and specific table type
insert db [ CREATE TABLE a [ [C1 Integer] [C2 Char 255] [C3 Date] ] TYPE = [ CSV ColNameHeader: false format: CSVDelimited ] ]
Nota Bene |
TYPE [....] is still not implemented, only TYPE CSV is allowed |
Creating a FIXED table with full columns description
insert db [ CREATE TABLE a [ [F01 Bit 1] [F02 Byte 3] [F03 Char 255] [F04 Currency 15] [F05 Date 6] [F06 Float 21] [F07 Integer 11] [F08 LongChar 32] [F09 Short 5] [F10 Single 255] ] TYPE = FIXED ]
You can also precise if the header is saved in the file or not:
insert db [ CREATE TABLE a [ [C1 Integer 11] [C2 Char 255] [C3 Date 6] ] TYPE = [ FIXED ColnameHeader: true ] ]
To drop a table, the dialect is very simple:
insert db [ DROP TABLE table-name ]
If the table doesn't exist, an error is thrown.
You can if needed use directly the DATA, HEAP, CSV and FIXED protocol. These protocols are named file protocol.
DATA, HEAP, CSV and FIXED protocol are usefull to store and access REBOL datatype block thru protocol.
DATA | to store REBOL values as block of values into a file. |
HEAP | to store REBOL values as block of values into memory. |
CSV | to store REBOL value as character separated records. |
FIXED | to store REBOL value as fixed length record. |
The advantage is to have with a standard input, output and interface access to many file format. These normalized access simplify file read and write and reduce the debuggind time. It's allows you to insert, update or delete records as simple as managing blocks.
The usage of file protocol differs form one protocol to the other, only when you open the file. All other access are standized.
To open a DATA or HEAP file, you use the standard open function :
my-file: open DATA:my-datafile.data
To open a CSV or FIXED file, you use the standard open function :
my-file: open CSV:my-datafile.data
If you want to precise the file structure, you have to make the port before opening it and to give the file schema in a block :
my-file: open make port! [scheme: 'csv schema: [ ColNameHeader: true format: "Delimited" ; or CSVDelimited or TABDelimited delimiter: ";" ; or any character except doublequote ("), line feed (^M) and newline (^/) cols: [ [C1 Char width 10] [C2 Integer] ] ]
To insert data, use the standard insert function :
insert my-file ["data 1" "data 2" 3 4]
If you want to insert many records, you can insert a block containing one or more blocks :
insert my-file [ ["Record 1" 1 2 3] ["Record 2" 2 3 4] ]
If you want to insert one record containing blocks, you can une the /only refinement:
insert/only my-file [ ["Value 1"] ["Value 2"] ]
To insert data at end of the file, use the standard append function :
insert my-file ["data 1" "data 2" 3 4]
To insert data for example at the third position, use the standard at function :
my-file: at my-file 3 insert my-file ["data 1" "data 2" 3 4]
To change data, use the standard change function :
my-file: open DATA:my-data-file.data my-file: at my-file 3 change my-file ["data 1" "data 2" 3 4] close my-file
This example change the third record.
If you want to change many records, you can change a block containing one or more blocks :
change my-file [ ["Record 1" 1 2 3] ["Record 2" 2 3 4] ]
If you want to change one record containing blocks, you can une the /only refinement:
change/only my-file [ ["Value 1"] ["Value 2"] ]
If you want to change for example 1 by 3 record, you can use the /part refinement
change/part my-file [ ["Record 1" 1 2 3] ["Record 2" 2 3 4] ["Record 3" 4 5 6] ] 1
To remove record (s), use the standard remove function :
remove my-file
If you want to remove many record, use the /part refinement. For example, to to remove 3 records, do this :
remove/part my-file 3
To get data, you can use either the standard copy or pick function
probe pick my-file 1 probe copy my-file
With the copy function you can use the /part refinement:
probe copy/part my-file 5
With file protocol, you can it as for series
You can sort the records in the file. It's very usefull and can be done like this
my-file: open data:my-file.data sort my-file close my-file
With file protocol, you can manage file and directory. This can done in the same way as standard file.
You can use the following function :
query | to obtain file or directory information |
exists | to verify if a file or directory exists |
make-dir | to create a directory |
delete | to delete a file or directory |
query data:my-file.data exsits data:my-file.data make-dir data:my-directory/ delete/any data:my-directory/*.data
For each database, the table file structure is stored in a file.
schema.ctl | is used for default database type. It's a standard REBOL script file which is simple loaded when the database is opened. |
schema.ini | is used for {Microsoft text driver} database. It's a standard .ini file |
All the table files are stored in the same directory as the schema file.
In schema.ctl file, SQL-PROTOCOL store the schema structure in the following manner :
table-name: [ scheme: table-file-schema target: table-file-name schema: [ cols: [ col-specs-1 ... col-specs-n ] property-1: value-1 ... property-n: value-n ] ] col-specs = col-name col-specs = [col-name col-type col-length]
<u>The schema.ctl is not implemented like this now, but this will be done soon. Actualy, the schema is stored in a database.ctl file with a different format</u>
Example schema for DATA or HEAP protocol :
data-table: [ scheme: 'DATA target: %data-table.data schema: [ cols: [ col-1 ; untyped column [col-2 date!] ; a date ] ] ]
Example schema for CSV or FIXED protocol :
csv-table: [ scheme: 'CSV target: %csv-table.csv schema: [ ColnameHeader: false format: 'Delimited delimiter: ";" cols: [ col-1 ; untyped column [col-2 char 32] ; 32 char ] ] ]
The schema.ini file is compatible with the {Microsoft text driver (*.csv *.txt)} You can find the specification here :
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/odbc/htm/odbcjetschema_ini_file.asp
Here is an example of schema.ini with all possible columns type :
[csvdelimited.csv] ColNameHeader=True Format=CSVDelimited MaxScanRows=0 CharacterSet=OEM Col1=F01 Bit Col2=F02 Byte Col3=F03 Char Width 255 Col4=F04 Currency Col5=F05 Date Col6=F06 Float Col7=F07 Integer Col8=F08 LongChar Col9=F09 Short Col10=F10 Single [fixedlength.txt] ColNameHeader=True Format=FixedLength MaxScanRows=0 CharacterSet=OEM Col1=F01 Bit Width 1 Col2=F02 Byte Width 3 Col3=F03 Char Width 10 Col4=F04 Currency Width 15 Col5=F05 Date Width 6 Col6=F06 Float Width 21 Col7=F07 Integer Width 11 Col8=F08 LongChar Width 32 Col9=F09 Short Width 5 Col10=F10 Single Width 255
I decided to use file protocol to acces to table data. This choice is motivated by two reason :
One important point was to implemement the directory capability so I can reused the standard function for the creation and deletion of table file.
Also, by implementing the HEAP protocol, I could use always the same interface even if data could be stored in a simple block in memory.
I use a unique SELECT strategy, it's a table scan / nested loop algorithm.
The algorithm is quiet simple, for each table in the SELECT, the script scan all the row of the table, and for each row, scan all the rows of the next table, and when it's the last table, the where clause is simply applied.
The speed performance is not verw good but that's enough with small table and few join. In fact, what is most time consuming, it's the insert in the resulting block when thousand of rows are inserted.
Incomplete list :-(
0.0.1 [15-Sep-2004 {Initial alpha version} marco@adyreb.org] 0.0.2 [16-Sep-2004 {Documentation & order bug correction & tests} marco@adyreb.org] 0.0.3 [17-Sep-2004 {Change path notation to dot notation for columns} marco@adyreb.org] 0.0.4 [20-Sep-2004 {Add table alias and where normalization} marco@adyreb.org] 0.0.5 [21-Sep-2004 {Interface stabilization, sort improvement} marco@adyreb.org] 0.0.6 [22-Sep-2004 {First optimization and select * or select table.* implementation} marco@adyreb.org] 0.0.7 [27-Sep-2004 {RAPID Documentation, to-rebol-cols debugging and order interface change} marco@adyreb.org] 0.1.0 [28-Sep-2004 {Change in provision of SQl protocol - DATA: protocol & database object} marco@adyreb.org] 0.1.1 [29-Sep-2004 {Change of test case & documentation & SQL: protocol} marco@adyreb.org] 0.1.2 [30-Sep-2004 {Dismiss of RAPID documentation & various changes} marco@adyreb.org] 0.1.3 [04-Oct-2004 {Transient debug & developpment of sql INSERT, UPDATE & DELETE} marco@adyreb.org] 0.1.4 [06-Oct-2004 {Change in UPDATE syntaxe and DELETE debug} marco@adyreb.org] 0.1.5 [08-Oct-2004 {Major change to resolve the path problem} marco@adyreb.org] 0.1.6 [11-Oct-2004 {Implements HEAP protocol for transient table} marco@adyreb.org] 0.1.7 [12-Oct-2004 {Implements CREATE/DROP TABLE and HEAP & DATA optimization} marco@adyreb.org] 0.1.8 [13-Oct-2004 {Improve path an target management & add get-url function} marco@adyreb.org] 0.2.0 [13-Oct-2004 {First alpha published on www.rebol.org} marco@adyreb.org] 0.2.1 [15-Oct-2004 {Change on the data protocol - implement directory management} marco@adyreb.org] 0.2.2 [18-Oct-2004 {Try to improve select performance} marco@adyreb.org] 0.2.3 [27-Oct-2004 {Change on the data and heap protocol - improve directory management} marco@adyreb.org] 0.3.0 [11-Nov-2004 {Add CSV protocol, change on DATA protocol and preparation to FIXED protocol} marco@adyreb.org] 0.3.1 [19-Nov-2004 {Various bug correction and header extension} marco@adyreb.org] 0.3.2 [14-Nov-2004 {Add SORT on DATA and CSV protocol plus bug correction (thank Shad)} marco@adyreb.org] 0.4.0 [14-Dec-2004 {Implement new schema.ctl and extend TYPE = clause} marco@adyreb.org] 0.4.1 [20-Dec-2004 {Implement schema.ini and extend test case} marco@adyreb.org] 0.4.2 [10-Jan-2005 {Test, debug and correction on port/local/table and schema.ctl} marco@adyreb.org] 0.4.3 [13-Jan-2005 {Lot of test, debug and correction on compatibility with Microsoft Text Driver} marco@adyreb.org] 0.5.0 [17-Jan-2005 {Alpha version published on www.rebol.org} marco@adyreb.org] 0.6.0 [17-Jan-2005 {More flexible SQL dialect (FROM clause)} marco@adyreb.org] 0.6.1 [16-Mar-2005 {More flexible SQL dialect (columns, WHERE and ORDER BY clause)} marco@adyreb.org] 0.6.2 [29-Mar-2005 {Extends test case} marco@adyreb.org] 0.6.3 [11-May-2005 {End of extended test and publication to library} marco@adyreb.org] 0.6.4 [02-Feb-2006 {Add LIKE clause + some bug correction} marco@adyreb.org] 0.6.5 [05-Feb-2006 {First attempt of SQL string parsing + some bug correction} marco@adyreb.org] 0.6.6 [07-Feb-2006 {Correction of a bug when using word in the SQL dialect} marco@adyreb.org] 0.6.7 [08-Feb-2006 {Improvement of word handling in dialect} marco@adyreb.org] 0.6.8 [04-Mar-2006 {Implement SQL parsing for INSERT, UPDATE & DELETE clauses} marco@adyreb.org]
You can find the sql-protocol.zip archive in the french framework here :
http://www.agora-dev.org/forums/view.php?site=rebol&bn=rebol_protocols&key=1095276031
The archive contains :
sql-protocol.r | The REBOL sql database protocol |
launch.r | A simple script to launch REBOL, load the protocol and open the database |
RUn.r | The Rebol Unit tool, many thanks to Christophe (christophe.coussement@mil.be) |
test-unit.r | To run one or many of the test case |
test-suite.r | To run all the test cases |
/test-suite | contains all the test case. |
/my-db | is the database for the test case |
/my-text-db | is the database for the Microsoft text driver compatibility test |
/my-db-save | contains the initial database for the test case |
/my-text-db-save | contains the initial database for the Microsoft text driver compatibility test |
bench-suite.r | To run all the bench cases |
/bench-suite | Contains all the bench case. |
/bench-db | is the database used for bench case |
/tutorial | contains the documentation source |
Text File Format.doc | The microsoft CSV file specification |