REBOL.org Script Library

Documention for: sql-protocol.r
Created by: coccinelle
on: 20-Nov-2004
Last updated by: coccinelle on: 3-Jan-2011
Format: html
Downloaded on: 30-Apr-2025

The SQL-PROTOCOL Documentation

SQL-PROTOCOL is a SQL Relationnal Database Management System (RDBMS) entirely written in REBOL with JOIN and SORT capability. This allow you having an easy to use lightweight database engine embeded in your REBOL application.

Today, sql-protocol execute only these kind of query :

SELECT ... FROM ... WHERE ... ORDER BY ...
INSERT ... INTO ... VALUES ...
UPDATE ... SET ... WHERE ...
DELETE FROM ... WHERE ...
CREATE TABLE ...
DROP TABLE ...

Query can be submited either as a standard SQL query string or as a SQL like query dialect block.

by using SQL query string you will have a better compatibility with other database system like MySQL, Oracle or DB2.
by using SQL query dialect you will get advantage of REBOL scripting facility.

This quick example illustrates how to load the protocol, open a database, select some rows from two tables, probe the result and close the database.

Using standard SQL query string :

    do %sql-protocol.r
    db: open sql:my-db
    insert db {
        SELECT * FROM a, b
        WHERE a.c2 = b.c1 AND a.c1 = 1
        ORDER BY 1, 2 DESC
    }
    foreach item copy db [probe item]
    close db

The same using the SQL dialect :

    do %sql-protocol.r
    db: open sql:my-db
    insert db [
        SELECT * FROM a b
        WHERE a.c2 = b.c1 AND a.c1 = 1
        ORDER BY 1 [2 DESC]
    ]
    foreach item copy db [probe item]
    close db

Moreover, sql-protocol provide a basic compabibility with the ODBC text driver {Microsoft text driver (*.csv,*.txt)} in order to provide a quick and simple way to share data between REBOL application and any ODBC application,for example, MS Excel to produce table or chart,or MS Word to produce letters or mailing.

sql-protocol provide also a set of file protocol which can be used directly in your script :

DATA
text file containing a REBOL block for each row
HEAP
same as DATA but for transient table (in memory table)
CSV
delimited file by any caracter except doublequote ("), newline (^/) or linefeed (^M).
FIXED
fixed length file (not implemented up now)

NOTA BENE

This documentation include what is already realized but also on what I intend to develop. I indicate each time when something in not implemented if it will bo done soon or not.

1. Preface

1.1 SQL-PROTOCOL capability

2. Installing sql-protocol
3. Creating or Opening a database
4. Data Definition Statement

4.1 CREATE TABLE Syntax
4.2 DROP TABLE Syntax

5. Data Manipulation Statement

5.1 SQL Query String

5.1.1 DELETE Syntax
5.1.2 INSERT Syntax
5.1.3 SELECT Syntax
5.1.4 UPDATE Syntax
5.1.5 WHERE Condition

5.2 SQL Query Dialect

5.2.1 SQL dialect constraint and limitation
5.2.2 Selecting rows in a database

5.2.2.1 Simple SELECT
5.2.2.2 SELECT with JOIN
5.2.2.3 SELECT with LIKE
5.2.2.4 SELECT with ORDER BY

5.2.3 Inserting rows

5.2.3.1 INSERT example

5.2.4 Updating rows
5.2.5 Deleting rows

6. Database Definition

6.1 DDL dialect constraint and Limitation
6.2 Creating table

6.2.1 Creating DATA or HEAP tables
6.2.2 Creating CSV tables
6.2.3 Creating FIXED (FixedLength) table

6.3 Dropping a table

7. Using file protocol (DATA, HEAP, CSV and FIXED)

7.1 Using file protocol
7.2 Opening file protocol

7.2.1 Opening DATA or HEAP file
7.2.2 Opening CSV or FIXED file

7.3 Common usage of file protocol

7.3.1 Inserting record(s)
7.3.2 Changing record(s)
7.3.3 Removing record(s)
7.3.4 Getting record(s)
7.3.5 First, next, at, tail, tail?, etc...
7.3.6 Sorting record(s)
7.3.7 File and directory management.

8. Technical implementation

8.1 Database file and directory
8.2 Database shema

8.2.1 schema.ctl file
8.2.2 schema.ini file

8.3 File protocol
8.4 Select algorithm

9. Developpement status and plan

9.1 Test status
9.2 In the pipe

10 Full script history
11 sql-protocol.zip archive. NO MORE AVAILABLE

1. Preface

1.1 SQL-PROTOCOL capability

Query on the database can be done :

either with standard SQL string
or in a REBOL dialect very near to standard SQL query.

The engine carries out only the following requests:

SELECT ... FROM ... WHERE ... ORDER BY ...
INSERT ... INTO ... VALUES ...
INSERT ... INTO ... SET ... (not implemented up now)
INSERT ... INTO ... SELECT ... (not implemented up now)
UPDATE ... SET ... WHERE ...
DELETE FROM ... WHERE ...
CREATE TABLE ... [...] TYPE ...
DROP TABLE ...

SQL-PROTOCOL support 4 table types (format) :

DATA
format which is text file containing a REBOL block for each table row
HEAP
format which is for in memory table (transient table) containing a REBOL block for each row
CSV
format which is delimited file (CSVDelimited, TABDelimited or Delimited by any caracter except doublequote ("), newline (^/) or linefeed (^M).
FIXED
format which is fixed length file (Fixed) (not implemented up now)

By default :

DATA format is used when the type is not specified in the create table query.
CSV table use ";" as delimiter and header is on (for Excel compatibility)

2. Installing sql-protocol

Installing sql-protocol is very simple :

Download sql-protocol.r from http://www.rebol.org/library/scripts/sql-protocol.r in your prefered directory
Load the sql-protocol.r in you script with a simple do %your-directory/sql-protocol.r

Load the sql-protocol.r directly in your script from the www.rebol.org with a do-thru http://www.rebol.org/library/scripts/sql-protocol.r

3. Creating or Opening a database

Opening a database is very simple:

    open sql:my-database

In this case, the protocol open the database contained in the directory my-database.

If this directory does not exist, it is created.
If a schema.ini or schema.ctl file is found in this directory, this file is used
If the schema file does not exists, it is created.

You can specify the standard /new refinement in order to reinitialize the database (this is not implemented up to now but I will make it soon):

    open/new sql:my-database

In this case, all the file contained in the directory are erased and a new schema.ctl file is created.

Other refinement will cause an error (error are not thrown up to now but I will change it soon).

When you open a database, you can also precise the file name of the schema :

    open sql:my-database/my-schema.txt

In this case, if the schema file does'nt exist, it is created. It's the same with the directory and the /new refinement is also authorized.

4. Data Definition Statement

4.1 CREATE TABLE Syntax

4.2 DROP TABLE Syntax

5. Data Manipulation Statement

SQL-PROTOCOL implements the following queries :

SELECT
allows you to select rows from table(s).
INSERT
allows you to insert row(s) in a table.
UPDATE
allows you to update row(s) in a table.
DELETE
allows you to delete row(s) from a table.

SQL-PROTOCOL implements the following WHERE clauses :

... AND ...
... OR ...
... LIKE ...
any logical operation (, >, <, NOT, ...)

5.1 SQL Query String

5.1.1 DELETE Syntax

    DELETE FROM tbl_name
        [WHERE where_condition]

The DELETE statement deletes rows from tbl_name and returns the number of rows deleted. The WHERE clause, if given, specifies the conditions that identify which rows to delete. With no WHERE clause, all rows are deleted.

where_condition is an expression that evaluates to true for each row to be deleted.

5.1.2 INSERT Syntax

    INSERT [INTO] tbl_name [(col_name,...)]
        VALUES (expr,...),(...),...

Or:

    INSERT [INTO] tbl_name
        SET col_name=expr, ...

(not implemented up now)

Or:

    INSERT [INTO] tbl_name [(col_name,...)]
        SELECT ...

(not implemented up now)

INSERT inserts new rows into an existing table. The INSERT ... VALUES and INSERT ... SET forms of the statement insert rows based on explicitly specified values. The INSERT ... SELECT form inserts rows selected from another table or tables.

tbl_name is the table into which rows should be inserted. The columns for which the statement provides values can be specified as follows:

You can provide a comma-separated list of column names following the table name. In this case, a value for each named column must be provided by the VALUES list or the SELECT statement.
If you do not specify a list of column names for INSERT ... VALUES or INSERT ... SELECT, values for every column in the table must be provided by the VALUES list or the SELECT statement.
The SET clause indicates the column names explicitly.

Column values can be given in several ways:

Any column not explicitly given a value is set to its default value.
You can specify an expression expr to provide a column value.
An expression expr can refer to any column that was set earlier in a value list.

For example, you can do this because the value for col2 refers to col1, which has previously been assigned:

    INSERT INTO tbl_name (col1,col2) VALUES(15,col1*2);

But the following is not legal, because the value for col1 refers to col2, which is assigned after col1:

    INSERT INTO tbl_name (col1,col2) VALUES(col2*2,15);

INSERT statements that use VALUES syntax can insert multiple rows. To do this, include multiple lists of column values, each enclosed within parentheses and separated by commas. Example:

    INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);

The values list for each row must be enclosed within parentheses. The following statement is illegal because the number of values in the list does not match the number of column names:

    INSERT INTO tbl_name (a,b,c) VALUES(1,2,3,4,5,6,7,8,9);

5.1.3 SELECT Syntax

    SELECT
        [DISTINCT]
        select_expr, ...
        FROM table_references
        [WHERE where_condition]
        [ORDER BY {col_name | expr | position}
          [ASC | DESC], ...]

SELECT is used to retrieve rows selected from one or more tables.

To query the database, you have to insert the query string into the port and next to get the result either by a copy, a first or a pick.

Quick example:

    db: open sql:my-db
    insert db {SELECT * FROM my-table}
    foreach item copy db [probe item]
    close db

The most commonly used clauses of SELECT statements are these:

Each select_expr indicates a column that you want to retrieve. There must be at least one select_expr.
table_references indicates the table or tables from which to retrieve rows.
The WHERE clause, if given, indicates the condition or conditions that rows must satisfy to be selected.

Clauses used must be given in exactly the order shown in the syntax description.

The DISTINCT option specify whether duplicate rows should be returned. If this options is not given, all matching rows are returned. DISTINCT specify removal of duplicate rows from the result set.
A table reference can be aliased using tbl_name AS alias_name or tbl_name alias_name:

    SELECT t1.name, t2.salary FROM employee AS t1, info AS t2
      WHERE t1.name = t2.name;

    SELECT t1.name, t2.salary FROM employee t1, info t2
      WHERE t1.name = t2.name;

WHERE condition is an expression that evaluates to true for each row to be selected. The statement selects all rows if there is no WHERE clause. In the WHERE clause, you can use any of the functions and operators sql-protocol supports.
It is not allowable to use a column alias in a WHERE clause, because the column value might not yet be determined when the WHERE clause is executed.
To sort in reverse order, add the DESC (descending) keyword to the name of the column in the ORDER BY clause that you are sorting by. The default is ascending order; this can be specified explicitly using the ASC keyword.
Columns selected for output can be referred to in ORDER BY clauses using column names, column aliases, or column positions. Column positions are integers and begin with 1:

    SELECT college, region, seed FROM tournament
      ORDER BY region, seed;

    SELECT college, region AS r, seed AS s FROM tournament
      ORDER BY r, s;

    SELECT college, region, seed FROM tournament
      ORDER BY 2, 3;

5.1.4 UPDATE Syntax

    UPDATE [LOW_PRIORITY] [IGNORE] tbl_name
        SET col_name1=expr1 [, col_name2=expr2 ...]
        [WHERE where_condition]

The UPDATE statement updates columns of existing rows in tbl_name with new values.

The SET clause indicates which columns to modify and the values they should be given.
The WHERE clause, if given, specifies the conditions that identify which rows to update. With no WHERE clause, all rows are updated.

WHERE condition is an expression that evaluates to true for each row to be updated.

If you access a column from tbl_name in an expression, UPDATE uses the current value of the column. For example, the following statement sets the age column to one more than its current value:

    UPDATE persondata SET age=age+1;

UPDATE assignments are evaluated from left to right. For example, the following statement doubles the age column, and then increments it:

    UPDATE persondata SET age=age*2, age=age+1;

5.1.5 WHERE Condition

5.2 SQL Query Dialect

5.2.1 SQL dialect constraint and limitation

AS clause for columns is not implemented (needed and will be done probably soon)
BETWEEN and IN are not implemented (will be done later certainly)
GROUP BY and HAVING clause, and statistical function (COUNT, MAX, ...) are not implemented (will be done later certainly)
only one select strategy is implemented (table scan / nested loop) so poor response time can occurs

NOTA BENE

Remember, it's a REBOL dialect so you have to respect REBOL syntax :

* don't forget to place a space before an after the = or all other operator (+, -*, /, >, <, ).

string must be placed between double quote ou brackets pair ("..." or {...})

5.2.2 Selecting rows in a database

The SQL dialect for SELECT is very similar to the SQL To query the database, you have to insert the query block into the port and next to get the result either by a copy, a first or a pick.

Quick example:

    db: open sql:my-db
    insert db [SELECT * FROM my-table]
    foreach item copy db [probe item]
    close db

5.2.2.1 Simple SELECT

To select all columns from a table you can use the * :

    insert db [SELECT * FROM a]

If you want to select some specific columns,

    insert db [SELECT c1 c2 FROM a]

You can also use the dot notation alias.column:

    insert db [SELECT a.c1 a.c2 FROM a]

You can also use the path notation alias/column:

    insert db [SELECT a/c1 a/c2 FROM a]

Same rules apply to the WHERE clause :

    insert db [SELECT * FROM a WHERE c1 = 2]
    insert db [SELECT * FROM a WHERE a.c1 = 2]
    insert db [SELECT * FROM a WHERE a/c1 = 2]

You can also give an alias name to a table. In this case, you have to set the table / alias pair into a block. The AS word is optional :

    insert db [SELECT t1.c1 t1.c2 FROM [a t1]]
    insert db [SELECT t1.c1 t1.c2 FROM [a AS t1]]

Nota Bene

Remember, it's a REBOL dialect so you have to respect REBOL syntax and don't forget to place a space before an after the =or all other operator (+, -*, /, >, <, ).

5.2.2.2 SELECT with JOIN

To do a join, you do simply :

    insert db [SELECT * from a b]

If you want select some columns from a table and all from another, you can use the alias.* notation :

    insert db [SELECT a.c1 b.* FROM a b]

When you join a table with itself, you must give an alias for each table :

    insert db [SELECT t1.c1 t2.c1 FORM [a AS t1] [a AS t2] WHERE t1.c1 = t2.c2]

Nota Bene

Remember, it's a REBOL dialect so you have to respect REBOL syntax and don't forget to place a space before an after the =or all other operator (+, -*, /, >, <, ).

5.2.2.3 SELECT with LIKE

You can use the LIKE clause in your query :

    insert db [SELECT * FROM a WHERE c1 LIKE "my*"]

LIKE clause use * and ? as wild card and not the standard SQL % and _

5.2.2.4 SELECT with ORDER BY

You can give an ORDER BY clause in order to sort the result.

You can either give the column(s) number

    insert db [SELECT * FROM a ORDER BY 1 2 3]

Or you can give the columns name by its alias.columns name :

    insert db [SELECT * FROM a ORDER BY a.c1 a.c2 a.c3]

By default, the engine sort the result in ascending order, but you can sort it in descending order by giving the pair alias.columns ASC or DESC in a block :

    insert db [SELECT * FROM a ORDER BY [a.c1 ASC] [a.c2 DESC]]

ASC or DESC can also be given with columns index :

    insert db [SELECT * FROM a ORDER BY [1 ASC] 2 [DESC]]

Limitation

Remember that GROUP BY, HAVING and statistical function (COUNT(...), MAX(...), etc...) are not implemented. It's the same for BETWEEN and IN word.

5.2.3 Inserting rows

For INSERT, the dialect is quiet simple.

Quick example:

    insert db [INSERT INTO a VALUES [1 2 3]]

When you insert a row in a table, you must give a value for all columns otherwise they are set to none (NULL).
When you give the columns list, you must give all the columns otherwise the mussing columns are set to none (NULL).
The value are inserted in the table columns order even if the columns are in another order in the columns list.
You can give many rows after the VALUES clause just by giving many rows (one block of value for each rows)
You can also insert rows with values selected from another query (not implemented now)

5.2.3.1 INSERT example

Inserting one row without specifying the column list :

    insert db [INSERT INTO a VALUES [1 2 3]]

Inserting one row with a column list :

    insert db [INSERT [c b a] INTO a VALUES [3 2 1]]

Inserting many rows at a time :

    insert db [INSERT INTO a VALUES [1 2 3] [2 3 4] [3 4 5]]

Inserting rows selected from another query (this is not implemented but it will be soon because it's easy to make) :

    insert db [INSERT INTO temp SELECT * FORM a b WHERE a.c2 = b.c1 ORDER BY a.c1 b.c1]

5.2.4 Updating rows

For UPDATE, the dialect is also very similar to SQL.

Quick example :

    insert db [UPDATE a SET c2: 3 WHERE c1 = 1]

You can update only one table at a time.
If the WHERE clause correspond to many rows, all the corresponding rows are updated
The where clause is optional. If WHERE is ommited, all the row of the table are updated.

Nota Bene

You must use the simple column notation, not the alias.columnnotation

If you want, you can set many columns at a time :

    insert db [UPDATE a SET c2: 3 c3: 4 WHERE c1: 1]

Nota Bene

Remember, it's a dialect so you have to respect REBOL syntax and don't forget to place a space before an after the =or all other operator (+, -*, /, >, <, ).

5.2.5 Deleting rows

For DELETE, it's very simple.

Quick example :

    insert db [DELETE FROM a WHERE c1 = 1]

If you ommit the WHERE clause, all the rows are deleted.
If the WHERE clause correspond to many rows, all the corresponding rows are deletes

Nota Bene

You must use the simple column notation, not the alias.columnnotation

Deleting all rows in a table :

    insert db [ DELETE FROM a ]

6. Database Definition

SQL-PROTOCOLimplements the following DDL (Data Definition Language) queries:

CREATE TABLE
which allows you to create a new table.
DROP TABLE
which allows you to delete a table.

6.1 DDL dialect constraint and Limitation

FIXED table is not implemented (up now, only DATA, HEAP and CSV table can be used)
columns type and length (CHAR, NUMBER, etc...) is not implemented for DATA and HEAP table(probably will be never done)
sql-protocol load all the data in memory, so table with many thousand of rows can be very slow or fail (I am waiting for the rif protocol which will be integrated as soon as possible)

6.2 Creating table

For CREATE TABLE the dialect is extended in order to have a better control on the file creation.

Quick example .

    insert db [CREATE TABLE a [c1 c2 c3]]

If you ommit the table type, DATA format is used
You can specify the type of the table (DATA, HEAP, CSV or FIXED)
If you ommit the columns type and length, untyped columns is create (can contain any REBOL type of data)
Columns type is depending of table type.
If you specify the IF NOT EXIST clause, the table is created only if it doesn't exist, otherwise an error is thrown

6.2.1 Creating DATA or HEAP tables

Creating a table without column type :

    insert db [ CREATE-TABLE table-name [C1 C2 C3] ]

As REBOL is an untyped programming language, it's more simple to ommit the columns type for DATA or HEAP protocol.

Creating a DATA or HEAP table with full columns description (this is not implement up to now and I will make later) :

    insert db [ CREATE TABLE a [
        [F01 binary!]
        [F02 bitset!]
        [F03 block!]
        [F04 char!]
        [F05 datatype!]
        [F06 date!]
        [F07 decimal!]
        [F08 email!]
        [F10 file!]
        [F11 get-word!]
        [F12 hash!]
        [F13 hex!]
        [F14 image!]
        [F15 integer!]
        [F16 issue!]
        [F17 list!]
        [F18 lit-path!]
        [F19 lit-word!]
        [F20 logic!]
        [F21 money!]
        [F22 pair!]
        [F23 paren!]
        [F24 path!]
        [F25 refinement]
        [F26 set-path]
        [F27 set-word]
        [F28 string]
        [F29 tag]
        [F30 time]
        [F31 tuple]
        [F32 url]
        [F33 word]
    ] TYPE = HEAP ]

6.2.2 Creating CSV tables

You can create a CSV table without giving the columns type and length. In this case, you can put what you want in each row but this is not supported by {Microsoft text driver).

    insert db [ CREATE TABLE a [C1 C2 C3] type = CSV]

By default CSV table include the header, the format is Delimited and use semicolumn delimiter (;).

Creating a CSV table with full columns description. By default CSV table include the header, the format is Delimited and use semicolumn delimiter (;).

    insert db [ CREATE TABLE a [
        [F01 Bit]
        [F02 Byte]
        [F03 Char Width 255]
        [F04 Currency]
        [F05 Date]
        [F06 Float]
        [F07 Integer]
        [F08 LongChar]
        [F09 Short]
        [F10 Single]
    ] TYPE = CSV ]

Creating a CSV table with full columns description and specific table type

    insert db [ CREATE TABLE a [
        [C1 Integer]
        [C2 Char 255]
        [C3 Date]
    ] TYPE = [ CSV ColNameHeader: false format: CSVDelimited ] ]

header can be true or false
format can be Delimited, CSVDelimited, TABDelimited
delimiter can be any char or string

except doublequote (") line-feed (^M) or newline (^/)
is mandatory for Delimited format
must not be specified for CSV or TAB Delimited

Nota Bene

TYPE [....] is still not implemented, only TYPE CSV is allowed

6.2.3 Creating FIXED (FixedLength) table

Creating a FIXED table with full columns description

    insert db [ CREATE TABLE a [
        [F01 Bit 1]
        [F02 Byte 3]
        [F03 Char 255]
        [F04 Currency 15]
        [F05 Date 6]
        [F06 Float 21]
        [F07 Integer 11]
        [F08 LongChar 32]
        [F09 Short 5]
        [F10 Single 255]
    ] TYPE = FIXED ]

With FixedLength table, if you don't give the column type, CHAR 255 is used.
format can be Delimited, CSVDelimited, TABDelimited

You can also precise if the header is saved in the file or not:

    insert db [ CREATE TABLE a [
        [C1 Integer 11]
        [C2 Char 255]
        [C3 Date 6]
    ] TYPE = [ FIXED ColnameHeader: true ] ]

ColNameHeader can be true or false

6.3 Dropping a table

To drop a table, the dialect is very simple:

    insert db [ DROP TABLE table-name ]

If the table doesn't exist, an error is thrown.

7. Using file protocol (DATA, HEAP, CSV and FIXED)

You can if needed use directly the DATA, HEAP, CSV and FIXED protocol. These protocols are named file protocol.

7.1 Using file protocol

DATA, HEAP, CSV and FIXED protocol are usefull to store and access REBOL datatype block thru protocol.

DATA
to store REBOL values as block of values into a file.
HEAP
to store REBOL values as block of values into memory.
CSV
to store REBOL value as character separated records.
FIXED
to store REBOL value as fixed length record.

The advantage is to have with a standard input, output and interface access to many file format. These normalized access simplify file read and write and reduce the debuggind time. It's allows you to insert, update or delete records as simple as managing blocks.

The usage of file protocol differs form one protocol to the other, only when you open the file. All other access are standized.

7.2 Opening file protocol

7.2.1 Opening DATA or HEAP file

To open a DATA or HEAP file, you use the standard open function :

    my-file: open DATA:my-datafile.data

If the extension is ommited, the .data exetension is automaticaly added.
If the file doesn't exist, it is created.
If you use the /new extension, the file is cleared during the open.
When /line and /direct refinement are used an error is thrown.

7.2.2 Opening CSV or FIXED file

To open a CSV or FIXED file, you use the standard open function :

    my-file: open CSV:my-datafile.data

If the file doesn't exist, it is created.
If you use the /new extension, the file is cleared during the open.
When /line and /direct refinement are used an error is thrown.

If you want to precise the file structure, you have to make the port before opening it and to give the file schema in a block :

    my-file: open make port! [scheme: 'csv schema: [
        ColNameHeader: true
        format: "Delimited" ; or CSVDelimited or TABDelimited
        delimiter: ";"      ; or any character except doublequote ("), line feed (^M) and newline (^/)
        cols: [
            [C1 Char width 10]
            [C2 Integer]
        ]
    ]

If you don't give the schema, the CSV protocol detect automaticaly the fileds and their type. By default, char field have a length of 255.
For FIXED (FixedLength) file, the schema and field length are mandatory.

7.3 Common usage of file protocol

7.3.1 Inserting record(s)

To insert data, use the standard insert function :

    insert my-file ["data 1" "data 2" 3 4]

If you want to insert many records, you can insert a block containing one or more blocks :

    insert my-file [
        ["Record 1" 1 2 3]
        ["Record 2" 2 3 4]
    ]

If you want to insert one record containing blocks, you can une the /only refinement:

    insert/only my-file [
        ["Value 1"]
        ["Value 2"]
    ]

To insert data at end of the file, use the standard append function :

    insert my-file ["data 1" "data 2" 3 4]

To insert data for example at the third position, use the standard at function :

    my-file: at my-file 3
    insert my-file ["data 1" "data 2" 3 4]

7.3.2 Changing record(s)

To change data, use the standard change function :

    my-file: open DATA:my-data-file.data
    my-file: at my-file 3
    change my-file ["data 1" "data 2" 3 4]
    close my-file

This example change the third record.

If you want to change many records, you can change a block containing one or more blocks :

    change my-file [
        ["Record 1" 1 2 3]
        ["Record 2" 2 3 4]
    ]

If you want to change one record containing blocks, you can une the /only refinement:

    change/only my-file [
        ["Value 1"]
        ["Value 2"]
    ]

If you want to change for example 1 by 3 record, you can use the /part refinement

    change/part my-file [
        ["Record 1" 1 2 3]
        ["Record 2" 2 3 4]
        ["Record 3" 4 5 6]
    ] 1

7.3.3 Removing record(s)

To remove record (s), use the standard remove function :

    remove my-file

If you want to remove many record, use the /part refinement. For example, to to remove 3 records, do this :

    remove/part my-file 3

7.3.4 Getting record(s)

To get data, you can use either the standard copy or pick function

    probe pick my-file 1
    probe copy my-file

With the copy function you can use the /part refinement:

    probe copy/part my-file 5

7.3.5 First, next, at, tail, tail?, etc...

With file protocol, you can it as for series

7.3.6 Sorting record(s)

You can sort the records in the file. It's very usefull and can be done like this

    my-file: open data:my-file.data
    sort my-file
    close my-file

7.3.7 File and directory management.

With file protocol, you can manage file and directory. This can done in the same way as standard file.

You can use the following function :

query
to obtain file or directory information
exists
to verify if a file or directory exists
make-dir
to create a directory
delete
to delete a file or directory

    query data:my-file.data
    exsits data:my-file.data
    make-dir data:my-directory/
    delete/any data:my-directory/*.data

8. Technical implementation

8.1 Database file and directory

For each database, the table file structure is stored in a file.

schema.ctl
is used for default database type. It's a standard REBOL script file which is simple loaded when the database is opened.
schema.ini
is used for {Microsoft text driver} database. It's a standard .ini file

All the table files are stored in the same directory as the schema file.

8.2 Database shema

8.2.1 schema.ctl file

In schema.ctl file, SQL-PROTOCOL store the schema structure in the following manner :

    table-name: [
        scheme: table-file-schema
        target: table-file-name
        schema: [
            cols: [
                col-specs-1
                ...
                col-specs-n
            ]
            property-1: value-1
            ...
            property-n: value-n
        ]
    ]

    col-specs = col-name
    col-specs = [col-name col-type col-length]

The schema.ctl is not implemented like this now, but this will be done soon. Actualy, the schema is stored in a database.ctl file with a different format

Example schema for DATA or HEAP protocol :

    data-table: [
        scheme: 'DATA
        target: %data-table.data 
        schema: [
            cols: [
                col-1            ; untyped column
                [col-2 date!]    ; a date
            ]
        ]
    ]

Example schema for CSV or FIXED protocol :

    csv-table: [
        scheme: 'CSV
        target: %csv-table.csv
        schema: [
            ColnameHeader: false
            format: 'Delimited
            delimiter: ";"
            cols: [
                col-1            ; untyped column
                [col-2 char 32]  ; 32 char
            ]
        ]
    ]

8.2.2 schema.ini file

The schema.ini file is compatible with the {Microsoft text driver (*.csv *.txt)} You can find the specification here :

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/odbc/htm/odbcjetschema_ini_file.asp

Here is an example of schema.ini with all possible columns type :

    [csvdelimited.csv]
        ColNameHeader=True
        Format=CSVDelimited
        MaxScanRows=0
        CharacterSet=OEM
        Col1=F01 Bit
        Col2=F02 Byte
        Col3=F03 Char Width 255
        Col4=F04 Currency
        Col5=F05 Date
        Col6=F06 Float
        Col7=F07 Integer
        Col8=F08 LongChar
        Col9=F09 Short
        Col10=F10 Single
    [fixedlength.txt]
        ColNameHeader=True
        Format=FixedLength
        MaxScanRows=0
        CharacterSet=OEM
        Col1=F01 Bit Width 1
        Col2=F02 Byte Width 3
        Col3=F03 Char Width 10
        Col4=F04 Currency Width 15
        Col5=F05 Date Width 6
        Col6=F06 Float Width 21
        Col7=F07 Integer Width 11
        Col8=F08 LongChar Width 32
        Col9=F09 Short Width 5
        Col10=F10 Single Width 255

8.3 File protocol

I decided to use file protocol to acces to table data. This choice is motivated by two reason :

The interface of all protocol is standardised, so I didn't need to reinvente the wheel
These protocol can be use also outside the database so it can be usefull for other application.

One important point was to implemement the directory capability so I can reused the standard function for the creation and deletion of table file.

Also, by implementing the HEAP protocol, I could use always the same interface even if data could be stored in a simple block in memory.

8.4 Select algorithm

I use a unique SELECT strategy, it's a table scan / nested loop algorithm.

The algorithm is quiet simple, for each table in the SELECT, the script scan all the row of the table, and for each row, scan all the rows of the next table, and when it's the last table, the where clause is simply applied.

The speed performance is not verw good but that's enough with small table and few join. In fact, what is most time consuming, it's the insert in the resulting block when thousand of rows are inserted.

9. Developpement status and plan

9.1 Test status

DATA, HEAP and CSV protocols are quiet well tested.
SQL protocol is quiet well tested with DATA, HEAP and CSV table.
SQL SELECT, UPDATE and DELETE are quiet well tested
SQL CREATE and DROP are tested but they need more tests.

9.2 In the pipe

Incomplete list :-(

Implementation of IN (perhaps)
Implementation of GROUP BY, HAVING ans statistic function (COUNT, MAX, ...)
Implementation of column AS clause in the SELECT
Implementation of SET instead of VALUES clause in INSERT query (INSERT INTO ... SET ...)
Implementation of SELECT instead of VALUES clause in INSERT query (INSERT INTO ... SELECT ...)
Implementation of the open/NEW refinement for SQL:..
Raise an error when other refinement are used with open sql:...
Implementation of column type and length.
Improvement of CSV protocol for an automatic detection of columns type when schema is not given.
Implementation of RIF protocol. Perhaps, I will make a small protocol which handle only the RIF table format but certainly with a completely database organisation in order to store all the tables and also the schema in one unique file.
Implementation of FixedLength file (FIXED protocol).
Implementation of columns without alias.column notation and AS clause for columns in SELECT query
Implementation of SQL string request instead of SQL dialect (or keep both)
More and more, improve performance and simplify the script

10 Full script history

        0.0.1 [15-Sep-2004 {Initial alpha version} marco@adyreb.org]
        0.0.2 [16-Sep-2004 {Documentation & order bug correction & tests} marco@adyreb.org]
        0.0.3 [17-Sep-2004 {Change path notation to dot notation for columns} marco@adyreb.org]
        0.0.4 [20-Sep-2004 {Add table alias and where normalization} marco@adyreb.org]
        0.0.5 [21-Sep-2004 {Interface stabilization, sort improvement} marco@adyreb.org]
        0.0.6 [22-Sep-2004 {First optimization and select * or select table.* implementation} marco@adyreb.org]
        0.0.7 [27-Sep-2004 {RAPID Documentation, to-rebol-cols debugging and order interface change} marco@adyreb.org]
        0.1.0 [28-Sep-2004 {Change in provision of SQl protocol - DATA: protocol & database object} marco@adyreb.org]
        0.1.1 [29-Sep-2004 {Change of test case & documentation & SQL: protocol} marco@adyreb.org]
        0.1.2 [30-Sep-2004 {Dismiss of RAPID documentation & various changes} marco@adyreb.org]
        0.1.3 [04-Oct-2004 {Transient debug & developpment of sql INSERT, UPDATE & DELETE} marco@adyreb.org]
        0.1.4 [06-Oct-2004 {Change in UPDATE syntaxe and DELETE debug} marco@adyreb.org]
        0.1.5 [08-Oct-2004 {Major change to resolve the path problem} marco@adyreb.org]
        0.1.6 [11-Oct-2004 {Implements HEAP protocol for transient table} marco@adyreb.org]
        0.1.7 [12-Oct-2004 {Implements CREATE/DROP TABLE and HEAP & DATA optimization} marco@adyreb.org]
        0.1.8 [13-Oct-2004 {Improve path an target management & add get-url function} marco@adyreb.org]
        0.2.0 [13-Oct-2004 {First alpha published on www.rebol.org} marco@adyreb.org]
        0.2.1 [15-Oct-2004 {Change on the data protocol - implement directory management} marco@adyreb.org]
        0.2.2 [18-Oct-2004 {Try to improve select performance} marco@adyreb.org]
        0.2.3 [27-Oct-2004 {Change on the data and heap protocol - improve directory management} marco@adyreb.org]
        0.3.0 [11-Nov-2004 {Add CSV protocol, change on DATA protocol and preparation to FIXED protocol} marco@adyreb.org]
        0.3.1 [19-Nov-2004 {Various bug correction and header extension} marco@adyreb.org]
        0.3.2 [14-Nov-2004 {Add SORT on DATA and CSV protocol plus bug correction (thank Shad)} marco@adyreb.org]
        0.4.0 [14-Dec-2004 {Implement new schema.ctl and extend TYPE = clause} marco@adyreb.org]
        0.4.1 [20-Dec-2004 {Implement schema.ini and extend test case} marco@adyreb.org]
        0.4.2 [10-Jan-2005 {Test, debug and correction on port/local/table and schema.ctl} marco@adyreb.org]
        0.4.3 [13-Jan-2005 {Lot of test, debug and correction on compatibility with Microsoft Text Driver} marco@adyreb.org]
        0.5.0 [17-Jan-2005 {Alpha version published on www.rebol.org} marco@adyreb.org]
        0.6.0 [17-Jan-2005 {More flexible SQL dialect (FROM clause)} marco@adyreb.org]
        0.6.1 [16-Mar-2005 {More flexible SQL dialect (columns, WHERE and ORDER BY clause)} marco@adyreb.org]
        0.6.2 [29-Mar-2005 {Extends test case} marco@adyreb.org]
        0.6.3 [11-May-2005 {End of extended test and publication to library} marco@adyreb.org]
        0.6.4 [02-Feb-2006 {Add LIKE clause + some bug correction} marco@adyreb.org]
        0.6.5 [05-Feb-2006 {First attempt of SQL string parsing + some bug correction} marco@adyreb.org]
        0.6.6 [07-Feb-2006 {Correction of a bug when using word in the SQL dialect} marco@adyreb.org]
        0.6.7 [08-Feb-2006 {Improvement of word handling in dialect} marco@adyreb.org]
        0.6.8 [04-Mar-2006 {Implement SQL parsing for INSERT, UPDATE & DELETE clauses} marco@adyreb.org]

11 sql-protocol.zip archive. NO MORE AVAILABLE

You can find the sql-protocol.zip archive in the french framework here :

http://www.agora-dev.org/forums/view.php?site=rebol&bn=rebol_protocols&key=1095276031

The archive contains :

sql-protocol.r
The REBOL sql database protocol

launch.r
A simple script to launch REBOL, load the protocol and open the database

RUn.r
The Rebol Unit tool, many thanks to Christophe (christophe.coussement@mil.be)
test-unit.r
To run one or many of the test case
test-suite.r
To run all the test cases
/test-suite
contains all the test case.

/my-db
is the database for the test case
/my-text-db
is the database for the Microsoft text driver compatibility test

/my-db-save
contains the initial database for the test case
/my-text-db-save
contains the initial database for the Microsoft text driver compatibility test

bench-suite.r
To run all the bench cases
/bench-suite
Contains all the bench case.
/bench-db
is the database used for bench case

/tutorial
contains the documentation source

Text File Format.doc
The microsoft CSV file specification

DATA	text file containing a REBOL block for each row
HEAP	same as DATA but for transient table (in memory table)
CSV	delimited file by any caracter except doublequote ("), newline (^/) or linefeed (^M).
FIXED	fixed length file (<u>not implemented up now</u>)

DATA	format which is text file containing a REBOL block for each table row
HEAP	format which is for in memory table (transient table) containing a REBOL block for each row
CSV	format which is delimited file (CSVDelimited, TABDelimited or Delimited by any caracter except doublequote ("), newline (^/) or linefeed (^M).
FIXED	format which is fixed length file (Fixed) (<u>not implemented up now</u>)

SELECT	allows you to select rows from table(s).
INSERT	allows you to insert row(s) in a table.
UPDATE	allows you to update row(s) in a table.
DELETE	allows you to delete row(s) from a table.

CREATE TABLE	which allows you to create a new table.
DROP TABLE	which allows you to delete a table.

DATA	to store REBOL values as block of values into a file.
HEAP	to store REBOL values as block of values into memory.
CSV	to store REBOL value as character separated records.
FIXED	to store REBOL value as fixed length record.

query	to obtain file or directory information
exists	to verify if a file or directory exists
make-dir	to create a directory
delete	to delete a file or directory

schema.ctl	is used for default database type. It's a standard REBOL script file which is simple loaded when the database is opened.
schema.ini	is used for {Microsoft text driver} database. It's a standard .ini file

RUn.r	The Rebol Unit tool, many thanks to Christophe (christophe.coussement@mil.be)
test-unit.r	To run one or many of the test case
test-suite.r	To run all the test cases
/test-suite	contains all the test case.

/my-db	is the database for the test case
/my-text-db	is the database for the Microsoft text driver compatibility test

/my-db-save	contains the initial database for the test case
/my-text-db-save	contains the initial database for the Microsoft text driver compatibility test

bench-suite.r	To run all the bench cases
/bench-suite	Contains all the bench case.
/bench-db	is the database used for bench case

The SQL-PROTOCOL Documentation

Contents