Script Library: 1219 scripts
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 
View scriptLicenseDownload documentation as: HTML or editable
Download scriptHistoryOther scripts by: btiffin

Documentation for: dict-scheme.r


Usage document for %dict-scheme.r

1. Introduction to %dict-scheme.r

dict-scheme.r is a sample port scheme for the dict:// protocol hosted off of http://dict.org, and defined in RFC2229.

1.1. Credits to Jeff Kreis

The original REBOL dict:// port handler was found in the dict.org resources 

scheme A REBOL term for a port protocol
protocol A set of formal rules describing how to transmit data especially across a network
port A REBOL I/O interface to external series

2. dict-scheme At a Glance

No setup is required, just do it.

>> do %dict-scheme.r
 

2.1. A small demo application

For a demo, and a few console shortcuts, right out of the library
>> do read http://www.rebol.org/library/scripts/%dict-demo.r
 

3. DICT protocol defined URLs

 dict://<user>;<auth>@<host>:<port>/d:<word>:<database>
 dict://<user>;<auth>@<host>:<port>/m:<word>:<database>:<strat>
 

The REBOL dict scheme will allow for

 dict:///<word>
 

as a shortcut to the default host, and port and the default action is /define: (alias for /d:) and

 dict://
 

to read the databases included with the default host.

This port handlers extends the /d: and /m: to include

  • /match:word:database:strat
  • /define:word:database
  • /help:
  • /strat:
  • /server
  • /info:database
  • /status:

without the terminating colons, the words will be treated as default definition lookups

4. Using %dict-scheme.r

Just as simply and easily as other REBOL network interfaces, you can

>> print read dict://
 

The dict:// is the scheme, and the %demo-scheme.r file includes a default host of all.dict.org . So that it is really print read dict://all.dict.org/

4.1. Different functions of the dict protocol

The dict protocol allows for definition lookups and spell-checker style matches.

Some examples

read dict:/// Returns the default list of databases from the default host, all.dict.org
read dict:///server: Returns a server summary of the default host. Note the three slashes
read dict://vocabulary.aioe.org/server: Returns server summary of another dict protocol host, a very complete one
read dict://dict.org/d:protocol:foldoc Will read the definition of protocol from the Free On-Line Dictionary Of Computing
read dict://dict.org/match:japan:soundex Does a soundex spell-checker match for "japan". match: is an extension alias for m:
read dict:///strat: Returns a list of match strategies, "exact" being the usual default.
read dict:///define:Canada:world95 Returns the complete entry for Canada from the CIA Book of World Facts, 1995 edition

5. Return formats

This port handler returns a block! for results. For all queries other than define, the block will contain a single string!. For /d: queries, the results will be one inner block! for each definition. So read dict://dict.org/define:test will return a block structure of

[{First def}][{Second def}]
 

being a block of blocks holding the strings.

6. Sophisticated URLs

Some of the spell-check and word lookups may require use of character symbols that can cause confusion to the builtin REBOL url parser.

6.1. Spaces in <words>

RFC2229 allows for double-quote and single-quotes for surroundind spaces. REBOL will not allow a URL to contain " double-quotes. Use ' single-quotes instead.

Some Regular Expression characters will also cause confusion for URLs. Luckily REBOL allows a port to be opened using a control block.

6.2. Block port specs

The REBOL dict scheme can be accessed using the standard REBOL block spec for ports.

 read [scheme: 'dict  target: {define:'caveat emptor':bouvier}]
 

will allow you to lookup definitions for words with spaces, and

 read [scheme: 'dict  host: "vocabulary.aieo.org" target: {match:[a|b|c]$:gcide:re}]
 

will allow you to take full advantage of the re strategy when doing word matches.

6.3. Open ports

You can use open to access a dict port, and then use copy to read the data. insert on the port will usually have error causing consequences, but reading with copy will work fine.

7. The dict protocol hosts

There are quite a few active dict protocol servers. The main list is kept at the DICT protocol Server List 

8. Strategies for Match

These are subject to change and may be different for each host, but all.dict.org supports

  • exact "Match headwords exactly"
  • prefix "Match prefixes"
  • substring "Match substring occurring anywhere in a headword"
  • suffix "Match suffixes"
  • re "POSIX 1003.2 (modern) regular expressions"
  • regexp "Old (basic) regular expressions"
  • soundex "Match using SOUNDEX algorithm"
  • lev "Match headwords within Levenshtein distance one"
  • word "Match separate words within headwords"

Using . as a strategy allows the host to pick the best

9. Databases

Once again, subject to change, but all.dict.org has access to

  • gcide {The Collaborative International Dictionary of English v.0.48}
  • wn "WordNet (r) 2.0"
  • moby-thes "Moby Thesaurus II by Grady Ward, 1.0"
  • elements "Elements database 20001107"
  • vera {Virtual Entity of Relevant Acronyms (Version 1.9, June 2002)}
  • jargon "Jargon File (4.3.1, 29 Jun 2001)"
  • foldoc {The Free On-line Dictionary of Computing (27 SEP 03)}
  • easton "Easton's 1897 Bible Dictionary"
  • hitchcock "Hitchcock's Bible Names Dictionary (late 1800's)"
  • bouvier "Bouvier's Law Dictionary, Revised 6th Ed (1856)"
  • devils {THE DEVIL'S DICTIONARY ((C)1911 Released April 15 1993)}
  • world02 "CIA World Factbook 2002"
  • gazetteer "U.S. Gazetteer (1990)"
  • gaz-county "U.S. Gazetteer Counties (2000)"
  • gaz-place "U.S. Gazetteer Places (2000)"
  • gaz-zip "U.S. Gazetteer Zip Code Tabulation Areas (2000)"
  • --exit-- "Stop default search here."
  • afr-deu "Africaan-German Freedict dictionary"
  • afr-eng "Africaan-English Freedict Dictionary"
  • ara-eng "English-Arabic Freedict Dictionary"
  • cro-eng "Croatian-English Freedict Dictionary"
  • cze-eng "Czech-English Freedict dictionary"
  • dan-eng "Danish-English Freedict dictionary"
  • deu-eng "German-English Freedict dictionary"
  • deu-fra "German-French Freedict dictionary"
  • deu-ita "German-Italian Freedict dictionary"
  • deu-nld "German-Nederland Freedict dictionary"
  • deu-por "German-Portugese Freedict dictionary"
  • eng-afr "English-Africaan Freedict Dictionary"
  • eng-ara "English-Arabic FreeDict Dictionary"
  • eng-cro "English-Croatian Freedict Dictionary"
  • eng-cze "English-Czech fdicts/FreeDict Dictionary"
  • eng-deu "English-German Freedict dictionary"
  • eng-fra "English-French Freedict Dictionary"
  • eng-hin "English-Hindi Freedict Dictionary"
  • eng-hun "English-Hungarian Freedict Dictionary"
  • eng-iri "English-Irish Freedict dictionary"
  • eng-ita "English-Italian Freedict dictionary"
  • eng-lat "English-Latin Freedict dictionary"
  • eng-nld "English-Netherlands Freedict dictionary"
  • eng-por "English-Portugese Freedict dictionary"
  • eng-rom "English-Romanian FreeDict dictionary"
  • eng-rus "English-Russian Freedict dictionary"
  • eng-spa "English-Spanish Freedict dictionary"
  • eng-swa "English-Swahili xFried/FreeDict Dictionary"
  • eng-swe "English-Swedish Freedict dictionary"
  • eng-tur "English-Turkish FreeDict Dictionary"
  • eng-wel "English-Welsh Freedict dictionary"
  • fra-deu "French-German Freedict dictionary"
  • fra-eng "French-English Freedict dictionary"
  • fra-nld "French-Nederlands Freedict dictionary"
  • hin-eng "English-Hindi Freedict Dictionary [reverse index]"
  • hun-eng "Hungarian-English FreeDict Dictionary"
  • iri-eng "Irish-English Freedict dictionary"
  • ita-deu "Italian-German Freedict dictionary"
  • jpn-deu "Japanese-German Freedict dictionary"
  • kha-deu "Khasi-German FreeDict Dictionary"
  • lat-deu "Latin-German Freedict dictionary"
  • lat-eng "Latin-English Freedict dictionary"
  • nld-deu "Nederlands-German Freedict dictionary"
  • nld-eng "Nederlands-English Freedict dictionary"
  • nld-fra "Nederlands-French Freedict dictionary"
  • por-deu "Portugese-German Freedict dictionary"
  • por-eng "Portugese-English Freedict dictionary"
  • sco-deu "Scottish-German Freedict dictionary"
  • scr-eng "Serbo-Croat-English Freedict dictionary"
  • slo-eng "Slovenian-English Freedict dictionary"
  • spa-eng "Spanish-English Freedict dictionary"
  • swa-eng "Swahili-English xFried/FreeDict Dictionary"
  • swe-eng "Swedish-English Freedict dictionary"
  • tur-deu "Turkish-German Freedict dictionary"
  • tur-eng "Turkish-English Freedict dictionary"
  • english "English Monolingual Dictionaries"
  • trans "Translating Dictionaries"
  • all "All Dictionaries (English-Only and Translating)"
  • web1913 "Webster's Revised Unabridged Dictionary (1913)"
  • world95 "The CIA World Factbook (1995)"

In the full url you can specify one database or use ! to allow the server to choice a best path, or * to force the server to use all the databases.

9.1. Info

read dict:///info:lat-eng will return an information summary for a database, this example being the latin-english dictionary.

9.2. Running %dict-scheme.r

From the library with:

 >> do http://www.rebol.org/cgi-bin/cgiwrap/rebol/download-a-script.r?script-name=dict-scheme.r
 
or locally with:
 >> do %dict-scheme.r
 
or from the REBOL/View sandbox with
 >> do-thru http://www.rebol.org/library/scripts/dict-scheme.r
 

10. What you can learn

As of July 2007, writing REBOL R2 port schemes is a little bit technically arcane. More documentation is promised, but the release of REBOL R3 may change the entire landscape so unless you are very curious, wait for the R3 documentation and the easier to manage port schemes.

One note. If you look at Jeff's original dict.rebol code, he used system/standard/port-flags/direct, and redefined the Root-Protocol read handler. This version uses system/standard/port-flags/pass-thru and defines the copy function for the handler. This version allows for a more REBOL standard way of accessing a dict:// port.

10.1. What you can change.

Currently this scheme does not implement authentication. If you need access to a proprietary dictionary, you may need to change this handler to support those features of the protocol.

The scheme could be changed to throw less errors by adding response handlers for status codes like "420", but it's sometimes better to let a high level error handler do its thing instead.

10.2. Default DICT protocol host

This may be something to change depending on expected use. The current default of all.dict.org is a DNS round-robin of dict?.us.dict.org, and will try different primary servers for each request. But there are some other servers that you may wish to make the default. See the list mentioned above for the options, and perhaps a search of dict protocol on the Internet could highlight others.

11. What can break

Quite a few things could break here. Errors are left to be thrown on response codes that are not (but could be) recognized. An application that relies on this scheme will need judicious use of error? try and attempt blocks.

No guarantees on availability, and these servers may be off-line at any given moment, but that is unlikely as of July 2007.

12. Credits

%dict-scheme.r Author: Brian Tiffin
%dict.rebol Author: Jeff Kreis
  • The rebol.org Library Team
  • Usage document by Brian Tiffin, Library Team Apprentice, Last updated: 18-Jul-2007