Script Library: 1238 scripts
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 
View scriptLicenseDownload documentation as: HTML or editable
Download scriptHistoryOther scripts by: btiffin

Documentation for: dict-demo.r


Usage document for %dict-demo.r

1. Introduction to %dict-demo.r

dict-scheme.r is a sample port scheme for the dict:// protocol hosted off of http://dict.org, and defined in RFC2229.

dict-demo.r is a how-to and sample of usage for the dict protocol.

1.1. Credits to Jeff Kreis

The original REBOL dict:// port handler was found in the dict.org resources 

scheme A REBOL term for a port protocol
protocol A set of formal rules describing how to transmit data especially across a network
port A REBOL I/O interface to external series

2. dict-demo At a Glance

No setup is required, just do it. Use one of the following

 >> do %dict-demo.r
 or
 >> do-thru http://www.rebol.org/library/scripts/dict-demo.r
 or
 >> do read http://www.rebol.org/library/scripts/dict-demo.r
 or
 >>do http://www.rebol.org/cgi-bin/cgiwrap/rebol/download-a-script.r?script-name=dict-demo.r
 

2.1. rebol.org tip

For quick and easy access to rebol.org scripts, I have the following in my %user.r file.

 rebol.org: http://www.rebol.org/library/scripts
 

When I want execute one of the stand-alone scripts I can use

 >> do-thru rebol.org/dict-demo.r
 Script: "using dict protocol from dict.org" (17-Jul-2007)
 Script: "dict protocol from dict.org" (17-Jul-2007)
 dict protocol loaded
 You can now try: demo, def, thes, spell, soundex, regexp, trans, cia
 For demo and how-to...not thoroughly tested, maybe untrue
 >>
 

or for a version of REBOL that does not have to do-thru function

 >> do read rebol.org/dict-demo.r
 connecting to: www.rebol.org
 connecting to: www.rebol.org
 dict protocol loaded
 You can now try: demo, def, thes, spell, soundex, regexp, trans, cia
 For demo and how-to...not thoroughly tested, maybe untrue
 >>
 

Nice and short and sweet. And for REBOL/View, it will rarely have to go to the Internet to get the code. It will usually be cached on the local disk. If there may have been an update to the rebol.org code, just use do-thru/update.

PLUS, if the script is one that needed modification for proper functioning, and you make a copy of those edits into the local sandbox (not the originals, too great a risk of overwrite) the do-thru rebol.org/script will pick up the local edit copy and hum away.

2.1.1. For those that like stats, and hey who doesn't

Sunanda keeps track of library access so that authors know how many times a script has been accessed. If you'd like to be nice, and offer incentives to script writers, by helping them feel the love, use a slightly different trick involving the rebol.org CGI scripts.
 dorg: func [script] [
     do to url! rejoin [
         "http://www.rebol.org/cgi-bin/cgiwrap/rebol/download-a-script.r?script-name="
         script
     ]
 ]
 

in your %user.r and then

 >> dorg %dict-demo.r
 

will make me happy as I scan the library and see that people are enjoying the demo. You don't have to name the function dorg, it could be do-library or xyzzy if you are so inclined. Note: Anyone that doesn't know about xyzzy, has to check out ADVENT, if for nothing less than a history lesson.

End of side-trip, back to DICT.

3. Inside %dict-demo.r

The DICT demo, is just that, a demo. Don't expect the spell or thes functions to be world-class production ready functions. They will break in weird and wonderful ways, are not optimized or error checked in anyway what-so-ever. Only samples of what can be done. Having mentioned that disclaimer, the sample functions are pretty handy little command line tools.

demo a simple point and click demo for REBOL/View or a blast of output for REBOL/Core
def fetches a definition
thes does a thesaurus database lookup
spell a quick and dirty spell-checker type word match options list
soundex word match using soundex
regexp word match using regular expressions, more common to GNU/Linux users, see man 7 regex
endsinvowels lookup up all the words in the database the end with four vowels
trans translate the word from whatever to whatever. Scans all the translation databases
cia lookup up the country in the CIA World FactBook 1995 edition

4. These are low quality

The demo words work, but not well. More for fun and how-to than anything. These are just starting points for what a full-fledged DICT client could be in REBOL. You need to look at the code to get the most of this demo.

 A lot of the rest of this material is also in the %dict-scheme.r usage document
 

5. DICT protocol defined URLs

 dict://<user>;<auth>@<host>:<port>/d:<word>:<database>
 dict://<user>;<auth>@<host>:<port>/m:<word>:<database>:<strat>
 

The REBOL dict scheme will allow for

 dict:///<word>
 

as a shortcut to the default host, and port and the default action is /define: (alias for /d:) and

 dict://
 

to read the databases included with the default host.

This port handlers extends the /d: and /m: to include

  • /match:word:database:strat
  • /define:word:database
  • /help:
  • /strat:
  • /server
  • /info:database
  • /status:

without the terminating colons, the words will be treated as default definition lookups

6. Using %dict-scheme.r

Just as simply and easily as other REBOL network interfaces, you can

>> print read dict://
 

The dict:// is the scheme, and the %demo-scheme.r file includes a default host of all.dict.org . So that it is really print read dict://all.dict.org/

6.1. Different functions of the dict protocol

The dict protocol allows for definition lookups and spell-checker style matches.

Some examples

read dict:/// Returns the default list of databases from the default host, all.dict.org
read dict:///server: Returns a server summary of the default host. Note the three slashes
read dict://vocabulary.aioe.org/server: Returns server summary of another dict protocol host, a very complete one
read dict://dict.org/d:protocol:foldoc Will read the definition of protocol from the Free On-Line Dictionary Of Computing
read dict://dict.org/match:japan:soundex Does a soundex spell-checker match for japan . match: is an extension alias for m:
read dict:///strat: Returns a list of match strategies, "exact" being the usual default.
read dict:///define:Canada:world95 Returns the complete entry for Canada from the CIA Book of World Facts, 1995 edition

7. Return formats

This port handler returns a block! for results. For all queries other than define, the block will contain a single string!. For /d: queries, the results will be one inner block! for each definition. So read dict://dict.org/define:test will return a block structure of

[{First def}][{Second def}]
 

being a block of blocks holding the strings.

8. Sophisticated URLs

Some of the spell-check and word lookups may require use of character symbols that can cause confusion to the builtin REBOL url parser.

8.1. Spaces in <words>

RFC2229 allows for double-quote and single-quotes for surroundind spaces. REBOL will not allow a URL to contain " double-quotes. Use ' single-quotes instead.

Some Regular Expression characters will also cause confusion for URLs. Luckily REBOL allows a port to be opened using a control block.

8.2. Block port specs

The REBOL dict scheme can be accessed using the standard REBOL block spec for ports.

 read [scheme: 'dict  target: {define:'caveat emptor':bouvier}]
 

will allow you to lookup definitions for words with spaces, and

 read [scheme: 'dict  host: "vocabulary.aieo.org" target: {match:[a|b|c]$:gcide:re}]
 

will allow you to take full advantage of the re strategy when doing word matches.

8.3. Open ports

You can use open to access a dict port, and then use copy to read the data. insert on the port will usually have error causing consequences, but reading with copy will work fine.

9. The dict protocol hosts

There are quite a few active dict protocol servers. The main list is kept at the DICT protocol Server List 

10. Strategies for Match

These are subject to change and may be different for each host, but all.dict.org supports

  • exact "Match headwords exactly"
  • prefix "Match prefixes"
  • substring "Match substring occurring anywhere in a headword"
  • suffix "Match suffixes"
  • re "POSIX 1003.2 (modern) regular expressions"
  • regexp "Old (basic) regular expressions"
  • soundex "Match using SOUNDEX algorithm"
  • lev "Match headwords within Levenshtein distance one"
  • word "Match separate words within headwords"

Using "." as a strategy allows the host to pick the best

11. Databases

Once again, subject to change, but all.dict.org has access to

  • gcide {The Collaborative International Dictionary of English v.0.48}
  • wn "WordNet (r) 2.0"
  • moby-thes "Moby Thesaurus II by Grady Ward, 1.0"
  • elements "Elements database 20001107"
  • vera {Virtual Entity of Relevant Acronyms (Version 1.9, June 2002)}
  • jargon "Jargon File (4.3.1, 29 Jun 2001)"
  • foldoc {The Free On-line Dictionary of Computing (27 SEP 03)}
  • easton "Easton's 1897 Bible Dictionary"
  • hitchcock "Hitchcock's Bible Names Dictionary (late 1800's)"
  • bouvier "Bouvier's Law Dictionary, Revised 6th Ed (1856)"
  • devils {THE DEVIL'S DICTIONARY ((C)1911 Released April 15 1993)}
  • world02 "CIA World Factbook 2002"
  • gazetteer "U.S. Gazetteer (1990)"
  • gaz-county "U.S. Gazetteer Counties (2000)"
  • gaz-place "U.S. Gazetteer Places (2000)"
  • gaz-zip "U.S. Gazetteer Zip Code Tabulation Areas (2000)"
  • --exit-- "Stop default search here."
  • afr-deu "Africaan-German Freedict dictionary"
  • afr-eng "Africaan-English Freedict Dictionary"
  • ara-eng "English-Arabic Freedict Dictionary"
  • cro-eng "Croatian-English Freedict Dictionary"
  • cze-eng "Czech-English Freedict dictionary"
  • dan-eng "Danish-English Freedict dictionary"
  • deu-eng "German-English Freedict dictionary"
  • deu-fra "German-French Freedict dictionary"
  • deu-ita "German-Italian Freedict dictionary"
  • deu-nld "German-Nederland Freedict dictionary"
  • deu-por "German-Portugese Freedict dictionary"
  • eng-afr "English-Africaan Freedict Dictionary"
  • eng-ara "English-Arabic FreeDict Dictionary"
  • eng-cro "English-Croatian Freedict Dictionary"
  • eng-cze "English-Czech fdicts/FreeDict Dictionary"
  • eng-deu "English-German Freedict dictionary"
  • eng-fra "English-French Freedict Dictionary"
  • eng-hin "English-Hindi Freedict Dictionary"
  • eng-hun "English-Hungarian Freedict Dictionary"
  • eng-iri "English-Irish Freedict dictionary"
  • eng-ita "English-Italian Freedict dictionary"
  • eng-lat "English-Latin Freedict dictionary"
  • eng-nld "English-Netherlands Freedict dictionary"
  • eng-por "English-Portugese Freedict dictionary"
  • eng-rom "English-Romanian FreeDict dictionary"
  • eng-rus "English-Russian Freedict dictionary"
  • eng-spa "English-Spanish Freedict dictionary"
  • eng-swa "English-Swahili xFried/FreeDict Dictionary"
  • eng-swe "English-Swedish Freedict dictionary"
  • eng-tur "English-Turkish FreeDict Dictionary"
  • eng-wel "English-Welsh Freedict dictionary"
  • fra-deu "French-German Freedict dictionary"
  • fra-eng "French-English Freedict dictionary"
  • fra-nld "French-Nederlands Freedict dictionary"
  • hin-eng "English-Hindi Freedict Dictionary [reverse index]"
  • hun-eng "Hungarian-English FreeDict Dictionary"
  • iri-eng "Irish-English Freedict dictionary"
  • ita-deu "Italian-German Freedict dictionary"
  • jpn-deu "Japanese-German Freedict dictionary"
  • kha-deu "Khasi-German FreeDict Dictionary"
  • lat-deu "Latin-German Freedict dictionary"
  • lat-eng "Latin-English Freedict dictionary"
  • nld-deu "Nederlands-German Freedict dictionary"
  • nld-eng "Nederlands-English Freedict dictionary"
  • nld-fra "Nederlands-French Freedict dictionary"
  • por-deu "Portugese-German Freedict dictionary"
  • por-eng "Portugese-English Freedict dictionary"
  • sco-deu "Scottish-German Freedict dictionary"
  • scr-eng "Serbo-Croat-English Freedict dictionary"
  • slo-eng "Slovenian-English Freedict dictionary"
  • spa-eng "Spanish-English Freedict dictionary"
  • swa-eng "Swahili-English xFried/FreeDict Dictionary"
  • swe-eng "Swedish-English Freedict dictionary"
  • tur-deu "Turkish-German Freedict dictionary"
  • tur-eng "Turkish-English Freedict dictionary"
  • english "English Monolingual Dictionaries"
  • trans "Translating Dictionaries"
  • all "All Dictionaries (English-Only and Translating)"
  • web1913 "Webster's Revised Unabridged Dictionary (1913)"
  • world95 "The CIA World Factbook (1995)"

In the full url you can specify one database or use "!" to allow the server to choice a best path, or "*" to force the server to use all the databases.

11.1. Info

read dict:///info:lat-eng will return an information summary for a database, this example being the latin-english dictionary.

11.2. Running %dict-scheme.r

From the library with:

 >> do http://www.rebol.org/cgi-bin/cgiwrap/rebol/download-a-script.r?script-name=dict-scheme.r
 
or locally with:
 >> do %dict-scheme.r
 
or from the REBOL/View sandbox with
 >> do-thru http://www.rebol.org/library/scripts/dict-scheme.r
 

12. What you can learn

As of July 2007, writing REBOL R2 port schemes is a little bit technically arcane. More documentation is promised, but the release of REBOL R3 may change the entire landscape so unless you are very curious, wait for the R3 documentation and the easier to manage port schemes.

One note. If you look at Jeff's original dict.rebol code, he used system/standard/port-flags/direct, and redefined the Root-Protocol read handler. This version uses system/standard/port-flags/pass-thru and defines the copy function for the handler. This version allows for a more REBOL standard way of accessing a dict:// port.

12.1. What you can change.

Currently this scheme does not implement authentication. If you need access to a proprietary dictionary, you may need to change this handler to support those features of the protocol.

The scheme could be changed to throw less errors by adding response handlers for status codes like "420", but it's sometimes better to let a high level error handler do its thing instead.

12.2. Default DICT protocol host

This may be something to change depending on expected use. The current default of "all.dict.org" is a DNS round-robin of dict?.us.dict.org, and will try different primary servers for each request. But there are some other servers that you may wish to make the default. See the list mentioned above for the options, and perhaps a search of dict protocol on the Internet could highlight others.

13. What can break

Quite a few things could break here. Errors are left to be thrown on response codes that are not (but could be) recognized. An application that relies on this scheme will need judicious use of error? try and attempt blocks.

No guarantees on availability, and these servers may de off-line at any given moment, but that is unlikely as of July 2007.

14. Credits

%dict-demo.r Author: Brian Tiffin
%dict-scheme.r Author: Brian Tiffin
%dict-rebol.r Author: Jeff Kreis
  • The rebol.org Library Team
  • Usage document by Brian Tiffin, Library Team, Last updated: 19-Jul-2007