Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

Full text indexing

 [1/8] from: gchiu::compkarori::co::nz at: 1-Feb-2004 23:54


Has anyone done any full text indexing in Rebol, or implemented the Burrows Wheeler transform for searching? -- Graham Chiu http://www.compkarori.com/cerebrus<

 [2/8] from: gchiu:compkarori at: 1-Feb-2004 23:54


Graham Chiu wrote.. apparently on 2-Feb-2004/9:53:07+13:00
>Has anyone done any full text indexing in Rebol, or implemented the Burrows Wheeler transform for searching?
Missed including the url: http://www.mfn.unipmn.it/~manzini/fmindex/ -- Graham Chiu http://www.compkarori.com/cerebrus<

 [3/8] from: hallvard:ystad:oops-as:no at: 1-Feb-2004 23:07


Dixit Graham Chiu (21.54 01.02.2004):
>Graham Chiu wrote.. apparently on 2-Feb-2004/9:53:07+13:00 >>Has anyone done any full text indexing in Rebol, or implemented the Burrows Wheeler transform for searching? >Missed including the url: >http://www.mfn.unipmn.it/~manzini/fmindex/
That's all very impressing, until I tried the demo (http://roquefort.di.unipi.it/~ferrax/fmindex/ricerca2.html), clicked search, and got the perl script source... Anyway: I have full-text indexing in my rebol search engine, but, alas!, I use DocKimbel's excellent mysql protocol driver... HY

 [4/8] from: doug:vos:eds at: 2-Feb-2004 13:07


Hello Graham and rebolers, I did full text indexing in rebol back in 1999 or 2000 as one of my first rebol learning projects. (This is an old bunch of rebol code and I have not updated it since then.) You can see the result here: http://vvn.net/bible/ (Works with rebol 2.x.x and up I think.) If you want the source, I can dig it up and ship it to you or re-package the basic functions for use by rebol.org. This full text indexer parses all the words in the body of text first and produces the index file (one time) and then compares query requests to index before displaying the actual text. - Doug

 [5/8] from: gerardcote:sympatico:ca at: 2-Feb-2004 14:02


Hello Doug, You wrote: =======
> I did full text indexing in rebol back in 1999 or 2000 as one of my first > rebol learning projects.
<<quoted lines omitted: 3>>
> produces the index file (one time) and then compares query requests to index > before displaying the actual text.
I think this would be a welcome addition to get the sources here to be put into the rebol.org as you suggested. Many people would probably learn much by comparing the current search algorithms with any existing code that accomplish this task. Thanks, Gerard

 [6/8] from: SunandaDH:aol at: 2-Feb-2004 14:10


REBOL.org does full word indexing on all contributed scripts. All the code is in the download, of you want a look: http://www.rebol.org/cgi-bin/cgiwrap/rebol/download-librarian.r The indexing programs are: librarian/support/make-header-idx.r librarian/support/make-words-idx.r Though this part of the site is due for an overhaul -- it has some limitations now there are over 500 scripts. I'd be interested at looking at alternatives, Sunanda.<

 [7/8] from: gchiu:compkarori at: 2-Feb-2004 22:48


Vos, Doug wrote.. apparently on 2-Feb-2004/13:07:44-5:00
>If you want the source, I can dig it up and ship it to you or re-package the >basic functions for use by rebol.org.
Hi Doug, that would be great if you could do this.
>This full text indexer parses all the words in the body of text first and >produces the index file (one time) and then compares query requests to index >before displaying the actual text.
Can the index file be updated? I know the Bible is immutable ... -- Graham Chiu http://www.compkarori.com/cerebrus<

 [8/8] from: doug:vos:eds at: 2-Feb-2004 16:00


The rebol source for this project is on another archive right now, so I will have to upload in next 24 hours. Can the index file be updated? I know the Bible is immutable ... This system works best when the complete length of text is known at the beginning. In the case of the Biblical text, it started as 6 megs and compressed to 1.5 megs. (Also uses rebol compress/decompress -- compresses a chapter at a time for fastest decompression).

Notes
  • Quoted lines have been omitted from some messages.
    View the message alone to see the lines that have been omitted