REBOL TEXTBASE

[1/8] from: louisaturk:eudoramail at: 25-Oct-2001 16:06

Rebol experts, Has anyone written a textbase program using rebol? Requirements: 1. A file containing a list of directories and masks for files containing text: notes, books, articles, source code, e-mail, etc. 2. An exceptions file, containing words to not be indexed (like a, the, an, and so forth). 3. Upon starting the program the first time it should automatically make an index of all the words found in all the files found in the file mentioned in paragraph 1, except for the words found in the exceptions file. 4. Upon subsequent starts it should automatically check to see if any files have been modified or new files added, and re-index if yes. 5. Once all text files have been indexed, it should pop up a search window to allow easy search for words or phrases, etc., and display the paragraph containing it and that paragraph's context. The right arrow key should allow on to continue on the next find. The left arrow key should allow one to return to the previous find. 6. One keystroke (say the HOME and/or 5 key) should load the paragraph containing that verse into memory. The UP arrow key should load the paragraph above it into memory (another paragraph each time it is clicked). The DOWN arrow key should load the paragraph(s) below the one containing the search word. Hit PgDN to go to the next paragraph to load more paragraphs into memory. 7. Switch back to your word processor or text editor. Hit paste. Viola! It is there. It seems to me that this should be very easy to do with rebol, and would be extremely useful for both writers and programmers. Has anyone done anything like this? Louis

[2/8] from: louisaturk:coxinet at: 25-Oct-2001 17:22

Rebol experts, Has anyone written a textbase program using rebol? Requirements: 1. A file containing a list of directories and masks for files containing text: notes, books, articles, source code, e-mail, etc. 2. An exceptions file, containing words to not be indexed (like a, the, an, and so forth). 3. Upon starting the program the first time it should automatically make an index of all the words found in all the files found in the file mentioned in paragraph 1, except for the words found in the exceptions file. 4. Upon subsequent starts it should automatically check to see if any files have been modified or new files added, and re-index if yes. 5. If (or once) all text files have been indexed, it should pop up a search entry window to allow easy search for words or phrases, etc., and display the paragraph containing it and that paragraph's context. The right arrow key should allow one to continue on the next find. The left arrow key should allow one to return to the previous find. 6. One keystroke (say the HOME and/or 5 key) should load the paragraph containing that verse into memory. The UP arrow key should load the paragraph above it into memory (another paragraph each time it is clicked). The DOWN arrow key should load the paragraph(s) below the one containing the search word. Hit the right arrow key to go to the next find(s) to load more paragraphs into memory. 7. Switch back to your word processor or text editor. Hit paste. Viola! It is there. It seems to me that this should be very easy to do with rebol, and would be extremely useful for both writers and programmers. Has anyone done anything like this? Louis

[3/8] from: dness:home at: 25-Oct-2001 21:34

Dr. Louis A. Turk wrote:

> Rebol experts, > > Has anyone written a textbase program using rebol? Requirements: > > 1. A file containing a list of directories and masks for files containing > text: notes, books, articles, source code, e-mail, etc. >

[Snip]

> It seems to me that this should be very easy to do with rebol, and would be > extremely useful for both writers and programmers. Has anyone done > anything like this? >

A bit naive, perhaps. Many (if not most) of the files on my home machines (about 500,000 files at last count) contain `text' that is `encoded' by the programs that use it. Simple examples are .DOC (Windows Word) or .PS/PDF (Adobe PostScript/Acrobat), Mail Files, Help Files, MySQL data bases, Outlook Contact Lists etc. etc. Even text that isn't `deeply encoded' is often surrounded by a disconcering quantity of HTML markup or some such, and increasingly, these days, basic text is often stored in DBMS systems such as mySQL or Oracle. Sadly, recovering `sensible' text from these files is not an easy task without running the processors that interpret it (Word, Acrobat, GhostScript, Communicator, ...) in each specific case. This is one of the reasons that I am not much of a fan of the programs that store things in these idiosyncratic ways---but since the programs are heavily used one generally has to `go along' with customary usage. Rather than taking the approach suggested in this note, I guess I'd suggest starting with an implementation of something like the much-beloved `HyperCard' that was such a popular part of the Mac World for such a long time. Since I never liked Macs much, and since most of the HyperCard-like Windows systems were _very_ expensive, I never used this much, but it had a real following among a broad set of users, and it strikes me---perhaps naively---as something that might be easy to do, at least in a preliminary way, in REBOL. If such a thing were doable, then it might be possible, in some cases, to invoke processes that would convert other forms of files into `HyperCards' that could then be accessed in-lieu of the information in the documents themselves, thus accomplishing much of what you seem to want to be able to do.

[4/8] from: louisaturk:coxinet at: 25-Oct-2001 21:59

At 09:34 PM 10/25/2001 -0400, you wrote:

>"Dr. Louis A. Turk" wrote: > >

<<quoted lines omitted: 41>>

>to do. >--

I personally would not want everything on my computer indexed, but only documents that I might use in my writings. Anything else would be undesirable clutter. Documents such as web pages can easily be stripped of their tags using parse. Source code for rebol would certainly work as is. Almost all programs have an export to ascii option. Eudora email files are already in ascii format. Louis

[5/8] from: dness:home at: 26-Oct-2001 3:25

Dr. Louis A. Turk wrote:

> I personally would not want everything on my computer indexed, but only > documents that I might use in my writings. Anything else would be > undesirable clutter. > > Documents such as web pages can easily be stripped of their tags using parse.

Actually that's not true. Complex HTML markup, particularly with in-line java can be quite difficult to parse,

> Source code for rebol would certainly work as is.

I suppose most computer code would, but it isn't something that I'd normally use in `my writings'.

> Almost all programs have an export to ascii option.

Some do. Far, far from `almost all'. And in any case this sort of misses the point, doesn't it? If you have to remember to write out an `ascii version' for the purposes of your indexing, then all of the advantages of `automatic' indexing pretty much disappear.

> Eudora email files are already in ascii format.

Yes, but Eudora only handles a small fraction of Net mail. NetScape Communicator and Outlook, which handle far more, I think, both store mail in complex structured files.

[6/8] from: nitsch-lists:netcologne at: 26-Oct-2001 23:42

RE: [REBOL] Re: REBOL TEXTBASE Hi Louis at http://jove.prohosting.com/~screbol/reb/sucher/ is my old multi-file-searcher. hacked together and german code.. maybe i do a rewrite if you think its a bit what you want. -Volker [louisaturk--coxinet--net] wrote:

[7/8] from: louisaturk:coxinet at: 26-Oct-2001 19:36

Hi Volker, At 11:42 PM 10/26/2001 +0200, you wrote:

>RE: [REBOL] Re: REBOL TEXTBASE >Hi Louis

<<quoted lines omitted: 3>>

>maybe i do a rewrite if you think its a bit what you want. >-Volker

Thanks! I would like to see your code. Your web page won't let me in, however, (Forbidden You don't have permission to access /~screbol/reb/sucher/ on this server.) so could you just email it to me off list, please? Louis [louisaturk--eudoramail--com] Louis

[8/8] from: nitsch-lists:netcologne at: 27-Oct-2001 17:29

RE: [REBOL] Re: REBOL TEXTBASE Hi Louis [louisaturk--coxinet--net] wrote:

> Hi Volker, > At 11:42 PM 10/26/2001 +0200, you wrote:

<<quoted lines omitted: 11>>

> You don't have permission to access /~screbol/reb/sucher/ on this server.) > so could you just email it to me off list, please?

Ups, i forgot: you have to go there with the goto-button on the rebol-desktop. there is only a %index.r, no %index.html yet..

Notes

Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted