[REBOL] Challenging script idea Re:
From: sterling:rebol at: 29-Aug-2000 14:29
Well, it sounded fun so here's what I've got. The output running it
on the two files you talked about is at the bottom. The diff shows a
list of blocks with tokens and a number which is how many times that
token was in the file. You may see the same token listed in the diff
for each file if the number of appearances is different.
Well, enjoy!
Sterling
REBOL [
Title: "Simple token diff"
Purpose: {
I don't know, really. It just tries to
figure out how many REBOL tokens are different
between two files. Somebody thought it would
be neat. ;) Maybe they'll ake it complete and
fix whatever lurking bugs there are in this code.}
Author: "Sterling Newton"
]
a: ask "File or URL #1? "
b: ask "File or URL #2? "
get-type: func [item [string!]] [
switch/default true reduce [
found? find item "://" [item: to-url item]
found? find item "%" [item: to-file next item]
] [a: to-file a]
item
]
a: get-type a
b: get-type b
; the unique tokens and totals blocks
foreach item [a-tokens b-tokens a-totals b-totals] [
set item copy []
]
file1: load/next a
file2: load/next b
tokenize-block: func [
blk [block!] tokens [block!] totals [block!]
/local tmp idx]
[
while [not empty? blk] [
either block? blk/1 [
tokenize-block load/next form blk/1 tokens totals
] [
either tmp: find tokens blk/1 [
idx: index? tmp
totals/:idx/2: totals/:idx/2 + 1
] [
append tokens blk/1
repend/only totals [blk/1 1]
]
]
blk: load/next blk/2
]
]
tokenize-block load/next file1 a-tokens a-totals
tokenize-block load/next file2 b-tokens b-totals
print ["The two files differ by:" length? difference a-tokens b-tokens "tokens."]
print ["----- Tokens in" a "not in" b "-----"]
foreach item intersect diff: difference a-totals b-totals a-totals [
probe item
]
print ["----- Tokens in" b "not in" a "-----"]
foreach item intersect diff b-totals [
probe item
]
> Don't laugh, but...
>
> I was noticing in the script library (web section)
> that mailpage.r and websend.r are identical. So
> here's the challenge: as powerful as parse (and other
> language processing features) is, can someone come up
> with a script that would analyze the tokens in a pair
> of scripts and determine when they are essentially the same?
>
> __________________________________________________
> Do You Yahoo!?
> Yahoo! Mail - Free email you can access from anywhere!
> http://mail.yahoo.com/
========== results from the two web page emailing scripts ==========
>> do %/home/moses/temp/diff.r
File or URL #1? http://www.rebol.com/library/html/mailpage.html
File or URL #2? http://www.rebol.com/library/html/websend.html
The two files differ by: 14 tokens.
----- Tokens in http://www.rebol.com/library/html/mailpage.html not in http://www.rebol.com/library/html/websend.html
-----
[Email 2]
[a 2]
[Page 1]
[mailpage.r 1]
[10-Sep-1999 1]
[page. 1]
[(simple) 1]
[http://www.rebol.com/releases.html</font> 1]
----- Tokens in http://www.rebol.com/library/html/websend.html not in http://www.rebol.com/library/html/mailpage.html
-----
[Page 2]
[Emailer 1]
[websend.r 1]
[20-May-1999 1]
[Fetch 1]
[a 1]
[and 1]
[it 1]
[as 1]
[email. 1]
[email 1]
[http://www.rebol.com</font> 1]