Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search

[REBOL] Challenging script idea Re:

From: sterling:rebol at: 29-Aug-2000 14:29

Well, it sounded fun so here's what I've got. The output running it on the two files you talked about is at the bottom. The diff shows a list of blocks with tokens and a number which is how many times that token was in the file. You may see the same token listed in the diff for each file if the number of appearances is different. Well, enjoy! Sterling REBOL [ Title: "Simple token diff" Purpose: { I don't know, really. It just tries to figure out how many REBOL tokens are different between two files. Somebody thought it would be neat. ;) Maybe they'll ake it complete and fix whatever lurking bugs there are in this code.} Author: "Sterling Newton" ] a: ask "File or URL #1? " b: ask "File or URL #2? " get-type: func [item [string!]] [ switch/default true reduce [ found? find item "://" [item: to-url item] found? find item "%" [item: to-file next item] ] [a: to-file a] item ] a: get-type a b: get-type b ; the unique tokens and totals blocks foreach item [a-tokens b-tokens a-totals b-totals] [ set item copy [] ] file1: load/next a file2: load/next b tokenize-block: func [ blk [block!] tokens [block!] totals [block!] /local tmp idx] [ while [not empty? blk] [ either block? blk/1 [ tokenize-block load/next form blk/1 tokens totals ] [ either tmp: find tokens blk/1 [ idx: index? tmp totals/:idx/2: totals/:idx/2 + 1 ] [ append tokens blk/1 repend/only totals [blk/1 1] ] ] blk: load/next blk/2 ] ] tokenize-block load/next file1 a-tokens a-totals tokenize-block load/next file2 b-tokens b-totals print ["The two files differ by:" length? difference a-tokens b-tokens "tokens."] print ["----- Tokens in" a "not in" b "-----"] foreach item intersect diff: difference a-totals b-totals a-totals [ probe item ] print ["----- Tokens in" b "not in" a "-----"] foreach item intersect diff b-totals [ probe item ]
> Don't laugh, but... > > I was noticing in the script library (web section) > that mailpage.r and websend.r are identical. So > here's the challenge: as powerful as parse (and other > language processing features) is, can someone come up > with a script that would analyze the tokens in a pair > of scripts and determine when they are essentially the same? > > __________________________________________________ > Do You Yahoo!? > Yahoo! Mail - Free email you can access from anywhere! >
========== results from the two web page emailing scripts ==========
>> do %/home/moses/temp/diff.r
File or URL #1? File or URL #2? The two files differ by: 14 tokens. ----- Tokens in not in ----- [Email 2] [a 2] [Page 1] [mailpage.r 1] [10-Sep-1999 1] [page. 1] [(simple) 1] [</font> 1] ----- Tokens in not in ----- [Page 2] [Emailer 1] [websend.r 1] [20-May-1999 1] [Fetch 1] [a 1] [and 1] [it 1] [as 1] [email. 1] [email 1] [</font> 1]