Mailing List Archive: Re: diff... anyone?

[REBOL] Re: diff... anyone?

From: carl::cybercraft::co::nz at: 24-Dec-2003 22:21


On 20-Jun-03, Maxim Olivier-Adlhoch wrote:
> Are there any rebol diff functions or objects available, that:

> given two strings:

> return a block with text strings id pairs which identify if the text
> was in one, the other or both strings... something like:

>>> a: "my very insane momy likes to cook her boots before washing
>>> them" b: "my not so sane momy really likes to cook boots"

>>> blk: diff a b

> ;which would result in a block like (or something similar)

>>> probe blk
> [
>  both "my "
>  first "very in"
>  second "not so "
>  both "sane momy "
>  second "really "
>  both "likes to cook boots"
>  first " before washing them"
> ]

There's no function I know of that would work directly on a string
like that, (other than writing a specific parse rule), but you could
do something like this...

First convert your strings to blocks containing your words like so...

>> aa: parse a none
== ["my" "very" "insane" "momy" "likes" "to" "cook" "her" "boots"
before
 "washing" "them"]
>> bb: parse b none
== ["my" "not" "so" "sane" "momy" "really" "likes" "to" "cook"
boots
]

Then use intersect to get the words that are in both blocks...

>> both: intersect aa bb
== ["my" "momy" "likes" "to" "cook" "boots"]

Then use difference to compare the both block with the original blocks
to get the words that are only in one block...

>> a-only: difference both aa
== ["very" "insane" "her" "before" "washing" "them"]
>> b-only: difference both bb
== ["not" "so" "sane" "really"]

Which may or may not be quite the type of results you want, but shows
one approach you might find useful.  (Along with intersect and
difference, unique, union and exclude can also be used to work on
data sets.)

Hope that's of some help.

--
Carl Read