Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[algo] Intervals and parsing

 [1/7] from: didec:tiscali at: 11-Mar-2004 18:24


[algo] Intervals and parsing Hi all, For one of my apps, I have to manage a very long list of integer! I need to know if an integer! is in the list, and i need to add it to the list. To avoid an infinite block of integer! that will overflow the memory for nothing, and considering the fact that most of the values will follow themselve, I use an intervals dialect and some funcs. The values are stored in a block, but consecutives values are replaced by first and last values separate by a minus sign "-". I.e. : [0 1 2 3 4 5 10 11 12 13 15 19 20 21 22] become [0 - 5 10 - 13 15 19 - 22] I use 2 functions, one to check if a value exists in list, and one to add the value to the list. I post the funcs here for 2 reasons : 1) Maybe someone would need it 2) Maybe someone can do better or simpler Anyway, comments are welcome DideC ----- SCRIPT ----- REBOL [ title: "Manage list of integer intervals" author: "Didier Cadieu" date: 11-march-2004 version: 1.0 ] intervals: context [ where: none included?: func [ "Return true if Val is inside an intervals block." itv [block!] "Block of integers / intervals (x - y)" val [integer!] "Value to search" /local pos brk ] [ res: where: brk: none parse itv [ any [ pos: set i1 integer! '- set i2 integer! ( if any [ all [i1 <= val val <= i2 res: true] all [any [i1 - val = 1 val - i2 = 1] where: pos] all [val < i1 not brk where: pos] ] [brk: true] ) | pos: set i1 integer! ( if any [ all [i1 = val res: true] all [1 = abs i1 - val where: pos] all [val < i1 not brk where: pos] ] [brk: true] ) ] pos: end (any [brk res where: pos]) ] res ] include: func [ "Change intervals block to include Val." itv [block!] "Block of integers / intervals (x - y)" val [integer!] "Value to include" /local pos brk ] [ brk: none parse itv [ any [ pos: set i1 integer! '- set i2 integer! ( any [ brk all [i1 <= val val <= i2 brk: true] all [i1 - val = 1 change pos val brk: true] all [val - i2 = 1 change next next pos val brk: true] all [val < i1 insert pos val brk: true] ] ) | pos: set i1 integer! ( any [ brk all [i1 = val brk: true] all [i1 - val = 1 insert pos reduce [val '-] brk: true] all [val - i1 = 1 insert next pos reduce ['- val] brk: true] all [val < i1 insert pos val brk: true] ] ) ] pos: end (any [brk insert pos val]) ] regroup itv ] regroup: func [ "Simplify intervals block if possible" itv [block!] "Block of integers / intervals (x - y)" /local pos brk ] [ brk: none parse itv [ any [ pos: set i1 integer! set i2 integer! ( any [ brk all [i2 - i1 = 1 ( remove pos if all [not tail? next pos '- = second pos] [remove/part pos 2] brk: true) ] ] ) | skip ] ] ] ] ;***** some manual tests ; Type "ii num" to test a num in IT interval ; Type "ato num" to test inclusion of num in IT. IT is never changed. it: [-1 1 4 - 5 7 - 10 12 20 - 22 30 ] ii: func [val] [ intervals/included? it val prin res prin " " probe intervals/where ] ato: func [val /local it2] [ intervals/include it2: copy it val probe head it2 ] ;***** Some automatic Tests random/seed now/time/precise i: copy [] test1: does [ print "--- Test1 : includes values ---" loop 50 [intervals/include i v: random 40 print [v "->" mold i]] ask "Press enter to continue" print "^/" ] test2: does [ print "--- Test2 : check if values are included?---" probe i loop 50 [print [v: (random 50) - 5 "?" either intervals/included? i v ["yes"][" no"]]] ask "Press enter to continue" print "^/" ] test1 test2 print {Type "test1" or "test2" to redo a test} halt

 [2/7] from: tomc:darkwing:uoregon at: 12-Mar-2004 0:10


On Thu, 11 Mar 2004, Didec wrote:
> [algo] Intervals and parsing > Hi all,
<<quoted lines omitted: 11>>
> Anyway, comments are welcome > DideC
Hi DideC, as much as I do like parse I am unconvinced it is the ideal solution for managing a list of integers. I changed your interval datastructure to be a list of pairs!, each representing an interval, and used binary search instead of parse to locate solutions/insertion-points. it has not been tested beyond a few runs of your automatic testing and it is incomplete as it does not merge adjecent intervals when they meet. but I would be curious as to how it fairs in comparison to the parse version on your large data sets. --- script --- REBOL [ Title: "binary interval search on pairs" Author: "Tom Conlin" Date: 2004-Mar-11 Version: 0.0.1 File: %binary-interval-search.r Purpose: {DideC: I have to manage a very long list of integer! I need to know if an integer! is in the list, and i need to add it to the list. To avoid an infinite block of integer! that will overflow the memory for nothing, and considering the fact that most of the values will follow, I use an intervals dialect and some funcs. The values are stored in a block, but consecutives values are replaced by first and last values separate by a minus sign "-". I.e. : [0 1 2 3 4 5 10 11 12 13 15 19 20 21 22] become [0 - 5 10 - 13 15 19 - 22] I use 2 functions, one to check if a value exists in list, and one to add the value to the list. } Comment: {} History: [] ] ;;; adapted from ;;; http://www.rebol.org/cgi-bin/cgiwrap/rebol/download-a-script.r?script-name=binary-search.r binary-interval-search: func [ interval[block!] {smaller to bigger sequence of disjoint pairs} key [integer!] "integer to find in the intervals" /local lo mid hi tmp ][ lo: 1 hi: length? interval if hi <= lo [return 0] mid: to-integer (hi / 2) while[hi >= lo][ tmp: pick interval mid either (key >= tmp/1) and (key <= tmp/2) [return mid] [either key > tmp/2 [lo: mid + 1][hi: mid - 1]] mid: to-integer (lo + ((hi - lo) / 2)) ] mid ] intervals: context[ where: 1 included?: func[interval [block!] value [integer!]][ where: binary-interval-search interval value ;true all[not zero? where value >= interval/:where/1 value <interval/:where/2] ] include: func[interval [block!] value [integer!] /local here][ if included? interval value[return] here: at interval where either tail? here[insert here 1x1 * value] [;print [value 'found first here 'at where] either value > here/1/2 [either 1 + here/1/2 = value [here/1/2: value] [here: next here either all [not tail? here -1 + here/1/1 value] [here/1/1: value] [insert here 1x1 * value] ] ] [either -1 + here/1/1 = value [here/1/1: value] [here: back here either 1 + here/1/2 = value [here/1/2: value] [insert here 1x1 * value] ] ] ] ;;; mearging adjecent pairs is left as an exercise ] ] ;;;;;;;;;;;;;;;;;;;;;;;; it: [-1x-1 1x1 4x5 7x10 12x12 20x22 30x30 ] ii: func [val] [ prin intervals/included? it val prin " " probe intervals/interval/where ] ato: func [val /local it2] [ intervals/include it2: copy it val probe head it2 ] ;***** Some automatic Tests random/seed now/time/precise i: copy [] test1: does [ print "--- Test1 : includes values ---" loop 50 [intervals/include i v: random 40 print [v "->" mold i]] ask "Press enter to continue" print "^/" ] test2: does [ print "--- Test2 : check if values are included?---" probe i loop 50 [print [v: (random 50) - 5 "?" either intervals/included? i v ["yes"][" no"]]] ask "Press enter to continue" print "^/" ] test1 test2 print {Type "test1" or "test2" to redo a test} halt

 [3/7] from: didec:tiscali at: 12-Mar-2004 14:07


Re: [algo] Intervals and parsing See at bottom
> On Thu, 11 Mar 2004, Didec wrote: > > [algo] Intervals and parsing
<<quoted lines omitted: 31>>
> but I would be curious as to how it fairs in comparison to the parse > version on your large data sets.
Thank you very much Tom. I have not make benchmark to compare, but I like the pair! idea and the dichotomical search. It will be surely faster as the most search occurs on the high value (end of the series. Sure, for performance, the best is to reverse the order of the series or to search from the end (in my case) But performance is not the main problem, as this system reduce the series from thousand of values to about 15 intervals. But I keep your idea in mind. FOR PARSE GURUS ! How can I stop the parsing if I find the correct value (and considering the test is done in the code next to the rule, not by the rule himself) ? DideC

 [4/7] from: maarten:vrijheid at: 12-Mar-2004 14:56


> FOR PARSE GURUS ! > > How can I stop the parsing if I find the correct value (and considering the test is done in the code next to the rule, not by the rule himself) ? >
use the 'break keyword from the parse dialect, core 2.5.3 and up http://www.rebol.com/docs/changes.html#section-3.6 --Maarten

 [5/7] from: nitsch-lists:netcologne at: 12-Mar-2004 16:02


On Freitag, 12. M=E4rz 2004 14:07, Didec wrote:
> Re: [algo] Intervals and parsing > > > FOR PARSE GURUS ! > > How can I stop the parsing if I find the correct value (and considering the > test is done in the code next to the rule, not by the rule himself) ?
That means parse s[ copy something (if xy = something [i-want-to-break]) ] and now you don't know how to tell the parser to stop? I use this trick: parse s[ copy something ( break-or-not: either xy = something [ 'break ][ [] ] ) break-or-not ] so setting a dummy-rule following the paren to 'break. Looks ugly but works.
> DideC
-Volker

 [6/7] from: pwawood::mango::net::my at: 13-Mar-2004 8:27


Hi DideC As your list of integers seems to be very densely populated, have you considered storing those present rather than those missing ? Peter On Friday, Mar 12, 2004, at 21:07 Asia/Kuala_Lumpur, Didec wrote:

 [7/7] from: pwawood:mango:my at: 13-Mar-2004 8:40


Oops, I meant storing those missing rather than those presenting. Peter On Saturday, Mar 13, 2004, at 08:27 Asia/Kuala_Lumpur, Peter WA Wood wrote:

Notes
  • Quoted lines have been omitted from some messages.
    View the message alone to see the lines that have been omitted