Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search

[REBOL] Re: Coffee break problem anyone?

From: joel:neely:fedex at: 4-Nov-2003 11:32

Hi, Sunanda, Nice little problem! (And, I admit to being more interested in the algorithm design aspect than the REBOL/PARSE aspect...) [SunandaDH--aol--com] wrote:
> The problem is this......You have a block containing integers, like: > > sparse-array: [ 12 13 14 15 16 17
> ] > > There are runs of consecutive, ascending integers: 12--17, 7--8 etc. > We want to find the location in the block of the first longest, > such sequence. > > The answer for this dataset is 20--26. 101--107 is as long, but > isn't the first. 59--52 is longer, but it is descending. >
This is a great little problem for comparing two approaches that handle state differently: classic stateful structured programming massages variables to maintain state, while the Jackson Structured Methodology maps data structure to algorithm structure as much as possible. Incidentally, I'm extending the spec to include the requirement that we should get a zero-length run if the array is empty... The "classic" stateful approach says "We have an array; process it one element at a time, and keep track of whether each new element extends a run or begins a new one." This normally gets a little tricky when we have to address boundary conditions (e.g. what do we compare the first element against, as it has no predecessor?) Here's a standard design based on that strategy: find-run-stateful: func [ data [block!] /local maxpos maxlen seqpos seqlen prev curr ][ maxpos: maxlen: 0 ; initially no runs seqpos: seqlen: 0 ; imaginary first run prev: data/1 ; pretend first is... ; ... its own previous for pos 1 length? data 1 [ ; count across array curr: data/:pos ; get new current value either curr - 1 = prev [ ; if current extends run... seqlen: seqlen + 1 ; ... just bump length ][ ; ... otherwise if seqlen > maxlen [ ; if last run was new best... maxpos: seqpos ; ... remember position maxlen: seqlen ; ... and length ] seqpos: pos ; start new run here... seqlen: 1 ; ... with length of 1 ] prev: curr ; remember for comparison ] ; we can have a run underway ; when falling off the end if seqlen > maxlen [ ; so we must check its length maxpos: seqpos ; to see if it was the best maxlen: seqlen ; run in the array ] reduce [maxpos maxlen] ] The tricky bits in this design are: - initializing the imaginary "previous run" before the first element of the block, in such a way that it won't be extended by the first element, and it won't look like a better run than any that we actually find; - remembering to check the run which goes off the end of the block. Notice that the above design just assumes that the block is an iteration of numbers, and so has to figure out what to do with each number (including the first!). It maintains two invariants: - SEQPOS and SEQLEN represent the run in progress which each element either extends or terminates (by starting a new one); - MAXPOS and MAXLEN represent the longest *completed* run yet seen. The Jackson method says instead, "The argument block is an iteration of runs, and a run is an iteration of consecutive numbers" and so has two loops: the outer one iterates across runs in the block, and the inner one iterates across numbers in a single run. find-run-Jackson: func [ data [block!] /local maxpos maxlen seqpos seqlen prev curr ][ maxpos: maxlen: 0 ; initially no runs while [not empty? data] [ ; process another run... seqpos: index? data ; ... starting here seqlen: 0 ; ... initially empty prev: none ; ... no previous value while [ ; same run as long as... all [ not empty? data ; ... more data and any [ (curr: data/1) - 1 ; ... current value = prev ; ... extends this run, or zero? seqlen ; ... this is a new run ] ] ][ ; same run, so... prev: curr ; ... save value to compare seqlen: seqlen + 1 ; ... extend length data: next data ; ... shift the data ] if seqlen > maxlen [ ; if that run is new best... maxpos: seqpos ; ... remember position maxlen: seqlen ; ... and length ] ] reduce [maxpos maxlen] ] Notice that now there's a single, obvious place for the test for new best run -- right after the inner loop across a run's numbers! The condition on the inner loop has to satisfy the outer loops condition first, and then add its own criteria (which uses ZERO? SEQLEN to mean that we have a new run, and therefore don't care whether the previous value was one less than the current one). To see the value of this structural change, imagine that we now need to return a triplet of numbers; in addition to the position and length of the longest consecutive run, also return the count of how many runs of that length existed. Where would that code go in the Jackson version? Where would it have to appear in the other one? -jn- -- ---------------------------------------------------------------------- Joel Neely joelDOTneelyATfedexDOTcom 901-263-4446 Enron Accountingg in a Nutshell: 1c=$0.01=($0.10)**2=(10c)**2=100c=$1