## [REBOL] Re: Coffee break problem anyone?

### From: joel:neely:fedex at: 5-Nov-2003 13:37

Hi, Sunanda, First, let me offer an improvement and then address some more of the good discussion points you raise. (There's a little challenge at the end for the interested who wade through to that point! ;-) Minor issue first: instead of initializing run length to zero and incrementing for each added element, just directly calculate the length once the run is finished. Now for the more interesting stuff... Instead of saying that: - a block is an iteration of runs, and - a run is an iteration of numbers; block --*-- run --*-- number lets say that: - a block is an iteration of runs, and - a run is a first number, followed by any consecutive numbers. +-- first number | block --*--+ | +--*-- consecutive number This (again) addresses a boundary issue; the first value in a run plays a different role than the other numbers in the run. Tricky setup of initial conditions so that the first value can be treated just like all the others adds more complexity to the code. That said, here's a version that implements that view: where-j: func [ block [block!] /local maxpos maxlen runpos runlen prev curr ][ maxpos: maxlen: 0 ; default empty run while [not empty? block] [ ; more data = more runs runpos: index? block ; start new run here prev: block/1 ; remember first value block: next block ; done with first while [ prev + 1 = curr: block/1 ; extending the run ][ prev: curr ; save comparison value block: next block ; move on ] runlen: (index? block) - runpos ; now compute length if runlen > maxlen [ ; update best run? maxpos: runpos maxlen: runlen ] ] reduce [maxpos maxlen] ; return best run ][SunandaDH--aol--com]wrote:> ... the insight that (in effect) the procedures can fall naturally out of an > analysis of the data structures can lead to some very elegant solutions. > > Of course there are situations where neither approach works, and then you > need other approaches too. Object-orientation is one claim to the next step as it > merges both approaches. I never quite saw the point there, either, but it > does work in some areas. >I'm not clear on what you mean by "both approaches"... I view JSP (Jackson Structured Programming) simply as a specific type of structured programming which has strong heuristics about which structure(s) should drive the design.> And then there are situations that seem completely off the > deterministic axis. >??? Do you mean non-deterministic programming, or just programming when the criteria are only vaguely stated?> Consider this progression: > > -- Find the sum of the numbers (simply almost any way: write a loop; > use a map function, etc) >Easy. Agreed.> -- Find the longest run with the highest end value (101--107 in the > original data) >An easy extension of the above program: - add a local MAXTOP initialized to anything - change the condition at the end from runlen > maxlen to any [runlen > maxlen all [runlen = maxlen prev > maxtop]] - change the "new best run" block to include maxtop: prev (For why this was easy, see below...)> -- Find the run (as defined in the original problem) whose run length > is the modal length of runs >First, we have to change "the run" to "a run", since the answer is no longer unique. For the data [10 8 9 6 7 4 5 0 1 2] there are runs of length 1, 2, 2, 2, and 3, so the mode length is 2, and there are three runs that have that length. However, something more significant has happened. The previous change to the problem (longest run with highest upper value) actually had THE SAME STRUCTURE as the original problem: find the "best" run, where we have a simple test for determining whether one run is "better" than another. In other words, we can generate runs one at a time, and test each one against the current "best" to see which one wins. But in this last change, we now are asking for a run based on some property/ies of the entire collection of runs, so we must generate (and retain) all of them. We can (trivially) change the above function to return all runs, then write a separate "pick a median" function to operate on that collection (or we build the additional stuff into the end of the modified function). In either case we now have a two-step approach: 1) transform the offered block into a collection of runs; 2) do something with the collection of runs. That's why the first change was easy and the second one was harder.> --Find the two runs that are most nearly similar (contain mostly the > same numbers) >I think I can paraphrase that as "find two runs whose intersection is at least as long as any other intersection of distinct runs", which is easy to solve, still using the two-step approach above. This one is also not uniquely determined without more constraints; to see why, consider [1 2 3 4 5 2 3 4 5 6 3 4 5 6 7]> -- The numbers are RGB values from a JPG. Find the average color tone > of the object the female nearest the camera is looking at. Break out > the neural networks for this one! >Now we've jumped from a programming problem to a specification problem! The other tasks were well-defined (there could be no argument about whether an answer is correct or not), but not so with this last one. (I recently saw a documentary on TV in which a critical question was whether a certain figure in a Renaissance painting was male or female. The "experts" couldn't agree. ;-)> In my actual case, the patterns I was looking for grew more complex, > so I reasoned that the requirement was becoming unreasonable as the > preceding data structures weren't designed for easy sifting. Luckily, > I could rewrite history (not always a luxury in this industry) and > simplified the problem almost out of existence. >My complements!!! Sometimes the best way to solve a problem is to redifine the terms of discourse so that the problem simply disappears!> Finally, this is getting a bit long, but just to stick up for the > "finite machine state" approach I took in the original code... >I certainly use FSM-inspired design at times myself, but the nasty part (IMHO) is always the transition between states. Not implementing the transition (usually just updating a variable) but making it clear to the eye/mind of the reader the pattern of transitions that will actually occur. In that regard, I suggest that the (improved) JSP-inspired version above can be regarded as a FSM as well. It's just that each "state" corresponds to a region of code, and the state transitions are simply the standard flow-control! In the updated FSM version, State 0 is *always* followed by State 1 (modulo falling off of the end) and a series of occurrences of State 1 are likewise followed by a State 0 (same caveat). The JSP version makes that organization much more obvious (IMHO). Finally, there is still considerable redundancy in the FSM version -- it's just in time rather than in codespace! Notice that the "update largest" test is done *for each element* of each run, and the sets for a maximum-length run will be done for each element after the length of its nearest competitor has been passed. The FSM code contains these expressions only once, but evaluates them *MANY* more times than the equivalent ones in the JSP version. I'm not criticizing here, but pointing out the trade-offs (which just happen to be particularly dramatic in this case! ;-) It is possible to optimize for one measure (in this case, minimizing source code redundancy) in such a way that it pessimizes another measure (the number of times the code is -- redundantly -- evaluated). Note that the JSP design heuristics don't minimize *ALL* aspects of code redundancy; the NOT EMPTY? test is really in there twice!!! Brownie points to anyone who can show why! ;-) -jn-