World: r3wp
[Rebol School] Rebol School
older newer | first last |
Sunanda 6-Jul-2007 [545] | Not sure this is a case for parse......You seem to have four types of line: -- those with "page" in a specific location on the line -- those with "name" in a specific location on the line -- those with "member" in a specific location on the line -- others which are to be ignored .... eg your orginal line 6 "Line 6 600 Desc 1 text 12/23/03" What I would do is: * use read/lines to get a block * for each line in the block, identify what record type it is by the fixed literal .... something like: if "page" = copy/part skip line 25 4 [....] * perhaps use parse to extract the items I need, once I know the line type *** If you just use parse in the way you propose, you run the risk of mis-identifying lines when there is a member called "page" or "name" |
PatrickP61 6-Jul-2007 [546x4] | Thank you Sunanda -- I will give that a try. Just to let you know -- My goal is to convert a printable report that is in a file into a spreadsheet. Some fields will only appear once per page like PAGE. Some fields could appear in a new section of the page multiple times like NAME in my example. And some fields could appear many times per section like MEMBER: _______________________ Page header PAGE 1 Section header NAME1.1 Detail lines MEMBER1.1.1 Detail lines MEMBER1.1.2 Section header NAME1.2 Detail lines MEMBER1.2.1 Detail lines MEMBER1.2.2 Page header PAGE 2 (repeat of above)____________ I want to create a spreadsheet that takes different capturable fields and place them on the same line as the detail lines like so... ______________________ Page Name Member 1 NAME1.1 MEMBER1.1.1 1 NAME1.1 MEMBER1.1.2 1 NAME1.2 MEMBER1.2.1 1 NAME1.2 MEMBER1.2.2 2 NAME2.1 MEMBER2.1.1 ... (the version numbers are simply a way to relay which captured field I am referring to (Page, Name, Member) Anyway -- that is my goal. I have figured out how to do the looping, and can identify the record types, but you are right about the possiblity of mis-identifying lines. |
This is my pseudocode approach: New page is identified by a page header text that is the same on each page and the word PAGE at the end of the line New section is identified by a section header text that is the same within the page and the text "NAME . . . . :" Members lines do not have an identifying mark on the line but are always preceeded by the NAME line. Member line continue until a new page is found, or the words "END OF NAME" is found (which I didnt show in my example above). Initialize capture fields to -null- like PAGE, NAME Initialize OUTPUT-FLAG to OFF. Loop through each line of the input file until end of file EOF. /|\ If at a New-page line | or at end of Name section | Set OUTPUT-FLAG OFF | If OUTPUT-FLAG ON | Format output record from captured fields and current line (MEMBER) | Write output record | IF at New Name line | Set OUTPUT-FLAG ON | IF OUTPUT-FLAG OFF | Get capture fields like PAGE-NUMBER when at a PAGE line | Get NAME when at a NAME line. |____ Next line in the file. | |
Note to all -- Please realize this is a simplified version of the real report -- There are many more fields and other things to code for, but they are all similar items to the example PAGE, NAME, and MEMBER fields. | |
Oops -- I should put the IF at New Name line at the end of the loop, or put the capture of the name in that part. | |
Tomc 7-Jul-2007 [550] | Yes Patrick you have it right. The rules I gave would fail since you have multiple names/members I would try to get away from the line by line mentality and try to break it into your conceptual record groupings file, pages, sections, and details... One trick I use is to replace a string delimiter for a record with a single char so parse returns a block of that record type. this is good because then when you work on each item in the block in turn you know any fields you find do belong to this record and that you have not accidently skipped to a similar field in a later record. something like this pages: read %file replace/all/case pages "PAGE" "^L" pages: parse/all pages "^L" foreach page pages[ p: first page page: find page newline replace/all/case page "NAME" "^L" sections: parse page "^L" foreach sec section [ s: first section sec: find sec newline parse sec [ any [thru "Member" copy detail to newline newline (print [p tab s tab detail]) ] ] ] ] |
PatrickP61 18-Jul-2007 [551x4] | I am a little confused about PORTS. I want to control how much information is loaded into a block but I am not sure how to determine if data remains in a port. Example: |
In-port: open/lines In-file while [not tail? In-port] [ print In-port In-port: next In-port ] close In-port | |
This is not doing what I want. I want it to continue to run through all lines of a file and print it | |
My goal is to be able to control how much of a file is loaded into a block then process the block and then go after the next set of data. That is why I am using PORT to do this function instead of reading everything into memory etc. | |
Geomol 18-Jul-2007 [555x2] | Change the print line to: print first In-port |
I think, your code print the port specs and everything. | |
PatrickP61 19-Jul-2007 [557] | Yes, It does dump a lot of stuff that I don't kow about!!! |
btiffin 19-Jul-2007 [558x2] | Ports are nifty little objects. :) And if you just type >> In-port you get back nothing, just another prompt. >> The interpreter does not display the internals of objects, but print does, so what you are seeing is the object! that is In-port. Well, I'm lying...In-port is a port! not an object! Close but not the same. ports emulate a series! wrapped in object! wrapped in enigma. Or is it an object! wrapped in a series! disguised as a sphynx? :) first In-port is a REBOL reflective property feature that when you get it, you'll go "Ahhhh" as you step closer to the Zen of REBOL. For fun with a really big object! try >> print system |
Oh, by the way...we added to the %form-date.r script in the library. See I'm New for details. | |
PatrickP61 20-Jul-2007 [560] | Another question -- I know to use escape to insert things like a tab as in ^(tab) into a string. What can I use to insert a newline? ^(newline) doesn't work. |
Geomol 20-Jul-2007 [561x3] | str: "" insert str newline |
If you just write NEWLINE in the prompt, you'll see how it's defined. You can specify a newline in a string as str: "a string with a newline: ^/" | |
A tab can also be specified as: "^-" | |
PatrickP61 20-Jul-2007 [564] | Perfect! ^/ works just great! |
Geomol 20-Jul-2007 [565] | It's a bit strange, that ^(newline) doesn't work, now that ^(tab) does. Maybe it was just forgotten. |
PatrickP61 20-Jul-2007 [566] | As a newbie, it seemed natural to try ^(newline), but the shortcut ^/ works for me too. |
Geomol 20-Jul-2007 [567] | Unfortunately there are some strange things in the corners of REBOL, but you'll learn to live with it. |
PatrickP61 20-Jul-2007 [568] | :-) |
Geomol 20-Jul-2007 [569] | Now you're at it, check http://www.rebol.com/docs/core23/rebolcore-16.html#section-2.11 and http://www.rebol.com/docs/core23/rebolcore-8.html for info about strings in REBOL. |
PatrickP61 20-Jul-2007 [570] | Just what I needed!!! |
Geomol 20-Jul-2007 [571] | Ah, there's the explanation, a newline can be specified as ^(line) (for some reason) |
PatrickP61 20-Jul-2007 [572] | Ahhhhh |
Gregg 20-Jul-2007 [573] | More reference info here: http://www.rebol.com/docs/core23/rebolcore-16.html#section-3.1 And you also have the words CR, LF, and CRLF available. |
PatrickP61 26-Jul-2007 [574] | My teachers, I have an array ( block (of "lines") within a block (of values) ) that I would like to convert to a block (of "lines") with all values joined with an embedded tab. What is the best way to achieve this? See example: In-array: [ [ {Col A1} {Col B1} ] <-- I have this [ {2} {3} ] [ {line "3"} {col "b"} ] ] Out-block: [ {Col A1^(tab)Col B1} <-- I want this {2^(tab)3} {line "3"^(tab)col "b"} ] |
Rebolek 26-Jul-2007 [575] | >> out-block: copy [] == [] >> foreach line in-array [append out-block rejoin [line/1 "^-" line/2]] == ["Col A1^-Col B1" "2^-3" {line "3"^-col "b"}] |
Anton 26-Jul-2007 [576x2] | in-array: [["Col A1" "Col B1"]["2" "3"][{line "3"} {col "b"}]] out-block: copy [] foreach blk in-array [ line: copy "" repeat n -1 + length? blk [append append line blk/:n tab] if not empty? blk [append line last blk] append out-block line ] |
new-line/all out-block on == [ "Col A1^-Col B1" "2^-3" {line "3"^-col "b"} ] | |
PatrickP61 26-Jul-2007 [578] | Anton, what does the new-line/all do. I gather it inserts newlines after each value. Is that right? |
Volker 26-Jul-2007 [579] | it cleans up rebol-listings |
PatrickP61 26-Jul-2007 [580] | Forgive me, how does it do that? |
Volker 26-Jul-2007 [581x2] | else all strings would be on one line. only interesting for probing rebol-code, does not change the strings itself |
there is a hidden markerin values, for newline | |
PatrickP61 26-Jul-2007 [583] | So if i read you right, then if I didn't do new-line/all, and tried to probe Out-block, it would show the entire contents as one large string, whereas new-line/all will allow probe to show each value as a spearate line. Right? |
Volker 26-Jul-2007 [584x2] | as lots of strings in one line |
and with the new-line all strings in own lines | |
PatrickP61 26-Jul-2007 [586x4] | I see how it works now -- Thank you Anton and Volker!! |
Thank you Reblek -- didn't see your answer at first! | |
My teachers, Anton and Rebolek have submitted two answers. The difference between them is that Anton's answer will insert a tab between varying numbers of values per line, where Rebolek will insert a tab in-between col 1 and col2 (assuming only 2 columns in the array). Is that a correct interpretation? | |
Anton, I understand Rebolek answer, but I want to understand your answer too. I'm wondering about the line: repeat N -1 + length? Blk [append append Line Blk/:N tab] does Rebol do the inner append first (in math expressions) like this: [append ( append Line Blk/:N ) tab] and then do this for the number of "lines" in the array N Out-block 0 [] 1 "Col A1^-Col B1" 2 "Col A1^-Col B1" "2^-3" 3 "Col A1^-Col B1" "2^-3" {line "3"^-col "b"} I think I see the above progression, but not sure about Blk [append Line last Blk] Is this advancing the starting position within In-array? | |
Gregg 27-Jul-2007 [590] | ...insert a tab between varying numbers of values per line <versus> ... insert a tab in-between col 1 and col2 -- Correct. On new-line, it's kind of advanced because it doesn't insert a newline (CR/LF), but rather a hidden marker between values that REBOL uses when molding blocks. |
PatrickP61 27-Jul-2007 [591] | Hi Gregg -- Is that primarily for display purposes, or could it be used for other things? |
Gregg 27-Jul-2007 [592x2] | On "append append", yes. You could also do it like this: "append line join blk/:n tab", the difference being that APPEND modifies its series argument, and JOIN does not. REPEAT is 1-based, not zero, Anton is using "-1 + length? blk" rather than "(length? blk) - 1" or "subtract length? blk 1". The first of those cases requires the paren because "-" is an op! which will be evaluated before the length? func, so REBOL would see it like this "length? (blk - 1)", which doesn't work. |
For display or formatted output. It's *very* useful when generating code for example. | |
PatrickP61 27-Jul-2007 [594] | Sounds like more advanced stuff than I'm understanding right now. I'll read up on the terms. When I get REBOL code solution, I'd like to understand how Rebol is processing the code. What it does logically first, and logically second... I think I get confused about when Rebol does the evaluations. |
older newer | first last |