AltME groups: search

Help · search scripts · search articles · search mailing list

results summary

world	hits
r4wp	5907
r3wp	58701
total:	64608

results window for this page: [start: 30001 end: 30100]

world-name: r3wp

Group: Parse ... Discussion of PARSE dialect [web-public]
btiffin: 16-Mar-2008	context [ ] is just a shortcut for make object! [ ] and it's great. The more we hide in objects the easier it will be share, or at the least, easier to use code from a variety of developer sources. Programming in the Many is important in our context as there are relativily few of us in the "many" - so far. So when even our small stuff is shareable we all win.
BrianH: 17-Mar-2008	Does a bind/copy on its code block every time it is used.
Oldes: 17-Mar-2008	I should probably not to use the code evaluation so much directly in the parse rule block and rather call a function if I need a lot of temp variables to process the action.
Henrik: 28-Apr-2008	(note this block can only be made without a space at the end in rebol 2.7)
Henrik: 11-May-2008	if I have a rule-block that does not exist in the same context as the main parse block, is there a simple way to rebind it without composing it into the main parse block? my current solution is to bind it to a temp block and use the temp block as a rule in the main parse block, which is less than optimal, I think.
Chris: 11-May-2008	Assuming you want to assign values to function locals from the external parse rules, you can a) bind as you are doing, b) create a larger context for the function encompassing your rules or c) compile the parse rule, either on creation of the function or for each instance. a) rule: [set tag tag!] test: func [data /local tag][bind rule 'data parse data rule tag] b) test: use [tag][ rule: [set tag tag!] func [data][parse data rule tag] ] c) rule: [set tag tag!] test: func [data /local tag] compose/only [parse data (rule) tag] Also, note that when you bind, it alters the original block -- no need to reassign to a new word.
Chris: 11-May-2008	When it comes to complex rules, I opt for b). At that, I'd go for context [] where there are a lot of associated words...
Henrik: 12-May-2008	the function is recursive, so that may put a twist on b). I forgot that detail with BIND on a) so thanks for that. c) seems to work best.
amacleod: 15-May-2008	I'm trying to parse a tex document that I've formated into lines of text with blank lines between simialr to make doc format
amacleod: 15-May-2008	Most lines begin with a section number (2.), or a sub-section (2.3) or a sub-sub-section (2.3.5).
BrianH: 16-May-2008	If the section numbers always end with a period, you can do this: some [some digits "."] If the section numbers don't end with period you can do this: some digits any ["." some digits]
BrianH: 16-May-2008	Look up recursive descent parsing, and take a not of the difference between left recursion and right recursion.
Chris: 16-May-2008	Don't want to add too much, but with parse you can really build up a vocubulary based on the patterns you know: section: [integer! ["." \| 1 4 ["." integer!]]] ; -- or whatever rule covers all permutations chars-sp: charset " " space: [some chars-sp] parse/all [copy sn section space [to newline \| to end]] Vocabularies are easy to wrap in their own context too. Note also that [integer!] is a shorthand for [some digit] -- very useful : )
amacleod: 16-May-2008	Oldes, thanks for your suggestion. It works when I do a simple one line rule as you suggested but when I try to use multiple rules it fails. Example of what I'm trying to do: Example of the text document:
amacleod: 16-May-2008	3. CONSTRUCTION OF PORTABLE ALUMINUM LADDERS 3.1 Aluminum ladders are divided into two basic types of construction, viz:, solid beam and truss. 3.1.1 Solid Beam Aluminum Construction- This type of ladder has a solid side rail construction with aluminum rungs connecting with the side rails at fourteen inch intervals. The connection is generally either by a welded joint between rung and side rails, or by an expansion plug pinching the rung tightly to the side rails and internal backup plates. (Figure 2 A) 3.1.2 Aluminum Truss Construction- In the aluminum truss design, the top and bottom rails are connected to rung assemblies or rung blocks by rivets. The rungs are either welded or expansion plugged to the rung plate assemblies, which are supported by the top and bottom rails. (Figure 2B) 3.2 The base of the portable aluminum ladder is provided with either steel spikes or swiveling rubber safety shoes and aluminum spikes. For ladders equipped with the swiveling device, the rubber pads should be utilized when the ladder is to be raised and used on hard surfaces. (Figure 2A, 2B) 3. CONSTRUCTION OF PORTABLE ALUMINUM LADDERS
BrianH: 16-May-2008	Any reason that the headings with one number have a trailing period and the rest don't?
amacleod: 16-May-2008	BrianH, sorry BRian the text above is just from a random and simpler section of the document. if I copied the from the begining the first line would not have a number at all.
BrianH: 16-May-2008	But I made a mistake.
amacleod: 16-May-2008	This will give me a hit on any section or sub or sub sub? I may want to do something different depending on each. does this allow me to ?
BrianH: 16-May-2008	If you are making your decisions on a per-line basis, you might consider doing a read/lines and parsing each line individually, maintaining your own state to tell you where you are in the greater document. It's the only way to parse documents greater than memory in size.
Anton: 17-May-2008	BrianH, eh? read/lines would still try to read the whole document wouldn't it ? Or are you just suggesting that as a way which is then easily modified to allow larger than memory documents?
Chris: 17-May-2008	That would suck -- I use it. Seems like a common enough scenario....
BrianH: 19-May-2008	I mean you can do open/lines/direct and stream - then you would only need the memory for one line and a state machine.
Josh: 3-Jun-2008	I'm finally digging into parse now, but I have a question about HTML. Big idea: pulling the data out of an HTML table (made in Word--ugh!). Where I am stuck: Is there a way to create a rule for opening tags such as <tr> that include a lot of formatting: i.e. <tr style="mso........> ? I want to pull the info inbetween the opening and closing tags.
Josh: 3-Jun-2008	I came up with a rule: [some [thru "<td" thru ">" y: to "</td>" (a: remove-each tag load/markup y [tag? tag])]] but it seems to not be as efficient as it could be.
Geomol: 3-Jun-2008	Josh, if you do a load/markup on the whole string, you get a block with tags and strings. You can then pick the string from the block, maybe doing TRIM on them to sort out newlines and spaces. Like: blk: load/markup your-data foreach f blk [if all [string? f "" <> trim f] [print f]]
Chris: 3-Jun-2008	I've been toying with this to obtain a very parsable "dialect" -- my goal being to scrape live game updates from a certain sports web site (for personal use, natch). It's reliant on 'parse-xml though, so ymmv.... do http://www.ross-gill.com/r/scrape.r probe load-xml some-xml
Chris: 3-Jun-2008	Result is a little like: from -- <tag attr="attribute">Content</tag> to -- <tag> /attr attribute "Content"
Anton: 4-Jun-2008	Josh, using the REMOVE-EACH very often is what makes your parse slow. A remove operation in the middle of a large string is slow, and you are doing many removes. That's why the others suggested using copy.
Josh: 6-Jun-2008	Thanks for the input. I will have to play around with those later as I am trying to get this finished up and then I can go back and clean up the code. The data is minimal enough for the script to finish in under a second anyway. Parse is pretty sweet. Makes this much neater than the alternative
amacleod: 30-Jun-2008	I'm trying to copy some text from the position found iwhile parsing a document. I'm using something like: rule: [some digit copy text to newline] (--where "digit has ben defined as all digits 0 to 9) This copies eveerything after the digit. How would I copy the digit itself as well?
amacleod: 30-Jun-2008	Is there a difference between using "to" and "thru"
amacleod: 30-Jun-2008	No I have a text document with section numbers in front: 2. Hello 2.1 Hello Again 2.1.1 Hello already 3. Goodbye I want the section number inclued in hte copy
amacleod: 30-Jun-2008	Well it gets a little more complicated. some parts of the docment will be multilined.
amacleod: 30-Jun-2008	I thought it would be a simple thing that I was missing. I may need to re-think the formatting of the document.
[unknown: 5]: 30-Jun-2008	Or do you mean a multiline might looks something like this: 2.1 Hello Goodbye Where the second line doesn't have the preceeding number?
[unknown: 5]: 30-Jun-2008	Ahhh yes that gets a bit more complicated.
amacleod: 30-Jun-2008	Let me briefly explain where I'm going to see if you think its workable or perhaps a there is a better solution
amacleod: 30-Jun-2008	I trying to put a set of Fire department related materials online. THey are now in pdf
amacleod: 30-Jun-2008	I want to hold each section in a seperate database record
[unknown: 5]: 30-Jun-2008	Well, TRETBASE 1.0 is the only finished product right now. So the only available TRETBASE app is 1.0 which is really not a multi-user solution.
amacleod: 30-Jun-2008	I'm using mysql for the online component but I need a local storage method too for offline use
amacleod: 30-Jun-2008	What I would need is a simple method to sync them
amacleod: 18-Jul-2008	Is there a difference between a "space" and a "tab"? Can you parse for tab and not sapce?
Graham: 18-Jul-2008	I would think you would have to parse/all .. and a space is #" " and a tab is #"^-"
btiffin: 21-Aug-2008	A long time ago, I offered to try a lecture. Don't feel worthy. So I thought I'd throw out a few (mis)understandings and have them corrected to build up a level of comfort that I wouldn't be leading a group of high potential rebols down a garden path. So; one of the critical mistakes in PARSE can be remembered as "so many", or a butchery of some [ any [ , so many. some asks for a truth among alternatives and any say's "yep, got zero of the thing I was looking for", but doesn't consume anything. SOME says, great and then asks for a truth. ANY say "yep, got zero of the thing I was looking for", and still doesn't move, ready to answer yes to every question SOME can ask. An infinite PARSE loop. Aside: to protect against infinite loops always start a fresh PARSE block with [() the "immediate block" of the paren! will allow for a keyboard escape, and not the more drastic Ctrl-C. So, I'd like to ask the audience; what other PARSE command sequences can cause infinite loops? end? and is it only "end", "to end" but "thru end" will alleviate that one? end end end end being true? >> parse "" [some [() end end end]] (escape) >> parse "" [some [() thru end end end]] == false >> parse "" [some [() to end end end]] (escape) >> Ok, but thru end is false. Is there an idiom to avoid looping on end, but still being true on the first hit? Other trip ups?
Henrik: 28-Sep-2008	parse [a] ['a] ;== true parse ['a] reduce [to-lit-word 'a] ; == false (why?)
Henrik: 28-Sep-2008	forget it. I was confused for a second, but is there a way to parse that 'a correctly? The same goes for get-word! and set-word!.
Henrik: 28-Sep-2008	I should clarify: I would like to parse a specific get-word!, lit-word! or set-word! as opposed to parsing on the type and then checking the value in some kind of action afterwards: parse ['a 'b 'c] ['a 'b 'c] ;== true (I know this is the wrong parser block, but it's something to that effect I would like to see)
Anton: 28-Sep-2008	If I remember correctly, this was a problem of parse (and may still be)...
Anton: 28-Sep-2008	You may have to use a workaround.
Geomol: 28-Sep-2008	If you can go with a reduced block, this can work: parse reduce ['a 'b 'c] ['a 'b 'c]
Henrik: 28-Sep-2008	what if there are set-words in it? I wanted to parse the content of an object, which can be a mixture of word types.
BrianH: 28-Sep-2008	In general that restriction of parse is part of an overall pattern in REBOL of encouraging you to use lit-words as lit-words rather than some other kind of datatype. Lit-words in REBOL are generally used to express literal expressions of words, rather than being used as a distinct datatype. In general you convert them to words before use.
BrianH: 28-Sep-2008	It's usually a bad idea to use lit-words as keywords - they make better values. If you are comparing to a particular lit-word value, that is using it as a keyword. If any lit-word value would do and their meaning is semantic rather than syntactic, that works. In general, PARSE is better for determining syntactic stuff - use the DO dialect code in the parens for semantic stuff.
BrianH: 28-Sep-2008	Not that I don't want a LIT or LITERAL directive in PARSE that would turn off the PARSE-dialect treatment of the next value in the spec.
Anton: 10-Oct-2008	term: [word! \| into term] parse [a b [c]] [some term] ;== true parse [a b [c d]] [some term] ;== false
Anton: 10-Oct-2008	I'm a bit confused by that. I need to parse recursively.
Anton: 10-Oct-2008	terms: [some [word! \| into terms]] parse [a b [c d]] terms ;== true
Terry: 12-Oct-2008	blk: [aa "test" bb "two" cc "#block"] rules: [some [cc set cc string! ]] parse blk rules no go? I have a more complicated rule set that chokes on the "#block" string.. does it think it's an issue! ?
sqlab: 30-Oct-2008	Yes, this is an old bug. It does not work, if " is next to your delimiter. Insert a blank, and it works again.
Graham: 3-Nov-2008	This is a result of using parse-xml and some cleanup [document [soapenv:Envelope [soapenv:Body [ns1:getSpellingSuggestionsResponse [getSpellingSuggestionsReturn [getSpellingSuggestionsReturn "Penicillin G"] [getSpellingSuggestionsReturn "Penicillin V"] [getSpellingSuggestionsReturn "Penicillamine"] [getSpellingSuggestionsReturn "Polycillin"] ] ] ] ] ]
Graham: 3-Nov-2008	drugs: [set drugblock into [ 'getSpellingSuggestionsReturn set drugname string! ( print drugname) ]] parse a [ 'document set envelope into [ 'soapEnv:envelope set body into [ 'soapEnv:body set response into [ 'ns1:GetSpellingsuggestionsresponse set returns into ['getspellingsuggestionsreturn some drugs to end ]]]]] works but is very long winded
Gregg: 4-Nov-2008	It's not so bad Graham. And whether you can shorten things depends on how exact you need to be. rule: [ 'getspellingsuggestionsreturn some drugs \| url! into rule ] parse a ['document into rule]
PeterWood: 4-Nov-2008	This is a bit shorter but recursive: pr: [any [ [set b block! (parse b pr)] \| ['getSpellingSuggestionsReturn set s string! ( insert drug-names s ) \| skip ] ] ]
Graham: 4-Nov-2008	the output I presented looks so close to being a rebol object .. and then I can use paths to access the data
PeterWood: 4-Nov-2008	Sorry about the formatting ... can't cut and paste in AltME on a Mac without reformatting.
PeterWood: 4-Nov-2008	If it's not fast enough you can speed it up by adding a rule to consume the unwanted parts.
PeterWood: 4-Nov-2008	gxs is a string of your xml listed above.
BrianH: 5-Nov-2008	So far we have been accepting proposals in these categories: - Recognition: LIT, NOT, OF, TO and THRU extensions - Modification: CHANGE, INSERT, REMOVE - Structural and control flow: FAIL (may not be the final name), USE, CHECK (still debate here), REVERSE There is still some debate even within these proposals (name of FAIL for example) and some of them might not make it. Some of the old PARSE REPs have been definitively rejected or changed, and some are still under debate and won't make it in without a lot more thought.
BrianH: 5-Nov-2008	These changes to PARSE are another example of changes to the R3 core happening as a side effect of the new GUI work :)
BrianH: 5-Nov-2008	Yup. We've been working on the Parse Project article a lot today. The last 2 things from the REP that might make it are the THROW and INTO-STRING proposals, though both will need some changes first. The rest are covered or rejected.
BrianH: 5-Nov-2008	Peter Wood's RETURN proposal is really interesting. I have been thinking about how to make a variant of it work.
Anton: 5-Nov-2008	I'd like to understand Peter Wood's START command a bit better. It's not clear to me from the example why it's needed. (or even how the example works..)
Anton: 5-Nov-2008	Peter's example, from the blog: parse [a b c d] [ any [ start (acc: 0) \| set inc integer! (acc: acc + inc) \| end ] ]
BrianH: 5-Nov-2008	Here's a working version of that example: parse [1 2 3 4] [ (acc: 0) any [set inc integer! (acc: acc + inc)] ]
BrianH: 5-Nov-2008	Perhaps he thought a paren could only follow a rule.
BrianH: 5-Nov-2008	I like the RETURN proposal as this: RETURN rule Match the rule and return a copy of the value from the PARSE function. Like COPY then BREAK, but without the temporary variable.
Anton: 5-Nov-2008	I vaguely remember suggesting PARSE dialect be extended into parens with a few commands. Parens are executed as normal rebol dialect (not parse dialected in any way). If I remember correctly, it was thought better to keep the parens 'pure' rebol. If that is to be maintained, then I think Peter's RETURN command ought to be morphed into a parse command, as you suggest above, Brian.
Anton: 5-Nov-2008	-- ie. that's a good idea.
BrianH: 5-Nov-2008	More importantly it will override the meaning of the RETURN function at a point where you would expect it to work.
PeterWood: 5-Nov-2008	Clearly my proposal for START is based on my ignorance and inability to search the documents properly :-) It wouldn't hurt as a form of slef-documenting code, though.
BrianH: 5-Nov-2008	Actually, I think it would hurt (no offence). The word start is a common name for parse rules and every keyword we add can't be used as a parse rule name. Something to consider when making proposals.
Anton: 5-Nov-2008	Perhaps, Peter, you could post a withdrawl for START on the blog.
Chris: 5-Nov-2008	Other side of the coin, if 'end is a keyword, 'start is an intuitive companion.
BrianH: 5-Nov-2008	HEAD would be a better name for a directive to reset the position to the beginning of the data. That behavior would be more consistent with the series accessors :)
BrianH: 5-Nov-2008	It was an initialization proposal. Nonetheless, your HEAD? proposal sounds interesting. What problem are you solving that would need such a directive?
Pekr: 5-Nov-2008	Anton - but there is some point in time we should start to make rebol bigger by adding unnecessary things, or we will never reach 100MB executable size and outer world migt not consider us being a rellevant alternative :-)
Anton: 5-Nov-2008	One NOP keyword at a time :)
BrianH: 5-Nov-2008	In particular, it would return a copy, like the COPY directive, not the SET directive.
Chris: 5-Nov-2008	Like! Would that work for values? -- [to "<" copy a thru ">" "<" return a] ; - returns a if there is a < next?
Anton: 5-Nov-2008	What would you do when you need to process the data a bit first ? eg. You return tags from different places in a rule, and to distinguish them you need to also return something extra, by prepending a code to the beginning, for example.
BrianH: 5-Nov-2008	Carl was kinda weirded out by the modifying operations, but I pointed out that people do this anyway and get it wrong a lot.
BrianH: 5-Nov-2008	Everything in that Parse Proposals page has already been discussed with Carl and could go in, barring insurmountable problems with implementation. I stopped putting stuff in when he stopped working for the day. There will likely be a couple going in tonight, but Carl is actively involved in this process.
BrianH: 5-Nov-2008	The main thing that Carl is concerned about now is that some of the proposals make use of the value calculated in a paren on occasion. I don't know why this would be a problem, but I'm sure it will be worked out or around.
Chris: 5-Nov-2008	Using 'remove -- a) removing a bracket only at the end of a string (as per Graham's example): parse "[this]" [remove "[" to "]" remove ["]" end]] b) where you go down a false path: parse "abcdef123" [remove "abc" "123" \| remove "abcd" "ef123"]
Chris: 5-Nov-2008	Would a) work? Would b) reset the string as the first rule didn't match?
BrianH: 5-Nov-2008	a) would work. b) would not likely reset the string, just like code blocks don't undo.
BrianH: 6-Nov-2008	You might be able to do b) like this: parse "abcdef123" [use [a] [remove ["abc" a: "123" :a] \| remove ["abcd" a: "ef123" :a] to end]] or like this: parse "abcdef123" [use [a] [remove ["abc" a: "123" :a \| "abcd" a: "ef123" :a] to end]]
Chris: 6-Nov-2008	How about this? parse "abc" ["a" to end reverse "bc"]

30001 / 64608

[301]