AltME groups: search

Help · search scripts · search articles · search mailing list

results summary

world	hits
r4wp	4382
r3wp	44224
total:	48606

results window for this page: [start: 13001 end: 13100]

world-name: r3wp

Group: Parse ... Discussion of PARSE dialect [web-public]
Gregg: 28-Apr-2006	I think that's where string parsing comes in, and where having rules for REBOL datatypes would ease the pain.
Volker: 1-May-2006	How about another way: integrate datatypes in string-parser. Basically a load/next and check for type. Then we can write (note i parse a string): parse "1 a , #2" [ integer! word! "," issue! ]
Ashley: 24-May-2006	Quick question for the parse experts. If I have a parse rule that looks like this: parse spec [ any [ set arg string! (...) \| set arg tuple! (...) \| ... ] ] How would I add a rule like: set arg paren! (reduce arg) that always occurred prior to the any rule and evaluated parenthesized expressions (i.e. I want parenthesized expressions to be reduced to a REBOL value that can be handled by the remainder of the parse rule).
Graham: 27-Jun-2006	My brain is still asleep. How to go thru a document and add <strong> </strong> around every word that is in capitals and is more than a few characters long?
Graham: 27-Jun-2006	pattern search on capitals, mark, copy to space, mark, count length of copy, if long, insert at mark2, and then at mark1, continue ??
Graham: 27-Jun-2006	that won't work because file is just text and not a block.
Graham: 27-Jun-2006	search for a series of capitalised words and strong them
Graham: 27-Jun-2006	Actually I would like to add a parse problem to the weeklyblog and get people to submit answers :)
Graham: 27-Jun-2006	And give a prize for the shortest answer
Graham: 27-Jun-2006	shortest .. I mean the least number of words, and operators - not in length
BrianH: 27-Jun-2006	Seriously though, three charsets and two temporary variables, there's got to be a more efficient way.
Volker: 27-Jun-2006	and "g" is none
Volker: 27-Jun-2006	because " a: 5 capitals any capitals b:" stops at "g" and friends.
BrianH: 27-Jun-2006	More importantly, it fails at "g" and friends, backtracks and proceeds to the next alternate action, some alpha.
BrianH: 27-Jun-2006	No, that would break out of the enclosing all loop. The end skip will always fail and proceed to the next alternate.
BrianH: 28-Jun-2006	Of course mine doesn't handle words with apostrophes or hyphens in them either. Easy fix though, just add ' and - to the capitals charset.
BrianH: 28-Jun-2006	And do what?
Graham: 28-Jun-2006	A person is writing a text file. It has headings which are denoted by caps, and terminating in ":".
BrianH: 29-Jun-2006	To use the simpler of the CS terms: Parse is a rule-based, recursive-descent string and structure parser with backtracking. It is not a parser generator (like Lex/Yacc) or compiler (like most regex engines) - the engine follows the rules directly. Since Parse is recursive-descent it can handle patterns that regular expressions wouldn't be able to. Since Parse backtracks it can handle patterns that ordinary recursive-descent parsers can't. Basically, it puts the text and structure processing abilities of Perl 5 to shame, let alone those of the lesser regex engines. In theory, Perl 6 has caught up with REBOL, but Perl 6 only exists in theory for now. By the time it becomes actual REBOL should surpass it (especially if I have anything to say about it).
BrianH: 29-Jun-2006	It's pretty easy to demonstrate patterns that regular expressions can't handle. It's only somewhat difficult to demonstrate patterns that can't be handled by a recursive descent parser without backtracking or unlimited lookahead. I have never run into a pattern that can't be handled by Parse in theory - its only limits are in implementation (available memory and recursion depth). I am not qualified to describe its limits. Still, you have to be careful about how you write the rules or they will trip you up.
BrianH: 29-Jun-2006	Volker, it's more like it can do what a compiler-compiler can do without needing to compile :) And backtracking is about the same as unlimited lookahead, but more powerful.
[unknown: 9]: 29-Jun-2006	Thanks Brian, but as is the theme with questions I ask, I don't ask for myself, but rather that the "world" can learn what "we" know. So perhaps you should add your 2 cents to Henriks, and Tom's in a public forum of the Wikibook.
Volker: 29-Jun-2006	the compiling is no big argument as compiler-compilers are for compiled languages anyway ;) the point is, you can mix a grammar and actions for semantics easy.
BrianH: 29-Jun-2006	Reichart, I figured as much (hence the "dry" comment). I'll look over the Wikibook and see if I can help.
Volker: 29-Jun-2006	i guess that depends on the coco. the point is, a bnf by default, and code inside therules, instead of putting things in vars andprocess later. IMHO.
BrianH: 29-Jun-2006	Volker, I've used a lot of compiler-compilers before and reviewed many more, and unlimited lookup or backtracking are rare.
Volker: 29-Jun-2006	then the advantages of parse are beeing like a compiler-compiler and habving unlimited lookup etc?
Volker: 29-Jun-2006	and you can use parse to tokenize first?
BrianH: 29-Jun-2006	Two rounds of parsing, one for tokenizing and one to parse? Interesting. That would work if you don't have control over the source syntax - otherwise load works pretty well for simple languages.
Volker: 29-Jun-2006	Thats where i got the idea: tokenize first and use block-parser :)
BrianH: 29-Jun-2006	My next personal project is to go through the XML/XSL/REST specs and create exactly that. I already have an efficient structure, I just need to fill out the semantics to support the complete logical model of XML.
BrianH: 29-Jun-2006	Still, "run away" is a common and sensible reaction to XML.
Gordon: 29-Jun-2006	I'm a bit stuck because this parse stop after the first iteration. Can anyone give me a hint as to why it stops after one line. Here is some code: data: read to-file Readfile print length? data 224921 d: parse/all data [thru QuoteStr copy Note to QuoteStr thru QuoteStr thru quotestr copy Category to QuoteStr thru QuoteStr thru quotestr copy Flag to QuoteStr thru newline (print index? data)] 1 == false Data contains hundreds of "memos" in a csv file with three fields: Memo, Category and Flag ("0"\|"1") all fileds are enclosed in quotes and separated by commas. It would be real simple if the Memo field didn't contain double quoted words; then parse data none would even work; but alas many memos contain other "words". It would even be simple if the memos didn't contain commas, then parse data "," or parse/all data "," would work; but alas many memos contain commas in the body.
Gordon: 29-Jun-2006	I'm pretty sure that you are right in that I have to loop throught the "Data". That was my big stumbling block and the rest is just logic to figure out. Thanks a bunch.
Izkata: 29-Jun-2006	Not sure - I remember seeing it in others' parse rules, so I just put it there and it worked '^^ Take it out and see what happens lol
Gordon: 29-Jun-2006	Hi BrianH; Yes I did try that and the problem was that even though I specified the "," as the delimiter, it came across an embedded quote #"^"" and split the input at the quote. Rebol Shouldn't have split it up that way, to my understanding. I will post some simple data to test.
Gordon: 29-Jun-2006	Tomc: Do I understand that :word would be like "get word" and needed in a parse sentence but you can just use the shortcut 'word' most everywhere else?
BrianH: 29-Jun-2006	The colon before the word prevents the interpreter from evaluating active values like functions and parens. It's a safety thing.
Tomc: 29-Jun-2006	and that would be get 'word not get word
Gordon: 29-Jun-2006	Thanks Tomc and BrianH. I'll chew on it for a while. Meanwhile I'm working on building some test data for the first problem.
BrianH: 30-Jun-2006	That's interesting. Parens and paths used to be active - oh yeah, that was changeda while ago. Still, there are some value types that are active (function types, lit-path, lit-word) and if you think you will get one of these you should disable their evaluation by referencing them with a get-word, unless you intend them to be evaluated.
Anton: 30-Jun-2006	both parens and paths changed between View 1.2.1 and 1.2.5, actually.
Gordon: 30-Jun-2006	DideC: Thanks. I've copied and pasted it for review and added it to my local public library. This script should be useful especially with the html help page. Documentation on a script is very rare and much appreciated. Graham: Did a search using "librarian" and search term of "sql cvs" and didn't come up with anything. Although, I think we've got it covered now anyhow.
Graham: 1-Jul-2006	What I was trying to do above is to look for the macro text preceded by a space or newline, and ending in a space or newline.
Graham: 1-Jul-2006	and then replace in situ.
Tomc: 1-Jul-2006	at the macros and expansons single tokens
BrianH: 1-Jul-2006	HTML/XML entities begin with & and end with ; for just this reason. What kind of text? can you give us an example?
Graham: 1-Jul-2006	Heart: Heart regular rate and rhythm, no rubs, murmurs, or gallops noted. A: Abdomen: soft, nontender, no mass, no hernia, no guarding, no rebound tenderness, no ascites, non obese Hbp Hypertension (high blood pressure) #401.9. Ii Type II Diabetes #250.00
Graham: 1-Jul-2006	I would have to intercept the keyboard handler to do this .. so I want to try and just do the replacement after he's finished typing.
Tomc: 1-Jul-2006	hmm I am in the bussiness of sharing biological information and I got to say please strongly consider creating an ontology if one does not exist already
Graham: 1-Jul-2006	and there's the proprietary MEDCIN
Tomc: 1-Jul-2006	and the macros should also be part of that ontology
Graham: 1-Jul-2006	actually the file will be saved in a database and loaded when the program starts
Graham: 1-Jul-2006	and the ontologically controlled programs are very expensive due to licensing fees
Graham: 1-Jul-2006	The AMA charge to use their codes, the American College of Pathologists charge to use their SMOMED-CT codes .. and so it goes on.
Tomc: 1-Jul-2006	that the macro-expansoion fioe needs to self check for incidental occurances of a "macro" in an "expansion" and protect against
BrianH: 1-Jul-2006	And it won't have the problem you mention Tomc.
Graham: 1-Jul-2006	so, basically you created a single parse rule from the macro list and then parsed the text in one go.
Tomc: 1-Jul-2006	I am glad to see someone else using here and there ;)
BrianH: 1-Jul-2006	Gabriele and I have worked extensively on such submissions.
Graham: 1-Jul-2006	Yes. So, somehow I need to force the area field to recognise them as newlines and reformat the screen.
Henrik: 9-Jul-2006	how "local" are variables that are set during a parse? I was looking at Geomol's postscript.r and looked at: coords: [err: (pos: none) [set pos pair! \| set x number! set y number!] ( either pos [ append output compose [(pos/x) " " (pos/y) " lineto^/"] ][ append output compose [(x) " " (y) " lineto^/"] ] ) ]
Henrik: 9-Jul-2006	yes, I try to print the variable and it just returns none.
Henrik: 9-Jul-2006	actually, there is a difference between my code and this, which may be causing it: I need to loop the block with 'any. I suspect the contents is lost after the first run.
Anton: 9-Jul-2006	And to answer your question, the variables are just regular rebol words, so they are as local as you make them.
Oldes: 9-Jul-2006	and how looks the code you parse?
Oldes: 9-Jul-2006	if the parse is inside function and you set pos in the function as a local - it will be local
Henrik: 9-Jul-2006	where 'image is always first and the remaining items may come in random order
Oldes: 9-Jul-2006	there is no pair and no numbers - the pos must be none
Oldes: 9-Jul-2006	and what exactly do you want?
Henrik: 9-Jul-2006	it works the exact opposite :-) Only the outer 'txt is set, and I can't reach the variable inside the block
Henrik: 9-Jul-2006	the brackets would make it a "real" rule, wouldn't it? it would be possible to replace the rule with a variable and have the rule block placed elsewhere in your code
Henrik: 9-Jul-2006	and it makes the parse scalable, so I can add options later
Pekr: 19-Jul-2006	I tried doing myself small template "engine", which will simply look-up for some marks, and replace values accordingly. I decided to look for the end of the marks and my friend suggested me, that I should name even ending marks, to be clear there is not an error. My parse looks like this:
Pekr: 19-Jul-2006	I now can create simply a func, which will accept mark name, and do some code-block accordingly - sql query, simple replace of value, whatever (well, it will not work for cases like img tags, so it is not as flexible as full html parser in temple for e.g., but hey, it is meant being simple)
Pekr: 19-Jul-2006	... but should not be simpler, so I wonder - so far, as you can see, mark-x is not finished, so it is ignored. How to catch this case properly and eventually generate error, send email, write to log, whatever?
Pekr: 19-Jul-2006	Maarten - now looking into build-markup - sorry, it is just strange was of doing things .... noone will place rebol code into template, that will not work ... btw - the code is 'done? What happens if someone uploads template with its own code? I want presentation and code separation.
Pekr: 19-Jul-2006	I looked into rsp some time ago, and I liked it, especially as it was complete, with session support etc., but later on I found shlik.org being unavailable ...
BrianH: 31-Aug-2006	Hey, locals and arguments (practically the same thing in REBOL) are the most important difference between closures and plain blocks. The difference is significant but Peters' background with Smalltalk made him miss it - Smalltalk "blocks" look like REBOL blocks but act like functions.
Volker: 31-Aug-2006	No, the main point is, easy definitions of code and referencing the original context. Rebol-blocks do that.
Volker: 31-Aug-2006	The highlights he mentions is: lexically-scoped, code and data, freely mix computations in
Volker: 31-Aug-2006	That scoping is the difference between a closure and doing a "string" here.
BrianH: 31-Aug-2006	REBOL blocks don't reference a context, but they may contain words that reference a context. Still, this distinction makes no difference to the argument that Peters was making - REBOL text processing is more powerful than regex and easier to use. It would be easier to replicate REBOL-style parsing in Python using closures and generators anyway (Peters' real subject), since that is the closest Python gets to Icon-style backtracking.
Geomol: 25-Sep-2006	I would like the functionality, when parsing things like TeX. There the greek letter gamma is called gamma, and the same in capital is called Gamma. Now I have to invent the word capgamma or something.
Gregg: 25-Sep-2006	If it were a safe and easy thing to change, I can see some value in it as an option but, since words--and REBOL--are case insensitive, I'm inclined to live with things as they are, and use string parsing if case sensitivity is needed. I think it's Oldes or Rebolek that sometimes requests the ability to parse non-loadable strings, using percentage values as an example. I think loading percentages would be awesome, but then there are other values we might want to load as well; where do you draw the line? I'm waiting to see what R3 holds with custom datatypes and such.
Gregg: 25-Sep-2006	And didn't you suggest that values throwing errors could be coerced to string! or another type? e.g. add an /any refinement to load, and any value in the string that can't be loaded would become a string (or maybe you could say you want them to be tags for easy identification).
Oldes: 25-Sep-2006	I think, load/next can be used to handle invalid datatypes now: >> b: {1 2 3 'x' ,} == "1 2 3 'x' ," >> while [v: load/next b not empty? second v][probe v b: v/2] [1 " 2 3 'x' ,"] [2 " 3 'x' ,"] [3 " 'x' ,"] ['x' " ,"] Syntax Error: Invalid word -- , Near: (line 1) , Just add some hadler to convert the invalid datatype to something else what is loadable and then parse as a block
Geomol: 25-Sep-2006	Gabriele, yes it works with strings. But I have words! Thing is, I parse the string input from the user and produce words in an internal format. Then I parse those words for the final output, which can be different formats. I would expect parse/case to be case-sensitive, when parsing words, but parse/case is only for strings, therefore my suggestion.
Oldes: 26-Sep-2006	And there is some parse example how to deal with recursions while parsing strings? If you parse block, it's easy detect, what is string! and what is other type, but if you need to parse string, it's not so easy to detect for example strings like {some text {other "text"}}
Anton: 27-Sep-2006	Here's an idea to toss into the mix: I am thinking of a new notation for strings using underscore (eg. _"hello"_ ) in a parse block, which allows to specify whether they are delimited by whitespace or not. This would allow you to enable/disable the necessity for delimiters per-string. eg: parse input [ _"house"_ ; a complete word surrounded both sides by whitespace _"hous" ; this would match "house", "housing", "housed" or even "housopoly" etc.. but left side must be whitespace "ad"_ ; this would match "ad", "fad", "glad" and right side must be whitespace ] But this would need string datatype to change. On the other hand, I could just set underscore _ to a charset of whitespace, then use that with parse/all eg: _: charset " ^-^/" parse/all input [ [ _ "house" _ ] ] though that wouldn't be as comfortable. Maybe I can create parse rules from a simpler dialect which understands the underscore _. Just an idea...
MikeL: 27-Sep-2006	Anton, Andrew had defined white space patterns in his patterns.r script which seems usable then you can use [ ws* "house" ws] or other combinations as needed without underscore. Andrew's solution for this and a lot of other things have given me some good mileage over the past few years. WS: [some WS] and WS?: [any WS]. It makes for clean parse scripts clear once you adopt it.
Gregg: 27-Sep-2006	I think either approach above can work well. I like the "look" of the underscore, and have done similar things with standard function names. For SOME, ANY, and OPT, the tag chars I prefer are +, *, and ? resepctively; which are EBNF standard.
Anton: 27-Sep-2006	Oh yes, I've seen Andrew's patterns.r. I was just musing how to make it more concise without even using a short word like WS. Actually the use case which sparked this idea was more of a "regex-level" pattern matcher, just a simple pattern matcher where the user writes the pattern to match filenames and to match strings appearing in file contents.
Gregg: 28-Sep-2006	I also have a naming convention I've been playing with for a while, where parse rule words have an "=" at the end (e.g. date=) and parse variables--values set during the parse process--have it at the beginning (e.g. =date). The idea is that it's sort of a cross between BNF syntax for production rules and set-word/get-word syntax; the goal being to easily distinguish parse-related words. By using the same word for a rule and an associated variable, with the equal sign at the head or tail, respectively, it also makes it easier to keep track of what gets set where, when you have a lot of rules.
Maxim: 28-Sep-2006	simple and clean, good idea!
Maxim: 28-Sep-2006	so many years of reboling (since core 1.2) , and still parse remains largely untaimed by myself.
Izkata: 3-Oct-2006	That's a ~very~ good example, Oldes... it should be put in the docs somewhere (if it isn't already.) I didn't understand how get-words and set-words worked in parse, either, before..
Rebolek: 4-Oct-2006	I've got following PARSE problem: I've got string - "<good tag><bad tag><other tag><good tag>" and I want to keep "good tag" and "<>" in other tags change to let's say "X" (I need to change it to HTML entities but that doesn't matter now). So result will look like: "<good tag>Xbad tagXXother tagX<good tag>" I'm working on it for last few hours but still not found sollution. Is there any?
Rebolek: 4-Oct-2006	I'll probable replace everything and then just revert the "good tag" back. It's not very elegant, but...
Anton: 4-Oct-2006	<, and > ?

13001 / 48606

[131]