World: r3wp

Join the discussions in the REBOL3 world...

[Parse] Discussion of PARSE dialect

older newer	first last
Gregg 28-Apr-2006 [935]	space/spc whitespace/wsp alpha digit(s) alpha-num ; should digit be num? ctl/control non-US-ASCII/high-ASCII quoted-string escaped-char ; what is the escape though; REBOL ^, C \, etc.? What other standard sets would we want?
Sunanda 28-Apr-2006 [936]	II was sure I'd posted this just after Oldes' message.....But it ain't there now.....Maybe it's in the wrong group) Andrew has a nice starter set: http://www.rebol.org/cgi-bin/cgiwrap/rebol/view-script.r?script=common-parse-values.r And I know he has extended that list extensively to include things like email address and URL
Gregg 28-Apr-2006 [937x2]	It would be great (again, IMO), if we had parse rules for REBOL datatypes. For those that want the power of block parsing, with the ability to load strings that aren't valid REBOL, it would be very handy.
Gregg 28-Apr-2006 [937x2]	Good starter set! I forgot about that. Thanks Sunanda.
Graham 28-Apr-2006 [939x2]	the problem I find with block parsing is the rigid interpretation of datatypes.
Graham 28-Apr-2006 [939x2]	So, if Rebol gets the datatype wrong ( and real word data is dirty ), you're screwed.
Gregg 28-Apr-2006 [941]	That's the tradeoff. :\
Graham 28-Apr-2006 [942x3]	real world data is dirty ..
	Maybe there should be no invalid datatypes .... everything can be converted to a datatype
	if the parser thinks a datatype is invalid, well, let's call it an invalid! datatype!!
Gregg 28-Apr-2006 [945]	I think that's where string parsing comes in, and where having rules for REBOL datatypes would ease the pain.
Graham 28-Apr-2006 [946x3]	I do screen validation by datatypes ( for data input ). If the user enters an invalid datatype ... ..
	anyway, I think rebol should recognise all data ..
	have a catchall for stuff it thinks is wrong
Oldes 30-Apr-2006 [949x2]	I agree with you Graham, I was mentioning this many times, that there could be something to handle datatype exceptions
Oldes 30-Apr-2006 [949x2]	About the spaces charset - most people do not know that we have one more space char - non braking space: >> to-char 160 <<
Volker 1-May-2006 [951x2]	How about another way: integrate datatypes in string-parser. Basically a load/next and check for type. Then we can write (note i parse a string): parse "1 a , #2" [ integer! word! "," issue! ]
Volker 1-May-2006 [951x2]	'invalite! has a problem: its easy to recognize where the wrong part starts, but harder to recognize where the wrong part ends.
Oldes 1-May-2006 [953x2]	Is there any RTF (Rich Text Format) parser for Rebol?
Oldes 1-May-2006 [953x2]	hm, maybe this one: http://www.codeconscious.com/rebol/scripts/rtf-tools.r :-)
Ashley 24-May-2006 [955]	Quick question for the parse experts. If I have a parse rule that looks like this: parse spec [ any [ set arg string! (...) \| set arg tuple! (...) \| ... ] ] How would I add a rule like: set arg paren! (reduce arg) that always occurred prior to the any rule and evaluated parenthesized expressions (i.e. I want parenthesized expressions to be reduced to a REBOL value that can be handled by the remainder of the parse rule).
Tomc 25-May-2006 [956]	I only parse strings not blocks so this may be compleatly off but I would try parse spec [ any[ opt [here: set arg paren! (change :here reduce arg) :here] [ set arg string! (...) \| set arg tuple! (...) \| ... ] ] ]
Anton 25-May-2006 [957]	(here/1: do arg)
Ashley 25-May-2006 [958]	Thanks both, works a treat.
Graham 27-Jun-2006 [959]	My brain is still asleep. How to go thru a document and add <strong> </strong> around every word that is in capitals and is more than a few characters long?
Pekr 27-Jun-2006 [960x3]	hmm, quite a challenge ...
	somehow to look-up words, mark: before, find its end (another space), check for if first is capital or not, change at position, :mark at end ...
	but don't ask me for code, it would last few hours to get somewhere, if even :-)
Graham 27-Jun-2006 [963]	pattern search on capitals, mark, copy to space, mark, count length of copy, if long, insert at mark2, and then at mark1, continue ??
Gordon 27-Jun-2006 [964]	I agree - a bit much to ask. A more specific question would get a more specific answer :) Something like: file: read filename2parse newfile: "" Foreach word file [ if Is-Capitals Word [ newfile: join newfile ["<strong> " word " </strong> "] ] The Is-Capitals function would have to be defined Is-Capitals func [Word2Check] [ some code here ]
Graham 27-Jun-2006 [965x2]	that won't work because file is just text and not a block.
Graham 27-Jun-2006 [965x2]	but my brain is gradually waking up now ... all I need to do is get dressed!
Pekr 27-Jun-2006 [967]	:-)
Volker 27-Jun-2006 [968]	;thinking loud: capitals: charset["#"A" - #"Z"] capital: [5 capitals any capitals]
Henrik 27-Jun-2006 [969]	can you do this in one pass?
Gordon 27-Jun-2006 [970]	.Yes "Newfile would have to be "parsed" into words something like: Newfile: parse file or file: parse/with file {separator character}
Graham 27-Jun-2006 [971x3]	troubel is, parse doesn't only just parse on " " if specified ...
	so, you might lose other characters.
	I think this can be done in one pass.
Pekr 27-Jun-2006 [974]	I would not rely on parse helpers, as parse string delimiter, but use full parse/all, if you need precise result ...
BrianH 27-Jun-2006 [975]	Yes, give me a minute...
JaimeVargas 27-Jun-2006 [976x2]	capitalize-word: func [ s [string!] /local len ][ either 5 < len: length? s [ s: rejoin ["<strong>" uppercase s/1 next s </strong>] ][ s ] ] capitalize-text: func [ s [string!] /local result word-rule alpha non-alpha w c ][ result: copy {} alpha: charset [#"A" - #"Z" #"a" - #"z"] non-alpha: complement alpha word-rule: [copy w [some alpha] (insert tail result capitalize-word w)] other-rule: [copy c non-alpha (insert tail result c)] parse/all s [some [word-rule \| other-rule] end] result ]
JaimeVargas 27-Jun-2006 [976x2]	>> capitalize-text {The result changes according to formating.} ; == {The <strong>Result</strong> <strong>Changes</strong> <strong>According</strong> to <strong>Formating</strong>.}
Graham 27-Jun-2006 [978x2]	Not quite the problem I was stating!
Graham 27-Jun-2006 [978x2]	search for a series of capitalised words and strong them
JaimeVargas 27-Jun-2006 [980]	Ah. Very easy modification.
Graham 27-Jun-2006 [981x2]	bolden-word: func [ s [string!] /local len ][ either 5 < len: length? s [ s: rejoin ["<strong>" s </strong>] ][ s ] ] enhance-text: func [ s [string!] /local result word-rule alpha non-alpha w c ][ result: copy {} alpha: charset [#"A" - #"Z"] non-alpha: complement alpha word-rule: [copy w [some alpha] (insert tail result bolden-word w)] other-rule: [copy c non-alpha (insert tail result c)] parse/all s [some [word-rule \| other-rule] end] result ]
Graham 27-Jun-2006 [981x2]	Thanks Jaime.
BrianH 27-Jun-2006 [983x2]	capitals: charset ["#"A" - #"Z"] alpha: charset ["#"A" - #"Z" #"a" - #"z"] non-alpha: complement alpha parse/all/case [any non-alpha any [ a: 5 capitals any capitals b: non-alpha ( b: change/part a rejoin ["<strong>" copy/part a b "</strong>"] b ) :b \| some alpha any non-alpha ] to end]
BrianH 27-Jun-2006 [983x2]	This is the Parse group after all.
older newer	first last