World: r3wp

Join the discussions in the REBOL3 world...

[!REBOL3-OLD1]

older newer	first last
Anton 24-Nov-2006 [1625x3]	Functions like these are very useful to have. I could have used them recently while doing file searching. However, I wouldn't like to see these functions included as is. - Not very efficient. That's ok for searching small strings or the contents of short files, but bad when searching large files for many strings. - Not generic. The name suggests many datatypes are supported. Better names might be find-any-string, find-all-strings - The above FINDALL does not keep FINDIT as a local. - The argument names are too short, so they are not distinct or descriptive enough. - The return values are not defined clearly in the function doc strings. The above issues are fixable, but it will take some time.
	(Actually, the efficiency issue will take the most time to resolve.)
	(... but, most important is defining the user interface and functionality clearly, as well as eliminating undesireable side-effects.)
Louis 24-Nov-2006 [1628]	Who can make these functions the most efficient, and display them in a benchmark program to prove it? And correct all the other problems mentioned by Anton.
Anton 24-Nov-2006 [1629x2]	Yes.... "who ?".... :-)
Anton 24-Nov-2006 [1629x2]	The above find-any suffers from this problem, which needs at least to be documented in the function description. >> pos: findany "hello cat dog license" ["dog" "cat"] == "dog license" ("cat" appears before "dog" in the input string, but because "dog" was searched first, it was returned first.)
Maxim 24-Nov-2006 [1631x2]	this is the same limitation as in parse which is not optimal, IMHO.
Maxim 24-Nov-2006 [1631x2]	the 'ANY in the name implies any of the options are equivalent, so its the first in the input which is the desired return value, as anton points out.
[unknown: 5] 24-Nov-2006 [1633]	Louis, the way I benchmark in REBOL is to do a trace count. In other words if the execution of the trace generates more output than another method then I assume that method is less efficient.
Anton 25-Nov-2006 [1634x4]	Stayed up all night, and succeeded in making a parse rule generator, so if we want to search a string for any substrings: string: {Hello there Anton. Arrow in the box. What nice antlers you have.} substrings: ["ant" "antler" "anton" "arrow" "bar" "box"] rule: [start: [["a" [["nt" action ["ler" action \| "on" action]] \| "rrow" action]] \| ["b" ["ar" action \| "ox" action]]] \| skip] Found at: 13 Substring: "Ant" Found at: 13 Substring: "Anton" Found at: 20 Substring: "Arrow" Found at: 33 Substring: "box" Found at: 48 Substring: "ant" Found at: 48 Substring: "antler" true
	So you can see the rule is built from the substrings.
	Thus we are able to search large files for any number of substrings in a single pass parse. :)
	(very happy about this..) I'll clean it up and publish it probably later tonight.
Jerry 25-Nov-2006 [1638]	Nice job. Anton.
Anton 25-Nov-2006 [1639]	Thankyou, Jerry. I wonder if anyone else made a parse generator like that ?
Louis 25-Nov-2006 [1640x3]	Maxim and Anton, what difference does it make which value is returned? It is the true or false that I am looking for. If any of the strings are found, why look any farther? I'm sure you guys have a reason, but I want to know what it is.
	is returned = is returned first
	If you keep looking after already having an answer, how can that be more efficient?
Anton 25-Nov-2006 [1643x4]	Well, that functionality works perfectly for your case, but there are many other cases where the position of the match(es) is also wanted.
	.. and a name like FINDALL suggests that it returns those matches.
	Your functions might better be named: ANY-SUBSTRINGS? and ALL-SUBSTRINGS? FINDANY and FINDALL might be fine for personal use, but to get acceptance out in the community, the names should be more accurate.
	For the single-pass parse, the action can be defined by the user to either continue or break the parse. (So FINDANY would break, whereas FINDALL would continue.)
Louis 25-Nov-2006 [1647x2]	OK, it really doesn't make any difference to me what the functions are named, as long as the names are easy to remember.
Louis 25-Nov-2006 [1647x2]	But I would really like to see funtions that find-any-substring and that find-all-substrings included in REBOL3, as they make programming a lot easier---at least for me.
Maxim 25-Nov-2006 [1649x3]	Louis, hehe you'll eventually realize that semantics in rebol are pretty important... simply because Carl puts soooo much effort I guess it all makes us anal about it ;-)
	in parse, a good example of where the index creates many problems is when you use the 'TO or 'THRU words.
	they jump exactly like the above... and well it makes them much less usefull within the context of trying to get to the next value of "equivalent" values.
Louis 26-Nov-2006 [1652x2]	Maxim, I see. Thanks for the explanation. It will be interesting to see Anton's function.
Louis 26-Nov-2006 [1652x2]	You guys a way more advance to me. That is why I hang out here---I get help when I get stuck. And by the way, thanks to all of you guys for the help.
Gregg 26-Nov-2006 [1654]	Anton, my LIKE? function generates parse rules, but I doubt it's as advanced as yours, since it's just meant for simple pattern matching and doesn't deal with multiple search targets.
Anton 26-Nov-2006 [1655]	Gregg, I think I recently made a function probably similar to your LIKE?. Have you published that somewhere ? But yes, multiple search terms are the next level up. To get the full range of matching rules with multiple search terms will take some work, however the basis is there.
Anton 27-Nov-2006 [1656]	Ok, here it is: http://anton.wildit.net.au/rebol/library/string-search-functions.r do http://anton.wildit.net.au/rebol/library/demo-string-search-functions.r
Gregg 27-Nov-2006 [1657]	It's on REBOl.org. I finally decided to publish it (it's old) when I published my file-list script, which uses it.
Louis 27-Nov-2006 [1658x2]	Anton, I think line 437 should be: find-every-string: func [
Louis 27-Nov-2006 [1658x2]	func [ must of been accidentally deleted right before you sent the file.
Anton 27-Nov-2006 [1660x2]	Ah.. thankyou Louis.
Anton 27-Nov-2006 [1660x2]	Fixed and republished (with same version number).
Henrik 28-Nov-2006 [1662]	http://www.rebol.net/r3blogs/0052.html<--- Change the Hash Datatype in 3.0?
Henrik 29-Nov-2006 [1663]	http://www.rebol.net/r3blogs/0053.html<-- Vector datatype
CharlesS 5-Dec-2006 [1664]	So anyone know of a rough date for 3.0 ?
BrianH 5-Dec-2006 [1665]	No, not even RT.
Izkata 5-Dec-2006 [1666]	Let's assume December 2025, then we shouldn't be disappointed.... =^_^= (I hope....)
Rebolek 6-Dec-2006 [1667]	I hope it will be out before Duke Nukem Forever >;-)
Pekr 6-Dec-2006 [1668x3]	My estimation is, that R3 (in some form) will be released for DevCon 2007
	... including View, without some parts as Unicode etc., just new architecture ....
	hmm, pages for DevCon are down, so actually it is difficult to say, wehn it will happen :-)
Pekr 11-Dec-2006 [1671]	So now we know the target date for R3 release :-) DevCon 2007 holds first conference session - By Carl Sassenrath - "Introducing REBOL 3.0"
Rebolek 11-Dec-2006 [1672]	Pekr: "introducing R3" may mean just technology overview ;)
CharlesS 11-Dec-2006 [1673]	who here is going to devcon2007 ?
Pekr 11-Dec-2006 [1674]	probably me. It depends if I opt for a new job or no, and if I am succesfull :-)
older newer	first last