World: r3wp

Join the discussions in the REBOL3 world...

[Power Mezz] Discussions of the Power Mezz

older	first
Gabriele 21-Dec-2010 [127]	The HTML to text module has still not been rewritten to use LOAD-HTML instead of the older approach of the HTML normalizer followed by a FSM.
Kaj 21-Dec-2010 [128x2]	Thanks for the clarification
Kaj 21-Dec-2010 [128x2]	So the parser in 5.10 is the newest one? But where does the parser in 7.4 fit in?
Gabriele 22-Dec-2010 [130x4]	7.4 parses a string into a sequence of tags and text (etc.). (it also has a load-markup function that is similar to load/markup but also parses tag attributes and so on). 5.10 uses 7.4 and builds a tree from that sequence of tags and text.
	(i never got around to change wetan to show module dependencies. if you look at the script header though, you'll see that load-html.r depends on ml-parser.r)
	http://www.rebol.it/power-mezz/mezz/load-html.r
	the code is using a number of tricks to be "fast" (esp. expand-macros.r), so it's not as clean as it could be.
Kaj 22-Dec-2010 [134]	Thanks
Janko 29-Apr-2011 [135x2]	Hi, first thanks for making and open sourcing power-mezz. I am trying to use load-html and am getting some strange results if sems it makes for example recursing [ html [ html [ html ..... ]]] on my simple html input (and on real one that I tried). I prepared two examples to make the point as clear as possible. http://paste.factorcode.org/paste?id=2263(notice the stack owerflow error)
Janko 29-Apr-2011 [135x2]	I added 2 more cases to the paste (2 annotaitons). load-html seems quite complex since it uses many other modules (that I don't understant either).. so I rather see if you find something obvious in my approach or the bug in power-mezz
Gabriele 30-Apr-2011 [137x11]	First: you only need to import %mezz/load-html.r in your examples. You're not using the other modules; they will be loaded automatically by load-html.r - you never need to worry about dependencies.
	Second: your problem is that you are trying to mold the result, which is a tree where each node has a reference to the parent node. (much like faces in R2). That's why you see the "loop".
	there is a mold-tree function in %mezz/trees.r if you want to mold the tree. Or, you could simply use form-html to pretty print the tree for you.
	Eg. for your first example: t: load-html p print mold-tree t [root [] [html [] [head [] [title [] [text [value "t"]]]] [body [] [h2 [] [text [value "HEADING"]]] [p [] [text [value "first para"]]] [p [] [text [value "second para"]]]]]] print form-html/with t [pretty?: yes] <html> <head> <title>t</title> </head> <body> <h2>HEADING</h2> <p>first para</p> <p>second para</p> </body> </html>
	(the pretty? option to form-html is something i only use for debugging, so it's not as pretty as it should be i guess)
	You can also do things like: >> mold-tree get-node t/childs/html/childs/head/childs/title == {[title [] [text [value "t"]]]}
	get-node and set-node are also from %mezz/trees.r ; most likely you don't want to mess around with %mezz/macros/trees.r , that is deep vodoo i use to make the html filter fast.
	(if you have performance problems, we'll talk about it :)
	other examples: >> get-node t/childs/html/childs/head/childs/title/childs/text/prop/value == "t" >> get-node t/childs/html/childs/body/childs/h2/childs/text/prop/value == "HEADING"
	Also note that: >> print form-html/with load-html "<p>A paragraph!" [pretty?: yes] <html> <head> <title></title> </head> <body> <p>A paragraph!</p> </body> </html>
	ie. load-html tries to cope with malformed input as much as possible.
Janko 30-Apr-2011 [148x3]	wow, thank you a lot! I knew this was to obvious "bug" to be real and I am probably doing something wrong. GREAT! I initially imported only needed modules but got errors .. ( I will try and report ) the errors went away as I manually imported them. Just a second
	very good that you cope with bad html .. I will need that functionality because no html is perfect.
	I was planing to use beaurtifullsoup if you didn't but since you do that is even much better
Janko 1-May-2011 [151x4]	I tried now, the problem with import was that I didn't set the absolute path to load-module/from before.
	It all works now according to your example.. and I tested and it handles improper html very well! Thanks!
	I looked at html-rules in load-html and I am stunned by how well the code / dialect is
	s/is/looks/
Gabriele 2-May-2011 [155x2]	there are a few things that it still can't do (the dialect i mean), but it's very powerful on the things it can do :) The documentation is here: http://www.rebol.it/power-mezz/mezz/niwashi.html
Gabriele 2-May-2011 [155x2]	it's one of the parts that i think it's best documented, so it's worth reading.
Pekr 2-May-2011 [157]	Gabriele - what is that? New templating system? Reminds me of Temple :-) (sorry, not following the discussion)
Gabriele 3-May-2011 [158:last]	No, though it could easily be used to reimplement Temple, this time with the ability to load any html.
older	first