World: r3wp

Join the discussions in the REBOL3 world...

[XML] xml related conversations

older newer	first last
CharlesW 1-Aug-2009 [828]	Is there a more sutiable language for parsing the XML?
Graham 2-Aug-2009 [829]	parse-xml and xml-to-object seems to work okay on this file.
CharlesW 2-Aug-2009 [830]	It reads and parses ok. I get the block object but my problem is trying to access the individual elements. When venturing into some of the nested attributes, I just can't seem to get it returning a result.. Can you post an example on how you would retrieve the "hits" for <player id="b.11965"> or get the league info from sports-content-code code-type="league"
Graham 2-Aug-2009 [831x4]	>> do %xml-parse.r Script: "A more XML 1.0 compliant set of XML parsing tools." (4-Dec-2001) >> do %xml-object.r Script: {Convert an XML-derived block structure into objects.} (29-Sep-2001) >> obj: first reduce xml-to-object parse-xml+ read %test.xml >> data: second obj == [make object! [ xts:sports-content-set: make object! [ sports-content: make object! [ sports... >> type? data == block! >> probe data/2/sports-content/sports-event/team/1/player/1 make object! [ player-metadata: make object! [ name: make object! [ value?: "" first: "Matt" last: "Kemp" ] position-event: "8" player-key: "l.mlb.com-p.11965" status: "starter" ] player-stats: make object! [ player-stats-baseball: make object! [ stats-baseball-offensive: make object! [ value?: "" runs-scored: "0" at-bats: "4" hits: "1" rbi: "2" bases-on-balls: "0" strikeouts: "0" singles: "1" doubles: "0" triples: "0" home-runs: "0" grand-slams: "0" sac-flies: "0" sacrifices: "0" grounded-into-double-play: "0" stolen-bases: "0" stolen-bases-caught: "1" hit-by-pitch: "0" average: ".271" ] stats-baseball-defensive: make object! [ value?: "" errors: "0" errors-passed-ball: "0" ] ] ] id: "b.11965" ]
	if you download anamonitor300.r, then you can browse the object like this mon data mon obj won't work because of the ":" in the object name
	It should be easy enough to write a recurseive function that descends thru the object looking for some text, and then to print out the full path for you as in the recursive examples I posted above.
	make that "straight forward" vs "easy"
CharlesW 2-Aug-2009 [835]	Thank you so much Graham, I will give it a try as you have indicated and will alsodownload anamonitor300.r
Graham 2-Aug-2009 [836]	Good luck with this stuff .. it's pretty tedious navigating thru large objects like this ...
Chris 12-Aug-2009 [837]	>> do http://www.ross-gill.com/r/altxml.r connecting to: www.ross-gill.com Script: "AltXML" (7-Jun-2009) >> all-stats: load-xml/dom your-xml-data >> player: stats/get-by-id "b.11965" >> his-stats: first player/get-by-tag <stats-baseball-offensive> >> his-stats/get #hits == "1" >> remove-each code codes: all-stats/get-by-tag <sports-content-code> ["league" <> code/get #code-type] == [make object! [ name: <sports-content-code> space: none value: [ #code-type "league" ... >> foreach code codes [probe code/get #code-name] Major ^/ League Baseball == "Major ^/ League Baseball"
Graham 12-Aug-2009 [838]	How is your parser getting on these days?
Chris 13-Aug-2009 [839]	Still seems to work as advertised - good for extraction; still missing the 'flatten function. Not much time for development in the schema direction though : (
Graham 13-Aug-2009 [840]	I'm guessing that it should be "all-stats/get-by-id" and not "stats/get-by-id"
Graham 14-Aug-2009 [841]	Chris, do you documentation yet for your parser?
Chris 14-Aug-2009 [842]	Doesn't go into much detail, but: http://www.ross-gill.com/page/XML+and+REBOL
Graham 14-Aug-2009 [843x2]	Given some xml like this which is a list of documents http://code.google.com/apis/documents/docs/2.0/developers_guide_protocol.html#ListDocs how would your parser extract the <gd:resourceid> and text associated with these tags?
Graham 14-Aug-2009 [843x2]	again .. how would you use your parser to exxtract the <gd:resourceid> tags and associated text?
Chris 14-Aug-2009 [845x3]	>> google-xml: load-xml/dom clipboard:// ; copied from page >> entries: google-xml/get-by-tag <entry> == [make object! [ name: <entry> space: none value: [ #etag {"BxAUSh5RAyp7ImBq"} <... >> foreach entry entries [probe entry/get <resourceId>] spreadsheet:key == "spreadsheet:key"
	Note that while it appears namespaces have been stripped, they are in fact still there:
	>> entry: first entries >> id: first entry/get-by-tag <resourceId> >> id/space == <gd>
Graham 14-Aug-2009 [848]	So, looks feasible to use your parser to create an api for googledocs ...
Graham 15-Aug-2009 [849x2]	There's a rebzip script which has recently been updated on rebol.org which I guess can be used to open up docx
Graham 15-Aug-2009 [849x2]	Sounds like too much overhead ... unzip the docx, make changes to the xml portion and then rezip.
Janko 2-Jan-2010 [851]	I will need a xml parser .. I was thinkinf something fast and quick like sax style .. I found this one http://www.rebol.org/view-script.r?script=xml-parse.r but by looking of it it seems to offer a lot of things I don't need. Has anyone used it for "serrious" xml parsing with it. I am thinking of making my own simple minimal event based xml parser.
Graham 2-Jan-2010 [852x2]	Yes, I have used it to parse large XML files
Graham 2-Jan-2010 [852x2]	You can turn the xml file into a rebol object with it
Janko 2-Jan-2010 [854]	I imagine that is too costly .. I preferr the callback model to just extract the relevant data out
Graham 2-Jan-2010 [855]	Mine is a desktop application .. your needs for a web service differ ..
Janko 2-Jan-2010 [856]	yes, I get a big xml made by "official" BLOATED standard for invoices .. I want to parse it as quick as possible and that's all
Geomol 2-Jan-2010 [857]	Janko, http://www.fys.ku.dk/~niclasen/rebxml/rebxml-spec.html http://www.rebol.org/view-script.r?script=xml2rebxml.r http://www.rebol.org/view-script.r?script=rebxml2xml.r
Janko 2-Jan-2010 [858]	thanks Geomol, I will study the links .. xml2rebxml seems short which is nice, but I haven't yet figured out what exactly rebxml is .. I am reading the first link you gave me
Robert 2-Jan-2010 [859]	Wouldn't it make a lot more sense to use a C based XML parser, construct a Rebol data-structure/string and return that to Rebol?
Geomol 2-Jan-2010 [860]	Janko, rebxml is a rebol version of xml. It can do the same things, but without the bad implementation, xml suffers from. The idea behind xml is ok, it's just not implemented well. Much of that is solved with the rebxml format.
Gregg 2-Jan-2010 [861]	I believe Maarten has done a SAX style parser. I've used parse-xml in the past, sometimes post-processing the output to a different REBOL form, but my needs were simple. Janko, have you tested any of the existing soluitions, with test input on target hardware, and found them to be too slow? If so, what were the results, and how fast do you need it to be?
BrianH 2-Jan-2010 [862]	SAX pull parsing would work well with the port model.
Janko 3-Jan-2010 [863x2]	Robert: it's a good idea but not for my case. I don't want the data strucure from whole xml , I want to stream it through parser and collect out the data. Geomol: I will look at it but probably not what I want in this particular case for the reason above Gregg: I haven't tested any yet, I googled and found that xml-parse.r above , which has sax style of work but seems huge. I only care to support the simplified subset of xml, xml with all the variants is a total bloat so I believe it can be that complex (and it doesn't support 100% of it also). Thats why I am considering writing a simple sax liek parser, I wrote it in c once and it was small (but it parsed even smaller subset of xml)
Janko 3-Jan-2010 [863x2]	BrianH: What does that mean "port model"?
BrianH 3-Jan-2010 [865]	The semantic model of REBOL protocol schemes, implemented with the port! type, would fix well with the semantic model of SAX pull. SAX pull generates the same SAX events, except they are not propagated through callbacks - instead they are returned from function calls. SAX pull is sort of like an generator (in the Icon or Python sense) of SAX events. That is very similar in model to the behavior of command ports (like database ports).
Pekr 4-Jan-2010 [866]	I like SAX model, because IIRC it allows to work on things in a "streamed" way, whereas DOM requires you load everything in memory? Sorry if I oversimpilifed it :-) IIRC Doc used such aproach in his Postgress SQL driver, in opposite to his mySQL one ...
Dockimbel 4-Jan-2010 [867]	It's a matter of tradeoff, if you only need fast XML document reading, SAX is the winner. If you need to modify the document, you need DOM (with or without SAX).
james_nak 11-Oct-2010 [868]	Does anyone know if there is a rebol object to xml script. I've got xml to rebol objects but now I want to change it back to xml. (and I'm lazy)
GrahamC 11-Oct-2010 [869]	Lazy evaluation is useful.. a lazy programmer not so!
Maxim 12-Oct-2010 [870]	I use blocks, although a bit slower to access, they are faster for big loads cause thery require less ram and do not required binding which is a big issue on large XML blocks.
james_nak 12-Oct-2010 [871]	Yeah, what I am trying to do is convert back to XML after I've done my thing.
Maxim 12-Oct-2010 [872x2]	well, going to xml is easy no?
Maxim 12-Oct-2010 [872x2]	how are your objects structured?
james_nak 12-Oct-2010 [874]	Sorry for the delay. They are nested objects that represent the tags they were created from. I think the answer is that I will just have to create the routines to do what I wanted. I thought that perhaps there was something already out there. Thanks.
Maxim 12-Oct-2010 [875]	might find some inspiration in the JSON converters ?
james_nak 12-Oct-2010 [876]	Yes, if I run into any problems I will look into those. Thanks Maxim. That renote app is really cool, btw.
Maxim 12-Oct-2010 [877]	thx it will improve about once a week.
older newer	first last