r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[XML] xml related conversations

CharlesW
1-Aug-2009
[828]
Is there a more sutiable language for parsing the XML?
Graham
2-Aug-2009
[829]
parse-xml and xml-to-object seems to work okay on this file.
CharlesW
2-Aug-2009
[830]
It reads and parses ok. I get the block object but my problem is 
trying to access the individual elements. When venturing into some 
of the nested attributes, I just can't seem to get it returning a 
result.. Can you post an example on how you would retrieve the "hits" 
for <player id="b.11965"> or get the league info from sports-content-code 
code-type="league"
Graham
2-Aug-2009
[831x4]
>> do %xml-parse.r

Script: "A more XML 1.0 compliant set of XML parsing tools." (4-Dec-2001)
>> do %xml-object.r

Script: {Convert an XML-derived block structure into objects.} (29-Sep-2001)
>> obj: first reduce xml-to-object parse-xml+ read %test.xml
>> data: second obj
== [make object! [
        xts:sports-content-set: make object! [
            sports-content: make object! [
                sports...
>> type? data
== block!
>> probe data/2/sports-content/sports-event/team/1/player/1
make object! [
    player-metadata: make object! [
        name: make object! [
            value?: ""
            first: "Matt"
            last: "Kemp"
        ]
        position-event: "8"
        player-key: "l.mlb.com-p.11965"
        status: "starter"
    ]
    player-stats: make object! [
        player-stats-baseball: make object! [
            stats-baseball-offensive: make object! [
                value?: ""
                runs-scored: "0"
                at-bats: "4"
                hits: "1"
                rbi: "2"
                bases-on-balls: "0"
                strikeouts: "0"
                singles: "1"
                doubles: "0"
                triples: "0"
                home-runs: "0"
                grand-slams: "0"
                sac-flies: "0"
                sacrifices: "0"
                grounded-into-double-play: "0"
                stolen-bases: "0"
                stolen-bases-caught: "1"
                hit-by-pitch: "0"
                average: ".271"
            ]
            stats-baseball-defensive: make object! [
                value?: ""
                errors: "0"
                errors-passed-ball: "0"
            ]
        ]
    ]
    id: "b.11965"
]
if you download anamonitor300.r, then you can browse the object  
like this

mon data

mon obj won't work because of the ":" in the object name
It should be easy enough to write a recurseive function that descends 
thru the object looking for some text, and then to print out the 
full path for you as in the recursive examples I posted above.
make that "straight forward" vs "easy"
CharlesW
2-Aug-2009
[835]
Thank you so much Graham, I will give it a try as you have indicated 
and will alsodownload anamonitor300.r
Graham
2-Aug-2009
[836]
Good luck with this stuff .. it's pretty tedious navigating thru 
large objects like this ...
Chris
12-Aug-2009
[837]
>> do http://www.ross-gill.com/r/altxml.r
connecting to: www.ross-gill.com
Script: "AltXML" (7-Jun-2009)
>> all-stats: load-xml/dom your-xml-data 
>> player: stats/get-by-id "b.11965"                        
>> his-stats: first player/get-by-tag <stats-baseball-offensive>
>> his-stats/get #hits                                          
== "1"


>> remove-each code codes: all-stats/get-by-tag <sports-content-code> 
["league" <> code/get #code-type]
== [make object! [
        name: <sports-content-code>
        space: none
        value: [
            #code-type "league" 
      ...
>> foreach code codes [probe code/get #code-name]
Major ^/      League Baseball
== "Major ^/      League Baseball"
Graham
12-Aug-2009
[838]
How is your parser getting on these days?
Chris
13-Aug-2009
[839]
Still seems to work as advertised - good for extraction; still missing 
the 'flatten function. Not much time for development in the schema 
direction though : (
Graham
13-Aug-2009
[840]
I'm guessing that it should be "all-stats/get-by-id" and not "stats/get-by-id"
Graham
14-Aug-2009
[841]
Chris, do you documentation yet for your parser?
Chris
14-Aug-2009
[842]
Doesn't go into much detail, but: http://www.ross-gill.com/page/XML+and+REBOL
Graham
14-Aug-2009
[843x2]
Given some xml like this which is a list of documents http://code.google.com/apis/documents/docs/2.0/developers_guide_protocol.html#ListDocs

how would your parser extract the <gd:resourceid> and text associated 
with these tags?
again .. how would you use your parser to exxtract the <gd:resourceid> 
tags and associated text?
Chris
14-Aug-2009
[845x3]
>> google-xml: load-xml/dom clipboard:// ; copied from page
>> entries: google-xml/get-by-tag <entry>
== [make object! [
        name: <entry>
        space: none
        value: [
            #etag {"BxAUSh5RAyp7ImBq"} 
            <...
>> foreach entry entries [probe entry/get <resourceId>]
spreadsheet:key
== "spreadsheet:key"
Note that while it appears namespaces have been stripped, they are 
in fact still there:
>> entry: first entries                   
>> id: first entry/get-by-tag <resourceId>             
>> id/space                               
== <gd>
Graham
14-Aug-2009
[848]
So, looks feasible to use your parser to create an api for googledocs 
...
Graham
15-Aug-2009
[849x2]
There's a rebzip script which has recently been updated on rebol.org 
which I guess can be used to open up docx
Sounds like too much overhead ... unzip the docx, make changes to 
the xml portion and then rezip.
Janko
2-Jan-2010
[851]
I will need a xml parser .. I was thinkinf something fast and quick 
like sax style .. I found this one http://www.rebol.org/view-script.r?script=xml-parse.r
but by looking of it it seems to offer a lot of things I don't need. 
Has anyone used it for "serrious" xml parsing with it. I am thinking 
of making my own simple minimal event based xml parser.
Graham
2-Jan-2010
[852x2]
Yes, I have used it to parse large XML files
You can turn the xml file into a rebol object with it
Janko
2-Jan-2010
[854]
I imagine that is too costly .. I preferr the callback model to just 
extract the relevant data out
Graham
2-Jan-2010
[855]
Mine is a desktop application .. your needs for a web service differ 
..
Janko
2-Jan-2010
[856]
yes, I get a big xml made by "official" BLOATED standard for invoices 
.. I want to parse it as quick as possible and that's all
Geomol
2-Jan-2010
[857]
Janko,
http://www.fys.ku.dk/~niclasen/rebxml/rebxml-spec.html
http://www.rebol.org/view-script.r?script=xml2rebxml.r
http://www.rebol.org/view-script.r?script=rebxml2xml.r
Janko
2-Jan-2010
[858]
thanks Geomol, I will study  the links .. xml2rebxml seems short 
which is nice, but I haven't yet figured out what exactly rebxml 
is .. I am reading the first link you gave me
Robert
2-Jan-2010
[859]
Wouldn't it make a lot more sense to use a C based XML parser, construct 
a Rebol data-structure/string and return that to Rebol?
Geomol
2-Jan-2010
[860]
Janko, rebxml is a rebol version of xml. It can do the same things, 
but without the bad implementation, xml suffers from. The idea behind 
xml is ok, it's just not implemented well. Much of that is solved 
with the rebxml format.
Gregg
2-Jan-2010
[861]
I believe Maarten has done a SAX style parser.  I've used parse-xml 
in the past, sometimes post-processing the output to a different 
REBOL form, but my needs were simple.


Janko, have you tested any of the existing soluitions, with test 
input on target hardware, and found them to be too slow? If so, what 
were the results, and how fast do you need it to be?
BrianH
2-Jan-2010
[862]
SAX pull parsing would work well with the port model.
Janko
3-Jan-2010
[863x2]
Robert: it's a good idea but not for my case. I don't want the data 
strucure from whole xml , I want to stream it through parser and 
collect out the data. 

Geomol: I will look at it but probably not what I want in this particular 
case for the reason above

Gregg: I haven't tested any yet, I googled and found that xml-parse.r 
above , which has sax style of work but seems huge. I only care to 
support the simplified subset of xml, xml with all the variants is 
a total bloat so I believe it can be that complex (and it doesn't 
support 100% of it also).  Thats why I am considering writing a simple 
sax liek parser, I wrote it in c once and it was small (but it parsed 
even smaller subset of xml)
BrianH: What does that mean "port model"?
BrianH
3-Jan-2010
[865]
The semantic model of REBOL protocol schemes, implemented with the 
port! type, would fix well with the semantic model of SAX pull. SAX 
pull generates the same SAX events, except they are not propagated 
through callbacks - instead they are returned from function calls. 
SAX pull is sort of like an generator (in the Icon or Python sense) 
of SAX events. That is very similar in model to the behavior of command 
ports (like database ports).
Pekr
4-Jan-2010
[866]
I like SAX model, because IIRC it allows to work on things in a "streamed" 
way, whereas DOM requires you load everything in memory? Sorry if 
I oversimpilifed it :-) IIRC Doc used such aproach in his Postgress 
SQL driver, in opposite to his mySQL one ...
Dockimbel
4-Jan-2010
[867]
It's a matter of tradeoff, if you only need fast XML document reading, 
SAX is the winner. If you need to modify the document, you need DOM 
(with or without SAX).
james_nak
11-Oct-2010
[868]
Does anyone know if there is a rebol object to xml script. I've got 
xml to rebol objects but now I want to change it back to xml. (and 
I'm lazy)
GrahamC
11-Oct-2010
[869]
Lazy evaluation is useful.. a lazy programmer not so!
Maxim
12-Oct-2010
[870]
I use blocks, although a bit slower to access, they are faster for 
big loads cause thery require less ram and do not required binding 
which is a big issue on large XML blocks.
james_nak
12-Oct-2010
[871]
Yeah, what I am trying to do is convert back to XML after I've done 
my thing.
Maxim
12-Oct-2010
[872x2]
well, going to xml is easy no?
how are your objects structured?
james_nak
12-Oct-2010
[874]
Sorry for the delay. They are nested objects that represent the tags 
they were created from. I think the answer is that I will just have 
to create the routines to do what I wanted. I thought that perhaps 
there was something already out there. Thanks.
Maxim
12-Oct-2010
[875]
might find some inspiration in the JSON converters ?
james_nak
12-Oct-2010
[876]
Yes, if I run into any problems I will look into those.
Thanks Maxim. That renote app is really cool, btw.
Maxim
12-Oct-2010
[877]
thx it will improve about once a week.