[REBOL] Re: parse-xml and build-tag
From: joel:neely:fedex at: 8-Oct-2001 7:52
Hi, again, Hallvard,
Hallvard Ystad wrote:
> Thanks for the explanation, Joel. Question 1 is now out of the way.
> But as for Q2, I still am facing some problems.
>
> I actually am parsing HTML, not XML, so I need a method that will
> understand certain things that are illegal in XML. Ex:
>
> <table width="100%" noborder height=75%>
>
> Any suggestions (or code), anyone?
>
Well, let's steal as much as possible from XML-LANGUAGE...
Is this what you're after?
>> html-tag-parser/parse-html-tag <table width="100%" noborder height=75%>
== ["table" ["width" "100%" "noborder" "true" "height" "75%"] none]
If so, have a look at this:
8<------------------------------------------------------------
REBOL []
html-tag-parser: make object! [
tag-name: ""
attr-name: ""
attr-data: ""
attr-string: ""
attributes: []
space: make bitset! #{
0026000001000000
0000000000000000
0000000000000000
0000000000000000
}
sp: [some space]
sp?: [any space]
eq: [sp? #"=" sp?]
qt1: "'"
qt2: {"}
data-chars-gt: make bitset! #{
00260000FFFFFFAF
FFFFFFFFFFFFFFFF
FFFFFFFFFFFFFFFF
FFFFFFFFFFFFFFFF
}
data-chars-qt1: make bitset! #{
002600007FFFFFEF
FFFFFFFFFFFFFFFF
FFFFFFFFFFFFFFFF
FFFFFFFFFFFFFFFF
}
data-chars-qt2: make bitset! #{
00260000FBFFFFEF
FFFFFFFFFFFFFFFF
FFFFFFFFFFFFFFFF
FFFFFFFFFFFFFFFF
}
name-first: make bitset! #{
0100000000000004
FEFFFF87FEFFFF07
0000000000000000
FFFF7FFFFFFF7F01
}
name-chars: make bitset! #{
010000000060FF07
FEFFFF87FEFFFF07
0000000000000000
FFFF7FFFFFFF7F01
}
name: [name-first any name-chars]
attr-value: [
[qt1 copy attr-data any data-chars-qt1 qt1]
|
[qt2 copy attr-data any data-chars-qt2 qt2]
|
copy attr-data any data-chars-gt
]
attribute: [
copy attr-name name
[
eq attr-value
|
none (attr-data: copy "true")
]
(append attributes reduce [attr-name attr-data])
]
tag: [copy tag-name name]
parse-html-tag: function [
html-tag [tag! string!]
][
][
if tag? html-tag [
html-tag: rejoin [#"<" to-string html-tag #">"]
]
tag-name: copy ""
attributes: copy []
either parse/all html-tag [
#"<"
tag
any [sp attribute]
sp?
#">"
][
copy/deep reduce [tag-name attributes none]
][
copy []
]
]
]
8<------------------------------------------------------------
HTH!
-jn-
--
; Joel Neely [joel--neely--fedex--com] 901-263-4460 38017/HKA/9677
REBOL [] foreach [order string] sort/skip reduce [ true "!"
false head reverse "rekcah" none "REBOL " prin "Just " "another "
] 2 [prin string] print ""