r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

BrianH
4-Nov-2005
[698]
So if you copy your test data to the clipboard, you would assign 
it to a variable like this:
obx: read clipboard://

If you are reading it from a file with the read or open functions, 
there is no escaping.
Volker
4-Nov-2005
[699]
then it should be no problem. only when load sees it in strings. 
if you 'read it, no escaping is needed. and mold auto-escapes.
Graham
4-Nov-2005
[700]
Ok, I am testing on the command line.  May be that is the problem.
BrianH
4-Nov-2005
[701x3]
Try reading from the clipboard directly like I suggested for that.
Note: The form native doesn't escape on output like mold does.
Oh wait, the error in his parse statement isn't because of escaping. 
It's the thru bitset that isn't supported. Try this:
caret: charset "^^"
non-caret: complement caret

parse obx ["OBX" "|" digits "|" "ST" "|" any non-caret caret to end 
]
Graham
4-Nov-2005
[704]
does that return true ?
Volker
4-Nov-2005
[705]
should work like thru.  "any non-caret" is like "to caret", then 
skipping it is like "thru caret". without caret it would go to end, 
and then the final skip fail.
Graham
4-Nov-2005
[706]
>> obx: {OBX|1|ST|Hb^ Hb:^L||135|g/L|120 - 155|N|||F^/}
== "OBX|1|ST|Hb Hb:^L||135|g/L|120 - 155|N|||F^/"
>> caret: charset "^^"
== make bitset! #{
0000000000000000000000400000000000000000000000000000000000000000
}
>> non-caret: complement caret
== make bitset! #{
FFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
}
>> parse obx [ "OBX" copy test some non-caret caret to end ]
== false
>> test
== "|1|ST|Hb Hb:^L||135|g/L|120 - 155|N|||F^/"
>>
Volker
4-Nov-2005
[707x3]
Yes, so it works
remember the "^" in your string are no "^", else they would be molded 
as "^^"
i would really get the data from outside. put them in a file and 
use obx: read %data.txt, or use clipboard.
Graham
4-Nov-2005
[710]
ok, I'll give that a go later on.
Volker
4-Nov-2005
[711x2]
obx: {OBX|1|ST|Hb^^ Hb:^^L||135|g/L|120 - 155|N|||F^^/}
(or copy from altme ;)
Graham
4-Nov-2005
[713]
obx: {OBX|1|ST|Hb^ Hb:^L||135|g/L|120 - 155|N|||F^/}
Volker
4-Nov-2005
[714]
Hu? Does altme escape too? there should be duble "^^" everywhere.
Graham
4-Nov-2005
[715x2]
>> parse read clipboard:// [ copy test some non-caret caret to end 
]
== true
>> test
== "obx: {OBX|1|ST|Hb"
ok so that works.
BrianH
4-Nov-2005
[717]
Of course since you are looking for one character, you can use to 
or thru like this:
parse read clipboard:// [ copy test thru "^^" to end ]
Volker
4-Nov-2005
[718]
Meoww, thats to simple for me.. good catch :)
Graham
4-Nov-2005
[719x3]
well, if RT want to use Rebol in medical applications, they should 
look at making it easier to work with medical data !
OBX|1|ST|Hb^ Hb:^L||135|g/L|120 - 155|N|||F^/
Just using Altme as a clipboard.
BrianH
4-Nov-2005
[722]
Is there a standard we can read to get the syntax?
Graham
4-Nov-2005
[723]
HL7 - yeah, hundreds of pages of specs.
BrianH
4-Nov-2005
[724]
Aaak.
Graham
4-Nov-2005
[725]
http://www.hl7.org/
BrianH
4-Nov-2005
[726]
For instance, how many fields does that data you posted have? Are 
they seperated by | or is it a length thing?
Graham
4-Nov-2005
[727]
I don't know .. I am just looking at sample data and trying to reverse 
engineer the format as I don't have time to read 100s of pages of 
specs.
BrianH
4-Nov-2005
[728]
What does the ^ mean in context?
Graham
4-Nov-2005
[729x2]
but each OBX record is one blood result
It seems to be a delimiter to divide a record into parts
BrianH
4-Nov-2005
[731]
Really? By the format it looks like they are using | for that.
Graham
4-Nov-2005
[732x3]
so, | separates fields, and ^ sub divides a field
OBR|1|3CHI|05-556701-MHA-0^VDL|MHA^MASTER HAEM PANEL^L|R|200511021006|200511021006|""|""|||||200511021006||10761^CHIU&G|||10761^CHIU&G|10761^CHIU&G|3CHI^chiu|200511021152|||F
OBX|1|ST|Hb^ Hb:^L||135|g/L|120 - 155|N|||F
OBX|2|ST|pV^ PCV:^L||0.397||0.340 - 0.470|N|||F
OBX|3|ST|mV^ MCV:^L||95|fL|81 - 97|N|||F
OBX|4|ST|mh^ MCH:^L||32.3|pg|26.5 - 33.0|N|||F
OBX|5|ST|pl^ Platelets:^L||224|x 10*9/L|150 - 450|N|||F
OBX|6|ST|es^ ESR:^L||23|mm/hr|1 - 27|N|||F
OBX|7|ST|wc^ WCC:^L||7.7|x 10*9/L|3.8 - 10.0|N|||F
OBX|8|ST|Nt^ Neutrophils:^L||4.5|x10*9/L|1.9 - 7.1|N|||F
OBX|9|ST|Ly^ Lymphocytes:^L||2.6|x10*9/L|0.6 - 3.6|N|||F
OBX|10|ST|Mo^ Monocytes:^L||0.5|x10*9/L|0.2 - 1.0|N|||F
OBX|11|ST|Eo^ Eosinophils:^L||0.1|x10*9/L|< 0.6|N|||F
OBX|12|ST|Ba^ Basophils:^L||0.05|x10*9/L|0.00 - 0.10|N|||F

OBX|13|FT|bf^Comments^L||COMMENT: RBC parameters normochromic normocytic.|||N|||F
NTE|1|L|CC Drs: MALIK, CHIU.
I've omitted the MSH MSA and PID lines which identify the patient.
BrianH
4-Nov-2005
[735x2]
So the first field specifies the record format, the number of fields 
and such. The other fields are data.
Thanks for that by the way, I'd rather not know.
Graham
4-Nov-2005
[737x2]
no, I think the numbers indicate a sequence
so, there are 13 OBX records for the OBR result.
BrianH
4-Nov-2005
[739]
I mean OBX is the record type, and OBX records have 11 additional 
fields to them.
Graham
4-Nov-2005
[740x3]
Yes, I think so.
I presume ST stands for sub test.
The HL7 org want to move to using XML instead ...
BrianH
4-Nov-2005
[743]
Do you want to do a full rule-based parse here, or will simple parse 
do?
data: read/lines %data
foreach rec data [
    rec: parse/all rec "|"
    switch rec/1 [
        "OBX" [ ... do stuff...
sqlab
4-Nov-2005
[744]
OBX is the segment type
segments are separated by #"^M"
an OBX segment can have up to 24 fields according version 2.4, 
empty fields at the end of an segment need not to be transferred,

fields are delimited by #"|" normally, but all delimiters except 
segment delimiter can be defined for each message. 

fields can be divided by #"^^" into components, components can be 
divided into subcomponents etc.
Graham
4-Nov-2005
[745x3]
I'm working on a full rule based parse.
sqlab, you've been doing this stuff for years.
Ok, my parser is able to get all the data out of all the records 
now in the test result above.