r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[I'm new] Ask any question, and a helpful person will try to answer.

PatrickP61
20-Jul-2007
[627x2]
My end goal is to be able to take some formatted text of some kind, 
something that is generated by a utility of some kind, and generate 
a spreadsheet from it.  The formatted text can be of any type including 
" and the like.


I'm working in reverse, by creating a spreadsheet in MS excel with 
various kinds of data that I've shown above.  Some data with just 
alpha, just numbers, combinatins, leading quotes, trailing quotes, 
embedded quotes, embedded commas, spaces etc.  Then I saved the spreadsheet 
as CSV and another version as Tab delimited.


Then by looking at those files via notepad or other editor, I can 
see how the data must be in order for MS excel to accept it.  I initially 
had problems with the CSV model because embedded qutoes needs other 
qutoes added to that "cell" if you will.  The Tab delimited model 
has less restrictions on it.  The only thing that needs attention 
is when a "cell" starts with a quote, which needs additional quotes 
added to it.  Embedded qutoes or trailing qutoes don't need any modification.


Long story short -- I'm going with Tab delimited model and figuring 
out a rebol script to take data from an IBM utility dump (with rules 
on what data to capture), and model that info into an excel spreadsheet 
via Tab delimited file.
Hi Gregg -- The cookbook recipe is a good one for reading and processing 
CSV's as input.  My main issue is NOT the CSV part itself.  It is 
pretty simple really.  But as usual MS has some additional formatting 
rules whenever certain characters are embedded, and that is the part 
I'm having trouble with in order for a CSV file to be loaded as a 
spreadsheet. 


You don't happen to have one that lets you write CSV files as output 
for excel (with all the special rules etc)???   :-)
Gregg
21-Jul-2007
[629x5]
I don't , but let's see how close this gets us. First, here is a 
support func for building delimited series. In addition, you'll need 
my COLLECT func from REBOL.org, or something similar.
delimit: func [
        "Insert a delimiter between series values."
        series [series!] "Series to delimit. Will be modified."
        value            "The delimiter to insert between items."

        /skip   ;<-- be sure to use system/words/skip in this func

            size [integer!] "The number of items between delimiters. Default 
            is 1."
    ][
        ; By default, delimiters go between each item.
        ; MAX catches zero and negative sizes.
        size: max 1 any [size 1]

        ; If we aren't going to insert any delimiters, just return the series.

        ; This check means FORSKIP should always give us a series result,
        ; rather than NONE, so we can safely inline HEAD with it.
        if size + 1 > length? series [return series]
        ; We don't want a delimiter at the beginning.
        series: system/words/skip series size

        ; Use size+n because we're inserting a delimiter on each pass,

        ; and need to skip over that as well. If we're inserting a

        ; series into a string, we have to skip the length of that

        ; series. i.e. the delimiter value is more than a single item
        ; we need to skip.
        size: size + any [

            all [list? series  0] ; lists behave differently; no need to skip 
            dlm.

            all [any-string? series  series? value  length? value]
            all [any-string? series  length? form value]
            1
        ]
        head forskip series size [insert/only series value]
    ]
fmt-for-Excel: func [blk dlm] [
	rejoin delimit collect fld [
		foreach val blk [
			val: form val
			replace/all val {"} {""}
			if find val #"," [val: rejoin [{"} val {"}]]
			fld: val
		]
	] dlm
]

fmt-for-Excel [{"A"} "B" "C,C" {"4,4"}] #","
If this output looks correct, then you just need to test it some 
more, with other possible scenarios.

>> fmt-for-Excel [{"A"} "B" "C,C" {"4,4"}] #","
== {""A"",B,"C,C","""4,4"""}
The case I'm not sure about is the last one there, where you end 
up with triple quotes, because they're doubled first, then an outer 
pair is added because of the embedded delimiter.
PatrickP61
23-Jul-2007
[634]
Hey Gregg,  Thanks for the code.  I tried it out and while there 
is a few hicups, I am planning on using that code when I create an 
Excel CSV version in addition to the Tab delimited version.   Thank 
you!
Gregg
23-Jul-2007
[635]
Let me know what the hiccups are, if you can, so we can get a solid 
version out there for people to use. Thanks.
RobertS
1-Aug-2007
[636x7]
can you tell me why to-file does not care if a word holding the value 
of a file name is presented as
to-file :filename
or  to-file filename
source to-file is very simple  to-file: func [value] [ to file! :value 
]  ; which seems to apply : to a get-word! in the first case
Put another way, what is the rule for when a word must explicitly 
bear the sigil prefix of  :  ? I.e., when is a get-word! required 
and when does any word suffice?  ::word is an error but   to file! 
:myString  in a func is no different from  to file! myString   and 
how do you pur carriage-returns into this message-post box!  ;-)
ed: func [/file filename [string! file!] " afile name" /local fn 
] 
	[either file 
		[either exists? fn: to file! :filename
			[editor fn]
			[fn: ask "file name:  " editor to file! :fn] ]
		[editor {}]
	]
comment { this works the same

ed: func [/file filename [string! file!] " afile name" /local fn 
] 
	[either file 
		[either exists? fn: to file! filename
			[editor fn]
			[fn: ask "file name:  " editor to file! fn] ]
		[editor {}]
	]}
btiffin
1-Aug-2007
[643x3]
Robert.  get-words are "unevaluated", so no code will execute getting 
to the value.  Most datatypes will return the same value for get 
as for evaluate.  But getting functions will return the function, 
not the result of evaluating the function.  Umm, that's probably 
not a Ladislav level answer, but it's how I think about it.
I'm not completely clued in, but I think get-words can be faster 
as well, as the lexical scanner can skip the evaluation,  In your 
case; evaluating a filename, returns a filename, (and I only assume) 
is an extra (nearly empty?)step than just getting the filename.
For instance.  a: now  gives you a date time field that won't change 
whenever you reference a or get a and it's type is date!   a: :now 
gives you an a that will be the current time whenever it is evaluated., 
but if you get-word a with :a or get 'a you get back the native, 
not the datetime, so a's type reports as native!  It's funky and 
fun.
Gregg
1-Aug-2007
[646x3]
Carriage returns - click the pencil icon to change to ctrl+s as the 
send key.
>> logname: does [rejoin [now/date ".log"]]
>> to-file logname
== %1-Aug-2007.log
>> to-file :logname
== %?function?
Brian's explanation is good; it's something to play around with in 
the console, to get a feel for things.
RobertS
1-Aug-2007
[649]
what seems a little spooky is the way the behavior Gregg illustrates 
disappears when I define to-file as
to-file: func [value] [to file! value] ; cool - or spooky
btiffin
1-Aug-2007
[650x2]
REBOL is both and more.  :)   But in that last example, although 
you may have passed the "unevaluated" logname, the function by using 
 the  value  reference, evaluates it.  :)
Try func [:val]  and func ['val]  for even more fun
RobertS
1-Aug-2007
[652]
wilco ;-)
btiffin
1-Aug-2007
[653x5]
The lexical scanner will pass the "uneval" value in the first case 
and the literal value in the second.  It is tricky, but it slowly 
starts to make sense.
But I recommend practise, and let yourself be confused.  :)
It's a weird time in Altme as well.  The gurus that are usually around 
to give very detailed and exact wording to these issues are all busy 
with R3, so show up here on a less regular basis.  You may have to 
filter through some tier B rebol explanations for the next few days. 
 :)
Now having said that...Gregg, Geomol, many many others dish out good 
help...but watch for Gabriele, Ladislav and some others as they seem 
to have a gift for explaining things so a computer would understand 
without ambiguity.
I'm in a catch-22 now...Gregg and John are gurus, they speak human 
and computer...Gabriele and Ladislav are gurus, they speak computer 
and human...Gregg and John are gurus, they speak computer and human...Gabriele 
and Ladislav are gurus, they speak human and computer...  Just like 
REBOL, I know what I want to say but it's deeper than I can express. 
 :)
RobertS
1-Aug-2007
[658]
Thanks - I don't discourage easily...  The Rebol for Dummies is not 
too very helpful on some points - I now like the 'Official Guide' 
a lot for the path it takes, but some of the typos/misprints must 
have irked Carl ;-)
btiffin
1-Aug-2007
[659]
:)  I've never actually read the books...everything I know is from 
the Core manual, experiments and the guys here.
btiffin
2-Aug-2007
[660x2]
Robert and all;  I just bumped into this one again...so I thought 
I'd mention it.


none is a weird value, in that it usually looks like none, but a 
lot of time is  'none  the word!, not the value none of type none!

My suggestion...when starting out, get used to typing  #[none]


a: [none none]   type? first a  is word!  and none? will test false.

a: [#[none] #[none]]  nice and safe...  type? first a  is none!  
and none? will test true.
Umm, get used to typing [#none] if you are putting none into a block 
that is.  a: none  does what you'd expect.
Gregg
2-Aug-2007
[662x3]
The online Core guide is still the best overall language reference 
IMO.
Core manual rather; agreeing with Brian.
Sometimes MOLD can be very helpful, along with TYPE?, to see if things 
are what you think they are. NONE, datatypes, etc. can be tricky 
at times, to know if they are a word or the value.
Geomol
2-Aug-2007
[665]
Is there a list anywhere of what values are actually their expected 
values, and what values are seen as words, when inside a block? As 
in:


blk: [none 1 1.2 integer! [email-:-somewhere-:-net] #an-issue 127.0.0.1]
etc. etc.

If not, someone should make one such list!
Gregg
2-Aug-2007
[666]
All words are seen as words, unless reduced.


>> blk: [none true :word word: 'word integer! 1 1.0 1.0.0 $1 1x1 
"string" <tag> [name-:-host] #i
ssue]

== [none true :word word: 'word integer! 1 1.0 1.0.0 $1.00 1x1 "string" 
<tag> [name-:-host] #issue]
>> head forall blk [change blk type? first blk]

== [word! word! get-word! set-word! lit-word! word! integer! decimal! 
tuple! money! pair! string! tag! email! issue!]


>> blk: reduce [none true 'word integer! 1 1.0 1.0.0 $1 1x1 "string" 
<tag> [name-:-host] #issue]

== [none true word integer! 1 1.0 1.0.0 $1.00 1x1 "string" <tag> 
[name-:-host] #issue]
>> head forall blk [change blk type? first blk]

== [none! logic! word! datatype! integer! decimal! tuple! money! 
pair! string! tag! email! issue!]
Anton
3-Aug-2007
[667]
Geomol, such a list cannot be made, if I understand you correctly.
Geomol
3-Aug-2007
[668]
Ok, why not? I was thinking about something like Gregg just did, 
but with all the datatypes. Putting the different datatypes in a 
block with a simple assignment and then check, how they're seen by 
REBOL.
btiffin
3-Aug-2007
[669]
Carl wants such a list for   form, mold, to string!,  format (but 
that's R3), add the serial form, score some points and help the beginners 
in one grand pdf-maker datatype file.  Not much to ask, is it John? 
 :)
Geomol
3-Aug-2007
[670]
I would be cool, if we could kick some of the new ones to make such 
a list. It would be a good learning experience.
btiffin
3-Aug-2007
[671]
I think it'll be a very useful and worthy entry for DocBase, but 
it would also be nice to use Gabriele's PDF-MAKER or other page layout 
for a nice reference sheet printout.
Geomol
3-Aug-2007
[672]
*It would be cool*
Henrik
3-Aug-2007
[673]
hmm... that's a good idea. :-)
btiffin
3-Aug-2007
[674]
So...get RT to open DocBase and I'll start right now, oh yeah and 
R3 beta for access to format  :)  Kidding... I was going to be playing 
with PDF-MAKER today, to see if I can add print functionality to 
the Desktop Librarian...maybe I'll use this as a good way learn the 
dialect.  So unless someone else steps up...I'll take a kick at it, 
for R2 datatypes anyway.
Gabriele
3-Aug-2007
[675]
hmm, not sure i understand here. what do you want to add to docbase?
btiffin
3-Aug-2007
[676]
I was just kidding...I'm about half way through using your pdf-maker 
(v2) to document the R2 datatypes regarding  to string! form mold. 
 Blog http://www.rebol.net/r3blogs/0092.htmlbut like I everything 
I do now-a-days, it comes pre-expired  :)