World: r3wp
[I'm new] Ask any question, and a helpful person will try to answer.
older newer | first last |
PatrickP61 19-Jul-2007 [590] | Hi all -- I didn't mean to cause additional entries to other libraries -- just tried to format it in my own routine. But since there is interest, I double checked the official IBM SQL reference manual for timestamps. So for the record and for your information: Although there are many many different forms (Japanese, Europe etc), I only focused on two elements, the ANSI ISO timestamp, and IBM SQL timestamp standard. So in the IBM book "DB2 UDB for iSeries SQL Reference V5R370" under "Data Types, Datetime values, table 10 page 70": _________________________________________________________________________________________________ Table 10. Formats for String Representations of Timestamps Format Name Time Format Example ANSI/ISO SQL standard TIMESTAMP ’yyyy-mm-dd hh:mm:ss.nnnnnn’ TIMESTAMP ’1990-03-02 08:30:00.010000’ IBM SQL ’yyyy-mm-dd-hh.mm.ss.nnnnnn’ ’1990-03-02-08.30.00.010000’ 14–character form ’yyyymmddhhmmss’ ’19900302083000’ _________________________________________________________________________________________________ For the record, I think I confused the two types. Notice the embedded space between the date and time as well as : separators for ANSI/ISO, while the IBM SQL standard contains it all (no embedded space) and uses periods to separate the time elements, which I will easily fix in my version. -- You may wish to do the same for any new form of date you have. |
btiffin 19-Jul-2007 [591] | Patrick; No problem about library entries. :) When REBOL grows we all benefit. Nothing in rebol.org is 'official'. It's a user maintained repository of 'stuff'. Chris' form-date just happens to be one of the beauties. It is close to but not the same as the C strftime function. With form-date you can make up pretty much any date time output you'd like. No one has to use form-date, I was just cheerleading. So I'll cheerlead a liitle bit. I you haven't yet, check out http://www.rebol.org.Sunanda and team have created a a world class repository of information and functionality that is all REBOL user community generated. |
PatrickP61 19-Jul-2007 [592] | Thanks btiffin -- Yes I agree that having a library of common routines is the way to go. I'm new and just learning how to play with Rebol with the approach of -- If I do it this way, what will rebol do, rather than, use the library to find common routines, and use it, which of course I would do to solve a specific problem. Rebol.org is a great resource! |
btiffin 19-Jul-2007 [593] | Nice approach. REBOL really does allow for exploration, and when puzzles are seen from an angle that was not even considered before is when the magic can happen. :) |
PatrickP61 19-Jul-2007 [594] | Hi all, As you may have guessed from my above posts, I'm trying to write a script that will convert a formatted report into a table or CSV. I'm new and just playing around to understand the process. In any event, I did search rebol.org on CSV and found the CSV.r script which seems to a part of what I would like to do. But here is my concern. The Mold-CSV function does not handle all the different kinds of strings that can occur. I'm talking about embedded " or {. I would like a function that can handle all strings properly into CSV. Take this example: In-block: [ [ "C A1" "C B1" ] [ 2 3 ] [ "2" "3" ] [ "4a" "4b" ] [ "5a x" "5b y" ] [ ""7a,"" ""7b,"" ] [ ""a8""" ""b8""" ] ] Mold-CSV In-block doesn't handle 7a or a8 lines properly since the "" terminates a string. You could replace the first and last " with a brace {} but that does some funny things too. |
Henrik 19-Jul-2007 [595x2] | those should be converted to ^" and ^{ ^}, I believe |
>> "^"" == {"} | |
PatrickP61 19-Jul-2007 [597] | Working in reverse, If you go into Microsoft Excel and type in your desired contents, you will see how they do it: - If a cell has embedded commas, then " are put around the entire cell contents. - If a cell has embedded " like abc"def, then the " is repeated as "abc""def". just food for thought |
btiffin 19-Jul-2007 [598] | Patrick; This is pretty common with a lot programming, escaping quotes inside strings. And Henrik, just beat me to it and offered up a starting point... :) |
PatrickP61 19-Jul-2007 [599x2] | Henrik, are you saying that I should change any embedded " to ^"? Like [ "^"7a,^"" ] |
. | |
Henrik 19-Jul-2007 [601] | that would do it, yes |
PatrickP61 19-Jul-2007 [602x3] | I'll try it! |
Getting better, but still no cigar. Here is my test code for Mold-CSV function in CSV.r script: ( I hope this formats correctly on Altme) In-block: [ ; (want in csv file) ; (want in excel) ["Col A1" "Col B1" ] ; Col A1,Col B1 ; Col A1 Col B1 [2 3 ] ; 2,3 ; 2 3 ["'3" "'4" ] ; '3,'4 ; '3 '4 ["4a" "4b" ] ; 4a,4b ; 4a 4b ["^"5a^"^"^"^"" "^"^"^"5b^"^"" ] ; "5a""""","""5b""" ; 5a"" "5b" ["6a x" "6b y" ] ; 6a x,6b y ; 6a x 6b y ["7a, x" "7b,^"^" y" ] ; "7a, x","7b,"" y""" ; 7a, x 7b," y" ["^"8a ^"^",x" "^"8b ^"^"^"^"y^"" ] ; "8a "",x","8b """" y" ; 8a ",x 8b "" y ["^"^"^"9a^"^" x" "^"9b ^"^"y^"^"^"" ] ; """9a"" x","9b ""y""" ; "9a" x 9b "y" ] Out-block: Mold-CSV In-block write %Book2.csv Out-block ____________ In the above, I have 3 "views" if you will of what I am after. The first view is the In-block that I would like Mold-CSV to run against. The second commented view is what I need Mold-CSV to generate to put into the csv file The third commented view is what Microsoft Excel will generate when I open the CSV file. Mold-CSV works fine for the first 6 lines, then it gives me this for lines 7,8 and 9: 7a, x ,{7b,"" y} <-- Where did the braces come for for 7b? {"8a "",x},"8b """"y" <-- Same quest for 8a? 9a x ,"9b ""y""" ok Any ideas on how to solve this? | |
Correction: Where did the braces come from for 7b? | |
Anton 19-Jul-2007 [605x6] | Rebol molds long strings, or strings which contain a quote " inside { } instead of " ". |
Watch this: >> "{^"" == {^{"} | |
The string I typed above contains an open brace and a double-quote (which I've escaped). Rebol probably saw that the string contains an escaped quote and decided to mold it with { } to avoid having to escape it. However, now the opening brace { needs to be escaped. | |
In short, Rebol's molding of strings can be in two different formats depending on the input string, and depending on where you get your input string from, it can be hard to guess which one it is. I would suggest to make your own Excel/CSV-string-molding function to ensure you have double-quotes as expected. Other people have come to this exact same problem before, by the way. | |
I don't understand the exact logic of the CSV file quote formatting, but you could use a function similar to this: enquote: func [string][rejoin [{"} string {"}]] >> print enquote "hel^"lo" hel lo" and you could take it further by replacing single instances of " in the string with two instances, etc. | |
Hmm.. http://www.rebol.org/cgi-bin/cgiwrap/rebol/view-script.r?script=csv.r Mold-CSV just uses MOLD, so you should replace it with your own MOLD-CSV-STRING function, similar to the above enquote. | |
PatrickP61 19-Jul-2007 [611] | Anton, Is MOLD-CSV-STRING different than the MOLD-CSV in the above link? I don't see it inside of that script. |
Anton 19-Jul-2007 [612x5] | mold-csv-string: func [string][rejoin [{"} replace/all copy string {"} {""} {"}]] |
With the link I was just verifying that we were talking about the same script in rebol.org. | |
MOLD-CSV uses rebol's built-in MOLD to mold the string. I am proposing to replace MOLD with the above one-liner MOLD-CSV-STRING function. | |
Just replace the two instances of MOLD with MOLD-CSV-STRING. | |
Wait a minute ! MOLD handles any value, while MOLD-CSV-STRING only handles series at the moment... hang on. | |
PatrickP61 19-Jul-2007 [617] | Anton, What do you think of this approach -- I'm just thinking it through and am not sure if I have covered all the bases. Since my input can contain any number of symbols, commas, single and double quotes, and rarely, but possibly braces, what if I attack the problem a different way. Whenever an embedded comma, or double quote or something like that occurs within an spreadsheet cell, it will require some kind of "extra" formatting like two quotes or the like. It may even be that there are unique combinations of such symbols, rare as that would be, to have complex formatting of an input block for rebol to convert it properly for CSV. What if I shift gears and look at a TAB delimited file instead. I know that I will never have TAB embedded in my cells, and that I deal with the entire block as a series instead. I could embed TAB wherever needed to separate the columns and leave the remaining string the way it is. Would that work, or would I still need to do some formatting to handle it. I think I'll open an excel spreadsheet, and work in reverse to see what it needs for TAB delimited file. Any comments? |
Anton 19-Jul-2007 [618x2] | That may alleviate the string quoting problem. Worth a try. |
I think I got it. mold-csv-string: func [value][append insert replace/all either any-string? :value [copy :value][mold :value] {"} {""} {"} {"}] | |
PatrickP61 20-Jul-2007 [620x4] | Anton, I tried the mold-csv-string and got the following: [ [ Col A1 Col B1 ] [2 3] [ '3 '4 ] [ 4a 4b ] [{ 5a } { 5b }] [{ 6a x } 6b y ] [ 7a, x {7b, y }] [{ 8a ,x} { 8b y }] [{ 9a x } { 9b y }] ] Which is not usable for excel -- Maybe I mis-understood how to use it... |
I also checked out tab delimited and the rules for that is much much simpler. If a quote or comma is embedded in the string and does not start with a quote, then I can leave the value as it is. The only thing I need to handle is if a value starts with a quote, then I need to add additional " around it, which I should be able to do! | |
For those of you monitoring and want to see what I mean: Out-string: [ ; (want in tab file) ; (want in excel) {Col A1^-Col B1^/} ; Col A1 Col B1 ; Col A1 Col B1 {2^(tab)3^(line)} ; 2 3 ; 2 3 {'3^-'4^/} ; '3 '4 ; '3 '4 {4a^-4b^/} ; 4a 4b ; 4a 4b {5a""^-"""""5b"^/} ; 5a"" """""5b" ; 5a"" ""5b {"""6a x"""^-6b y^/} ; """6a x""" 6b y ; "6a x" 6b y {7a, x^-7b," y"^/} ; 7a, x 7b," y" ; 7a, x 7b," y" {8a ",x^-8b "" y^/} ; 8a ",x 8b "" y ; 8a ",x 8b "" y {"""9a"" x"^-9b "y"^/} ; """9a"" x" 9b "y" ; "9a" x 9b "y" ] write %Book2.txt Out-string Aside from just need to insert a ^(tab) or ^- at the appropriate places to separate a cell from each other, and a ^(line) or ^/ in a string, I will also need to check the first character of each "cell". If it starts with a ", then I need to add another set around it. See 5b and 9a above to see what I mean. | |
-- (I wish AltMe would show the same spacings / tabs that I type into the New Message box as it would appear on the postings ) | |
Henrik 20-Jul-2007 [624x2] | I'm not really sure what you should use there, because it seems to me that rebol always reduces ^" to ", unless you enter them by hand. |
also I don't know if I understand your problem correctly. it's a bit hard to follow. :-) | |
Gregg 20-Jul-2007 [626] | Just skimmed here, so no new advice to add, except to say that this is one of those cases where it's impossible for REBOL to get it "right", because everybody has different rules about how it should work. I agree that the Excel model is a good one, and I would like to see it support that. REBOL also treats things differently based on whether there are spaces adjacent to the quotes or not, making it even more fun. http://www.rebol.net/cookbook/recipes/0018.html |
PatrickP61 20-Jul-2007 [627x2] | My end goal is to be able to take some formatted text of some kind, something that is generated by a utility of some kind, and generate a spreadsheet from it. The formatted text can be of any type including " and the like. I'm working in reverse, by creating a spreadsheet in MS excel with various kinds of data that I've shown above. Some data with just alpha, just numbers, combinatins, leading quotes, trailing quotes, embedded quotes, embedded commas, spaces etc. Then I saved the spreadsheet as CSV and another version as Tab delimited. Then by looking at those files via notepad or other editor, I can see how the data must be in order for MS excel to accept it. I initially had problems with the CSV model because embedded qutoes needs other qutoes added to that "cell" if you will. The Tab delimited model has less restrictions on it. The only thing that needs attention is when a "cell" starts with a quote, which needs additional quotes added to it. Embedded qutoes or trailing qutoes don't need any modification. Long story short -- I'm going with Tab delimited model and figuring out a rebol script to take data from an IBM utility dump (with rules on what data to capture), and model that info into an excel spreadsheet via Tab delimited file. |
Hi Gregg -- The cookbook recipe is a good one for reading and processing CSV's as input. My main issue is NOT the CSV part itself. It is pretty simple really. But as usual MS has some additional formatting rules whenever certain characters are embedded, and that is the part I'm having trouble with in order for a CSV file to be loaded as a spreadsheet. You don't happen to have one that lets you write CSV files as output for excel (with all the special rules etc)??? :-) | |
Gregg 21-Jul-2007 [629x5] | I don't , but let's see how close this gets us. First, here is a support func for building delimited series. In addition, you'll need my COLLECT func from REBOL.org, or something similar. |
delimit: func [ "Insert a delimiter between series values." series [series!] "Series to delimit. Will be modified." value "The delimiter to insert between items." /skip ;<-- be sure to use system/words/skip in this func size [integer!] "The number of items between delimiters. Default is 1." ][ ; By default, delimiters go between each item. ; MAX catches zero and negative sizes. size: max 1 any [size 1] ; If we aren't going to insert any delimiters, just return the series. ; This check means FORSKIP should always give us a series result, ; rather than NONE, so we can safely inline HEAD with it. if size + 1 > length? series [return series] ; We don't want a delimiter at the beginning. series: system/words/skip series size ; Use size+n because we're inserting a delimiter on each pass, ; and need to skip over that as well. If we're inserting a ; series into a string, we have to skip the length of that ; series. i.e. the delimiter value is more than a single item ; we need to skip. size: size + any [ all [list? series 0] ; lists behave differently; no need to skip dlm. all [any-string? series series? value length? value] all [any-string? series length? form value] 1 ] head forskip series size [insert/only series value] ] | |
fmt-for-Excel: func [blk dlm] [ rejoin delimit collect fld [ foreach val blk [ val: form val replace/all val {"} {""} if find val #"," [val: rejoin [{"} val {"}]] fld: val ] ] dlm ] fmt-for-Excel [{"A"} "B" "C,C" {"4,4"}] #"," | |
If this output looks correct, then you just need to test it some more, with other possible scenarios. >> fmt-for-Excel [{"A"} "B" "C,C" {"4,4"}] #"," == {""A"",B,"C,C","""4,4"""} | |
The case I'm not sure about is the last one there, where you end up with triple quotes, because they're doubled first, then an outer pair is added because of the embedded delimiter. | |
PatrickP61 23-Jul-2007 [634] | Hey Gregg, Thanks for the code. I tried it out and while there is a few hicups, I am planning on using that code when I create an Excel CSV version in addition to the Tab delimited version. Thank you! |
Gregg 23-Jul-2007 [635] | Let me know what the hiccups are, if you can, so we can get a solid version out there for people to use. Thanks. |
RobertS 1-Aug-2007 [636x4] | can you tell me why to-file does not care if a word holding the value of a file name is presented as |
to-file :filename | |
or to-file filename | |
source to-file is very simple to-file: func [value] [ to file! :value ] ; which seems to apply : to a get-word! in the first case | |
older newer | first last |