Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Re: areas, text, and read/binary

From: brett:codeconscious at: 12-Feb-2002 11:24

> I am reading file contents with 'read/binary, because the file may be text
or an image.
> If the file is text then I display those contents in an 'area with
'to-string contents.
> I can edit those contents, and then would like to save the changed
contents (my-area/text).
> Here is the problem: the line terminators in the saved text seem to
multiply.
> i.e. with each new save of the text an additional line terminator is added
to the end of each line. I couldn't duplicate your problem. What operating system are you using - what is the normal line termination for your text files?
> Keep in mind that for reasons not mentioned here I would always like to
'read/binary to get the file contents, whether text file or image file. (I know this problem disappears if I just 'read text files, but it makes the script very much more complicated in several places). READ without the binary refinement is doing real work for you. If you choose to bypass this work using the /binary refinement you'll just have to do it yourself elsewhere. The Core user guide says: "When a file is read as text, all line terminators are converted to newline (line feed) characters. Line feeds (used as line terminators on Amiga, Linux, and UNIX systems), carriage returns (used as line terminators on Macintosh), and the CR/LF combination (PC and Internet) are all converted to the equivalent newline characters." "Using a standard line terminator within scripts allows them to operate in a machine-independent fashion." So REBOL defines newline as the line terminator. Let's say your OS uses CR/LF as the line terminator. What then does AREA get to work on. The code in REBOL is expecting just a LF so how will it treat CR - I doubt that it will treat as part of the line termination - it will be treated as another ordinary text char. In this case you will end up editing it out when using AREA. So now your text inconsistent when you go to save via /binary or not. If use /binary when you write you will have broken line termination (missing CRs), if you don't use /binary you will have extra CR's (the one you didn't edit out). Worse, you application will not work across platforms even if you get it working on one. My understanding is that you are doing this to avoid the work of treating text and binary files differently when calling READ. I had the same problem when uploading my site to the web using FTP. My solution was to define a function that decided what was appropriate, /binary or not, depending ultimately on the extension of the file. I actually defined my own scheme for mapping file extensions to mime types: txt --> text/plain htm --> text/html html --> text/html jpg --> image/jpeg jpeg --> image/jpeg etc. It would be nice if this mapping facility came with Rebol. Brett.