[REBOL] Re: areas, text, and read/binary
From: brett:codeconscious at: 12-Feb-2002 11:24
> I am reading file contents with 'read/binary, because the file may be text
or an image.
> If the file is text then I display those contents in an 'area with
'to-string contents.
> I can edit those contents, and then would like to save the changed
contents (my-area/text).
> Here is the problem: the line terminators in the saved text seem to
multiply.
> i.e. with each new save of the text an additional line terminator is added
to the end of each line.
I couldn't duplicate your problem. What operating system are you using -
what is the normal line termination
for your text files?
> Keep in mind that for reasons not mentioned here I would always like to
'read/binary to get the file contents, whether text file or image file. (I
know this problem disappears if I just 'read text files, but it makes the
script very much more complicated in several places).
READ without the binary refinement is doing real work for you. If you choose
to bypass this work using the /binary refinement you'll just have to do it
yourself elsewhere. The Core user guide says:
"When a file is read as text, all line terminators are converted to
newline (line feed)
characters. Line feeds (used as line terminators on Amiga, Linux, and
UNIX
systems), carriage returns (used as line terminators on Macintosh), and
the CR/LF
combination (PC and Internet) are all converted to the equivalent
newline
characters."
"Using a standard line terminator within scripts allows them to operate
in a
machine-independent fashion."
So REBOL defines newline as the line terminator. Let's say your OS uses
CR/LF as the line terminator. What
then does AREA get to work on. The code in REBOL is expecting just a LF so
how will it treat CR - I doubt
that it will treat as part of the line termination - it will be treated as
another ordinary text char. In this case you
will end up editing it out when using AREA. So now your text inconsistent
when you go to save via /binary or not.
If use /binary when you write you will have broken line termination (missing
CRs), if you don't use /binary you will
have extra CR's (the one you didn't edit out).
Worse, you application will not work across platforms even if you get it
working on one.
My understanding is that you are doing this to avoid the work of treating
text and binary files differently when calling READ.
I had the same problem when uploading my site to the web using FTP. My
solution was to define a function that decided what was appropriate, /binary
or not, depending ultimately on the extension of the file. I actually
defined my own scheme for mapping file extensions to mime types:
txt --> text/plain
htm --> text/html
html --> text/html
jpg --> image/jpeg
jpeg --> image/jpeg
etc.
It would be nice if this mapping facility came with Rebol.
Brett.