Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Re: Mold, load and Core 2.6

From: nitsch-lists:netcologne at: 7-Mar-2002 18:17

RE: [REBOL] Mold, load and Core 2.6 [holger--rebol--net] wrote:
> > Here are some comments regarding the recent discussion on mold and load, > explanations of which of those observations represent bugs and which do not, > and what is going to change in the next upcoming Core 2.6 release: > > First of all, the *intended* use of load and mold for data is in the following > way: > > stored data -> load -> data in memory -> mold -> stored data. > > Technically, instead of "load" you should use "first load/all", because > load/all is more "transparent" in that it does not try to interpret the header > or remove an outermost block. > > If used in this way, i.e. always starting off with load, there are, as far as > we are aware of, no bugs in mold or load. > > Sometimes people try to use load and mold in a different way: > > data in memory -> mold -> stored data -> load -> data in memory > > i.e. to serialize data. If used for serialization, mold and load have a number > of limitations, just like any other serialization system in any language. Some > of the limitations are unavoidable (how do you serialize an open socket > connection ?), others could be removed by improvements in the implementation. > > We are currently aware of the following issues when using load and mold for > serialization: > > 1. Serialization means creating literal representations of data. Unfortunately > not all datatypes *have* literal representations. This leads to two types of > problems: > > 1.a) Some values, when molded, become words, and are thus indistinguishable > from regular words. For instance "mold 'none" and "mold none" both result in > none > , which, when loaded back, becomes the word none, not the value none. > > Comment: Will be addressed in Core 2.6. > > 1.b) Some values, when molded, become a sequence of values that represent > instructions how to recreate the value. For instance molding a hash! results in > something like "make hash! [1 2 3]". The problem with that is that it requires > the loading script to evaluate the resulting block (which it ordinarily is not > supposed to do, because other items in the block, e.g. words, cannot be > evaluated, plus evaluation is a security risk if the data is untrusted). > > Comment: Will be addressed in Core 2.6. > > 2. Series indices are not included in the molded data. For instance a string > series "abc" with an index of 2 becomes "bc" after mold and load. Molding drops > the data before the index. > > Comment: Will be addressed in Core 2.6. > > 3. It is not possible to create an object! without performing an evaluation of > the value fields, i.e. molding and loading an object! represents a security > risk, even if the loader is careful and explicitly checks for the words "make" > and "object!", because the spec block may contain expressions with side > effects. > > 3.a) A related issue: Objects have to be molded in such a way that the > resulting object spec block can be evaluated. This causes some problems > and ambiguities, e.g. object containing lit-words or set-words as values > are not correctly molded and loaded, because during evaluation such values > do not behave as regular values, but have side effects. > > Comment: Will be addressed in Core 2.6. > > 4. In memory it is possible to create values of certain datatypes that include > characters or expressions that are usually not valid for that particular > datatype. For instance it is possible to create an email! value which does not > contain an "@" sign, or an issue which contains a semicolon. mold and load do not > handle this correctly, because the scanner requires certain hints to identify > datatypes. > > Comment: Not a bug, will not be addressed. Creating datatypes with invalid > contents is simply an invalid operation. REBOL allows you to do it (because > type restrictions are only checked when scanning), but that does not mean that > it is a valid thing to do. If you want to be able to process arbitrary data > then you need to use string! and binary! only. Other string series have > limitations on their structure which have to be complied with for mold and load > to work. >
haha. change then import-email. or, use it, have some mad spammer and destroy your archive next time you save. since it "to-emails" whatever this guy thinks could be a nice broken address. i think this is generally, one will not check everywhere for proper formatting, to-email does the job 99,99% of time, then some crazy data destroys all. remembering {{} . i expect a fix in a year or so? #[email! "/badguy-hahaha"] would be so easy..
> 5. Various issues regarding references (circular or otherwise). This is always > difficult to handle in the context of serialization. There are three cases: > > 5.a) The data represents a tree, i.e. each item is referenced no more than > once, and there are no cycles. For this type of data mold and load should work > without problems, and this is the type of data organization recommended if data > needs to be serialized. > > 5.b) The data represents a directed, acyclic graph, i.e. there are no cycles, > but data items can be referenced more than once. For this type of data > mold and load should still work, but the referenced items may be included in the > molded data separately for each reference, i.e. after loading the data back the > references point to separate items. > > 5.c) The data represents a general, directed graph, with cycles. This is > strongly discouraged :-). mold and load will not work at all with this kind of > data. > > Comment: No changes are planned regarding this, and the current behavior is not > considered to be a bug. Serialization of data structures with non-tree-based > references requires special serialization functions. mold and load are not > suitable for this. > > 6. Word bindings are not preserved by mold. > > Comment: Not a bug. No changes are planned regarding this. It would be pretty > much impossible to correctly preserve word bindings without saving the complete > REBOL machine state :-). Load always binds words into the global context. >
why binding global ? we get kicked whenever somebody inserts a paren! cleverly. having load/unbound would be more secure. and binding to a restricted set of words in a fresh context, like [make object! true false none]. (to-block hangs sometimes here, and is not exactly the same)
> As far as mold and save are concerned, the major change for Core 2.6 is the > /all refinement, which makes mold and save more suitable for serialization. > The /all refinement has the following effects: > > - (Almost) all data types are molded in a literal form. Datatypes which already > have a natural literal form continue to use this form (integer, words etc.) > Datatypes which so far have not had a literal form will use a new notation > that acts as a pseudo-literal. This notation is "#[type! description]" or, > for some datatypes, "#[value]" (without the quotes). For instance: > > Value-oriented pseudo-literals: > > true -> #[true] > false -> #[false] > none -> #[none] > unset! -> #[unset!] > > Datatype-oriented pseudo-literals: > > object -> #[object! [a: 1 b: 2 ...]] > list -> #[list! [a b c ...]] > > etc. > > When loading pseudo-literals back no special refinements have to be used > with 'load. The 'load function recognizes pseudo-literals just like all > other literals. > > - When a series with an index different than the head of the series is molded > then the complete series is molded while preserving the index. To do this, > the series is molded in its pseudo-literal form, with the index following the > content. For instance the string "abc" with an index at position 2 is molded > as "#[string! "abc" 2]. Loading the string back results in a string "abc" > with an index at position 2. > > - When object pseudo-literals are loaded the spec block is not treated as a > block to be executed under the object's context, but strictly as a name/value > pair block. This allows objects containing set-words, lit-words etc. to be > loaded correctly. For instance #[object! [a: val: b: 1]] results in the > object [ > a: val: (a containing the set-word val) > b: 1 > ] > instead of > object [ > a: 1 > val: 1 > b: 1 > ] as you would get from make object! [a: val: b: 1]. >
i would think [a: #[val:] b: 1] are more obvious?
> Also, the value items are not evaluated before storing them in the object. > They are treated as literals. These changes should make it completely safe > to send molded objects across untrusted communication lines and load them > back at the receiver. > > Please note that the normal output format of mold and load is not affected > at all. The changes only affect the output of mold/all and load/all. The > intended use is: > > - If you start with a string representation or a file, load that file, > manipulate the resulting block, and then write the block back into a > file, then use mold or save without the /all refinement. > > - If you start with some data structure in memory, mold it for > serialization purposes, store it on disk or send it across a network, > and then load it back, then use mold or save with the /all refinement. >
all in all sounds great. makes load/mold for serialisation pretty usable. drawbacks are: -unparsable data breaks all -> no use of handy parsings like import-email. at least some check while molding would be nice, instead of something like [equal? mold data mold load mold data] as today.. -global binding -> paren! kills security (or use :this :that everywhere..), -crazy molded set-words -oh yes, and if the newline-tag could be set by programm.. i don't like having 4K-lines after reduce, unable to fix it in block-form. having to mold everything by hand and reload isnt the best solution..
> -- > Holger Kruse > [kruse--nordicglobal--com] > --
-volker