AltME groups: search
Help · search scripts · search articles · search mailing listresults summary
world | hits |
r4wp | 5907 |
r3wp | 58701 |
total: | 64608 |
results window for this page: [start: 55301 end: 55400]
world-name: r3wp
Group: !REBOL3 ... [web-public] | ||
BrianH: 28-May-2011 | It's really not any more of a problem for /local than it is for any other function option or argument, since the real problem is that the techniques for code injection have been revealed. Fortunately, so have the methods for avoiding or counteracting it: APPLY, type checking, get-words, or wrapping expressions in parens or putting them at the end of blocks to make sure that they can't get access to modifiable values later on in the block. | |
BrianH: 6-Jun-2011 | In R2, indexes are constrained to the bounds of the series they reference, so if you shorten the series you lose your position. In R3 your position is preserved for most of the standard operations, just in case you add back to the series later. The operations that can't possibly work for those out-of-bounds references trigger errors (mostly modifying operations), and the ones that could work if you consider series bounds to be an implementation detail and out-of-bounds values to be just not there return none (data access functions). SKIP is an exception, afaik, to allow you to sync up with the bounds - useful, but I don't remember whether it was intentional or a happy accident. >> a: "abc" == "abc" >> c: skip a 2 == "c" >> remove/part a 2 == "c" >> index? skip c 0 == 2 >> index? c == 3 | |
onetom: 6-Jun-2011 | BrianH: that's a beautiful description. it should be part of the R3/Concepts document. can we just dump it to there? (i don't have write privileges yet) | |
BrianH: 6-Jun-2011 | It would be a victory if we can get to the point where if R3 developers see an error triggered in R3 they are thankful for the information provided, because it will make it easier for them to find the mistake in their code that caused the uncaught error to be triggered in the first place. This is why error locality is so important, as is triggering better errors, and not assuming that something is an error unless the code says it is. If developers can think of errors as being their friends, all the better. | |
onetom: 6-Jun-2011 | these are the kind of explanations which i miss in case of other languages when im wondering over design decisions like: "what kind of animal could have came up with such a fucked-in-the-nose idea?" | |
BrianH: 6-Jun-2011 | It's a little easier with R3, where a lot of the design decisions were made for good reasons, and by consensus. And because we decided early that R3's answer to backwards compatibility with R2 is to keep maintaining R2 for such code. That frees us to fix aspects of R2's design that turned out to not be that good an idea in retrospect. | |
Geomol: 6-Jun-2011 | Desiding what to do with block indexes out of range is a tough call, I think. I understand the argument not to cause errors, if it can be handled somehow, but I'm not sure, handling out-of-range problems is always good. What if it's a user bug in the code, that made the index get out of range? Then the user won't easily find that bug, as it causes no error. It's not possible to index a block lower than 1 (first element). It's only possible to index out of range in the other end of the block, getting past the tail. And that can only be done by having an index there, and then remove something from earlier in the block. When the index is beyond the tail, then it has to be desided what to do with insert, remove, skip, next, back, pick, select, append used on that index. (and maybe more like TAIL?, INDEX?, ...) What does other languages do? | |
Geomol: 6-Jun-2011 | Let's say, my index is way beyond the tail, and I insert a new element there. It may then just be appended to the series, which is at an index way before my pointer. What if I then e.g. say: remove/part my-index -1 | |
Ladislav: 6-Jun-2011 | Geomol: "I'm not sure, if this is known or desired behaviour" - Brian used a long description, but the fact is, that the best part of it is "accident". I bet, that it has not been checked, and it is not clear, whether the difference is desirable or not. | |
Ladislav: 6-Jun-2011 | Let's say, my index is way beyond the tail, and I insert a new element there. - you are out of luck, such an operation is not supported | |
Ladislav: 6-Jun-2011 | BTW, "Let's say, my index is way beyond the tail, and I insert a new element there." - in fact, I was right saying that the operations was not supported, since it did not insert a new element there | |
BrianH: 6-Jun-2011 | Only the part with the behavior of SKIP might be an accident. The rest was the result of a blog discussion over two years ago, which iirc happened while Ladislav was taking a break from REBOL. | |
BrianH: 6-Jun-2011 | I was simplifying to avoid having to write a bunch of test code when my time is limited. Any proofs are welcome, though maybe the R2 proofs should go in Core. | |
BrianH: 6-Jun-2011 | The word "constrained" was a simplification of the real process. | |
BrianH: 6-Jun-2011 | It would be possible and in keeping with the metaphor to have an out-of-bounds INSERT pad blocks with none values, but since strings and binaries don't have a way to have inline nones, that would make the behavior of blocks inconsistent. That is why INSERT behaves the way it does. If you want INSERT to trigger an error in that case, like POKE and set-path modification, that would make sense too. | |
Ladislav: 6-Jun-2011 | APPEND is clearly a different case than INSERT, since APPEND always uses the tail | |
BrianH: 6-Jun-2011 | I think that the incompatibility of NEXT and BACK is more important, and definitely an error, and an accident. >> a: [1 2 3] == [1 2 3] >> b: skip a 2 == [3] >> remove/part a 3 == [] >> index? b == 3 >> index? next b == 3 ; constrained to the argument index, not even the series bounds >> index? back b == 2 ; not constrained to the series bounds >> index? back a == 1 ; constrained to the head | |
BrianH: 6-Jun-2011 | Option 4: NEXT and BACK will be constrained to the series tail, even if that makes the (index? a) <= (index? next a) truism false. | |
BrianH: 6-Jun-2011 | Also note that REMOVE/part with a negative part acts like REMOVE/part from the other point with a positive part. This is why it's impossible to create a before-the-head reference. | |
BrianH: 6-Jun-2011 | With Option 3 that would change, since BACK from the head of the series would go out of bounds. REMOVE out of bounds would continue to be a noop. | |
Geomol: 7-Jun-2011 | Would a programmer expect this to be true always? (index? series) + (length? series) = (index? tail series) That seems to define some basic rules for indexes and the functions involved. But it fails sometimes, as it is now: >> s: [1 2] == [1 2] >> t: tail s == [] >> clear s == [] >> (index? s) + (length? s) = (index? tail s) == true >> (index? t) + (length? t) = (index? tail t) == false Problem here is, that LENGTH? t returns 0. It should return -2, if the result should be true. | |
Geomol: 7-Jun-2011 | I noticed a funny thing, when inserting a series in the same series with the /only refinement. >> s: [a b c] == [a b c] >> length? skip s 2 == 1 ; makes sense >> insert/only s skip s 2 == [a b c] >> s == [[...] a b c] ; reference to the same series is shown this way, ok >> length? s/1 == 2 ; unexpected Wouldn't it be more logical, if that last result were 1? It's the same in R2. | |
Geomol: 7-Jun-2011 | It's possible to create infinitely deep series with this: >> s: [a b c] == [a b c] >> insert/only tail s s == [] >> s == [a b c [...]] >> s/4 == [a b c [...]] >> s/4/4 == [a b c [...]] >> s/4/4/4 == [a b c [...]] and so on. | |
Endo: 7-Jun-2011 | No I think it is not unexpected. Because when you insert new values into a series its internal positions is changing: | |
Endo: 7-Jun-2011 | in you example s/1 actually a pointer to s itself (the second value in s). and when you insert something into s, the pointer (which is s/1) is also moves forward. >> same? head s/1 head s == true >> s/1 == [b c] ;which mean it still points to second value in the series. | |
Endo: 7-Jun-2011 | That is because, c is not a series pointing to another (in this case same) series. | |
Geomol: 7-Jun-2011 | Right. For blocks, inserting doens't change position of other indexes to same series. What I expect, is that the insert is from the index given, before the insert is carried out. Like: >> s: [a b c] >> insert/only s skip s 2 should result in [[c] a b c] , where the first element is a pointer to one before the end of s, not two before the end of s as we have today. | |
Endo: 7-Jun-2011 | Here is the trace log (R2) >> s: [a b c] == [a b c] >> trace on Result: (unset) >> insert/only s skip s 2 Trace: insert/only (path) Trace: s (word) Trace: skip (word) Trace: s (word) Trace: 2 (integer) Result: [c] (block) | |
Endo: 7-Jun-2011 | But then, the internal pointer changes because of the insert. >> head s/1 Trace: head (word) Trace: s/1 (path) Result: [[...] a b c] (block) | |
Geomol: 7-Jun-2011 | I understand, what happens, that it's a position in a series, I try to insert earlier in the same series. I just find it a bit confusing, it works as it does. Woldn't it be more logical, if it's the position + 1, that is inserted in such cases? I think, this looks strange: >> s: [a b c d e f g h] == [a b c d e f g h] >> insert/only s find s 'd == [a b c d e f g h] >> s/1 == [c d e f g h] It seems more logical to me, if it does this: >> s: [a b c d e f g h] == [a b c d e f g h] >> insert/only s next find s 'd == [a b c d e f g h] >> s/1 == [d e f g h] | |
Geomol: 7-Jun-2011 | It's also interesting, that if /only isn't used, the result is what I first would expect: >> s: [a b c d e f g h] == [a b c d e f g h] >> insert s find s 'd == [a b c d e f g h] >> s == [d e f g h a b c d e f g h] | |
Endo: 7-Jun-2011 | without /only is more general use of course, but it is completely different. In your last example (Geomol) you get the values of a series and insert them into another series. | |
Endo: 7-Jun-2011 | with /only you get a "pointer" to a series and put that "pointer" into another series. (I know "pointer" is not a correct word for this but you got the idea) | |
Geomol: 7-Jun-2011 | What if I want to insert the tail position of a series earlier in the same series? >> s: [1 2] == [1 2] >> insert/only s tail s == [1 2] >> s/1 == [2] >> insert/only s next tail s == [[...] 1 2] >> s/1 == [2] So that can't be done. | |
Endo: 7-Jun-2011 | So you could have a series which holds "pointers" to other some positions in another series, even a position in itself. | |
Geomol: 7-Jun-2011 | It can't? Isn't that just a design desision? | |
Geomol: 7-Jun-2011 | I think, this is only a problem with blocks, if you insert a later position earlier in the same block. | |
Ladislav: 7-Jun-2011 | index is a number | |
Ladislav: 7-Jun-2011 | INSERT cannot do such a "harakiri" | |
Ladislav: 7-Jun-2011 | just realize, that index is a number, and every blocks remembers it | |
Ladislav: 7-Jun-2011 | You need to realize, that "insert/only a block into a block with the same head" has to be compatible with "insert/only a block into a block with distinct head" | |
Andreas: 7-Jun-2011 | Geomol, refering to your example above: >> s: [a b c d e f g h] == [a b c d e f g h] >> p: find s 'd == [d e f g h] >> insert s 42 == [a b c d e f g h] >> p == [c d e f g h] | |
Geomol: 7-Jun-2011 | I see no problem with your example. I'm surprised, you find my suggestion "awful". What's awful about this: >> s: [a b c d e f g h] == [a b c d e f g h] >> p: find s 'd == [d e f g h] >> insert/only s p == .... >> s/1 == [d e f g h] >> p == [c d e f g h] | |
Kaj: 7-Jun-2011 | John, remember our previous discussion, that indexes are not properties of series, but properties of series references. Therefore, it makes no sense to try to make a vector behave like a list, because any other references to the vector are unknown and thus it's impossible to make the behaviour for those references consistent | |
Ladislav: 7-Jun-2011 | The awful property is as follows: s: [a b c d e f g h] t: [a b c d e f g h] p: find s 'd insert/only s p insert/only t p same? first s p ; == false ! same? first t p ; == true | |
Geomol: 7-Jun-2011 | That's a strong point! | |
Kaj: 10-Jun-2011 | Probably trying to build a big antenna to phone home | |
GrahamC: 10-Jun-2011 | Doesn't he have a phone? Or did he give away his iphone? | |
Pekr: 13-Jun-2011 | This is by far the longest period of Carl's disappearance, IIRC. The reasons we can only speculate about - some personal/family difficulcies, burn-out to the REBOL topic, new daily job, which does not left you with much energy and free time to do some other stuff, REBOL related. We tried to get Carl's answer on R3 Chat, only with sporadic answer of recent Carl's job. But no answers to "what's next" for the REBOL. The reasons might be various again - Carl is willing to proceed, he just dosn't have time/energy. He is most probably not willing to open-source the project either, which gets us into kind of schizophrenic situation - no open-source, no progress either. If Carl would know the answer to what's next for REBOL question, he would already share it with us. But he did not do so, yet, which is a bit disappointing of course. As for a phone call - I don't know - I would not call him, as it could get him into feeling, that we push him to give us an answer, which he might not have right now. But - I think that call from some friend, e.g. Reichart, would be accepted differently. But then - we can't push anyone to do anything, and from few weeks old discussion I think that Reichart does not necessarily feel urge to do such a thing, which is OK too. So ... we wait ... and RED progresses at least, so hopefully in the end, there is still the chance that we will see the light in the end of the tunnel :-) | |
Henrik: 14-Jun-2011 | Got a note this morning: They are fine... but that is the news that can be publicised for now. :-) | |
Ladislav: 14-Jun-2011 | Wow, what a generous dose of speculations from Pekr! | |
Kaj: 14-Jun-2011 | There are very easy ways for business owners to prevent a number of classes of speculations | |
Brock: 15-Jun-2011 | I didn't hear much about this day job that he has, but I can only hope that it was a contract to finish R3 for a big company and once that project was completed, release it. | |
Jerry: 21-Jun-2011 | i've just got a "segmentation fault" while my R3 script was running. :-( | |
Robert: 2-Jul-2011 | We are going to drive the priorities by the things RMA needs. This shouldn't be a problem for anyone because if you are going to use R3 for serious development than you will need it as well. The thing is, that we are not going to jump-start for every requested feature etc. We know there is a lot to do and we will work through it step-by-step. And, this is not because we don't care what you all state here. Definetly not. I want to move R3 forward as fast as possible. This needs concentration, focus and pushing to finally get it done. At the end of the day the only thing that counts is, if we make it to make R3 stable enough for prime time development. | |
Robert: 2-Jul-2011 | So, stay tuned and I must say that this is the most promising development for a long time. | |
Endo: 2-Jul-2011 | Thank a lot for the news Robert. Currently it is the most important thing is to continue on develeopment of R3, so it is definitely not a problem to drive the priorities by the things RMA needs.. Please keep us informed when you have more news from Carl. | |
Pekr: 3-Jul-2011 | If so, then there is a question, if Carl shoul not consider fully open-sourcing R3. But we most probably know the answer. I would be really intested into Carl officially telling the REBOL community his motives to (dis)contiu with R3 development. It all feels really dishonest, and I wonder, for any eventual future project, how Carl wants to win the trust of ppl. | |
Andreas: 3-Jul-2011 | I think Oldes was just referring to speculations made in some channel here on AltME a little while ago. I think the speculation was that one reason for Carl disappearing publicly could be that he is prohibited from talking about REBOL by an NDA-like contract at his current occupation. | |
Robert: 4-Jul-2011 | Some notes: - We will have an NDA - Carl didn't lost interest, he is full comitted to R3. His contract work is (still) / was very intense. Things get a bit more relaxed there now. | |
PeterWood: 4-Jul-2011 | Much kudos to Robert for taking such a powerful initiative from which all of us will benefit. I wish you, RMA and the RMA team every success; you all really deserve it. | |
Robert: 4-Jul-2011 | - I / RMA will be the main communication channel. I have access to Rebol-3 twitter and there exists a RMA twitter. - We will continue to work on the R3-GUI and release it as we did before (sometimes there might be longer periods of no-release, if we are doing massive changes) - The main focus will be: fixing bugs, defining and writing down how datatypes are handled WRT conversion, priority, sorting etc. | |
shadwolf: 6-Jul-2011 | LOL AFTER 111 version there zill be a NDA ( non disclosure agreement) but non disclosure agreement to disclose what the emptyness of nothingness ? go ahead please :) | |
shadwolf: 6-Jul-2011 | wasn't carl some moth ago that was saying the rebol was a commercial faillure and that he needed tto get a real work to feed his familly ? | |
Rebolek: 7-Jul-2011 | Where he said, that it's a commercial failure? | |
Geomol: 15-Jul-2011 | Can R3 load and use shared libraries like R2 with load/library ? I see a group named "!REBOL3 /library". Is that about such libraries, or are extensions for that? (Group "!REBOL3 Extensions".) | |
Pekr: 15-Jul-2011 | There was a bounty, and attempt from Max (not finished IIRC), to bring R2 like DLL interface to R3, to simplify it for users not being able to utilise full extension interface. | |
Geomol: 15-Jul-2011 | Has it been tried to get the source for R2's load/library, make routine! and then the calling from Carl? That seems to me to be a lot easier to start with that code, as it do work. | |
Geomol: 15-Jul-2011 | I know, it has been tried many times to get sources, but maybe he would agree on such a specific case, as it would be needed in R3 anyway. | |
Robert: 15-Jul-2011 | Converting a R2 DLL into a R3 one is really simple. I have done our DLLs in a way that I can compile them as R2 or R3 version. The only change is a .DEF file for the linker. Everything else is the same. | |
Robert: 15-Jul-2011 | To make them into a R3 extension is pretty simple. The R3 external interface is way simpler than in R2. So, yes, via IMPORT and R3 extension. | |
Geomol: 15-Jul-2011 | Ok, I get this error: >> opengl: import %/System/Library/Frameworks/OpenGL.framework/OpenGL ** syntax error: script is missing a REBOL header: %/System/Library/Frameworks/OpenGL.framework/OpenGL So such DLLs need to be wrapped in some REBOL code, or? | |
Pekr: 15-Jul-2011 | Geomol - yes, you need to write a wrapper for each DLL you are about to utilise ... | |
Henrik: 16-Jul-2011 | http://curecode.org/rebol3/ticket.rsp?id=1888&cursor=1 This doesn't look like a bug to me. Anyone? | |
Henrik: 16-Jul-2011 | http://curecode.org/rebol3/ticket.rsp?id=1886&cursor=3 This one looks fixable, as it's a mezzanine. | |
BrianH: 16-Jul-2011 | #1888 is definitely not a bug. #1886 should be looked at by the person who knows what SPLIT is supposed to do. It wasn't one of mine, and there was never really any consensus about its behavior. SPLIT isn't finished yet. | |
Gregg: 17-Jul-2011 | I don't know where the test suite for SPLIT is, but the rule in effect for that changed from the old source that Gabriele and I originally created. The final rule, for string/char/bitset delimiters was originally this: [any [mk1: some [mk2: dlm break | skip] (emit copy/part mk1 mk2)]] but is now this: [any [mk1: [to dlm mk2: dlm | to end mk2:] (keep copy/part mk1 mk2)]] It looks like that changed due to http://issue.cc/r3/573, but obviously wasn't run through a test suite. I don't know what caused the issue with the above bug, as that parse rule returns a correct result. | |
Gregg: 17-Jul-2011 | Found a small test suite. | |
Gregg: 17-Jul-2011 | The original was written before MAP-EACH and the new COLLECT. Here is the source I have, updated to use those as the current version does, but with the last rule reverted to the original. Related cc reports: http://issue.cc/r3/1096 http://issue.cc/r3/690 split: func [ "Split a series into pieces; fixed or variable size, fixed number, or at delimiters" series [series!] "The series to split" dlm [block! integer! char! bitset! any-string!] "Split size, delimiter(s), or rule(s)." /into "If dlm is an integer, split into n pieces, rather than pieces of length n." /local size count mk1 mk2 ][ either all [block? dlm parse dlm [some integer!]] [ map-each len dlm [ either positive? len [ copy/part series series: skip series len ] [ series: skip series negate len ; return unset so that nothing is added to output () ] ] ][ size: dlm ; alias for readability collect [ parse/all series case [ all [integer? size into] [ if size < 1 [cause-error 'Script 'invalid-arg size] count: size - 1 size: round/down divide length? series size [ count [copy series size skip (keep/only series)] copy series to end (keep/only series) ] ] integer? dlm [ if size < 1 [cause-error 'Script 'invalid-arg size] [any [copy series 1 size skip (keep/only series)]] ] 'else [ ; = any [bitset? dlm any-string? dlm char? dlm] [any [mk1: some [mk2: dlm break | skip] (keep copy/part mk1 mk2)]] ] ] ] ] ] | |
Gregg: 17-Jul-2011 | >> split "a.b.c" "." == ["a" "b" "c"] >> split "c c" " " == ["c" "c"] >> split "1," " " == ["1,"] >> split "1,2" " " == ["1,2"] >> split "c,c" "," == ["c" "c"] >> split/into "" 1 == [""] >> split/into "" 2 == ["" ""] >> split "This! is a. test? to see " charset "!?." == ["This" " is a" " test" " to see "] | |
Gregg: 17-Jul-2011 | split: func [ "Split a series into pieces; fixed or variable size, fixed number, or at delimiters" series [series!] "The series to split" dlm [block! integer! char! bitset! any-string!] "Split size, delimiter(s), or rule(s)." /into "If dlm is an integer, split into n pieces, rather than pieces of length n." /local size piece-size count mk1 mk2 res fill-val add-fill-val ][ either all [block? dlm parse dlm [some integer!]] [ map-each len dlm [ either positive? len [ copy/part series series: skip series len ] [ series: skip series len ; return unset so that nothing is added to output () ] ] ][ size: dlm ; alias for readability res: collect [ parse/all series case [ all [integer? size into] [ if size < 1 [cause-error 'Script 'invalid-arg size] count: size - 1 piece-size: to integer! round/down divide length? series size if zero? piece-size [piece-size: 1] [ count [copy series piece-size skip (keep/only series)] copy series to end (keep/only series) ] ] integer? dlm [ if size < 1 [cause-error 'Script 'invalid-arg size] [any [copy series 1 size skip (keep/only series)]] ] 'else [ ; = any [bitset? dlm any-string? dlm char? dlm] [any [mk1: some [mk2: dlm break | skip] (keep/only copy/part mk1 mk2)]] ] ] ] ;-- Special processing, to handle cases where the spec'd more items in ; /into than the series contains (so we want to append empty items), ; or where the dlm was a char/string/charset and it was the last char ; (so we want to append an empty field that the above rule misses). fill-val: does [copy either any-block? series [[]] [""]] add-fill-val: does [append/only res fill-val] case [ all [integer? size into] [ ; If the result is too short, i.e., less items than 'size, add ; empty items to fill it to 'size. ; We loop here, because insert/dup doesn't copy the value inserted. if size > length? res [ loop (size - length? res) [add-fill-val] ] ] ; integer? dlm [ ; ] 'else [ ; = any [bitset? dlm any-string? dlm char? dlm] ; If the last thing in the series is a delimiter, there is an ; implied empty field after it, which we add here. case [ bitset? dlm [ ; ATTEMPT is here because LAST will return NONE for an ; empty series, and finding none in a bitest is not allowed. if attempt [find dlm last series] [add-fill-val] ] char? dlm [ if dlm = last series [add-fill-val] ] string? dlm [ if all [ find series dlm empty? find/last/tail series dlm ] [add-fill-val] ] ] ] ] res ] ] | |
Gregg: 17-Jul-2011 | test [split "1234567812345678" 4] ["1234" "5678" "1234" "5678"] test [split "1234567812345678" 3] ["123" "456" "781" "234" "567" "8"] test [split "1234567812345678" 5] ["12345" "67812" "34567" "8"] test [split/into [1 2 3 4 5 6] 2] [[1 2 3] [4 5 6]] test [split/into "1234567812345678" 2] ["12345678" "12345678"] test [split/into "1234567812345678" 3] ["12345" "67812" "345678"] test [split/into "1234567812345678" 5] ["123" "456" "781" "234" "5678"] test [split/into "123" 6] ["1" "2" "3" "" "" ""] test [split/into [1 2 3] 6] [[1] [2] [3] [] [] []] test [split [1 2 3 4 5 6] [2 1 3]] [[1 2] [3] [4 5 6]] test [split "1234567812345678" [4 4 2 2 1 1 1 1]] ["1234" "5678" "12" "34" "5" "6" "7" "8"] test [split first [(1 2 3 4 5 6 7 8 9)] 3] [(1 2 3) (4 5 6) (7 8 9)] test [split #{0102030405060708090A} [4 3 1 2]] [#{01020304} #{050607} #{08} #{090A}] test [split [1 2 3 4 5 6] [2 1]] [[1 2] [3]] test [split [1 2 3 4 5 6] [2 1 3 5]] [[1 2] [3] [4 5 6] []] test [split [1 2 3 4 5 6] [2 1 6]] [[1 2] [3] [4 5 6]] test [split [1 2 3 4 5 6] [3 2 2 -2 2 -4 3]] [[1 2 3] [4 5] [6] [5 6] [3 4 5]] test [split "abc,de,fghi,jk" #","] ["abc" "de" "fghi" "jk"] test [split "abc<br>de<br>fghi<br>jk" <br>] ["abc" "de" "fghi" "jk"] test [split "a.b.c" "."] ["a" "b" "c"] test [split "c c" " "] ["c" "c"] test [split "1,2,3" " "] ["1,2,3"] test [split "1,2,3" ","] ["1" "2" "3"] test [split "1,2,3," ","] ["1" "2" "3" ""] test [split "1,2,3," charset ",."] ["1" "2" "3" ""] test [split "1.2,3." charset ",."] ["1" "2" "3" ""] test [split "abc|de/fghi:jk" charset "|/:"] ["abc" "de" "fghi" "jk"] test [split "abc^M^Jde^Mfghi^Jjk" [crlf | #"^M" | newline]] ["abc" "de" "fghi" "jk"] test [split "abc de fghi jk" [some #" "]] ["abc" "de" "fghi" "jk"] | |
Gregg: 17-Jul-2011 | A quick scan of the docs showed that negative skip val usage changed from the original design. I will revert the negate on those to match the doc'd behavior. | |
Gregg: 17-Jul-2011 | split: func [ "Split a series into pieces; fixed or variable size, fixed number, or at delimiters" series [series!] "The series to split" dlm [block! integer! char! bitset! any-string!] "Split size, delimiter(s), or rule(s)." /into "If dlm is an integer, split into n pieces, rather than pieces of length n." /local size piece-size count mk1 mk2 res fill-val add-fill-val ][ either all [block? dlm parse dlm [some integer!]] [ map-each len dlm [ either positive? len [ copy/part series series: skip series len ] [ series: skip series negate len ; return unset so that nothing is added to output () ] ] ][ size: dlm ; alias for readability res: collect [ parse/all series case [ all [integer? size into] [ if size < 1 [cause-error 'Script 'invalid-arg size] count: size - 1 piece-size: to integer! round/down divide length? series size if zero? piece-size [piece-size: 1] [ count [copy series piece-size skip (keep/only series)] copy series to end (keep/only series) ] ] integer? dlm [ if size < 1 [cause-error 'Script 'invalid-arg size] [any [copy series 1 size skip (keep/only series)]] ] 'else [ ; = any [bitset? dlm any-string? dlm char? dlm] [any [mk1: some [mk2: dlm break | skip] (keep/only copy/part mk1 mk2)]] ] ] ] ;-- Special processing, to handle cases where the spec'd more items in ; /into than the series contains (so we want to append empty items), ; or where the dlm was a char/string/charset and it was the last char ; (so we want to append an empty field that the above rule misses). fill-val: does [copy either any-block? series [[]] [""]] add-fill-val: does [append/only res fill-val] case [ all [integer? size into] [ ; If the result is too short, i.e., less items than 'size, add ; empty items to fill it to 'size. ; We loop here, because insert/dup doesn't copy the value inserted. if size > length? res [ loop (size - length? res) [add-fill-val] ] ] ; integer? dlm [ ; ] 'else [ ; = any [bitset? dlm any-string? dlm char? dlm] ; If the last thing in the series is a delimiter, there is an ; implied empty field after it, which we add here. case [ bitset? dlm [ ; ATTEMPT is here because LAST will return NONE for an ; empty series, and finding none in a bitest is not allowed. if attempt [find dlm last series] [add-fill-val] ] char? dlm [ if dlm = last series [add-fill-val] ] string? dlm [ if all [ find series dlm empty? find/last/tail series dlm ] [add-fill-val] ] ] ] ] res ] ] | |
Steeve: 18-Jul-2011 | Seems you wrecked the behavior when a parse rule is fulfilled. [split] should keep the matched parts, you do the contrary (exclusion), why this change ?. | |
Gregg: 18-Jul-2011 | OK. I'm not vested in the implementation, just the results. Feel free to improve things and make it more elegant. As long as the tests all pass, or we agree on behavior changes, I don't have a problem. | |
Gregg: 18-Jul-2011 | If you could use a test case to explain, I might get it. I'm slow today. | |
Gregg: 18-Jul-2011 | And what behavior did I change (as a test case)? | |
Steeve: 18-Jul-2011 | oK wait a little, I will do my best ;-) | |
Steeve: 18-Jul-2011 | current behavior: split "-a-a'" ["a"] >> ["a" "a"] yours: split "-a-a" ["a"] >> ["-" "-"] | |
Gregg: 18-Jul-2011 | So, you're saying that you want to specify a delimiter, and have it keep that? In any case, that's not the current behavior: >> split "-a-a'" ["a"] == ["-" "-" "'" ""] Here's mine: >> split "-a-a'" ["a"] == ["-" "-" "'"] | |
Gregg: 18-Jul-2011 | Thanks for taking a look at it Steeve. Pills or not. | |
sqlab: 19-Jul-2011 | It seems the parse behaviour changed somewhere a few times.. Nevertheless I think split is overloaded and overcomplicated. The parse rules should better go in a PARSE Common Patterns library, that is included with Rebol, not unlike http://www.rebol.com/article/0508.html | |
BrianH: 19-Jul-2011 | Well, for one thing, complex, flexible mezzanine functions tend to be slowed down by the conditional code that determines the actual behavior desired in a particular case. There are real advantages to seperating a complex function into multiple smaller, simpler functions. This makes it so the choices about which set of behavior to use are made by the programmer at development time instead of by the interpreter at runtime. SPLIT is a really large function that does many different things, so it's a good candidate for such a function split. | |
BrianH: 19-Jul-2011 | I liked Carl's idea of a SPLIT function that takes a series and returns the series from the head to the offset, and from the offset to the end. Like this: split: func [series [series!]] [reduce [copy/part head series series copy series]] At most, add a option to control the copying. Then have a seperate function split on a delimeter, another split into a number of parts, etc. | |
BrianH: 19-Jul-2011 | There are some mezzanine functions that have to be large and complex for other reasons. For instance, a couple of the LOAD subfunctions need to have functionality bundled together for security purposes. This doesn't seem to be the case with SPLIT. | |
BrianH: 19-Jul-2011 | It's one of the ironies of R3 that for a language that touts its ability to create user-designed dialects, inside the R3 mezzanine code, dialected functions are often too slow to be efficient enough for inclusion. This is why most of the builtin dialects are implemented in native code, through natives or commands. A dialect needs to be efficient enough to merit its use as opposed to the procedural equivalent, and easy enough to comprehend that the users of the dialect are likely to use it, rather than a simpler alternative. Developers' minds have overhead too. | |
Gregg: 19-Jul-2011 | If optimization is the goal, we can certainly write specialized funcs. I have a lot of them myself. How much too slow is the current SPLIT and in what contexts? This SPLIT is intended to be general, like ROUND. If you need to round something, HELP ROUND gives you all the options, rather than having CEIL, FLOOR, TRUNC, etc. There was a long discussion about that when it was designed. The goal is to reduce the cognitive overhead. If people think this functionality is not helpful, and all we need is SPLIT = [first rest], then that's all we need. If so, please give it a more precise name. | |
BrianH: 19-Jul-2011 | SPLIT has enough cognitive overhead that I've never understood what it was supposed to do, and thus never used it. A sign? | |
Gregg: 19-Jul-2011 | How could this be made clearer then? Split a series into pieces; fixed or variable size, fixed number, or at delimiters Did you ever look at the docs for it? http://www.rebol.com/r3/docs/functions/split.html |
55301 / 64608 | 1 | 2 | 3 | 4 | 5 | ... | 552 | 553 | [554] | 555 | 556 | ... | 643 | 644 | 645 | 646 | 647 |