World: r4wp
[!REBOL3] General discussion about REBOL 3
older newer | first last |
Andreas 1-Apr-2013 [2221x2] | In effect, that is :) |
dirname/basename does a clean-path before splitting. | |
Gregg 1-Apr-2013 [2223x2] | :-) Let me play with an idea here for a bit. |
Let's widen the discussion a bit. Spitting a string at a delimiter. Easy enough to define clear behavior if the series contains the delimiter, but what if it doesn't? Most split funcs return an array, splitting at each dlm. If no dlm, return the original series as the only element in that array. What if we always want to return two elements? e.g., we have a SPLIT-AT func that can split a series into two parts, given either an integer index or value to match. Let's also give it a /LAST refinement, so it can split at the last matching value found, like FIND/LAST works. Given that, what do you expect in the case where the dlm (e.g. "=") is not in the series? SPLIT-AT "abcdef" "=" == [? ?] SPLIT-AT/LAST "abcdef" "=" == [? ?] | |
Maxim 1-Apr-2013 [2225] | I haven't had the time to follow all the discussion in detail, but to me, the second part of split-path should NEVER return a directory path. when doing set [dir file] I should be able to count on the fact that the second part is either a file or none. The same for the first part which should always be none or a dir. I have my own implementation in R2 which makes this strict and it simplifies a lot of code. so we can do with absolute certainty: if second set [dir file] split path [ ] IIRC some of the versions of my split perform a clean-path to simplify and add robustness to the result. |
Gregg 1-Apr-2013 [2226] | Thanks for posting Max. With 5 of us talking about it, we have 5 opinions so far. :-) The one thiing we all seem to agree on is that we want consistent behavior, which we don't have right now. |
sqlab 2-Apr-2013 [2227] | Maxims method sounds reasonable |
Ladislav 2-Apr-2013 [2228x6] | Re: One test missing in your collection: %foo [%./ %foo] - this test violates the "invariant" |
Re: %/c/test/test2/ [%/c/test/ %test2/] - this test does not violate anything but it does not split the "pathfile" to "path" and "file" parts | |
hmm, I may be wrong in "it does not violate anything" - in fact, it contradicts the help string of the function | |
(I do not object against adjustment, but would expect the help string to be changed as well to be compatible with this) | |
(if it is the preferred behaviour) | |
Regarding the split-path behaviour in the %foo case. I stongly object against the proposal to obtain [%. %foo], since for example INCLUDE whan obtaining %foo with empty path uses INCLUDE-CTX/PATH to find %foo (which may even exclude the %./ directory if it is not in INCLUDE-CTX/PATH), while when obtaining %./foo it just finds the file in the current directory (which is not equivalent) | |
Maxim 2-Apr-2013 [2234x2] | IMHO %foo should return [none %foo] |
split-path shoudn't invent information which isn't given to it | |
Ladislav 2-Apr-2013 [2236] | As said, I prefer [%"" %foo] to have the invariant that file = rejoin split-path file |
Maxim 2-Apr-2013 [2237] | question is, is that invariant useful? really, I like consistency almost above all else, but I prefer when it not just neutral. getting empty file specs is very awkward to use and doesn't work well with all the none transparency which makes a lot of the conditional code so easy to read. one reason this is so readable in REBOL is the limited use of equality operations, when doing complex decision making. |
Ladislav 2-Apr-2013 [2238] | question is, is that invariant useful? - for me it is, but, of course, it is a matter of preference |
Maxim 2-Apr-2013 [2239] | I agree. |
Bo 2-Apr-2013 [2240] | I prefer split-path %foo == [%./ %foo] The reason is because I believe split-path shouldn't require an extra check if all you want to do is read the base directory that a file is in. I think this is a common use of split-path. |
Ladislav 2-Apr-2013 [2241x2] | hmm, but Bo, that would make %foo equivalent to %./foo , which is not good IMO |
(just because they *are not* equivalent) | |
Gregg 2-Apr-2013 [2243] | Do our preferences come from the basic difference of whether we want SPLIT-PATH to be "smart" about file specs, or whether it should assume nothing (the REJOIN invariant case)? For example, Andreas's path invariant (p/:t) makes a lot of sense, but some of his examples' results look wrong when just viewed as results. e.g.: ; %/ [%/ %/] ; %// [%/ %/] ; %./ [%./ %./] |
Ladislav 2-Apr-2013 [2244] | In my opinion, the behaviour is neither simple nor useful. |
Gregg 2-Apr-2013 [2245x2] | And the reason I posted the SPLIT-AT question was to see if we could find a solution for both. |
Ladislav, you mean the examples I just posted? | |
Ladislav 2-Apr-2013 [2247] | Yes, the most recently posted behaviour examples. |
Gregg 2-Apr-2013 [2248x4] | Got it. As you might all guess, since my proposal is most like Ladslav's, that's my current preference. As I posted, it only misses a couple edge cases to also meet the path invariant. For Max, I undersand the value of NONE. So much so that I have an NONE-OR-EMPTY? mezz. |
For reference, those cases are: Path quality failed: %. %/. Path quality failed: %.. %/.. | |
And while I understand that a file with no path implies the current directory, we lose information by assuming it. For example, if I let a user specify a path or filename, and I split it, now I can't tell if they gave me just a filename, or if they gave me %./<file>. | |
Can we resolve our differences with a refinement? | |
Ladislav 2-Apr-2013 [2252x2] | we have got quite a few combinations to consider: missing path: * yielding %. * yielding %"" * yielding #[none] missing file * yielding the last directory in the path * yielding %"" * yielding #[none] In total that is 6 variants but only some combinations make sense, I think |
sorry, I mean 9 | |
Gregg 2-Apr-2013 [2254x2] | The NONE case, while potentially useful, is only in Max's custom version. The current version in REBOL only returns NONE in a few edge cases, which I think we all agree is wrong. |
And it doesn't satisfy either the string (REJOIN) or path invariant. If we care about either of those, it's a problem. | |
Maxim 2-Apr-2013 [2256] | but what is usefull in the rejoin invariant? we know the path before the split... |
Andreas 2-Apr-2013 [2257x4] | The plain "path-invariant" (just d/:b) I posted earlier was too simple, only the later refined one, including clean-path caters for the corner cases I posted as well. |
Maxim, the problem with requiring that SPLIT-PATH should _never_ return a directory as second component, is that SPLIT-PATH cannot decide that based on a file! alone. | |
It could only do that in relation to a file system or with the simple heuristic used by DIR? as well: based on the presence of absence of a trailing slash. | |
I'm fine with a purely REJOIN-based invariant as well. (Even though I personally find a path-based invariant more useful.) | |
Cyphre 2-Apr-2013 [2261] | For those interested in the "alpha-channel change": Here is the branch with first round of related code changes to the image! and image codecs: https://github.com/cyphre/r3/commit/472c106a0f177ead82a6f29be1ae98b4cd33b9ad Note: This code doesn't contain any graphics related changes...just the image! datatype + image codecs so you can MAKE images and load BMP, GIF, PNG and JPG files. But it should be enough to test the change. (I have this code already intergated with changed AGG graphics and it works well but I haven't published it as this part is not compatible with the 'official' git source at the moment.) Note2: the code was not tested on big-endian systems so it is possible there can be some quirks. Use at your own risk and let me know about any problems. The RGBA tuples on IMAGE! now work so if the fourth(alpha) value is not defined it is assumed the RGB tuple is opaque (ie. alpha = 255) so 0.0.0 = 0.0.0.255 etc. This way color values in old code that doesn't explicitly define alpha values are still compatible. If you are interested, try to compile and test a bit. Let me know if you see any issues. Thanks. |
Bo 2-Apr-2013 [2262] | Great job, Cyphre! |
Izkata 2-Apr-2013 [2263] | Isn't a convention that %foo/ is a directory while %foo is not? That's one way to tell if a given file! directory or not... It's what I generally expect, and I agree with Maxim that it makes the most sense for split-path to return #[none] if there is no file. |
Maxim 2-Apr-2013 [2264] | Andreas, the trailing slash is the separator. Are you proposing that split-path do a system check to verify if its a file? like dir? does in R2? |
Gregg 2-Apr-2013 [2265x3] | It can't do a check, because the file may not exist. |
Max said: "split-path shoudn't invent information which isn't given to it" I agree, if we consider split-path to be operating in string mode (the rejoin invariant). If we want to have a file-system aware option, what would we call the refinement? Or should it be a separate function? As far as returning none for either part, it strikes me as inconsistent (if convenient, which it may be). That is, if you split a series into two parts, splitting at the head or tail should just give you an empty series for that part, shouldn't it? This comes back to my SPLIT-AT question. | |
e.g., what would you expect for each of these cases? [split-at "123456378" #"/"] [split-at/tail "123456378" #"/"] [split-at/last "123456378" #"/"] [split-at/last/tail "123456378" #"/"] | |
Maxim 2-Apr-2013 [2268x2] | if it where a generic string handling function I'd agree with you... but its not... it has added meaning, it splits filesystem paths. its not just a string. if it where, I'd use parse or some tokenize func. I see absolutely no merit in trying to make split-path act like a generic string handling func. the point of the func is to separate folder and file into two parts. to me it comes down to either you decide that when there is no data you invent a default, or use the internal one which is none, which works well with soooo many other funcs. if there is no directory part in the path, do not try to find a suitable value for it... there is none... funny, even when trying to explain my point of view, the actual sentence reads almost like a line of rebol source. :-) |
if you give me split-path %"" I'd return [none none] if you want a meaningfull default for the dir, then just use clean-path before supplying the value to split-path, then you'll be assured of always getting a directory. | |
Gregg 2-Apr-2013 [2270] | I understand your view Max, but that's not what I asked. It doesn't work the way you want today, but maybe there's a way to provide a solution that is better than what we have now. I'd love to see your custom version, so we can compare its results. And I'm asking about SPLIT-AT for a reason, separate from SPLIT-PATH. I'd love to get everyone's thoughts. The funny thing is how much we can all care about the details of this func we (at least I) use a lot, and yet which none of us seem to like all that much. I think it points out that the normal case is the most important, where there is both a path and a file component. And maybe now is the time that we can make it just a little bit better, a little more consistent. |
older newer | first last |