r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[!REBOL3-OLD1]

Anton
6-Sep-2006
[1324]
; Anton's enhanced version:
; - /quote is applied to first value, if a string

; - reorders PAD and DATA arguments so PAD is first (being likely 
always short)
; - distinguishes /only and /pad-only
; - renames /quoted -> /quote
conjoin: func [

 "Join the values in a block together with a delimiting PAD value."
	pad "The value to put into the series"
	data [any-block!] "The series to join"
	/only "Inserts a series value in DATA as a series."

 /pad-only "Inserts a series PAD as a series." ; <-- this might not 
 be used much in practice (easy to add extra brackets around PAD)
	/quote "Puts string values in quotes."
	/local ; <- used to track tail of the result as we build it
] [
	if empty? data [return make data 0]

 local: tail either series? local: first data [copy local] [form :local]

 if all [quote any-string? local][local: insert tail insert head local 
 {"} {"}] ; quote the first value
	; <- (local should be at its tail at this point)
	while [not empty? data: next data] either any-string? local [
		either quote [

   [local: insert insert insert insert local pad {"} first data {"}]
		][
			[local: insert insert local pad first data]
		]
	] [
		either only [
			either pad-only [
				[local: insert/only insert/only local pad first data]
			][
				[local: insert/only insert local pad first data]
			]
		][
			either pad-only [
				[local: insert insert/only local pad first data]
			][
				[local: insert insert local pad first data]
			]
		]
	]
	head local
]

; test
conjoin "" []
conjoin "," []
conjoin '| [1 2 [3]]
conjoin '| [[1] 2 [3]]
conjoin ", " [{one} 2 [3]]
conjoin '| [["one"] 2 [3]]
conjoin/only '| [["one"] 2 [3]]

conjoin/only [pad] [1 2 [3]] ; ONLY and PAD-ONLY make no difference 
in string mode
conjoin/only [pad] [[1] 2 [3]]

conjoin/pad-only [pad] [1 2 [3]] ; ONLY and PAD-ONLY make no difference 
in string mode
conjoin/pad-only [pad] [[1] 2 [3]]

conjoin/only/pad-only [pad] [1 2 [3]] ; ONLY and PAD-ONLY make no 
difference in string mode
conjoin/only/pad-only [pad] [[1] 2 [3]]

conjoin/quote "" []
conjoin/quote "," []
conjoin/quote '| [1 2 [3]]
conjoin/quote '| [[1] 2 [3]] ; QUOTE doesn't work in block mode
conjoin/quote ", " [{one} 2 [3]]
conjoin/quote '| [["one"] 2 [3]]
conjoin/quote/only '| [["one"] 2 [3]]

conjoin/quote/only [pad] [1 2 [3]] ; ONLY and PAD-ONLY make no difference 
in string mode
conjoin/quote/only [pad] [[1] 2 [3]]

conjoin/quote/pad-only [pad] [1 2 [3]] ; ONLY and PAD-ONLY make no 
difference in string mode
conjoin/quote/pad-only [pad] [[1] 2 [3]]

conjoin/quote/only/pad-only [pad] [1 2 [3]] ; ONLY and PAD-ONLY make 
no difference in string mode
conjoin/quote/only/pad-only [pad] [[1] 2 [3]]
BrianH
6-Sep-2006
[1325x2]
Anton, I put the data argument first on purpose, to make the function 
fit in with the standard function argument order of series functions 
in REBOL.
I'll look at the rest later. Good catch on the pick.
Anton
7-Sep-2006
[1327x2]
Years ago, I successfully argued to Carl that SWITCH's VALUE argument 
should go before the CASES argument. My reasoning today is the same 
- it is easier to parse visually when the smaller or less frequently 
changing parts of an expression go together. As you can see above, 
all the conjoins with the same PAD argument are easy to see, and 
the more likely to vary DATA blocks begin sometimes at the same horizontal 
position (thus, easier to compare). Just scroll up and compare with 
the tests for your version; look at each line and try to see what 
the differences between them are.

The reasoning that a standard argument order is a good memory guide 
isn't strong enough for me; there is always HELP, and I think the 
particularities of each function are more important when determining 
the order of arguments.
Anyway, I knew I would encounter some resistence to the argument 
order in this version. The argument order is less important than 
all the other features, even though I feel strongly about it. If 
I have to reverse argument order to get it through, I will. (But 
I will try to make you rebut my argument first.) I keenly await your 
analysis of both functions. Maybe there are some cases I haven't 
considered?
Ladislav
7-Sep-2006
[1329]
What are your preferences for: abs -9223372036854775808 (64-bit integer) 
or abs -2147483648x-2147483648 (32-bit integers)? As far as I found 
out it looks, that in Python as well as in Java the C "standard" 
yielding the negative numbers is used.
Anton
7-Sep-2006
[1330]
I think it should be an overflow error.
Ladislav
7-Sep-2006
[1331]
Could somebody confirm my guess, about the Python an Java behaviour?
Volker
7-Sep-2006
[1332x2]
Python 2.4.3 (#2, Apr 27 2006, 14:43:58)
[GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2
>>> abs( -9223372036854775808)
9223372036854775808L
whatever "L" is..
Pekr
7-Sep-2006
[1334]
:-)
Volker
7-Sep-2006
[1335x7]
public class TestAbs {
	public static void main(String[] args) {
		System.out.println(Math.abs(-9223372036854775808));
	}
}
-9223372036854775808
Without range-check cbehavior is typical with two-complement. 0 is 
"positive", so there is one positive number less. Interesting that 
python can handle it.
seems python uses bignums for long:
>>> abs( -9223372036854775808 ** 2)
85070591730234615865843651857942052864L
http://docs.python.org/lib/typesnumeric.html
BTW i would rething the name for 'decimal! . To me its base-10. float 
or such are better for floats IMHO. If that does not break to much, 
but should be a global replace.
Re argument-oder: To me big inline block comes last, vars first. 
Else the standard, the important thing first. With conjoin i am unsure, 
it looks to me as if it rarely has inline-data. If i pad things together, 
i usually have a list, 
  conjoin list-of-things  ","

Its not like 'reduce or 'rejoin, where i mix inline-data with variables, 
which can span some codelines.
If i am wrongand its used like
  cojoin "," ["I" "who writes this" "has more to think about it"]
i am with Anton, small thing first.
Ladislav
7-Sep-2006
[1342]
aha, so Python looks like "going up" automatically, so no overflow
Volker
7-Sep-2006
[1343]
Yes, auto-bignums.
Ladislav
7-Sep-2006
[1344]
that is consistent and comfortable
Volker
7-Sep-2006
[1345]
i agree :)
Anton
7-Sep-2006
[1346x4]
Yes, overflow either results in an overflow error or a conversion.
(should result)
Volker, conjoin will be used very often, just like rejoin.
Thanks for your comments.
Ladislav
7-Sep-2006
[1350x3]
I remember that once there was an issue with a value that MOLD was 
unable to process. Sunanda gave an example of a MONEY! value of this 
kind, but I guess, that there was something else, what was it? - 
a past-tail block/string?
ah, yes, that's it
any other value of this kind?
Volker
7-Sep-2006
[1353]
Anton, i think that conjion will be used often, but will the argument 
be an inline-block, or a block in a variable? 'rejoin is used as 
an template, 
rejoin["Its" now/time "o'clock"]

In that case the block should be last. 'append is used with block 
in a var, 
'append this-block something

With conjoin i  expect it less like a template and more like 'append.
Anton
7-Sep-2006
[1354x2]
I try to avoid using extra variables. They can be a real pain when 
it comes to optimization and make things look messier. Of course, 
when using a variable the argument order becomes less important. 
It's only important when no variables are used and specified directly.
... but I'm still passionate about these cases, because they happen 
often.
BrianH
7-Sep-2006
[1356]
The series function standard is
    function data-to-be-operated-on modfier-arguments

That's what I used with conjoin. It was also intentional that the 
data block not be reduced by conjoin. I see conjoin as an operation 
that you pipe data through, like utilities on Unix. If you want the 
data reduced, go ahead and do so - if not, don't.
Graham
7-Sep-2006
[1357x2]
reconjoin
lol
BrianH
7-Sep-2006
[1359x8]
Looking at your conjoin with the /only and /pad-only refinements, 
it seems that with the /only you are trying to recreate the delimit 
function, but not as usefully. I thought of using pad as a variable 
name, but "delimiter" was more appropriate since padding functions 
usually pad outside the data, not within it. Let me try to add you 
fixes to my version and see what I get.
delimit: func [
    "Put a value between the values in a series."
    data [series!] "The series to delimit"
    delimiter "The value to put into the series"
    /only "Inserts a series delimiter as a series."
    /copy "Change a copy of the series instead."
    /local
] [
    while either copy [
        if empty? data [return make data 0]
        local: make data 2 * length? data
        [
            local: insert/only local first data
            not empty? data: next data
        ]
    ] [
        local: data
        [not empty? local: next local]
    ] pick [
        [local: insert local delimiter]
        [local: insert/only local delimiter]
    ] none? only
    head local
]

conjoin: func [
    "Join the values in a block together with a delimiter."
    data [any-block!] "The series to join"
    delimiter "The value to put into the series"
    /only "Inserts a series delimiter as a series."
    /quoted "Puts string values in quotes."
    /local
] [
    if empty? data [return make data 0]

    local: tail either series? local: first data [copy local] [form :local]
    while [not empty? data: next data] either any-string? local [
        either quoted [
            local: insert tail insert head local {"} {"}

            [local: insert insert insert insert local delimiter {"} first data 
            {"}]
        ] [[local: insert insert local delimiter first data]]
    ] [pick [
        [local: insert insert local delimiter first data]
        [local: insert insert/only local delimiter first data]
    ] none? only]
    head local
]
In theory, the
    pick [false-val true-val] none? option

pattern should be faster for straight value return than the pattern
    either option [true-val] [false-val]
when no evaluation is required.


I used while on purpose instead of foreach since it doesn't rebind 
the block passed to it. That should speed things up too.
The pick pattern is also good for nested options, like this

    pick pick [[no-a-no-b no-a-yes-b] [yes-a-no-b yes-a-yes-b]] none? 
    a none? b
The copy refinement on the delimit function should be faster than 
pre-copying the data because the copy is preallocated to size.
If you want to try something really fun, pass a no-argument function 
value as the delimiter argument. You can use this for all sorts of 
tricks, though if you are doing that the references to delimiter 
in the conjoin function should be put in parentheses for safety. 
Like this:
conjoin: func [
    "Join the values in a block together with a delimiter."
    data [any-block!] "The series to join"
    delimiter "The value to put into the series"
    /only "Inserts a series delimiter as a series."
    /quoted "Puts string values in quotes."
    /local
] [
    if empty? data [return make data 0]

    local: tail either series? local: first data [copy local] [form :local]
    while [not empty? data: next data] either any-string? local [
        either quoted [
            local: insert tail insert head local {"} {"}

            [local: insert insert insert insert local (delimiter) {"} first data 
            {"}]
        ] [[local: insert insert local (delimiter) first data]]
    ] [pick [
        [local: insert insert local (delimiter) first data]
        [local: insert insert/only local (delimiter) first data]
    ] none? only]
    head local
]
I'm reasonably certain that these functions will fail awkwardly if 
the data contains unset! values. Should this be fixed?
Volker
7-Sep-2006
[1367x2]
Anton, "I try to avoid using extra variables". In my scenario its 
not an extra variable. I have a block of data from elsewhere, so 
it is in a variable. I want that nicely formed, with delemiters. 
If i want  the data inline, i need no conjoin, i can join it by putting 
all in one string.
I have problems to find a case where using it for inline-blocks makes 
much sense. Maybe i have a blackout here, can you give a non-artificial 
case?
Anton
8-Sep-2006
[1369x5]
Brian:

The series function standard is
    function data-to-be-operated-on 
modfier-arguments

Yes I understand, but this standard doesn't make sense to me. For 
comparison my preferred pattern is:
    function smaller-argument larger-argument
I've pretty much abandoned the DO/NEXT idea for now.
Actually, from my tests it looks like EITHER is faster than PICK 
.. NONE? :
>> time-it: func [iterations code /local t0 t1][t0: now/precise loop 
iterations code t1: now/precise print difference t1 t0]
>> time-it 4000000 [pick [[a][b]] none? true]
0:00:04.306
>> time-it 4000000 [either true [[a]][[b]]]
0:00:03.555
>> time-it 4000000 [pick [[a][b]] none? none]
0:00:04.266
>> time-it 4000000 [either none [[a]][[b]]]
0:00:03.525
... which surprised me a little bit. I think maybe the reason pick 
is favoured in general use is because there's less typing.