World: r3wp
[Core] Discuss core issues
older newer | first last |
Anton 20-Oct-2006 [5774x2] | Jerry, if you're not aware, just a word of caution about line by line entry in the console; the console will attempt to mold the result. That means rebol will use at least the same amount of memory again just to create the molded string. |
I would avoid molding by putting length? on the front, and I'd also avoid line conversions done in non-binary mode: length? lines3: read/binary %/c/reg2.reg | |
Maxim 20-Oct-2006 [5776x2] | is it normal that join attemps to evaluate its second argument? ex: >> join [1 2 3] [bogus-word] ** Script Error: bogus-word has no value ** Where: repend ** Near: bogus-word where append does not give the error: >> append copy [1 2 3] [bogus-word] == [1 2 3 bogus-word] |
the help only talks about concatenation, no details about reducing the second argument , :-/ | |
Graham 20-Oct-2006 [5778] | try repend .. that will give you the error you seek! |
Maxim 20-Oct-2006 [5779] | exactly, I would have expected the error with rejoin. not join. |
Graham 20-Oct-2006 [5780] | except join normally converts the second argument to the datatype of the first |
Maxim 20-Oct-2006 [5781x2] | rejoin converts all internal values to the value of its first item in the block... similar... |
but if both arguments are blocks... it should not complain. | |
Graham 20-Oct-2006 [5783] | rambo it |
Maxim 20-Oct-2006 [5784x3] | even if I do a to-block on [bogus-word] I get no errors. |
I will :-) | |
sourcing join I see it uses repend instead of append. any gurus share to comment if they think this shold be changed? | |
Anton 20-Oct-2006 [5787] | I don't think the function should be changed now. The doc string could be more descriptive, but it's pretty easy to read that short code. |
Maxim 20-Oct-2006 [5788x2] | I did a RAMBO on it... I understand the effects on current code, maybe it should be revised for R3? |
and yess, in next R2 version, the doc string should be more explicit, | |
Gabriele 20-Oct-2006 [5790x2] | Max, the reason behind repend there is that join "a" something and join "a" [something something-else] should produce similar results. |
if you don't want the reduce, just use append copy "a" [something]. | |
Maxim 20-Oct-2006 [5792] | yeah I know, but join is just much more cleaner in the code... and now I realise that its use is quite limited, since most uses of block with words have unbound words. |
Gabriele 20-Oct-2006 [5793] | hmm, well, it really depends. noone complained so far :-) |
Rebolek 20-Oct-2006 [5794] | Maxim, 'join is much more cleaner in code? That's matter of opinion, I use 'rejoin almost everywhere and 'join just very rare :) |
Maxim 20-Oct-2006 [5795x4] | I use rejoin a lot to. |
but not having to wrap everything in a block when you really just want to append a value to a copy is easy to read. | |
so I guess I'll just use rejoin for blocks and join for strings :-) | |
thanks Gabriele | |
Jerry 20-Oct-2006 [5799x2] | The following code: unicode-to-ascii: func [ from to /local fs ts sz] [ fs: open/binary/direct/read from ts: open/binary/direct/write to sz: size? from fs/1 fs/1 ; discard the first two bytes, FFFE for i 3 sz 2 [ append ts to-char fs/1 fs: skip fs 1 ; SKIP is the problem ] close fs close ts ] unicode-to-ascii %/c/Unicode.txt %/c/Ascii.txt In REBOL/View 1.2.7.3.1 12-Sep-2006 Core 2.6.0 ** CRASH (Should not happen) - Expand series overflow In REBOL/View 1.3.2.3.1 5-Dec-2005 Core 2.6.3 ** Script Error: Not enough memory ** Where: do-body ** Near: fs: skip fs 1 |
Anton, thanks for the tip on avoiding molding. | |
Rebolek 20-Oct-2006 [5801] | Jerry: For conversion from/to UTF/UCS... you can use Oldes' unicode tools, it handles it very well (unfortunately you have to look around AltMe for some link, because Oldes does not upload to rebol.org and has his files all around the web - shame on you, Oldes! ;) |
Jerry 20-Oct-2006 [5802] | Thank you, Rebolek. |
Gregg 20-Oct-2006 [5803] | Depending on how you're going to do the actual diff, there are other ways you could work around this. You could try using the /seek refinement to read just parts of the files, you could split the files into chunks, or you could split them by top-level key (HKLM, HKCU, etc.); assuming you can read one full file into memory in order to do that. |
Jerry 20-Oct-2006 [5804] | To Gregg, I tried what you said. But there was a weird situation for the Windows Registry in my computer. If I export these 5 HKEY_??? into 5 files, respectively, the sum of their size is 568 MB. If I export all of them into 1 file. the file size is 316 MB, which is much smaller than 568 MB. I don't know why. So the 5-file version of Registry-Diff in REBOL might use more memory if the GC doesn't work well. |
Gregg 20-Oct-2006 [5805x2] | Hmmm. How are you diffing the data? |
i.e. do you expect things to be in the same order; using parse, matching keys after loading, etc. | |
Jerry 20-Oct-2006 [5807x2] | To Gregg, The diff algorithm I am using ... 2 blocks, one for reg-data-old (block1), the other for reg-data-new (block2). data in these blocks are in the following format: [ key1 value1 key2 value2 key3 value3 ... ] where keyX and valueX are both strings. Example: [ "HKEY_LOCAL_MACHINE_SOFTWARE_ABC" {"sid"=dword:00000001^/"tid"=dword:000000FF} ... ] I use "SORT/SKIP 2" to sort the 2 blocks. It's very fast, I guess that's because the original data are in order already. After sorting, I can comapre these two blocks with the "race" algorithm. The "race" algorithm is very simple ... loop [ if ... the key in block1 is equal to the key in block2 then ... check their values (different values mean modified) if ... the key in block1 is less than the key in block2 then ... the key in block1 is deleted-key. Move the key in block 1 to the next key. if ... the key in block1 is greater than the key in block2 then ... the key in block2 is added-key. Move the key in block 2 to the next key. ] Well, my English is not very good. I hope you understand what I am saying here. |
I would like to know ... 1. How to use the OPEN function with the /SEEK refinement to replace the 1,000,000th byte with the 2,000,000th byte in a file. 2. How to truncate a huge file to its helf size, and keep the head helf only. Thanks. | |
PeterWood 20-Oct-2006 [5809x3] | My answer to your first question (from reading http://www.rebol.net/article/0199.html ) >> write %testdata "123456789A123456789B" ;; A simple test file >> fp: open/seek %testdata ;; Open the file in seek mode >> fp: skip fp 19 ;; Move to 20th character >> newval: copy fp ;; Copy the 20th character == "B" >> fp: head fp ;; Position at start of file >> change at fp 10 newval ;; Overwrite 10th character >> copy head fp ;; Check change made == "123456789B123456789B" >> close fp >> fp: open %testdata >> copy fp ;; Check file was changed == "123456789B123456789B" |
Some of the statements can, of course be consolidated and there are, no doubt, better ways of doing this. | |
Second question: >> fp: open/direct %testdata >> write %testdata2 copy/part fp 10 >> read %testdata2 == "123456789B" | |
Graham 20-Oct-2006 [5812] | Peter, the second answer would not work well with large files. I think Carl gives an example of how to copy large files and one should use that method. |
PeterWood 20-Oct-2006 [5813x2] | Thanks Graham |
Was it this example? http://www.rebol.net/article/0281.html | |
Graham 20-Oct-2006 [5815x2] | Yes. |
T'was | |
Jerry 21-Oct-2006 [5817] | Thanks. The #0281 Article in the "Vivi la REBOLution" Blog is very helpful. Maybe I should read these articals all over again. BTW, I really hope that the web page in http://www.rebol.com/docs/changes.html would be updated to reflect the changes after 2.5.6. |
Gregg 21-Oct-2006 [5818] | OK, so it's not just the volume of raw registry data we're dealing with, but loading it into blocks as well. Do you have any idea how many keys there are that you're dealing with (i.e. how many block entries)? (Your English is very good BTW :-) |
Jerry 21-Oct-2006 [5819x2] | Gregg. There are 166,659 keys in my registry system. |
And loading the registry into a REBOL block is not a problem so far. I simply READ the registry-dump file as a string, then PARSE the long string into block: parse-reg: func [ file /local str blk ] [ str: read file blk: copy [] parse str [ to "[HKEY_" some [ copy txt thru "]" (append blk txt) skip copy txt [ to "[HKEY_" | to end ] (append blk trim txt) ] ] str: none blk ] | |
Graham 22-Oct-2006 [5821x3] | anyone got a function that converts rebol colours to html colour codes ? |
sint-to-hex: func [ smallint [integer!]][ copy/part skip tail form to-hex smallint -2 2 ] rgb-to-hex: func [ c [tuple!] /local ][ rejoin [ "#" sint-to-hex c/1 sint-to-hex c/2 sint-to-hex c/3 ] ] | |
something like that I guess | |
older newer | first last |