World: r3wp
[rebcode] Rebcode discussion
older | first |
Geomol 12-Feb-2008 [2332x2] | Hm, probably a bad idea to use arrows to navigate ram, because it makes them not work in the text area. |
A performance test program: lda #0 sta &1001 .l1 lda #0 sta &1002 .l2 lda #0 sta &1003 .l3 lda &1003 adc #1 sta &1003 lda &1003 bne l3 lda &1002 adc #1 sta &1002 lda &1002 bne l2 lda &1001 adc #1 sta &1001 lda &1001 bne l1 It takes 40s to run on a BBC emulator emulating a 1MHz 6502. It took around 14s using the rebcode emulator on my 1.2 GHz G4, and it took 9.5s using the rebcode emulator on my 2.4GHz Pentium 4. | |
Geomol 13-Feb-2008 [2334x3] | A similar rebcode performance test program might look like: ram: make binary! 3 insert/dup ram #"^(00)" 3 looptest: rebcode [/local a] [ set a 0 pokez ram 0 a label l1 set a 0 pokez ram 1 a label l2 set a 0 pokez ram 2 a label l3 pickz a ram 2 add a 1 pokez ram 2 a eq a 256 braf l3 pickz a ram 1 add a 1 pokez ram 1 a eq a 256 braf l2 pickz a ram 0 add a 1 pokez ram 0 a eq a 256 braf l1 ] It does 16'777'216 loops and takes less than 3 seconds on my 1.2 GHz G4. |
To sum it up: A 1MHz 6502 takes 40 sec to do 16'777'216 loops of this kind. Emulating the 6502 using rebcode can do the same thing in 14 sec (on a 1.2 GHz G4) and in 9.5 sec (on a 2.4 GHz P4). A pure rebcode program (no emulation) can do the 'same' 16'777'216 loops in around 2.7 sec on a 1.2 GHz G4. So a conclusion might be, that programming in rebcode is like having a 40 / 2.7 = 15 MHz cpu (if run on a 1.2 GHz G4). Is this a correct conclusion? | |
Is it known how many cpu clocks, each rebcode instruction use in average? | |
Henrik 13-Feb-2008 [2337] | sounds pretty slow? |
Geomol 13-Feb-2008 [2338x2] | I'm not sure. |
This is just one single test using only a few of the available instructions. To have a better view, more tests are needed. I made a similar loop in C, compiled it with gcc, and it runs around 6 times faster than the pure rebcode version. Initially I won't call rebcode slow, but not blasting fast either. | |
Pekr 13-Feb-2008 [2340] | and R3 rebcode si going to be even slower .... |
Geomol 13-Feb-2008 [2341] | There's something wrong with my compare with a 1MHz 6502. I counted the number of cycles in the inner loop and found 17 cycles. A 1MHz 6502 can then do 1'000'000 / 17 * 40 = 2'352'941 loops in 40 seconds. But the BeebEm emulator made 16.7 mio. loops in that time. It should have taken 285 sec. So programming in rebcode is more like a 107 MHz cpu in this test. (It's probably not correct to measure it this way.) |
BrianH 13-Feb-2008 [2342] | Rebcode is a higher-level language than 6502 assembler. Perhaps a peephole optimizer can rewrite your generated rebcode into better equivalent rebcode. |
Steeve 13-Feb-2008 [2343] | Geomol, i had a look on your emulator code, i think perfs could be improved if you delay the update of all flags only when they are used. |
Geomol 13-Feb-2008 [2344] | Good idea! Do you have previous experience with emulators like this, because I have none. |
Steeve 13-Feb-2008 [2345x2] | in fact the engine is very similar with the z80 one, i think we could make a meta-emulator using external data-sheets (one for 6502, one for Z80) |
i' made a Z80 emulator using rebcode (not complete), you can see it in galaga.r on rebol.org | |
Geomol 13-Feb-2008 [2347] | Ah, that was you. Someone mentioned that one lately. |
Steeve 13-Feb-2008 [2348x3] | ah BrianH, i remember that you made the same proposal for my z80 emu (peephole optimzation) |
hard to do | |
interesting to do on ROMs (static analysis before to launch the code) but not valuable in RAM because the code can be modified | |
Geomol 13-Feb-2008 [2351] | Steeva, about flags: e.g. the zero flag Z (bit 1 of P). In stead of that I set it each time A, X or Y become zero, I could save any of those (A, X or Y) in a variable, and then test on that var and set the flag correctly, if and when the flag is actual used. Is that what you mean? |
Steeve 13-Feb-2008 [2352x3] | and limited because on 6502 for example, many branchements are calculated (not statics) |
yes Geomol, it's that | |
Flags are calculated on the last accumulator value if i don't do mistakes | |
Geomol 13-Feb-2008 [2355] | ok. One optimization, I consider, is to cross-compile 6502 opcodes to rebcode, instead of emulating the 6502. That won't work with self-modifying code and branches will be a problem. So it's hard, but I think, it might work. |
Steeve 13-Feb-2008 [2356x2] | in theory |
i give you an example with the TAX opcode ; updating flags in real time label TAX seti X A eq X 0 either [or P 2] [and P 253] seti i X and i 128 eq i 128 either [or P 128] [and P 127] bra continue ; delay the calcul of flags label TAX seti X A or maskA (2 + 128) ; remember that we have to recalculate zero and negative flags using A, but don't do it now bra continue | |
Geomol 13-Feb-2008 [2358] | Thanks! |
Steeve 13-Feb-2008 [2359] | you got the idea ? ;-) |
Geomol 13-Feb-2008 [2360] | Yup! :) |
Steeve 13-Feb-2008 [2361x2] | did you think that using PC as an offset (integer) instead of as a serie could be faster ? |
I should do a test before saying that | |
Geomol 13-Feb-2008 [2363] | I didn't consider much in deep actually. It can be improved, I'm sure. :) |
Andreas 7-Jan-2010 [2364] | anyone happes to still have a rebol/core binary with rebcode functionality archived somewhere? |
Steeve 7-Jan-2010 [2365] | it's not in the download section of rebol.com anymore ? |
Andreas 7-Jan-2010 [2366x2:last] | ah, got it: http://www.rebol.net/builds/042/rebview1350042.tar.gz |
resp. http://www.rebol.net/builds/031/rebview1350031.exefor windows | |
older | first |