Multithreading with Rebol

[1/13] from: parki:whatevernot at: 14-Oct-2003 10:52

I took a quick look at Rugby, which is really pretty nifty. One thing though, which Rugby made clear - there seems to be no language support for threads in Rebol. If I am reading the Rugby docs correctly, the server side has to be implemented as cooperative multi-tasking, and that there is a polling loop (no wait notify semantics). This seems to be a bit of a hole, imho (unless there is a Rebol way?). I'm used to the threading support from Java, which is pretty powerful. Any thoughts/help appreciated. parki...

[2/13] from: greggirwin:mindspring at: 14-Oct-2003 15:44

Hi Brian, BP> ...there seems to be no language support for threads in Rebol. Correct. BP> This seems to be a bit of a hole, imho (unless there is a Rebol way?). I'm BP> used to the threading support from Java, which is pretty powerful. IIRC Carl has strong feelings against multi-threading. It can be a powerful tool, yes, but do we *need* it, and is it worth the complexity that comes with it? -- Gregg

[3/13] from: parki:whatevernot at: 14-Oct-2003 19:20

Well, here's my take. I want to set up (say) a Rugby server, which will accept client connections and process them. I'd like each connection to be handled in its own thread (let's keep it simple and leave thread pools, resource utilization and the like out of this) as I need each client's response to be in reasonable time (pretend the processing involves I/O or something). I'm kinda astonished that I can't do this in Rebol proper. Unless I'm missing something real obvious, the only way that I can do this is to rely on code that implements a cooperative multitasking queue (a la Rugby) or roll my own (!). One way out of is to rely on cgi-bin as the multi-threaded server, but then every client invocation causes a new Rebol process to be launched, which is resource intensive (perhaps there is a way to jigger the apache.conf file to keep one or more processes running, but that's beyond me). I hate to bring in language wars (as I am quite taken with Rebol, and have been able to do some really cool stuff with this) but a multi-threaded server like this is bread 'n butter in Java (and others). Just my $0.02 (ie. I'm not trying to incite a flame war). parki...

[4/13] from: greggirwin::mindspring::com at: 14-Oct-2003 18:35

Hi Brian, BP> I want to set up (say) a Rugby server, which will accept client BP> connections and process them. I'd like each connection to be handled in BP> its own thread (let's keep it simple and leave thread pools, resource BP> utilization and the like out of this) as I need each client's response BP> to be in reasonable time (pretend the processing involves I/O or BP> something). I'm kinda astonished that I can't do this in Rebol proper. I can't tell you not to *want* multi-threading, and I can't even say that you won't need it in some specific instances. But how valuable is it for 98% of the things we're doing, and does that outweigh the burden it puts on us as developers? Not to mention RT having to build a portable threading solution. :) My experience is going to be a factor here. I've only done very limited multi-threaded development, though I've read a *lot*. I try to weigh the pros and cons objectively, but my preference is naturally based on my own subjective view and experiences. Now, before condemning (strong word, I know) a cooperative threading approach, have you tried it and found it to be inadequate in some respect? While I don't feel a need to have a pre-emptive threading system in REBOL, the topic does come up from time to time. Maybe someone should write an article comparing and contrasting things--though that's been done outside of a REBOL context. BP> One way out of is to rely on cgi-bin as the multi-threaded server, but BP> then every client invocation causes a new Rebol process to be launched, BP> which is resource intensive (perhaps there is a way to jigger the BP> apache.conf file to keep one or more processes running, but that's BP> beyond me). I think someone (Robert or Maarten?) cooked up an LRWP module for REBOL to hook up to Xitami, not sure about Apache though. I use this approach (spawning processes, not LRWP) in a project where the hit rate is low and processes run a long time; it works well in that environment. -- Gregg

[5/13] from: g:santilli:tiscalinet:it at: 15-Oct-2003 10:50

Hi Brian, On Wednesday, October 15, 2003, 1:18:36 AM, you wrote: BP> to be in reasonable time (pretend the processing involves I/O or BP> something). If it's just because of I/O, then you have async I/O (well, partially, but this is expected to be completed soon...), which is far simpler than multithreading. If you have a lot of calculations, instead, some form of multitasking is required. Uniserve uses a pool of REBOL processes to achieve this. Multithreading seems like a nice thing to have, but if you consider all the things that are involved, you realize that most of the times it does more harm than good. This said, one of the reason why Nenad started the R# project (not to be confused with Andrew's R# :), is to create a multithreaded REBOL clone. We'll see how mush trouble this will mean. ;-) BP> Unless I'm missing something real obvious, the only way that I can do BP> this is to rely on code that implements a cooperative multitasking BP> queue (a la Rugby) or roll my own (!). Rolling your own might be much easier than you'd expect. BP> I hate to bring in language wars (as I am quite taken with Rebol, and BP> have been able to do some really cool stuff with this) but a BP> multi-threaded server like this is bread 'n butter in Java (and others). Well, yes, if you pretend to ignore all the synchronization issues. As long as your threads do not share data, there's no difference in using multithreading instead of just launching other processes. Regards, Gabriele. -- Gabriele Santilli <[g--santilli--tiscalinet--it]> -- REBOL Programmer Amiga Group Italia sez. L'Aquila --- SOON: http://www.rebol.it/

[6/13] from: andrew:martin:colenso:school at: 24-Dec-2003 22:40

Gabriele wrote:

> ...one of the reason why Nenad started the R# project (not to be

confused with Andrew's R# :), is to create a multithreaded REBOL clone. Perhaps R# was a r(H)ash choice... :) Andrew J Martin Attendance Officer & Grail Jedi. Colenso High School Arnold Street, Napier. Tel: 64-6-8310180 ext 826 Fax: 64-6-8336759 http://colenso.net/scripts/Wiki.r?AJM http://www.colenso.school.nz/ DISCLAIMER: Colenso High School and its Board of Trustees is not responsible (or legally liable) for materials distributed to or acquired from user e-mail accounts. You can report any misuse of an e-mail account to our ICT Manager and the complaint will be investigated. (Misuse can come in many forms, but can be viewed as any material sent/received that indicate or suggest pornography, unethical or illegal solicitation, racism, sexism, inappropriate language and/or other issues described in our Acceptable Use Policy.) All outgoing messages are certified virus-free by McAfee GroupShield Exchange 5.10.285.0 Phone: +64 6 843 5095 or Fax: +64 6 833 6759 or E-mail: [postmaster--colenso--school--nz]

[7/13] from: robert:muench:robertmuench at: 17-Oct-2003 14:11

On Tue, 14 Oct 2003 19:18:36 -0400, Brian Parkinson <[parki--whatevernot--com]> wrote:

> I want to set up (say) a Rugby server, which will accept client > connections and process them. I'd like each connection to be handled in

<<quoted lines omitted: 5>>

> this is to rely on code that implements a cooperative multitasking > queue (a la Rugby) or roll my own (!).

Hi, don't bee fooled by all this multi-threading hype. For example, have a look at www.xitami.com and theire LRWP protocol. This is done using a cooperative multi-tasking. Very cool and fast, and it can be coupled with Rebol quite easy. Remember: multi threading won't solve performance problems just because it's multi threaded... -- Robert M. M�nch Management & IT Freelancer Mobile: +49 (177) 245 2802 http://www.robertmuench.de

[8/13] from: andreas:bolka:gmx at: 17-Oct-2003 14:40

Friday, October 17, 2003, 2:11:47 PM, Robert wrote:

> On Tue, 14 Oct 2003 19:18:36 -0400, Brian Parkinson > <[parki--whatevernot--com]> wrote:

<<quoted lines omitted: 11>>

> using a cooperative multi-tasking. Very cool and fast, and it can be > coupled with Rebol quite easy.

The problem I see w/ coop multi-tasking (and I've implemented two coop mt server frameworks in REBOL) is that it's a viral, an all-or-nothing approach. If I have e.g. disk IO, I've to impl a simple file IO in a coop way, i.e. decompositing it into a state machine w/ small tasks that preferrably don't block. If I've some computing intensive algorithm, once more, I've to state it in a coop way. And I (my very personal opinion based on my experiences with that kind of stuff in REBOL) find that decomposition rather boring and the result quite hard to maintain. Imho, first class continuations would help in this situation, but that's another topic. So, when you control the whole application, it's at least possible to completely adhere to a coop oriented programming approach. Once I want to cleanly integrate with external apps or other libraries, yeah, problems may arise - who guarantees that the library of my choice does not use blocking IO, for example? I'm in no way an expert in the theoretical issues surrounding this topic, nor am I a REBOL guru. So it may well be that I've missed some nifty ways to write something that won't require complete changes in all my coding habits. So - questions, corrections and suggestions are welcome :) -- Best regards, Andreas

[9/13] from: maarten:vrijheid at: 17-Oct-2003 15:28

Hi Andreas,

> The problem I see w/ coop multi-tasking (and I've implemented two coop > mt server frameworks in REBOL) is that it's a viral, an all-or-nothing

<<quoted lines omitted: 6>>

> to maintain. Imho, first class continuations would help in this > situation, but that's another topic.

So let's do first-class continuations on the mezzanine level, re-implementing reduce as block interpreter that recursively evaluates the values supplied. And then redo 'do ;-) , retrofit load etc. Anybody interested? And when we have done that, why not make'em first class distributed in our full reflexive meta-language? Now that's GRID computing on steroids!

> I'm in no way an expert in the theoretical issues surrounding this > topic, nor am I a REBOL guru. So it may well be that I've missed some > nifty ways to write something that won't require complete changes in > all my coding habits. So - questions, corrections and suggestions are > welcome :) >

Same here, but if you really want to, you'll learn what you need to know ;-) --Maarten

[10/13] from: petr:krenzelok:trz:cz at: 17-Oct-2003 16:01

Maarten Koopmans wrote:

>Hi Andreas, >>The problem I see w/ coop multi-tasking (and I've implemented two coop

<<quoted lines omitted: 16>>

>distributed in our full reflexive meta-language? Now that's GRID >computing on steroids!

how fast can it be? Well, anyway - I am not expert on such things, but - if your aproach will prove, why not take them down to native level? And if we follow such thinking, why Rebol 2.0 family took step away from 1.0 model, which supported continuations? I e.g. remember Holger stating that if Rebol would not be done the way it is, something like View would not be easily possible (or would be slow?) ... ... just curious, as R# plans on such thing as continuations IIRC ... -pekr-

[11/13] from: rotenca:telvia:it at: 17-Oct-2003 16:50

Hi Andreas,

> The problem I see w/ coop multi-tasking (and I've implemented two coop > mt server frameworks in REBOL) is that it's a viral, an all-or-nothing

<<quoted lines omitted: 6>>

> to maintain. Imho, first class continuations would help in this > situation, but that's another topic.

I agree at all. Writing cooperative code without continuations is very hard. A dialect can be used to handle particular cases of continuations in Rebol (i should say "cooperative continuations"). Else the resulting code is a very big FSM, which among the other things, should have a clear idea of time costs of all operations and this is hard to know for a script which should run without any change on every system and with different data input.

I'm not an expert, but surely with do/next can be emulated a preemptive multitasking environment. The result is probably a very slow program. I think that it is not the Rebol language the best level in which multitasking must be handled. I have also seen at least one program which implements what seems a multitasking dispatcher/scheduler. But the code for this stuff is 1) very long 2) hard to read 3) hard to change 4) sometime slow (i think). So some of the best reasons to use Rebol are totally lost with also simple multitasking emulation. The lack of async i/o on files makes more hard to write this kind of scripts. --- Ciao Romano

[12/13] from: joel:neely:fedex at: 17-Oct-2003 10:08

Hi, Robert, A couple of thoughts on the other side... Robert M. M�nch wrote:

> Hi, don't bee fooled by all this multi-threading hype. For example, > have a look at www.xitami.com and theire LRWP protocol. This is done > using a cooperative multi-tasking. Very cool and fast, and it can > be coupled with Rebol quite easy. >

I don't know what you mean by "hype", nor what that has to do with speed. I've never thought of parallelism in terms of the speed of a single task, but as a design/expressiveness issue. Remember: it's possible to express *any* computation in terms of only sequence, alternation, and iteration (e.g. block, IF, and WHILE) but few of us choose to restrict ourselves to only those mechanisms. For that matter, most of us would prefer to write (e.g.): foo: func [b [block! n [integer!] ...] [ ... expressions with B and N ... ... final expression with B and N ] ... blort: foo someblock 23 instead of foo-b: foo-n: foo-result: none foo-exprs: [ ... expressions with B and N ... foo-result: ... final expression with B and N ] ... foo-b: someblock foo-n: 23 do foo-exprs blort: foo-result The gain in expressiveness from having functions renders irrelevent the contention that we could find other ways to get the job done without them. There are some problems whose solution can be most naturally expressed in recursive terms. Likewise, there are some problems which can be expressed most clearly as a collection of distinct processes with well-defined collaboration patterns. Consider the popularity of "|" as a means of structuring computations via the *nix shell. Of course, anything that can be done with | can also be done in a single-threaded program, but then the programmer has to concern herself/himself with implementation/algorithm details that are simply irrelevant at the level of the original problem (e.g. buffering, distinguishing push-driven and pull-driven variations of the same algorithm, etc.) Finally, there are cases in which one of a set of collaborating activities should be allowed to "stall" for a time without forcing all others to wait. The fact that we can (sometimes!) deconstruct our code in a scheme to allow this to be managed in a single thread only means that we now have to add the issues of that scheme to the things to consider in doing our design, instead of being able to keep our focus on the top problem. I've recently been involved in several (very hard-core practical) projects where parallelism made significant contribution to the simplicity of the solution, overall performance, or both.

> Remember: multi threading won't solve performance problems just > because it's multi threaded... >

Agreed, but... Remember: single-threading won't solve design problems just because only one thing is happening at a given instant! ;-) -jn- -- ---------------------------------------------------------------------- Joel Neely joelDOTneelyATfedexDOTcom 901-263-4446 Counting lines of code is to software development as counting bricks is to urban development.

[13/13] from: greggirwin:mindspring at: 17-Oct-2003 11:24

Hi Andreas, AB> The problem I see w/ coop multi-tasking (and I've implemented two coop AB> mt server frameworks in REBOL) is that it's a viral, an all-or-nothing AB> approach. I agree. One of the biggest issues--at least for me--was getting over the initial hurdle of realizing that. The other difficulty is that you often won't realize any benefits in simple applications, so it's easier just to do things directly in response to UI events and such. AB> If I have e.g. disk IO, I've to impl a simple file IO in a AB> coop way, i.e. decompositing it into a state machine w/ small tasks AB> that preferrably don't block. If I've some computing intensive AB> algorithm, once more, I've to state it in a coop way. For me, it's largely an issue of "can I mentally walk through a sequence of events and visualize how the code is going to behave?" For me, the cooperative approach, especially combined with FSMs, tends to make things very clear (and it's portable too! :). For pre-emptive threading, it's very hard to be sure that you're code is really correct. i.e. that it will work correctly in all cases. One reason I find it hard, is that you can't duplicate exact timing or interrupt scenarios. With a cooperative system, it's predictable and reproducible. That is, my I/O may block, and cause a delay, but it won't work "most of the time" and do something potentially *very* bad when it doesn't work--depending on exactly what you're doing of course. AB> I'm in no way an expert in the theoretical issues surrounding this AB> topic, nor am I a REBOL guru. So it may well be that I've missed some AB> nifty ways to write something that won't require complete changes in AB> all my coding habits. I'm not an expert on this either. My bias comes--I think--from how effectively I've used each technique without having to become an expert. For me, the FSM-based stuff I've done for this kind of thing *does* require a different mindset, and I don't use it for very simple things (which is most of what we do I'd guess). For more complex things, I really think it helps me. I have a project with a dynamic UI, that can request various types of remote data (project keys, etc. via Rugby) and display incoming MJPG data. At its core is an FSM. These various event sources feed a queue much like a coop system would work. It may not be the best solution, but it's easy to understand, reliable, and performs well. For testing, I can "stuff" events into the queue; even create entire test scenarios of event sequences. Of course, what works for me may not work for you. :) -- Gregg

Notes

Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted