Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search


From: holger:rebol at: 3-Oct-2001 8:29

On Wed, Oct 03, 2001 at 09:03:51AM +0200, Maarten Koopmans wrote:
> It works well but you are switching between threads which is overhead you > can save if you do it yourself within one process. Unless you have native > threads, SMP, and multiple processors, because you are not switching then.
Actually, you are, if the threads are tightly coupled, and you may end up with less performance than if you just forgot about SMP support in your threading layer altogether. Sun learned that the hard way in their attempt to support Solaris SMP in Java. There are basically three types of threading support: 1. Single CPU, non-preemptive, switching completely under application control (in the case of an interpreter "application" is the interpreter, not the script). 2. Single CPU, preemptive (at least partially, e.g. by having no control over switching during OS calls, or by having Unix signals trigger switching). 3. Multiple CPUs. Whether switching is preemptive on each CPU or not does not matter (for the purpose of this analysis). Intuitively 3 appears to perform better than 2, and 2 better than 1. The rule is: if your threads are loosely coupled (do not have many resource dependencies) then the intuition is correct, and 3 is preferable. However if threads are tightly coupled then it is the other way around: 1 performs best and 3 worst. Here is why: With tightly coupled threads only one thread can physically run at any time anyway, because it needs to lock all (low-level) resources. For interpreters you would have to, e.g., lock symbol tables, name-value lookup tables (for REBOL: contexts) and a lot of other things. Almost every statement in a script makes changes in some way, so you need to lock exclusively. Locking at a smaller granularity than "everything" is usually impossible, because there is no natural hierarchy to data structures, and multiple locks would thus inevitably either lead to deadlocks or to such an enormous number of master lock lock layers and individual locks in order to avoid deadlocks that the overall performance gets even worse than by just using one single global lock. Knowing that, you can pretty much forget about SMP benefits, because only one thread and thus one CPU can run at any time anyway. That means cases 2 and 3 become very similar in performance. 3 is actually worse, because when switching from one thread to another in case 2 you only have ONE THREAD switch. In case 3 you have TWO TASK switches (because each CPU switches between one thread in your overall task and some OTHER task). MUCH more overhead. By going from 2 to 3 you are trading light-weight switching for heavy-weight switching, without any net benefit from parallelism: bad idea. And 1 performs better than 2 and 3 because it allows implicit locking. If the runtime system controls thread switching then it does not need to lock resources that are shared among threads at all. The active thread owns them by convention while it executes. Of course this only applies to low-level resources, within the interpreter, which are accessed "below" the switching layer. It says nothing about high-level resources which are accessed "above" the switching layer, e.g. non-atomic data structures within your script. You will still have to provide semaphore locks for those, but that is true for all three cases. It is interesting to see that the natural reaction of a lot of people is to assume that preemptive switching is always superior to non-preemptive switching. This is actually not the case. The intuition probably comes from the experience with Windows, where the change to preemptive switching provided so many advantages. However that is only because a lot of Windows applications were derived from MS-DOS or modeled after existing MS-DOS applications, and were therefore not written in a multitasking-friendly manner to start with. With non-preemptive switching a run-away application could lock up the whole system. With preemptive switching it would only result in a task not responding , keeping the system alive. If all tasks "played by the rules" then you would not see any difference between preemptive and non-preemptive switching at all. Of course for an operating system you would still want preemptive switching at the task level, but that's a different matter... Things are a lot different if each thread runs within an interpreter (regardless of whether it is a "real" interpreter like REBOL or a bytecode interpreter like Java). In that situation the interpreter itself can "play nicely" with the threading engine, within the interpreter loop, making run-away threads impossible, i.e. even with something like "forever []" in a non-preemptive threading environment you would still not lock up other threads, because the interpreter itself would preempt threads, even if the underlying threading engine is non-preemptive. This means in this kind of environment preemptive threading has no advantages over non-preemptive threading, but it does have one disadvantage: higher overhead and thus worse performance because of locking requirements. The same is true for SMP threading (preemptive or non-preemptive). It is no surprise that, from my experience, most people running Java in Solaris SMP environments choose not to enable Java's native SMP support, but stick with Java's single-CPU "green-threads" layer instead, which is basically a non-preemptive single-CPU threading engine with wrappers around system functions to allow "controlled preemption". Same thing for REBOL: if/when we do support multi-threading it is very likely going to be a solution that falls into category one, supporting single CPUs only (per REBOL process), with implicit internal locking and some primitives at the REBOL language layer to manipulate threads and priorities, and provide mutual exclusion. At the script layer the engine would appear preemptive, but below the hood it would not be, thus no SMP.
> > We need multithread to make really serious apps with Rebol, especially for > serious server > > apps ! IIRC, /Serve use multithreading internally. > > > > [...] > Async I/O I thought ;-) Jeff, Holger?
Yes, async i/o. No real multithreading. Of course single-threading and async i/o combined with a state machine per pending task leads to pretty much exactly the same performance (sometimes even better) as multi-threading with blocking i/o per thread. One advantage of the single-threading solution is that load balancing, fairness, resource management, sequencing etc. usually become much easier, because the information required to manage that is explicitly available in the data structures, not hidden within the threading environment. The main drawback is that writing programs that way takes some getting used to. Sometimes people refer to the first technique as "multi-threading" as well, simply because the end result is the same, and to benefit from the "buzz" that the term multi-threading creates among the less technical crowd :-). For instance a lot of so-called "multi-threaded filesystems" or "multi-threaded TCP/IP stacks" don't really use multi-threading at all, but actually use single-threading with async i/o and state machines. The user never knows the difference, neither do benchmarks. You only recognize it once you see the source code. -- Holger Kruse [holger--rebol--com]