Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL.org] Outage

 [1/7] from: SunandaDH:aol at: 24-Jun-2004 15:02


Just about the only "official" REBOL site not to crash in the past few weeks was REBOL.org. But now it has joined the others. REBOL.org just got taken down by the ISP because there were 100s of unterminated CGI processes running. It's been a busy couple of days with triple the usual messages, mainly due to spiders being very active. But, even so, all processes should terminate at a quit. I know this sort of problem has been raised before (by Graham)? If anyone has any ideas, I'd be very grateful to hear them. Versions: OS: FreeBSD Server: Apache 1.3 REBOL: Core 2.5.6.4.1. The site is back up now, but the ISP techies are keeping a close eye on it, Thanks, Sunanda.

 [2/7] from: g:santilli:tiscalinet:it at: 25-Jun-2004 13:59


Hi SunandaDH, On Thursday, June 24, 2004, 9:02:28 PM, you wrote: Sac> But, even so, all processes should terminate at a quit. I think the real problem is that only one of the processes actually terminates. Wild guess --- this could be related to the sysport bug, one of the processes hangs after getting a signal. (REBOL starts two processes on Unix.) Regards, Gabriele. -- Gabriele Santilli <[g--santilli--tiscalinet--it]> -- REBOL Programmer Amiga Group Italia sez. L'Aquila --- SOON: http://www.rebol.it/

 [3/7] from: SunandaDH::aol::com at: 26-Jun-2004 18:14


Gabriele:
> I think the real problem is that only one of the processes > actually terminates. Wild guess --- this could be related to the > sysport bug, one of the processes hangs after getting a signal. > (REBOL starts two processes on Unix.)
Thanks. I didn't know about the two processes.....It certainly looks like one of them (at least) doesn't terminate under some conditions. But most days, it's fine. Theories vary about why, but a common theme is that failure is triggered when the server is under a heavier than usual load. On our failure day, we had twice as many messages as on an average day. So, maybe, the problem is (also) related to having two or more independent REBOL CGIs running at once -- maybe they deadlock on trying to access some common resource. If so, it puts an uncomfortably low ceiling on the number of messages we can handle in one day -- at least until the problem is fixed. Our ISP tech support say that the unterminated processes were doing *something* -- not just sitting there hogging memory and resources, but also eating lots of CPU cycles. I've attempted to cure the symptom by banning bots that have been too aggressive. One problem any site may be having right now is the overly-aggressive msnbot -- it can drain a month's worth of bandwidth from an innocent site in a single day by compulsive repeated spidering of the same page. And all to no useful effect -- there is no publicly available search engine for msnbot results as yet. msnbot supports the non-standard robots.txt command, crawl-delay. I'd recommend anyone with a website to add something like this to their robots.txt, lest msnbot comes and over-spiders you: User-Agent: msnbot Crawl-Delay: 20 The REBOL.org robots.txt bans many time-wasting bots. You can see a copy here: http://www.rebol.org/robots.txt Sunanda.

 [4/7] from: amicom:sonic at: 2-Jul-2004 22:43


The ISP of one of my clients went so far as to write an "unrebol" program that kills Rebol CGI processes that don't terminate themselves. While Carl and I were visiting the ISP's owner one day, he showed Carl the problem and Carl said the problem was fixed with the latest version. From what I've seen since then, the problem persists. I hope Carl can find the time to track and kill this bug, although with the unrebol program in place, it isn't as high of a priority for me. Bohdan "Bo" Lechnowsky Lechnowsky Technical Consulting At 03:02 PM 6/24/04 -0400, you wrote:

 [5/7] from: hallvard:ystad:oops-as:no at: 3-Jul-2004 21:56


I too had to write a script to (re-)kill dead rebol processes. I had it run nightly as a cron job. But then with /core 2.5.6, these zombie processes disappeared. This is on a mac osx 10.2. Regards, HY Dixit Bohdan or Rosemary Lechnowsky (07.43 03.07.2004):

 [6/7] from: gchiu:compkarori at: 4-Jul-2004 16:17


Hallvard Ystad wrote.. apparently on 3-Jul-2004/21:56:04+2:00
>I too had to write a script to (re-)kill dead rebol processes. I had it run nightly as a cron job. But then with /core 2.5.6, these zombie processes disappeared. This is on a mac osx 10.2.
Can you post this script? -- Graham Chiu http://www.compkarori.com/cerebrus http://www.compkarori.com/rebolml

 [7/7] from: andreas:bolka:gmx at: 4-Jul-2004 17:18


Thursday, June 24, 2004, 9:02:28 PM, SunandaDH wrote:
> REBOL.org just got taken down by the ISP because there were 100s of > unterminated CGI processes running. > Versions: > OS: FreeBSD > Server: Apache 1.3 > REBOL: Core 2.5.6.4.1.
we had similar problems with vanilla and older REBOL/Core versions. at least on my machine [1], the zombies went away with REBOL/Core 2.5.6. i _think_ i remember that the problem was related to dns:// timeouts, but i'm not sure. [1] Linux 2.4.20, Apache 1.3.26, REBOL/Core 2.5.6.4.2 -- Best regards, Andreas