[REBOL.org] Outage
[1/7] from: SunandaDH:aol at: 24-Jun-2004 15:02
Just about the only "official" REBOL site not to crash in the past few weeks
was REBOL.org.
But now it has joined the others.
REBOL.org just got taken down by the ISP because there were 100s of
unterminated CGI processes running.
It's been a busy couple of days with triple the usual messages, mainly due to
spiders being very active.
But, even so, all processes should terminate at a quit.
I know this sort of problem has been raised before (by Graham)?
If anyone has any ideas, I'd be very grateful to hear them.
Versions:
OS: FreeBSD
Server: Apache 1.3
REBOL: Core 2.5.6.4.1.
The site is back up now, but the ISP techies are keeping a close eye on it,
Thanks,
Sunanda.
[2/7] from: g:santilli:tiscalinet:it at: 25-Jun-2004 13:59
Hi SunandaDH,
On Thursday, June 24, 2004, 9:02:28 PM, you wrote:
Sac> But, even so, all processes should terminate at a quit.
I think the real problem is that only one of the processes
actually terminates. Wild guess --- this could be related to the
sysport bug, one of the processes hangs after getting a signal.
(REBOL starts two processes on Unix.)
Regards,
Gabriele.
--
Gabriele Santilli <[g--santilli--tiscalinet--it]> -- REBOL Programmer
Amiga Group Italia sez. L'Aquila --- SOON: http://www.rebol.it/
[3/7] from: SunandaDH::aol::com at: 26-Jun-2004 18:14
Gabriele:
> I think the real problem is that only one of the processes
> actually terminates. Wild guess --- this could be related to the
> sysport bug, one of the processes hangs after getting a signal.
> (REBOL starts two processes on Unix.)
Thanks. I didn't know about the two processes.....It certainly looks like one
of them (at least) doesn't terminate under some conditions. But most days,
it's fine.
Theories vary about why, but a common theme is that failure is triggered when
the server is under a heavier than usual load.
On our failure day, we had twice as many messages as on an average day.
So, maybe, the problem is (also) related to having two or more independent
REBOL CGIs running at once -- maybe they deadlock on trying to access some
common resource.
If so, it puts an uncomfortably low ceiling on the number of messages we can
handle in one day -- at least until the problem is fixed.
Our ISP tech support say that the unterminated processes were doing
*something* -- not just sitting there hogging memory and resources, but also eating
lots of CPU cycles.
I've attempted to cure the symptom by banning bots that have been too
aggressive.
One problem any site may be having right now is the overly-aggressive msnbot
-- it can drain a month's worth of bandwidth from an innocent site in a single
day by compulsive repeated spidering of the same page.
And all to no useful effect -- there is no publicly available search engine
for msnbot results as yet.
msnbot supports the non-standard robots.txt command, crawl-delay. I'd
recommend anyone with a website to add something like this to their robots.txt, lest
msnbot comes and over-spiders you:
User-Agent: msnbot
Crawl-Delay: 20
The REBOL.org robots.txt bans many time-wasting bots. You can see a copy here:
http://www.rebol.org/robots.txt
Sunanda.
[4/7] from: amicom:sonic at: 2-Jul-2004 22:43
The ISP of one of my clients went so far as to write an "unrebol" program
that kills Rebol CGI processes that don't terminate themselves. While Carl
and I were visiting the ISP's owner one day, he showed Carl the problem and
Carl said the problem was fixed with the latest version. From what I've
seen since then, the problem persists.
I hope Carl can find the time to track and kill this bug, although with the
unrebol
program in place, it isn't as high of a priority for me.
Bohdan "Bo" Lechnowsky
Lechnowsky Technical Consulting
At 03:02 PM 6/24/04 -0400, you wrote:
[5/7] from: hallvard:ystad:oops-as:no at: 3-Jul-2004 21:56
I too had to write a script to (re-)kill dead rebol processes. I had it run nightly as
a cron job. But then with /core 2.5.6, these zombie processes disappeared. This is on
a mac osx 10.2.
Regards,
HY
Dixit Bohdan or Rosemary Lechnowsky (07.43 03.07.2004):
[6/7] from: gchiu:compkarori at: 4-Jul-2004 16:17
Hallvard Ystad wrote.. apparently on 3-Jul-2004/21:56:04+2:00
>I too had to write a script to (re-)kill dead rebol processes. I had it run nightly
as a cron job. But then with /core 2.5.6, these zombie processes disappeared. This is
on a mac osx 10.2.
Can you post this script?
--
Graham Chiu
http://www.compkarori.com/cerebrus
http://www.compkarori.com/rebolml
[7/7] from: andreas:bolka:gmx at: 4-Jul-2004 17:18
Thursday, June 24, 2004, 9:02:28 PM, SunandaDH wrote:
> REBOL.org just got taken down by the ISP because there were 100s of
> unterminated CGI processes running.
> Versions:
> OS: FreeBSD
> Server: Apache 1.3
> REBOL: Core 2.5.6.4.1.
we had similar problems with vanilla and older REBOL/Core versions. at
least on my machine [1], the zombies went away with REBOL/Core 2.5.6.
i _think_ i remember that the problem was related to dns:// timeouts,
but i'm not sure.
[1] Linux 2.4.20, Apache 1.3.26, REBOL/Core 2.5.6.4.2
--
Best regards,
Andreas