[CGI] web server issues
My uplink speed kinda (no, it pretty much completely) sucks but I offer free hosting to any rebol that wants it at peoplecards.ca. I just ask for patience if a new service needs to be installed while I work out kinks and the user needs to know that it's home based with a not-so-speedy delivery pipe and I offer little in the way of frills; meaning it's sftp or ssh cli, not cPanel or other gui.
We fly the flag of peace and truth .. hummed to the tune of "God bless America"
Don't forget the great big smiley...
Is that the tune that sounds like "God Save The King?"' If there's a sugar maple blight, 'The Maple Leaf Forever' will sound lame ... and they'll never see the Eastern Townships annexed by Vermont. 'CGI' does stand for 'Chat Gateway Interfarce' doesn't it ?
Anybody noticed CGI is back as a programming model?
Let me explian.... (the PITL in user.r reminded me to post this)....
First - do not virtualize OSes
1) Think multicore 2) think memory is cheap (2Gb per core) 3) Typically, /Core consumes 8 mb of memory 4) do not encap, use amodule management system like my 'require or Ladislav's 'include 5) wrt 3 and 4: the OS starts using its disk cache etc. After a few hits these operations will be cheap 6) do all session mgt etc in a database => sales up as well, no state, share nothing
Now, what happens? The OS will start distributing the CGI processes over the multiple cores. Using the disk cache etc to speed loading times, enough memory per core on the processor. A 8Gb RAM quadcore should be able to run +- 1000 procs/sec (rough estimate). That's just one box, with that load it should be profitable. And as you obey rule 6, you can scale up and load balance pretty easily.
Winding down: Apache CGI with RSP is pretty good these days. If you combine: - module management - logging - error handling - session management - database protocol (mysql://) - CGI params handling you can "just work".
not virtualising OS this days is imo a mistake, no? :-)
depends on what you gain
Are you sure OS distributes CGI processes to different Cores? Is e.g. Apache working that way?
If they are separate processes, the OS should balance over cores.
Gregg: really? I thought that the reason why R3 will use threads for tasking instead of tasks is, that OS can better balance threads? Anyway, those questions are for gurus, I can only wonder :-)
Threads are much lighter, but not as separate. I don't know details though. On a dual core with hyper-threading on, spawning multiple processes, I can see the load is spread.
petr, the processes are managed by the OS too. *obviously* the os will distribute processes among processors. (unless the os has no multiprocessor support, that is). distributing threads is more difficult (because of the shared memory), however all good threading implementations should do it, and if you programs the threads correctly you can get the performance boost.
Can anybody give me an exampkle setup + explanation of FastCGI + lighttpd with FastCGI; also on the Rebol side. I know Francois mentioned it was easy but I don't get how you can do adaptive spawning one the same listening port (e.g, 10 Rebol FastCGI processes listening on port 1026 or so)
I used fastcgi in the past, wich Apache, under linux. All modes worked fine IIRC. However, under windows, the implementation was crippled, and only external mode worked.
Also, there was problem, that the same client could be served by different process, so fastcgi guys implemented some kind of "afinity patch", kind of proxy, which then connected the same client always to the same process.
I don't think those 10 processes would listen on port 1026? In fact I don't know, how it is being done.
Hope you read http://www.rebol.com/docs/fastcgi.html
That REBOL doc should really answer your question. Simply put, in External mode, you do something like /path/to/script (that does not need to exist) and you direct it to certain, already running REBOL process. But - rebol has no tasking, so you have to handle accepting connections and multiplexing. It is like with Rubgy - unless you are finished, you are not available to other requests ...
Read that, but that is why the adaptive scaling with lighttpd is interesting if you put number of request/fcgi porcess on 1. Then the daemon scales for you
Maarten, I agree with your observation and you can even scale it more. If you see a web-server as just a request dispatcher to CGIs and a fast-answering-machine for user-feedback (pages, forms etc.) you just need a small and "simple" one like Cheyenne. The CGIs can be distributed to different cores (through the OS) or even to different machines (via TCP/IP).
As dispatching requests is most likely much faster than processing a request, a single web-server should serve a lot of users and a bunch of machines do the processing. This is the coarse grained multi-process approach.
With Cheyenne you can already have the main httpd process on one machine and task-handlers (RSP or whatever) on other machines 8)
I am close to autogenerating fastcgi processes, linked with Lighttpd configs and generating automagical includes that match the web server config for encap/Pro.
I am also adding a routing scheme (dialect) so basically you redirect all traffic except static stuff to a fastcgi process, it comes in the router and then it checks on extension or path (REST) what to do.
The generating part will be asking you a few questions and the generate matchhing config files and binaries that you can copy to an empty Linux box with only lighttpd installed (or enginx)
Session management is database-backed, so scaling up is hiring VPS'es and putting pound in between
With the avereage memory use fo Rebol < 10Mb you can coompute how many users I can concurrently server for complex operations (100-200 minimum), so every machine I hire can host 500 customers. That means that I should earn e0.50 customer to get a decent margin (roughly).
earn = ask
I finally learned the trick: 1) create a good interface (but w eknew that) 2) use back-end technology that lowers the cost per user (i.e. enlarges the # users per box)
Maarten, this sounds very cool. So the goal is to have a scalable web-service framwork based on fastCGI and simple tools?
That's how I should have described it in one sentence.
But yes: load-balancer -> webserver(*) -> FastCGI(*) -> MySQL
FastCGI is a rebol process with core enhancements, session mgt, RSP etc. I am also integrating autodoc from Gabriele so the files will be more "literate" and I have a module management system in place thathandles from interactive to encap.
I am using assembla.com from SVN and trac, the actual application I am building is for personal life management.
As a rebol process is only 10 Mb.... I can serve lots of users on cheap VPS's, load balance them, data backup in S3. No others invited until I get things stable enough. eed to ge things going
Sounds very cool. Go for it!!
I don't understand it a bit. I can understand webserver, fastcgi, mysql part, but what is that load-balancer part? Client side?
or some special server part?
No, before the webserver, so you scale transparantly to multiple webservers (in my scenario each webserver effectively is the load balancer for X FastCGI rebol processes; it's how nginx and lighty work)
What are the best ways to protect source code from view in a cgi script? When a script is made world-viewable, isn't that compromising the source pretty badly?
You want to set the file permissions for each script globally executable, but not globally readable. Also protect the cgi-bin folder from being read by the whole world...If you are using Apache, look at the IndexIgnore directive: http://httpd.apache.org/docs/1.3/mod/mod_autoindex.html#indexignore
One of my clients updates his site via some tool, which always seem to add some space between the lines. After some time, the page is instead of 400 rows something like 13K rows - the size goes from cca 25KB to 100KB. So I wrote a cgi script, which reads index.html and removes blank lines. Everything is OK, when I run the script from the console. But when I run it via a browser as a CGI script call, it can't write the file. Dunno why - cgi-script is being run using -cs switch, I even put secure none in there, cgi-script has the same own, grp set as index.html, but I can't write it ....
Maybe I should try the trick to connect to the FTP account and upload it there, instead of rewriting? :-)
Check the permissions of the file to be written. Are you also sure the cgi script is being executed? Check its permissions too..
I am sure, as I get cgi script output to the browser window. It just fails on the last line - write/lines %index.htm data