Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

Need some URL help

 [1/14] from: hallvard::ystad::gmail::com at: 23-Jan-2006 13:27


Hi list, I have a problem with downloading this URL: http://www.linuxtelephony.org/article.cgi?i=400&r=0 It seems the linuxtelephony.org returns something in the HTTP headers that I cannot successfully read with rebol. Here's my code (copy&paste into your rebol shell, but watch out for line breaks): REBOL [] ; This is the URL: http://www.linuxtelephony.org/article.cgi?i=400&r=0 insert-this: "GET /article.cgi?i=400&r=0 HTTP/1.0^/Host: www.linuxtelephony.org^/" port: open/lines [ scheme: 'tcp host: "www.linuxtelephony.org" port-id: 80 ] print "[We send:]" print form insert-this insert port insert-this header: make block! 10 while [ not empty? reply: first port ] [ if none? reply [print "Break!" break] parse reply [ [copy name [thru #":"] (name: load name) | copy name [to end] (print form name)] [copy value to newline | copy value to end] (if value [append header reduce [name value]]) ] ] either [ not empty? content: first port ] [content: copy port] [print EMPTY! ] close port print reduce [header content] halt ===== End code From other servers, this code is OK, but this linuxtelephony.org URL always crashes on me. Does anyone have a clue? Thanks, HY PS. The server seems to run Apache with PHP4: Server: Apache/1.3.33 (Unix) PHP/4.3.11-dev X-Accelerated-By: PHPA/1.3.3r2 X-Powered-By: PHP/4.3.11-dev Could this be a PHP bug?

 [2/14] from: SunandaDH:aol at: 23-Jan-2006 7:51


Hallvard:
> I have a problem with downloading this URL: > http://www.linuxtelephony.org/article.cgi?i=400&r=0
Your code works fine for me -- I can read the page with it.
>> do read clipboard://
[We send:] GET /article.cgi?i=400&r=0 HTTP/1.0 Host:www.linuxtelephony.org HTTP/1.1 200 OK Mon, 23 Jan 2006 12:48:45 GMT Apache/1.3.33 (Unix) PHP/4.3.11-dev PHP/4.3.11-dev PHPA/1.3.3r2 close text/html <!doctype h tml public "-//w3c//dtd html 4.0 transitional//en"> <head> <title>Linuxtelephony.org - Linuxtelephony.org</title> <me ta http-equiv="Content-Type" content="text/........... The headers look okay to me: HTTP/1.1 200 OK Date: Mon, 23 Jan 2006 12:45:50 GMT Server: Apache/1.3.33 (Unix) PHP/4.3.11-dev X-Powered-By: PHP/4.3.11-dev X-Accelerated-By: PHPA/1.3.3r2 Connection: close Content-Type: tex Could it be a firewall at your end? Sunanda.

 [3/14] from: hallvard:ystad:gm:ail at: 23-Jan-2006 14:47


Argh! I just hate it when errors aren't repeatable. But the behaviour is consistent on my side: I get a 302 redirect to some URL with a session ID in it, which will (probably set a cookie and then) redirect me back. This works with a browser, but not with rebol... Darn. Actually, I know that rebol will halt in the line while [ not empty? reply: first port ] [ and claim there is an out-of-index or past-end error. When I probe 'port, I cannot see anything irregular at that point. Thanks for trying anyway. HY On 23/01/06, SunandaDH-aol.com <SunandaDH-aol.com> wrote:

 [4/14] from: SunandaDH::aol::com at: 24-Jan-2006 6:16


Hallvard:
> But the behaviour is consistent on my side: I get a 302 redirect to some URL > with a session ID in it, which will (probably set a cookie and then) > redirect me back. This works with a browser, but not with rebol...
Weird.....I got a 302 redirect one time I tried, but not any of the other times. It's possible the site is sniffing user agents, and redirecting ones it doesn't like to a different (simpler?) version of the site. So you could try playing round with: system/schemes/http/user-agent: "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)" see if different user agent strings make a difference. Another possibility to explain the randomness is that they are doing some sort of load balancing/redirecting based on IP address. If so, you could try coming at them via an anonymizing website. Either way, good luck! Sunanda.

 [5/14] from: hallvard:ystad:gma:il at: 24-Jan-2006 14:42


Thanks for the tip, but I now tried with different haeders (didn't even include the user-agent header on first attempts). I have experimented with both HTTP/1.1 and HTTP/1.0, kept-alive and closed connections, with and without specifying the host, with different user-agents and even from different IPs... And nothing seems to make any difference. I now get the 302 error on all attempts. The rebol console looks like this: [We send:] GET /article.cgi?i=400&r=0 HTTP/1.1 Host: www.linuxtelephony.org connection: close user-agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0) HTTP/1.1 302 Moved Temporarily ** Script Error: Out of range or past end ** Near: not empty? reply: first port
>>
When I probe the port, I see this: state: make object! [ flags: 791107 misc: [144 [] 0] tail: 0 num: 1 with: "^M^/" custom: none index: 0 func: 3 fpos: 0 inBuffer: "^/" ... Could it be that the server says it's newlines are "^M^/", but the last newline it actually sends, is only "^/"? I put everything in a file: http://babelserver.org/ strange.txt if anyone would like to have a look. HY On 24/01/06, SunandaDH-aol.com <SunandaDH-aol.com> wrote:

 [6/14] from: hallvard::ystad::gmail::com at: 26-Jan-2006 13:30


OK, I'm a little bit closer to the problem now. I tried to download the same URL using 'read, and rebol hung. Setting on trace, I got an endless loop in parsing of the HTTP headers, so there definitely is something spooky going on there. I put the lot (trace produces kilometres of output) on this URL: http://www.babelserver.org/ strange.html I believe this is a bug. I'd like to look closer into it myself, but since 'read is native, I cannot peek into it, can I? And from the 'port object, (open/lines), I don't know how to get more information than what I have on the above mentionned URL. What should I do? Should I post it directly to Rambo? HY On 24/01/06, Hallvard Ystad <hallvard.ystad-gmail.com> wrote:

 [7/14] from: compkarori:gm:ail at: 27-Jan-2006 20:37


read works for me on this page http://www.linuxtelephony.org/article.cgi?i=400&r=0 On 1/24/06, Hallvard Ystad <hallvard.ystad-gmail.com> wrote:
> Hi list, > > I have a problem with downloading this URL: > http://www.linuxtelephony.org/article.cgi?i=400&r=0 >
-- Graham Chiu http://www.compkarori.com/emr/

 [8/14] from: hallvard:ystad::gmail at: 27-Jan-2006 21:40


Sure. The server behaves as it pleases, it seems. The error appears when you get a 302 redirect http response header. Which one occationally does, Sunanda seems to have gotten it on his third attempt. I get it *most* of the time, and when I get a 200 OK header, everything works just fine for me too. The server that responds to http://www.viahardware.com/ exhibits the same behaviour. (Exhibits? Is that an english way of speaking? Or does it smell like new-mowed grass?) HY On 27/01/06, Graham Chiu <compkarori-gmail.com> wrote:

 [9/14] from: compkarori::gmail::com at: 28-Jan-2006 0:52


If they redirect only sometimes, it's probably some load balancing they're doing. On 1/28/06, Hallvard Ystad <hallvard.ystad-gmail.com> wrote:
> Sure. The server behaves as it pleases, it seems. The error appears when > you
<<quoted lines omitted: 35>>
> To unsubscribe from the list, just send an email to > lists at rebol.com with unsubscribe as the subject.
-- Graham Chiu http://www.compkarori.com/emr/

 [10/14] from: hallvard:ystad:gmai:l at: 27-Jan-2006 21:58


Still, the problem isn't the forwarding, but that rebol isn't able to read the response. It hangs, or actually, as the trace log shows, goes into an infinite loop. With read/lines on an open 'port, it crashes. HY On 27/01/06, Graham Chiu <compkarori-gmail.com> wrote:

 [11/14] from: anton::wilddsl::net::au at: 29-Jan-2006 1:38


I got it straight away:
>> trace/net on forever [print "trying..." read
http://www.linuxtelephony.org/article.cgi?i=400&r=0] trying... URL Parse: none none www.linuxtelephony.org none none article.cgi?i=400&r=0 Net-log: ["Opening" "tcp" "for" "HTTP"] connecting to: www.linuxtelephony.org Net-log: {GET /article.cgi?i=400&r=0 HTTP/1.0 Accept: */* Connection: close User-Agent: REBOL View 1.3.2.3.1 Host: www.linuxtelephony.org } Net-log: "HTTP/1.1 302 Moved Temporarily" Anton.
> Sure. The server behaves as it pleases, it seems. The error > appears when you
<<quoted lines omitted: 7>>
> like new-mowed grass?) > HY
That use of "exhibits" is fine, Halvard. :)

 [12/14] from: anton::wilddsl::net::au at: 29-Jan-2006 10:54


Hi Hallvard, I forgot to mention, that after the final Net-log line below, rebol hung for a long time, until this was printed: ** Script Error: Not enough memory ** Where: append ** Near: insert tail series :value
>>
Anton.

 [13/14] from: hallvard::ystad::gmail::com at: 29-Jan-2006 20:41


On 29/01/06, Anton Rolls <anton-wilddsl.net.au> wrote:
> Hi Hallvard, > I forgot to mention, that after the final Net-log line below,
<<quoted lines omitted: 3>>
> ** Near: insert tail series :value > >>
I believe you must have gone through the same endless loop as I, except I didn't have the patience to wait for the memory error. I'll submit to rambo. HY

 [14/14] from: hallvard::ystad::gmail::com at: 30-Jan-2006 20:50


On 29/01/06, Hallvard Ystad <hallvard.ystad-gmail.com> wrote:
> I'll submit to rambo. >
Done HY

Notes
  • Quoted lines have been omitted from some messages.
    View the message alone to see the lines that have been omitted