Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

Help with HTTP protocol

 [1/16] from: carlos::lorenz::bol::com::br at: 4-Mar-2004 14:06


Hi list, I used to parse some info from weather channel home site with a small script using REBOL. Last week the script just stopped working and while trying to investigate the reasons to the situation I have discovered that the following command read http://br.weather.com/weather/tenday/BRXX2888 does not work anymore as expected. Now It seems my script is beeing redirected to another page by the guys at Weather.Com. My question is: is there a way to ask REBOL to present himself as a webbrowser to a certain host? I ask you this because the url above (http://br.weather.com/weather/tenday/BRXX2888) still works with the browser TIA Carlos Lorenz

 [2/16] from: greg:brondo:algx at: 4-Mar-2004 11:10


I'm doing the same thing but getting my info from weather.yahoo.com. I can send you the code if you like. Also, if you want to use your code grab curl and run it like this: curl -D head http://br.weather.com/weather/tenday/BRXX2888 You can then view the 'head' file to see what is in the http headers. At least that's how I trouble shoot these things... Greg B. On Thursday 04 March 2004 11:06 am, Carlos Lorenz wrote:

 [3/16] from: carlos:lorenz:bol at: 4-Mar-2004 14:27


Hi Greg I would appreciate to study your code if you don't mind I have one to scan Yahoo Weather - not completed tested, but written in PHP if you want I may send you Sorry for my ignorance but what's curl anyway? CArlos Em Quinta 04 Mar=E7o 2004 14:10, you wrote:

 [4/16] from: joel:neely:fedex at: 4-Mar-2004 11:36


Hi, Carlos, Carlos Lorenz wrote:
> "read http://br.weather.com/weather/tenday/BRXX2888 " does not work anymore > as expected. >
What does it do? At first glance (using NS7.1) that the returned page is making heavy use of JavaScript to manage content. If that's correct (whether through stupidity or hostility on the part of weather.com ;-) you may have a much harder time getting the content you want without a browser that implements JS (assuming that you don't want to write your own JS interpreter... ;-) -jn- -- Joel Neely com dot fedex at neely dot joel I had proved the hypothesis with a lovely Gedankenexperiment, but my brain was too small to contain it. -- Language Hat

 [5/16] from: greg:brondo:algx at: 4-Mar-2004 13:37


Now this is getting weird as I implemented my first agent for this in PHP as well (queue Twilight Zone music here).... Curl is a command line tool and library for getting data from network systems (HTTP, TELNET, FTP, GOPHER, etc). http://curl.haxx.se/ it's quite handy to have for quick scripts although Rebol makes it somewhat unneeded for me now ;-) Anyway, here's the code (just a test case): REBOL [] data: read http://weather.yahoo.com/search/weather2?p=75056 results: [] parse/all data [ thru {TEXT FORECAST-->} any [ thru <b> copy field to </b> (append results field append results newline) thru </b> copy field to <p> (append results field append results newline) ] to {<!--ENDTEXT} ] print results Hope that helps! On Thursday 04 March 2004 11:27 am, Carlos Lorenz wrote:

 [6/16] from: SunandaDH:aol at: 4-Mar-2004 14:57


Carlos:
> Now It seems my script is beeing redirected to another page by the guys at > Weather.Com. > > My question is: is there a way to ask REBOL to present himself as a > webbrowser > to a certain host? I ask you this because the url above > (http://br.weather.com/weather/tenday/BRXX2888) still works with the
browser They may be sniffing you as being non-IE and deliberately victimising you. Try this: system/schemes/http/user-agent: "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)" Sunanda.

 [7/16] from: carlos:lorenz:bol at: 5-Mar-2004 10:06


Hi Sunanda, Thanks for you help but it does not work for me. Is there anything else that could be done in order to let the host "think" I am using a browser and not REBOL? The thing is that with a browser I can read the URL with no problem at all. It's the first time that REBOL cannot read an URL for me :( CArlos Em Quinta 04 Mar=E7o 2004 16:57, you wrote:

 [8/16] from: carlos:lorenz:bol at: 5-Mar-2004 10:13


Hi Joel,
> What does it do? At first glance (using NS7.1) that the > returned page is making heavy use of JavaScript to manage > content. If that's correct (whether through stupidity or
My REBOL script shoul be able to read the entire page and parse it so I have temperatures by day into a block that I can use to write my own HTML page without so many features As Sunanda said maybe the guys at Weather.Com are aware that I am not using a browser to read their page and then redirect me to some place else :( Carlos

 [9/16] from: bry:itnisk at: 5-Mar-2004 15:23


just looking at it quickly I think the redirection you're going through is because the page is trying to set a cookie. I don't know if anyone has done anything to get rebol to read and accept cookies, probably not, If I remember correctly cURL does this, and someone suggested you try cURl recently. Then again as was also pointed out the site uses a lot of javascript, a lot, and it ain't pretty either, I don't think you're gonna get this site with rebol, because as was pointed out Rebol doesn't have a javascript interpreter.
> Hi Sunanda, > > Thanks for you help but it does not work
for me.
> Is there anything else that could be done
in order
> to let the host "think" I am using a
browser and not REBOL?
> The thing is that with a browser I can
read the URL with no problem at all.
> It's the first time that REBOL cannot read
an URL for me :(
> CArlos > > Em Quinta 04 Março 2004 16:57, you wrote: > > Carlos: > > > Now It seems my script is beeing
redirected to another page by the guys
> > > at Weather.Com. > > > > > > My question is: is there a way to ask
REBOL to present himself as a
> > > webbrowser > > > to a certain host? I ask you this
because the url above
> > >
(http://br.weather.com/weather/tenday/BRXX288 8) still works with the
> > > > browser > > > > They may be sniffing you as being non-IE
and deliberately victimising you.
> > > > Try this: > > system/schemes/http/user-
agent: "Mozilla/4.0 (compatible; MSIE 5.5; Windows

 [10/16] from: bry:itnisk at: 5-Mar-2004 15:35


I think if what you want is to screen-scrape weather info you would probably do better by going to NOAA, for example http://weather.noaa.gov/weather/current/SBCF. html for belo horizonte. from looking at the site you were scraping before I think they buy their data from some place like this: http://www.views.com.au/weathersystem/index.c fm which is no doubt getting the data free from some government service somewhere.

 [11/16] from: carlos:lorenz:bol at: 5-Mar-2004 11:46


Thanks for your help Carlos Em Sexta 05 Mar=E7o 2004 15:35, you wrote:

 [12/16] from: brunobord:tele2 at: 5-Mar-2004 16:04


[bry--itnisk--com] wrote:
>I think if what you want is to screen-scrape >weather info you would probably do better by >going to NOAA, for example >http://weather.noaa.gov/weather/current/SBCF. >html >for belo horizonte. >
Hey ! That's "human readable" METAR data ! Once upon a time, I made a little REBOL app to parse METAR and TAF data and transform them into readable text. It's still available at : (if the free.fr web servers can be accessed ;o) http://brunobord.free.fr/ressources/fairebo.zip User manual, in french : http://brunobord.free.fr/fairebo_u.php or in english : http://brunobord.free.fr/fairebo_u.php?lang=en Since I built the 1.1.1 version, I left the program asleep (t'was more than a year ago !). But inside the code, you'll have all the direct HTTP addresses to access the METAR and TAF files to your nearest airport. Best regards, -- No', pi=E9ton

 [13/16] from: warp:reboot:ch at: 5-Mar-2004 16:06


Hi Carlos, this one works here (Rio Branco): print read http://www.w3.weather.com/outlook/travel/local/BRXX0199 Will;)

 [14/16] from: carlos:lorenz:bol at: 5-Mar-2004 13:41


Hi Will, This is it! You have helped me understand the problem. Actually they are redirecting me to the following URL http://br.w3.weather.com/weather/tendayBRXX0199 whenever I try to reach http://br.weather.com/weather/tendayBRXX0199 as I used to do before I have not noticed that REBOL was not able to redirecting to the new page until you came with your message that showed me this new URL Thanks :)) Carlos Em Sexta 05 Mar=E7o 2004 12:06, you wrote:

 [15/16] from: bry:itnisk at: 5-Mar-2004 21:37


> Hey ! That's "human readable" METAR data ! > Once upon a time, I made a little REBOL
app to parse METAR and TAF data
> and transform them into readable text. > It's still available at : (if the free.fr
web servers can be accessed ;o)
http://brunobord.free.fr/ressources/fairebo.z ip
> User manual, in french : > http://brunobord.free.fr/fairebo_u.php > or in english : > http://brunobord.free.fr/fairebo_u.php?
lang=en
>
hey thanks, I was actually thinking about building a METAR parser now it looks like I won't need to.

 [16/16] from: antonr:iinet:au at: 7-Mar-2004 22:03


Actually, rebol can handle cookies, it's just not built in. Go to http://www.rebol.organd search for cookie , you will find Several examples. Anton.