[REBOL] Re: does modified? GET or HEAD?
From: andreas:bolka:gmx at: 14-Apr-2003 18:42
Monday, April 14, 2003, 1:53:15 PM, Hallvard wrote:
> Does anyone know if 'modified? reads the whole URL with a http GET
> command, or does it simply use a HEAD command?
Short version: it uses HTTP GET. You can see that using REBOL's own
'trace function:
>> trace/net on
>> modified? http://www.rebol.com
URL Parse: none none www.rebol.com none none none
Net-log: ["Opening" "tcp" "for" "HTTP"]
connecting to: www.rebol.com
Net-log: {GET / HTTP/1.0
Accept: */*
Connection: close
User-Agent: REBOL Core 2.5.5.3.1
Host: www.rebol.com
}
Net-log: "HTTP/1.1 200 OK"
== 7-Apr-2003/23:11:33
'modified? actually calls 'info? (try source modified?) which in turn
calls 'query (try source info?).
> Does anyone have some code for getting the date out of a port!
> object?
>> p: open http://rebol.com
>> probe p/locals/headers
make object! [
Date: "Mon, 14 Apr 2003 15:58:05 GMT"
Server: "Apache/1.3.26 (Unix) FrontPage/5.0.2.2623"
Last-Modified: "Mon, 07/Apr/2003/23:11:33/+GMT"
Accept-Ranges: "bytes"
Content-Encoding: none
Content-Type: "text/html"
Content-Length: "11998"
Location: none
Expires: none
Referer: none
Connection: "close"
Authorization: none
ETag: {"1259b8-2ede-3e9205a5"}
content: ""
]
So use 'open instead of 'read and you can access the Last-Modified
date and the ETag; using e.g. p/locals/headers/Last-Modified.
> Say I'd like to fetch a URL only if it's been modified since the
> last time I was there. Is there a way to check without having to
> read the URL twice? E.g. this would read the URL twice, if the site
> has changed:
It would probably be the best thing to accomplish that using HTTP's
inherent methods, referred to as "conditional GET". Preferrably,
conditional GETs are done using "ETags". An example:
; first we need to get the last modified date ("lastvisit")
p: open http://www.rebol.com/
etag: copy p/locals/headers/ETag
; == {"1259b8-2ede-3e9205a5"}
close p
; then, subsequently, we do conditional GETs:
p: open/custom http://www.rebol.com/ compose/deep [ header [ If-None-Match: (etag) ]
]
Problem is, that REBOL doesn't provide you the HTTP response code (you
usually check on HTTP response code == 304 to know wether the resource
changed or not). A good workaround is to simply check the ETag value
again:
print either etag = p/locals/headers/ETag [ "not changed" ] [ "changed" ]
; not changed
A conditional GET can be done using the
Last-Modified/If-Modified-Since headers as well. But as REBOL mangles
the Last-Modified value you'd have to clean the value first.
To see a bit more about what happens at HTTP level, try the above code
with trace/net turned on, use a HTTP/TCP sniffer or see the examples
in Simon Fell's "BDG to Etags"
- http://www.pocketsoap.com/weblog/stories/2002/05/0015.html
--
Best regards,
Andreas mailto:[andreas--bolka--gmx--net]