|View script||License||Download documentation as: HTML or editable||Download script||History|
Script Library: 1222 scripts
Documentation for: webtitle.r
Usage document for %webtitle.r
1. Introduction to %webtitle.r
webtitle.r is an introduction to the ease and simplicity of accessing the internet and parsing HTML.
2. webtitle At a Glance
Not setup is required, just do it.
>> do %webtitle.r >> do http://www.rebol.org/cgi-bin/cgiwrap/rebol/download-a-script.r?script-name=webtitle.r connecting to: www.rebol.org Script: "Web Page Title Extractor" (20-May-1999) connecting to: www.rebol.com REBOL Technologies
3. Using %webtitle.r
Requires REBOL/Core or REBOL/View console mode.
3.1. Running %webtitle.r
From the library with:
>> do http://www.rebol.org/cgi-bin/cgiwrap/rebol/download-a-script.r?script-name=webtitle.ror locally with:
>> do %webtitle.r
4. See also
5. What you can learn
5.1. Powerful builtin Internet Access
REBOL has fantastically simple builtin procedures for accessing the internet.
http://www.rebol.com is actually a value with a special datatype. In REBOL this is a url!. Very powerful. No quotes needed. REBOL just knows.
5.3. Web Server defaults
Web Servers have default files that are returned. http://www.rebol.com is actually returned as http://www.rebol.com/index.html. This is not always the case. Some sites return default.htm, or index.php, or index.cgi. No need to worry, the REBOL read function and the web server will work that all out for you. If REBOL Technologies ever changes its web server setup, a different file may be returned and this script will still work.
REBOL comes with a very powerful parse command. It can parse strings or blocks. In this example it is used to parse a web page as a string, and uses another power feature of REBOL, the tag! datatype
One of the many REBOL builtin datatypes is the tag! datatype. A tag is anything surrounded in the "<" lessthan and ">" greaterthan symbols. HTML is based on tags, as is XML, and a few other markup languages such as SGML.
webtitle uses parse to scan through the the web page looking for the <title> tag, and scanning just past it. That is a feature of the thru rule of parsing, scan through the string. Then there is a copy rule, that informs parse that it has to copy all the following scan and match data into a variable, in this case title. Then the parse rules call to. This rule is very similar to thru but it scans up to the string, not through it. So the parser scans past the html <title> tag, copies everthing between it right up to the ending </title> tag.
Then it prints the title variable.
5.5. Changing the page that is scanned
Changing the website or page that the title is extracted from is as easy as changing the text after the http:// part following the read command.
5.6. Getting REBOL
Both REBOL/Core and REBOL/View are available
6. What can break
You will need access to the internet and rebol.com will have to be up and running for this script to work. Don't worry, http://www.rebol.com is always up and running.