Documention for: webtitle.r Created by: btiffin on: 21-Jun-2007 Format: text/editable Downloaded on: 30-Apr-2025 [h1 Usage document for %webtitle.r [contents [numbering-on [h2 Introduction to %webtitle.r [p webtitle.r is an introduction to the ease and simplicity of accessing the internet and parsing HTML. [table/att/border="1px"/att/cellpadding="4px" [row [cell **HTML** [cell Hyper Text Markup Language or Hypertext Markup Language table] [h2 webtitle At a Glance [p Not setup is required, just **do** it. [asis/style/font-size:75% >> do %webtitle.r >> do http://www.rebol.org/cgi-bin/cgiwrap/rebol/download-a-script.r?script-name=webtitle.r connecting to: www.rebol.org Script: "Web Page Title Extractor" (20-May-1999) connecting to: www.rebol.com REBOL Technologies asis] [h2 Using %webtitle.r [p Requires REBOL/Core or REBOL/View console mode. [h3 Running %webtitle.r [p From the library with: [asis/style/font-size:75% >> do http://www.rebol.org/cgi-bin/cgiwrap/rebol/download-a-script.r?script-name=webtitle.r asis] or locally with: [asis >> do %webtitle.r asis] [h2 See also [p There are other rebol.org scripts that use read http and parse. There is more explanation of **parse** in the scripts for weblinks.r and websplit.r [br [link http://www.rebol.org/cgi-bin/cgiwrap/rebol/view-script.r?script=weblinks.r "%weblinks.r" [br [link http://www.rebol.org/cgi-bin/cgiwrap/rebol/documentation.r?script=websplit.r "websplit usage doc" [h2 What you can learn [h3 Powerful builtin Internet Access [p REBOL has fantastically simple builtin procedures for accessing the internet. [br **read http://www.rebol.com** accesses and returns the HTML as text. How cool is that? [h3 URLs [p **http://www.rebol.com** is actually a value with a special datatype. In REBOL this is a **url!**. Very powerful. No quotes needed. REBOL just **knows**. [h3 Web Server defaults [p Web Servers have default files that are returned. http://www.rebol.com is actually returned as http://www.rebol.com/index.html. This is not always the case. Some sites return default.htm, or index.php, or index.cgi. No need to worry, the REBOL **read** function and the web server will work that all out for you. If REBOL Technologies ever changes its web server setup, a different file may be returned and this script will still work. [h3 parse [p REBOL comes with a very powerful **parse** command. It can parse strings or blocks. In this example it is used to parse a web page as a string, and uses another power feature of REBOL, the **tag!** datatype [h4 tag! [p One of the many REBOL builtin datatypes is the **tag!** datatype. A tag is anything surrounded in the ""<"" lessthan and "">"" greaterthan symbols. HTML is based on tags, as is XML, and a few other markup languages such as SGML. [p **webtitle** uses parse to scan through the the web page looking for the <title> tag, and scanning just past it. That is a feature of the **thru** rule of parsing, scan through the string. Then there is a **copy** rule, that informs parse that it has to copy all the following scan and match data into a variable, in this case **title**. Then the parse rules call **to**. This rule is very similar to **thru** but it scans up to the string, not through it. So the parser scans past the html <title> tag, copies everthing between it right up to the ending </title> tag. [p Then it prints the **title** variable. [h3 Changing the page that is scanned [p Changing the website or page that the title is extracted from is as easy as changing the text after the http:// part following the **read** command. [h3 Getting REBOL [p Both REBOL/Core and REBOL/View are available [br **free of charge** from [link http://www.rebol.com "www.rebol.com" [h2 What can break [p You will need access to the internet and rebol.com will have to be up and running for this script to work. Don't worry, **http://www.rebol.com** is always up and running. [h2 Credits [table/att/border="1px"/att/cellpadding="4px" [row [cell %webtitle.r [cell Author: Unknown [row [cell REBOL/Core [cell Carl Sassenrath, REBOL Technologies table] [list [li The rebol.org Library Team [li Usage document by Brian Tiffin, Library Team Apprentice, [date list]