Read Finnish teletext with REBOL.
[1/1] from: jhagman::infa::abo::fi at: 11-Apr-2001 15:45
I have made another fetch'n show program with REBOL. This time it gets a
html-page from the web, and shows it.
The program is quite elementary, but it is a nice example of how
caching affects the performance. The program caches all the pages it has
received and also the page next to current.
Hopefully someone gets some fun out of this, it is of course not that
much of use if you do not speak (well, at least read) finnish or
swedish.
Then some things that popped up in my head while programming.
1. In the program there is a decode-html -function with a
corresponding table. It's place is not in "end programmer's
code" but it should be placed out of sight in a module of some
kind. I know how it is possible to split up the code technically,
but I would appreciate if there would be a standard way of [a]
making the module so that it does not interfere with the
namespace and [b] placing the modules in the filesystem so that
they are always on REBOL's path. Are there standard practices for
this?
OPINION: REBOL needs modules! (I remember seeing something about
modules by Carl on the ALLY list a long time ago, but not hearing
about it later). Just look at Python, there are lot's of standard
and user contributed modules. There surely would be with REBOL if
there was a way of doing it and it would be promoted well. (there
might be... I have not looked that much in the docs lately)
2. I was going to say that dictionaries (associative arrays) would
be nice, but actually I got the cache done really easily with a
block so I will not complain... Another thing on the datatypes,
there should be more dokumentation on the comparison of
datatypes, should I use list!, hash!, block! or something else?
One should not have to just test it by oneself.
The code follows:
#!/usr/local/bin/rebol -q
TODO:
o Subpages
o port to /View
o Parse addr of the next page from
<a href=...>[Seuraava sivu]</a>
o cursor or vim keys. Better ui that is.
o perhaps should one cache just the text pages... (ie after dehtml)
o time-stamp to cache
REBOL [
Name: "Tekstitv"
Author: "Jussi Hagman"
Version: 0.4
History: [0.1 2000-7-6 "Initial version"
0.2 2000-7-8 "Uses now load/markup, small fixes"
0.3 2001-4-11 "Small fixes"
0.4 2001-4-11 "Caching, Small fixes"
]
]
base: [http://www.yle.fi/cgi-bin/tekstitv/ttv.cgi/ n "/txt/"]
cls: func[][print "^(page)"]
decode-table: [
["È" "E"] ["è" "e"] ; !!!! Wrong!
["Ä" "Ä"] ["ä" "ä"]
["Ö" "Ö"] ["ö" "ö"]
["Å" "Å"] ["å" "Å"]
]
decode-HTML: func [
"Decodes HTML-style chars to their equivalents"
string [string!]
/local item x y
][
foreach item decode-table [
set [x y] item
replace/case/all string x y
]
]
print-page: func [
"Prints out one teletext page"
page [block!]
/local mode_pre item
][
mode_pre: 0
cls
foreach item page [
either string? item [
; When "[Edel" is found there is nothing intresting left on the page
if find item "[Edel"
[return]
if any [(mode_pre > 0) (not item = "^/")]
[prin decode-HTML item]
][
switch item [
<PRE> [mode_pre: mode_pre + 1]
</PRE> [mode_pre: mode_pre - 1]
<BR> [print []]
]
]
]
]
page-cache: make block! 20
get-page: func [page][
if l: find/skip page-cache page 2 [
return second l
]
p: load/markup page
append page-cache page
append/only page-cache p
return p
]
; Main program
if error? try [
n: to-integer trim system/script/args
][
n: 100
]
page: get-page rejoin base
forever [
print-page page
;cache the next page
n: n + 1
page: get-page rejoin base
command: ask "page? "
if command = "q"
[quit]
if not error? try [
n: to-integer command
][
page: get-page rejoin base
]
]
--
Jussi Hagman CS in Åbo Akademi University
Studentbyn 4 D 33 [juhagman--abo--fi]
20540 Åbo [jhagman--infa--abo--fi]
Finland