Read Finnish teletext with REBOL.
[1/1] from: jhagman::infa::abo::fi at: 11-Apr-2001 15:45
I have made another fetch'n show program with REBOL. This time it gets a html-page from the web, and shows it. The program is quite elementary, but it is a nice example of how caching affects the performance. The program caches all the pages it has received and also the page next to current. Hopefully someone gets some fun out of this, it is of course not that much of use if you do not speak (well, at least read) finnish or swedish. Then some things that popped up in my head while programming. 1. In the program there is a decode-html -function with a corresponding table. It's place is not in "end programmer's code" but it should be placed out of sight in a module of some kind. I know how it is possible to split up the code technically, but I would appreciate if there would be a standard way of [a] making the module so that it does not interfere with the namespace and [b] placing the modules in the filesystem so that they are always on REBOL's path. Are there standard practices for this? OPINION: REBOL needs modules! (I remember seeing something about modules by Carl on the ALLY list a long time ago, but not hearing about it later). Just look at Python, there are lot's of standard and user contributed modules. There surely would be with REBOL if there was a way of doing it and it would be promoted well. (there might be... I have not looked that much in the docs lately) 2. I was going to say that dictionaries (associative arrays) would be nice, but actually I got the cache done really easily with a block so I will not complain... Another thing on the datatypes, there should be more dokumentation on the comparison of datatypes, should I use list!, hash!, block! or something else? One should not have to just test it by oneself. The code follows: #!/usr/local/bin/rebol -q TODO: o Subpages o port to /View o Parse addr of the next page from <a href=...>[Seuraava sivu]</a> o cursor or vim keys. Better ui that is. o perhaps should one cache just the text pages... (ie after dehtml) o time-stamp to cache REBOL [ Name: "Tekstitv" Author: "Jussi Hagman" Version: 0.4 History: [0.1 2000-7-6 "Initial version" 0.2 2000-7-8 "Uses now load/markup, small fixes" 0.3 2001-4-11 "Small fixes" 0.4 2001-4-11 "Caching, Small fixes" ] ] base: [http://www.yle.fi/cgi-bin/tekstitv/ttv.cgi/ n "/txt/"] cls: func[print "^(page)"] decode-table: [ ["È" "E"] ["è" "e"] ; !!!! Wrong! ["Ä" "Ä"] ["ä" "ä"] ["Ö" "Ö"] ["ö" "ö"] ["Å" "Å"] ["å" "Å"] ] decode-HTML: func [ "Decodes HTML-style chars to their equivalents" string [string!] /local item x y ][ foreach item decode-table [ set [x y] item replace/case/all string x y ] ] print-page: func [ "Prints out one teletext page" page [block!] /local mode_pre item ][ mode_pre: 0 cls foreach item page [ either string? item [ ; When "[Edel" is found there is nothing intresting left on the page if find item "[Edel" [return] if any [(mode_pre > 0) (not item = "^/")] [prin decode-HTML item] ][ switch item [ <PRE> [mode_pre: mode_pre + 1] </PRE> [mode_pre: mode_pre - 1] <BR> [print ] ] ] ] ] page-cache: make block! 20 get-page: func [page][ if l: find/skip page-cache page 2 [ return second l ] p: load/markup page append page-cache page append/only page-cache p return p ] ; Main program if error? try [ n: to-integer trim system/script/args ][ n: 100 ] page: get-page rejoin base forever [ print-page page ;cache the next page n: n + 1 page: get-page rejoin base command: ask "page? " if command = "q" [quit] if not error? try [ n: to-integer command ][ page: get-page rejoin base ] ] -- Jussi Hagman CS in Åbo Akademi University Studentbyn 4 D 33 [juhagman--abo--fi] 20540 Åbo [jhagman--infa--abo--fi] Finland