Little parsing problem just got bigger
[1/3] from: nigelb:anix at: 20-Nov-2000 16:30
Thanks to all those that helped solve/explain the parsing problem I needed
help with. I managed to get that all working fine.
However (there's always a however isn't there?)...
The problem runs a little deeper than I thought.
I'm writing a small add on to a bbs written in Perl to produce a 'Slash
style' front page. The Perl bbs software allows the poster to add custom
tags e.g. [color = xxx] [quote] [code] as well as interpreting www as an
html link and adding a link etc. I thought there were only a few tags I had
to worry about but on looking at the Perl code I find there are nearly 50
cases to worry about.
I've included the perl code below (appologies for posting so much non Rebol
code - but it helps explain the problem). Basically what I need to do is
mimic the Perl code below in Rebol. That way I can read the post with Rebol
and display the post correctly.
So I guess what I need is a general purpose function to replicate the Perl
regular expression search and replace function.
sub ikoncode {
my $post = shift;
$post =~ s/\<p>/<br><br>/isg;
$post =~ s|\[\[|\{\{|g;
$post =~ s|\]\]|\}\}|g;
$post =~ s|\n\[|\[|g;
$post =~ s|\]\n|\]|g;
$post =~ s|<br>| <br>|g;
$post =~ s|\[hr\]\n|\<hr width=40\% align=left>|g;
$post =~ s|\[hr\]|\<hr width=40\% align=left>|g;
$post =~
s/\[quote\](.*)\[quote\](.*)\[\/quote](.*)\[\/quote\]/<blockquote><hr><font
size=\"1\" face=\"verdana, helvetica\">$1<\/font><blockquote><hr><font
size=\"1\" face=\"verdana, helvetica\">$2<\/font><hr><\/blockquote><font
size=\"1\" face=\"verdana, helvetica\">$3<\/font><hr><\/blockquote>/isg;
$post =~ s/\[quote\]\s*(.*?)\s*\[\/quote\]/<font face=arial
size=1><blockquote><hr noshade size=1>$1<hr noshade
size=1><\/blockquote><\/font>/isg;
$post =~ s/\[url\](\S+?)\[\/url\]/<a href=\"$1\"\
target=\"_blank\">$1<\/a>/isg;
$post =~ s/\[url=http:\/\/(\S+?)\]/<a href=\"http:\/\/$1\"\
target=\"_blank\">/isg;
$post =~ s/\[url=(\S+?)\]/<a href=\"http:\/\/$1\"\
target=\"_blank\">/isg;
$post =~ s/\[\/url\]/<\/a>/isg;
$post =~ s/\ http:\/\/(\S+?)\ / <a href=\"http:\/\/$1\"\
target=\"_blank\">http\:\/\/$1<\/a> /isg;
$post =~ s/<br>http:\/\/(\S+?)\ /<br><a href=\"http:\/\/$1\"\
target=\"_blank\">http\:\/\/$1<\/a> /isg;
$post =~ s/^http:\/\/(\S+?)\ /<a href=\"http:\/\/$1\"\
target=\"_blank\">http\:\/\/$1<\/a> /isg;
$post =~ s/\ www.(\S+?)\ / <a href=\"http:\/\/www.$1\"\
target=\"_blank\">http\:\/\/www.$1<\/a> /isg;
$post =~ s/<br>www.(\S+?)\ /<br><a href=\"http:\/\/www.$1\"\
target=\"_blank\">http\:\/\/www.$1<\/a> /isg;
$post =~ s/^www.(\S+?)\ /<a href=\"http:\/\/www.$1\"\
target=\"_blank\">http\:\/\/www.$1<\/a> /isg;
$post =~ s/\[b\]/<b>/isg;
$post =~ s/\[\/b\]/<\/b>/isg;
$post =~ s/\[i\]/<i>/isg;
$post =~ s/\[\/i\]/<\/i>/isg;
$post =~ s/\[size=\s*(.*?)\s*\]\s*(.*?)\s*\[\/size\]/<font
size=\"$1\">$2<\/font>/isg;
$post =~ s/\[font=\s*(.*?)\s*\]\s*(.*?)\s*\[\/font\]/<font
face=\"$1\">$2<\/font>/isg;
$post =~ s/\[u\]/<u>/isg;
$post =~ s/\[br\]/<br>/isg;
$post =~ s/\[\/u\]/<\/u>/isg;
$post =~ s/\[img\](.+?)\[\/img\]/<img src=\"$1\">/isg;
$post =~ s/\[color=(\S+?)\]/<font color=\"$1\">/isg;
$post =~ s/\[\/color\]/<\/font>/isg;
$post =~ s/\\http:\/\/(\S+)/<a href=\"http:\/\/$1\"\
target=\"_blank\">http:\/\/$1<\/a>/isg;
$post =~ s/\[list\]/<ul>/isg;
$post =~ s/\[\*\]/<li>/isg;
$post =~ s/\[\/list\]/<\/ul>/isg;
$post =~ s/\[code\](.+?)\[\/code\]/<blockquote><font size=\"1\"
face=\"Courier New\">code:<\/font><hr><font face=\"Courier
New\"><pre>$1<\/pre><\/font><hr><\/blockquote>/isg;
$post =~ s/\\(\S+?)\@(\S+)/<a href=\"mailto:$1\@$2\"\>$1\@$2<\/a>/ig;
$post =~ s/\[email=(\S+?)\]/<a href=\"mailto:$1\">/isg;
$post =~ s/\[\/email\]/<\/a>/isg;
$post =~ s|\{\{|\[|g;
$post =~ s|\}\}|\]|g;
return $post;
} # end routine
[2/3] from: chaz:innocent at: 21-Nov-2000 21:05
Except for the fact that that code contains $'s, []'s, ()'s, and {}'s, I'd
say put all of ikoncode into a block and parse the whole block.
Something like
substituteblock: copy []
parse ikoncode [
any [thru "^$post =~ s/" original to "/" thru "/" copy substitute to "/"]
( comment{replacement code} )
]
At 04:30 PM 11/20/00 -0000, you wrote:
[3/3] from: al:bri:xtra at: 22-Nov-2000 22:36
> Except for the fact that that code contains $'s, []'s, ()'s, and {}'s, I'd
say put all of ikoncode into a block and parse the whole block.
Hmmm, a Perl to Rebol translator...
Andrew Martin
Who's way too little time to work on it.
ICQ: 26227169
http://members.nbci.com/AndrewMartin/