Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search

Little parsing problem just got bigger

 [1/3] from: nigelb:anix at: 20-Nov-2000 16:30

Thanks to all those that helped solve/explain the parsing problem I needed help with. I managed to get that all working fine. However (there's always a however isn't there?)... The problem runs a little deeper than I thought. I'm writing a small add on to a bbs written in Perl to produce a 'Slash style' front page. The Perl bbs software allows the poster to add custom tags e.g. [color = xxx] [quote] [code] as well as interpreting www as an html link and adding a link etc. I thought there were only a few tags I had to worry about but on looking at the Perl code I find there are nearly 50 cases to worry about. I've included the perl code below (appologies for posting so much non Rebol code - but it helps explain the problem). Basically what I need to do is mimic the Perl code below in Rebol. That way I can read the post with Rebol and display the post correctly. So I guess what I need is a general purpose function to replicate the Perl regular expression search and replace function. sub ikoncode { my $post = shift; $post =~ s/\<p>/<br><br>/isg; $post =~ s|\[\[|\{\{|g; $post =~ s|\]\]|\}\}|g; $post =~ s|\n\[|\[|g; $post =~ s|\]\n|\]|g; $post =~ s|<br>| <br>|g; $post =~ s|\[hr\]\n|\<hr width=40\% align=left>|g; $post =~ s|\[hr\]|\<hr width=40\% align=left>|g; $post =~ s/\[quote\](.*)\[quote\](.*)\[\/quote](.*)\[\/quote\]/<blockquote><hr><font size=\"1\" face=\"verdana, helvetica\">$1<\/font><blockquote><hr><font size=\"1\" face=\"verdana, helvetica\">$2<\/font><hr><\/blockquote><font size=\"1\" face=\"verdana, helvetica\">$3<\/font><hr><\/blockquote>/isg; $post =~ s/\[quote\]\s*(.*?)\s*\[\/quote\]/<font face=arial size=1><blockquote><hr noshade size=1>$1<hr noshade size=1><\/blockquote><\/font>/isg; $post =~ s/\[url\](\S+?)\[\/url\]/<a href=\"$1\"\ target=\"_blank\">$1<\/a>/isg; $post =~ s/\[url=http:\/\/(\S+?)\]/<a href=\"http:\/\/$1\"\ target=\"_blank\">/isg; $post =~ s/\[url=(\S+?)\]/<a href=\"http:\/\/$1\"\ target=\"_blank\">/isg; $post =~ s/\[\/url\]/<\/a>/isg; $post =~ s/\ http:\/\/(\S+?)\ / <a href=\"http:\/\/$1\"\ target=\"_blank\">http\:\/\/$1<\/a> /isg; $post =~ s/<br>http:\/\/(\S+?)\ /<br><a href=\"http:\/\/$1\"\ target=\"_blank\">http\:\/\/$1<\/a> /isg; $post =~ s/^http:\/\/(\S+?)\ /<a href=\"http:\/\/$1\"\ target=\"_blank\">http\:\/\/$1<\/a> /isg; $post =~ s/\ www.(\S+?)\ / <a href=\"http:\/\/www.$1\"\ target=\"_blank\">http\:\/\/www.$1<\/a> /isg; $post =~ s/<br>www.(\S+?)\ /<br><a href=\"http:\/\/www.$1\"\ target=\"_blank\">http\:\/\/www.$1<\/a> /isg; $post =~ s/^www.(\S+?)\ /<a href=\"http:\/\/www.$1\"\ target=\"_blank\">http\:\/\/www.$1<\/a> /isg; $post =~ s/\[b\]/<b>/isg; $post =~ s/\[\/b\]/<\/b>/isg; $post =~ s/\[i\]/<i>/isg; $post =~ s/\[\/i\]/<\/i>/isg; $post =~ s/\[size=\s*(.*?)\s*\]\s*(.*?)\s*\[\/size\]/<font size=\"$1\">$2<\/font>/isg; $post =~ s/\[font=\s*(.*?)\s*\]\s*(.*?)\s*\[\/font\]/<font face=\"$1\">$2<\/font>/isg; $post =~ s/\[u\]/<u>/isg; $post =~ s/\[br\]/<br>/isg; $post =~ s/\[\/u\]/<\/u>/isg; $post =~ s/\[img\](.+?)\[\/img\]/<img src=\"$1\">/isg; $post =~ s/\[color=(\S+?)\]/<font color=\"$1\">/isg; $post =~ s/\[\/color\]/<\/font>/isg; $post =~ s/\\http:\/\/(\S+)/<a href=\"http:\/\/$1\"\ target=\"_blank\">http:\/\/$1<\/a>/isg; $post =~ s/\[list\]/<ul>/isg; $post =~ s/\[\*\]/<li>/isg; $post =~ s/\[\/list\]/<\/ul>/isg; $post =~ s/\[code\](.+?)\[\/code\]/<blockquote><font size=\"1\" face=\"Courier New\">code:<\/font><hr><font face=\"Courier New\"><pre>$1<\/pre><\/font><hr><\/blockquote>/isg; $post =~ s/\\(\S+?)\@(\S+)/<a href=\"mailto:$1\@$2\"\>$1\@$2<\/a>/ig; $post =~ s/\[email=(\S+?)\]/<a href=\"mailto:$1\">/isg; $post =~ s/\[\/email\]/<\/a>/isg; $post =~ s|\{\{|\[|g; $post =~ s|\}\}|\]|g; return $post; } # end routine

 [2/3] from: chaz:innocent at: 21-Nov-2000 21:05

Except for the fact that that code contains $'s, []'s, ()'s, and {}'s, I'd say put all of ikoncode into a block and parse the whole block. Something like substituteblock: copy [] parse ikoncode [ any [thru "^$post =~ s/" original to "/" thru "/" copy substitute to "/"] ( comment{replacement code} ) ] At 04:30 PM 11/20/00 -0000, you wrote:

 [3/3] from: al:bri:xtra at: 22-Nov-2000 22:36

> Except for the fact that that code contains $'s, []'s, ()'s, and {}'s, I'd
say put all of ikoncode into a block and parse the whole block. Hmmm, a Perl to Rebol translator... Andrew Martin Who's way too little time to work on it. ICQ: 26227169