Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Re: Cleaning up swear words

From: btiffin:rogers at: 8-Jul-2007 14:09

Hello again, Hope everyone got 'lucky' for the big 07. Anyway, thanks to Tom the code has changed so that it won't clean brass anymore. Now a new quandry. What to do with two bad words with no spacing. Is there a fairly fast way of checking for badword [terms | end | ... any other words] or should it just be left to clean brass (perhaps removing that badword as potentially not so bad and it is a common subword? Hmm...this gets a little six of one half dozen of the other...these word games. Who invented English? Did they not know anything about computers? Cheers, Brian On Saturday 07 July 2007 04:05, Tom wrote:
> using parse instead of parse/all will get word boundaries > > For instance :) ... Just noticed a bug. The code cleans brass but > > leaves associate alone. Need a parse trick to advance to word boundaries > > and not just skip through the text. > > > > Cheers again.
REBOL [ Title: "Momify bad words" Author: "Brian Tiffin" Date: 06-Jul-2007 File: %momify.r Purpose: "Translate bad words to cartoon speak" Version: 0.9.0 Comment: {Usage: do %momify.r clean "..."} ] ;; protect the global name space ;; do it an object or use global: do %momify context [ ;; It's a fairly incomplete list of bad words. 244 entries. ;; found at http://www.mattfacer.com/swear-filter/ ;; If you want to add to the list: ;; evaluate unhidelist, make your edits, evaluate hidelist, ;; cut'n' paste badwords.bin to badwords definition ;; then delete badwords.bin and badwords.raw ;; or for more protection use shred -zu on at least badwords.raw files unhidelist: does [ if any [not exists? %badwords.raw confirm "Overwrite badwords? "] [ write/lines %badwords.raw mold to string! first badwords foreach word next badwords [ write/lines/append %badwords.raw mold to string! word ] ] print "cover your eyes and edit %badwords.raw" ] ;; hidelist creates a list of words in binary so if REBOL ;; errors out no one will be exposed to the bad words. ;; Again executing hidelist, you'll need to cut'n'paste include the ;; badwords.bin in the badwords definition hidelist: has [base words] [ base: system/options/binary-base system/options/binary-base: 64 words: load %badwords.raw while [not tail? words] [ change words to binary! first words words: next words ] save %badwords.bin compress mold head words system/options/binary-base: base print "Now cut'n'paste %badwords.bin to badwords: in source code" ] badwords: load decompress 64#{ eJx9l9tuqzgUhu/nKUaaFwDTVOWiFztqgaQNUqNdCB7NBYatuAmmKM2JjObd9zIB r2VSzUUkf7/Py+tA/r6/++vfOJq5/O3x8b8/O0rX5/IHkmyQsp/vbt9KJ1KgvqIr ZKt5W7B3pAvtiy/Yek25m7F9VaxpP6+TTc6S1lJ2OXsmY/LQ/7JXLZk8itDs+hE4 GVsPtPlxyuqFoWeP9lkjVfAlwrnkp0eqRAuLYpmxL1QuZRi0HK2hEll4yyZLz00Z bY263dF9KmvX81Gw5QSpZM0xQ6sovxHsLAtyKv9IbgQk2PxCyf+wR4NyM+NmXD2v wJYOVZINXy2loYOot+vRC9WJU4auS+5zmW5yXEOTF49fmgUeGcNkI5SxKatOGb4u OxGPzNi5+V/Ct2ITl/b59EyaWvdSpvGOr2ZUZRxe991SZJWlb5biukLFn7fKaK1v bn1ViTcbZS+NssOWWCUHsI69xq7w9AuQNe76lpfsTetAbq8pPDej03iVFGHV+w1/ CoiFeFiNybIDKF88fECCmH82dLbm+vVtpGuVs+pgE6ezDuBjk0JVHxBRrVGPhfEz HoEVTMzztJG5l3xlq2XVKx9TdxYu1nk7rWbR8DJ8s9RePJAKamzBGRy02lXxHUP2 PKBKiTA4YbwMqu1XWv21IjMdzINcwduT+doTzG07sqyiEmpZReOQQ0zTPiDsq615 t4RvZ8X2lax37xTzbj3x0WqgePFnvlruhbeEqJiPLaJHeSWMeLMVEs0662PLmruz TrizzrODSjBapYRXNXSHrQX6BAsmIvRJFod3l2UU956Uh8kFPQ2qjywUIeLXuV7F 5K5cQUwxX3HsV9VFYL+VBXM2IVm/I6gleKZe2VKib3NVoHKw/cRWdCRdFfEUNy9s yF2aytDQz6lamN00iRPSm1uGSQv1zsH7ifT5qm7WY2U0TlfJpH1N57I0dhxUOrtX YPbLejxbq4vTd2qe+ofv1cIeH92e7GHP1YMhn1oAiFgg9R3wg9aOEFsdcjhkbVCH mLpSoLgVA4OqI4Os16ljGu250XnJrN6R6VPBfY63U1XNL3FLaflJCFfpKLD6gotF 95RwP/8DW7PQ8jVQ4PTEgztF6Yq3tRWwi1HqZJ8pqN2mkoPiFIxWp+Jp2grjH0UY eJSSQ75aWBSP8k8RShe/4YoQbmRZGJR9YSqjbqEfdcQ41Jmp6ffpaaJ5Q9aO5kfa l1zw2/hKW0rGLlD39pA9PCSR0jxbtJPjq5r1Xl9AREP+ov0sbgULHELeUBd7cjEP QU2VLydKmEGBzpSkpLsAsZLkv15pxK2iClQqiEbyBQ9KQ/enGfwbCpfUwp3ybtGQ I3uaUOLueC7NoZ0SLa0bnV26v2vRRGL+LLypzMwtyqfAgV8Vv1lKWFXoS3C2Cv1c 06816YNMVm2y9DRS8H8OkGNXc60UFuFpO4L9yYl6xfb+QR2tE12/ayzFg3+CVGHB gZwlKujuUSlHZFm+U6LhXxCQK2rSB0Rfpdw8O9yby4GCulBDLitVJUnf5ccJY06f EP13dF5NaAVNJh71/8vM5A9N+AUA3xfwr9Gyl1b0f78//vkN4gHQ8mQPAAA} ;; Replacement characters comicbook: "!-#$%^&*.~*&^%$#-!.~!-#$%^&*.~*&^%$#-!" ;; ensure words are shorter to longer sort/compare badwords func [a b] [sign? subtract length? a length? b] ;; non terminators nont: charset [#"a" - #"z" #"A" - #"Z" #"0" - #"9"] ;; terminators terms: complement nont ;; big block of rules rules: copy [] reprules: copy [] ;; build a replacement rule for each badword foreach word badwords [ insert tail reprules compose/deep [ mark: [ (word) [terms | end] (to paren! compose [ change mark copy/part random comicbook (length? word) ]) :mark] | ] ] ;; then last alternate is to advance past non badword ;; then a rule to skip any terminators insert tail reprules [some nont any terms] ;; Generate the parse rule insert tail rules compose/only [any terms some (reprules)] ;; ;; Get rid of badwords ;; momify: func [ "Replace bad words with comic book text" instr [string!] "String to clean - Modified" ][ parse/all instr [some [rules]] instr ] ;; uncomment to expose momify as clean, ;; or comment to hide all the words and then the usage is ;; a: do %momify.r a/momify "..." ;; that is the only way to get at unhidelist and hidelist set 'clean :momify ]