Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Re: fighting spam paper & links / naive bayes / anybody ?

From: gchiu:compkarori at: 15-Sep-2002 23:19

On Mon, 26 Aug 2002 11:30:30 +1000 "Brett Handley" <[brett--codeconscious--com]> wrote:
>I've uploaded my prototype script on to my site at the >address below, be >warned it is not thoroughly tested and I'm certainly not >letting it be final >arbiter of my email just yet: > > http://www.codeconscious.com/rebol/mlscripts/spam-filter.r >
I've taken Brett's code from the IOS server ( I'm not sure it's the same as the one above ), and created a "web service" out of it just so that you can see what it does. http://207.8.27.211/spam/index.html Just paste into the box a complete email with all the headers, and "test" it to see if it is considered spam or not. The database I'm using is from 2597 good email, and 876 spam. At the moment it does not update itself ie. does not learn, as I have to consider the issue of file locking etc. What I would like to do, is to tokenise the email locally, and just send the tokens to the web service ( perhaps SOAP or a Rugby service ). Trouble is I don't know whether what I consider spam is what others consider spam. I would be interested to see what results people get. -- Graham Chiu