Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Re: Searching news articles for a string

From: David_A_Brown:vanguard at: 5-Jun-2003 9:12

Mat, Thanks a lot. That is a handy function. Dave Mat Bettinson <[mat--plothatching] To: "[David_A_Brown--vanguard--com]" <[rebol-list--rebol--com]> .com> cc: (bcc: David A Brown/IT/VGI) Sent by: Subject: [REBOL] Re: Searching news articles for a string [rebol-bounce--rebo] l.com 06/03/2003 11:06 AM Please respond to rebol-list Hello David, Dvc> New York Times Dvc> Boston Globe Dvc> CNN Money OK I'm not sure exactly what sort of thing you wanted to see but here included a version of a kind of 'google news watching' function in my IRC bot. It searches all the sub-sections of Googlenews. It looks at the pages itself rather than using Google's news searcher - so the results are very recent. You give 'GoogleNewsSearch' two strings. The first is a type of search, match all or any of the words in the second string. The words would be space delimited. Here's some examples;
>> GoogleNewsSearch "any" "france nuclear"
1. A conflict of views sharpens in Korea - http://www.iht.com/articles/98268.htm 2. Euro states let France off deficit hook - http://www.expatica.com/francemain.asp?pad=278,313,&item_id=31706 3. Putin Wants IAEA Checks on Iran Nuclear Program - http://reuters.com/newsArticle.jhtml?type=worldNews&storyID=2868319
>> GoogleNewsSearch "all" "france nuclear"
1. Putin Wants IAEA Checks on Iran Nuclear Program - http://reuters.com/newsArticle.jhtml?type=worldNews&storyID=2868319 Maybe this might be useful sort of thing for what you wanted to do? GoogleNewsSearch: func [ typehit [string!] searchterms [string!] /local Searchwords GoogleNewsURLs fgurl fgtitle fgstory i ][ AnyHits: func [ targetdata [string!] searchblock [block!] ][ foreach sbit searchblock [ if found? find targetdata sbit [return true] ] return false ] AllHits: func [ targetdata [string!] searchblock [block!] ][ foreach sbit searchblock [ if not found? find targetdata sbit [return false] ] return true ] Searchwords: parse searchterms none GoogleNewsURLs: [http://news.google.com/news/gnworldleftnav.html http://news.google.com/news/gnusaleftnav.html http://news.google.com/news/gnbusinessleftnav.html http://news.google.com/news/gntechnologyleftnav.html http://news.google.com/news/gnsportsleftnav.html http://news.google.com/news/gnenterleftnav.html http://news.google.com/news/gnhealthleftnav.html] GoogleHits: make block! [] Foreach GoogleURL GoogleNewsURLs [ either error? try [Googlepage: read GoogleURL][ return false ][ parse Googlepage [any [thru "<td width=80 align=center valign=top>" thru {<a class=y href="} copy fgurl to {"} thru {>} copy fgtitle to {</a>} thru "<font size=-1>" thru "<br>" copy fgstory to "<br>" ( if typehit = "all" [ if ((AllHits fgstory Searchwords) or (AllHits fgtitle Searchwords)) [ append GoogleHits reduce [fgurl fgtitle fgstory] ] ] if typehit = "any" [ if ((AnyHits fgstory Searchwords) or (AnyHits fgtitle Searchwords)) [ append GoogleHits reduce [fgurl fgtitle fgstory] ] ] ) ] ] ] ] i: 0 foreach [fgurl fgtitle fgstory] GoogleHits [ i: i + 1 Print rejoin[i". "fgtitle" - "fgurl] ] ] Regards, Mat Bettinson - +44-(0)20-83401514.