[REBOL] Re: Searching news articles for a string
From: David_A_Brown:vanguard at: 5-Jun-2003 9:12
Mat,
Thanks a lot. That is a handy function.
Dave
Mat Bettinson
<[mat--plothatching] To: "[David_A_Brown--vanguard--com]" <[rebol-list--rebol--com]>
.com> cc: (bcc: David A Brown/IT/VGI)
Sent by: Subject: [REBOL] Re: Searching news articles for a string
[rebol-bounce--rebo]
l.com
06/03/2003 11:06
AM
Please respond to
rebol-list
Hello David,
Dvc> New York Times
Dvc> Boston Globe
Dvc> CNN Money
OK I'm not sure exactly what sort of thing you wanted to see but here
included a version of a kind of 'google news watching' function in my
IRC bot. It searches all the sub-sections of Googlenews. It looks at
the pages itself rather than using Google's news searcher - so the
results are very recent.
You give 'GoogleNewsSearch' two strings. The first is a type of
search, match all or any of the words in the second string. The words
would be space delimited. Here's some examples;
>> GoogleNewsSearch "any" "france nuclear"
1. A conflict of views sharpens in Korea -
http://www.iht.com/articles/98268.htm
2. Euro states let France off deficit hook -
http://www.expatica.com/francemain.asp?pad=278,313,&item_id=31706
3. Putin Wants IAEA Checks on Iran Nuclear Program -
http://reuters.com/newsArticle.jhtml?type=worldNews&storyID=2868319
>> GoogleNewsSearch "all" "france nuclear"
1. Putin Wants IAEA Checks on Iran Nuclear Program -
http://reuters.com/newsArticle.jhtml?type=worldNews&storyID=2868319
Maybe this might be useful sort of thing for what you wanted to do?
GoogleNewsSearch: func [
typehit [string!]
searchterms [string!]
/local
Searchwords GoogleNewsURLs fgurl fgtitle fgstory i
][
AnyHits: func [
targetdata [string!]
searchblock [block!]
][
foreach sbit searchblock [
if found? find targetdata sbit [return true]
]
return false
]
AllHits: func [
targetdata [string!]
searchblock [block!]
][
foreach sbit searchblock [
if not found? find targetdata sbit [return false]
]
return true
]
Searchwords: parse searchterms none
GoogleNewsURLs: [http://news.google.com/news/gnworldleftnav.html
http://news.google.com/news/gnusaleftnav.html
http://news.google.com/news/gnbusinessleftnav.html
http://news.google.com/news/gntechnologyleftnav.html
http://news.google.com/news/gnsportsleftnav.html
http://news.google.com/news/gnenterleftnav.html
http://news.google.com/news/gnhealthleftnav.html]
GoogleHits: make block! []
Foreach GoogleURL GoogleNewsURLs [
either error? try [Googlepage: read GoogleURL][
return false
][
parse Googlepage [any [thru "<td width=80 align=center valign=top>"
thru {<a class=y href="} copy fgurl to {"}
thru {>} copy fgtitle to {</a>}
thru "<font size=-1>" thru "<br>" copy fgstory
to "<br>"
(
if typehit = "all" [
if ((AllHits fgstory Searchwords) or
(AllHits fgtitle Searchwords)) [
append GoogleHits reduce [fgurl fgtitle
fgstory]
]
]
if typehit = "any" [
if ((AnyHits fgstory Searchwords) or
(AnyHits fgtitle Searchwords)) [
append GoogleHits reduce [fgurl fgtitle
fgstory]
]
]
)
]
]
]
]
i: 0
foreach [fgurl fgtitle fgstory] GoogleHits [
i: i + 1
Print rejoin[i". "fgtitle" - "fgurl]
]
]
Regards,
Mat Bettinson - +44-(0)20-83401514.