Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search

[REBOL] Re: The "It's Mine Now and I'll Do What I Want With It" Project Proposal

From: koolauscott::yahoo::com at: 10-Mar-2001 11:27

Yes, you can do this. I'm working on a similar project right now. It's easy to do for any given page, but it is difficult to write generic code. My approach is to parse a webpage, and then run each result through a series of simple functions each one which tests for something desired or not desired. After I have taken what I want, I run it through a html preprocess function and then the result can be used in generating a web page. To make dead links live I generally use split-path on the main url but there are some exceptions. Some sites require a function just to find the correct url for the day. I tie all these functions together with a master function so I only need to create one function call to process any given web page. Because this function call can be complicated I'm working on a tool that makes it easy to examine any web page and then create the parameters needed to build the function call. A good example of this is the website They extract headlines from news pages and they produce excellent results. To cover thousands of sites they must have good generic code. My first website doing this is at It has a couple of bugs but it has been working fairly well. The problem with this is I have to write a new page of code for each webpage I want to extract headlines from and that's time consuming. That's why I'm in the process of rewriting the code to be more generic. I think the key idea is to write simple functions that only do one thing but when combined together have the power to extract and reformat just about anything from a web page. When and if I get further along in this I will make my scripts available. --- Terry Brownell <[depotcity--home--com]> wrote: