[REBOL] Re: Preventing Automated Website Registrations
From: pwoodward:cncdsl at: 8-May-2002 17:38
Joel -
not a bad idea to eliminate the overhead of image generation. However....
(there's always a however) having used web automation tools like the ones
from Orsus (http://www.orsus.com/) the mechanism for easy image generation
you describe is "easy prey".
For a previous employer I built a cross-registration system for site
members. Essentially their business model was to partner with several
e-commerce sites, and then get their members cross-registered at them. They
wanted it so that their members would only have to enter their user data
once, and it would be stored as a super-set of the data needed to cross
register them at any partner sight on demand. We used Orsus's tools to do
this. It took about a day per site.
For an example of a site that uses this type of web automation, check out
www.dealtime.com. They use the Orsus product to preform aggregate searching
across a whole bunch of e-commerce sites. The "buy it" button at dealtime,
actually automates the whole checkout process. In a sense the Orsus product
is a "screen scraper" in that it actually browses to the target web site for
the user of your site. In reality it's a pretty sophisticated piece of
work.
As it retrieves HTML data from sites, it converts them (if needed, on the
fly) to XHTML. You can then use XQL (XML Query Language) against the page
data. In turn, you can use regular expressions on that data. If you are
familiar with ASP (VB) or JSP (Java) results of page data could be handed
back to your calling script as recordset or resultset objects, respectively.
Parsing for the named numbers of a set of generated images would be the work
of about 15 minutes with a tool like this. It might take longer with some
of the free screen scraping Perl libraries - but not much.
In short - always generating an image named "imagecode.png" or something
would be better - especially if the contents of that image are generated on
the fly. That way, the image name stays the same, and gives no clue as to
the content of the image. While the cost of generating that image may be
expensive - the effort and computation required to automate interpretation
of that image is more expensive still.
A possible extension to security might be to have the user save that image
to their own hard disk. Instead of using a standard username and password
to login - maybe use a multipart form, with an upload field... Everytime
they want to login, they upload their image... Again, there would be a
computing and bandwidth cost, but there's also the question of cost from
breached security.
- Porter