Can REBOL change HTML of a web page?
[1/14] from: oliveirard:y:ahoo at: 29-Mar-2005 15:57
Hello guys! I want to build a program that that could run on the client-side (IE, Netscape,
Opera, etc.), get the web page retrived by the server and do some adjustments to it,
let's say, add some footnotes or summarizing the text. Can I make that using REBOL?
Thanks to you all!
Rodrigo
[2/14] from: SunandaDH:aol at: 29-Mar-2005 14:10
Rodrigo:
> Hello guys! I want to build a program that that could run on the
client-side (
> IE, Netscape, Opera, etc.), get the web page retrived by the server and do
> some adjustments to it, let's say, add some footnotes or summarizing the
text.
> Can I make that using REBOL?
Yes.
You need to insert your script as a proxy between the client and the
webserver. Then you can do what you like to the incoming html stream.
For inspiration:
http://www.rebol.org/cgi-bin/cgiwrap/rebol/view-script.r?color=yes&script=prox
y.r
Sunanda.
[3/14] from: oliveirard::yahoo::com::br at: 29-Mar-2005 16:29
Hi Sunanda and all you guys,
Thanks so much for your attention! I went to that page and got the feeling
that this way I will have to set all the browsers' configuration to go
through this proxy. Even if I could make this a transparent proxy, it would
be a problem because I don't want a local solution. The traffic through it
will be very heavy, am I right? That's why I was wondering to build
something like an add-on or plug-in to a browser that users could install.
Like plug-ins for acrobat reader, flash, quicktime, etc. The traffic in the
HTTP requests/responses would be the same and the plug-in would make the
adjustments needed. Do you know anything I can do? Any book? Any tutorial?
Any web page?
Thanks a lot.
Rodrigo
[4/14] from: carl::cybercraft::co::nz at: 30-Mar-2005 8:21
On Tuesday, 29-March-2005 at 15:57:40 Rodrigo de Oliveira wrote,
>Hello guys! I want to build a program that that could run on the client-side
>(IE, Netscape, Opera, etc.), get the web page retrived by the server and do
>some adjustments to it, let's say, add some footnotes or summarizing the text.
>Can I make that using REBOL?
>
>Thanks to you all!
Perhaps you could set up a local server that you'd initially point your browser to, and
from then on it'd handle all requests to the Net, changing all the links in webpages
to point back to itself before giving them to the browser. ie, you'd switch the links
from http://www.xyz.not/ to http://localhost:8080/?http://www.xyz.not/ or whatever.
There's a tiny-server example in the main Core manual here...
http://www.rebol.com/docs/core23/rebolcore-13.html#section-14
Hmmm. This would be a way to do local translations of webpages. Well, the easy part,
anyway!
-- Carl Read.
[5/14] from: antonr::lexicon::net at: 31-Mar-2005 15:31
Hi Rodrigo,
This is an interesting question and one that
I would like to have as an example myself some time.
A friend of mine I was speaking to recently said
he had seen it done (using javascript I think)
to update sections of a page.
I'll have to ask him to send me more details next
time I see him but for now at least you know it
is possible.
I think sticking rebol in as a proxy client-side is making
things more complicated than it needs to be, since I think
the javascript can do this "asynchronous partial updating".
Of course, rebol can be used at the server end.
Anton.
[6/14] from: antonr::lexicon::net at: 1-Apr-2005 0:17
This looks like it - AJAX ("Asynchronous JavaScript + XML"):
http://www.adaptivepath.com/publications/essays/archives/000385.php
(link found by Graham Chiu, thanks)
Anton.
[7/14] from: oliveirard::yahoo::com::br at: 31-Mar-2005 19:08
Hi Anton,
Thanks a lot! I found this AJAX and Chris' dissection of Google Suggest very
interesting. I don't know yet if that's going to solve my problem because I
need to intercept the server responses and make the adjustments locally
WITHOUT changing the original web pages. But I will certainly give this a
try. If that works, I'll tell you.
Thanks.
Rodrigo
[8/14] from: inetw3::mindspring::com at: 31-Mar-2005 21:06
Hi Rodrigo,
I don't know exactly what you are needing but the
Rebol/plug-in can be used to retrieve data from
the web-page and based on your rebol script, it
can be processed as needed.
There are many ways to do this and without CGI
scripts. But what the Rebol/Plugin brings is the
ability to act as a full blown webpage server.
Remember the plugin is limited to IE-Mozilla-Firefox,
and there is no limit to what this plugin can achieve
or do to or from a single webpage. (security issue)
The trick is you must use javascript with XML-DOM (DHTML)
or the .innerHTML (easier) functions to get the page's data.
Look at the Rebol/Plugin webpage for an idea how to do this.
Put this javascript in your rebol script and the rebol script
in the rebol/plug-in code. When the plug-in loads, have the
javascript part inserted into the webpage.
How?
Give each webpage a "<script>" tag with an ID name. ie..
<script id="update"></script>, and this is where you want
the rebol/plug-in to put the javascript.
You can read or write to the page or anywhere else that you
need the returned data to go to.
How do i get Rebol to respond to get new data from the page?
Just like a form has a submit button, you can make the Rebol/
plugin viewable as a button with text that says something like
update/retrieve data
. just and idea.
Another way is to just have regular javascript code in your page
write all your needed info into the DOM, (Document object model)
and then use a Rebol script (maybe quickparser.r at Rebol.org? )
in the plugin to parse the needed data right out of the page and do
with it as you please.
[9/14] from: oliveirard::yahoo::com::br at: 1-Apr-2005 11:22
Hi Anton and all you guys,
I took a look at this proposal and I think it won't work. AJAX is
basically DHTML + XMLHttpRequest and I got the feeling that it demands
commitment to this approach from the beggining of the design. I mean,
the original web pages need to be constructed in this way. As Google
Suggest and Google Maps. That's not my case. Let me try to state it clearly:
*"I want something on the client-side (between browser and web server)
that can intercept the HTTP response (web page retrieved by the server)
and make some adjustments to the page (for example, inserting footnotes,
summarizing content, etc.) and, only then, delivers the changed page to
the browser for presentation. Just remembering: it gotta be client-side
programming, which means I CAN'T change the original web pages of ANY
web server."*
Can Rebol do that? Another plug-in? Or a browser extension? Any Java
solution in client-programming?
Hope you guys help me on that. Thanks for your attention!
Rodrigo
Anton Rolls wrote:
[10/14] from: inetw3::mindspring::com at: 1-Apr-2005 20:48
Hi Rodrigo,
Yes, what you want is possible without having to change
the page on the web server. I don't know if you've seen
the email i posted 3/31/05 @ 9:09 pm, but to reinerate,
javascript and scripting the DOM is all you need.
No matter how you intercept the html data, something must
be written in the html to start this process; cgi, asp, php, perl,
javascript, etc.
With any plug-in, you have to have an <Object> tag in the
webpage. This is where the Rebol/plug-in comes in.
If this is some ones elses page your browsing (and i
think it is) then your going to still need the plug-in <object>
coded into the page.
If i understand you correctly,
I don't believe any web-app or plug-in software can
intercept a page and add info to it before the page is
seen unless it's a proxy/socket app/server thats on,
or the details is written in the page first or it's coming
from the same server.
All web-app or plug-in software present there info in there
own windows embeded into the page or off the server.
And this is what the Rebol/plug-in and DOM scripting must do.
----------------------------------------------------------------------------
--------
Using the javascript and css style, you hide the page. ie...
<body id="thepage" style="visibility:hide;"> .
If it a browsed page from the web, it still needs the rebol
<object> plug-in in the page.
When the page load , have the rebol/plug-in parse the page
and/or add your needed info to the heading and footing of the
page.
Finally from the plug-in script, change the
<body> style to visible ie...
Document.thepage.style.visibility = "visible"
Look at my other email for other ideas.
If i'm giving you incorrect info it's because i don't
understand you correctly, sorry. But i'd still like to
know what you want because i have begun to write
this type of software myself (mdlparser.r @ rebol.org).
This is where rebol will have to shine if its going to do
anything within the "internet or web" arena. Get rid of
the heavy-wear and let the client-side do all(some) of
the tricks. You know it's funny, thats what Rebol/Core
was all about.
[11/14] from: antonr::lexicon::net at: 2-Apr-2005 23:28
Ok,
Now it sounds (as inetw3 suggested) like you want a proxy.
If that's true then you don't need rebol plugin.
Rebol plugin is just going to make things more complicated
I think, since it's not very stable. It's more easy to
find a stable Rebol/Core and program it to act as a proxy.
The program will listen on a port (say 8080) for requests
from the client, pass the requests out on port 80,
get the results, process the results in some way, then
hand the result back to the client.
I did this a few years ago when I was mad at webpages
returning stark white backgrounds all the time.
So check out:
http://www.lexicon.net/antonr/rebol/web/no-white-web-proxy.r
based on webserver.r from the script library.
Also check out the latest incarnation of webserver.r at rebol.org.
That probably has some interesting differences from
the more than three years old version of webserver.r.
Regards,
Anton.
[12/14] from: oliveirard::yahoo::com::br at: 2-Apr-2005 14:03
Well... First I would like to thank all the attention you guys are
giving to me. Sice my doubt maybe quite isn't a REBOL's issue, you're
being so kind. Other foruns wouldn't even give me an answer. Then, I
will try to answer Anton, inetw3 and Ingo in this e-mail, ok?
Anton, first I was looking for a proxy to do this. I choosed squid
(http://www.squid-cache.org/) because it's free and can act as a
transparent proxy, which prevents me from having to update each browser
configuration to send the requests through it. I was wondering I could
program something like a module for squid that could make the
adjustments to the pages. But unfortunately, it can't be done. As in
http://www.squidguard.org/intro/, "neither squidGuard nor Squid can be
used to filter/censor/edit text inside documents or embeded scripting
languages". Although I could try to build a proxy that does that, as you
suggest, it will be extremely error prone. I'd like to reuse things. As
I was trying with squid. Anyway, my intentions for this program is wide
enough to congestion the proxy server. There I thought it should be done
on the client-side.
inetw3, as you said, "No matter how you intercept the html data,
something must be written in the html to start this process; cgi, asp,
php, perl, javascript, etc.". But that's the problem: I don't want to
change the original HTML pages on the web servers, like inserting
<object>
tags. I hope there's a way.
Ingo, my last hope of doing this on the client-side was building an
extension for Firefox. I know there will be a lost in compatibility
(since MANY still use IE), but I'm almost accepting that as it's free.
And your suggestion about greasemonkey was great! I don't know why yet,
but the "options" for the extension isn't working. Everything there is
disabled. Anyway, I will take a good look at it's code.
Thank you all guys! If any other suggestion comes, it's VERY WELCOME.
Best wishes,
Rodrigo
Anton Rolls wrote:
[13/14] from: antonr::lexicon::net at: 3-Apr-2005 20:21
Hi Rodrigo,
First, thanks for your thanks.
It's good to be thanked, so thankyou.
Second, I would think that modifying no-white-web-proxy.r
would be easier than writing a mozilla extension,
because the code is so short. Depending on your
requirements, you would probably cut out some
of the basic ad filtering I do in there and make it
even shorter. As I remember, it was also fairly
stable.
However, it would also be good if there was
another person around with experience in building
mozilla extensions, so I'm not complaining !
Anton.
[14/14] from: inetw3::mindspring::com at: 9-Apr-2005 20:05
Hey Rodrigo,
I wanted to shine some more light on the subject
of retrieving info from the http request, creating
footnotes, headers, summeries, etc., and then
presenting the web page.
Another idea, (if you have'nt found one already.)
Have a general page anyone can browse to that
has the rebol/plug-in tagged in it already.
make the plug-in viewable as an address bar and
go button. Type in the page you want to go to,
and let the plug-in retrieve it, not the browser. ie..
myinfo: read http://www.getmyinfo.com , then parse
out the info you need and add or change it with
your plug-in script. ie...
append myinfo {<footer>tadah</footer>}
Save this updated page to your "c:/program files/temp",
getinfo: to-file %"/c/program files/temp/cool.html"
write getinfo myinfo
Then have the rebol/plug-in load this page into the browser.
i believe you can make it load into the opened webpage. ie...
browse getinfo
A suggestion, also input into each newly created page your
rebol/plug-in object. That way you dont have to go back to
the the original page just to get new or more info.