• Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r4wp

[Rebol School] REBOL School

Sujoy
10-Oct-2012
[1219x2]
unfortunately, i dont have permissions to send files over altme. 
:(
how else can i send? email?
have sent you a mail at nr at red-lang dot org
DocKimbel
10-Oct-2012
[1221]
Ok, from the UniServe folder, this code works:

    uniserve-path: %./
    do %uni-engine.r
    uniserve/boot

I had to change your absolute path in %reminder.r to:

- line 11:   do uniserve-path/libs/scheduler.r
- line 24:   feeds: load uniserve-path/docs/feeds.r
Sujoy
10-Oct-2012
[1222x2]
trying this right now...
damn! no luck.

>> ls
BSD-License.txt  change-log.txt   clients/         docs/
handlers/        libs/            protocols/       services/
uni-engine.r
>> uniserve-path: %./
== %./
>> do %uni-engine.r
Script: "UniServe kernel" (17-Jan-2010)
Script: "Encap virtual filesystem" (21-Sep-2009)
== true
>> uniserve/boot
booya
.

http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/business/rss.xml
** Script Error: Cannot use path on none! value
** Where: process-task
** Near: if any [
    zero? shared/pool-max
    shared/pool-max > shared/pool-count
] [fork]
either
DocKimbel
10-Oct-2012
[1224]
Ah, I stopped at "booya". :-)
Sujoy
10-Oct-2012
[1225x2]
:)
is this because pool-list is empty?

i put in a debug "print" cmd in the on-new-client function of task-master.r, 
which is the only place i could see pool-list being appended to...but 
it seems the function is not called
on-new-client: has [job][
  ;added this line
  print client/remote-ip
  if client/remote-ip <> 127.0.0.1 [close-client exit]
  set-modes client [keep-alive: on]
  client/timeout: 15
  client/user-data: make task []
  ;only place where pool-list is appended to...
  append pool-list :client
DocKimbel
10-Oct-2012
[1227]
No the issue is with 'shared being reset to 'none in %task-master...looks 
like a regression in Uniserve when working on standalone...I'm looking 
into it.
Sujoy
10-Oct-2012
[1228]
thanks doc!
DocKimbel
10-Oct-2012
[1229]
In %reminder.r, you shouldn't use: scheduler/wait. Uniserve is already 
providing an event loop. You need to remove that line.
Sujoy
10-Oct-2012
[1230x2]
ok...
removed the scheduler/wait line...now:

uniserve-path: %./
== %./
>> do %uni-engine.r
Script: "UniServe kernel" (17-Jan-2010)
Script: "Encap virtual filesystem" (21-Sep-2009)
== true
>> uniserve/boot
booya
** Script Error: Invalid path value: server-ports
** Where: reform
** Near: mold any [uniserve/shared/server-ports port-id]
>>
DocKimbel
10-Oct-2012
[1232]
I've just pushed a fix for that to Cheyenne SVN repo on Google code.
Sujoy
10-Oct-2012
[1233]
thanks doc...downloading now...
DocKimbel
10-Oct-2012
[1234]
From that, it seems to work until the job event is raised, then the 
server crashes (not sure if it's your code, scheduler or Uniserve 
that causes that).
Sujoy
10-Oct-2012
[1235x3]
:(
i'm actually trying to do something really simple
i have a bunch of feeds i want to download

i can do that sequentially (foreach feed feeds [...]), but thought 
it best to us background worker processes via task-master to download 
instead
is there an alternative?
or a better way of writing this using uniserve?
this is what i get with the latest from googlecode:

uniserve-path: %./
== %./
>> do %uni-engine.r
Script: "UniServe kernel" (17-Jan-2010)
Script: "Encap virtual filesystem" (21-Sep-2009)
== true
>> uniserve/boot
booya

10/10-18:37:48.883-## Error in [uniserve] : Cannot open server reminder 
on port 9000 !

10/10-18:37:48.884-## Error in [uniserve] : Cannot open server task-master 
on port 9799 !
== none
>>
DocKimbel
10-Oct-2012
[1238x2]
Uniserve task-master is mainly meant for server-side parallel request 
processing. For your need, you should use an async HTTP client rather, 
which would be a much simpler solution.
Cannot open...
 you need to close any previous Uniserve session.
Sujoy
10-Oct-2012
[1240]
sorry - just killed all previous Uniserve sessions. now get:

uniserve-path: %./
== %./
>> do %uni-engine.r
Script: "UniServe kernel" (17-Jan-2010)
Script: "Encap virtual filesystem" (21-Sep-2009)
== true
>> uniserve/boot
booya
** Script Error: Invalid path value: conf-file
** Where: on-started
** Near: if all [
    uniserve/shared
    file: uniserve/shared/conf-file
] [
    append worker-args reform [" -cf" mold file]
]
>>
DocKimbel
10-Oct-2012
[1241]
Are you running from SVN repo, or a copy of Uniserve folder?
Sujoy
10-Oct-2012
[1242]
a copy of the Uniserve folder...
DocKimbel
10-Oct-2012
[1243x2]
This looks like Cheyenne-dependent code...
But, you should *really* use a async HTTP client, that's the best 
solution for your need (multiple HTTP downloads at the same time).
Sujoy
10-Oct-2012
[1245x2]
hmmm. ok...will work on this and get back to you
thanks for the time Doc
(cant wait to see Cheyenne on Red ;)
DocKimbel
10-Oct-2012
[1247]
Well, you might see some micro-Cheyenne before Christmas. ;-)
Sujoy
10-Oct-2012
[1248x4]
best christmas ever!
just to persist with using uniserve...i think something i may be 
getting there

uniserve-path: %./
== %./
>> do %uni-engine.r
Script: "UniServe kernel" (17-Jan-2010)
Script: "Encap virtual filesystem" (21-Sep-2009)
== true
>> uniserve/boot
booya
127.0.0.1
127.0.0.1
== none
>>

i commented out the lines from on-started:

on-started: has [file][
		worker-args: reform [

   "-worker" mold any [in uniserve/shared 'server-ports port-id]		;TBD: 
   fix shared object issues
		]
		if not encap? [
			append worker-args reform [" -up" mold uniserve-path]
			if value? 'modules-path [
				append worker-args reform [" -mp" mold modules-path]
			]
			if all [
				uniserve/shared
				;file: uniserve/shared/conf-file 
			][		
				;append worker-args reform [" -cf" mold file]
			]
		]
		if integer? shared/pool-start [loop shared/pool-start [fork]]
	]

...since conf-file is cheyenne specific


i think maybe the scheduler is killing UniServe - it exits while 
returning none...
nope - the scheduler is just fine...

i'm now thinking it may have to do with using the shared/do-task 
in the on-load function...
nope
will take doc's advice and do something simpler
Kaj
10-Oct-2012
[1252]
If you're using R3 or Red/System, you could use the cURL binding 
in multi-mode
DocKimbel
10-Oct-2012
[1253]
Sujoy: have a look at this description of  one of async HTTP clients 
available: http://stackoverflow.com/questions/1653969/rebol-multitasking-with-async-why-do-i-get-invalid-port-spec
Endo
10-Oct-2012
[1254x2]
Doc, I reported that problem before remember? we were agreed on the 
fix:

in task-master.r

line 135: if all [ uniserve/shared in uniserve/shared 'conf-file 
file: uniserve/shared/conf-file ][
	 append worker-args reform [" -cf" mold file] ]

and on line 123: all [ in uniserve/shared 'server-ports uniserve/shared/server-ports 
]


Endo: "without these patches latest UniServe cannot be used alone. 
because it fails to start task-master. Ofcourse I need to remove 
logger, MTA etc. services." - 19-Dec-2011 2:50:29
Dockimbel: "I agree about your changes." - 19-Dec-2011 2:50:56
I think it is the same problem for Sujoy. (better to move Cheyenne 
group)
Sujoy
10-Oct-2012
[1256x5]
Thanks Endo...I am still keen on using uniserve - will get there 
eventually!
i have another issue - and need help from a parse guru
i'm trying to extract article text from an awfully written series 
of html pages - one sample:


http://www.business-standard.com/india/news/vadra-/a-little-helpmy-friends//489109/
there are 160 </table> tags!!
worse, article contents are scattered throughout the html mess
using beautifulsoup in python however, i can do the following:

from bs4 import  BeautifulSoup as bs
import urllib2


uri = "http://www.business-standard.com/india/news/vadra-/a-little-helpmy-friends//489109/"
soup = bs(urllib2.urlopen(uri).read())

p = soup.find_all('p')
[s.extract() for s in soup.p.find_all('table')]
[s.extract() for s in soup.p.find_all('script')]
[s.extract() for s in soup.p.find_all('tstyle')]

text = bs(''.join(str(p))).get_text()

...and this gives me exactly what is required...

just want to do this in Rebol! ;)
Endo
10-Oct-2012
[1261x2]
just a quick answer, to give you an idea, I've used following to 
extract something from a web page:
b: [] parse/all mypage [
        any [

            thru {<span class="dblClickSpan"} thru ">" copy t to </span>
            (append b trim/lines t) 7 skip
        ]
 ]
7 skip
 is to skip </span> tag.
Sujoy
10-Oct-2012
[1263]
yeah - thanks Endo

that works great for well formed html docs - but this site is an 
absolute nightmare!
Kaj
10-Oct-2012
[1264]
I've used the HTML parser from PowerMezz to parse complex web pages 
like that
Sujoy
10-Oct-2012
[1265x2]
note from the python code that there are styles and javascript specified 
inside the <p> element!
i was wondering about Gabrielle's HTML niwashi tree
never used the niwashi - Kaj, do you have a quick example for me 
to use?

i've got the docs open, but am maybe being obtuse - it is 230am here!
Kaj
10-Oct-2012
[1267]
It's a bit confusing to set up. I'll have a look
Sujoy
10-Oct-2012
[1268]
thanks Kaj
actually - thanks everyone for all your help on Rebol School