Shortage of network connections
[1/4] from: sqlab::gmx::net at: 15-Mar-2002 13:21
Hello I am using Rebol since a few years for transferring messages between pcs and not so common systems, where i wrote the servers on the non-pc. Just since short time i use it too for transferring message via Tcp/Ip between pcs. And there I observe connection errors in more than 10% of every try. Of course I handle that and try it again. But i can see a whole stack of orphaned socket connections and I am therefore worried that I will get a buffer or socket shortage or similar problems sooner or later. Has anyone seen the same effect and what can you do under Windows (NT, 2000 a.s.o.), to overcome the problem? Should I use another open/mode, but open/binary with Rebol, or are there some tweaks forcing Windows to do a housekeeping on socket connections in CLOSE/WAIT state? TIA AR
[2/4] from: holger:rebol at: 15-Mar-2002 6:13
On Fri, Mar 15, 2002 at 01:21:05PM +0100, [sqlab--gmx--net] wrote:
> Hello > I am using Rebol since a few years for transferring messages between pcs and
<<quoted lines omitted: 9>>> some tweaks forcing Windows to do a housekeeping on socket connections in > CLOSE/WAIT state?
Having a socket in Close/Wait for an extended period of time is the result of bugs in the kernel, either on your end or on the other end. There is nothing you can do about it, but it should not be harmful either. Kernels can handle tens of thousands of socket connections, and connections in Close/Wait do not count towards the per-application socket limit, so applications should not be affected. (Contrast Close/Wait to Time/Wait, which is completely normal, necessary, not the result of bugs, and only last for a minute or so). All of this is unrelated to REBOL or any modes/options set in REBOL. -- Holger Kruse [kruse--nordicglobal--com]
[3/4] from: sqlab:gmx at: 18-Mar-2002 8:57
Holger, thanks for your reply Unfortunately I am not totally convinced.
> On Fri, Mar 15, 2002 at 01:21:05PM +0100, [sqlab--gmx--net] wrote: > > Hello
<<quoted lines omitted: 23>>> bugs in the kernel, either on your end or on the other end. There is > nothing
After closing Rebol all orphaned sockets in CLOSE/WAIT are gone. It seems, that Rebol frees all used TCPT/IP resources, even if it does not report all sockets to the requesting application.
> you can do about it, but it should not be harmful either. Kernels can > handle > tens of thousands of socket connections, and connections in Close/Wait do > not
I expect my applications to run for months without manual intervention, and yes, there are easily a few hundred thousands opened and closed sockets durig the lifetime of the application.
> count towards the per-application socket limit, so applications should not > be > affected. (Contrast Close/Wait to Time/Wait, which is completely normal,
CLOSE/WAIT should mean, that the other side closed the connection (it's in CLOSE/WAIT2) and is waiting for me to close my channel. In former times and on other systems I got problems when closing a listening socket, where some derived CLOSE/WAIT2 sockets still existed. Then I could not open immediately after a listening socket on the same port. So even I am not in trouble now, I also don't want to block peers.
> necessary, not the result of bugs, and only last for a minute or so). > > All of this is unrelated to REBOL or any modes/options set in REBOL.
Rebol gives an connection error to the application, either because the former closing was unsuccessful, but the error was delayed, or because the TCP/IP stack finished the connection successfully, but Rebol did not wait long enough to get all connection informations. Of course, there could also be an error in the peer software. As soon as I get a hand on it, I will check this too. Maybe it accepts a connection and cancels it immediately. But why should someone do this. And if, then I would expect a successful open and a failure later. Anyway, I think, I have to clear this, before using Rebol for the intended purpose. AR
[4/4] from: sqlab:gmx at: 12-Apr-2002 9:56
A few weeks ago I complained about problems with unsuccessful tcp/ip connections. I said, I will check this problem in greater details at the other side too. After gaining access to the host I could see, that it was mainly a problem with Rebol alone. Therefore I wrote a client test program, that does nothing but opening a socket, transfers some bytes, reads some bytes and closes the socket again. This approach done in a loop leads very soon to connection errors. But if I place a wait before the next opening, I can open, exchange messages and close the connections in abundance, even if I use "wait 0". Probably wait seems to clear or renew some conditions needed for successful tcp/ip connections under Win/NT. If I remember right, other members of the mailing list reported similar observations too. They just succeded after adding an otherwise useless wait 0. So what does "wait 0" and should the clearance of the network status not be part of the normal port handling? AR
- Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted