Mailing List Archive: Re: The dark side of P2P column ...

[REBOL] Re: The dark side of P2P column ...

From: holger:rebol at: 14-May-2001 16:39


On Mon, May 14, 2001 at 08:25:29PM +0200, Petr Krenzelok wrote:
> Hi,
>
> RT is advertising rebol as lightweight distributed system, where each client
> system is able to communicate. Now you can read what Jim Seymour thinks
> about P2P ...
>
> http://www.zdnet.com/pcmag/stories/opinions/0,7802,2711706,00.html

Lightweight Distributed Computing is not the same as what most people these
days mean by "P2P", nor does it need P2P (meaning P2P in its usual sense, i.e.
using direct connections between clients, which is also how Jim Seymour
uses the term). "Being able to communicate" does not necessarily mean
being able to accept TCP connections and handle requests
.

It is possible to set up LDC environments with lightweight servers, which
is exactly what REBOL/Express does, btw. In some environments it is also possible
to use completely different types of solutions, e.g. based on broadcasting
or multicasting, which does not really distinguish between clients and
servers in the usual way.

Jim Seymour is right on many points, and he does not even address all the
issues of P2P. There are some more. When we started designing REBOL/Express
and some other LDC features in REBOL we looked at transport-layer P2P very closely,
and decided not to use P2P (in its strict sense) as its primary transport and
session mechanism, because of its many problems.

P2P (at the transport and session level) is probably one of the most overhyped
concepts in recent years. It's funny though: that type of P2P communication has been
around for decades. Windows Network Neighborhood is P2P, e.g., (and it is actually
even REAL P2P, without the need for ANY server, unlike Napster, Gnutella, Groove
etc).  Still P2P people don't seem to like it though. I wonder why :-).

Every few years or so someone seems to come up with that "really new idea"
P2P which is "so much better than server-based", it stays around for a while
and then dies or becomes obsoleted by better server-based solutions. The only
P2P-based solutions which stay around for long are those designed excusively
for LANs, using broadcasting or multicasting.

People don't seem to learn though, and after a while the cycle repeats, and
again someone somewhere tries to change the way transports and sessions work.
Somehow with P2P people always seem to look at the promised advantages but
completely ignore the overwhelming flaws and problems. Here are some:

1) P2P in the REALLY strict sense (i.e. no servers whatsoever) is impossible
on the Internet, because you need some way for one peer to "find" other
peers. That requires known, central addresses, i.e. servers. At the very least
you need "registration servers" where peers providing a service need to
register. Depending on the type of environment this can be a central server
(Napster), many servers which also act as clients (Gnutella) or something
based on existing central registration networks, e.g. DNS. The ONLY situation
where you can get away with not having any central servers at all is if you can
use broadcasting or multicasting, but this is not possible on the whole Internet
(yet). This means almost all so-called P2P solutions which span more than a
single LAN actually cheat. They are not really P2P at all, and in real life
they don't give you the advantages P2P promises in theory.

2) If you need central servers anyway then the often cited argument that P2P
provides additional robustness or redundancy becomes untrue. Actually in P2P
both robustness and redundancy are usually worse than in server-based
solutions: you have the same r/r bottlenecks for the registration servers,
but the added problem that your "serverized clients" are not always rated
as "real" servers, i.e. they tend to have more outages, in particular if they
are really just "serverized clients" behind a DSL line, and outages tend to
last longer (no RAID etc).

3) Bandwidth: there is not only the obvious problem that most DSL, Cable and
modem lines are asymmetric, but there is also the bigger problem that with P2P
more connections into the "outlying areas" of the Internet are needed to
transmit the same amount of data to the same number of people. Example:
transmitting a 10 MB file to 100 users using a well-connected server means you need
10 MB * (100 + 1) = 1010 MB of bandwidth on relatively slow ("client") lines:
one upload and 101 downloads. Transmitting the same file by P2P means you
need 10 MB * (100 + 100) = 2000 MB, almost twice the bandwidth. So the effect is
not just that the ratio of uploads to downloads changes, but you also increase
the total amount of bandwidth across client lines that you need. Yes, you save the
server bandwidth, but server bandwidth is cheap, easily available, easy to optimize,
tune, replicate, scale based on demand etc. Client bandwidth is not, and likely won't
be for a long time.

4) Bandwidth (again): a LOT of systems these days are limited to 28800 (or 33600)
bps in the upload direction, and this is not likely to change soon. Yes, DSL and
Cable are catching on, but not quickly enough, and not in the very quickly
growing mobile market. For the majority of connected devices high-speed Internet
won't become a reality for many years.

5) Mobile environments, uptime: In order to transfer something by P2P BOTH sides
need to be up at the same time, a big problem in the mobile market. Handheld
devices make very bad P2P service providers because they get disconnected too
quickly (drive through a tunnel, ride a subway or plane, battery empty etc.).
With server-based solutions only one side has to be online at a time, whereas
with P2P a time slot has to be found where both are online.

6) Firewalls, proxy servers, Intranets: server-based solutions are compatible
to the majority of security systems out there (most compatible: HTTP tunnelling
from client to server, which is what REBOL/Express uses as its default
transport scheme). P2P solutions that require reverse connections usually
cannot traverse firewalls or proxy servers. A huge problem for some companies
out there with products that are exclusively based on P2P transports.

7) Accounting, monitoring: usually someone in your company wants to know how a
service is used, i.e. how many transfers take place, what is transfered, who
transfers data, etc., to get a sense of customer demand, to improve marketing
etc. In server setups you just look at server logs, in P2P setups you, well,
guess
 :-).

8) Security: securing the machine providing the service, as Jim described, is
only part of the problem. Another part is access rights: How do you do that
in distributed systems ? How do you revoke the privileges of, say, an employee
who has left a company ? Another problem is certification: P2P environments
make it significantly more difficult to deal with object or source certificates.
Instead of trusting server certificates you would have to start using object-based
certificate chains, and both Java and PGP show what a pain that tends to be,
compared to the (transparent and seamless) server certificates SSL uses.

9) Content management: how do you manage your content, how do you handle versioning ?
Updates ? Revokations ? Quality control ? For server-based setups you can establish
administrative procedures for this. For P2P you have no control over who downloads
what version of what product when. It is already a problem with FTP servers which
have automatic replication, but in P2P setups it becomes much worse.

To me it seems extremely foolish to blindly jump onto this P2P bandwagon, like
some companies are doing, and this is not what LDC is about.

P2P people are concentrating on the wrong thing: the technical aspects of the
physical connection. Those aspects were solved 30 years ago, and the resulting
solution (TCP, server-based) is just fine and usually superior to P2P. Reinventing
the wheel, but worse, seems pointless to me.

The problem these days is above the transport level. It is extremely difficult
to design network applications that are small, functional, secure, compatible among

platforms and versions, easy to debug, easy to upgrade etc. This is where LDC comes in.

LDC is not about the technical aspects of physical connections. It is about
how to exchange data, how to format data, how to secure data, how to parse data,
how to interpret data, how to quickly and easily design software that allows
multiple machines to communicate, how to do so on a large number of platforms, how
to visualize received data and to interact with the user, how to add networking
capabilities to existing applications in an easy way, with a single implementation
that works on all platforms, without spending months designing ".NET" spec files,
ASN.1 descriptions, Java class hierarchies etc., how to send data that is smart,
in real-time, instead of having to design, implement and debug applications that
are smart and predict what data they are used with years before the first byte of
data is ever exchanged. Long sentence :-).

REBOL provides a single, consistent environment that can be used in ALL parts of
the networked application, including GUI creation, user input, data generation,
information exchange, communication with back-end servers, response generation,
encryption, parsing, data visualization, data storage etc. Keeping data and
application models consistent throughout the application and across platforms
saves time, reduces opportunities for bugs, and simplifies and encourages the
development of networked applications of all kinds. THAT is what LDC is about.

As for LDC and P2P: we believe in peer-to-peer as something that has to happen
at a much higher level: the concept should exist all the way up at the application
layer, not down at the transport layer, as with most so-called P2P environments.

Which transport mode is used for any individual application should be
irrelevant, and should certainly not spawn religious discussions. It is easy
to write applications which actually support more than one transport mode,
e.g. server-based, peer-to-peer, broadcast/multicast etc. all in one application.

We like to think that LDC is a "better P2P" in the sense that it provides P2P
functionality where it counts: at the point where applications interact. REBOL
simplifies the development of networked applications. LDC allows the
development of applications that allow Jim in Houston to easily communicate
with John in Los Angeles: peer-to-peer communication at the application level,
without worrying how this actually works "under the hood", i.e. whether servers
are involved or not.

--
Holger Kruse
[holger--rebol--com]