Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Re: The dark side of P2P column ...

From: holger:rebol at: 14-May-2001 16:39

On Mon, May 14, 2001 at 08:25:29PM +0200, Petr Krenzelok wrote:
> Hi, > > RT is advertising rebol as lightweight distributed system, where each client > system is able to communicate. Now you can read what Jim Seymour thinks > about P2P ... > > http://www.zdnet.com/pcmag/stories/opinions/0,7802,2711706,00.html
Lightweight Distributed Computing is not the same as what most people these days mean by "P2P", nor does it need P2P (meaning P2P in its usual sense, i.e. using direct connections between clients, which is also how Jim Seymour uses the term). "Being able to communicate" does not necessarily mean being able to accept TCP connections and handle requests . It is possible to set up LDC environments with lightweight servers, which is exactly what REBOL/Express does, btw. In some environments it is also possible to use completely different types of solutions, e.g. based on broadcasting or multicasting, which does not really distinguish between clients and servers in the usual way. Jim Seymour is right on many points, and he does not even address all the issues of P2P. There are some more. When we started designing REBOL/Express and some other LDC features in REBOL we looked at transport-layer P2P very closely, and decided not to use P2P (in its strict sense) as its primary transport and session mechanism, because of its many problems. P2P (at the transport and session level) is probably one of the most overhyped concepts in recent years. It's funny though: that type of P2P communication has been around for decades. Windows Network Neighborhood is P2P, e.g., (and it is actually even REAL P2P, without the need for ANY server, unlike Napster, Gnutella, Groove etc). Still P2P people don't seem to like it though. I wonder why :-). Every few years or so someone seems to come up with that "really new idea" P2P which is "so much better than server-based", it stays around for a while and then dies or becomes obsoleted by better server-based solutions. The only P2P-based solutions which stay around for long are those designed excusively for LANs, using broadcasting or multicasting. People don't seem to learn though, and after a while the cycle repeats, and again someone somewhere tries to change the way transports and sessions work. Somehow with P2P people always seem to look at the promised advantages but completely ignore the overwhelming flaws and problems. Here are some: 1) P2P in the REALLY strict sense (i.e. no servers whatsoever) is impossible on the Internet, because you need some way for one peer to "find" other peers. That requires known, central addresses, i.e. servers. At the very least you need "registration servers" where peers providing a service need to register. Depending on the type of environment this can be a central server (Napster), many servers which also act as clients (Gnutella) or something based on existing central registration networks, e.g. DNS. The ONLY situation where you can get away with not having any central servers at all is if you can use broadcasting or multicasting, but this is not possible on the whole Internet (yet). This means almost all so-called P2P solutions which span more than a single LAN actually cheat. They are not really P2P at all, and in real life they don't give you the advantages P2P promises in theory. 2) If you need central servers anyway then the often cited argument that P2P provides additional robustness or redundancy becomes untrue. Actually in P2P both robustness and redundancy are usually worse than in server-based solutions: you have the same r/r bottlenecks for the registration servers, but the added problem that your "serverized clients" are not always rated as "real" servers, i.e. they tend to have more outages, in particular if they are really just "serverized clients" behind a DSL line, and outages tend to last longer (no RAID etc). 3) Bandwidth: there is not only the obvious problem that most DSL, Cable and modem lines are asymmetric, but there is also the bigger problem that with P2P more connections into the "outlying areas" of the Internet are needed to transmit the same amount of data to the same number of people. Example: transmitting a 10 MB file to 100 users using a well-connected server means you need 10 MB * (100 + 1) = 1010 MB of bandwidth on relatively slow ("client") lines: one upload and 101 downloads. Transmitting the same file by P2P means you need 10 MB * (100 + 100) = 2000 MB, almost twice the bandwidth. So the effect is not just that the ratio of uploads to downloads changes, but you also increase the total amount of bandwidth across client lines that you need. Yes, you save the server bandwidth, but server bandwidth is cheap, easily available, easy to optimize, tune, replicate, scale based on demand etc. Client bandwidth is not, and likely won't be for a long time. 4) Bandwidth (again): a LOT of systems these days are limited to 28800 (or 33600) bps in the upload direction, and this is not likely to change soon. Yes, DSL and Cable are catching on, but not quickly enough, and not in the very quickly growing mobile market. For the majority of connected devices high-speed Internet won't become a reality for many years. 5) Mobile environments, uptime: In order to transfer something by P2P BOTH sides need to be up at the same time, a big problem in the mobile market. Handheld devices make very bad P2P service providers because they get disconnected too quickly (drive through a tunnel, ride a subway or plane, battery empty etc.). With server-based solutions only one side has to be online at a time, whereas with P2P a time slot has to be found where both are online. 6) Firewalls, proxy servers, Intranets: server-based solutions are compatible to the majority of security systems out there (most compatible: HTTP tunnelling from client to server, which is what REBOL/Express uses as its default transport scheme). P2P solutions that require reverse connections usually cannot traverse firewalls or proxy servers. A huge problem for some companies out there with products that are exclusively based on P2P transports. 7) Accounting, monitoring: usually someone in your company wants to know how a service is used, i.e. how many transfers take place, what is transfered, who transfers data, etc., to get a sense of customer demand, to improve marketing etc. In server setups you just look at server logs, in P2P setups you, well, guess :-). 8) Security: securing the machine providing the service, as Jim described, is only part of the problem. Another part is access rights: How do you do that in distributed systems ? How do you revoke the privileges of, say, an employee who has left a company ? Another problem is certification: P2P environments make it significantly more difficult to deal with object or source certificates. Instead of trusting server certificates you would have to start using object-based certificate chains, and both Java and PGP show what a pain that tends to be, compared to the (transparent and seamless) server certificates SSL uses. 9) Content management: how do you manage your content, how do you handle versioning ? Updates ? Revokations ? Quality control ? For server-based setups you can establish administrative procedures for this. For P2P you have no control over who downloads what version of what product when. It is already a problem with FTP servers which have automatic replication, but in P2P setups it becomes much worse. To me it seems extremely foolish to blindly jump onto this P2P bandwagon, like some companies are doing, and this is not what LDC is about. P2P people are concentrating on the wrong thing: the technical aspects of the physical connection. Those aspects were solved 30 years ago, and the resulting solution (TCP, server-based) is just fine and usually superior to P2P. Reinventing the wheel, but worse, seems pointless to me. The problem these days is above the transport level. It is extremely difficult to design network applications that are small, functional, secure, compatible among platforms and versions, easy to debug, easy to upgrade etc. This is where LDC comes in. LDC is not about the technical aspects of physical connections. It is about how to exchange data, how to format data, how to secure data, how to parse data, how to interpret data, how to quickly and easily design software that allows multiple machines to communicate, how to do so on a large number of platforms, how to visualize received data and to interact with the user, how to add networking capabilities to existing applications in an easy way, with a single implementation that works on all platforms, without spending months designing ".NET" spec files, ASN.1 descriptions, Java class hierarchies etc., how to send data that is smart, in real-time, instead of having to design, implement and debug applications that are smart and predict what data they are used with years before the first byte of data is ever exchanged. Long sentence :-). REBOL provides a single, consistent environment that can be used in ALL parts of the networked application, including GUI creation, user input, data generation, information exchange, communication with back-end servers, response generation, encryption, parsing, data visualization, data storage etc. Keeping data and application models consistent throughout the application and across platforms saves time, reduces opportunities for bugs, and simplifies and encourages the development of networked applications of all kinds. THAT is what LDC is about. As for LDC and P2P: we believe in peer-to-peer as something that has to happen at a much higher level: the concept should exist all the way up at the application layer, not down at the transport layer, as with most so-called P2P environments. Which transport mode is used for any individual application should be irrelevant, and should certainly not spawn religious discussions. It is easy to write applications which actually support more than one transport mode, e.g. server-based, peer-to-peer, broadcast/multicast etc. all in one application. We like to think that LDC is a "better P2P" in the sense that it provides P2P functionality where it counts: at the point where applications interact. REBOL simplifies the development of networked applications. LDC allows the development of applications that allow Jim in Houston to easily communicate with John in Los Angeles: peer-to-peer communication at the application level, without worrying how this actually works "under the hood", i.e. whether servers are involved or not. -- Holger Kruse [holger--rebol--com]