Sei sulla pagina 1di 322

William James Yeager and Rita Yu Chen

Second Generation Peer-to-Peer Engineering

This page is intentionally blank

Copyright © 2004, 2008

William James Yeager and Rita Yu Chen


We are sure that many of you wonder why a book completed in 2004 is finally being published in 2009. To clarify this what follows is a short history of the writing of “Second Generation P2P Engineering.”

An editor from Addison-Wesley contacted us in the spring of 2002 after having read a few of our publications on the Internet. As the editor roughly described it at the time, “I was mining the Internet for possible authors in the P2P field and found you two.” We then completed the standard Addison-Wesley book proposal form that was accepted after both Addison-Wesley internal as well as external peer reviews. We began to write the book that summer. During the writing process that was finished in early 2004 all the completed chapters underwent peer review. These reviews were without exception excellent. So, why wasn’t the book published?

The final reason given to us by our third Addison-Wesley editor was that with respect to the technical book area, it was becoming more and more difficult to capture sales and that they did not believe our book would sell 30,000 copies. We consequently formally terminated our contract, thus reverting all rights granted to Addison-Wesley for the manuscript and its publication back to us.

As one can imagine, we were very disappointed given the effort it took to complete “Second Generation P2P Engineering.” But, it is important to note that we do not fault Addison-Wesley, and understood their decision. The business climate was difficult at that time, and for us it was a case of bad timing.

Our initial editor also felt badly about this and put us in contact with a few other editors. They all had the same take on the technical book market, and as a result, our book has remained in a digital archive since that time. We remain friends with our initial editor.

Finally and happily, Dr. Rita Chen discovered earlier this year. It took us a few months of review to decide if a web-based publication was appropriate for our book after a five-year hiatus on a digital bookshelf, and have concluded that it is suitable for publication.

It’s always interesting to reread what one has written and to be pleasantly surprised at first with how well the book is written, and its approachability, and second that a great deal of it is still quite relevant today. The book is written so that the introductory chapters are readable by non-technical persons, and that the introductions to the more technical chapters are similarly done.

We hope that those of you, who scan, read parts or attempt to absorb most or all of the technical design details we present feel the same way.

William Yeager, Rita Yu Chen

November 2009

To the memory of Gene Kan, a dear friend and colleague.


Chapter 1 Introduction

1.1 The .COM Supernova


1.2 The Changing Internet Marketplace


1.3 What Is a P2P Network?


1.4 Why P2P Now?


1.5 Basic P2P Rights


1.6 Contents of This Book and P2P PSEUDO-Programming Language


Chapter 2 P2P Is the End-Game of Moore’s Law

2.1 The 1980’s


2.2 The 1990’s - The Decade of the Information Highway


2.3 The New Millennium


Chapter 3 Components of the P2P Model

3.1 The P2P Document Space


3.2 Peer Identity


3.3 The Virtual P2P Network


3.4 Scope and Search - With Whom Do I Wish to Communicate?


3.5 How to Connect


Chapter 4 Basic Behavior of Peers on a P2P System

4.1 The P2P Overlay Network Protocols


4.2 P2P Mediator Protocols


4.3 P2P PeerNode-to-PeerNode Protocols


4.4 P2P Connected Community Protocols


4.5 P2P Hashing Algorithms in the Context of Our P2P Overlay Network


4.6 More 4PL Examples


Chapter 5 Security in a P2P System


Internet Security


5.2.Reputation Based Trust in P2P Systems


5.3 More Building Blocks for P2P Security


5.4 Digital Rights Management on P2P Overlay Networks


Chapter 6 Wireless and P2P


Why P2P in the Wireless Device Space?


6.2 Introduction to the Wireless Infrastructures


6.3 The Telco Operator/Carrier


6.4 Fixed Wireless Networks


6.5 Mobile Ad-hoc


6.6 P2P for Wireless


Chapter 7 Applications, Administration, and Monitoring

7.1 Java Mobile Agent Technology


7.2 Implementation of Java Mobile Agents on the P2P Overlay Network


7.3 The Management of the P2P Overlay Network with Java Mobile Agents


7.4 Email on the P2P Overlay Network





“Peer-to-Peer Engineering” has as one of its goals to provide the necessary technical understand- ing to enable software architects and engineers to build P2P software products that will have mar- ketplace longevity. One cannot begin to list the number of P2P applications and stay current for one month. Each application may use one of several P2P “techniques,” and these applications for most the part do not communicate with one another. At the same time the number of academic research papers on P2P is rapidly increasing. Unlike other publication in the related topic, this book does not just summarize and list the existing P2P implementations and academic research results, but also gives a complete and practical design of a P2P Overlay Network.

This approach leads to a much deeper look at the underlying fundamentals and research. What are the computer-scientic “fundamentals” of P2P? This book answers this question with engineering solutions. From P2P network components to P2P communication protocols, from security to wireless, this book covers each P2P building block with detailed design specications, as well as, down-to-the-earth implementation suggestions. Most of these are authors’ original ideas and intend to solve some of the difculties faced by P2P software engineers. This book also covers the eld with enough breath to enable readers to make comparisons of the existing P2P implementa- tions to show their strengths and weaknesses.

To these ends, this book will rst give the reader an historical perspective so that (s)he can under- stand that P2P is not new, and has its technical roots in the 1980’s. The reader will see that it is the current market forces along with technological advances that have caused this latent technology to resurface. In this perspective there will be case studies and examples of algorithms nearly twenty years old that have “sat on the shelf” waiting for the such events to motivative their reevaluation.

Then to make the P2P Overlay Network behavior understandable, this book denes: First, the set of components that comprise a P2P network, along with the XML documents to describe each component; Then, the protocols to control the interaction of the components and their associated documents. To express the behavior of the components and their communication protocols, this book also introduces a P2P Pseudo Programming Language (4PL).

To make a P2P network design success, a very important building block is Security. Unlike other books on the subject that discuss or allude to security protocols and the various algorithm names like RSA, RC4, SHA-1, etc., but do not give the technical analysis necessary to implement a secu- rity solution, this book includes an in depth analysis of security requirements for P2P networks, and provides the solution for each requirement. Furthermore, this book integrates these security solutions into other building blocks and this makes the P2P Overlay Network design complete and secure.

Also, the Internet is expanding to include small devices such as light switches, automobile GPS systems, mobile telephones, home appliances, etc. These devices are and will be an integral part of P2P computing and we have particular expertise in this area and will cover them in breadth and depth. It is important to note that it is now claimed that 30% of the Chinese will have Internet access by 2005, and this will be wireless. P2P will play an extremely important role in bringing

services to these 300+ million people. The book will prepare software engineers for this eventual- ity and will describe how to write a P2P application for mobile devices.

Finally, one the top of the P2P Overlay Network, this book discusses the P2P applications in the wide range of concept. It takes advantage of Java Mobile Agent to build P2P applications, and also P2P network management tools. The code samples are given on these topics.

In summary, this book is not going to be another P2P encyclopedia. Rather, it will help readers to

[1] Acquire an historical understanding of the evolution of P2P and why now is the right time for technology to be mainstream.

[2] Improve her/his technical understanding of the component and behavior of P2P networks.

[3] Prepare and start her/his the rst or next P2P Overlay Network design and implementation, and make them success.

[4] Avoid security holes during design and/or implementation.

[5] Understand how to write P2P applications on the top of the P2P Overlay Network.

[6] See the importance of writing software that can address the needs of the expanding P2P device space.

[7] Learn how wireless networks work, interplay with wired networks, and will effect the future of P2P.

The audience for this book is software architects and engineers designing and writing P2P appli- cations; managers who wish to evaluate the resource requirements for P2P product development; and high-level industry leaders wanting to make sound P2P investment decisions. This book is also appropriate for any computer network related courses, from general networking classes to distributed networking classes. One of the features of this book which makes it ideal for class use is that it covers not only industry trends but also the state of academic research including algo- rithms and projects.

Rita Y Chen, William J. Yeager

Chapter 1 Introduction

Although this book is technical in nature, it’s arrival is not independent of exist- ing economic trends and technical market forces. Thus, up front, before diving headlong into the engineering chapters, let’s take a closer look at the signi - cant events that have motivated the necessity to write a book on P2P engi- neering at this time.

1.1 The .COM Supernova

The year 2000 was amazing. The beginning of the new millennium began with reworks displays viewed world wide on television as successive capital cities greeted the stroke of midnight. The high- tech economy responded likewise, it too expanding world wide to reach new limits, as a .COM frenzy took hold and got to the point where a clever new domain name could generate a few million dollars in startup, venture capital. This was clearly economic madness and it had to end sooner or later. It did, and sooner, when the stock market bubble bursted like a sudden .COM supernova that sent hundreds of these Internet companies into bankruptcy. Since that time most of those .COMS that drove the NAS- DAQ to new heights, above 4000 points in the middle of 2000, have practically all vanished from the planet. Here it is historically important to note that the .COM stock market “bubble” bursting is not a

new phenomena. The same bubble effect occurred concurrently with the inventions of electricity in 1902, the radio in 1929, and the promise of the mainframe computer in 1966. Initial expectations drove stocks’ P/E ratios to unrealistic heights only to see them abruptly come tumbling down again. The new technology in each case was sound and produced long-term revenue, the investors’ enthusiasm and the market’s response was not.

One can ask rst, why did the .COMS come into being, were they and/or their supporting technologies legitimate, and from this point of view, try to understand the above events, and second, given the void that their disappearance has left, what, if anything can be done to put in their place the products and services that will result in long-term, stable economic growth in the high-tech sector? The recent col- lapse has shown this sector to be the heart of the cardio-vascular system of the new millennium econ- omy. There are those who deny the latter statement, and look to the old gold standard like industries in time of economic decline, but the new economy is as dependent on the high-tech Internet technologies as the automobile industry is on our highway system.

The birthrate of .COMs was directly tied to the ease with which one can deploy Internet, client/server web based architectures, and this drove the .COM expansion further than it should have gone over the period from the beginning of 1999 through the Summer of 2000. While this ease of deployment is cer- tainly a big plus for the underlying technologies, the resulting digital marketplace could not provide the revenue streams necessary to either support the already existing .COMs or sustain the growth being experienced. One wonders about their business plans since ofce space was rented, employees hired, software written, and hardware purchased and connected to the Internet Information Highway, all with speculative, venture capital investments. Venture capital support was supposed to be based on a sound business plan as well as new ideas based on Internet technology. Here it is extremely important to note that the technology is not at fault. The technology is fantastic. Rather, it was the desire to make a quick buck today without worrying about tomorrow’s paycheck that brought the economy to its knees. One senses that a sound business plan was irrelevant, and that the hopes for a quick, and highly inated IPO was the goal. When the .COMS went supernova, and their hardware capital was reabsorbed into the economy with bankruptcy sales, this brutalized companies that depended on hardware revenue. Simi- larly, tens of thousands of very talented employees are now jobless. This can only be viewed as a course in “business 101,” as “lessons learned” for the young entrepreneurs and their technical counter- parts. For those who pro ted from this bubble implosion-explosion, and there are many, the young entrepreneurs will be back, in force with their technical teams, but wiser and smarter, back to make changes in the system that exploited their talent, energy and dreams.

On the technical side of things, again, the ease of deployment of the hundreds of .COMs is a proof of concept of not only the web based, distributed computing model but also the high-tech design, manu- facturing, and delivery cycle. High-tech is and will continue to be responsive across the board, and clearly, the Internet will not go away either as a way of doing e-business, or as a means of enhancing both personal communication and one’s day-to-day life. With respect to personal communication, a beautiful thing happened during this time period. Strong partnerships and alliances were created between nearly all aspects of the high-tech and telecommunications sectors because personal commu- nication in all of its forms can be enhanced with Internet connectivity. Yes, there was a rush to add fea-

tures to mobile phones before the technology was mature, but the i-Mode, JavaTM experiment alone proved the business and technical models are sound while the WAP experiment proved that without uniform standards across this space, and responsive data networks, these services will be abandoned. The former partnerships continue to ourish and the latter problem was realized mid-stream at the WAP Forum and has been corrected. These are discussed in detail in Chapter 6.

As a consequence, business opportunities with personal communication, life-enhancing applications will be a large part of the P2P economic model. We will point out throughout this book how P2P can better address areas like these, and more easily solve the multiple device problems for which client/ server, web based solutions have not been adequate. In some cases a marriage of these latter technolo- gies with P2P networks will be suitable, e. g., collaborative P2P applications in the enterprise, where the documents produced are proprietary and need to be highly secured, regularly checkpointed, with all transactions audited, and in others, pure, ad-hoc P2P networks will be the best solution. Here, one might have the exchange of content, like family trip photos, between neighbors connected either on a shared 802.11a/b network, or with DSL.

As the .COM rollout proceeded, limitations of the underlying network, its administration and services were also revealed; SPAM continues to be an uncontrolled problem; bandwidth is not sufcient in the “last mile;” Internet service to China, India and Africa is minimal to non-existent [Allen]; and denial- of-service attacks, and security problems are a constant concern. These are discussed in section, and P2P engineering solutions, where applicable, are found throughout the book. For example, security, denial-of-service attacks and SPAM are covered in Chapter 5.

In the nal analysis, the .COM supernova was not technology gone wrong, but rather a business fail- ure. Yes, we do have those unemployed, creative engineers, managers, and marketeers. Happily, cre- ative beings tend not to be idle beings. Creative energy is restless energy. From this perspective, the “supernova” is a positive event: Just like its astronomical counterpart which must obey the laws of thermodynamics, where the released energy and matter self organizes itself to form new stars, one can view the laid off, post .COM supernova employees as a huge pool of intellect, of creative energy that will not remain dormant like a no longer active volcano, but rather, will regather in small meetings in bars, cafes, homes and garages, to create business plans based on surprising and disruptive technolo- gies, some which will appear to come out of “left eld,” and others from a direct analysis of what is required to advance Internet technology. And, these post .COM entrepreneurs will be not much older but will be much wiser. As a result, the technologies they will introduce will be based on sound com- puter science, an examination of the underlying reasons why failure of such huge proportions hap- pened so abruptly, and thus, yield products with long term revenue as a goal rather than a short term doubling of one’s stock portfolio. Life continues for these individuals, the dream is in place, is here to stay, reshaping itself as necessary, and the .COM supernova is in the natural progression of things, a necessary reorganization of a huge amount of energy that cannot be destroyed.

1.2 The Changing Internet Marketplace

Why does a marketplace change? Why are we still not all shopping in large farmer-like markets? Fun- damental to these two questions is access to, the distribution and aggregation of the commodities being sold. The Internet, digital or e-Commerce marketplace is no different. As we accelerated through the events discussed in the previous section, rules for accessing, distributing and delivering digitally pur- chased items were put in place. And, most of which was purchased on the Internet could not be deliv- ered in the same manner. Many examples come to mind. Two typical ones are EBay and WebVan. Also, many items of e-Commerce value that were not purchased were delivered extremely well on the Internet. Here one is referring to Napster. What we are looking for is the right combination.

Access to these items was for the most part through a web-based, browser interface. If one needs to search for a hotel at a particular price in a given region, then Internet access to this digital information can be extremely tedious and time consuming. While the results are often gratifying, the means of access can certainly be improved. If one wishes to purchase an orange, then it should not be necessary to go to Florida. For the http GET mode of access, one almost always returns to the home site for fresh data.

Napster showed us that one can much more effectively deliver digital content using hybrid P2P tech- nology. Yes, Napster no longer exists but the technology it used is only getting better. There are legal issues with respect to copyright protection, and so digital rights management software is being written to protect digital ownership. Why? Because those who stopped Napster, i. e., the recording industry, realize the huge potential of unloading the infrastructure cost for delivering mpeg to the users’ own systems and networks. P2P is sure to become an essential part of the new digital marketplace because there will be safegards put in place to both guarantee the payment for purchased, copyrighted material as well as its P2P distribution. In [Saroiu02] it is pointed out that about 40-60% of the peers shared only 5-20% of the total of shared Napster les, i. e., Napster was used as an highly centralized, client/ server system with very little incentive for sharing the wealth of music in the system. One can specu- late that the resistance to sharing was directly correlated with the fear of getting caught for copyright theft, and that a system of sharing incentives such as digital coupons leading towards free or low cost content will be easy to put in place in a marketplace that discourages digital content theft.

Digital content extends far beyond mpeg and jpeg. There is an enormous marketplace for digital games, and all kinds of software applications. P2P is the perfect way to distribute this content and some form of legal sanity must prevail to permit it to become an accepted part of e-Commerce market- place. Similarly, individuals can begin to market their home computing power to computing grids. There are millions of people willing to contribute home computing cycles to SETI@Home [SETI] to aid in the search for extra terrestrial life and that search in itself is sufcient incentive. This is a noble cause and should be supported in this way. On the other hand, a business venture, PopularPower [Pop- ularPower], which tried to make a business out of selling these computer cycles failed. This implies that those companies that spend millions of dollars to purchase super computers were or are not will- ing to spend less money for perhaps a more powerful, home based, computational grid. A rmly in

place, secured, P2P infrastructure can help provide the necessary motivation to make idle CPU cycles an Internet commodity. This in turn can create a boom in personal computer sales, and this can make even more cpu cycles available for the worthwhile non-prot grids like SETI@Home. The possibilities are endless. One of the most important keys is to develop the technology necessary to assure that indi- viduals’ and companies’ property rights are protected. The engineering side of this issue is discussed in Chapter 4. The solution is neither to wage denial-of-service attacks against P2P networks, nor to launch a legal attack against P2P startups hoping that the legal costs will drive them out of business before the legal battles have nished. These approaches might give short term nancial protection to multimedia giants but will the stie technical innovation required to bootstrap multimedia e-Com- merce. They are motived by a near sighted vision that should be stopped in its tracks.

In order for a P2P to become a fundamental building block in the new digital marketplace, the digital content exchanged needs to be aggregated closer to home. While powerful centralized servers are essential for the nancial management and initial distribution of this data, always “going to Florida” is not a complete solution. Just like the freeways are lled with semi-trucks during off hours to deliver the oranges to local markets, the same use of the information highway makes good “bandwidth sense.” The consumer experience will be much improved if the data access is closer to home. Ultimately, the equivalent of digital delis is needed. This is thoroughly discussed in Chapter 2.

Finally, as mentioned above, the web-based process of nding what you wish to purchase is tedious, time consuming and must be streamlined. One would like to have a digital-aide with a template describing a desired purchase that does the shopping. This together with digital yellow pages for locally available purchases will create a very attractive e-Commerce marketplace. We discuss this thor- oughly in Chapter 5 under the topic of Java Mobile Agents on P2P networks.

1.3 What Is a P2P Network?

What is P2P? That is the real question. Most people believe that P2P means the ability to violate copy- right and exchange what amounts to billions of copyrighted mpeg or jpeg les free of charge. The music and motion picture industries are strongly against P2P for this reason. We will discuss the his- torical origins of P2P in Chapter 2, and this history makes no reference to the “Napster-like” deni- tions of P2P. Rather, it will discuss the foundations of distributed, decentralized computing. One nds also that the non-technically-initiated have also begun to equate decentralized computing with some kind of dark force of the computing psyche. It almost comes down to the old battle of totally central- ized versus decentralized governments. And, amusingly enough, for example, the United States and European Union are organized somewhat like hybrid P2P networks in the denition we will propose below. And, capitalism was born out of such an organization of political and legal balance of power. Where does that leave the cabalistic opponents of P2P? We leave that to the reader to decide. But, this opposition is at least partially neutralized by a lack of understanding of P2P, and thus a goal of this book is to help shed some light on this P2P issue. Decentralized, distributed computing is a powerful

tool for organizing large collections of nodes on a network into cooperative communities, which in term can cooperate with one another. Yes, the one possible instance of this paradigm is anarchy where each member node of such an organization of computational resources can do what it wishes without concern for others. At the opposite extreme is a dictatorship. An organization of network nodes that leads to either anarchy or a dictatorship does not have a place in P2P networks as viewed from this book’s perspective. Nearly everything in between does. And, certainly, we need to establish some rules to prevent the violation of others’ rights whether they are members of society or nodes in a network. Napster from this point of view was not P2P. Rather, is was a centralized system that encouraged non- cooperation among the member nodes or subtle form of anarchy which is almost a self-contradiction. Centralized because all mpeg distribution began with centralized registration and access to copyright protected mpeg les from about 160 servers [Saroiu02]. And, anarchy-like behavior among the nodes because once a node possessed a copy of a mpeg le the tendency was not to redistribute it, and thus, ignore the underlying “share” the content model which is at the roots of P2P.

Clay Shirky gives us a litmus test for P2P in [Shirky00]:

1) Does it treat variable connectivity and temporary network addresses as the norm, and 2) does it give the nodes at the edges of the network signicant autonomy?”

While this is a test for P2P, there will be instances of P2P networks from this book’s point of view that will treat xed network addresses and 24x7 connectivity as the norm. Our goal is not to have a purist litmus test that excludes a major segment of the computational network, but rather a test that is less exclusive. To this end a collection of nodes forms a P2P overlay network or P2P network if the follow- ing hold:

1) A preponderance of the nodes can communicate with one another, can run app-services enabling them to each play the role of both a client and a server, and exhibit a willingness to participate in the sharing of resources,

2) Nodes may be completely ad-hoc and autonomous, or use traditional, centralized, client/server technology as necessary.

Here one notes the term overlay network. From this book’s point of view P2P networks are overlay networks on top of the real network transports and physical networks themselves, as shown as Figure


Overlay Network node3 node5 node1 node6 node2 node4 node7 Tcp/Ip Tcp/Ip Tcp/Ip on 802.11b http
Overlay Network
Tcp/Ip on 802.11b
Real Network

Figure 1-1. P2P Overlay Network

P2P means mutually supportive nodes on the one hand, and being able to use as much of the available technology as is necessary on the other, to arrive at a network that behaves optimally. A P2P network in an enterprise will be different than a P2P network in a neighborhood, and the two may or may not communicate with one another. The former will in all probability be stable, and the later most likely ad-hoc and unstable.

The lifetimes of network addresses and connectivity, as well as an autonomous node’s symbolic “Edge” position in the Internet topology lay at the far end of a very broad P2P spectrum of possibilities offered by the above denition. If one wishes P2P to be a success, then the engineering principles to which it adheres, its domain, must be able to encompass, and nd ways to both interact with and improve current Internet, centralized client/server, based app-services. In fact, the appropriate vision is to view the ultimate Internet as a single network of nodes for which P2P provides an underlying fabric to help assure optimal, and thus, maximum service to each device limited only by the device’s inherit shortcomings, and not by its symbolic position in the network. Yes, an ad-hoc, autonomous, self-orga- nizing, network of unreliable nodes is inherently P2P. Yet, a highly available cluster of database sys- tems supporting a brokerage rm can also be congured as a P2P network as can these systems’ clients. The members of such a cluster can be peers in a small P2P network using P2P communication for the exchange of availability and fail-over information; the clients can also be members of the same network to help both mediate network wide load balancing, and data checkpointing, as well as a mem- ber of a client P2P network to share content, and suspend and resume tasks on one another’s systems.

In such a congured P2P network there may be centralized client/server relationships to, for example, insure authenticated, authorized access, and this P2P network as well as the pure, ad-hoc P2P network both satisfy the denition, both being points in the P2P spectrum. The application of the fundamentals in this book will enable one to create such networks. But, standard, distributed client/server email and database systems are not P2P even if the clients may keep data locally and can act as auto-servers

either to improve performance or at times of disconnection. These later client/server systems do not communicate with one another as peers and adhere strictly to their roles as clients and servers. This book does not discuss the fundamentals of these latter systems but will describe methods for morphing them towards the P2P paradigm for better performance. Such morphed systems will certainly be hybrid rather than pure P2P, and an extremely important step in the evolution of P2P technology.



step in the evolution of P2P technology. client/server adhoc Figure 1-2. The P2P Spectrum The symbolic
step in the evolution of P2P technology. client/server adhoc Figure 1-2. The P2P Spectrum The symbolic

Figure 1-2. The P2P Spectrum

The symbolic “Edge” of the network is really better viewed as pre-Columbian network terminology in the sense that before Columbus, the western world believed the world was at, and had an edge. When Columbus looked for the edge of the world, he never found it, this ctional point of view was dropped, and the possibilities for travel have been limitless ever since that time. If one is at any node in the Inter- net, then there is not a sense of “I am at the network’s Edge.” One’s services may be limited because of a slow or poor connection and this can happen anywhere in the Internet. It is much better to view each node as located at the “Center” of the network, and then do what is possible to make all such “Centers” equal. This is closer to the task P2P has set out for itself in this book.

1.4 Why P2P Now?

Internet e-Commerce will be as successful as the acceptance of, and thus, the willingness to both use on a regular basis and pay for the applications and services (app-services) that the digital marketplace offers. One of the reasons we had a .COM supernova was the consumers did not accept the app-ser- vices offered by the .COM marketplace in the above sense. Some of the app-services were used some of the time, and very few were used all of the time. Those very few are those that survived. The accep-

tance one envisions here means much more than, “Hmm

someday.” Acceptance is expressed by users saying things like, “This app is so cool I can’t get along without it”, “This service is so compelling that I feel like I am under charged for its use”, and “this app is as necessary as my car, my roller blades, and my hiking boots, and, hey, this is fun!” The app-ser- vices must offer a break from the tedium of day-to-day living. People spend enough time waiting in lines, sitting in trafc, and being overloaded with junk postal mail, spam and those obnoxious popup advertisements. Each of the above produce revenue but why perpetuate pain when it is not necessary? Right, in the last three cases the advertisers are neither thinking of, nor care about the recipient of the advertisements, rather they use any technique they can to increase sales revenue. How many times have you, the reader, cursed these advertising methods? As its primary goal, the kind of acceptance

This is an interesting URL, maybe I’ll try it

envisioned here requires maximal service with minimal user hassles. Furthermore, optimal app-service response time is essential to enable the creation of real sources of revenue rather than using bogus nui- sances for this purpose. These nuisances can be eliminated if they can be obsoleted.

In order to achieve maximal service with minimal user hassles we must look beyond the current client/ server mode of distributed computing that drives the Internet. We are looking towards a near future where billions of devices will be interconnected. Although the client-server structured network is straightforward, and has served us well up to now, even looking up the right server is a painful job for both clients and servers as our every-day directory service, Domain Naming Service (DNS) becomes one of the fatal bottle-necks with the sustained growth of the Internet. Moreover, with the appearance of various directory services, such as Novell Directory Service (NDS), Network Information Service (NIS) and Windows Active Directory, the difculty of communication among those services has trig- gered the adoption of a standard directory protocol - Lightweight Directory Access Protocol (LDAP). This adds another bottleneck for information access.

On the road toward distributed computing on the top of the same client-server systems, several archi- tectures were established to locate applications and allow applications to communicate. Those archi- tectures include Remote Procedure Call (RPC), Remote Method Invocation (RMI), and Common Object Request Broker Architecture (CORBA). They all need a centralized registry directory for cli- ents to locate the distributed objects. This centralized service always requires high reliability, accessi- bility and stability. Again, we have another bottleneck.

For authentication and authorization centralized, server based services such as Kerberos are in use. The Internet security protocols like Secure Socket Layer (SSL) and Transport Layer Security (TLS) currently require centralized Public Key Infrastructures (PKI) and well known, centralized Certicate Authorities (CA) to manage X509 certicate creation and distribution. These systems are also facing severe bottlenecks, and are required to do secure Internet nancial transactions.

Beyond these computational, infrastructure limitations in the client/server model, we are also faced with a new paradigm: Mobility. One travels with her/his laptop and would like to communicate with another such client system. Even if name/address lookup is possible, this will no longer be sufcient to locate the systems given their network address. There is in fact, no notion of a “home network.” One solution is Mobile IP or a variant. But, Mobile IP still requires the device to have a home address. This does not handle the problem of ad-hoc mobility where a node appears in a network, joins and begins to communicate with other nodes. We are heading towards a collection of mobile devices with disposable IP addresses and no home address identier, e. g., a wristwatch. One can imagine two wrist watches wishing to synchronize calendars. We need solutions for these mobile devices to discover and commu- nicate with one another.

To build a reliable computing power house to serve billions of clients and applications, during the past few decades, companies, institutes and governments are viewing Moore’s Law as a monarch to follow, as well as a limit to challenge. Sooner or later, the limit will be reached at the level of each individual machine, and scientists have already begun to investigate the possibility of building more powerful computing engines by using more advanced technologies from optical to quantum that will no longer

be subjects of this Monarch. We are excited about the future, at the same time, we are worried about present: idled computers, Internet services wedged like the 5 p.m. trafc on a Los Angeles freeway, devices no longer able to communicate with one another, the impossibility of secure communication between any two devices, wasted man-power and energy outages. Are we solving the right problem? Are there better solutions already available?

We need P2P now because with the duplication of information at various locations closer to the sys- tems accessing it, even when one or more these sites are disabled, the information can be retrieved more efciently since both the load on, and access bandwidth to a centralized resource are minimized; with the direct connection between any two devices identied by unique IDs virtually achievable, the centralized directory lookup will no longer be either a “must-have” component or a source of bottle- necks in the network, and ad-hoc mobility can be achieved; with mobile agent services, objects, appli- cations and clients can communicate and share information with each other in such a way as to minimize the users’ involvement in tedious search tasks and thus make systems that are more user responsive. There are many more possibilities brought by P2P technology and any one of them can lead us toward the next wave. With respect to timing and the state of current technology, these possibil- ities are much closer to realization, and preferable to us sitting here and waiting for the next revolution from Physics or Bioinformatics.

We need P2P now because P2P will not only help to optimize a customer’s access to the Internet, but will also provide a new, unique set of Internet commodities in the form of app-services compelling enough to attract new users, by sufciently improving both the user experience and these users’ lives to keep them coming back for more.

The kind of new app-services one envisions are those for which P2P can play a signicant role. They must be multidimensional: Applications will be required have a mixture of text, audio and video at a minimum; provide many-way communication paths between individuals; be responsive to the users’ location; be pervasive across the device space; and create the opportunity for not before possible busi- ness ventures. An example of such a venture is in-the-home digital music recording studios using P2P for distribution, digital rights management software for copyright protection and guaranteed payment software such that a secure, parallel P2P payment structure can be put in place as a separate and sup- porting business. With a few thousand dollars, several adventurous artists can create a digital music marketplace where every customer is a potential seller. The recording giants like Columbia records will have zero inuence, will not be able to sue the artists nor their methods of distribution out of the business, and will no longer monopolize the revenue. The growth potential of this kind of digital mar- ketplace is tremendous with P2P in place.

So, why will P2P now help optimize Internet access and blow away this illusion of the user isolated at its edge? A short answer to this question is that the current Internet technology without P2P cannot support the sustained, optimized growth of multidimensional app-services, and the network topology which P2P implicitly creates will location independent, and hot with activity everywhere. Let’s look at why this is true.

As mentioned above, one of the rst requirements is app-services that fully support multimedia data.

This means music, and video must be delivered on demand. The evidence is already here that central- ized services cannot support the current demand for domain name lookup [Cheriton88], and the mas- sive exchange of multimedia content is huge problem in comparison. The bandwidth is not there, and the centralized, web-services based solution is always to add more bandwidth capacity, and larger servers. This is historically viewed as keeping up with the demand by providing the same or poorer quality of service. This is neither acceptable nor successful at providing user satisfaction. The analogy

is adding more lanes to the freeways to “improve” trafc ow, rather than seriously looking at alterna-

tive solutions that will either be convenient or compelling enough for drivers to consider using them.

How can P2P help now? Napster’s short-lived success proved that hybrid, P2P networks can efciently deliver billions of copies of mpeg les by taking advantage of peers in a P2P network in such a way as


encourage the independent redistribution content. As we mentioned above, software is being written


protect the digital rights of the owners of the content. It is foolish to ignore the content distribution

power of P2P networks if one desires to have optimal, revenue sustaining, digital marketplaces.

The build out of Wireless LANs (WLANs) based on 802.11a/b networks will arrive sooner rather than later. In 2002 the revenue for WLAN in South Korea was expected to be $100,000,000 [WLANREVE- NUE]. As will be discussed in Chapter 6, P2P is a natural t for optimal content distribution in WLANs. In section 3.4 it is pointed out how P2P will encourage an evolution of the organization of a mixture of network devices again leading to an optimal use of bandwidth and services to eliminate the centralized bottlenecks reinforced by the pre-Columbian illusion of where the center of the Internet is located.

A second way P2P can optimize the Internet now is by taking advantage of the processing power of

quiescent CPUs. It was projected in 1966 that mainframe computers would revolutionize the world. Neither the arrival of the now-extinct mini-computer nor the microprocessor was anticipated. A mobile

phone’s processor is more powerful than a typical 1966 mainframe’s! Mobile devices included, there are several hundred million computers out there all connected to the Internet, and most of the world’s population is not yet connected. The existing available processing power is massive. Using P2P one can create coordinated, loosely-coupled, distributed computational networks of hundreds of thousands

of nodes. SETI@Home is successful as an administratively centralized computing grid where the

responsibility for decisions are made by software at SETI@Home’s University of California’s labora- tory. With the addition of P2P capabilities SETI@Home will be able to off load some these administra- tive tasks by adding server capabilities to each SETI@Home client. This will help to both lessen the bandwidth used to and from their laboratory, and speed up the overall grid computationally by, for example, permitting checkpointed jobs to be off-loaded to another client directly.

In the very near future one’s home can be a fully connected P2P network. This network in turn can be

connected to either a laptop, PDA, mobile phone, automobile, or workstation in one’s ofce behind a rewall giving each family their personal peer community. This is possible now with existing P2P technology [JXTA]. These latter networks are not as rened as they can and will be, and the time has arrived to begin the engineering renement that is necessary. This book presents the fundamentals suf- cient to initiate the process.

1.5 Basic P2P Rights

P2P Networks are organized overlays on top of and independent of the underlying physical networks. We have a denition that is general enough to permit almost any device with network connectivity to be a node in some P2P network. And our denition permits these networks to be both ad-hoc and autonomous, and their nodes to be anonymous. It also admits the possibility of strongly administrated, well authenticated networks. And, in either case, both openness and secrecy can and will exist. This paradigm is frightening to some because on the one extreme it is antagonistic to George Orwell’s state in the book 1984 . Big brother will not know that you exist and therefor cannot be “watching you.” It is frightening to others because it also permits Orwellian P2P networks where there is no secrecy, all communication is both monitored and audited, and all data is in clear-text and logged. What’s impor- tant is the freedom of choice P2P offers. This choice has concomitant responsibilities, a P2P credo if you like: Respect one another’s rights, data, cpu usage, and share the bandwidth fairly; spread neither virus nor worms; be tolerant of one another’s resource differences; be a good network neighbor; will do no harm to others. The nature of credo’s is to be violated. That is understood and part of human nature. The goal is to minimize non-altruistic P2P behavior by either localizing it to those P2P net- works where it is acceptable, or appropriately punishing it when it leaks into networks where it is unwanted.

Rightly enough, in the sanctity of one’s home can be a full-blown P2P network where everything is connected, everything is private, and only search warrants will permit entry. The United States Bill of Rights can be viewed as a P2P supporting document since freedom of assembly, speech and the avoid- ance of unreasonable search and seizure are at the heart of P2P. And, certainly one can imagine a well organized “network militia” bearing its software arms to protect the network and the data resident therein. Freedom of access for network devices and their associated data are at the heart of P2P net- works. The rules for the network and data use are decided by the member nodes, are member nodes’ policies. In the P2P world the voice of the minority is equal to that of the majority. Purchase several devices, create a P2P network and attach them. Then chose your P2P applications carefully.

1.5.1 “Freedom of Assembly” for Right to Connect Everything

The rst decade of the new millennium will see an exponential growth of network aware devices capa- ble of sending and receiving data. The list is long and the combinatorics defy one’s imagination. Along with computers we will have: PDA’s, mobile phones, automobiles, TV’s, light switches and light bulb

receptors, fans, refrigerators, alarm systems, wrist watches, stoves, dishwashers, ovens, stereos and all

components, electricity and gas meters, pet licenses, eye glasses, rings, necklaces, bracelets, etc

combination of these can be inter-connected to form ad-hoc P2P networks. One might ask, “To what


Imagine the following: Having dinner in London with several friends and receiving a mobile phone call from your home, not someone at your home, but your home telling you that the alarm system had been triggered. This is a real story told to one of the authors: In 2000 his friend from Sydney immedi-


ately used his mobile phone to scan the alarm log les and detected an alarm on the back bedroom windows had been triggered. Since this bedroom was support by stilts and the alarm in question was on a window overlooking a canyon, he concluded that a bird had own into the window. Right, the friend lives in an experimental home. But, the experimental possibility can and will be a reality during this decade.

Similarly, it is easy to place oneself in a scenario having just left home and worrying if the oven or stove was left on. Rather than turn back, a simply control panel on these devices which are peers in a home P2P network and this home P2P network accessible with either a wireless device in one’s auto- mobile or a mobile phone, both peers in one’s private home network, is sufcient to make a quick check. In fact, one could launch a mobile agent to do a full integrity check of the home and report back. Ten seconds later one will receive an “all is well,” or “you left the stove on, shall I turn it off?”


nal scenario is ad-hoc networks of people in coffee houses, railroad stations, sitting in the park, or


an outdoor cafe. Here, jeweled bracelets, or necklaces might light up in similar colors for all of those

who are members of the same ad-hoc, P2P network. In the evening when viewed from the distance, one can imagine small islands of similar colored lights moving about, towards and away from one another, in a beautifully illustrated, ad-hoc social contract as “Smart Mobs” [Rheingold].

These scenarios are endless, practical, and part of our future. P2P engineering for wireless devices and networks is discussed in Chapter 6.

1.5.2 “Freedom of Assembly” for Survival of the Species

Eco-systems are self-organizing, self-maintaining and in case of reasonable injury, self-healing. There

is life and death and the eco-system evolves using natural selection. This process in continuing and

new life forms arrive over time as the result of mutation. Eco-systems are great for trial and error test- ing. The same must be said for P2P overlay networks. Peers come and go, crash during data transfers, lose their visibility, and are rediscovered. New devices are accepted on face value, and permitted to participate in the network’s activities. P2P networks are spawing grounds, playgrounds for creative thinkers. In this manner, a P2P network can continue to gather new members, make them as known as policy permits, and behave much like eco-systems where diversity leads to survival of the ttest. Peers are free to assemble with others for the interchange of content. Peers like mobile-agents are free to traverse the P2P network, visit those peers where entry is permitted, gather information, and return it to their originators.

As such, “Freedom of Assembly” is the ultimate P2P right. As “what is P2P” denes, although each single device is part of a cooperative whole, it is a node in a P2P network and makes its own decisions and acts independently. A device’s presence is neither required nor denied. Hence, the failure of a device should not be harmful to other devices. If two devices are connected, and one abruptly crashes, this should be a little hiccup in the P2P network, and there ought to be a redundant device to take its place. Still, everything has two sides, this independence also means that there might not initially be anyone who will help this poor, temporarily stranded guy. As for a highly-available client-server sys-

tem, there always are servers behind each client, and back-up servers congured for each server but they are subject to bottlenecks, resource depletion and denial-of-service attacks. So these self-main- taining, self-healing and self-adaptive features cannot always reduce the burdens on client/server, cen- tralized systems. On the other hand, for a device in a P2P network they are not only essential but rather they are inherent in the network ecology. Thus, the “poor guy” who was sharing content and abruptly lost the connection can expect to resume the operation with another node although this recovery may not be instantaneous. During its apparent isolation it might receive the welcome visit of a wandering mobile-agent that is validating the network topology and can redirect the peer to a new point of con- tact. Similarly, denial-of-service attacks are difcult because like an eco-system, there is no center to attack because of the built in redundancy.

From a software engineer’s perspective, ideally, P2P software must be written to reside in a self-heal- ing network topology. Typically, any device uses software to monitor its tasks, schedule its resources to maximize its performance, set pointers and re-ush memory for resuming operations efciently after a failure. At the higher level, the P2P software should be able to adjust to the new environment for both recovery and better performance. For example, a device might have dedicated caches or tables to store its favorite peer neighbors to be used for fast-track connections or network topology sanity checks. When either a neighboring device fails or one its of buddies is not so “trustful” for a intolerable period, the P2P software on the device should be able to dynamically modify its records. In this way, at least the same connectivity can be guaranteed. This is just one of the most straightforward cases showing that P2P software needs to be self-healing and self-adaptive if the network is to behave in the same manner since the P2P network the “sum of its nodes.” The engineering dynamics of these scenarios is discussed in detail in latter chapters.

Unfortunately, not all devices are capable of self-management, for example, the handheld, wireless devices. Such small devices don’t have enough computing power and memory to have such sophisti- cated software or hardware embedded. So, they must rely on other machines for the above purposes. Although the P2P purists hate to use the “server” word, it is true that the small devices require such “server-like” or surrogate machines, and this ts perfectly with the denition of P2P overlay networks dened above.

As mentioned just above, “Freedom of Assembly” in P2P networks is supportive of a multiplicity of devices and software “organisms.” They arrive, try to succeed, to nd a niche, and either continue to ourish or die out. Since the early 1990’s mobile-agent technology has been looking for the appropri- ate execution environment. They can be invasive, pervasive, informative, or directed, and come in all shapes and sizes. They best work when implemented in JAVA. Mobile-agents can be written to adapt to self-healing, ad-hoc network behavior and, in fact thrive in such an environment. The very fact that they are mobile by nature, and can have self-adapting itineraries and agendas, can be signed and thus secured, and are opportunistic as they travel, has always required a network eco-system for their sur- vival, and evolution in mainstream technology. The authors of this book are advocates of mobile-agent technology as applied to P2P overlay networks. The engineering details are discussed in Chapter 4.

1.5.3 “Freedom of Speech” for the Right Publish Data and Meta-data

As previously mentioned, the data or information published and transferred on the Internet is multi- dimensional, and enormous in volume 1 . Thus, brute force pattern matching techniques for searching for data are no longer durable, and become dinosaur-like relics from the era when only textual data was available. A le sharing system which depends on such simple techniques can be easily hacked since it only requires data corruption by a virus to destroy data. Now, a description of data, data about data, meta-data, is an essential component of the organization of data on the Internet to make tasks like search achievable. With meta-data, for example, one can keep signed hashes of the data in meta-data that permit one to detect modication attacks. Similar techniques can be used to detect data that has been modied while being transferred on the Internet. Nodes on a P2P overlay network have the abso- lute right to exchange data or meta-data or both.

This meta-data can be public or private, freely published and subscribed to by everyone, or absolutely secret and viewable by a select few. Meta-data can be stored as clear text and posted to a public domain site for wide distribution of the data described by this meta-data. One of the immediate uses of these sites is to share research publications among institutes. On the other hand, P2P applications have the choice of not hiding or hiding meta-data. They can have strong encryption or use secure IP (IPsec) so that data or meta-data that is being exchanged can be impossible to monitor because well written secu- rity systems can assure the end-to-end privacy of these “conversations.” Thus, encrypted meta-data can and will be impossible to detect on peers’ systems. Also, access to a system’s data directories, i. e., the meta-data describing the les on that system, can be password protected, and this meta-data can be transmitted as clear text or encrypted descriptions of these directories. Thus, again it may only be visi- ble to the possessor of the decryption key so that detection, in this case, is again impossible. Processing speed is so fast that encrypting or decrypting a megabyte of data only takes a second or two. Thus, the processing time required to keep both local and remote user data and meta-data secret is almost not noticeable in human time. “Freedom” of Internet privacy protection has almost no obstacles because the cryptography code which implements the required algorithms is freely available on the Internet [CRYPTIX, BOUNCYCASTLE, PURETLS, OPENSSL].

Noting that thirty three percent of all internet trafc is directed towards pornographic sites [TechNews- World], will P2P networks be any different with respect to data and meta-data that is published? The answer is probably not. The “Freedom of Speech” gives one the right to publish data and meta-data within certain limitations, and the Internet knows fewer and fewer boundaries to data exchange. The rst amendment to the United States Constitution is being applied world-wide inspire of resistance from governments that wish to control the information that crosses their borders.

When P2P networks are pervasive, the publication of content will reside more and more on individu- als’systems. These systems will be much more difcult to locate because their network addresses will be temporary and often mobile. Still, the permitted and accepted private exchange of data and meta- data is no different than a telephone conversation. The problem does not reside with the system that is used to enable the conversation to take place, but rather with the endpoints of the conversation, the

1. Google searches more than 2 billion publically accessible web pages as of July, 2002.

individuals using the system. New technology forges new pathways across all systems including legal ones. This always has and always will be the side-effect of crossing new frontiers. It’s exciting for some and frightening for others, and one generation’s laws may become the next generation’s blue laws, i. e., outdated relics of historical interest which will always have some diminishing in time sup- port. Undeniably, all users will be subjects to the current “laws of the land,” and open to arrest and prosecution for their violation. But, P2P technology will create new markets for honest digital com- merce with enormous economic incentives and will also permit private network conversations between consenting adults along with the expected criminal activity. The latter is well understood human behavior that P2P neither encourages nor discourages much like a highway neither encourages nor dis- courages a driver to exceed the posted speed limit. The solutions to these problems is neither to abolish driving nor stop innovative technological progress because it can be misused. Clearly such reactions are neither well thought out, nor well founded, and will not lead to long term, fair and meaningful solu- tions. The latter will come from the technologists themselves working with law makers and media rms, and always respecting an individual’s Basic P2P Rights.

The engineering aspects of data and meta-data are discussed in Chapter 3.

1.6 Contents of This Book and P2P PSEUDO-Programming Language

This book is organized as following:

Chapter 2 gives an historical perspective on both the evolution of the Internet in two phases, pre-WWW and post WWW, and the roots of P2P.

Chapter 3 covers the essential, engineering components of the generic P2P model. These include the descriptive language and resulting document space, unique peer identity, the overlay network, and communication components. The chapter concludes by showing how these components can be assembled to permit communication on the overlay network given the limitations of the real, underlying network and physical layers.

Chapter 4 gives life to the components, and describes protocols used to create an active P2P network. Here connecting, routing, load balancing, and querying are discussed.

Given an operational P2P network which is an instance of the documents, components, and protocols presented in the previous three chapters, and thus, a model of a P2P system, in Chapter 5 we present the details of how one can implement standards based security in such a system. We conclude this chapter by applying these security fundamentals to dem- onstrate how to create secure Java mobile agent P2P services.

Chapter 6 is a thorough discussion of wireless networks followed by showing how P2P can enable exciting new applications that are device and bearer network independent, and thus be a long needed, unifying force between wired and wireless networks. We also describe what is required to build a Java P2P application for mobile handsets.

Chapter 7 explores some possible P2P applications starting with the familiar email, and chat programs and ending up with less familiar and innovative, almost science ction like possibilities.

In order to explicitly express the engineering principals in this book a P2P Pseudo-Programming Lan- guage, 4PL has been devised. The syntax and semantics of 4PL are dened in Appendix I. 4PL permits

one to both programatically dene nodes on the P2P overlay network, as well as describe their interac-

tion by dening each P2P component we introduce in Chapter 3 as a 4PL data type, and creating a set

of associated 4PL commands to which one can pass and return typed variables.

As mentioned above, in Chapter 4 we dene several overlay network communication protocols. We

will use 4PL here to create ow charts to describe the precise protocol behavior. 4PL thus will give a solid logical foundation to the engineering and protocol denitions, and eliminate the possibility of inconsistent behavior barring any 4PL programming bugs. It is suggested that the reader uses Appen-

dix I as a reference when reading Chapters 3 and 4.

Chapter 2 P2P Is the End-Game of Moore’s Law

“I recall quite vividly having been awe struck by the processing power and

memory size that was available to me when I took it upon myself in 1981 to rewrite my PDP11-05 mini-computer based, multiple protocol router code to take advantage of a master degree student’s, Andy Becholstein, mc68000 micro-processor board. The latter had a clock speed of 8 megahertz and 256K bytes of Chip RAM while the former had a clock speed of 2Mhz and 56K bytes of core memory available to run software. Andy’s board was one of the first in

a long line that would bring us to where we are at this time. 1 ” Today, systems with clock speeds in excess of 2 gigahertz, several gigabytes of high speed, RAM and in excess of 100gigabytes of disk storage are readily available on desktops and laptops. At the same time better mobile phones have 40 mega- hertz clock speeds, a megabyte of RAM, and 64megabytes of flash memory. Thus, twenty two years or 264 months from 1981 we see the predictive power

of Moore’s law: Gordon Moore stated in 1965 that the transistor density will

double every 12 months . Yes, this slowed down to doubling every 18 months

over the years. Still, along with the doubling of transistor density we have had

a concomitant increase in processor speeds. This is because, first, there are

more transistors in a given space, and second the delay on the transistors and

wires is reduced. 2 Indeed, since 1981 we have had 8 such doublings, or

1. Bill Yeager’s recollections of his days at Stanford University, December, 2003.

roughly one doubling every 33 months, yielding processor clock speeds that are now 256 times as fast as they were in 1981: 2 8 = 256, and 256 x 8 mega- hertz = 2048 megahertz or 2 gigahertz. It is assumed that by the end of this decade Moore’s law will no longer apply to silicon based transistors. Similarly, given the current computing resources in the hands of the average user, and the Internet, it is no coincidence that the new millennium was greeted with a rebirth of P2P technology. The potential “computational energy” available bog- gles one’s mind. Indeed, we view the emergence of P2P as a means to tap this energy source, and is as such, the final moves, the logical conclusion to the evolution brought about by Moore’s Law , that is to say, “P2P is the End

Game of Moores Law. Decentralize and conquer is the end-game’s winning strategy. The inevitability of harnessing the latent CPU power of personal sys- tems into communities of users with common interests is the logical conclu- sion of Moore’s Law. There is just too much CPU power to be ignored.

2.1 The 1980’s

As the increase in processor speeds began to diligently follow the almost mystical curve of Moore's law, the initial benefactors were the servers in the client/server model of distributed computing. Still, there were those who even in the 1980's dreamed of harnessing the untapped computational resources of the workstations that began to dominate the desktops of researchers, those who viewed these sys- tems as peerNodes capable of sharing content and computational services. The embryonic beginnings of the P2P technology that would surface at the debut of the new millennium were already in place twenty years before its birth. And, during this twenty-year gestation period several events put in place exactly the requirements necessary for the successful rebirth and growth of P2P computing during the first ten years of the new millennium. Owen Densmore, formerly of Sun Labs, and now working for Complexity Workshop [ComplexityWorkshop] in Santa Fe, predicts that 2000-2010 will be the “Decade of The Peer,” and we believe, as do many others that Owen is correct. In this chapter we look at the history of P2P, its initial appearance in the 1980’s, and the historical highlights of the twenty year gestation period leading to its inevitable and logical reappearance in 2000.

One imagines that most of the software architects and engineers that designed Napster, Gnutella, and FreeNet were about thirty years old in 2000. That puts them at the introduction to their teenage years at the time that the Arpanet switched from NCP, the Arpanet protocols, to IP in 1983, and from our point of view the internet became The Internet with the addition of end-to-end IP connectivity. During the decade of the 1980’s, IP quickly obsoleted the other network protocols that were in use at that time. Some examples are XNS, DECNET, Appletalk, Localtalk, Chaosnet, and LAT. By 1984 IP and its

2. Delay is proportional to resistance times the capacitance, and both resistance and capacitance are reduced as a result of Moore’s law.

accompanying protocol suites were already well on their way to becoming the global network stan- dards.So, what does this have to do with P2P? Clearly, the rapid growth of networking technology that took place in the 1980’s was the major impetus, the force that pushed us technically from the perspec- tive of applications to where we are today. Behind these advances, behind the scenes if you like, we also find first the effects of Moore’s law: Smaller is faster; smaller is more memory; more memory implies network hardware and software with more functionality; better networks imply higher produc- tivity and new and creative ways to communicate. Second, the IETF as a forum for open Internet stan- dards was then, and still is a major factor. From 1980 until 1990 three hundred and two rfc’s were approved. The 1980’s, indeed, set the stage, put in place the sets, and the scenery for as yet unwritten dramas to take place. They would be all about the evolution of the Internet and P2P will play a major role in the act that began in the year 2000.

2.1.1 LAN’s, WAN’s and the Internet

While some Local Area Networks (LAN) did exist in the 1970’s at places like Xerox PARC where eth-

ernet was invented, the real upsurge occurred in the 1980’s, and in particular, after 1983. In this context it is important not to forget that the 3mbps ethernet, ethernet “version 1,” using the Parc Universal Packet (PUP) specifications was officially documented in 1976 in a paper by Bob Metcalfe and David Boggs entitled, “ Ethernet: Distributed Packet-Switching for Local Networks .” We include in appendix

III a copy of the first page of a PARC Inter-Office Memorandum by Ed Taft and Bob Metcalfe written

in June of 1978 which describes the PUP specifications [PUP]. We certainly can assume that the hard- ware and software that is discussed in this latter memo existed well before this date. PUP was used to link together Altos, Lisp Machines, Xerox Star Systems, and other servers at Parc. Bob Metcalfe left Xerox in 1979 to form 3COM, and promote ethernet. The Ethernet version 2 standard, or 10mbps eth- ernet is specified in IEEE 802.3. The standardization is the result of a consorted effort by Xerox, DEC, and Intel from 1979 to 1983 that was motivated by Bob Metcalfe. Today ethernet is the world’s stan- dard comprising 85% of the LAN’s.

We now give a brief description of the emergence of the Stanford University LAN. It is by no means a unique event but one we can describe from Bill Yeager’s deep personal involvement in this effort. And certainly, as we will see, what happened at Stanford did spearhead the growth of networking on a world-wide scale.

By the means of a grant of hardware and software received in December of 1979 by Stanford Univer-

sity from Xerox PARC, PUP became the original 3mbps ethernet LAN protocol at Stanford University and was primarily used to link D-Machines, Altos, Sun Workstations, VAX’s and TENEX/TOPS20 systems across the University. The first three subnets linked the medical center, with departments of computer science and electrical engineering by the means of the first incarnation of Bill’s router which ran in a PDP11-05 and routed the PUP protocol. These three subnets were the basis for the original Stanford University LAN. IP showed up at Stanford in late 1981, and to support IP Bill invented the multiple protocol, packet switched ethernet router that routed both PUP and IP. He continued to develop the code over the next 5 years. By 1985 the code routed PUP, IP, XNS and CHAOSNET. It

was officially licensed by Cisco systems in 1987, and was the basis for the Cisco systems router tech- nology. As of late 1981 the hardware for these multiple protocol routers was known as the “bluebox.” The first had a multibus backplane outfitted with a power supply, mother board with a mc68000 CPU and 256Kbytes of chip memory, and held up to 4 ethernet interfaces. The bluebox was invented in the Stanford department of computer science. The mother board was Andy Becholstein’s invention [CAREY]. The first cisco routers used the identical hardware. They ultimately became the Internet router of choice.

10mbps ethernet subnets appeared in early 1982 and along with IP began to dominate the LAN which blossomed from a research LAN with three subnets to one that began to connect all of Stanford’s aca- demic and non-academic buildings. The Stanford LAN had the IP class A internet address, and was first connected to the Internet in 1983 by the means of a BBN router called the Golden Gate- way that was maintained by a graduate student name Jeff Mogul. Jeff has been very active in network research since graduate student days in the early 1980’s. As a graduate student he co-authored rfc903 on the Reverse Address Resolution Protocol in 1984 and since that time he has joined with others to write an additional fifteen rfc’s. His most notable effort may be rfc2068 which specifies HTTP/1.1. One of his co-authors was Tim Berners-Lee.

By 1990 the Stanford LAN had more than 100 subnets. This growth was accomplished by formally acknowledging the necessity of the Stanford LAN for doing day-to-day business and forming in 1985 a department under the management of Bill Yundt to support their further growth. Stanford continued to support the PUP, IP, XNS and Chaosnet protocols into the late 1980’s since the ongoing research required it. The Stanford LAN service was superb and permitted seminal work in Distributed Systems to be done which is clearly a forerunner of P2P. This research is discussed in section 2.1.3.

In a similar context, the MIT Media Labs had Chaosnet which was originally used to link its Lisp machines, and later a large selection of machines at MIT. This was documented by David Moon in 1981 [Moon81]. By the mid-1980’s LAN’s like these were common place in Universities, and busi- nesses began to follow suit.

In 1985 golden was retired and Bill Yundt’s department provided connections to the Internet, NSF backbone. These were T1 1.52mbps networks formally called the NSFNET. Similarly, a T1 Bay Area Network was created to link up Universities, research institutions and companies in the general Bay Area. BARNET extended to U. C. Davis near Sacramento. Bill Yundt played a major role and was an impetus to the formation of BARNET which was one of the first Wide Area Networks (WAN). We believe there were no restrictions with respect to whom might connect to BARNET. It was a pay as you go network. From 1985 onward, LAN’s and WAN’s popped up everywhere with the NSFNET provid- ing the Internet connectivity. Networking was the rage.

The Internet grew dramatically with NSFNET as the major motivating force. “NSF had spent approxi- mately $30 million on NSFNET, complemented by in-kind and other investments by IBM and MCI. As a result, 1995 saw about 100,000 networks—both public and private—in operation around the country. On April 30 of that year, NSF decommissioned the NSF backbone. The efforts to privatize the

backbone functions had been successful, announced Paul Young, then head of NSF's CISE Directorate, and the existing backbone was no longer necessary. [NSF]”

Before we move on, it is worth reflecting just why networking was the rage? What was behind the rapid adaptation of the latest technology? Clearly, one did not spend millions of dollars on a technol- ogy because it was something “cool” to do. Rather, those who financed the research and development ultimately required a return on investment (ROI). It is because networks stirred the imagination of visionaries, researchers, software designers, systems analysts; of CEO’s and CTO’s; of thousands of students in Universities; and above all the users of the network applications. Thus, a job market was created to implement what would become the next industrial revolution. The ability to network appli- cations streamlined business processes, opened the doors to new forms of interactive entertainment, provided avenues for long distance, collaboration in real time. Expertise became location independent, as did one’s geographical location with respect to her or his office. The effective, network virtual office was born in this decade because of the network infrastructure and the early network applications.

2.1.2 Early Network Applications

The first, foremost and most popular network application has always been email. It existed in the 1970’s on the ARPANET and became standardized in the 1980’s on the Internet with SMTP, POP1 and IMAP. The rapid information exchange that email provides has always made it a great tool for commu- nication be it for research, business, or personal use. It also points out that above all, applications that promote person-to-person communication will always yield a decent return on investment.

The IMAP email service of the late 1980’s was a harbinger of how effective protocols can be that are specifically targeted at both the network infrastructure and computer resources. It had significant advantages over POP at that time since IMAP permits one to access the properties of messages as well as the messages themselves, and the parsing of messages was entirely done on mail servers. This greatly simplified the writing of client UI code, and maximized the use of network bandwidth. It is important to recall that the NSFNET was T1 based at this time and clients were very limited with respect to computational and storage resources. Also, many clients ran on computers that used serial lines to connect to the network. Bill Yeager recalls demonstrating macMM and IMAP at the Apple Corporate building in Cupertino in 1989, reading his email on the Stanford LAN via BARNET and was not surprised to see no human recognizable difference between reading email in Cupertino from the SUMEX-AIM IMAP server some fifteen miles away at Stanford and doing the same thing on his desktop at Stanford. A great deal of this performance was implicit in the original design of the protocol provided that the clients were well written as macMM and MM-D both were. While mail is not P2P, it gives the user a sense that it is P2P. Servers in the middle tend to be invisible servants for most of today’s email users. And, just like IMAP was born out of the necessity of the requirements of the 1980’s, we see today’s network infrastructures and computer resources ready for new email protocols. To this end, for a discussion of P2P Email see chapter 7.

Similarly, one cannot discuss early network applications without mentioning telnet and ftp. Both can be viewed as early P2P applications for those users who had Unix workstations or Lisp machines as

desktop. Each system was both a telnet/ftp client and server. Users regularly ran applications on one another’s systems, and exchanged data with ftp. This is discussed further in the next section.

It is also amusing how today’s users believe Instant Messaging and chat rooms are a phenomena of the

new millennium. Chat room applications were available on mainframes before we had networks. One’s buddy list was extracted from a “systat” command which showed who was on at the time. And the chat

room’s were supported by dumb terminals like heathkit Z29’s or datamedia’s. Mike Achenbach wrote

a chat application exactly like this that ran on TENEX mainframes. The terminal screen was broken

into rectangles to support each member of the chat room. These chat rooms were small but the func- tionality is identical. When networks arrived, we had PUP based chat on Lisp Machines. The UI’s were graphics based. Finally, the Unix talk command was always networked and used in conjunction with rwho for presence detection. The protocols have evolved over the years but the ideas came from the 1980’s. Nothing really new has been invented in this arena since that time. Chat, and talk were both P2P applications.

Also, LAN and Internet router software used routing protocols to communicate routing information with one another in a P2P manner. Here we are thinking about the Routing Information Protocol (RIP) and inter-domain routing based on the Border Gateway Protocol (BGP). By either broadcasting routing information (RIP) or supplying to it with a reliable connection (BGP), the service was symmetric, with each such system behaving as both a client and a server.

Another application that was born in the 1980’s is the network bulletin board. They arrived in many forms. While some were simple digests with moderators, others were fully distributed client/server systems with interest lists to which a user could subscribe, and both post and read messages. Good examples of the former are the SF-Lovers digest, and info-mac. The SF-Lovers digest was ongoing email messages where the daily threads were mailed out as a moderated digests to the subscribers. The subject was science fiction and fantasy and SF-Lovers became extremely popular in the late 1970’s with the release in 1977 of the first Star Wars film, “A New Hope.” Info-mac was all you wanted to know about the macintosh and was hosted by Sumex-AIM for more than a decade. What was admira- ble about such digests was the dedication of the moderators. Keeping such a digest active was a full time job, and those who moderated these digests did it out of a passion for the subject. It was volun- tary. The Network News Transport Protocol (NNTP) was specified in rfc877 in 1986. “NNTP specifies

a protocol for the distribution, inquiry, retrieval, and posting of news articles using a reliable stream

(such as TCP) server-client model.” USENET servers on top of NNTP were P2P systems in the sense that they were clients of one another in order to update their news data bases. The actual net news cli- ents could run on any system that had TCP/IP as the transport. The client side protocol is simple and elegant, and USENET client/server system provided a powerful mechanism for the massive exchange of opinions on almost any topic imaginable.

One might pause and ask where is the ROI, “show me the money.” Even applications as simple as ftp on TCP/IP encouraged digital data repositories to be created, and thus the rapid exchange of informa- tion. People caught on quickly in the 1980’s and soon other content query, storage, and exchange pro- tocols were placed on top of TCP/IP. Among these were networked SQL, distributed data base

technology; Digital libraries of medical information at NIH; Remote medical analysis; networked print services from companies like IMAGEN; and Laboratory-to-Laboratory research as exemplified by national resources like the Stanford University Medical Experimentation in AI and Medicine (SUMEX-AIM). All of these networked technologies led to huge cost savings and streamlined both research and business processes thus yielding more profit and ROI.

Finally, “During the late 1980s the first Internet Service Provider companies were formed. Companies like PSINet, UUNET, Netcom, and Portal were formed to provide service to the regional research net- works and provide alternate network access (like UUCP-based email and Usenet News) to the pub- lic.[HISTINTERNET]”

2.1.3 Workstations and Distributed File Systems

The 1980’s also hallmarked the birth of systems such as the personal Lisp machine, the Unix worksta- tion desktop, the macintosh, and the PC. These machines, for the first time, gave users their own sys- tems for running applications and network services, and broke away from the approach of “all of your eggs in one basket,” that is to say, a dependency on a serial-line tether to a time-shared mainframe to run applications and store data. As already discussed, routers on the other hand inspired the, “Let’s connect everything to everything” attitude. They provided the means to this inter-connectivity be it on a LAN, WAN, or the Internet. An important feature of routers that is often overlooked is that they also form barriers that isolate local subunit traffic to that subnet. Consequently, they permit a great deal of experimentation to take place within a LAN without having it disrupt the day-to-day business that is conducted through interaction of many of the hosts connected to the LAN. Thus, in particular, the 1980’s found users and researchers alike in the ideal network environment where co-habitation was the accepted policy, and routers effectively administrated the policy. We were at this time clearly on the path towards both centralized client/server and decentralized, distributed computational services. And as seen below, although not called P2P, the freedom this environment provided encouraged both dis- tributed file sharing and computation.

Since many of these systems (Unix desktops and Lisp Machines in particular) had client as well as server capabilities, telneting or ftping between them was the norm. Also, mutual discovery was done with DNS. Every host on the Internet could have a fixed IP.v4 address, and it was easy to keep track of the unique host names of interest that were bound to those addresses. In this sense, a users having sym- metric ftp access to one another’s systems is P2P in its generic form. Noting that this was as easily done across the Internet as on one’s local subnet or LAN since each such system had a unique IP address, the true end-to-end connectivity that existed at that time yielded P2P in its purest state.

The early 1980’s featured the rise of Unix servers. These servers ran the rdist software that permitted them to share binary updates automatically and nightly. They were peers from the perspective of rdist. Similarly, Lisp machines such as Symbolics Systems, and Texas Instruments Explorers were extremely popular as research workstations, and they too behaved as both clients and servers, as peers using their own file sharing applications as well as ftp.

The Network Files System (NFS) was introduced by Sun Microsystems in 1984, and standardized with rfc1094 in 1987. This was quickly followed by the Andrew File System from Project Andrew at Carn- egie Melon University. While NFS was restricted to the LAN, AFS was Internet wide. These file sys- tems run on both clients and servers, and permit users to view a distributed file system as a collection of files virtually on their own systems. The Unix “ls” command was location independent. Therefor, to access a file one used the usual local command line interfaces since drag and drop user interfaces did not yet exist. As long as the connectivity was there, any file for which the user had access privileges could be simultaneously shared as if it was on the local system. This is again an example of p2p file sharing. A major difference between NFS and AFS file sharing, and what has become known has file sharing in the current decade is that the latter is done by retrieving a copy and storing it locally, while the distributed file systems worked and still work perfectly well as virtual file systems. The file itself need not reside on the local system even if it appears to do so. Thus, a file can be read, or written with simultaneous access and appropriate locking mechanisms to prohibit simultaneous writes. One other difference is the nature of the content. During the 1980’s for the most part shared content was either text, or application binaries, and thus the impetus for massive file sharing did not exist as it does now. The user communities in the 1980’s were initially technical and research based, and evolved to include businesses towards the end of the decade. Still, it is easy to imagine what might have happened if even digital music was available for distribution during that epoch.

We are quite sure that speakers would have appeared on workstations, and distributed virtual files sys- tems like NFS and AFS would have been one of the communication layers beneath the Napster equiv- alents of the 1980’s. Sure, the audiences would have been smaller but the technology was there to do what was required for LAN/WAN wide distribution of digital content, and the Internet connected LAN’s and WAN’s. You get the picture.

Using these distributed file systems for P2P was a natural for read-only file sharing of multimedia con- tent. Recall that disk drives were limited in size, and that many of the early workstations were often diskless. They booted off of the network and ran programs using NFS. Still, peerNodes could have auto-mounted the file systems containing content of interest, and then search, list and view it as appro- priate for the media type. The meta-data for each file could have been cached throughout the P2P Net- work on small servers behaving much like mediators, and carry with it the file system location of where the file resided.The meta-data and content may have migrated with access to be close to those users to whom it was most popular. Noting that scatter-gather techniques are a variation on the themes used in the 1980’s for both the interleaving of memory as well as storing files across multiple disk drive platters for simultaneous access with several disk drive read heads to improve performance, com- ing up with a similar scheme for distributing files in a more efficient way is and was an obvious next step. A file may have in fact existed in several chunks that were co-located on the thus constructed P2P network. The demand would have motivated the innovation. Finally, since the content never needed to be stored on the system that was accessing it, if necessary, digital rights management software could have been integrated as part of the authentication for access privileges. Thus, P2P content sharing existed in a seminal, pure form in the 1980’s and the technological, engineering innovations in place today that give us global content sharing on P2P networks are really a tuning/reworking of old ideas

accompanied with the expansion of the Internet, the performance enhancing corollaries associated with Moore’s law, and drastically increased local disk storage. The authors sincerely believe that care- ful research for prior art would uncover sufficient examples from the 1980’s to invalidate huge num- bers of current software patents.

Just as distributed file systems were developed in the 1980’s, so were distributed operating systems. The latter bear a very strong resemblance to P2P systems. In this spirit we next describe The V-System that was developed at Stanford University.

2.1.4 The V-System

One thing that can be said about the 1980’s is that all efforts were made to support heterogeneous com- puting environments. We’ve already mentioned the multiple network protocols that were present. Along with these protocols one also found a large assortment of computers. At Stanford University, for example, research was done on systems such as Sun workstations, VAX’s, DEC-20’s and Lisp machines 3 . These systems also supported student users. Appropriately enough one would expect dis- tributed systems research to look at ways to use these machines in a co-operative fashion, and this is exactly what the V-System did under the guidance of computer science professors David Cheriton and Keith Lantz. It brought together the computing power of the above collection of machines in a way that was really P2P. The major goal of the V-System was to distribute processing and resources, and to do so with protocols and API’s that were system independent. Why discuss the V-System in detail? As you will see, the organization of the V-System, it’s model, the approach that was taken towards develop- ment, were carefully thought out and implemented to satisfy the needs of the user; to separate each system component with API’s and protocols that were machine independent; to yield a system that had user satisfaction and performance as primary goals rather than after thoughts; and addressed its net- work protocols to the IETF. The software development practices adhered to by the graduate students were way ahead of their time. All protocols and API’s were carefully documented and rules for writing consistent C code were strictly followed. And, last but not least, it exhibited many features of P2P sys- tems.

The V-System begins with its user model . Each user had a workstation, and state-of-the-art user inter- face support was a first principle. “The workstation should function as front end to all available resources, whether local to the workstation or remote. To do so the V-System adheres to three funda- mental principles:

1. The interface to the application programs is independent of particular physical devices or

intervening networks.

2. The user is allowed to perform multiple tasks simultaneously.

3. Response to user interaction is fast [V-SYSTEM].”

3. Bill Yeager wrote an Interlisp version of the VGTS in 1985. The VGTS is a V-System component and explained in this sec- tion. The Interlisp VGTS was used to demonstrate the power of remote virtual graphics by communicating to a V-System running on a Sun workstation where the graphics were displayed. The graphics were generated on a Xerox D-machine.

It is refreshing to see the user placed first and foremost in a research project. All processing was paral-

lel, and a “reasonably sophisticated” window system was employed. Applications ran either locally or

remotely and when user interaction was required were associated with one or more virtual terminals.

“The V-System adheres to a server model [V-SYSTEM].” In the V-System resources are managed servers and accessible by clients. One can view a server as an API that hides the resource it represents and thus it is by the means of the API that the resource can be accessed and manipulated. The API’s are well defined across servers thus yielding consistent access. In a sense, the V-System has many proper- ties of today’s application servers with perhaps the following exception. A server can act as a client when it accesses the resources managed by another server. “ Thus, client and server are merely roles played by a process [V-SYSTEM].” And here, we see the P2P aspect of the V-System.

It is easy to imagine the collection of workstations running the V-System all sharing resources in a

symmetric way. The resources can be cpu cycles or content or both. This is again pure P2P. Let’s look

a little more closely to see what else can be revealed.

The system is a collection of clients and servers that can be distributed throughout the Internet , and that can access and manipulate resources by the means of servers. The access to a resource is identical

if the resource is local or remote since there are API’s and protocols that are used for this access. This

access is said to be “network transparent.” There is a principle of P2P that resources will tend to migrate closer to the peerNodes that show interest in them. The V-System has a similar feature. V-Sys- tem clients may influence or determine the location of a resource.

In order to support the server processes the V-System has a distributed kernel which is the collection of V- Kernels that run on each machine or host in the distributed system. “Each host kernel provides pro- cess management, interprocess communication, and low-level device management facilities.” Further- more, there is an Inter-kernel Protocol (IKP) that permits transparent, inter-process communication between processes running in V-Kernels. Let’s take a quick look at a few of the typical V-Servers:

1. Virtual Graphics Terminal Server: Handles all terminal management functions. There is

one per workstation. An application may manipulate multiple virtual terminals and the Vir- tual Graphics Terminal Protocol (VGTP) is used for this purpose. The VGTP is an object oriented protocol where the graphic objects can be recursively defined by other graphic objects and thus the VGTS supports structured display files which are highly efficient with respect to both the frequency of communication and amount of data communicated.

2. Internet Server: Provides network and transport level support.

3. Pipe Server: Standard asynchronous, buffered communication.

4. Team Server: Where a team is a collection of processes on a host, the team server pro-

vides team management. Noting that applications can migrate between hosts, this migra- tion and remote execution is managed by the team server.

5. Exception Server: Catches process exceptions and manages them appropriately.

6. Storage Server: Manages file storage.

7. Device Server: Interfaces to standard physical devices like terminals, mice, serial lines and disks.

It is therefor simple to visualize a typical workstation running the V-System, and users running appli- cations communicating with processes which form teams all managed by the distributed V-kernel’s servers. The symmetry of client/server roles is clear, and symmetry is at the heart of P2P.

Now, suppose that the distributed V-Kernel is active across a LAN on multiple hosts, and that there are team processes on several of the hosts that have a common goal, or at least a need to share resources. What is the most efficient way for this communication to take place? First, we need to organize the teams. In the V-System the teams are organized into host groups. A host group is a collection of servers on one or more hosts. And, certainly, there can be many host groups active at the same time in the V- System. They are similar to our connected communities as well as Jxta peer groups. In fact, a host group can be implemented as a connected community. Again, the computer science roots of P2P reach back at least to the 1980’s.

In order to efficiently communicate between the distributed host groups the V-System uses mulitcast that is first described in rfc966, and ultimately obsoleted by rfc1112. The authors of rfc966 are David R. Cheriton and Steve Deering. Steve was a computer science graduate student in David’s distributed systems group. The author of rfc1112 is Steve Deering. Rfc1112 is entitled “Host Extensions for IP Multicasting.” Rfc1112 is an Internet standard. What follows is an excerpt from rfc1112:

IP multicasting is the transmission of an IP datagram to a “host group”, a set of zero or more hosts identified by a single IP destination address. A multicast datagram is delivered to all members of its destination host group with the same “best-efforts” reliability as regu- lar unicast IP datagrams, i.e., the datagram is not guaranteed to arrive intact at all members of the destination group or in the same order relative to other datagrams. The membership of a host group is dynamic; that is, hosts may join and leave groups at any time. There is no restriction on the location or number of members in a host group. A host may be a member of more than one group at a time. A host need not be a member of a group to send datagrams to it. A host group may be permanent or transient.

Indeed, host groups are the forerunners of connected communities. To accommodate host groups in IP.V6 there are dedicated group multicast addresses.

It would have been quite simple to implement P2P chat rooms in the V-System given the VTGS. The implementation would have been quite efficient with the use of IP Multicasting as it is implemented. This is because IP Multicasted datagrams were directed to the subnets on which the host groups reside. On that subnet a single IP datagram is multicast to all of the host group members yielding a huge sav- ings in bandwidth. This is very much like the mBone that is used for multicasting video on the Internet.

Content sharing would also be straight forward with the storage server and VTGS. The V-System could have been also used for grid computing where host groups partition the grid for targeted, host group based calculations.

Finally, we are sure that the V-System is not the only example from the 1980’s of a distributed system that is very P2P-like in its behavior. P2P is really a naissant capability that a couple of decades has brought to the mainstream. We next look at the decade of the 1990’s that was a decade of maturation of the ideas from the 1980’s with a lot of help from excellent hardware engineering taking advantage of Moore’s Law.

2.2 The 1990’s - The Decade of the Information Highway

Recall from section 2.1.1 that the Internet had been so successful that on April 30, 1995 NSF aban- doned the NFSNET backbone in favor of a fully privatized backbone having achieved a growth to about 100,000 networks in the United States. During the same time the Internet 4 was on a global growth path. While universities, research laboratories, governments and companies were discovering a better, more stream lined way of doing business using the Internet, it is clear that the invention of the world wide web by Tim Berners-Lee in 1991 was the real force behind bring the Internet from where it was then to where it is now, in 2004.

Tim Berners-Lee writes, “Given the go-ahead to experiment by my boss, Mike Sendall, I wrote in 1990 a program called “WorlDwidEweb”, a point and click hypertext editor which ran on the “NeXT” machine. This, together with the first Web server, I released to the High Energy Physics community at first, and to the hypertext and NeXT communities in the summer of 1991. Also available was a “line mode” browser by student Nicola Pellow, which could be run on almost any computer. The specifica- tions of UDIs (now URIs), HyperText Markup Language (HTML) and HyperText Transfer Protocol (HTTP) published on the first server in order to promote wide adoption and discussion.[Berners-Lee]” The first web server,, was put on-line in 1991 and the access grew by an order of magni- tude each year up until 1994.

By 1994 the interest in the web was so large in both business and academia that Tim decided to form the World Wide Web Consortium (w3c). At the same time a series of rfc’s specified the protocols and definitions in the IETF:

1. rfc1630 Universal Resource Identifiers in WWW: A Unifying Syntax for the Expression of Names and Addresses of Objects on the Network as used in the World-Wide Web. T. Berners-Lee. June 1994.

2. rfc1738 Uniform Resource Locators (URL). T. Berners-Lee, L. Masinter, M. McCahill. December 1994.

3. rfc1866 Hypertext Markup Language - 2.0. T. Berners-Lee, D. Connolly. November 1995.

4. The term “Internet” as we use it includes both the private and public networks. Purists my find this objectionable, but in hindsight that is what the Internet became in the mid-90’s.

4. rfc1945 Hypertext Transfer Protocol -- HTTP/1.0. T. Berners-Lee, R. Fielding, H. Frystyk. May 1996.

The 1990’s also launched the commercial use of the Internet. There was resistance from academics to the commercialization. “Many university users were outraged at the idea of non-educational use of their networks. Ironically it was the commercial Internet service providers who brought prices low enough that junior colleges and other schools could afford to participate in the new arenas of education and research[HISTINTERNET].”

With the end to commercial restrictions in 1994 the Internet experienced unprecedented growth. ISP’s flourished and began to offer both web access and email service. Fibre optic cables were pulled to glo- bally connect major industrial areas, satellite service was added, and the web went mobile-wireless with the introduction of the Wireless Access Protocol (WAP) in the late 1990’s bringing web access to mobile phones and PDA’s. As we are all aware, by the end of the decade web sites flourished, and the.COM era arrived. Perhaps the best measure of this growth is the number of web pages that are indexed: “The first search engine, Lycos, was created in 1993 as a university project. At the end of 1993, Lycos indexed a total of 800,000 web pages[HISTINTERNET].” Google currently indexes 4,285,199,774 web pages. In a little over ten years the increase is 5,000-fold!

With respect to standards, during the decade of the 1990’s the IETF was extremely active. 1,679 rfc’s were published. 380 were published in the previous decade. The intellectual contribution to the IETF was escalating, and this is a standards body that became extremely formal during the 1990’s. Processes were put in place to create application areas, working groups and an overall governing body for the IETF. The WAP specification 1.0 was released in 1999, thus giving a standards foundation for proxied Internet access by mobile phones. As is discussed in chapter 6, Java standards for these same devices were put into place with MIDP 1.0 in the Fall of 1999. These standards along with the increase of Internet bandwidth brought to the users of the Internet the protocols to access a diversity of content types on a variety of devices. In the 1980’s we had for the most part text based content. The innova- tions of the 1990’s provided multi-media content to the masses: Text, images, sound, and video. And, the masses loved it!

Finally, we have the issue of security and in particular ciphers and public key algorithms. With respect to the former, the patent for the DES cipher expired in 1993. Although 56bit DES can be cracked by a brute force attack which makes it obsolete, 3DES was introduced to make up for this shortcoming. Also, Bruce Schneier introduced the Blowfish cipher in 1993 as a public domain DES alternative. With respect to the latter, Diffey-Hellman expired in 1997 and on September 6, 2000, RSA Security made the RSA algorithm publicly available and waived its rights to enforce the RSA patent. Thus, by the end of the 1990’s developers were able to secure their software without a concern for royalties which gave a large boost to e-commerce on the Internet.

As we reflect upon the last few paragraphs, the one salient thing beyond the global connectivity pro- vided by the Internet, beyond the hardware, that motivates this growth is the driving force for people to communicate and to be entertained. Behind it all is a starvation for social interaction, for access to information for education, and entertainment. They will pay dollars for the applications that fulfill

these needs. And, here the stage is set for P2P to enter the scene and play a major role. The Information Highway is in place and ready to handle the traffic!

2.3 The New Millennium

P2P exploded into the public’s eye in 2000 with the flurry of lawsuits against Napster for contributing

to the infringement of copyright by its users or peers. By that time billions of MP3 music files had been

exchanged by users of the Napster network and client. The network was a collection of servers that

indexed music found on users’ systems that ran the Napster client. The Napster client used the servers

to find the indexed music and the peers on which it resided, and the network provided the mechanisms

necessary to permit the music to be shared between peers. Napster was created by Shawn Fanning in May of 1999 as a content sharing application. It was not P2P in the purest sense. At its peak there were 160 Napster servers at the heart of the Napster network. The lawsuits had the ironic effect of popular- izing Napster. In March of 2001 a ruling by a U. S. District Court of Appeals upheld an injunction against Napster thus requiring it to block copyrighted songs.

In June of 1999 Ian Clarke brought us Freenet. Freenet, unlike Napster, is a pure P2P system. Ian is

very interested in personal privacy and the freedom of speech. In an article by Ian on “The Philosophy

of Freenet, Ian states:“Freenet is free software which lets you publish and obtain information on the

Internet without fear of censorship. To achieve this freedom, the network is entirely decentralized and publishers and consumers of information are anonymous. Without anonymity there can never be true freedom of speech, and without decentralization the network will be vulnerable to attack [FREENET].” Communications by Freenet nodes are encrypted and are “routed-through” other nodes

to make it extremely difficult to determine who is requesting the information and what its content is.

Ian’s most recent P2P system is Locutus. Locutus emphasizes security, runs on.NET and is targeted to the Enterprise.

A third generic P2P system of this decade is Gene Kan’s Gnutella. Gnutella is an elegant protocol for

distributed search with five commands. It too is pure P2P where each peer plays the role of both a cli- ent and a server. A brief description is the following:

Gnutella 2 is a protocol for distributed search. Although the Gnutella protocol supports a traditional client/centralized server search paradigm, Gnutella’s distinction is its peer-to- peer, decentralized model. In this model, every client is a server, and vice versa. These so- called Gnutella servants perform tasks normally associated with both clients and servers. They provide client-side interfaces through which users can issue queries and view search results, while at the same time they also accept queries from other servants, check for matches against their local data set, and respond with applicable results. Due to its distrib- uted nature, a network of servants that implements the Gnutella protocol is highly fault-tol- erant, as operation of the network will not be interrupted if a subset of servants goes offline [GNUTELLA].

Gnutella has undergone a huge amount of analysis since it was launched. It had weaknesses, and these weaknesses were part of its strength. They encouraged, and yield excellent research in P2P and as a consequence improved algorithms. The P2P world is really grateful to Gene for his vision of P2P, and his energy as an evangelist of the P2P technology.

Finally, we close our discussion of the history noting the launch of Sun Microsystems’ Project Jxta on April 25, 2001. Jxta is open source and has been under continuous development since this time. It’s specifications define a P2P infrastructure that includes both peer nodes and super-peers called rendez- vous. The Jxta infrastructure is fully realized with a Java implementation. From the beginning one of the goals of Jxta has been to create a P2P standard. Currently, P2P networks are not interoperable with differing protocols creating P2P networks that are isolated islands. As a direct consequence of this desire to standardize P2P by members of the Jxta community, there is now an Internet Research Task Force Research Group on P2P. Jxta is used world-wide by P2P enthusiasts for creating P2P applica- tions. The web site is

We have made a conscious choice in writing this book to not be encyclopedic and thus, not list the remaining P2P applications and networks that now exist. No such list will ever be current until a con- sensus is reached on a P2P standard for the Internet. What we have now are cul-de-sac protocols that cannot possibly do justice to the possibilities of P2P imagined by its visionaries. Understandably, these dead-end alley ways are driven by the desire to capitalize, to profit on P2P. While not bad in itself, since capital is necessary to support research and development, we really want to see the history of P2P come to the place where the agreement on a standard is reached.

We hope that this brief history of P2P has given the reader an idea of its roots some of which are not apparently P2P on the surface. So much of technical history is filled with side-effects. One can not always guess what a new idea will bring along with it. The original Internet of the 1980’s had very few worries about security until the end of the decade when it became global and surprise hackers arrived to cause serious problems. The Internet’s age of innocence was short lived. Still, the energy and cre- ativity of those who continued to build out this amazing infrastructure could not be stopped. Security problems are an impedance that this creative energy continues to overwhelm. Most of the history of P2P is in front of us. Let’s get to work to realize its possibilities.

Chapter 3 Components of the P2P Model

From thirty thousand feet a P2P overlay network appears as a collection of

peer-nodes that manage to communicate with one another. At this altitude it is sufficient to discuss content sharing, its pro’s and con’s and how it will create a new Internet digital economy as was done in Chapter 1. In order to come down out of the clouds and to discuss the ground level engineering concepts it is necessary to define the real engineering parts of these peer-nodes, the components that comprise the P2P network, as well as the protocols used for peer-node to peer-node communication. This is not unlike assembling a detailed plastic model of a classic automobile, or futuristic spacecraft. Each part is important, and the rules for assembling them correctly, i.e., the blue- prints, are indispensable to the process. To this end, in this chapter we first have a discussion of the P2P document language which is a universal, descriptive meta-component (a component for describing components). Just like the final blueprint of a home is always a collection of blueprints, one describing the plumbing, several for the multiple views, others for each room,

the exterior walls, etc

, of documents. Thus, as we define our P2P model component by component in this chapter, by starting with those that are fundamental and using these as building blocks for more complex components, we will be defining a set of 4PL types and combinations thereof. These types will then be the grammar of our document language and permit us to create the multiple blueprints that will be the engineer’s guide to constructing a peer-node. To help the reader build a conceptual understanding of the final, assembled model, each section explains the motivations and behaviors of the components it defines. It is from these explanations and 4PL that we derive the semantics of the document lan- guage.

our final peer-node document will also be a collection

3.1 The P2P Document Space

3.1.1 XML as a Document Language

In any society, to establish a communication between people, either every one needs to speak a com- mon language or to be able to translate their language into a language which can be understood by both parties. Peer-node to peer-node P2P network communication is not an exception to this rule. For exam- ple, either for the transfer of content or to initialize connections, one peer-node will send messages to a target peer-node, the target peer-node will send responses. The “language” in which the messages are expressed must be understood by all peer-nodes participating in the communication. In the networking world, the meaning of such a “language” is not the same as that of a programming language in the computing world. Instead, the former language permits us to define the structure (or format, or syntax) in which messages are written, and unlike programming languages, this structure is independent of the message’s semantics, or meaning. This structure should be able to allow messages not only to say “hello”, but also to permit peer-nodes to negotiate a secure communication channel, and to transfer the multi-dimensional data along that channel. The semantics of such a negotiation, or data transfer will be defined by the associated protocols’ behavior and not the underlying document language. A required feature of the document language is to permit the creation of structured documents with flexibility of descriptors or names. It is difficult to describe almost arbitrary peer-node components in a document language whose descriptor set or namespace is fixed. The language of choice must also be an accepted standard in the Internet community and off-of-the-shelf, open source parsers must be available. It should also be simple to write parsers for minimal, application defined namespaces so that it can be used across the device space.

Extensible Markup Language (XML) [XML] naturally meets above requirements, is a widely deployed on the Internet, is a World Wide Web Community (w3c) standard, and for us, is an ideal markup language to create structured documents that describe the various engineering components we require in order to communicate their properties amongst the peer-nodes on a P2P network. With its structured format, XML parsers are easily implemented. Also, the tags used by XML are not fixed like in HTML, and therefore, the elements can be defined based on any application’s needs. Each applica- tion can have its particular XML namespace that defines both the tags and the text that appears between these tags that the application uses. Because of these properties, HTML can be expressed in XML as XHTML[XHTML]. Given the freedom to chose tags suitable for specific applications or devices, XHTML basic [XHTMLbasic] which is a subset of XHTML is used for mobile phone like devices. There are XHTML basic browsers with a footprint of about 60K bytes which shows the pro- grammability and power of XML. The parsers are a small percentage of this overall browser footprint. In other words, XML is a versatile markup language that is able to represent the nature of both com- puter and human behavior. For example, one can imagine a small “talk” application expressed in XML. Here at the highest level John talks to Sam:

<?xml version=”1.0”?> <talk-behavior> <run> talk.jar </run> <from> John </from> <to> Sam </to> <text> Hi Sam! </text> </talk-behavior>

John’s local system analyzes the document, starts the talk program, connects to Sam’s system, and sends the document. Upon receipt, Sam will see something like:

Message from John: Hi Sam!

The structure is in the document, the behavior is in the talk program. This is admittedly an over simpli- fication but does express the power of XML. In this case, for example, the talk application might only need to have four or five tags, and XML is used to describe several programmatic actions:

The “run” tag is interpreted by an application to mean to run a program named talk.jar. Here, the extension implicitly will invoke Java to accomplish this task.

The “from” and “to” tags implicitly describe the local and remote ends of a communication and are explicit parameters for talk.jar.

Finally, the “text” tag is also a parameter for talk.jar and the program’s code sends this as a

text message to Sam. Notice that the data is included in the scope of <text>


Again, it is important to emphasize that the meanings of the tags are not implied by the XML docu- ment. Our brains have associations with respect to the tag names, and they are named in this manner because humans have come to understand “run program.” We could have just as well used <u>, <v>, <w>, <x>, and <y> as tags:

<?xml version=”1.0”?> <u> <v> talk.jar </v>





Sam </x>

<y> Hi Sam!



The talk program’s code doesn’t care, and can be written to interpret any text string to mean “send the message to the name bound to the tag pair defined by this text string .” After all, this is just a string match.

Now, let’s generalize the above example to explain what is meant by Meta-data, or data about data.

<?xml version=”1.0”?> <behavior> <from> tcp://John </from> <to> tcp://Sam </to>



<dc:name> talk.jar </dc:name> <dc:type> java </dc:type> <dc:version> 1.0 </dc:version> <dc:size> 27691 </dc:size> </meta-application>



<dc:access-control> <dc:access> read-only </dc:access> <dc:path> file:///home/John/friends-only </dc:path> </dc:access-control> <dc:greeting> <dc:filename> hello </dc:filename> <dc:filetype> txt </dc:filetype> </dc:greeting> <dc:attachment> <dc:content-type> image/gif </dc:content-type> <dc:filename> John.gif </dc:filename> </dc:attachment> <dc:attachment> <dc:content-type> video/jpeg </dc:content-type> <dc:filename> Hawaii.jpeg </dc:filename> </dc:attachement> </metadata> </behavior>

In the above example “xmlns:dc” identifies the namespace with a Uniform Resource Identifier (URI) [RFC2396] “ .” This latter URI name need only have uniqueness and persistance, and is not intended to reference a document. There are several examples of meta-data: The application version, size, type, the access control fields, and the attachments’ content types. Because meta-data is such a powerful tool, many efforts have been made to standardize its for- mat, such as the open forum Dubin Core Metadata Initiative (DCMI), the w3c standardization commu- nity and the knowledge representation community. Out of the w3c efforts we have the Resource Description Framework (RDF) [RDF]. The goal of RDF is not only to specify what kind of tags are needed, but also to enable the creation of relationships between these tags, i. e., RDF is explicitly about

semantics , and uses XML syntax to specify the semantics of structured documents. For example, here’s a relationship: Holycat is the creator of the resource

RDF will indicate this relationship as the mapping to the proper tags: Holycat as the creator in an RDF “Description about” The relevant part of the metadata is below:

<?xml version=”1.0”?> <rdf:RDF



<rdf:Description rdf:about=””> <dc:title> Holy Cat .com </dc:title> <dc:creator> holycat </dc:creator> </rdf:Description> </rdf:RDF>

In the sections and chapters that follow we require a markup language for the structured documents we define to describe the P2P components of our overlay network. We are selecting XML for this pur- pose. As mentioned above, it is a w3c standard with wide and growing support in the high-tech indus- try; permits us to create our own namespace to clearly express the concepts of each component; and the engineers who wish to implement a system modeled on these components will have the tools avail- able to parse the documents [XMLparser], or write their own, application, namespace specific XML parsers as has been done for many small devices, and existing P2P systems.

3.1.2 Publish and Subscribe: Administrated vs. Ad-hoc

Our P2P document language, XML, will provide structured information describing our components, their network behavior, the content exchanged between peer-nodes, crytographic data, and much more. When this information is published either locally, or remotely, it is crucial to efficiently administer this large document space, and this administration may or may not be centralized. Recall the “P2P Spec- trum” introduced in Chapter 1. A P2P network may be configured as hybrid or pure ad-hoc, and each point in the spectrum needs various methods to distribute, publish and subscribe to these documents as well as the policies that control both the publication and subscription. Inside a hybrid P2P network, any document can be stored and managed on centralized nodes. The more centralized the network becomes the more server-like these peer-nodes become, and consequently, the P2P network has much more control imposed upon the peer-nodes’ behavior. For example, the initial copies of the most popu- lar music, digital certificates, the status of critical peer-nodes, and the registration of peer-node names can be published on content, enterprise key ESCRO, system status, and naming servers, and access is

controlled by means such as passwords, and firewalls. On the other hand, in a pure, ad-hoc P2P net- work, the information is not required to be centrally stored, and the administrative polices that control its access are set between the peer-nodes themselves. In fact, in this latter case, all peer-nodes may have unrestricted read access to all documents. Administrated Registries

There are already existing, well administrated registries in use on the Internet. In 1983 the concept of domain names was introduced by Jon Postel [RFC811]. This was accompanied by the full specifica- tions [RFC882] by Paul Mockapetris as well as implementation specifications [RFC883] and an imple- mentation schedule [RFC897, RFC891]. Soon afterwards, the Domain Naming Service (DNS) was in place and was exclusively used for domain name to IP address lookups, or conversely. With time the DNS implemention has become a general, administered, Internet database. Thus, we can manage XML documents in such a fashion. While it is possible to store an entire XML document in DNS servers, this is impractical given the billions of possible peers and their associated documents. On the other hand, several fields these documents must be unique, and in some cases, their creation controlled by those who administer the P2P network in question. Collisions of the some fields that require unique- ness, given the right algorithms for their generation, will be probabilistically zero. Other such fields will be text strings, for example, a peer’s name, and as such, have a high probability of collision. These will certainly be administered and controlled in enterprise deployments of P2P systems, and DNS may be appropriate for these names. The problem with DNS is that it is that this service is already over- loaded and adding even millions of additional entries is not a good idea.

The Lightweight Directory Access Protocol (LDAP) [RFC1777] provides naming and directory ser- vices to read and write an X.500 Directory in a very efficient way. Its operations can be grouped to three basic categories: binding/unbinding which starts/terminates a protocol session between a client and server; reading the directory including searching and comparing directory entries; and writing to the directory including modifying, adding to and deleting entries from the directory. Such a simple protocol can be used to store and access our P2P documents. LDAP has the advantage over DNS that the administration and scope of its directories are more flexible. A small company, or even a neighbor- hood in a community can decide to put in place and administer its own LDAP directory. For a highly centralized P2P network it is appropriate to store entire XML documents or selected fields from these documents in LDAP directories. Tagged fields can have URI’s referencing LDAP directory entries. In certain cases, it will be necessary to authenticate peer’s so that they can access, for example, private data, and their login name and password can be validated through an LDAP search.

Here is a hypothetical example of using LDAP to access XML fields from an LDAP directory. Assume LDAP server, P2PLDAPServer, exists with host name Furthermore, assume that the organization supports shopping in department stores. Now, a client will make a query to search for Anne-Sophie Boureau who works for a department store in Paris, France. The “Grand Magasins” in Paris are well organized, and have XML documents for each employee.

<?xml version=”1.0”?>




<rdf:Description about=””> <dc:title> Holy Chat .com </dc:title> <dc:creator> holychat </dc:creator> <dc:dn> <dc:uid> Anne-Sophie </dc:uid> <dc:company> </dc:company> </dc:dn> <dc:cn> Anne-Sophie Boureau </dc:cn> <dc:gn> Anne-Sophie </dc:gn> <dc:sn> Boureau </dc:sn> <dc:email> </dc:email>



So, in the following code the client creates a query, then makes a connection to the N th server. After the connection succeeds, the client can perform the search and save the result:


LDAP_SERVER[N] = ""; LDAP_ROOT_DN[N] = "ou=DepartmentStores,"; LDAP_QUERY = "L=Paris,C=France,CN=Anne-Sophie Boureau"; connect_id = ldap_connect(LDAP_SERVER[N]); search_id = ldap_search(connect_id, LDAP_ROOT_DN[N], LDAP_QUERY);

result = ldap_get_entries(connect_id, search_id);

= "P2PLDAPServer";

In the result, we will have several items, common name (cn), distingished name (dn), first name (gn), last name (sn), and email address (email).

result["cn"] = "Anne-Sophie Boureau"

result["dn"] = "uid=Anne-Sophie," result["gn"] = "Anne-Sophie" result["sn"] = "Boureau" result["em"] = "" Ad-hoc Registries

Peers will be required to have a set of unique XML documents to describe the component of which they are comprised. Since the overlay network is really the join of these components, then the network itself will be consistent and there will be no conflicts from duplicate documents. The details of each of these XML documents is described later in this chapter and Chapter 4, and are not important for this discussion. When a collection of nodes on a network decides to form an ad-hoc, structured P2P overlay network, the associated XML document management cannot rely on the centralized “super” machines, instead, in our model for P2P behavior, in a purely ad-hoc network, each peer manages its own docu- ment space, as well as conflicts due to duplication to assure consistency. Suitable algorithms are described in the following sections to avoid duplication, i. e., the probability of duplication is extremely small, and when duplication arises the code will be present to deal with it. Other degrees of ad-hoc behavior are possible. For example, an ad-hoc P2P network may have a peer-node, or several peer-nodes whose systems are reliable. In this case one can off-load the real-time registration of these documents to these peer-nodes. Registration of documents is purely ad-hoc without any administration except for the knowledge of how to connect to these peer-nodes. One advantage of this pseudo-central- ization is the presence of systems whose role is real-time guardians of document consistency. The sys- tem that volunteers for this role is making the commitment to provide solid or near-solid services:

being up constantly, having a predictable degree of reliability so that the problem of duplication XML documents is non-existent.

We create unique documents for components by including unique identifiers (see section 3.2). An enterprise P2P overlay network behind firewalls can guarantee consistency to eliminate duplications by helping to form unique identifiers, and managing documents on enterprise servers. But when these peer-nodes connect to an non-enterprise P2P overlay network, this guarantee is difficult to maintain because there is no global monitoring of unique identifiers and the algorithms used to generate unique identifiers may permit the duplication some of the components’ XML documents, consequently yield- ing an inconsistent join of these different overlay networks. Thus, even if both of these P2P networks are strictly administered with independent, centralized control to guarantee each one’s consistency, we run into this problem. The same problem exists for the pure ad-hoc network discussed just above. Because there is no global registration joins might be inconsistent. Figure 3-1 shows the problem in both situations.

Sub-network_1 Adhoc Sub-network_2 Document Administrator

Figure 3-1.

The P2P Join Problem

While it is clear that for security reasons enterprises may not want to permit joins of this nature, the join problem can be solved with global, ad-hoc registries. We can argue that prohibiting a global P2P address space, and that is what we are describing, is the wrong way to solve such security problems, and that such an address space is a good idea to promote P2P technology and e-Commerce. What would the internet be like today if the IP address space was treated in the same manner. The security

solutions are best solved with security algorithms, etc

bally unique identifiers, and therefor a globally consistent document space is accomplished is dis-

cussed in section 3.2.2 of this chapter.

How a global P2P overlay network with glo-

3.2 Peer Identity

3.2.1 One Peer among Billions

As the Internet evolves during this decade, and billions of devices become Internet enabled, each of these networked devices will be capable of being a peer-node on a P2P overlay network, and each will require a unique identity that is recognizable independently of the device’s user and location within this network. While we do not expect most appliances in a home network to be mobile, unique peer identities will be necessary for a large majority of these devices, so that one peer among billions like one human being among billions, has its own, DNA like, identity, that accompanies the node, for

example a laptop, PDA, mobile phone, or Bluetooth enabled necklace, as the former move about the Internet, and the latter just meanders in a crowd looking for contact. To permit individual peers to have anonymity, a peer should be capable of changing this identity at will, and this new identity must main- tain both its uniqueness and mobility properties on the P2P network to which it belongs. You are stuck with your DNA but not with your peer identity. And, as always, policies that permit change of identity are set by the appropriate members of the peer network. It is clear that most enterprises, and probably all governments will want to carefully administer the peer identities that they legally control. But the large majority of these identities may not be registered, and registration is neither necessary nor always desirable in a P2P network. A peer should be able to generate a unique identity when it is initially con- figured, and the engineering fundamentals for building the P2P model described in this book will per- mit such a peer to become a member of a P2P network, to be discovered by other peer-nodes on this network, and to communicate with those peer-nodes. Registration of unique identities is also within the scope of this model. The model is, in fact, indifferent to administrative policies which are decided and implemented by the P2P network owners.

So, what can one use for such a universal peer identity? Certainly, IP version 4 (IPv4) addresses are out of the question given that its 32 bit address space is nearly exhausted. IPv4 addresses are the current Internet host address default, and the difficulties this raises for P2P networks are discussed in section 3.5 of this chapter. We can consider using IP version 6 (IPv6) addresses which provide a 128 bit address space but at this time and for the near future, IPv6 will not be universally deployed. Still, in anticipation of this deployment, the IPv6 option must be considered from the point of view of format. If we believe that IPv6 addresses are the inevitable future for the Internet, then we can argue for at least using the IPv6 format to both minimize the formats the software is required to support, and to stay as close to the Internet standards as possible with a long term goal of interoperable P2P software. Another choice is a unique random identifier of 128 or more bits generated by a secure random number genera- tor. We can also “spice” any of these identifiers with cryptographic information to create unique, secure crytographic based identities (CBID)[SUCV, CBID]. These and other possibilities are discussed in the next section.

3.2.2 Unique Identifiers for Peers

As mentioned above, it was recognized by the early 1990’s that the 32 bit IPv4 address space would be soon exhausted. Most of the early innovators of the Internet thought that 4,294,967,295 addresses were sufficient and thus the initial IP specification, RFC760, which was authored by Jon Postel, and pub- lished by the Department of Defense (DOD) as a DOD standard in January of 1980, allocated two 32 bit fields in the IP packet header for the source and destination IP addresses. As an historical note, interestingly enough, the XEROX Palo Alto Research Center’s XNS network protocol arrived at the same time as IP, and had 80 bits of address space, 48 bits for the host and 32 for the network, and the 48 bit host address was usually the 48 bit MAC address. As we will see below, using the MAC address as part of the IPv6 address is one possible format option. It is always amusing that from an historical perspective, so many cool things were done in the 1980’s, and that they are not abandoned, but rather, as was pointed in Chapter 2, sit on the shelf until rediscovered. The problem with XNS was that it was

XEROX-proprietary at the time, and IP has never been proprietary. They came from two different visions and it is clear which vision won, i. e., open standards, and this is the vision for the future. In any case, in December of 1995, the first specifications for IPv6 were submitted to the Internet Engi- neering Task Force (IETF) in RFC1883 and RFC1884 1 . In order to more fully understand the appropri- ateness of IPv6 addresses for unique identifiers, a very careful reading of RFC2373 and RFC2460, which obsolete the previous two RFC’s, is necessary. We give the reader a good enough overview of these RFC’s in subsection Again, it is important to keep in mind that the immediate goal with respect to IPv6 is, when appropriate, to use its address format to store, and publish generated, unique identities.

As mentioned in the introduction to this section, IPv6 is not the only possible choice for the format of what is called a Universally Unique Identifier (UUID). There are many good algorithms that can gen- erate thousands of UUID’s per second, the IPv6 format may not be suitable in some cases, and in this book’s model multiple UUID’s will be necessary for its component set. In subsection these UUID’s are discussed. Keeping this in mind let’s first move on to the primer on IPv6 addresses. IPv6 Addresses

IPv6 solves several shortcomings of IPv4. IPv6 is designed to improve upon IPv4’s scalability, secu- rity, ease-of-configuration, and network management [King99]. The scalability improvements reflect both increasing the address space size as well as providing the mechanism for scalable Internet rout- ing. We’ve adequately discussed the 32 bit address space limitation in this Chapter which IPv6 elimi- nates with a 128 bit addresses. This is an almost unimaginable number. If we start now and use

one billion addresses per second without recycling, then we have enough addresses to last 10 22 years. Noting that our sun will go supernova in 10 9 years, and if the universe is closed, its calculated lifetime is about 10 11 years, clearly, IPv6 addresses solve the address space size problem for the foreseeable future. Since IPv4 addresses do not provide a mechanism for hierarchical routing, like, for example, the telephone exchange does for phone calls with country and area codes, IP routers’ routing table size has become problematic as the Internet has grown in a way that was not anticipated by its founders. With the original IPv4 address space formats, the class A, B, and C networks provided no mechanism for hierarchical routing. The classic IPv4 address format, as defined in RFC796, permits 127 class A networks, 16,383 class B networks, and 1,048,537 class C networks. Since this is a flat address space, to route to every network using this scheme, an entry for each network is required in each router’s rout- ing table. With the advent of Classless Inter-Domain Routing (CIDR)[RFC1519] in 1993, an hierarchi- cal means of creating 32 bit IPv4 addresses was devised as a near-term solution to this problem. CIDR is backward compatible with the older IPv4 addresses, but does not eliminate the already existing leg- acy networks. A route to each one still must be maintained in the routing tables for all routers that pro- vide a path to such a network, but they can coexist with the CIDR network addresses. Thus, in spite of the CIDR near term solution, a true hierarchical addressing scheme is required, and IPv6 provides such a mechanism.

1. For a discussion of the Internet standards process see Appendix II.

1 2



1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1





Local Address


Class A Address






1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1

1 0



Local Address

Class B Address

1 2


0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1

1 1 0


Local Address

Class C Address

Figure 3-2. IPv4 Address Format

IPv6 offers multiple ways to format it’s 128 bit addresses, and there are three types of addresses: uni- cast, anycast and multicast. Where a node on an IP network may have more than one interface attached to that network a unicast address is an identifier for a single interface; an anycast address is an identi- fier for a collection of interfaces for which an anycast packet destined for this collection of interfaces is delivered to one and only one of the interface members; a multicast address is an identifier for a collec- tion of interfaces for which a multicast packet destined for this collection of interfaces is delivered to all of them. Because an anycast address is syntactically indistinguishable from a unicast address, nodes sending packets to anycast addresses don’t generally aware that an anycast address is used. We will concentrate our explanations on those addresses which are most useful for our P2P UUID purposes. In particular, the IPv6 aggregatable global unicast is salient here [RFC2374] since it solves the scalable routing problem and also provides a method to generate globally unique IP addresses when used in conjunction with IPv6 Neighbor Discovery (ND) [RFC2461] and IP stateless address autoconfigura- tion [RFC2462]. As we see in Figure 3-3, the aggregatable global unicast address permits aggregation in a three level hierarchy.







64 bits







Interface ID





Public Topology





Interface Identifier



Format Prefix (001) Top-Level Aggregation Identifier Reserved for future use Next-Level Aggregation Identifier







Site-Level Aggregation Identifier Interface Identifier

Figure 3-3. Aggregatable Global Unicast Address Structures

The Top-Level Aggregator (TLA) identifiers are at the top node in the Internet routing hierarchy, and must be present in the default-free routing tables of all of the top level routers in the Internet. The TLA ID is 13 bits and thus permits 8,191 such ID’s. This will keep these routing tables within reasonable size limits, and the number of routes per routing update that a router must process to a minimum. It is worth noting that in spring, 1998 the IPv4 default-free routing table contained approximately 50,000 prefixes. The technical requirement was to pick a TLA ID size that was below, with a reasonable mar- gin, what was being done with IPv4 [RFC2374].

The Next-Level Aggregator (NLA) identifier is for organizations below the TLA nodes and is 24 bits. This permits 16,777,215 flat ID’s or can give an arrangement of addresses similar to that of IPv4 that is hierarchical. One could, for example, do something similar to CIDR here. Next we have the Site-Level Aggregator (SLA) for the individual site subnets. This ID is 16 bits which permits 65,535 subnets at a given site. The low order 64 bits are for the interface identifier on the local-link to which an host with an IPv6 address belongs. This is usually the real MAC address of an host’s interface. It is certainly pos- sible that during certain time windows, two hosts may end up with the same such address, and there are means available to resolve these conflicts and to guarantee global uniqueness. These are discussed just below.

The authors of the IPv6 RFC’s understood clearly the burden IPv4 imposed on network administrators. The seemingly simple task of assigning a new IP address, is in fact, not that simple. The address must be unique. Yet, often enough there are unregistered IP addresses on a subnet, and in most cases the per- pitrator is innocent, the original intent usually requiring a temporary address for a test and the tempo-

rary address was never unassigned. The unfortunate side effect is that two systems will receive IP Address Resolution Protocol (ARP) requests from, for example, a router, and both will reply. Which system will receive the packet that initiated the ARP is arbitrary. There is also the assignment of a default router, and DNS servers. While most of this is now solved with the Dynamic Host Configura- tion Protocol (DHCP)[RFC1531], it is still a source of administrative difficulty when hosts change subnets, or IP addresses must be renumbered. Also, mobility adds a new twist to the equation (Mobile- IP). Most large organizations have a dedicated staff to deal with these administrative issues which are more often than not a source of painful and costly network bugs. An illustrative example, as recalled by one of the authors, William Yeager, is sufficient here: In the late 1980’s Stanford University had net- works that supported multiple protocols, and an organization was put in place to administer the univer- sity’s rapidly growing local area network. One afternoon all XEROX Interlisp machines in the Knowledge Systems Laboratory (KSL) went into hard garbage collection loops. These systems were used as desktops as well as for research and so about one hundred and twenty five people lost the use of their systems. Rebooting did not solve the problem. William Yeager always watched the network traffic, he kept a network sniffer continuously active in his office, and he noticed a huge, constant upsurge in Xerox Network Services (XNS) routing table updates and all of the routes being advertised were illegal, constantly changing, and non-repeating. The lisp machines in question cached XNS rout- ing table entries, and thus, were madly updating internal data structures, freeing up entries, resulting in a hard garbage collection loop. At that time, when a router received a new route it always immediately advertised it. These routes were originating on the backbone network from a previously unknown pair routers. Fortunately, the KSL managed its own routers and code they ran. William generated an imme- diate patch which was added to the appropriate router to firewall the errant routing table advertise- ments to keep them on the backbone. A phone call to a Stanford network administrator alerted them to the problem. It was their own. They had installed two XNS routers to support some administrative soft- ware, and assumed they worked fine. They did on small networks, but when the number of XNS net- works exceeded 17 all hell broke loose. The KSL had 17 such networks, and triggered this bug. The routers were shutdown until the problem was resolved. Such scenarios are not atypical. They arrive out of nowhere on a daily basis. Anything that can be done to ease the burden on network administrators is important.

To simplify the task of assigning IPv6 addresses, IPv6 autoconfiguration capabilities have also been defined. Both stateful and stateless autoconfiguration are possible. Either one or the other or both can be used, and this information is flagged, and thus automated, in periodic IPv6 router updates. If stateful autoconfiguration is used, then a stateful configuration server is contacted which assigns an IPv6 address from a known, administered, list. Even in this case ND, as described below, is used to assure that the server supplied address is unique. If it isn’t the address is not assigned, and the appropriate error message is logged.

Stateless autoconfiguration begins with the assignment of a link-local address [RFC2462] as the 64-bit interface ID. This is usually the MAC address but any unique token will do. Next, the host uses the ND Neighbor Solicitation (NS) Message to see if this identifier is unique. If no system complains, an ICMP Neighbor Solicitation message would be received from a neighbor with a matching token, then

it is assumed to be unique. If it is found not to be unique, then an administrator is required to assign an alternative link-local address. This may appear to be heavy handed, but is not. It is important to verify if there are in fact two identical MAC addresses on the local-link. The authors believe that it is suffi- cient to log the problem, and use a secure random number generator to create 64-bit tokens to be used here in conjunction with ND. These can be created in such a way as not to be in MAC address format. Such a system will at least permit a system to auto configure, and get on-line. A later administrative action can fix the address if necessary. Next, given a unique link-local address, periodic router adver- tisements contain the necessary prefix information to form a complete IPv6 address of 128 bits. A node can request such an advertisement with an ND router solicitation. IPv6 addresses have preferred and valid lifetimes where the valid lifetime is longer than the preferred lifetime. An address is preferred if its preferred lifetime has not expired. An address becomes deprecated when its preferred lifetime expires. It becomes invalid when its valid lifetime expires. A preferred address can be used as the source and destination address in any IPv6 communication. A deprecated address must not be used as the source address in new communications but can be used in communications that were in progress when the preferred lifetime expired. Noting that an address is valid if it is preferred or deprecated, an address becomes invalid when its valid lifetime expires. Thus, deprecation gives a grace period for an address to pass from preferred to invalid. The interested reader can explore the complete details of autoconfiguration in the RFC’s mentioned in this section. A full list of the IPv6 RFC’s can be found in the bibliography.

Finally, IPv6 provides a header extension for Encapsulation Security Payload (ESP)[RFC2406]. This can permit authentication of the data’s origin (anti-source spoofing), integrity checks, confidentiality, and the prevention of replay attacks. The well tested MD5, and SHA-1 hash algorithms are used, and authentication is done with Message Authentication Codes (MACs) (symmetrically encrypted hashes), and symmetric encryption algorithms like 3DES, AES, and Camellia. Sequence numbering is manda- tory in the ESP. They are monotically increasing and must never wrap to prevent replay attacks.

The duration or lifetime of an IPv6 address poses a problem for their use as UUID’s on an P2P overlay network which is independent of the underlying real, in this case, IPv6 network. While the 64-bit inter- face ID can be assumed to have an infinitely unique lifetime even if periodic ND checks must be made to assure that this is the case, the router prefixes can expire, and do arrive with preferred and valid life- times bound to them. Periodic router updates must be monitored to assure that an address is not depre- cated, and if it is, then appropriate actions must be taken. These actions are discussed in detail in Chapter 4. As mentioned in the introduction to this section, UUID’s are used to give a unique identity to each peer on the overlay network. These peers are also mobile. Thus, if one takes a laptop from the office to the home, or vice-versa, the IPv6 prefix will most likely change, and thus, a new UUID will be required. Why? If the prefixes are different, which can be discovered from router updates, then there is no way to use ND at the new location to verify the lifetime of the UUID. It could well be that if one is at home, then at the office another system has acquired this IPv6 address because the system at home cannot respond to ND Neighbor Solicitation Messages. This same system can join the P2P overlay net- work using the IPv6 address as a UUID, and therefore create a conflict. This implies that when a sys- tem becomes mobile, it must abandon its old IPv6 address and acquire another for use on the local-link

as well as for a UUID on the overlay network. This again does not impose a management problem on the P2P overlay network given the mechanisms described in Chapter 4. One thing is clear. If IPv6 addresses as described above are used as UUID’s, then before a system disconnects from the overlay network, if it intends to be mobile, it must be able to flush any knowledge of itself on the overlay net- work, or the overlay network has time-to-live values associated with dynamic information that permit this information to be expunged at expiration time.

It is important to understand that the IPv6 stateless, autoconfiguration protocols are attackable. There are obvious attacks like a malicious host replying to all ND MS messages, thus denying any new node the ability to auto configure. This kind of attack is detectable with a reasonableness heuristic: Generate up to five 64 bit interface ID’s using a good pseudo random number generator. If each of these five is denied as a duplicate, then there is an attack, and measures can be taken to find the attacker. Another equally obvious form of this attack is a node with a duplicate interface address not responding to ND. In this case, a duplicate IPv6 address will be created on the same local-link. Also, a node masquerad- ing as a router and generating bogus prefixes or valid prefixes with incorrect lifetimes is possible.

It is important to understand here that even with these possible attacks, IPv6 is a major step forward, and can be deployed while solutions to these attacks are in progress. The IETF does not stand still, and its members are pursuing solutions. Also, IPv4 can be similarly attacked, is being attacked as we write, and many of the IPv4 attacks are not possible with IPv6. In spite of these security problems, IPv4 has been tremendously successful, and IPv6 will be even more so.

Finally, there are alternatives to using IPv6 addresses as UUID’s and they are discussed in the next sec- tion. Universal Unique Identifiers (UUID)

In the previous sections we have given a good overview of IPv6 addresses, and their appropriateness as UUID’s on the P2P overlay network. The major problem faced with the IPv6 alternative is deploy- ment. The attacks on IPv6 described in our chapter on security should not slow down its deployment for general use, and are less menacing for IPv6 address format for P2P UUID’s. The most serious attack in the P2P space would theft of peer identity. As dangerous as this sounds, recall that someone attached to the internet can use almost any IPv4 address they wish if they are clever enough, and they have a non Internet Service Provider (ISP) connection. ISP’s can refuse to route some addresses for example. It is all to easy to change one’s IPv4 address with most home systems. IPv6 can be made more difficult to attack if stateless, auto-configuration is used. There is a computational and personal cost, the user must beware and take the right precautionary measures, and it is that cost that must be weighed against the probability of being hacked which is miniscule. In any case, we feel that IPv6 gives us a good future solution for UUID’s for several reasons:

1) Built-in global registration,

2) Barring attacks and administrative errors, the possibility of globally unique addresses, and therefor UUID’s,

3) IPv6 addresses can be used as UUID’s when a system is mobile to permit reattaching and acquiring

a new UUID, and here the interface identifier is almost always reusable,

4) The attacks and related security problems are being addressed as we write,

5) Global uniqueness also permits disjoint overlay networks to join as mentioned in section

Until IPv6 is sufficiently deployed, we can implement a P2P UUID generation strategy that is quite similar to ND. The interested reader can read Chapter 4, section, on mediator prefixed UUID’s.

There are other methods that can be used to generate UUID’s with a high probability of uniqueness given enough bits and essentially impossible to spoof. One can use a good pseudo random number generator, or better yet, a secure random number generator, to generate enough random bits per ID to make the probability of duplication essentially zero. If one uses 128 bit UUID’s generated in this way, the probability of a collision is less than winning the lottery 9 times in a row. We can never fill up the UUID space. Yes, there will be cheaters who will attempt to create peers with duplicate UUID’s since these values are public. This problem is currently resolvable with several emerging identifier genera- tion techniques.

There are Statistically Unique and Cryptographically Verifiable (SUCV) Identifiers [SUCV], Crypto- Based ID’s (CBID) [CBID], which are referred to as Cryptographically Generated Addresses (CBA) in

[Arkko02]. While the security issues discussed in these papers will be covered in chapter 5, the basic common ideas that play a role in UUID generation will be reviewed here. Where H-x is the high order

x bits of the hash algorithm H, a host generating a UUID can do the following:

1) Create a public/private key pair using, say, RSA or Diffey-Helman.

2) Using a hash function, H, like SHA-1, generate H(Public Key) , the 160-bit SHA-1 hash.

3) For a CBA use H-64(Public Key) as the CBID IPv6 interface identifier along with the high order 64- bit prefix. This can be an IPv6 based UUID.

4) For a UUID one can also use H-128(Public Key) CBID.

Given such a UUID, a challenge can be used to verify the owner of the UUID possesses the private key associated with the public key. When peer1 receives a document containing the UUID from peer2 , peer1 requests a private key-signed message from peer2 containing peer2 ’s public key, and a random session identifier, SID, generated by peer1 . The latter SID prevents peer1 from spoofing peer2’s iden- tity in a communication with peer3. Without the SID peer1 can save and send the signed message from peer1 thus faking the ownership of peer1’s private key. Continuing, peer1 can calculate H-128(Public Key) , and if the hash is correct, then verify the signature of the message. The signature can be a straight forward private-key signed SHA-1 hash of the message. If the signature is correct, then the document indeed belongs to peer2 , and peer2 ’s identity has been established.

How can this be attacked? There are those that worry that the H-64(Public Key) interface identifier can be attacked with brut force. Here, a successful attacker would need to find a different public/private key pair where the public key hashes to the exact H-64(Public Key) value, i. e., find another public key

that collides with the original one. Let’s assume RSA1536 is used. First, to generate a table with 2 64 values let’s make the assumption that a disk drive with a 1 inch radius can hold 10 gigabytes of data. We will need 2 64 64-bit or 8 byte values. A back-of-the envelope calculation says the total disk surface required to store the collision table is about 105,000 square miles. Now, if one just wants to compute until a collision is found and it is generous to assume that an RSA1536 public/private key pair can be computed in 1 millisecond, then let’s assume that some time in the future the calculation will take 1 microsecond, or that multiple systems are used in parallel to achieve the 1 microsecond/public/private key pair calculation. In this case, an exhaustive search for a collision will take 3 million years. Assum- ing that only half of the possible values are required to achieve this single collision, this reduces to 1.5 million years. That’s a lot of CPU cycles. Even with Moore’s law, we should not lose sleep over this attack succeeding in the near future. All this implies that 128-bit UUID’s are impossible to attack by brute force. Other attacks are possible if care is not taken to prevent them. These are typically the “man-in-the-middle” (MITM) attacks.

There are several ways to prevent MITM attacks: One can use a secure channel like TLS to exchange CBID’s; one can use infrared communication with eyeball contact between the individuals exchanging the CBID’s; out-of-band verification is also possible where upon the receipt of a 16byte CBID, the source is contacted and asked to verify the value; and a trusted 3 rd party can be a CBID escrow. Finally, MITM attacks are not always a threat. For example, if one is exchanging mpeg or jpeg content in an ad-hoc P2P network where CBID’s are used as UUID’s, then as long as the content satisfies the recipi- ent, there is not real security threat. And, a great deal of P2P activity will be the ad-hoc exchange of content. When financial information like credit card numbers are exchanged, then it is necessary to use strong security and verifiable CBID’s. This, as with security details, is covered in Chapter 5. The BestPeer Example

BestPeer [BESTPEER] is a self-configurable, mobile agent based P2P system. It is highly centralized, and relies on the Location Independent Global Names Lookup (LIGLO) server to identify the peers with dynamic IPv4 addresses. When a peer node joins BestPeer P2P system, it registers to a LIGLO server. The server gives the peer node a unique global ID (BestPeerID). This ID is a combination of LIGLO server’s IPIPv4 address and a random number which the server assigned to the peer node. The LIGLO server saves the BestPeerID, peer node IP address pair. The LIGLO server also sends the new peer node a list of such (BestPeerID, IP) pairs to which it can connect. When a node has a new IP address, it should update its LIGLO server with this information. These ID’s can be easily spoofed, thus permitting identity theft, because any MITM can watch the network activity to obtain Best- PeerID’s and then notify the LIGLO server of a change in IP address associated with these ID’s. Microsoft’s Pastry Example

Pastry [PASTRY] is a P2P overlay network performing application-level routing and object locating. Each peer node in the Pastry network is randomly assigned a nodeID in the numerical form of 128-bits in length. When a peer node joins the Pastry network, its ID can be generated through a cryptographic hashing of the node’s public key or of its IP address. The value of the ID plays a crucial role when doing the scalable application-level routing. Applications hash file name and owner to generate a

fileID and replicas of the file are stored on the nodes whose ID’s are numerically closest to the file ID. Given a numeric fileID, a node can be routed to the node with the ID which is the numerically closest to the file. Although there is no mention of CBID’s in [PASTRY], if the hash of the public key is used, then CBID techniques could be used to secure Pastry routes. The JXTA Example

The project JXTA is another P2P overlay network and assigns a UUID, the node’s peerID, to each peer. The peerID is implemented as 256-bit UUID’s, is unique to the peer node and is independent of the IP address of the node. JXTA permits peers to form groups which are called peer groups [JXTA]. The groups make the overlay network more scalable since all peer activities are restricted to the current peer group of which the peer is a member. Also, all communication between peers on the overlay net- work is done through pipes. Besides peer nodes, there are UUID’s for peer groups, for data, and for communication pipes. In JXTA a UUID is a URI string, for example:


The peer and its current peer group’s UUID’s along with ID type are encoded into the above yielding the 256-bit peerID. CBID’s are also implemented for JXTA peerID’s. In this case the 128-bits of the peer’s UUID are the SHA-1 hash of its X509.v3 root certificate. If peers use X509.v3 certificates for peer group membership authentication, then the peer group’s UUID part of the peerID is also a SHA-1 hash of the peer group root certificate.

3.2.3 Component 1 - The Peer-UUID

We require that every peer node on the overlay network have a UUID which we call the Peer-UUID. From the above discussion it is clear that we have many options for generating these UUID’s. The fea- ture we desire given any of the options is global uniqueness. An absolute requirement is uniqueness within one’s peer overlay network. If an enterprise decides to form an enterprise-wide, overlay net- work, then registration techniques can be used to administrate uniqueness. One might consider the SHA-1 hash of each system’s IP address or MAC address. But this can lead to problems if an enter- prise decides to renumber its IP addresses, uses IPv6 where IP addresses have a definite lifetime, or if one inadvertently programmatically creates two identical MAC addresses. In ad-hoc networks other techniques are required. In this latter case the best choice is using a sufficient number of bits, x, from the H-x (public key or X509.v3 certificate). If one uses, for example, RSA1536, then public/private key pairs are unique. Thus if x equals 120, then the probability of a hash collision is sufficiently close enough to zero to guarantee global uniqueness, and as discussed in section, one can get by with even fewer bits from the hash.

Therefore, while the choice of UUID is up to the designer of the P2P software, in our discussions we will assume uniqueness UUID’s within the overlay network, and when security is an issue, CBID based UUID’s will be used. If one is behind a firewall, and all communication is secure, this may not be necessary. Still, we cannot overlook the implicit advantage of cryptographic information being embedded in the peer-UUID. Towards a Standard UUID for Peers

Why do we need a standard? The answer is straight forward. We want to have a single, world-wide peer-to-peer network for all devices. And, when and if the Internet becomes Interplanetary or even Intergalactic, we want this to be true. Standards drive a world-wide Internet economy.

What should the standard look like? We didn’t intent to waste the reader’s time reading about IPv6. This is clearly the correct approach for standardized Peer-UUID’s. As will be explained in our Chapter 4, we introduce the mediator component. Mediators behave like Internet routers on the overlay net- work. Therefore, we can introduce protocols similar to neighborhood discovery and mediator prefix assignment to yield Peer-UUID’s in IPv6 format.

Then, when IPv6 is fully deployed, we can then use IPv6 addresses as long as we use CBID’s for the 64 bits of interface identifier. The reasons for this latter requirement are discussed in section

Open source cryptographic software is available for the generation of public/private keys and SHA-1 hash algorithms [BOUNCYCASTLE, OPENSSL]. Similar code can be found in versions of JDK 1.2 and higher [JDK]. The PeerIdentity document

It is not an accident that a great deal of what will described here is an outgrowth of the JXTA peer advertisement. Both of us have worked on project JXTA, helped define the specifications, and after all, there are generic requirements that cannot be avoided. At the highest level each peer needs an XML document description of several basic properties which are common to all P2P systems. First, a peer will have a human readable peerName, and a peer UUID. The name is usually assigned by the person using the peer node during a configuration phase. Because we wish to use this peerName for applica- tion like P2P Email (see chapter 7), the allowable characters are restricted by both XML and MIME. For the actual details we refer the reader to the application XML and MIME specifications in the refer- ences. The MIME Header Extensions specify how the XML characters in the name must be formatted to satisfy email address constraints.

We do expect some peers to have machine generated peer names. Certainly, the peer name may not be unique. In cases where uniqueness is an absolute requirement, some kind of registration is required as discussed in section If one were to ask for an elaboration of all of the peers on the overlay net- work, then a list of peer name, peer UUID’s pairs would be given. In most ad-hoc networks users of peer nodes will be familiar with the names of peers with whom they regularly communicate, and regis- tration will not be necessary. The names may be unique in the personal peer community in which they are being used. In this case, a peer can restrict its searches to this community and not worry too much about unrecognized peer names. Still, it is possible to have name collisions in such a community. To help with this situation we add an optional peer description field to the document. The description is usually created by the user when the peer is iniitially configured, and is there to help differentiate peer names in the case of duplication. The description will usually be a simple text string but almost any digital data is permitted, for example, a photo.gif of the user. Note that it is always important to con- sider performance, and PeerIdentity documents will be frequently accessed. Consequently, text is a

good choice for this field, and in the case of a gif file, or even executable code, a URN should be pro- vided so that it can be accessed only when necessary to disambiguate name collisions. The details of personal peer communities are discussed in section 3.4 of this chapter. A peer’s PeerIdentity document is required to communicate with that peer. One needs a unique identity as well as the other information discussed just below to communicate.

When two peerNodes communicate with one another or by the means of a mediator, each peerNode must provide the other, or the mediators with the possible ways it can communicate. For example, a peerNode may prefer to always use a secure communication channel like TLS when it is available, or may be behind a firewall where a means of traversal such as http or SOCKS is necessary. To this end the PeerIdentity document will contain a list of available protocols for communication.

Communication protocols can be on the real network or on the overlay network. For example, TCP/IP is a communication protocol on the real network, and requires an IP address as well as a TCP port in its specification. On the other hand, TLS is between peers on the overlay network and only requires the peer-UUID in its specification since all communication on this network is between peers, and indepen- dent of the real network and underlying physical bearer networks. Thus,





are URI’s describing real and overlay network communication protocols that can be used to contact the peer that includes them in the special document described just below.

Finally, a physical layer may or may not permit multicast communication. If it does, and the peerNode is configured to take advantage of this functionality. As appropriate the multicast field is marked as TRUE or FALSE.

Given this introduction we define the PeerIdentity document as follows:

Document type = P EER I DENTITY

Content tags and field descriptions:

<peername> Restricted Legal XML character string [XML][MIME] </peername> <peerUUID> uuid-Legal UUID in hexadecimal ascii string </peerUUID> <description> <text> Legal XML character string [XML] </text> <URN> Legal Universal Resource Name </URN> </description> <comprotocols> <real> real protocol URI </real> <overlay> overlay network URI </overlay>

</comprotocols> <multicast> TRUE | FALSE </multicast>

There may be multiple protocols specified on both the real and overlay network.

Below is an example of a PeerIdentity documument:

<?xml version=”1.0”?> <!DOCTYPE 4PL:PeerIdentity> <4PL:PeerIdentity xmlns:4PL=””> <peername> LucBoureau </peername> <peerUUID> uuid-AACDEF689321121288877EEFZ9615731 </peerUUID> <description> <text> Je suis le mec francais </text> <URN> </URN> </description> <comprotocols> <real> tcp:// </real> <real> </real> <overlay>


</overlay> </comprotocols> <multicast> FALSE </multicast>


Using 4PL we create the above example as follows:

Document pi = new Document (PEERIDENTITY, “LucBoureau”);

The other PeerIdentity document fields will be known to the system as part of its boot time configura- tion data. These details will vary from implementation to implementation. In some implemenations the creation of a PeerIdenity document will automatically publish it. Document publication on the overlay network is described in detail in Chapter 4. To functionally publish a document in 4PL we use the pub- lish command:


In the next section we discuss the Virtual P2P Network. For two peers to communicate with one another on this network they must possess one another’s PeerIdentity document. This, of course, enables communication on the real, underlying networks, and a P2P system using what we describe must implement the code necessary to create, update, and analyze the above document as well as establish communication on these real networks.

3.3 The Virtual P2P Network

Up to this point we have generally discussed the notion of an overlay network. The reader has in mind a possibly ad-hoc collection of peer nodes with certain rights, and policies for communication. Also, the requirement that each such peer has a UUID as a means of identification is well understood at this point. A single UUID is necessary to facilitate communication on this network. It does in fact give a single point of entry but lacks a means to organize the information that might be communicated. One might argue that the parameters required to organize data triage can be included in the initial part of the data. While this is true, first there may be several underlying transports on the real network, and we want an end-to-end delivery mechanism. Second, including addressing information as part of the data does not yield a very satisfactory network communication stack and hides what is really going on. Such a stack has always been a part of networking protocols, and we will build a overlay network stack in the same spirit.

One can ask, why not just use the IP stack and be done with it? Why go to all of the trouble of invent- ing yet another network stack on top of network stack? As we have carefully examined in the sections on IPv6 and UUID, in order to reestablish end-to-end network communication UUID’s are required that are independent of the underlying real networks which may not in fact use IP In the case of IPv4, the reader now understands the address space problem, and that IPv4 addresses cannot for this reason be used as UUID’s. Also, with the eminent arrival of literally billions of devices the ability to create ad-hoc UUID’s is necessary. We have mentioned that IPv6 addresses are possible candidates for UUID’s but we still have a deployment issue here. Finally, we have the Network Address Translator (NAT), firewall and other underlying real network barriers, as for example, the prohibition for good reasons of propagated multicast that in turn makes long range, ad-hoc discovery impossible without UUID’s. Thus, in order to establish a viable P2P network topology, a simple network stack where the UUID layer is at the bottom is necessary.

Before we give the details of the overlay network stack, let’s briefly examine the IP network stack.

3.3.1 Hosts on the Internet

There are many network stacks. The most general is probably the Open Systems Interconnection (OSI) stack which has seven layers ranging from the physical layer on the bottom to the application layer on the top. A few examples of physical layers are ethernet, 802.11a/b, GSM, Bluetooth, and wide band CDMA. An IP stack has five layers, and level 1 is the physical layer. The IP stack is used on the Inter- net, and is seen in Figure 3-4 with OSI layers:






Data Link


OSI layers









Figure 3-4. The IP Network Stack

Level 2 is the link or network layer, and this where the device drivers do their work. On ethernet, the IP packet has a unique number, 4096, that identifies it to the device driver and this is used to dispatch the packet to the next level. IP is at level 3. There are other IP protocols like the IP Address Resolution protocol (ARP), Reverse Address Resolution Protocol (RARP) at this level. The transport is at level 4. Here we have, for example, TCP, UDP, and ICMP. Finally, the application is at level 5.

There are a multitude of network applications that run at level 5. A few examples are telnet, ftp, imap, pop3, http, smtp, and snmp. The IP ports are well defined and registered through the Internet Assigned Numbers Authority (IANA). For those interested in the complete list look at the latest assigned num- bers published on the IANA web site [IANA]. As a consequence, in order to organize these applica- tions, as discussed in the next section, the transport protocols at level 4 that dispatch data to these applications will in this way have well defined port numbers. Addresses and Ports

Given that each host on the IP network has an IP address by which it can be contacted, or at least if the address is not registered, then responded to, these addresses give hosts end-to-end communication dur- ing the lifetime of the hosts’ IP addresses. At the transport layer to permit application triage, network

application port numbers are associated with each transport layer protocol. Looking at the above short list we have:

Application Protocol

TCP Ports















Thus, a telnet daemon listens on port 23 for incoming TCP/IP telnet connections at a host IP address. The listening IP-address.port pair can be viewed as a socket which will accept incoming connection requests to run the application associated with the port number. Not all port numbers are used, and this leaves room for experimentation as well as the assignment of a random port number to the host requesting a connection for a particular Internet application’s service. That is to say, if a host with IP address A1 wishes IMAP service on the host with IP address A2, then the initiating host uses as a source port, a unique, unassigned port number, PN, to be associated with A2.143, and creates the source socket, A1.PN. The combination of A1.PN and A2.143 is a unique connection on the Internet.

3.3.2 Peers on The Overlay P2P Network

As is the case with the IP network, we also define a stack on the overlay network. This stack has three layers because there is neither a physical nor a link layer. At level 1 is the Overlay Network Protocol (ONP) which is analogous to IP in the IP stack, and thus, peer-UUID’s play the role of IP addresses. There is a transport layer at level 2. Here there are two protocols which are discussed in detail later. Where ONP messages are our IP packet equivalent, for transports we have the Application Communi- cation Protocol (ACP) which is a reliable message protocol, and the Universal Message Protocol (UMP) which like UDP is not reliable. Hence, for UMP, when required, reliability is application dependent. At level 3 applications reside. As with IP, we also require virtual ports for the triage of incoming application data.


Application Communication Protocol

Universal Message Protocol

Universal Message Protocol

Overlay Network Protocol

Figure 3-5. The Overlay Network Stack The Virtual Port

Like the peer-UUID, a virtual port is also a UUID that defines a point of contact for a given applica- tion. Just like the Peer-UUID, the virtual-port-UUID can be a random number of enough bits, say 128, to guarantee uniqueness, or as we prefer, a CBID so that cryptographic challenges can be made to ver- ify the application source of the virtual port information. A virtual port can be ad-hoc or registered and has an associated name that usually identifies the application. So, for example, for instant messaging one might have the name, virtual-port-UUID pair, IMApp.UUID. Again, the names are not necessarily unique as with IP ports unless some kind of registration is used. This will certainly be the case in more formal, enterprise P2P networks. Continuing with the IP analogy for TCP, we might have, either on an ad-hoc or registered network:

Application Protocol

ACP Ports











In the case of ad-hoc networks we will thoroughly describe how peers discover such ports within the context of their personal peer communities. Level 2 Communication Channel Virtual Port

Once a peer possesses another peer’s PeerIdentity document, it has enough information to communicate with that peer at level 2 on the overlay network. This “operating system” to “operating

system” communication is required in order to publish system documents to which other peers subscribe. These system documents enable level 3 or application and services communication. In a sense, one is distributing the operating system communication primitives across overlay network, that is to say, we are really dynamically bootstrapping an adhoc distributed operation system. We use the Level 2 communication virtual port (L2CVP), and UMP for this purpose. This communication is established using the reserved virtual port 128 bit UUID whose value is all 1’s. Unicast, Unicast Secure, Multicast and Multicast Secure Virtual Ports

There are two basic types of virtual ports on the overlay network. These are unicast and multicast. A unicast port permits two peers to establish a unique bi-directional, overlay network connection. Simi- larly, a multicast port accepts uni-directional input from multiple peers. Each of these ports has a secure counterpart that can insure the authenticity of the communicating parties, and always guaran- tees the privacy and integrity of the data that is exchanged. The actual protocols that can secure the overlay network communication in this manner are discussed in Chapter 5. As previously mentioned, a virtual port is identified by a UUID. Component 2: The Name.Virtual-Port-UUID

The Name.virtual-port-UUID is our second component. As with the peer-UUID, the name must be a legal XML character string. It is used by services and applications to both publish and establish com- munication channels on the overlay network. The publication of this component is by the means of a VirtualPort document. The VirtualPort document

Two documents are required to be published by a peer to establish level 3, application based communi- cation on the overlay network. The first is the PeerIdentity document as discussed above, and the sec- ond is the VirtualPort document. The VirtualPort document is created by applications and services at level 3, and, as the PeerIdentity document, is published and subscribed to at level 2 using the L2CVP and UMP. See section 3.5 for an overview of publication, subscription and how communication is established.

Given this introduction we define our VirtualPort document as follows:

Document type = VIRTUALPORT

Content tags and field descriptions:

<vportname> Legal XML character string [XML] </vportname> <vportUUID> uuid-Legal UUID in hexadecimal ascii string </vportUUID> <vportType> unicast | UnicastSecure | multicast | multicastSecure </vport- Type> <multicastGroup> uuid-Legal UUID in hexadecimal ascii string </multicast- Group> <expirationDate> MMM DD YYYY HH:MM:SS +/-HHMM </expirationDate>

<sourceExclusive> Right to publish ID - Hexadecimal string </sourceExclu- sive> <comHints> <owner> peer UUID of publisher </owner> </comHints>

The <multicastGroup> tag’s field is the UUID of the multicast group bound to the vportUUID. This tag is exclusively for a virtualPort of type multicast, and its functionality is described in chapter 4, sec- tion 4.1.4.

The <expirationDate> is the date after which this port is no longer accessible. If this field is missing, then the virtualPort has no expiration date. Examples of such a date are:

Aug 03 2032 05:33:33 +1200

Jun 16 2040 05:20:14 -0800

The format is standard. The +/-HHMM is the hours and minutes offset of the time zone from GMT.

The <sourceExclusive> field provides a mechanism to prove the source for this document is the cre- ator. Other peerNodes should not publish this document. Mechanisms for detecting false publication are discussed in chapter 5. If this field is not present, then the default is FALSE.

The <commHints> tag is to identify to a subscriber to this document, a communication hint to aid in contacting its publisher. Thus, by including the peer-UUID, and acquiring its associated PeerIdentity document, a peer may have all that is required to communicate. We say “may” because of possible real network barriers which can prohibit true end-to-end communication on the real network. In this case additional communication hints will be necessary to add to this document. See section 3.5 and Chapter 4 for the overlay network solutions to this latter problem.

Below is an example of a Virtual Port documented:

<?xml version=”1.0”?> <!DOCTYPE 4PL:VirtualPort> <4PL:VirtualPort xmlns:4PL=””> <vportname> MobileAgent </vportname> <vportUUID> uuid-AACDEF689321121288877EEFZ9615731 </vportUUID> <vportType> unicastSecure </vportType> <commHints> <owner> uuid-61AAC8932DE212F169717E15731EFZ96 </owner> </commHints>


The following 4PL commands create, and publish the above VirtualPort document:

Document pi = new Document(peerIdentity, “LucBoureau”); Document vp = new eDocument(VirtualPort, pi, “MobileAgent”, unicastSe- cure); publish(pi); publish(vp);

Again, we include the creation of the PeerIdentity document for clarity of the requirements to build a VirtualPort document. In most implementations the system will be able to get these values from a peer’s global context. In the next section we discuss the virtual socket which is used to establish a con- nection between two peers. Component 3: The Virtual Socket

To enable application to application communication on the overlay network we require peer unique network identifiers on each peer that are analogous to the IP sockets mentioned just above. We call the unique peer-UUID.virtual-port-ID pair a virtual socket on the overlay network. On a system there are two kinds of virtual sockets. The first kind is well known and published by the means of the Virtu- alPort document, and its publication implies that the publishing peer will accept incoming connection requests for this virtual socket. The second is generated by the peer that is trying to establish a connec- tion on the overlay network. While the peer UUID part uniquely defines that peer, the virtual port num- ber must be unique only to that peer, and can be generated in any way that peer desires as long as it preserves peer-local uniqueness. When we are discussing published virtual sockets, that is to say, pub- lished PeerIdentity documents and their accompanying virtual port documents, we will refer to them as listening virtual sockets.

Note that on each peer a human readable representation of the listening virtual sockets is given by peer-name.virtual-port-name. This permits a socket-like application programming model, which in turn hides the underlying complexity of ther real network’s behavior.

In 4PL to create and publish a listening virtual socket as well as an outing socket on the MobileAgent virtual port we do the following:

// First we create the PeerIdentity document Document pi = new Document(PEERIDENTITY, “LucBoureau”); // Second we create a unicast VirtualPort document Document vp = new Document(VIRTUALPORT, pi, “MobileAgent”, unicast);

// We now create the listening virtual socket using the VirtualPort document VirtualSocket vs = new VirtualSocket(vp); listen(vs);

// We next publish the VirtualPort document so that incoming connections can be established publish(pi); publish(vp);

// Create a virtual socket used to establish outgoing connections. // This virtual socket can the be used to connect to any listening socket. // Note: The P2P system has the peerIdentity document stored locally. Thus, // a call to createSocket without any parameters will generate a virtual // socket with a random source virtual port UUID. There is no virtual port // document associated with the virtual port that is used. Rather, it only // appears in the outgoing messages. Responses can be sent to this socket // which is registered in the system until it is closed. VirtualSocket local_out = new VirtualSocket();

// Imagine we have “discovered” the mobileAgent listening virtual socket, remoteMobileAgent.

// We then open a connection as follows (see chapter 4 for a definition of // TYPE):

VirtualChannel out = new VirtualChannel(local_out, remoteMobileAgent, TYPE);

Given these basic fundamentals we can describe the peer communication channel which is used for data transfer. It is important to note here that we are giving an overview of the fundamentals required to understand the detail specifications in Chapter 4.

3.3.3. Putting it all together: Communication on the Virtual P2P Network

Each peer now can create a peerIdentity document, virtual port documents and virtual sockets. These are the components that are necessary for establishing end-to-end communication between peers on the overlay network. The medium for end-to-end communication on the overlay network will be called a channel. Let’s imagine that peer1 and peer2 wish to establish a channel to permit a mobile agent to migrate between them. Peer1 and peer2 are named rita and bill, and they have each personal data repositories that contain restaurant reviews that they wish to share with others. Both therefore need lis- tening virtual sockets that can accept incoming connection requests to establish either unicast or uni- castSecure channels. In any system there must be a service or application that created the initial listening socket, and which will send and receive data using either UMP/ONP or ACP/ONP. Whether or not these applications can accept more than one connection request is implementation dependent, and not important to our discussion.

Thus, we can assume without loss of generality that rita and bill have listening sockets named rita.mobileAgent and bill.mobileAgent, that their virtual port documents have been both published and received by one another. Furthermore, let us assume that rita is contacting bill, and vice-versa, and

have created the virtual outgoing sockets, rita.338888881, and bill.338888881, and two channels have been established. We will then have on each of the peers:















Note that we intentionally use the same virtual output port number on each peer because the only requirement is that the number 338888881 is unique on the systems where it was created so that the socket pairs define a unique channel. The reason that two channels are required even if they are bi- directional is that each peer is playing the role of both a client and a server. The listening socket is the server side of any protocol they establish between themselves. Certainly, it is possible that a single channel can be used to send migrating mobile agents in both directions, and in this case the listening servers would have dual personalities. This does lead to code complexity that is best avoided by adher- ing to the strict separation of client/server roles in applications.

So, what do we have? The fundamental mechanisms that permit peers to connect with one another. But, there are some protocols missing at this point. The above scenario is high level and postpones the engineering details until we have established a more complete description of the components that com- prise a P2P overlay network, as well as the protocols which manage communication. That is to say, we need to describe the publication and subscription mechanisms in detail as well as how two peers can discover one another given the complexities of the underlying real networks. The interested reader can skip to Chapter 4 where the engineering description of how to connect is given.

3.4 Scope and Search - With Whom Do I Wish to Communicate?

Imagine a global P2P overlay network with hundreds of millions devices which can range from light

switches to massively parallel super computers. The global network will be a union of home networks, manufacturing assembly line control processors, enterprise wide networks, military and government

networks, etc

tual port documents. P2P communication given such a search space is unmanageable without some means to scope or limit search to those peers that a given peer desires to contact. Why should a light switch wish to contact a tank on military maneuvers? Searching the entire space given the best of algo-

rithms is not only too time consuming, but also ridiculous. It would render this P2P topology useful to researchers in search and routing algorithms but would have no practical applications. To attack this

This gives us a massive collection of PeerIdentity documents and their associated vir-

problem we need to organize the search space in a way which has a certain logic to it, and reflects how humans and machines might really want to group themselves.

3.4.1 The Virtual Address Space

The above global overlay network at this point can be described as a collection of PeerIdentity and Vir- tualPort documents. So, given an overlay network with n peers, and with the PeerIdenity documents,

peerIdentity(i), i = 1, 2,

+ m n total docu-

ments. If peer1 and peer2 wish to communicate with one another on a well known socket, the problem of discovery can be both bandwidth and computationally intensive. To minimize the discovery prob- lem we need to minimize the document search space. To this end we use a common sense, real world approach, and recognize that communication is usually based on shared interests, friendship, and other human ways of social interaction. There are lonely hearts clubs, gambling clubs, baseball clubs, fami- lies, instant messaging buddy lists, assembly lines, military maneuver squadrons, mission control, for- est fire fighting teams, political groups, chess clubs, etc., etc. The list is endless.

tion of virtual ports for peer i . Thus, in this virtual address space we have n + m 1 +

, m i , is the complete collec-

, n, then for each such i, virualPort(i, j), j = 1,

Therefore, we define a connected community, CC, to be a subset of the collection, {peerIdentity(i) | i =


may be non-empty. Any peer can create a

CC, and a peer must always be a member of at least one CC. The default CC should be set when a peer is initially configured. Given our overlay network with its collection of connected communities,

, communities, i. e, given CC(1) and CC(2),

n}, where the peer members have a common interest. A peer can belong to multiple connected

CC(1) ∩ CC(2)


CC(i) 1 i N


, let CC(j) be any member of this collection. Then CCS(j) = { peer nodes p | p is a

member of CC(j) } forms an overlay subnetwork with a very special property: Where

p1 and p2 as nodes on CCS(m) and CCS(n), respectively, then p1 cannot connect to p2 and vice-versa. Here we are saying that the virtual ports and virtual sockets on CCS(m) are visible only to the nodes on CCS(m). As we will see later, the act of creating and publishing virtual ports and sockets is always restricted to a CCS. Thus, connected communities become secure “walled gardens” in the larger over- lay network with inter-CCS connectivity not permitted. Thus, the set of all CCS’s is pair-wise, commu- nication disjoint. Certainly, p1 can be an active member of multiple CC’s and concurrently communicate with members of each such CC on the associated CCS. With this definition the larger overlay network is the union of pair-wise, communication disjoint CCS’s.

In Figure 3-6 we illustrate a simple overlay network with four CC’s each determining a CCS. The ellipitical boundaries indicate the pair-wise, communcation disjoint attribute of the CCS’s. Note that Peer 1 and Peer 2 are in multiple CC’s.

m n

, if we have

Overlay Network CC2 CC3 CC1 CC4 Peer_1 Peer_2
Overlay Network

Figure 3-6. Connected Community on Overlay Network

We see a CC isolated to its CCS in this manner as a very powerful concept that will first speed up the discovery of the virtual sockets on ad-hoc P2P networks by limiting the search to a single CCS; Sec- ond, minimize the time required to establish connections and route data between its peer node mem- bers; and third, simplify implementing CC-based policies to guarantee strong authentication, as well as the privacy, and integrity of data that is local, being transmitted on the CCS, or remotely stored.

A CCS does raise one problem. How can information that is publically readable be accessed by any peer? An example of this occurs in the following section where some of the documents describing connected communities must be publically available. That is to say, a connected community document for CC0 must be accessible outside of the explicit access control of CC0 if non-CC members are to be able to find it, and thus join CC0. To solve this access problem we define the Public Connected Com- munity(PubCC) for this purpose. All peers are members of this community and as a member can both publish documents, and access all documents published by the PubCC’s members. As a general rule, documents describing CC’s, and meta-data describing data that is publically accessible can be pub- lished in the PubCC. The former are like CC bootstrap documents, and certainly, other CC documents will be published in particular CC’s and their access will be restricted to the CC members scope. The latter might be meta-data containg URL’s that can be used to access information about connected com- munities, for example, images and other publicity. Thus, the PubCC permits a peerNode global context for publishing some CC documents, and meta-data. As is seen in chapter 4, section, the pubCC restricts data access to CC documents.

To bind a virtual port to the CC in which it is created we are going to require another tag in the Virtu- alPort document. This will be the CC UUID that is generated by overlay network’s UUID algorithm and is described in the next section.

Let’s now revisit the VirtualPort document:

Document type = VIRTUALPORT

Content tags and field descriptions:

<vportname> Legal XML character string [XML] </vportname> <vportUUID> uuid-Legal UUID in hexadecimal ascii string </vportUUID> <vportType> unicast | UnicastSecure | multicast | multicastSecure </vport- Type> <multicastGroup> uuid-Legal UUID in hexadecimal ascii string </multicast- Group> <comHints> <owner> peer UUID of publisher </owner> <connCom> connected community UUID </connCom> </comHints>

Below is the revised virtual port example:

<?xml version=”1.0”?> <!DOCTYPE 4PL:VirtualPort> <4PL:VirtualPort xmlns:4PL=””> <vportname> MobileAgent </vportname> <vportUUID> uuid-AACDEF689321121288877EEFZ9615731 </vportUUID> <vportType> unicastSecure </vportType> <commHints> <owner> uuid-DDEDEF689321121269717EEFZ9615659 </owner> <connCom> uuid-FBCAEF689321121269717EEFZ9617854 </connCom> </commHints>


Given this additional tag, the virtual socket also reflects its CC identity. Let’s assume Bill and Rita are both members of CC1 and CC2, have created mobileAgent ports in each CC, and established connec- tions on the associated CCS’s. Then the connection tables on Bill and Rita could appear as follows:

Table 3-1. Local and Remote Socket with CC Identity.























When CC’s are created, its description needs to be published in a connected community document so that their existence can be recognized by other peers. This document is discussed in the next section.

3.4.2 Component 4: The Connected Community Document

When a connected community is created, it is given a human readable text name to render it recogniz- able, and a UUID so that it has a unique identity on the overlay network. Admittedly, as we’ve fre- quently discussed, it is possible to have connected community name collisions in purely ad-hoc networks. Because of this problem, the connected community document contains a description field provided by the creator which can give detailed information about the community to aid in distinguish- ing it from others with the same name. What this field contains is digital data whose format and use is up to the creator. It could be a plain text string, a gif, mpeg or jpeg file. If a non-text description is cho- sen, we suggest using a URN [URN] to reference the data for better performance. Certainly, a URN list can be used to point to multiple locations, such as peers, or even websites. This data must be publically available, and accessible via the Public Connected Community. Because of virus problems, we caution against but do not prohibit the use of executable code. It is certainly possible to create execution envi- ronments that are virus safe, and JAVA is such an example. The forth, and final field is for policies that moderate community behavior. Examples are authentication, digital rights, content restrictions, data security, etc. These policies may also refer to executable code and the same cautions apply. We define three membership policy types. They are ANONYMOUS, REGISTERED, and RESTRICTED:

1. ANONYMOUS - Any peerNode can join a CC, and access its associated content

2. REGISTERED - A peerNode must register with the CC creator. This is a “good faith” reg- istration and is not intended to strickly control access to content. Registration does result in returning a registration validation stamp that must be presented to other members for con- tent access. On the other hand no particular attempt is made to control the registration vali- dation forgery. There is no deep security here

3. RESTRICTED - Here a secure credential is returned when membership is granted. The cre- dential is used to authenticate the new member, and without the credential, access to con- tent is denied. Such a credential may use strong security or “Webs-of-Trust”-like mechanisms. Descriptions of how to implement this security are discussed in chapter 5.

The fifth field is the optional Email VirtualPort UUID for this connected community. For details see chapter 7, section 7.4.

The following is the connected community document:


Content tags and field descriptions:

<ccName> Restricted Legal XML character string [XML][MIME] </ccName> <ccUUID> uuid-Legal UUID in hexadecimal ascii string </ccUUID> <description> <text> Legal XML character string [XML] </text> <URN> Legal Universal Resource Name </URN> </description> <policy>

<type> ANONYMOUS | REGISTERED | RESTRICTED </type> <text> Legal XML character string [XML] </text> <URN> Legal Universal Resource Name </URN> </policy> <emailVportUUID> uuid-Legal UUID in hexadecimal ascii string </emaiVport- UUID>

Below is the connected community example:

<?xml version=”1.0”?> <!DOCTYPE 4PL:ConnectedCommunity> <4PL:ConnectedCommunity xmlns:4PL=””> <ccName> China green tea club </ccName> <ccUUID> uuid-AACDEF689321121288877EEFZ9615731 </ccUUID> <description> <text> We love green tea from China </text> <URN>


</URN> </description> <policy> <type> ANONYMOUS </type> <text> No membership restrictions </text> <text> GREEN tea related content only </text> </policy> <emailVportUUID> uuid-AACDEF689321121288877EEFZ9615731 </emailVportUUID>


We now have complete descriptions of four components: The peerIdentity Document, the virtualPort Document, the virtual socket, and the connectedCommunity Document. We have also introduced the concepts and definitions of the overlay network, and connected community subnetworks. Along with this we have given high level descriptions of how connections are established on the overlay network given the restriction to connected community subnetworks. Then we also noted the requirement for the Public Connected Community for data that must be available to all peer nodes on the overlay network when they are active members of this community.

Virtual connectivity exists to avoid the underlying issues of real network connectivity, and to provide by the means of peer identities both unique and true end-to-end visibility across all possible nodes on the overlay network in spite of these former issues. Yet, a system that provides overlay network func- tionality must have code that addresses the real underlying issues to permit this simplified form of net- work access. The implementation of the required software is where the real engineering takes place. In the next section we begin to discuss these real network issues and their concomitant programmable solutions to make them transparent to the application programmers using the overlay network.

3.5 How to Connect

3.5.1 Real Internet Transport

Imagine yourself in front of your familiar web browser connecting to some website anywhere in the world after just having done a search. What really happens? What Internet protocols are involved when you click on the URL? Let’s take a typical URL, First of all, the browser code knows that the protocol that will be used to connect to is http, and that the default TCP port is 80. Before a connection can be made the real IP address of must be found. How does that happen? In most cases one has a Domain Name Service (DNS) server which will return the IP address of a valid, registered domain name like Therefore, the DNS protocol is used to do the name to address translation before the connect is even attempted. But, to use the DNS proto- col one must be able to locate DNS servers. How is that done (see section Let’s assume we know a DNS server and the appropriate translation takes place. Then your system can connect to the IP address of using the http protocol and TCP/IP. But, your system is on the Internet and most likely not on the same subnetwork as Therefore, your system requires a route from your IP network to the destination IP network where is hosted. How are routers discov- ered? Sometimes they are part of your system’s initial configuration, and otherwise there are protocols to find them (see section Let’s assume your system knows the IP address of its router so that an attempt to connect to can be made. Neither your system’s network interface nor the router’s understand IP since they are physical layer devices. They each require the MAC addresses that work on the physical layer, e. g., ethernet. Since your system and the router must be on the same sub- net, your system uses the Address Resolution Protocol (ARP)[RFC826], broadcasts an ARP packet which includes its own IP and MAC address, and asks the system with the router’s IP address, i. e., the router, to respond with its MAC address. We assume the router is up, and its MAC address is found. Given that the network is in good form, IP packets from your system destined for are sent to the router and forwarded on to the final destination. In this manner the connection will succeed and http running on top of TCP/IP will retrieve the site’s home page. Needless to say, we have glossed over many details here, and have left out other possible barriers that might have been traversed to connect to the IETF’s web-site. But, we have pointed out the key steps required to connect to a host somewhere on the Internet. In the following sections we fill in the missing details that are required to be known by your P2P software in order to permit any two peers to connect to one another.





Physical Layer

http://www.ietf.orgApplication Http TCP IP Physical Layer http protocol TCP destination port 80 IP address of

http protocolHttp TCP IP Physical Layer TCP destination port 80 IP address of MAC

TCP destination port 80IP Physical Layer http protocol IP address of MAC Address of router to

IP address of www.ietf.org http protocol TCP destination port 80 MAC Address of router to destination network

MAC Address of router to destination networkprotocol TCP destination port 80 IP address of local IP Address DNS DHCP ARP local IP Address DNS DHCP ARP Registry of Router Port 80 IP Address MAC
IP Address
of Router
Port 80
IP Address
MAC Address

Figure 3-7. Connecting on the Internet

3.5.2 Issues

The most basic IP component your system possesses is its IP address. We have previously discussed the IPv4 address limitations and the as the yet not fully deployed IPv6 solution and why it is highly probable that your IP address is not fixed. Here we mean that each time your system boots or after an appropriate lease expiration discussed just below, this address may change. Dynamic Host Configuration Protocol (DHCP)

ISP’s and Enterprises with more clients than can be accommodated by their assigned IP address space, require a mechanism to fairly distribute IP addresses to these clients. Here we mean that an address is reusable when not yet allocated. The assumption is made that every client is not up all of the time. DHCP provides a method to allocate shared IP addresses from pools of reusable, administratively assigned addresses. DHCP includes a lease time for address use as well as a way for a client to relin- quish the address when it no longer needs it. DHCP in fact can provide all that is required to auto-con- figure a client under most situations. The DHCP service must be reachable by broadcast. Below is a short list of the more than one hundred DHCP options for providing information to a host about the

network to which it is connected. A complete list can be found at the Internet Assigned Numbers Authority [IANA].






Length Meaning ------ -------






Subnet Mask


Subnet Mask Value


Time Offset


Time Offset in




Seconds from UTC N/4 Router addresses


Time Server


N/4 Timeserver addresses


Name Server


N/4 IEN-116 Server addresses


Domain Server


N/4 DNS Server addresses


Log Server


N/4 Logging Server addresses


Quotes Server


N/4 Quotes Server addresses


LPR Server


N/4 Printer Server addresses


Impress Server


N/4 Impress Server addresses


RLP Server


N/4 RLP Server addresses




Hostname string


Boot File Size


Size of boot file in 512 byte


Merit Dump File


chunks Client to dump and name


Domain Name


the file to dump it to The DNS domain name of the client

Since, on the Internet in general, there is no guarantee that a systems IP address will remain fixed. This breaks end-to-end connectivity on the real network. Since end-to-end connectivity is guaranteed on the overlay network with a unique peer-UUID, this guarantee must be reflected in the connectivity on the real network. To do this the underlying P2P software must recognize the change of IP address and appropriately update the PeerIdentity document when this occurs. Let’s assume that the peer, Luc- Boureau, relinquished the IP address, and then DHCP assigned the new address This peer’s PeerIdentity document would be updated as follows using 4PL:

String oldStr = “tcp://“; String newStr = “tcp://“; replaceField(peerIdenity, “<comprotocols>”, oldStr, newStr);

oldStr = ““; newStr = ““; replaceField(PeerIdenity, “<comprotocols>”, oldStr, newStr);

This results in a new PeerIdentity document that is just below. How these changed documents are redistributed on the overlay network is discussed in the next chapter.

<?xml version="1.0"?> <!DOCTYPE 4PL:PeerIdentity> <4PL:PeerIdentity xmlns:4PL=""> <peername> LucBoureau </peername> <peerUUID> uuid-AACDEF689321121288877EEFZ9615731 </peerUUID> <comprotocols> <real> tcp:// </real> <real> </real> <overlay>


</overlay> </comprotocols>


This situation becomes much more complicated with the introduction of Network Address Translators (NAT). Network Address Translator (NAT)

Again because of the shortage of IPv4 addresses Network Address Translators [RFC1631, RFC3022] provide a simple scheme to permit a home user, or an enterprise to use internal, Intranet-private addresses and map them to globally unique addresses on the exterior, Internet. NAT boxes, as they are called, shield the external Internet from the IP addresses used on what is called the “stub network” for internal communication, and thus, will permit the massive duplication of internal IP addresses on these disjoint stub networks. This is shown in Figure 3-8 where the stub network host with address is communicating with the Internet host with address WAN src = dst = NAT Box LAN
src =
dst =

src = dst =

src - source IP address; dst - destination IP address

Figure 3-8. NAT Network Setup

, ten systems it uses, and is the stub network address of the NAT box. Also, assume that exter- nally, the NAT box has a single, globally unique IP address, which in this case is Further- more, suppose that the system with IP address wishes to telnet to the external system with address The telnet application with generate a random local TCP port, say 2077, and try and connect to TCP port 23 on host The NAT box does several things on the reception of IP packets from the system on the stub network. First it replaces the source IP address with the glo- bally unique IP address, Second, it assigns a TCP port to the outgoing connection, say 1024, and changes the source TCP port accordingly. It then maintains a port map, which maps received packets destined from to In this manner, if there are simultaneous outgoing TCP connections from multiple systems on the stub network, then every connection will have its unique port mapping. Because the IP address is changed, this means that the IP checksum must be recomputed for each outgoing packet. This change of IP address also forces a recomputation of the TCP checksum because the TCP pseudo-header must be updated to reflect the changed IP address. The pseudo-header and source port are part of the checksum calculation. Since the checksum calculation also includes all of the TCP data in the current IP packet, NATs can negatively effect TCP performance for large data transfers. UDP/IP packets also have a port and can be similarly mapped. Finally, NATs cause another restriction. One cannot fragment either TCP or UDP packets on the stub network side of a NAT. Packet fragmentation uses the following IP header fields: Each IP packet has an identification field, a more-fragment flag, and a fragment offset. Thus, in the first fragment the more-fragment flag is true, and in the last fragment it is false. If two systems on the same stub network happen to be frag-

Imagine that a small business assigns the ten class A addresses,,, to the

menting outgoing TCP/IP (UDP/IP) packets and the TCP/IP (UDP/IP) packets happen to have the same identification field and are sending to the same source, then on the global side of the NAT, the IP header information that is used to reassemble the packets at the ultimate source will be identical. Since the TCP and UDP header information is only in the first packet, all bets are off. Consequently, if one is behind a NAT, fragmentation cannot take place for those packets that have global destinations.

Now, since NAT boxes can respond to DHCP, the fixed stub net addresses may be reusable, and dynamically assigned. We can add the extra complication that the globally unique IP address on the Internet side of the NAT is acquired with DHCP and thus, changes over time as discussed in the above section. Furthermore, the stub network systems do not know the globally unique external NAT address. Consequently, the PeerIdentity document will contain real IP addresses that do not reflect the addresses really seen by other peers outside of the stub network, and these peers themselves may be on their own stub networks. The death blow to end-to-end connectivity is one cannot reliably initiate con- nections from the exterior side of the NAT box. How this is resolved so that one has end-to-end con- nectivity on the overlay network is discussed in section 3.5.3. Multiple Devices and Physical Layer Characteristics

Besides the headaches coming from firewall and NAT traversal, the growth of the scope and dimension of the Internet is causing more pain. Wireless LAN, Blue Tooth, sensors, etc. Will each small device have an IP address? Even without these wireless devices, we are running out of IPv4 addresses. How do different species talk to one another? Now even a USA-based GSM cellular phone is not working in Japan-China CDMA network. The two wireless standards, Bluetooth and 802.11a/b are fighting one another deployment, and clearly, both are being deployed. It is too much to ask if it is possible to form end-to-end communication in the sea of the devices. To build a P2P overlay network including each single device already sounds like a dream given this reality. However, when we look at the fundamen- tals of the problems, we are foreseeing the opportunities:

The problems caused by the growing number of devices are already covered by our careful design of PeerIdentity document. Not every device has to have an IP address, instead, a peer-UUID can be assigned for identification purposes independent of the underlying phys- ical layer.

To solve the problems caused by the growing types of devices and their associated commu- nication protocols, we need to find the common transport point between different networks. Then, the end-to-end, P2P overlay network communication can be established on the top of the transport chain using the common transport point which in our case is the mediator. Changing Identities - How do I Know the Color of a Chameleon?

All of the above issues are from technological point of view, but the most unpredictable element which contributes to the "chaos" of the Internet is human users. Yes, the carefully designed PeerIdentity doc- uments are perfect to identify people, and the systems that they use. Yet, look back to’s rat- ing engine. A badly behaved user can discard an old identity, apply a new user Id and start a new “life” with a new “face”. The same thing can happen on a P2P overlay network.

Although this sounds too idealistic, personally, we think one should be given a second chance. Internet life should be more ideal than the real life in a reasonable sense. Going back to the changing identity issue, Ebay’s problem can be solved by giving the new account Id the smallest rating value. Hence, for a badly behaved user, even he/she starts over, so there is not much benefit. This is just one of the sim- plest examples of how to deal with such problems. As will be pointed out in Chapter 5, there are many possible ways to engineer a nearly-fair overlay network where the evil can be caught and the damage can be limited.

3.5.3 Solutions

Let’s briefly summarize what we have learned about our P2P overlay network up to this point, and also what obstacles we must overcome to permit P2P communication on this network. The P2P overlay net- work is a collection of peer-nodes each having a name, unique identity, and description. And, each peer must belong to a connected community in order to communicate with other peers in these com- munities. Also, since inter-connected community communication is not possible, and we have overlay network information that must be communicated independently of these communities, there is a Pub- lice Connected Community to which any peer can belong to make access to this information universal. In particular, a peer must be able to elaborate all publically available connected communities. It is important to note that “stealth” connected communities are not prohibited and that their existence would never be publically available. Other out-of-band means are necessary to discover and join these communities.

For peer-nodes to discover, and communicate with one another, we have defined three documents: The PeerIdentity document, the VirtualPort document and the Connected Community document. As noted just above, the PeerIdentity document identifies peers. The VirtualPort document permits a peer to establish both listening and outgoing virtual sockets, and the Connected Community documents cre- ates safe “walled gardens” to both minimize the requirements for discovery, content exchange, routing, and provide a framework upon which privacy and security can more easily be added.

First, as we began to look under the network hood, we found that there are many obstacles to prevent end-to-end communication on the real network. As a means to overcome these obstacles we mentioned the requirement for systems which we call mediators. Most importantly, mediators make these barriers invisible to the P2P application programmer. Second, in the PeerIdentity document is a description of a peer-node’s real transports. While the overlay network is an abstraction that simplies communication as mentioned above, this network must be bound to the real network transports by the underlying P2P software for the abstraction to be realized. In this section we describe in some detail exactly what a mediator does as well as how the above binding happens. Transport Mediator

Reviewing the issues with the real transports, and their current protocols:

1. IP Multicast limited to local subnet prevents peer-nodes from discovering one another if they are not on the same subnet,

2. Multicast is either non-existent as in mobile phone networks, or device discovery is limited to a physical network with a small radius such as blue tooth,

3. Non-fixed IP addresses in the IPv4 address space,

4. NAT limitations which are like (3) but even worse,

5. Small devices with limited memory and processor capabilities requiring both storage and computational aid,

6. Routing our overlay network protocols to give us end-to-end communication,

7. The 100,000,000,000 devices requiring robust search algorithms for the discovery of peer- nodes, their associated documents, and content. This is currently a serious problem even with more than million peer-nodes,

8. Ad-hoc document registration for these same peer-nodes, i. e., administered registration is no longer feasible.

Our solutions to all of these problems begin with the P2P Mediator. Mediators host peer-nodes and any peer can be a mediator. For small, ad-hoc, P2P overlay networks, a single mediator is sufficient. Certainly, this depends on the processing power, memory size, and disk storage of the mediator. If one imagines a neighborhood network with a maximum of 100 peer-nodes, then any fully equipped home system available today is fine. On the other hand as we move to networks with thousands to millions of peer-nodes the ratio of mediators to peer-nodes on a single P2P overlay network should decrease because the systems hosting peer nodes must be more powerful and we wish also to maximize response time and stablitity so that the technology is compelling for users. In particular, there are some subtle computational problems related to routing. We are all familiar with the structure of telephone numbers. Local calls are easily differentiated from national and international calls. Our Mediators will be organized similarly to simplify routing, and minimize routing table memory usage. The primary requirement is that each mediator have a fixed network address that is not NAT/firewall limited so that it is always reachable from anywhere on its overlay network. Note that a mediator may be firewall lim- ited but in this case firewall traversal is not possible using this mediator. This is appropriate for an enterprise P2P overlay network that prohibits firewall traversal. A second requirement is that a media- tor must be available within the context of the needs of the peer-nodes it supports. For example: A neighborhood mediator may only be up from 6pm until midnight during the week and 24 hours on Sat- urday and Sunday while enterprise mediators will surely be 24x7 highly available.

Now, let’s take a closer look at the problem of two peers discovering one another. If both peers are not NAT/Firewall bound from each other, then even if they have dynamic IP addresses via DHCP or they are on the same NAT stub netowrk, discovery is possible. They may be on the same local network, and then multicast can be used to announce a peer’s presence and send the documents previously described. Again, this is short lived because both NAT and DHCP acquired addresses can be leased and reused. Once two peers are not on the same local network or NAT stub network, the story changes. If these same peers are not on the same local network and using DHCP services, then an out-of-band communication, e. g., email, a telephone call, or an ad-hoc name/address registry like an LDAP server, can be used to exchange network addresses so that communiation is established and the documents sent to one another. This is cumbersome but it works. Finally, if either of the peers is NAT bound, and

whether or not the NAT uses DHCP, then the external IP socket is hidden from the peer’s software since port mapping is used, and there is no reliable way that this peer can give the information neces- sary for an externally located peer to contact it. Simply stated: Because of NAT port mapping, this is impossible in general. Clearly, we have a real problem with discovery. So, how do our mediators solve the problem of discovery?

Mediators are always visible to the peers they host, and to the other mediators that are necessary to support the P2P overlay network. That is to say (see chapter 4 for the protocols, and algorithm details):

1. They must have fixed IP addresses,

2. Mediators must be external to all NAT stub networks they support,

3. If the P2P overlay network supports multiple physical layers whose level2 transports are incompatible, e. g., Bluetooth and IP, then the mediator must support all such transports. In this case note that the overlay network as we have defined is independent of these incom- patibilties,

4. If one wishes to traverse firewalls, then the mediator must be outside of the firewall, and some protocol must be permitted to traverse the firewall. Examples are http and SOCKS Version 5,

5. The mediators must maintain a map of all known mediators, and this map is called the mediator map. This is maintained using the mediator discovery protocol.

Peers must have either a preconfigured or other means to acquire the information needed to initially contact a mediator. For IP based mediators/peers the contact information will be an IP address/socket pair. This may be in the software when it is booted, the software may have the address of an external website where mediators are registered, or for a neighborhood P2P overlay network, it can be acquired out-of-band using, or example email, a telephone call, or a casual conversation. Let’s assume a media- tor hosted peer has the required contact information. What does the peer do to make itself known? It registers its peerIdenity, virtualPort, and connected community documents on that mediator. Since mediators will know about other mediators by the means of the mediator-to-mediator communication protocol, this information is distributed among themselves. A mediator may have a limit to the num- ber of peers it can host and may redirect a peer to a different mediator using the mediator redirect pro- tocol. In Figure 3-9, ecah mediator has a complete map which includes all four mediators.

peer m0 mediator map m1 m0 m2 m0 m3 m1 m1 m2 m3 m0 m2
mediator map
Figure 3-9. Simple Mediator Map

As the number of peers on a P2P overlay network grows, discovery becomes more and more difficult because the amount of data that is distibuted among the mediators is enormous, and even with the best search algorithms, the task is still very difficult. One can imagine using a distributed hast table (DHT), and hashing references to all documents across the meditor map. In this case, a peer search for a string will contact its meditor with the query, the query’s search string will be hashed, and sent the to the meditor which has the information in its hash table along with the query. If we are clever, we appended enough information to the hashed data so that the meditor having the information in its hash table can forward the query to the peer from which the original document hash originated. This glosses over some hash algorithm details, and routing issues but none the less, it is a reasonable approach. This is illstrated in Figure 3-10 where two document descriptions are sent to mediator M3 along with the rout- ing information. Then M3 hashes peerIdentity and virutalPort document descriptions to M1 and M4, respectively.

peer m0 m1 peerIdentity hashed virtualPort hashed m2 send m3 mediator peerIdentity virtualPort document list
virtualPort hashed
document list

Figure 3-10. Hashing Scheme

In Figure 3-11, assume peer P1 queries Mediator M1 for P2’s virtualPort Document. This query includes the routing information necessary for the responding peer to reach P1. M1 hashes the query and finds that M3 has the required information, then sends the query to M3. M3 has the routing infor- mation to forward the query to P2 via M2. Finally, since the query contains the route back to P1, P2 sends the response to P1 through M1 if M1 is reachable directly from P2. Otherwise, P2 has to use its mediator, M2, to communicate with M1 and P1.

peer p1 m0 m1 m2 m3 p2 mediator query response

Figure 3-11. Query and Response

Now, suppose a mediator crashes. In this case, a great deal needs to be done to stablize the remaining mediators’ hash tables. Let’s assume that a mediator hosting several peers discovers that a meditor to which it has hashed data has crashed. What does this imply? There are multiple possibilities, recovery

is difficult and can use a great deal of bandwidth and cpu time. Imagine we have a simple DHT algo-

rithm (note that there are many possible DHT algorithms) where given a (string, object) pair, we do the SHA-1 hash of the string mod the number of mediators and store the (string, object) as well as the orig- inating peer on the resulting mediator:

Given mediators M 0 , M 1 ,

, M N, j = SHA-1(string) MOD (N+1),

0 j N


and the data will be in mediator M j ’s hash table.

A mediator, M k , crashes. First, all data hased to that mediator must be rehashed. This implies that

when data is hashed a reference to the mediator to which it is hashed must be kept, and in particular, if mediator M j hashes data to mediator M k , then M k ’s mediator map entry on M j should maintain the ref- erence for the sake of computational efficiency during crash recovery. Then M j need not search all of its hashed data for M k , rather it goes directly to M k ‘s mediator map entry. We can decide to keep the same modulus, N+1, and any data that was stored on the crashed meditor’s hash table would then be stored on its successor, M k+1 , mod (N+1). This is OK, and all mediators need to do the same. If we used N rather than N+1 as a modulus, then all of the hashed data on all of the mediators must be rehashed since a new hast algorithm modulus is being used. One could result to a brute force like search instead, but this is not good for performance. When a mediator discovers that another mediator

is down, then it must notify all other mediators to keep a consistent DHT. Because there is a race con-

dition here, i. e., a mediator may be hashing data to a crashed mediator and not discover it is down until

it tries to store the data, we would the simple rule: In this kind of failure, the mediator will wait until

there is again a consistent mediator map, i. e., backoff and wait for a notification, and if none arrives,

then apply the the rule that is used to maintain a consistent map. That might be a simple ping of each member of the map that has not been recently heard from. The more unstable mediators one has, the more complicated this maintenance problem becomes. Here we must emphasize that mediators are supposed to be stable systems. And it is important to try to minimize the impact of such disaster recov- ery.

In Figure 3-12, mediators M0, M1 and M2 have discovered that M3 has crashed, have updated their

mediator maps, and rehashed the data that was hashed to M3 to M0, M3’s successor mod(4). Thus, in

Figure 3-13, P2’s peerIdentity document is rehashed to M0. Finally, P1’s query is directed to M0 instead of M3.

peer m0 mediator map m1 m0 m2 m0 m3 m1 m1 m2 m3 m0 m2
mediator map
Figure 3-12. Hashing Recovery Scheme peer p1 m0 m1 m2 m3 p2 mediator query response
Figure 3-12. Hashing Recovery Scheme

Figure 3-13. Query and Response after Hashing Recovery

Recall that connected communities are pair-wise, communication disjoint. Using this feature and the mediator-to-mediator protocol we do the following:

1. Mediators maintain a mediator map for each CC they host, i. e., for those connected com- munities in which the peerNodes they host are active (see chapter 4, section,

2. Mediators communicate the existence of each CC they host to all other mediators using the mediator-to-mediator protocol. If another mediator hosts this CC, it adds the contacting mediator to its CC mediator map, and notifies that mediator that it also hosts peers belong- ing to the CC so that both have the same CC mediator map.

It is not necessary for CC mediator maps to be complete, that is to say, contain an entry for every medi- ator that hosts a peer that is a member of a given CC. Here, it will simply be impossible to find all members of a CC, and this is permissible in P2P networks, i. e., discovery need not be complete. But

CC mediator maps must be consistent, i. e., every mediator in a CC mediator map hosts at least one

peer that is a member of CC. So, why do all of this? It simplifies disaster recovery because when a mediator has crashed, and this is discovered, then recovery is limited to those CC mediator maps to which that crashed mediator belongs. This is cool! Figure 3-14 shows that mediator M0 supports two connected communities, CC1 and CC2, and M1 only supports CC1. Similarly, M2 and M3 only sup- port CC2, and M4 is the sole supporter of CC3

m0 m0 m1 m2 m3 m0 m2 CC1 m3 CC2 m1 m4 p1 p2 m4
Figure 3-14. CC Mediator Map

Figure 3-15 shows that mediator M0 has crashed, its peers have discovered alternative mediators and

CC mediator maps have been updated to reflect this.

m0 m0 m1 m2 m3 m0 m2 CC1 m3 CC2 m1 m4 p1 p2 m4

Figure 3-15. CC Mediator Hash Recovery Scheme

Finally, mediators can proxy storage and computation for their hosted peers which are device con- strained. Given this capability and the above discussion, the eight issues previously mentioned can each be resolved with the addition of mediators to the P2P overlay network. Putting the Components Together: Mapping P2P Overlay Network to the Real Transport

We now have the components and documents that define the P2P overlay network, and mediators. We’ve also given an overview of discovery. What is missing is how the overlay network is mapped to the underlying real network by the means of the real transports in the peerIdentity document and medi- ators. Looking back at figures 3-4 and 3-5 we have the IP stack and the Overlay network stack. The code for the implementation of the overlay network is at application layer in the IP stack. We have a stack bound to a stack at the IP application layer. Here we formalize this binding.

P2P overlay network applications create virtual sockets. At this point it might be a good idea for the reader to review section 3.3. In following first line is extracted from table 3-1, and the 2nd line repre- sents a real transport connection:












Here, the peers bill and rita are members of connected communtity “1” and rita has a listening mobile- Agent virtual socket active. The above table shows an open connection on the P2P overlay network,

and we describe below exactly the steps necessary to establish this connection which in turn requires a real connection on the chosen real transport. Every step from discovery to communication on the P2P overlay network requires a real transport mapping. This requires a case by case analysis:

1. Both peers are on the same local network,

2. Both peers are not on the same local network, and thus, a mediator is required at some point, and perhaps throughout the communication.

In case 1 bill and rita would have discovered one another using IP multicast. They will also have the following in their peerIdentity documents:

On peer node bill we will have

<comprotocols> <real> tcp:// </real> </comprotocols>

and on rita,

<comprotocols> <real> tcp:// </real> </comprotocols>

Having discovered one another on the same subnet, the software that implements ONP will establish a real tcp connection for communication between bill and rita. The above table now showing both the ONP sockets and the IP sockets appears as follows:













In this way, the TCP data that is exchanged between bill and rita is appropriately dispatched by the ONP software to the mobile agent applications using the channel that is open on the P2P overlay net- work. There is a minor variation in this case where both peers have TCP/IP connectivity but cannot discover one another for many possible reasons, e. g., they are not on the same subnet. Here, after receiving each other’s peerIdentity documents from a mediator, the ONP software attempts an initial TCP/IP connection to the transport addresses in the peerIdentity documents. This will be successul, and all further communication will proceed as above and the above table will be identical.

To describe case 2 we assume that bill is on a NAT stub network, and rita is behind a firewall. Thus, a meditor is required for all communications. This begins with discovery and then applies to every com- munication between the two peers. bill is a member of CC1 and wishes to communicate with rita who

is also a member of CC1. We assume that both bill and rita have one another’s peerIdentity and virtu-

alPort documents by the means decribed in the above discussion on mediators. As mentioned several

times in this section, the details of document retrieval are thoroughly covered in chapter 4. bill’s ONP software already knows its peer is behind a NAT. How? When bill initially contacted it’s mediator, say

M 0 , the mediator requests bill’s peerIdentity document, notes that the source IP address of the TCP/IP

connection bill made is different than the IP address in the peerIdentity commprotocols fields. In this case three things will happen:

1. The mediator creates an INBOX for bill. The INBOX will be for all messages from other peers destined to bill. Recall, the mediator cannot contact bill because bill is behind a NAT, and so, bill must poll the mediator at a reasonble frequency to retrieve its messages. A mediator should let bill remain connected as long as the system resources permit, and a fairness algorithm for disconnecting hosted peers must be implemented,

2. The mediator notifies the bill that it is behind a NAT, and sends bill its mediator document which contains the routing information necessary to reach this mediator. It may be that bill will communicate with rita via a different mediator, and need to append to all communica- tions with rita the route back to bill, ie, via M 0 ,

3. bill will update its peerIdentity and virtual port documents to reflect the routing information in the mediator document it received. That way, any further requests for either of these doc- uments provides a highly probable route. Note that peers usually first ask for virtual port documents. If they contain successful routes, then the peerIdentity document is not required. If a route fails, then in this case, the virtualPort document also contains the peer- UUID which can be used to recover the most recent peerIdentity document with a viable route.

1 and 2 above apply similarly to rita. Let’s assume without loss of generality, that rita is using M 1 . bill has an INBOX on M 0 and rita an INBOX on M 1 . bill and rita both have their mediator’s documents. bill sends a request to connect to rita’s mobileAgent port to rita’s INBOX on M 1 using ACP/ONP. Recall bill has already communicated with rita, received the requisite peerIdentity and virtualPort doc- uments, and thus, also has received from rita M 1 ’s routing information. This request includes the rout- ing information necessary for rita to respond to bill. rita is polling the INBOX on M 1 and receives this message. rita then completes the handshake by acknowledging in a similar manner to bill that the request to connect has been received. Recall that ACP/ONP is a reliable channel protocol and these messages are guaranteed to be delivered barring a catastrophic partioning of the real network. Now we have the following connection tables on bill and rita describing the mappings between the overlay net- work and the real network















In the above table the overlay network connectivity is always the same. This is what makes P2P a sim- plifying technology for applications. On the other hand, bill is behind NAT, has received its IP address using DHCP from the NAT, and the NAT has a hidden but different external IP address. bill’s remote address is the IP address of M 0 , rita is behind a fireway and is using an http proxy address to contact M 1 and is the address of this proxy.

This completes the discussion of the functional mapping that takes place so that two peers can com- municate on the P2P overlay network. Notice that all of the components and documents we have described up to here are required. Also, the description of the new fields containing routing informa- tion that are added to the peerIdentity and virtualPort documents will be completed in the next chapter where we describe the protocols that moderate the behavior of peers and mediators as well as the medi- ator document.

Chapter 4 Basic Behavior of Peers on a P2P System

We now have a good understanding of the definition of a P2P overlay network, its peer nodes, and how this network maps onto the underlying real transports. On the other hand, we have not yet really provided the details necessary to insure that peer nodes can dependably communicate with one another where “dependability” must be taken in the context of the particular P2P network’s place on the P2P spectrum as discussed in chapter 1, section 3. There is no P2P engineering way around the inherent ad-hoc behavior of some P2P over- lay networks. In any case, these details are called protocols, or the rules that peer nodes must follow so that they can predictably and meaningfully interact. P2P overlay network protocols have both syntax and semantics. The syntax will define the form of the protocol data sent and/or received, and the seman- tics the required behavior and interpretation of this data. These protocols are programming language independent, and, as long as the protocols are well defined, correct implementations will interoperable. Network protocols are no different in spirit than the familiar rules of the road we must all know and obey to drive on the highways of our respective countries. And, although there are no international standards for such rules, there is a reasonable familiarity, a core behavior, that is common to them so that a driver in unfamiliar territory can be well behaved and probably not receive a traffic fine. What we are say- ing here is that we are all familiar with the use and need for network protocols even if we have never read the specifications, for example, the RFC’s for IP

and TCP. This chapter is all about P2P overlay network protocols. Where the P2P overlay network provides the arteries and veins for overlay network trans- port, the protocols permit them to be filled with their life’s blood and to regulate its flow.

4.1 The P2P Overlay Network Protocols

Recall from chapter 3, section 3, the Overlay Network Stack. Below the application level is the trans- port level and there we have two protocols: The Universal Message Protocol (UMP), and the Applica- tion Communication Protocol (ACP). At the bottom is the Overlay Network Protocol (ONP). The ONP specifies both the syntax of our message format, and the semantics associated with each field this for- mat defines. It is the IP equivalent on the Overlay Network.

4.1.1 The Overlay Network Protocol

We assume that the real transports bound to the overlay network will use protocols with the strength of IP.v4 or IP.v6 to manage the packet traffic on the underlying infrastructure, and thus, real transport issues are of no concern in this section. Rather, the ONP has as its goal the successful delivery of a message between two peers on the overlay network. This delivery requires a destination overlay net- work address. And, just like for IP, we must also supply the source overlay network address because communication is ultimately between two virtual sockets, and the information defining the two sockets must be included in the message. There are many reasons for always including the source address even if the communication does not require a response. For example, one needs to know the source of a

message to discourage denial of service attacks, to maintain audit trails, to do billing, and to authenti- cate the source of a message for security reasons. Recall that a peerIdentity may contain cryptographi- cally based information that permits source authentication by a receiver. Moreover, to simplify both routing to the destination, and the destination peer node’s task of responding to the source, optional routing information can be added. Please note that we are not specifying an implementation, and thus not giving real values to any of the fields. A message implementation can have many forms, e. g.,

XML, binary, (name, value) pairs, etc

sage be in a binary format. While the current fashion of expressing everything including the “kitchen sink,” in XML often has it merits, there is a performance hit one must take to analyze an XML docu- ment. Given that ONP messages contain information that may be used, modified and possibly extended on a message’s route to the destination peerNode, it is imperative that the ONP fields be readily accessible without invoking an XML parse. On the other hand, the data payload is certainly a candidate for XML, or any other format the application writer may prefer.

But, for reasons of performance we do suggest that the mes-

The ONP message is comprised of following fields:

ONP Header:

1. Version - ONP Version number. This assures a controlled evolution of the protocol.