Peer-to-peer

From Wikipedia, the free encyclopedia

Jump to: navigation, search
For other uses of the term see Peer-to-peer (disambiguation)
For peer-to-peer networks used for file sharing see File sharing
A peer-to-peer based network.
A server based network (i.e: not peer-to-peer).

A Peer to Peer (or P2P) computer network uses diverse connectivity between participants in a network and the cumulative bandwidth of network participants rather than conventional centralized resources where a relatively low number of servers provide the core value to a service or application. P2P networks are typically used for connecting nodes via largely ad hoc connections. Such networks are useful for many purposes. Sharing content files (see file sharing) containing audio, video, data or anything in digital format is very common, and realtime data, such as telephony traffic, is also passed using P2P technology.

A pure P2P network does not have the notion of clients or servers but only equal peer nodes that simultaneously function as both "clients" and "servers" to the other nodes on the network. This model of network arrangement differs from the client-server model where communication is usually to and from a central server. A typical example of a file transfer that is not P2P is an FTP server where the client and server programs are quite distinct, the clients initiate the download/uploads, and the servers react to and satisfy these requests.

In contrast to the above discussed pure P2P network, an example of a distributed discussion system that also adopts a client-server model is the Usenet news server system, in which news servers communicated with one another to propagate Usenet news articles over the entire Usenet network. Particularly in the earlier days of Usenet, UUCP was used to extend even beyond the Internet. However, the news server system acted in a client-server form when individual users accessed a local news server to read and post articles. The same consideration applies to SMTP email in the sense that the core email relaying network of Mail transfer agents follows a P2P model while the periphery of Mail user agents and their direct connections is client-server. Tim Berners-Lee's vision for the World Wide Web, as evidenced by his WorldWideWeb editor/browser, was close to a P2P network in that it assumed each user of the web would be an active editor and contributor creating and linking content to form an interlinked "web" of links. This contrasts to the more broadcasting-like structure of the web as it has developed over the years.

Some networks and channels such as Napster, OpenNAP and IRC server channels use a client-server structure for some tasks (e.g. searching) and a P2P structure for others. Networks such as Gnutella or Freenet use a P2P structure for all purposes, and are sometimes referred to as true P2P networks, although Gnutella is greatly facilitated by directory servers that inform peers of the network addresses of other peers.

P2P architecture embodies one of the key technical concepts of the Internet, described in the first Internet Request for Comments, RFC 1, "Host Software" dated 7 April 1969. More recently, the concept has achieved recognition in the general public in the context of the absence of central indexing servers in architectures used for exchanging multimedia files.

The concept of P2P is increasingly evolving to an expanded usage as the relational dynamic active in distributed networks, i.e. not just computer to computer, but human to human. Yochai Benkler has coined the term "commons-based peer production" to denote collaborative projects such as free software. Associated with peer production are the concept of peer governance (referring to the manner in which peer production projects are managed) and peer property (referring to the new type of licenses which recognize individual authorship but not exclusive property rights, such as the GNU General Public License and the Creative Commons licenses).

Contents

[edit] Classifications of P2P networks

P2P networks can be classified by what they can be used for:

  • file sharing
  • telephony
  • media streaming (audio, video)
  • discussion forums

Other classification of P2P networks is according to their degree of centralization.

In 'pure' P2P networks:

  • Peers act as equals, merging the roles of clients and server
  • There is no central server managing the network
  • There is no central router

some examples of pure P2P application layer networks designed for file sharing are Gnutella and Freenet.

There also exist countless hybrid P2P systems:

  • Has a central server that keeps information on peers and responds to requests for that information.
  • Peers are responsible for hosting available resources (as the central server does not have them), for letting the central server know what resources they want to share, and for making its shareable resources available to peers that request it.
  • Route terminals are used as addresses, which are referenced by a set of indices to obtain an absolute address.

e.g.

  • Centralized P2P network such as Napster
  • Decentralized P2P network such as KaZaA
  • Structured P2P network such as CAN
  • Unstructured P2P network such as Gnutella
  • Hybrid P2P network (Centralized and Decentralized) such as JXTA (an open source P2P protocol specification)

[edit] Advantages of P2P networks

An important goal in P2P networks is that all clients provide resources, including bandwidth, storage space, and computing power. Thus, as nodes arrive and demand on the system increases, the total capacity of the system also increases. This is not true of a client-server architecture with a fixed set of servers, in which adding more clients could mean slower data transfer for all users.

The distributed nature of P2P networks also increases robustness in case of failures by replicating data over multiple peers, and -- in pure P2P systems -- by enabling peers to find the data without relying on a centralized index server. In the latter case, there is no single point of failure in the system.[1]

[edit] Unstructured and structured P2P networks

The P2P overlay network consists of all the participating peers as network nodes. There are links between any two nodes that know each other: i.e. if a participating peer knows the location of another peer in the P2P network, then there is a directed edge from the former node to the latter in the overlay network. Based on how the nodes in the overlay network are linked to each other, we can classify the P2P networks as unstructured or structured.

An unstructured P2P network is formed when the overlay links are established arbitrarily. Such networks can be easily constructed as a new peer that wants to join the network can copy existing links of another node and then form its own links over time. In an unstructured P2P network, if a peer wants to find a desired piece of data in the network, the query has to be flooded through the network to find as many peers as possible that share the data. The main disadvantage with such networks is that the queries may not always be resolved. Popular content is likely to be available at several peers and any peer searching for it is likely to find the same thing. But if a peer is looking for rare data shared by only a few other peers, then it is highly unlikely that search will be successful. Since there is no correlation between a peer and the content managed by it, there is no guarantee that flooding will find a peer that has the desired data. Flooding also causes a high amount of signaling traffic in the network and hence such networks typically have very poor search efficiency. Most of the popular P2P networks such as Gnutella and FastTrack are unstructured.

Structured P2P network employ a globally consistent protocol to ensure that any node can efficiently route a search to some peer that has the desired file, even if the file is extremely rare. Such a guarantee necessitates a more structured pattern of overlay links. By far the most common type of structured P2P network is the distributed hash table (DHT), in which a variant of consistent hashing is used to assign ownership of each file to a particular peer, in a way analogous to a traditional hash table's assignment of each key to a particular array slot. Some well known DHTs are Chord, Pastry, Tapestry, CAN, and Tulip. Not a DHT-approach but a structured P2P network is HyperCuP.

[edit] The Chord Protocol

The Chord Protocol is one solution for connecting the peers of a P2P network together. Chord consistently maps a key onto a node. Both keys and nodes are assigned an m-bit identifier. This identifier is a hash of the node's IP address. A key's identifier is a hash of the key. There are many other algorithms in use by P2P, but this is a simple and common approach.

A ring with positions numbered 0 to 2^(m-1) Key k is assigned to node successor(k), which is the node whose identifier is equal to or follows the identifier of k. If there are N nodes and K keys, then each node is responsible for roughly K / N keys.

When the (N+1)st node joins or leaves the network, responsibility for O(K/N) keys changes hands. Each node knows the IP address of its successor. If each node knows the location of its successor, you can perform linear search over the network for a particular key. This is a naive method for searching the network.

A faster searched will require each node to keep a "finger table" cotanining up to m entries. The i(th) entry of node n will contain the address of successor(n + 2i). The i(th) entry of node n will contain the address of successor(n + 2i)

The number of node that must be contacted to find a successor in an N-node network is O(log n). Proof: Assume node n wants to resolve a query for key k. Let p be the node that contains k. We will analyze the number of steps to reach p. Let i be such that p is in [n+2^(i-1),n+2^i)]. Node n will contact the smallest node in this interval; call this node f. Fact: f is closer to p than to n. Therefore, in one step, the distance to p decreases by at least half.

[edit] US legal controversy

See also: File sharing and the law

In Sony Corp. v. Universal Studios, 464 U.S. 417 (1984), the Supreme Court found that Sony's new product, the Betamax, did not subject Sony to secondary copyright liability because it was capable of substantial non-infringing uses. Decades later, this case became the jumping-off point for all peer-to-peer copyright infringement litigation.

The first peer-to-peer case was A&M Records v. Napster, 239 F.3d 1004 (9th Cir. 2001). In the Napster case, the 9th Circuit considered whether Napster was liable as a secondary infringer. First, the court considered whether Napster was contributorily liable for copyright infringement. To be found contributorily liable, Napster must have engaged in "personal conduct that encourages or assists the infringement."[2] The court found that Napster was contributorily liable for the copyright infringement of its end-users because it "knowingly encourages and assists the infringement of plaintiffs' copyrights."[3] The court goes on to analyze whether Napster was vicariously liable for copyright infringement. The standard applied by the court is whether Napster "has the right and ability to supervise the infringing activity and also has a direct financial interest in such activities."[4] The court found that Napster did receive a financial benefit, and had the right and ability to supervise the activity, meaning that the plaintiffs demonstrated a likelihood of success on the merits of their claim of vicarious infringement.[5] The court denied all of Napster's defenses, including its claim of fair use.

The next major peer-to-peer case was MGM v. Grokster, 545 U.S. 913 (2005). In this case, the Supreme Court found that even if Grokster was capable of substantial non-infringing uses, which the Sony Court found was enough to relieve one of secondary copyright liability, Grokster was still secondarily liable because it induced its users to infringe.[6]

It is important to note the concept of blame in cases such as these. In a pure P2P network there is no host, but in practice most P2P networks are a hybrid (see "Computer science perspective" below). This has led groups such as the RIAA to file suit against individuals, young and old, rather than against companies. The reason that Napster was subject to violation of the law and ultimately lost in court is because Napster was not a pure P2P network but instead maintained central server. This server maintained an index of the files currently available on the network.

Around the world in 2006, an estimated five billion songs, equating to 38,000 years in music were swapped on peer-to-peer websites, while 509 million were purchased online [7].

[edit] Computer science perspective

Technically, a completely pure P2P application must implement only peering protocols that do not recognize the concepts of "server" and "client". Such pure peer applications and networks are rare. Most networks and applications described as P2P actually contain or rely on some non-peer elements, such as DNS. Also, real world applications often use multiple protocols and act as client, server, and peer simultaneously, or over time. Completely decentralized networks of peers have been in use for many years: two examples are Usenet (1979) and FidoNet (1984).

Many P2P systems use stronger peers (super-peers, super-nodes) as servers and client-peers are connected in a star-like fashion to a single super-peer.

Sun added classes to the Java technology to speed the development of P2P applications quickly in the late 1990s so that developers could build decentralized real time chat applets and applications before Instant Messaging networks were popular. This effort is now being continued with the JXTA project.

P2P systems and applications have attracted a great deal of attention from computer science research; some prominent research projects include the Chord project, the PAST storage utility, the P-Grid, a self-organized and emerging overlay network and the CoopNet content distribution system (see below for external links related to these projects).


Distributed Hash Table (DHT) networks has been widely utilized for accomplishing efficient resource discovery [8] [9] for Grid computing systems, as it aids in resource management and scheduling of applications. Resource discovery activity involve searching for the appropriate resource types that match the user’s application requirements. Recent advances in the domain of decentralized resource discovery have been based on extending the existing DHTs with the capability of multi-dimensional data organization and query routing. Majority of the efforts have looked at embedding spatial database indices such as the Space Filling Curves (SFCs) including the Hilbert curves, Z-curves, k-d tree, MX-CIF Quad tree and R*-tree for managing, routing, and indexing of complex Grid resource query objects over DHT networks. Spatial indices are well suited for handling the complexity of Grid resource queries. Although some spatial indices can have issues as regards to routing load-balance in case of a skewed data set, all the spatial indices are more scalable in terms of the number of hops traversed and messages generated while searching and routing Grid resource queries.

[edit] Application of P2P network besides file sharing

  • Bioinformatics: P2P networks have also begun to attract attention from scientists in other disciplines, especially those that deal with large datasets such as bioinformatics. P2P networks can be used to run large programs designed to carry out tests to identify drug candidates. The first such program was begun in 2001 the Centre for Computational Drug Discovery at Oxford University in cooperation with the National Foundation for Cancer Research. There are now several similar programs running under the auspices of the United Devices Cancer Research Project.
  • Academic Search engine: The sciencenet P2P search engine provides a free and open search engine for scientific knowledge. sciencenet is based on yacy technology. Universities / research institutes can download the free java software and contribute with their own peer(s) to the global network. Liebel-Lab @ Karlsruhe institute of technology KIT.
  • Education and Academia: Due to the fast distribution and large storage space features, many organizations are trying to apply P2P networks for educational and academic purposes. For instance, Pennsylvania State University, MIT and Simon Fraser University are carrying on a project called LionShare designed for facilitating file sharing among educational institutions globally.
  • Military: The U.S. Department of Defense has already started research on P2P networks as part of its modern network warfare strategy. In November, 2001, Colonel Robert Wardell from the Pentagon told a group of P2P software engineers at a tech conference in Washington, DC: "You have to empower the fringes if you are going to... be able to make decisions faster than the bad guy".[10] Wardell indicated he was looking for P2P experts to join his engineering effort. In May, 2003 Dr. Tether. Director of Defense Advanced Research Project Agency testified that U.S. Military is using P2P networks. Due to security reasons, details are kept classified.
  • Business: P2P networks have already been used in business areas, but it is still in the beginning stages. Currently, Kato et al’s studies indicate over 200 companies with approximately $400 million USD are investing in P2P network. Besides File Sharing, companies are also interested in Distributing Computing, Content Distribution, e-marketplace, Distributed Search engines, Groupware and Office Automation via P2P networks. There are several reasons why companies prefer P2P sometimes, such as: Real-time collaboration--a server cannot scale well with increasing volume of content; a process which requires strong computing power; a process which needs high-speed communications, etc. At the same time, P2P is not fully used as it still faces a lot of security issues.
  • TV: Quite a few applications available to delivery TV content over a P2P network (P2PTV)
  • Telecommunication: Nowadays, people are not just satisfied with “can hear a person from another side of the earth”, instead, the demands of clearer voice in real-time are increasing globally. Just like the TV network, there are already cables in place, and it's not very likely for companies to change all the cables. Many of them turn to use the internet, more specifically P2P networks. For instance, Skype, one of the most widely used internet phone applications is using P2P technology. Furthermore, many research organizations are trying to apply P2P networks to cellular networks.

[edit] Security

[edit] Anonymity

Main article: Anonymous P2P

Some P2P protocols (such as Freenet) attempt to hide the identity of network users by passing all traffic through intermediate nodes.

[edit] Encryption

Some P2P networks encrypt the traffic flows between peers.

This may help to:

  • make it harder for an ISP to detect that peer-to-peer technology is being used (as some artificially limit bandwidth)
  • hide the contents of the file from eavesdroppers
  • impede efforts towards law enforcement or censorship of certain kinds of material
  • authenticate users and prevent 'man in the middle' attacks on protocols
  • aid in maintaining anonymity

[edit] Networks, protocols and applications

[edit] Other types of peer-to-peer applications

[edit] Networks and protocols

Network or Protocol Use Applications
ANts P2P File sharing/Software distribution/Media distribution ANts P2P
Ares File sharing Ares Galaxy, Warez P2P, KCeasy
BitTorrent File sharing/Software distribution/Media distribution ABC, AllPeers, Vuze (formerly Azureus), BitComet, BitLord, BitTornado, BitTorrent, Burst!, Deluge, FlashGet, G3 Torrent, Halite, KTorrent, LimeWire, MLDonkey, Opera, Panthera, QTorrent, rTorrent, Shareaza, TorrentFlux, Transmission, Tribler, µTorrent, Xunlei
Direct Connect File sharing, chat DC++, NeoModus Direct Connect, SababaDC, BCDC++, RevConnect, fulDC, LDC++, CzDC, McDC++, DCDM++, DDC++, iDC++, IceDC++, Zion++, R2++, rmDC++, LinuxDC++, LanDC++, ApexDC++, StrongDC++
Domain Name System Internet information retrieval See Comparison of DNS server software
eDonkey File sharing aMule, eDonkey2000 (discontinued), eMule, eMule Plus, FlashGet, iMesh, Jubster, lMule, MLDonkey, Morpheus, Panthera, Pruna, Shareaza, xMule
FastTrack File sharing giFT, Grokster, iMesh (and its variants stripped of adware including iMesh Light), Kazaa (and its variants stripped of adware such as Kazaa Lite), KCeasy, Mammoth, MLDonkey, Poisoned
Freenet Distributed data store Entropy (on its own network), Freenet
GNUnet File sharing, chat GNUnet, (GNUnet-gtk)
Gnutella File sharing Acquisition, BearShare, Cabos, FilesWire, FrostWire, Gnucleus, Grokster, gtk-gnutella, iMesh, Kiwi Alpha, LimeWire, MLDonkey, Morpheus, MP3 Rocket, Panthera, Poisoned, Shareaza, Swapper, XoloX
Gnutella2 File sharing Adagio, Gnucleus, Kiwi Alpha, MLDonkey, Morpheus, Panthera, Shareaza, TrustyFiles
JXTA Peer applications Collanos Workplace (Teamwork software), Sixearch
Kad Network File sharing aMule, eMule, MLDonkey
KDP and SDDP File Distribution Kontiki
Krawler Social network Krawler[x]
MANOLITO/MP2P File sharing Blubster, Piolet
MFPnet File sharing amiciPhone (no longer available)
Napster File sharing Napigator, Napster
OpenNap File sharing WinMX, Utatane, XNap, Napster
P2PTV Video stream or file sharing TVUPlayer, Joost, CoolStreaming, Cybersky-TV, TVants, PPLive, LiveStation
PDTP Streaming media or file sharing PDTP
Peercasting Multicasting streams PeerCast, IceShare, FreeCast, Rawflow
Pichat Chat, Collaboration a peer-to-peer chat platform
Usenet Distributed discussion See list of news clients
WPNP File sharing WinMX
Windows Peer-to-Peer Distributed peer application development, collaboration [11] Shipped with Advanced Networking Pack for Windows XP [12], Windows XP SP2, Windows Vista. This is a Windows component that runs only over IPv6 and provides a 'meta' peer-to-peer network that applications can utilize. It does not have file sharing support but third-parties can develop one. [11] It also includes the Peer Name Resolution Protocol that allows dynamic domain name publication and resolution of names to endpoints. Windows Meeting Space and the People Near Me feature of Windows Vista use this protocol. It can be used to setup a Windows Internet Computer Name (WICN) using netsh p2p. [13]

An earlier generation of peer-to-peer systems were called "metacomputing" or were classed as "middleware". These include: Legion, Globus

[edit] Multi-network applications

Applications Network or Protocol Operating systems License
aMule eDonkey network, Kad network Cross-platform GPL
DC++ BCDC++ Windows GPL
eMule eDonkey network, Kad network Windows GPL
FilesWire Gnutella, G3 Cross Platform Proprietary
giFT eDonkey network, FastTrack, Gnutella Cross-platform GPL
Gnucleus Gnutella, Gnutella2 Windows GPL
gtk-gnutella Gnutella Linux GPL
iMesh FastTrack, eDonkey network, Gnutella, Gnutella2 (All Prior to Version 6.0 Only) Windows Proprietary
KCeasy Ares, FastTrack, Gnutella, OpenFT Windows GPL
Kiwi Alpha Gnutella, Gnutella2 Windows Proprietary
MLDonkey BitTorrent, Direct Connect, eDonkey network, FastTrack, Kad Network, OpenNap, SoulSeek, HTTP/FTP Cross-platform GPL
Morpheus Gnutella, Gnutella2, BitTorrent Windows Proprietary
Shareaza Gnutella, Gnutella2, eDonkey, BitTorrent, HTTP/FTP Windows GPL
Vagaa BitTorrent, eDonkey, Kad Windows Proprietary
WinMX WPNP, OpenNap Windows Proprietary
Zultrax Gnutella, ZEPP Windows Proprietary

[edit] Social and economic issues

Some researchers have explored the benefits of enabling virtual communities to self-organize and introduce incentives as a resource sharing and cooperation, arguing that what is missing from today's peer-to-peer systems, should be seen both as a goal and a means for self-organized virtual communities to be built and fostered[14]. Ongoing research efforts for designing effective incentive mechanisms in P2P systems, based on principles from game theory are beginning to take on a more psychological and information processing direction.

[edit] History

[edit] See also

[edit] References

  1. ^ Advantages of peer-to-peer networks
  2. ^ A&M Records v. Napster, Inc., 239 F.3d 1004, 1019 (9th Cir. 2001) citing Matthew Bender & Co. v. West Publ'g Co., 158 F.3d 693, 706 (2d Cir. 1998)
  3. ^ Napster, at 1020.
  4. ^ Napster, at 1022, citing Gershwin Publ'g Corp. v. Columbia Artists Mgmt., Inc, 443 F.2d 1159, 1162 (2d Cir. 1971.
  5. ^ Napster, at 1024.
  6. ^ MGM v. Grokster, 514 U.S. 913, 940 (2005).
  7. ^ June 2008, The Tables Have Turned: Rock Stars – Not Record Labels – Cashing In On Digital Revolution, IBISWorld
  8. ^ Rajiv Ranjan, Aaron Harwood, and Rajkumar Buyya, [http://www.cs.mu.oz.au/%7Erranjan/pgrid.pdf Peer-to-Peer Based Discovery of Grid Resource
  9. ^ Rajiv Ranjan, Lipo Chan, Aaron Harwood, Shanika Karunasekera, Rajkumar Buyya Decentralised Resource Discovery Service for Large Scale Federated Grids
  10. ^ Walker, Leslie. Uncle Sam Wants Napster! The Washington Post, November 8, 2001
  11. ^ a b Windows Peer-to-peer SDK FAQ
  12. ^ Overview of the Advanced Networking Pack for Windows XP
  13. ^ Windows Peer-to-Peer Networking
  14. ^ Antoniadis, P. & Le Grand, B. (2007). Incentives for resource sharing in self-organized communities: From economics to social psychology. Digital Information Management, 2007. ICDIM '07

[edit] External links

Personal tools