Sei sulla pagina 1di 62

[TYPE THE COMPANY NAME]

Advance Internet
Technology

Mayur Patel**
Pratik Gandhi**
Abhishek Chandan*

We tried our level best to compile all the topics of AIT. The remaining of the topics left will soon be added.
**Major Contribution, * Medium Contribution
Module 1: Advanced Internet Protocols
Domain Name System
DNS:

The domain name system (DNS) is the way that Internet domain names are located and
translated into Internet Protocol addresses. A domain name is a meaningful and easy-to-remember
"handle" for an Internet address. Because maintaining a central list of domain name/IP address
correspondences would be impractical, the lists of domain names and IP addresses are distributed
throughout the Internet in a hierarchy of authority. There is probably a DNS server within close
geographic proximity to your access provider that maps the domain names in your Internet requests
or forwards them to other servers in the Internet.

Name servers:

The Domain Name System is maintained by a distributed database system, which uses the
client-server model. The nodes of this database are the name servers. Each domain has at least one
authoritative DNS server that publishes information about that domain and the name servers of any
domains subordinate to it. The top of the hierarchy is served by the root nameservers, the servers to
query when looking up (resolving) a top-level domain name (TLD).

Domain Name Space:

The domain name space consists of a tree of domain names. Each node or leaf in the tree has
zero or more resource records, which hold information associated with the domain name. The tree
sub-divides into zones beginning at the root zone. A DNS zone may consist of only one domain, or
may comprise many domains and sub-domains, depending on the administrative authority delegated
to the manager.

Advanced Internet Technologies


2
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Administrative responsibility over any zone may be divided by creating additional zones.
Authority is said to be delegated for a portion of the old space, usually in form of sub-domains, to
another name server and administrative entity. The old zone ceases to be authoritative for the new
zone.

Authoritative name server:

An authoritative name server is a name server that gives answers that have been configured
by an original source, for example, the domain administrator or by dynamic DNS methods, in
contrast to answers that were obtained via a regular DNS query to another name server. An
authoritative-only name server only returns answers to queries about domain names that have been
specifically configured by the administrator.

An authoritative name server can either be a master server or a slave server. A master server
is a server that stores the original (master) copies of all zone records. A slave server uses an
automatic updating mechanism of the DNS protocol in communication with its master to maintain an
identical copy of the master records.

Recursive and caching name server:

In principle, authoritative name servers are sufficient for the operation of the Internet.
However, with only authoritative name servers operating, every DNS query must start with recursive
queries at the root zone of the Domain Name System and each user system must implement resolver
software capable of recursive operation.

To improve efficiency, reduce DNS traffic across the Internet, and increase performance in
end-user applications, the Domain Name System supports DNS cache servers which store DNS
query results for a period of time determined in the configuration (time-to-live) of the domain name
record in question. Typically, such caching DNS servers, also called DNS caches, also implement
the recursive algorithm necessary to resolve a given name starting with the DNS root through to the

Advanced Internet Technologies


3
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
authoritative name servers of the queried domain. With this function implemented in the name
server, user applications gain efficiency in design and operation.

The combination of DNS caching and recursive functions in a name server is not mandatory,
the functions can be implemented independently in servers for special purposes.

Internet service providers typically provide recursive and caching name servers for their
customers. In addition, many home networking routers implement DNS caches and recursors to
improve efficiency in the local network.

DNS resolvers:

The client-side of the DNS is called a DNS resolver. It is responsible for initiating and
sequencing the queries that ultimately lead to a full resolution (translation) of the resource sought,
e.g., translation of a domain name into an IP address.

A DNS query may be either a non-recursive query or a recursive query:

 A non-recursive query is one in which the DNS server provides a record for a domain for
which it is authoritative itself, or it provides a partial result without querying other servers.
 A recursive query is one for which the DNS server will fully answer the query (or give an
error) by querying other name servers as needed. DNS servers are not required to support
recursive queries.

The resolver, or another DNS server acting recursively on behalf of the resolver, negotiates
use of recursive service using bits in the query headers.

Resolving usually entails iterating through several name servers to find the needed
information. However, some resolvers function simplistically and can communicate only with a
single name server. These simple resolvers (called "stub resolvers") rely on a recursive name server
to perform the work of finding information for them.

Reverse lookup:

A reverse lookup is a query of the DNS for domain names when the IP address is known.
Multiple domain names may be associated with an IP address. The DNS stores IP addresses in the
form of domain names as a specially formatted names in pointer (PTR) records within the

Advanced Internet Technologies


4
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
infrastructure top-level domain arpa. For IPv4, the domain is in-addr.arpa. For IPv6, the reverse
lookup domain is ip6.arpa. The IP address is represented as a name in reverse-ordered octet
representation for IPv4, and reverse-ordered nibble representation for IPv6.

When performing a reverse lookup, the DNS client converts the address into these formats,
and then queries the name for a PTR record following the delegation chain as for any DNS query.
For example, the IPv4 address 208.80.152.2 is represented as a DNS name as 2.152.80.208.in-
addr.arpa. The DNS resolver begins by querying the root servers, which point to ARIN's servers for
the 208.in-addr.arpa zone. From there the Wikimedia servers are assigned for 152.80.208.in-
addr.arpa, and the PTR lookup completes by querying the wikimedianameserver for 2.152.80.208.in-
addr.arpa, which results in an authoritative response.

Protocol details:

DNS primarily uses User Datagram Protocol (UDP) on port number 53 to serve requests.
DNS queries consist of a single UDP request from the client followed by a single UDP reply from
the server. The Transmission Control Protocol (TCP) is used when the response data size exceeds
512 bytes, or for tasks such as zone transfers. Some operating systems, such as HP-UX, are known
to have resolver implementations that use TCP for all queries, even when UDP would suffice.

DNS resource records:

A Resource Record (RR) is the basic data element in the domain name system. Each record
has a type (A, MX, etc.), an expiration time limit, a class, and some type-specific data. Resource
records of the same type define a resource record set. The order of resource records in a set, returned
by a resolver to an application, is undefined, but often servers implement round-robin ordering to
achieve load balancing. DNSSEC, however, works on complete resource record sets in a canonical
order.

RR (Resource record) fields


Length
Field Description
(octets)
NAME Name of the node to which this record pertains (variable)
TYPE Type of RR in numeric form (e.g. 15 for MX RRs) 2
CLASS Class code 2
Count of seconds that the RR stays valid (The maximum is 2 31-1,
TTL 4
which is about 68 years.)
RDLENGTH Length of RDATA field 2
RDATA Additional RR-specific data (variable)

NAME is the fully qualified domain name of the node in the tree. On the wire, the name may be
shortened using label compression where ends of domain names mentioned earlier in the packet can
be substituted for the end of the current domain name.

TYPE is the record type. It indicates the format of the data and it gives a hint of its intended use. For
example, the A record is used to translate from a domain name to an IPv4 address, the NS record

Advanced Internet Technologies


5
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
lists which name servers can answer lookups on a DNS zone, and the MX record specifies the mail
server used to handle mail for a domain specified in an e-mail address (see also List of DNS record
types).

RDATA is data of type-specific relevance, such as the IP address for address records, or the priority
and hostname for MX records. Well known record types may use label compression in the RDATA
field, but "unknown" record types must not.

The CLASS of a record is set to IN (for Internet) for common DNS records involving Internet
hostnames, servers, or IP addresses. In addition, the classes Chaos (CN) and Hesiod (HS) exist. Each
class is an independent name space with potentially different delegations of DNS zones.

In addition to resource records defined in a zone file, the domain name system also defines several
request types that are used only in communication with other DNS nodes (on the wire), such as when
performing zone transfers (AXFR/IXFR) or for EDNS (OPT).

Advanced Internet Technologies


6
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
DDNS
Dynamic DNS providers offer a software client program that automates the discovery and
registration of client's public IP addresses. The client program is executed on a computer or device in
the private network. It connects to the service provider's systems and causes those systems to link the
discovered public IP address of the home network with a hostname in the domain name system.
Depending on the provider, the hostname is registered within a domain owned by the provider or the
customer's own domain name. These services can function by a number of mechanisms. Often they
use an HTTP service request since even restrictive environments usually allow HTTP service. This
group of services is commonly also referred to by the term Dynamic DNS, although it is not the
standards-based DNS Update method. However, the latter might be involved in the providers
systems.

Most home networking routers today have this feature already built into their firmware. One
of the early routers to support Dynamic DNS was the UMAX UGate-3000 in 1999, which supported
the TZO.COM dynamic DNS service.

An example is residential users who wish to access their personal computer at home while
traveling. If the home computer has a fixed static IP address, the user can connect directly using this
address, but many provider networks force frequent changes the IP address configured in their
customers' equipment. With dynamic DNS, the home computer can automatically associate its
current IP address with a domain name. As a result the remote user can resolve the host name used
for the dynamic DNS service entry to the current address of the home computer with a DNS query.
If a remote control program such as VNC server may be kept running on a host in the private
network, the user can connect to the home network with a VNC client program.

In Microsoft Windows networks, dynamic DNS is an integral part of Active Directory,


because domain controllers register their network service types in DNS so that other computers in
the Domain (or Forest) can access them.

Increasing efforts to secure Internet communications today involve encryption of all dynamic
updates via the public Internet, as these public dynamic DNS services have been abused increasingly
to design security breaches. Standards-based methods within the DNSSEC protocol suite, such as
TSIG, have been developed to secure DNS updates, but are not widely in use. Microsoft developed
alternative technology (GSS-TSIG) based on Kerberos authentication.

Advanced Internet Technologies


7
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
DHCP
The Dynamic Host Configuration Protocol (DHCP) is an automatic configuration protocol
used on IP networks. Computers that are connected to IP networks must be configured before they
can communicate with other computers on the network. DHCP allows a computer to be configured
automatically, eliminating the need for intervention by a network administrator. It also provides a
central database for keeping track of computers that have been connected to the network. This
prevents two computers from accidentally being configured with the same IP address.

In the absence of DHCP, hosts may be manually configured with an IP address. Alternatively
IPv6 hosts may use stateless address auto configuration to generate an IP address. IPv4 hosts may
use link-local addressing to achieve limited local connectivity.

In addition to IP addresses, DHCP also provides other configuration information, particularly


the IP addresses of local caching DNS resolvers. Hosts that do not use DHCP for address
configuration may still use it to obtain other configuration information.

There are two versions of DHCP, one for IPv4 and one for IPv6. While both versions bear
the same name and perform much the same purpose, the details of the protocol for IPv4 and IPv6 are
sufficiently different that they can be considered separate protocols.

BOOTP Interaction:

Advanced Internet Technologies


8
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
DHCP Interaction:

Advanced Internet Technologies


9
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Advanced Internet Technologies
10
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
DHCP Header :

• OpCode: 1 (Request), 2(Reply) Note: DHCP message type is sent in an option

• Hardware Type: 1 (for Ethernet)


• Hardware address length: 6 (for Ethernet)
• Hop count: set to 0 by client
• Transaction ID: Integer (used to match reply to response)
• Seconds:number of seconds since the client started to boot
• Client IP address, Your IP address, server IP address, Gateway IP address, client
hardware address, server host name, boot file name:client fills in the information that it
has, leaves rest blank

DHCP Message Type

Value Message Type


1 DHCPDISCOVER
2 DHCPOFFER
3 DHCPREQUEST
4 DHCPDECLINE
5 DHCPACK

Advanced Internet Technologies


11
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
6 DHCPNAK
7 DHCPRELEASE
8 DHCPINFORM

DHCP Transition Diagram:

Reliability

The DHCP protocol provides reliability in several ways: periodic renewal, rebinding, and
failover. DHCP clients are allocated leases that last for some period of time. Clients begin to attempt
to renew their leases once half the lease interval has expired. They do this by sending a unicast
DHCPREQUEST message to the DHCP server that granted the original lease. If that server is down
or unreachable, it will fail to respond to the DHCPREQUEST. However, the DHCPREQUEST will
be repeated by the client from time to time, so when the DHCP server comes back up or becomes
reachable again, the DHCP client will succeed in contacting it, and renew its lease.

If the DHCP server is unreachable for an extended period of time, the DHCP client will
attempt to rebind, by broadcasting its DHCPREQUEST rather than unicasting it. Because it is
broadcast, the DHCPREQUEST message will reach all available DHCP servers. If some other
DHCP server is able to renew the lease, it will do so at this time.

Advanced Internet Technologies


12
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
In order for rebinding to work, when the client successfully contacts a backup DHCP server,
that server must have accurate information about the client's binding. Maintaining accurate binding
information between two servers is a complicated problem; if both servers are able to update the
same lease database, there must be a mechanism to avoid conflicts between updates on the
independent servers. A standard for implementing fault-tolerant DHCP servers was developed at the
Internet Engineering Task Force.

If rebinding fails, the lease will eventually expire. When the lease expires, the client must
stop using the IP address granted to it in its lease. At that time, it will restart the DHCP process from
the beginning by broadcasting a DHCPDISCOVER message. Since its lease has expired, it will
accept any IP address offered to it. Once it has a new IP address, presumably from a different DHCP
server, it will once again be able to use the network. However, since its IP address has changed, any
ongoing connections will be broken.

Security

The base DHCP protocol does not include any mechanism for authentication. Because of
this, it is vulnerable to a variety of attacks. These attacks fall into three main categories:

 Unauthorized DHCP servers providing false information to clients.


 Unauthorized clients gaining access to resources.
 Resource exhaustion attacks from malicious DHCP clients.

Because the client has no way to validate the identity of a DHCP server, unauthorized DHCP
servers can be operated on networks, providing incorrect information to DHCP clients. This can
serve either as a denial-of-service attack, preventing the client from gaining access to network
connectivity, or as a man-in-the-middle attack. Because the DHCP server provides the DHCP client
with server IP addresses, such as the IP address of one or more DNS servers, an attacker can
convince a DHCP client to do its DNS lookups through its own DNS server, and can therefore
provide its own answers to DNS queries from the client. This in turn allows the attacker to redirect
network traffic through itself, allowing it to eavesdrop on connections between the client and
network servers it contacts, or to simply replace those network servers with its own.

Because the DHCP server has no secure mechanism for authenticating the client, clients can
gain unauthorized access to IP addresses by presenting credentials, such as client identifiers, that
belong to other DHCP clients. This also allows DHCP clients to exhaust the DHCP server's store of
IP addresses—by presenting new credentials each time it asks for an address, the client can consume
all the available IP addresses on a particular network link, preventing other DHCP clients from
getting service.

DHCP does provide some mechanisms for mitigating these problems. The Relay Agent
Information Option protocol extension (RFC 3046) allows network operators to attach tags to DHCP
messages as these messages arrive on the network operator's trusted network. This tag is then used as
an authorization token to control the client's access to network resources. Because the client has no
access to the network upstream of the relay agent, the lack of authentication does not prevent the
DHCP server operator from relying on the authorization token.

Another extension, Authentication for DHCP Messages (RFC 3118), provides a mechanism
for authenticating DHCP messages.

Advanced Internet Technologies


13
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
FTP
File Transfer Protocol (FTP) is a standard network protocol used to copy a file from one host
to another over a TCP-based network, such as the Internet. FTP is built on a client-server
architecture and utilizes separate control and data connections between the client and server. FTP
users may authenticate themselves using a clear-text sign-in protocol but can connect anonymously
if the server is configured to allow it.

The first FTP client applications were interactive command-line tools, implementing
standard commands and syntax. Graphical user interface clients have since been developed for many
of the popular desktop operating systems in use today.

2 Types of control connections:

Advanced Internet Technologies


14
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Creating data connection:

Advanced Internet Technologies


15
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Access commands

File Management Commands:

Data Formatting commands:

Advanced Internet Technologies


16
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Port Defining Commands:

File Transfer Commands:

What is transferred in control information?

Advanced Internet Technologies


17
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
What is transferred in data connection?

Example shows an example of using FTP for retrieving a list of items in a directory:

Advanced Internet Technologies


18
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Anonymous FTP:

A host that provides an FTP service may additionally provide anonymous FTP access. Users
typically log into the service with an 'anonymous' account when prompted for user name. Although
users are commonly asked to send their email address in lieu of a password, no verification is
actually performed on the supplied data.

GUI:

% ftp challenger.atc.fhda.edu
Connected to challenger.atc.fhda.edu
220 Server ready
Name:forouzan
Password:xxxxxxx
ftp>ls /usr/user/report
200 OK
150 Opening ASCII mode
...........
226 transfer complete
ftp>close
221 Goodbye
ftp>quit

Advanced Internet Technologies


19
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
IPv6:

Abbreviated address

Abbreviated address with consecutive zeros

CIDR address

Address structure

Advanced Internet Technologies


20
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Provider-based address

Loopback address

IPv6 datagram

Format of an IPv6 datagram

Advanced Internet Technologies


21
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Extension header format

Network Layer comparison:

Categories of ICMPv6 messages:

Advanced Internet Technologies


22
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Classless Inter-Domain Routing
Classless Inter-Domain Routing (CIDR) is a method for allocating IP addresses and routing
Internet Protocol packets. The Internet Engineering Task Force introduced CIDR in 1993 to replace
the previous addressing architecture of classful network design in the Internet. Their goal was to
slow the growth of routing tables on routers across the Internet, and to help slow the rapid
exhaustion of IPv4 addresses.

IP addresses are described as consisting of two groups of bits in the address: the most
significant part is the network address which identifies a whole network or subnet and the least
significant portion is the host identifier, which specifies a particular host interface on that network.
This division is used as the basis of traffic routing between IP networks and for address allocation
policies. Classful network design for IPv4 sized the network address as one or more 8-bit groups,
resulting in the blocks of Class A, B, or C addresses. Classless Inter-Domain Routing allocates
address space to Internet service providers and end users on any address bit boundary, instead of on
8-bit segments. In IPv6, however, the interface identifier has a fixed size of 64 bits by convention,
and smaller subnets are never allocated to end users.

CIDR notation is a syntax of specifying IP addresses and their associated routing prefix. It
appends to the address a slash character and the decimal number of leading bits of the routing prefix,
e.g., 192.168.0.0/16 for IPv4, and 2001:db8::/32 for IPv6.

CIDR Blocks:

CIDR is principally a bitwise, prefix-based standard for the interpretation of IP addresses. It


facilitates routing by allowing blocks of addresses to be grouped into single routing table entries.

Advanced Internet Technologies


23
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
These groups, commonly called CIDR blocks, share an initial sequence of bits in the binary
representation of their IP addresses. IPv4 CIDR blocks are identified using a syntax similar to that of
IPv4 addresses: a four-part dotted-decimal address, followed by a slash, then a number from 0 to 32:
A.B.C.D/N. The dotted decimal portion is interpreted, like an IPv4 address, as a 32-bit binary
number that has been broken into four octets. The number following the slash is the prefix length,
the number of shared initial bits, counting from the most-significant bit of the address. When
emphasizing only the size of a network, the address portion of the notation is usually omitted. Thus,
a /20 is a CIDR block with an unspecified 20-bit prefix.

An IP address is part of a CIDR block, and is said to match the CIDR prefix if the initial N
bits of the address and the CIDR prefix are the same. Thus, understanding CIDR requires that IP
address be visualized in binary. Since the length of an IPv4 address has 32 bits, an N-bit CIDR
prefix leaves 32-N bits unmatched, meaning that 232-N IPv4 addresses match a given N-bit CIDR
prefix. Shorter CIDR prefixes match more addresses, while longer prefixes match fewer. An address
can match multiple CIDR prefixes of different lengths.

CIDR is also used for IPv6 addresses and the syntax semantic is identical. A prefix length
can range from 0 to 128, due to the larger number of bits in the address, however, by convention a
subnet on broadcast MAC layer networks always has 64-bit host identifiers. Larger prefixes are
rarely used even on point-to-point links.

Advanced Internet Technologies


24
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Hierarchical routing
Hierarchical routing is method of routing in networks that is based on hierarchical addressing.

Hierarchical routing is the procedure of arranging routers in a hierarchical manner. A good


example would be to consider a corporate intranet. Most corporate intranets consist of a high speed
backbone network. Connected to this backbone are routers which are in turn connected to a
particular workgroup. These workgroups occupy a unique LAN. The reason this is a good
arrangement is because even though there might be dozens of different workgroups, the span of this
particular description is only 2. The span of a network is the maximum hop count to get from one
host to any other host on the network. Even if the workgroups divided their LAN network into
smaller partitions, the span could only increase to 4 in this particular example.

Considering alternative solutions with every router connected to every other router, or if
every router was connected to 2 routers, shows the convenience of hierarchical routing. It decreases
the complexity of network topology, increases routing efficiency, and causes much less congestion
because of fewer routing advertisements. With hierarchical routing, only core routers connected to
the backbone are aware of all routes. Routers that lie within a LAN only know about routes in the
LAN. Unrecognized destinations are passed to the default route.

Advanced Internet Technologies


25
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Voice over IP
Voice over Internet Protocol (Voice over IP, VoIP) is one of a family of internet
technologies, communication protocols, and transmission technologies for delivery of voice
communications and multimedia sessions over Internet Protocol (IP) networks, such as the Internet.
Other terms frequently encountered and often used synonymously with VoIP are IP telephony,
Internet telephony, voice over broadband (VoBB), broadband telephony, and broadband phone.

Internet telephony refers to communications services—Voice, fax, SMS, and/or voice-


messaging applications—that are transported via the Internet, rather than the public switched
telephone network (PSTN). The steps involved in originating a VoIP telephone call are signaling and
media channel setup, digitization of the analog voice signal, encoding, packetization, and
transmission as Internet Protocol (IP) packets over a packet-switched network. On the receiving side,
similar steps (usually in the reverse order) such as reception of the IP packets, decoding of the
packets and digital-to-analog conversion reproduce the original voice stream.

VoIP systems employ session control protocols to control the set-up and tear-down of calls as
well as audio codecs which encode speech allowing transmission over an IP network as digital audio
via an audio stream. The codec used is varied between different implementations of VoIP (and often
a range of codecs are used); some implementations rely on narrowband and compressed speech,
while others support high fidelitystereo codecs.

There are three types of VoIP tools that are commonly used; IP Phones,Software VoIP and
Mobile and Integrated VoIP. The IP Phones are the most institutionally established but still the least
obvious of the VoIP tools. Of all the software VoIP tools that exist, Skype is probably the most
easily identifiable. The use of software VoIP has increased during the global recession as many
persons, looking for ways to cut costs have turned to these tools for free or inexpensive calling or
video conferencing applications. Software VoIP can be further broken down into three classes or
subcategories; Web Calling, Voice and Video Instant Messaging and Web Conferencing. Mobile and
Integrated VoIP is just another example of the adaptability of VoIP. VoIP is available on many
smartphones and internet devices so even the users of portable devices that are not phones can still
make calls or send SMS text messages over 3G or WIFI.

Protocols:

Voice over IP has been implemented in various ways using both proprietary and open
protocols and standards. Examples of technologies used to implement Voice over IP include:

 H.323
 IP Multimedia Subsystem (IMS)
 Media Gateway Control Protocol (MGCP)
 Session Initiation Protocol (SIP)
 Real-time Transport Protocol (RTP)
 Session Description Protocol (SDP)

Advanced Internet Technologies


26
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
The H.323 protocol was one of the first VoIP protocols that found widespread
implementation for long-distance traffic, as well as local area network services. However, since the
development of newer, less complex protocols, such as MGCP and SIP, H.323 deployments are
increasingly limited to carrying existing long-haul network traffic. In particular, the Session
Initiation Protocol (SIP) has gained widespread VoIP market penetration.

A notable proprietary implementation is the Skype protocol, which is in part based on the
principles of Peer-to-Peer (P2P) networking.

Adoption

Consumer market

A major development that started in 2004 was the introduction of mass-market VoIP services
that utilize existing broadband Internet access, by which subscribers place and receive telephone
calls in much the same manner as they would via the public switched telephone network (PSTN).
Full-service VoIP phone companies provide inbound and outbound service with Direct Inbound
Dialing. Many offer unlimited domestic calling for a flat monthly subscription fee. This sometimes
includes international calls to certain countries. Phone calls between subscribers of the same
provider are usually free when flat-fee service is not available.

A VoIP phone is necessary to connect to a VoIP service provider. This can be implemented
in several ways:

 Dedicated VoIP phones connect directly to the IP network using technologies such as wired
Ethernet or wireless Wi-Fi. They are typically designed in the style of traditional digital
business telephones.
 An analog telephone adapter is a device that connects to the network and implements the
electronics and firmware to operate a conventional analog telephone attached through a
modular phone jack. Some residential Internet gateways and cable modems have this
function built in.
 A softphone is application software installed on a networked computer that is equipped with
a microphone and speaker, or headset. The application typically presents a dial pad and
display field to the user to operate the application by mouse clicks or keyboard input.

PSTN and mobile network providers

It is becoming increasingly common for telecommunications providers to use VoIP


telephony over dedicated and public IP networks to connect switching stations and to interconnect
with other telephony network providers; this is often referred to as "IP backhaul".

Smartphones and Wi-Fi enabled mobile phones may have SIP clients built into the firmware
or available as an application download. Such clients operate independently of the mobile telephone
phone network and use either the cellular data connection or WiFi to make and receive phone calls.

Corporate use

Advanced Internet Technologies


27
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Because of the bandwidth efficiency and low costs that VoIP technology can provide,
businesses are gradually beginning to migrate from traditional copper-wire telephone systems to
VoIP systems to reduce their monthly phone costs.

VoIP solutions aimed at businesses have evolved into "unified communications" services that
treat all communications—phone calls, faxes, voice mail, e-mail, Web conferences and more—as
discrete units that can all be delivered via any means and to any handset, including cellphones. Two
kinds of competitors are competing in this space: one set is focused on VoIP for medium to large
enterprises, while another is targeting the small-to-medium business (SMB) market.

VoIP allows both voice and data communications to be run over a single network, which can
significantly reduce infrastructure costs.

The prices of extensions on VoIP are lower than for PBX and key systems. VoIP switches
may run on commodity hardware, such as PCs or Linux systems. Rather than closed architectures,
these devices rely on standard interfaces.

VoIP devices have simple, intuitive user interfaces, so users can often make simple system
configuration changes. Dual-mode cellphones enable users to continue their conversations as they
move between an outside cellular service and an internal Wi-Fi network, so that it is no longer
necessary to carry both a desktop phone and a cellphone. Maintenance becomes simpler as there are
fewer devices to oversee.

Skype, which originally marketed itself as a service among friends, has begun to cater to
businesses, providing free-of-charge connections between any users on the Skype network and
connecting to and from ordinary PSTN telephones for a charge.

In the United States the Social Security Administration (SSA) is converting its field offices
of 63,000 workers from traditional phone installations to a VoIP infrastructure carried over its
existing data network.

Benefits

Operational cost

VoIP can be a benefit for reducing communication and infrastructure costs. Examples include:

 Routing phone calls over existing data networks to avoid the need for separate voice and data
networks.
 Conference calling, IVR, call forwarding, automatic redial, and caller ID features that
traditional telecommunication companies (telcos) normally charge extra for, are available
free of charge from open source VoIP implementations.

Flexibility

VoIP can facilitate tasks and provide services that may be more difficult to implement using the
PSTN. Examples include:

Advanced Internet Technologies


28
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
 The ability to transmit more than one telephone call over a single broadband connection.
 Secure calls using standardized protocols (such as Secure Real-time Transport Protocol).
Most of the difficulties of creating a secure telephone connection over traditional phone lines,
such as digitizing and digital transmission, are already in place with VoIP. It is only
necessary to encrypt and authenticate the existing data stream.
 Location independence. Only a sufficiently fast and stable Internet connection is needed to
get a connection from anywhere to a VoIP provider.
 Integration with other services available over the Internet, including video conversation,
message or data file exchange during the conversation, audio conferencing, managing
address books, and passing information about whether other people are available to interested
parties.
 Unified Communications, the integration of VoIP with other business systems including E-
mail, Customer Relationship Management (CRM), and Web systems.

Advanced Internet Technologies


29
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Virtual Private Network
A virtual private network (VPN) is a computer network that uses a public
telecommunication infrastructure such as the Internet to provide remote offices or individual users
secure access to their organization's network. It aims to avoid an expensive system of owned or
leased lines that can be used by only one organization.

It encapsulatesdata transfers using a secure cryptographic method between two or more


networked devices which are not on the same private network so as to keep the transferred data
private from other devices on one or more intervening local or wide area networks. There are many
different classifications, implementations, and uses for VPNs.

The Internet:

Leased Line :

A leased line is a service contract between a provider and a customer, whereby the provider
agrees to deliver a symmetrictelecommunications line connecting two or more locations in exchange
for a monthly rent (hence the term lease). It is sometimes known as a 'Private Circuit' or 'Data Line'
in the UK or as CDN (CircuitoDirettoNumerico) in Italy. Unlike traditional PSTN lines it does not
have a telephone number, each side of the line being permanently connected to the other. Leased
lines can be used for telephone, data or Internet services. Some are ringdown services, and some
connect two PBXes.

Typically, leased lines are used by businesses to connect geographically distant offices.
Unlike dial-up connections, a leased line is always active. The fee for the connection is a fixed

Advanced Internet Technologies


30
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
monthly rate. The primary factors affecting the monthly fee are distance between end points and the
speed of the circuit. Because the connection doesn't carry anybody else's communications, the carrier
can assure a given level of quality.

For example, a T-1 channel can be leased, and provides a maximum transmission speed of
1.544 Mbps. The user can divide the connection into different lines for multiplexing data and voice
communication, or use the channel for one high speed data circuit.

Advanced Internet Technologies


31
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
3 main protocols in VPN :

1. PPTP:

The Point-to-Point Tunneling Protocol (PPTP) is a method for implementing virtual private
networks. PPTP uses a control channel over TCP and a GRE tunnel operating to encapsulate PPP
packets.

The PPTP specification does not describe encryption or authentication features and relies on
the PPP protocol being tunneled to implement security functionality. However the most common
PPTP implementation, shipping with the Microsoft Windows product families, implements various
levels of authentication and encryption natively as standard features of the Windows PPTP stack.
The intended use of this protocol is to provide similar levels of security and remote access as typical
VPN products.

2. L2TP

In computer networking, Layer 2 Tunneling Protocol (L2TP) is a tunneling protocol used to


support virtual private networks (VPNs). It does not provide any encryption or confidentiality by
itself; it relies on an encryption protocol that it passes within the tunnel to provide privacy.

Although L2TP acts like a Data Link Layer protocol in the OSI model, L2TP is in fact a
Session Layer protocol, and uses the registered UDP port 1701.

The entire L2TP packet, including payload and L2TP header, is sent within a UDP datagram.
It is common to carry Point-to-Point Protocol (PPP) sessions within an L2TP tunnel. L2TP does not
provide confidentiality or strong authentication by itself. IPsec is often used to secure L2TP packets
by providing confidentiality, authentication and integrity. The combination of these two protocols is
generally known as L2TP/IPsec (discussed below).

The two endpoints of an L2TP tunnel are called the LAC (L2TP Access Concentrator) and
the LNS (L2TP Network Server). The LAC is the initiator of the tunnel while the LNS is the server,

Advanced Internet Technologies


32
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
which waits for new tunnels. Once a tunnel is established, the network traffic between the peers is
bidirectional. To be useful for networking, higher-level protocols are then run through the L2TP
tunnel. To facilitate this, an L2TP session (or call) is established within the tunnel for each higher-
level protocol such as PPP. Either the LAC or LNS may initiate sessions. The traffic for each session
is isolated by L2TP, so it is possible to set up multiple virtual networks across a single tunnel. MTU
should be considered when implementing L2TP.

The packets exchanged within an L2TP tunnel are categorized as either control packets or
data packets. L2TP provides reliability features for the control packets, but no reliability for data
packets. Reliability, if desired, must be provided by the nested protocols running within each session
of the L2TP tunnel.

3. IPSec:

Internet Protocol Security (IPsec) is a protocol suite for securing Internet Protocol (IP)
communications by authenticating and encrypting each IP packet of a communication session. IPsec
also includes protocols for establishing mutual authentication between agents at the beginning of the
session and negotiation of cryptographic keys to be used during the session.

IPsec is an end-to-end security scheme operating in the Internet Layer of the Internet
Protocol Suite. It can be used in protecting data flows between a pair of hosts (host-to-host), between
a pair of security gateways (network-to-network), or between a security gateway and a host
(network-to-host).

Some other Internet security systems in widespread use, such as Secure Sockets Layer (SSL),
Transport Layer Security (TLS) and Secure Shell (SSH), operate in the upper layers of the TCP/IP
model. Hence, IPsec protects any application traffic across an IP network. Applications do not need
to be specifically designed to use IPsec. The use of TLS/SSL, on the other hand, must be designed
into an application to protect the application protocols.

IPsec is a successor of the ISO standard Network Layer Security Protocol (NLSP). NLSP
was based on the SP3 protocol that was published by NIST, but designed by the Secure Data
Network System project of the National Security Agency (NSA).

IPsec is officially specified by the Internet Engineering Task Force (IETF) in a series of
Request for Comment documents addressing various components and extensions. It specifies the
spelling of the protocol name to be IPsec.

The IPsec suite is an open standard. IPsec uses the following protocols to perform various
functions:

Authentication Headers (AH) : AH is designed for authenticating the source host and to ensure the
integrity of the payload carried by the IP packet. The protocol calculates a message digest using a
hashing function and a symmetric key and inserts the digest in the authentication header.

Advanced Internet Technologies


33
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
 Next header - identifies the type of the next payload after the Authentication Header.
 Payload Length - specifies the length of AH in 32-bit words (4-byte units), minus "2".
 SPI - an arbitrary 32-bit value that, in combination with the destination IP address and
security protocol (AH), uniquely identifies the Security Association for this datagram.
 Sequence Number - it contains a monotonically increasing counter value and is mandatory
and is always present even if the receiver does not elect to enable the anti-replay service for a
specific SA.
 Authentication Data - a variable-length field containing an Integrity Check Value (ICV)
computed over the ESP packet minus the Authentication Data.

Encapsulating Security Payloads (ESP) : provide confidentiality, data origin authentication,


connectionless integrity, an anti-replay service (a form of partial sequence integrity), and limited
traffic flow confidentiality.

Security associations (SA)provide the bundle of algorithms and data that provide the parameters
necessary to operate the AH and/or ESP operations. The Internet Security Association and Key
Management Protocol (ISAKMP) provides a framework for authentication and key exchange, with
actual authenticated keying material provided either by manual configuration with pre-shared keys,
Internet Key Exchange (IKE and IKEv2), Kerberized Internet Negotiation of Keys (KINK), or
IPSECKEY DNS records

Two Modes of IPSec :

1. Transport Mode :
Transport mode is the default mode for IPSec, and it is used for end-to-end communications (for example,
for communications between a client and a server). When transport mode is used, IPSec encrypts only the IP
payload. Transport mode provides the protection of an IP payload through an AH or ESP header. Typical IP
payloads are TCP segments (containing a TCP header and TCP segment data), a UDP message (containing a

Advanced Internet Technologies


34
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
UDP header and UDP message data), or an ICMP message (containing an ICMP header and ICMP message
data).

2. Tunnel Mode :

When IPSec tunnel mode is used, IPSec encrypts the IP header and the payload, whereas transport
mode only encrypts the IP payload. Tunnel mode provides the protection of an entire IP packet by
treating it as an AH or ESP payload. With tunnel mode, an entire IP packet is encapsulated with an
AH or ESP header and an additional IP header. The IP addresses of the outer IP header are the tunnel
endpoints, and the IP addresses of the encapsulated IP header are the ultimate source and destination
addresses.

IPSec tunnel mode is useful for protecting traffic between different networks, when traffic must pass
through an intermediate, untrusted network. Tunnel mode is primarily used for interoperability with
gateways, or end-systems that do not support L2TP/IPSec or PPTP connections. You can use tunnel
mode in the following configurations:

 Gateway-to-gateway
 Server-to-gateway
 Server-to-server

Advanced Internet Technologies


35
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Advanced Internet Technologies
36
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Module 2: Internet as a Distributed computing platform
Web Services technology
Today, the principal use of the World Wide Web is for interactive access to documents and
applications. In almost all cases, such access is by human users, typically working through Web
browsers, audio players, or other interactive front-end systems. The Web can grow significantly in
power and scope if it is extended to support communication between applications, from one program
to another.

- From the W3C XML Protocol Working Group Charter

Welcome to the world of web services. This chapter will ground you in the basics of web service
terminology and architecture. It does so by answering the most common questions, including:

 What exactly is a web service?


 What is the web service protocol stack?
 What is XML messaging? Service description? Service discovery?
 What are XML-RPC, SOAP, WSDL, and UDDI? How do these technologies complement each
other and work together?
 What security issues are unique to web services?
 What standards currently exist?

1.1 Introduction to Web Services

A web service is any service that is available over the Internet, uses a standardized XML
messaging system, and is not tied to any one operating system or programming language.

There are several alternatives for XML messaging. For example, you could use XML
Remote Procedure Calls (XML-RPC) or SOAP, both of which are described later in this chapter.

Advanced Internet Technologies


37
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Alternatively, you could just use HTTP GET/POST and pass arbitrary XML documents. Any of
these options can work.

Although they are not required, a web service may also have two additional (and desirable)
properties:

 A web service should be self-describing. If you publish a new web service, you should also
publish a public interface to the service. At a minimum, your service should include human-
readable documentation so that other developers can more easily integrate your service. If you
have created a SOAP service, you should also ideally include a public interface written in a
common XML grammar. The XML grammar can be used to identify all public methods, method
arguments, and return values.
 A web service should be discoverable. If you create a web service, there should be a relatively
simple mechanism for you to publish this fact. Likewise, there should be some simple
mechanism whereby interested parties can find the service and locate its public interface. The
exact mechanism could be via a completely decentralized system or a more logically centralized
registry system. To summarize, a complete web service is, therefore, any service that:
 Is available over the Internet or private (intranet) networks
 Uses a standardized XML messaging system
 Is not tied to any one operating system or programming language
 Is self-describing via a common XML grammar
 Is discoverable via a simple find mechanism

1.1.1 The Web Today: The Human-Centric Web

To make web services more concrete, consider basic e-commerce functionality. For example,
Widgets, Inc. sells parts through its web site, enabling customers to submit purchase orders and
check on order status. To check on the order status, a customer logs into the company web site via a
web browser

and receives the results as an HTML page.

Advanced Internet Technologies


38
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
This basic model illustrates a human-centric Web, where humans are the primary actors
initiating most web requests. It also represents the primary model on which most of the Web
operates today.

1.1.2 Web Services: The Application-Centric Web

With web services, we move from a human-centric Web to an application-centric Web. This
does not mean that humans are entirely out the picture! It just means that conversations can take
place directly between applications as easily as between web browsers and servers. For example, we
can turn the order status application into a web service. Applications and agents can then connect to
the service and utilize its functionality directly. For example, an inventory application can query
Widgets, Inc. on the status of all orders. The inventory system can then process the data, manipulate
it, and integrate it into its overall supply chain management software.

There are numerous areas where an application-centric Web could prove extremely helpful.
Examples include credit card verification, package tracking, portfolio tracking, shopping bots,
currency conversion, and language translation. Other options include centralized repositories for
personal information, such as Microsoft's proposed .NET MyServices project. .NET MyServices
aims to centralize calendar, email, and credit card information and to provide web services for
sharing that data.

1.1.3 The Web Services Vision: The Automated Web

An application-centric Web is not a new notion. For years, developers have created CGI
programs and Java servlets designed primarily for use by other applications. For example,
companies have developed credit card services, search systems, and news retrieval systems. The
crucial difference is that most of these systems consisted of ad hoc solutions. With web services, we
have the promise of some standardization, which should hopefully lower the barrier to application
integration

1.1.4 The Industry Landscape

Advanced Internet Technologies


39
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
There are currently many competing frameworks and proposals for web services. The three
main contenders are Microsoft's .NET, IBM Web Services, and Sun Open Net Environment (ONE).
While each of these frameworks has its own particular niche and spin, they all share the basic web
service definition and vision put forth here. Furthermore, all of the frameworks share a common set
of technologies, mainly SOAP, WSDL, and UDDI.

Rather than focusing on one particular implementation or framework, this book focuses on
common definitions and technologies. Hopefully, this will better equip you to cut through the
marketing hype and understand and evaluate the current contenders.

Styles of use
Web services are a set of tools that can be used in a number of ways. The three most common
styles of use are RPC, SOAP and REST.

Remote procedure calls

Architectural elements involved in the XML-RPC.

RPC Web services present a distributed function (or method) call interface that is familiar to
many developers. Typically, the basic unit of RPC Web services is the WSDL operation.

The first Web services tools were focused on RPC, and as a result this style is widely
deployed and supported. However, it is sometimes criticized for not being loosely coupled, because
it was often implemented by mapping services directly to language-specific functions or method
calls. Many vendors felt this approach to be a dead end, and pushed for RPC to be disallowed in the
WS-I Basic Profile.

Other approaches with nearly the same functionality as RPC are Object Management Group's
(OMG) Common Object Request Broker Architecture (CORBA), Microsoft's Distributed
Component Object Model (DCOM) or Sun Microsystems's Java/Remote Method Invocation (RMI).

Service-Oriented Architecture

Web services can also be used to implement an architecture according to service-oriented


architecture (SOA) concepts, where the basic unit of communication is a message, rather than an
operation. This is often referred to as "message-oriented" services.

SOA Web services are supported by most major software vendors and industry analysts.
Unlike RPC Web services, loose coupling is more likely, because the focus is on the "contract" that
WSDL provides, rather than the underlying implementation details.

Advanced Internet Technologies


40
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Middleware analysts use enterprise service buses that combine message-oriented processing
and Web services to create an event-driven SOA. One example of an open-source ESB is Mule,
another one is Open ESB.

Representational State Transfer (REST)

REST attempts to describe architectures that use HTTP or similar protocols by constraining
the interface to a set of well-known, standard operations (like GET, POST, PUT, DELETE for
HTTP). Here, the focus is on interacting with stateful resources, rather than messages or operations.

An architecture based on REST (one that is 'RESTful') can use WSDL to describe SOAP
messaging over HTTP, can be implemented as an abstraction purely on top of SOAP (e.g., WS-
Transfer), or can be created without using SOAP at all.

WSDL version 2.0 offers support for binding to all the HTTP request methods (not only GET
and POST as in version 1.1) so it enables a better implementation of RESTful Web services.[5]
However, support for this specification is still poor in software development kits, which often offer
tools only for WSDL 1.

CloudFront will likely prove to be a critical component to the fast, worldwide distribution of static content.

Google App engine:

What Is Google App Engine?

Google App Engine lets you run your web applications on Google's infrastructure. App Engine
applications are easy to build, easy to maintain, and easy to scale as your traffic and data storage
needs grow. With App Engine, there are no servers to maintain: You just upload your application,
and it's ready to serve your users.

You can serve your app from your own domain name (such as http://www.example.com/) using
Google Apps. Or, you can serve your app using a free name on the appspot.com domain. You can
share your application with the world, or limit access to members of your organization.

Google App Engine supports apps written in several programming languages. With App Engine's
Java runtime environment, you can build your app using standard Java technologies, including the
JVM, Java servlets, and the Java programming language—or any other language using a JVM-based
interpreter or compiler, such as JavaScript or Ruby. App Engine also features a dedicated Python
runtime environment, which includes a fast Python interpreter and the Python standard library. The
Java and Python runtime environments are built to ensure that your application runs quickly,
securely, and without interference from other apps on the system.

With App Engine, you only pay for what you use. There are no set-up costs and no recurring fees.
The resources your application uses, such as storage and bandwidth, are measured by the gigabyte,

Advanced Internet Technologies


41
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
and billed at competitive rates. You control the maximum amounts of resources your app can
consume, so it always stays within your budget.

App Engine costs nothing to get started. All applications can use up to 500 MB of storage and
enough CPU and bandwidth to support an efficient app serving around 5 million page views a
month, absolutely free. When you enable billing for your application, your free limits are raised, and
you only pay for resources you use above the free levels.

The Application Environment

Google App Engine makes it easy to build an application that runs reliably, even under heavy load
and with large amounts of data. App Engine includes the following features:

 dynamic web serving, with full support for common web technologies
 persistent storage with queries, sorting and transactions
 automatic scaling and load balancing
 APIs for authenticating users and sending email using Google Accounts
 a fully featured local development environment that simulates Google App Engine on your computer
 task queues for performing work outside of the scope of a web request
 scheduled tasks for triggering events at specified times and regular intervals

Your application can run in one of two runtime environments: the Java environment, and the Python
environment. Each environment provides standard protocols and common technologies for web
application development.

The Sandbox

Applications run in a secure environment that provides limited access to the underlying operating
system. These limitations allow App Engine to distribute web requests for the application across
multiple servers, and start and stop servers to meet traffic demands. The sandbox isolates your
application in its own secure, reliable environment that is independent of the hardware, operating
system and physical location of the web server.

Examples of the limitations of the secure sandbox environment include:

 An application can only access other computers on the Internet through the provided URL fetch and
email services. Other computers can only connect to the application by making HTTP (or HTTPS)
requests on the standard ports.
 An application cannot write to the file system. An app can read files, but only files uploaded with the
application code. The app must use the App Engine datastore, memcache or other services for all data
that persists between requests.
 Application code only runs in response to a web request, a queued task, or a scheduled task, and must
return response data within 30 seconds in any case. A request handler cannot spawn a sub-process or
execute code after the response has been sent.

The Java Runtime Environment

Advanced Internet Technologies


42
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
You can develop your application for the Java runtime environment using common Java web
development tools and API standards. Your app interacts with the environment using the Java
Servlet standard, and can use common web application technologies such as JavaServer Pages
(JSPs).

The Java runtime environment uses Java 6. The App Engine Java SDK supports developing apps
using either Java 5 or 6.

The environment includes the Java SE Runtime Environment (JRE) 6 platform and libraries. The
restrictions of the sandbox environment are implemented in the JVM. An app can use any JVM
bytecode or library feature, as long as it does not exceed the sandbox restrictions. For instance,
bytecode that attempts to open a socket or write to a file will throw a runtime exception.

Your app accesses most App Engine services using Java standard APIs. For the App Engine
datastore, the Java SDK includes implementations of the Java Data Objects (JDO) and Java
Persistence API (JPA) interfaces. Your app can use the JavaMail API to send email messages with
the App Engine Mail service. The java.net HTTP APIs access the App Engine URL fetch service.
App Engine also includes low-level APIs for its services to implement additional adapters, or to use
directly from the application. See the documentation for the datastore, memcache, URL fetch, mail,
images and Google Accounts APIs.

Typically, Java developers use the Java programming language and APIs to implement web
applications for the JVM. With the use of JVM-compatible compilers or interpreters, you can also
use other languages to develop web applications, such as JavaScript, Ruby, or Scala.

For more information about the Java runtime environment, see The Java Runtime Environment.

The Python Runtime Environment

With App Engine's Python runtime environment, you can implement your app using the Python
programming language, and run it on an optimized Python interpreter. App Engine includes rich
APIs and tools for Python web application development, including a feature rich data modeling API,
an easy-to-use web application framework, and tools for managing and accessing your app's data.
You can also take advantage of a wide variety of mature libraries and frameworks for Python web
application development, such as Django.

The Python runtime environment uses Python version 2.5.2. Additional support for Python 3 is being
considered for a future release.

The Python environment includes the Python standard library. Of course, not all of the library's
features can run in the sandbox environment. For instance, a call to a method that attempts to open a
socket or write to a file will raise an exception. For convenience, several modules in the standard
library whose core features are not supported by the runtime environment have been disabled, and
code that imports them will raise an error.

Application code written for the Python environment must be written exclusively in Python.
Extensions written in the C language are not supported.

Advanced Internet Technologies


43
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
The Python environment provides rich Python APIs for the datastore, Google Accounts, URL fetch,
and email services. App Engine also provides a simple Python web application framework called
webapp to make it easy to start building applications.

You can upload other third-party libraries with your application, as long as they are implemented in
pure Python and do not require any unsupported standard library modules.

For more information about the Python runtime environment, see The Python Runtime Environment.

The Datastore

App Engine provides a distributed data storage service that features a query engine and transactions.
Just as the distributed web server grows with your traffic, the distributed datastore grows with your
data. You have the choice between two different data storage options differentiated by their
availability and consistency guarantees.

The App Engine datastore is not like a traditional relational database. Data objects, or "entities,"
have a kind and a set of properties. Queries can retrieve entities of a given kind filtered and sorted by
the values of the properties. Property values can be of any of the supported property value types.

Datastore entities are "schemaless." The structure of data entities is provided by and enforced by
your application code. The Java JDO/JPA interfaces and the Python datastore interface include
features for applying and enforcing structure within your app. Your app can also access the datastore
directly to apply as much or as little structure as it needs.

The datastore is strongly consistent and uses optimistic concurrency control. An update of a entity
occurs in a transaction that is retried a fixed number of times if other processes are trying to update
the same entity simultaneously. Your application can execute multiple datastore operations in a
single transaction which either all succeed or all fail, ensuring the integrity of your data.

The datastore implements transactions across its distributed network using "entity groups." A
transaction manipulates entities within a single group. Entities of the same group are stored together
for efficient execution of transactions. Your application can assign entities to groups when the
entities are created.

Google Accounts

App Engine supports integrating an app with Google Accounts for user authentication. Your
application can allow a user to sign in with a Google account, and access the email address and
displayable name associated with the account. Using Google Accounts lets the user start using your
application faster, because the user may not need to create a new account. It also saves you the effort
of implementing a user account system just for your application.

If your application is running under Google Apps, it can use the same features with members of your
organization and Google Apps accounts.

The Users API can also tell the application whether the current user is a registered administrator for
the application. This makes it easy to implement admin-only areas of your site.

Advanced Internet Technologies


44
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
For more information about integrating with Google Accounts, see the Users API reference.

App Engine Services

App Engine provides a variety of services that enable you to perform common operations when
managing your application. The following APIs are provided to access these services:

URL Fetch

Applications can access resources on the Internet, such as web services or other data, using App
Engine's URL fetch service. The URL fetch service retrieves web resources using the same high-
speed Google infrastructure that retrieves web pages for many other Google products.

Mail

Applications can send email messages using App Engine's mail service. The mail service uses
Google infrastructure to send email messages.

Memcache

The Memcache service provides your application with a high performance in-memory key-value
cache that is accessible by multiple instances of your application. Memcache is useful for data that
does not need the persistence and transactional features of the datastore, such as temporary data or
data copied from the datastore to the cache for high speed access.

Image Manipulation

The Image service lets your application manipulate images. With this API, you can resize, crop,
rotate and flip images in JPEG and PNG formats.

Scheduled Tasks and Task Queues

An application can perform tasks outside of responding to web requests. Your application can
perform these tasks on a schedule that you configure, such as on a daily or hourly basis. Or, the
application can perform tasks added to a queue by the application itself, such as a background task
created while handling a request.

Scheduled tasks are also known as "cron jobs," handled by the Cron service. For more information
on using the Cron service, see the Python or Java cron documentation.

Task queues are currently released as an experimental feature. At this time, only the Python runtime
environment can use task queues.

AMAZON CLOUD

Advanced Internet Technologies


45
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Amazon Elastic Compute Cloud (Amazon EC2)

Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides resizable compute
capacity in the cloud. It is designed to make web-scale computing easier for developers.

Amazon EC2’s simple web service interface allows you to obtain and configure capacity with
minimal friction. It provides you with complete control of your computing resources and lets you run
on Amazon’s proven computing environment. Amazon EC2 reduces the time required to obtain and
boot new server instances to minutes, allowing you to quickly scale capacity, both up and down, as
your computing requirements change. Amazon EC2 changes the economics of computing by
allowing you to pay only for capacity that you actually use. Amazon EC2 provides developers the
tools to build failure resilient applications and isolate themselves from common failure scenarios.

Amazon EC2 Functionality

Amazon EC2 presents a true virtual computing environment, allowing you to use web service
interfaces to launch instances with a variety of operating systems, load them with your custom
application environment, manage your network’s access permissions, and run your image using as
many or few systems as you desire.

To use Amazon EC2, you simply:

 Select a pre-configured, templated image to get up and running immediately. Or create an Amazon
Machine Image (AMI) containing your applications, libraries, data, and associated configuration
settings.
 Configure security and network access on your Amazon EC2 instance.
 Choose which instance type(s) and operating system you want, then start, terminate, and monitor as
many instances of your AMI as needed, using the web service APIs or the variety of management
tools provided.
 Determine whether you want to run in multiple locations, utilize static IP endpoints, or attach
persistent block storage to your instances.
 Pay only for the resources that you actually consume, like instance-hours or data transfer.

Service Highlights

Elastic – Amazon EC2 enables you to increase or decrease capacity within minutes, not hours or
days. You can commission one, hundreds or even thousands of server instances simultaneously. Of
course, because this is all controlled with web service APIs, your application can automatically scale
itself up and down depending on its needs.

Completely Controlled – You have complete control of your instances. You have root access to each
one, and you can interact with them as you would any machine. You can stop your instance while
retaining the data on your boot partition and then subsequently restart the same instance using web
service APIs. Instances can be rebooted remotely using web service APIs. You also have access to
console output of your instances.

Advanced Internet Technologies


46
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Flexible – You have the choice of multiple instance types, operating systems, and software
packages. Amazon EC2 allows you to select a configuration of memory, CPU, instance storage, and
the boot partition size that is optimal for your choice of operating system and application. For
example, your choice of operating systems includes numerous Linux distributions, Microsoft
Windows Server and OpenSolaris.

Designed for use with other Amazon Web Services – Amazon EC2 works in conjunction with
Amazon Simple Storage Service (Amazon S3), Amazon Relational Database Service (Amazon
RDS), Amazon SimpleDB and Amazon Simple Queue Service (Amazon SQS) to provide a complete
solution for computing, query processing and storage across a wide range of applications.

Reliable – Amazon EC2 offers a highly reliable environment where replacement instances can be
rapidly and predictably commissioned. The service runs within Amazon’s proven network
infrastructure and datacenters. The Amazon EC2 Service Level Agreement commitment is 99.95%
availability for each Amazon EC2 Region.

Secure – Amazon EC2 provides numerous mechanisms for securing your compute resources.

 Amazon EC2 includes web service interfaces to configure firewall settings that control
network access to and between groups of instances.
 When launching Amazon EC2 resources within Amazon Virtual Private Cloud (Amazon
VPC), you can isolate your compute instances by specifying the IP range you wish to use,
and connect to your existing IT infrastructure using industry-standard encrypted IPsec VPN.
You can also choose to launch Dedicated Instances into your VPC. Dedicated Instances are
Amazon EC2 Instances that run on hardware dedicated to a single customer for additional
isolation.

Inexpensive – Amazon EC2 passes on to you the financial benefits of Amazon’s scale. You pay a
very low rate for the compute capacity you actually consume.

 On-Demand Instances – On-Demand Instances let you pay for compute capacity by the hour
with no long-term commitments. This frees you from the costs and complexities of planning,
purchasing, and maintaining hardware and transforms what are commonly large fixed costs
into much smaller variable costs. On-Demand Instances also remove the need to buy “safety
net” capacity to handle periodic traffic spikes.
 Reserved Instances – Reserved Instances give you the option to make a low, one-time
payment for each instance you want to reserve and in turn receive a significant discount on
the hourly usage charge for that instance. After the one-time payment for an instance, that
instance is reserved for you, and you have no further obligation; you may choose to run that
instance for the discounted usage rate for the duration of your term, or when you do not use
the instance, you will not pay usage charges on it.
 Spot Instances – Spot Instances allow customers to bid on unused Amazon EC2 capacity and
run those instances for as long as their bid exceeds the current Spot Price. The Spot Price
changes periodically based on supply and demand, and customers whose bids meet or exceed
it gain access to the available Spot Instances. If you have flexibility in when your
applications can run, Spot Instances can significantly lower your Amazon EC2 costs. See
here for more details on Spot Instances.

Advanced Internet Technologies


47
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Features

Amazon EC2 provides a number of powerful features for building scalable, failure resilient,
enterprise class applications, including:

 Amazon Elastic Block Store – Amazon Elastic Block Store (EBS) offers persistent storage
for Amazon EC2 instances. Amazon EBS volumes provide off-instance storage that persists
independently from the life of an instance. Amazon EBS volumes are highly available,
highly reliable volumes that can be leveraged as an Amazon EC2 instance’s boot partition or
attached to a running Amazon EC2 instance as a standard block device. When used as a boot
partition, Amazon EC2 instances can be stopped and subsequently restarted, enabling you to
only pay for the storage resources used while maintaining your instance’s state. Amazon EBS
volumes offer greatly improved durability over local Amazon EC2 instance stores, as
Amazon EBS volumes are automatically replicated on the backend (in a single Availability
Zone). For those wanting even more durability, Amazon EBS provides the ability to create
point-in-time consistent snapshots of your volumes that are then stored in Amazon S3, and
automatically replicated across multiple Availability Zones. These snapshots can be used as
the starting point for new Amazon EBS volumes, and can protect your data for long term
durability. You can also easily share these snapshots with co-workers and other AWS
developers. See Amazon Elastic Block Store for more details on this feature.

 Multiple Locations – Amazon EC2 provides the ability to place instances in multiple
locations. Amazon EC2 locations are composed of Regions and Availability Zones.
Availability Zones are distinct locations that are engineered to be insulated from failures in
other Availability Zones and provide inexpensive, low latency network connectivity to other
Availability Zones in the same Region. By launching instances in separate Availability
Zones, you can protect your applications from failure of a single location. Regions consist of
one or more Availability Zones, are geographically dispersed, and will be in separate
geographic areas or countries. The Amazon EC2 Service Level Agreement commitment is
99.95% availability for each Amazon EC2 Region. Amazon EC2 is currently available in
five regions: US East (Northern Virginia), US West (Northern California), EU (Ireland), Asia
Pacific (Singapore), and Asia Pacific (Tokyo).

 Elastic IP Addresses – Elastic IP addresses are static IP addresses designed for dynamic
cloud computing. An Elastic IP address is associated with your account not a particular
instance, and you control that address until you choose to explicitly release it. Unlike
traditional static IP addresses, however, Elastic IP addresses allow you to mask instance or
Availability Zone failures by programmatically remapping your public IP addresses to any
instance in your account. Rather than waiting on a data technician to reconfigure or replace
your host, or waiting for DNS to propagate to all of your customers, Amazon EC2 enables
you to engineer around problems with your instance or software by quickly remapping your
Elastic IP address to a replacement instance. In addition, you can optionally configure the
reverse DNS record of any of your Elastic IP addresses by filling out this form.

 Amazon Virtual Private Cloud – Amazon VPC is a secure and seamless bridge between a
company’s existing IT infrastructure and the AWS cloud. Amazon VPC enables enterprises

Advanced Internet Technologies


48
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
to connect their existing infrastructure to a set of isolated AWS compute resources via a
Virtual Private Network (VPN) connection, and to extend their existing management
capabilities such as security services, firewalls, and intrusion detection systems to include
their AWS resources. See Amazon Virtual Private Cloud for more details.

 Amazon CloudWatch – Amazon CloudWatch is a web service that provides monitoring for
AWS cloud resources, starting with Amazon EC2. It provides you with visibility into
resource utilization, operational performance, and overall demand patterns—including
metrics such as CPU utilization, disk reads and writes, and network traffic. You can get
statistics, view graphs, and set alarms for your metric data. To use Amazon CloudWatch,
simply select the Amazon EC2 instances that you’d like to monitor; within minutes, Amazon
CloudWatch will begin aggregating and storing monitoring data that can be accessed using
web service APIs or Command Line Tools. See Amazon CloudWatch for more details.

 Auto Scaling – Auto Scaling allows you to automatically scale your Amazon EC2 capacity
up or down according to conditions you define. With Auto Scaling, you can ensure that the
number of Amazon EC2 instances you’re using scales up seamlessly during demand spikes
to maintain performance, and scales down automatically during demand lulls to minimize
costs. Auto Scaling is particularly well suited for applications that experience hourly, daily,
or weekly variability in usage. Auto Scaling is enabled by Amazon CloudWatch and
available at no additional charge beyond Amazon CloudWatch fees. See Auto Scaling for
more details.

 Elastic Load Balancing – Elastic Load Balancing automatically distributes incoming


application traffic across multiple Amazon EC2 instances. It enables you to achieve even
greater fault tolerance in your applications, seamlessly providing the amount of load
balancing capacity needed in response to incoming application traffic. Elastic Load
Balancing detects unhealthy instances within a pool and automatically reroutes traffic to
healthy instances until the unhealthy instances have been restored. You can enable Elastic
Load Balancing within a single Availability Zone or across multiple zones for even more
consistent application performance. Amazon CloudWatch can be used to capture a specific
Elastic Load Balancer’s operational metrics, such as request count and request latency, at no
additional cost beyond Elastic Load Balancing fees. See Elastic Load Balancing for more
details.

 High Performance Computing (HPC) Clusters – Customers with complex computational


workloads such as tightly coupled parallel processes, or with applications sensitive to
network performance, can achieve the same high compute and network performance
provided by custom-built infrastructure while benefiting from the elasticity, flexibility and
cost advantages of Amazon EC2. Cluster Compute and Cluster GPU Instances have been
specifically engineered to provide high-performance network capability and can be
programmatically launched into clusters – allowing applications to get the low-latency
network performance required for tightly coupled, node-to-node communication. Cluster
Compute and Cluster GPU Instances also provide significantly increased network throughput
making them well suited for customer applications that need to perform network-intensive
operations. Learn more about Cluster Compute and Cluster GPU Instances as well as other
AWS services that can be used for HPC Applications.

Advanced Internet Technologies


49
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
 VM Import – VM Import enables you to easily import virtual machine images from your
existing environment to Amazon EC2 instances. VM Import allows you to leverage your
existing investments in the virtual machines that you have built to meet your IT security,
configuration management, and compliance requirements by seamlessly bringing those
virtual machines into Amazon EC2 as ready-to-use instances. This offering is available at no
additional charge beyond standard usage charges for Amazon EC2 and Amazon S3. Learn
more about VM Import.

How BitTorrent Works

BitTorrent is a protocol that enables fast downloading of large files using minimum Internet
bandwidth. It sscosts nothing to use and includes no spyware or pop-up advertising.

Unlike other download methods, BitTorrent maximizes transfer speed by gathering pieces of the file
you want and downloading these pieces simultaneously from people who already have them. This
process makes popular and very large files, such as videos and television programs, download much
faster than is possible with other protocols.

To know why BitTorrent downloading is different from the regular downloading i’ll have to explain
how the traditional client-server downloading works:

 You open a Web page and click a link to download a file to your computer.

 The Web browser software on your computer (the client) tells the server (a central computer that holds the
Web page and the file you want to download) to transfer a copy of the file to your computer.

 The transfer is handled by a protocol (a set of rules), such as FTP (File Transfer Protocol) or HTTP
(HyperText Transfer Protocol).

Advanced Internet Technologies


50
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
The transfer speed is affected by a
number of variables, including the type of protocol, the amount of traffic on the server and the
number of other computers that are downloading the file. If the file is both large and popular, the
demands on the server are great, and the download will be slow.

But the BitTorrent follows a different method of sharing files known as Peer-Peer Sharing

Peer-to-peer file sharing is different from traditional file downloading. In peer-to-peer sharing, you
use asoftware program (rather than your Web browser) to locate computers that have the file you
want. Because these are ordinary computers like yours, as opposed to servers, they are called peers.
The process works like this:

 You run peer-to-peer file-sharing software (for example, a Gnutella program) on your
computer and send out a request for the file you want to download.
 To locate the file, the software queries other computers that are connected to the Internet and
running the file-sharing software.
 When the software finds a computer that has the file you want on its hard drive, the
download begins.
 Others using the file-sharing software can obtain files they want from your computer’s hard
drive.

Advanced Internet Technologies


51
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
The file-transfer load is distributed between the computers exchanging files, but file searches and
transfers from your computer to others can cause bottlenecks. Some people download files and
immediately disconnect without allowing others to obtain files from their system, which is
called leeching. This limits the number of computers the software can search for the requested file.

Unlike some other peer-to-peer downloading methods, BitTorrent is a protocol that offloads some of
the file tracking work to a central server (called a tracker). Another difference is that it uses a
principal called tit-for-tat. This means that in order to receive files, you have to give them. This
solves the problem of leeching — one of developer Bram Cohen‘s primary goals. With BitTorrent,
the more files you share with others, the faster your downloads are. Finally, to make better use of
available Internet bandwidth (the pipeline for data transmission), BitTorrent downloads different
pieces of the file you want simultaneously from multiple computers.

Advanced Internet Technologies


52
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
 You open a Web page and click on a link for the file you want.
 BitTorrent client software communicates with a tracker to find other computers running
BitTorrent that have the complete file (seed computers) and those with a portion of the file
(peers that are usually in the process of downloading the file).
 The tracker identifies the swarm, which is the connected computers that have all of or a
portion of the file and are in the process of sending or receiving it.
 The tracker helps the client software trade pieces of the file you want with other computers in
the swarm. Your computer receives multiple pieces of the file simultaneously.
 If you continue to run the BitTorrent client software after your download is complete, others
can receive .torrent files from your computer; your future download rates improve because
you are ranked higher in the “tit-for-tat” system.

Downloading pieces of the file at the same time helps solve a common problem with other peer-to-
peer download methods: Peers upload at a much slower rate than they download. By downloading
multiple pieces at the same time, the overall speed is greatly improved. The more computers
involved in the swarm, the faster the file transfer occurs because there are more sources of each
piece of the file. For this reason, BitTorrent is especially useful for large, popular files.

Advanced Internet Technologies


53
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
The many benefits of BitTorrent

Posted in Software on February 6th, 2010 by admin

With BitTorrent, the original file remains intact, the download speeds are amazing, and of course,
everything is available at your fingertips for free. Yes, these are among the myriad of reasons why it
is popular today. Instead of having to purchase a CD or DVD, downloads on BitTorrent are available
for free.

One of the main advantages of BitTorrent is that you can sample content prior to purchasing it. This
is great for both artists and users as well, who can end up buying albums if they like them. In this
way, movies and software that live up to expectations can be tested and then bought.

Many television shows, movies, and rare music may not be available in the market. However, you
are sure to find it on a BitTorrent. There may be TV shows that you may not have in your country
yet or songs that you can buy easily online.

Bittorrent is becoming popular, and many software publishers today include torrents in their
downloads section. This is because they have realized the convenience with which files can be
downloaded quickly with BitTorrent. This is of course in addition to the increasing number of
dedicated sites. You can expect to download a game much faster with BitTorrent instead of HTTP.

This entry was posted on Saturday, February 6th, 2010 at 11:11 am and is filed under Software. You
can follow any responses to this entry through the RSS 2.0 feed. Responses are currently closed, but
you can trackback from your own site.

Advanced Internet Technologies


54
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Advanced Internet Technologies
55
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Advanced Internet Technologies
56
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Advanced Internet Technologies
57
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Advanced Internet Technologies
58
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Advanced Internet Technologies
59
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Advanced Internet Technologies
60
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Advanced Internet Technologies
61
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan
Advanced Internet Technologies
62
Notes compiled by Mayur Patel, Pratik Gandhi and Abhishek Chandan

Potrebbero piacerti anche