Sei sulla pagina 1di 43

Basics

Collection of millions of files stored on thousands of servers all over the world. The components required to make a web are Structural Components
- Web Server - Web Client - Internet

Semantic Components
Well defined set of Languages Protocols Uniform Resource Locators (URLs)
Contd

Languages
HTML (Hyper Text Markup Language) XML (eXtensible Markup Language ) JAVA JAVA Script VRML (Virtual Reality Modeling Language)

Protocols
- Hyper Text Transfer Protocol (HTTP)

URLs
- Naming of Web Pages

WWW Background
1989-1990 - Tim Berners-Lee invented the World Wide Web

at CERN(European Center for Nuclear Research)

1993 Mark Andreessen invented MOSAIC at National

Center for Super Computing Applications (NCSA)


First graphical browser Internets first killer app Became Netscape Inc.
Contd

W3C

--

World Wide Web Consortium

1994

Recent Browsers are..


Internet Explorer
Mozilla Firefox Google Chrome

retain many characteristics of MOSAIC

Quick Aside Web server use

Source: Netcraft Server Survey, November 2013

Usage Share of Web Browsers

Web and HTTP


First some jargon Web page consists of objects Object can be HTML file, JPEG image, Java applet, audio file, Web page consists of base HTML-file which includes several referenced objects Each object is addressable by a URL Example URL: http://www.someschool.edu/someDept/pic.gif

protocol

host name

path name

WWW Structure
Clients use browser to send URLs via HTTP to servers requesting a Web page Web pages constructed using HTML (or other markup language) and consist of text, graphics, sounds etc. Servers respond to the clients with the requested Web page or with an error message

Clients browser renders Web page returned by server


The entire system runs over standard networking protocols (TCP/IP, DNS,)

user clicks link

browser sends request


GET http://www.sitams.org/index.html HTTP/1.1

server finds page

such syuhhow howget gtw his hsio if iart ertage ag ty ty gun ghntee ty we we ghty ghty syuh how gtw chid chdiawl qw oat oatyf hsio wet wetdeli dfla get ght a a i ert ag ty ghn ty we ghty chdi qw oatyf wet dfla ght a

communicate with HTTP


server sends page back
syuh how gtw hsio i ert ag ty ghn ty we ghty chdi qw oatyf wet dfla ght a

syuh how gtw hsio i ert ag ty ghn ty we ghty syuh how gtw gtw chdi syuh qw oatyf how hsioght ert ag ty ty wet dfla hsio ii ert a ag ghn ty ty we we ghty ghty ghn chdi qw oatyf chdi qw syuh oatyf how gtw wet dfla dfla ght a ag ty wet hsioght i ert a

browser displays it

ghn ty we ghty chdi qw oatyf wet dfla ght a

web client (browser)

web server (stores pages)

Contd

HTTP overview
HTTP: hypertext transfer protocol
Webs application layer

protocol client/server model client: browser that requests, receives, displays Web objects server: Web server sends objects in response to requests

PC running Explorer

Server running Apache Web server

Mac running Navigator

HTTP overview (continued)


Uses TCP:
client initiates TCP

HTTP is stateless
server maintains no

connection (creates socket) to server, port 80 server accepts TCP connection from client HTTP messages (applicationlayer protocol messages) exchanged between browser (HTTP client) and Web server (HTTP server) TCP connection closed

information about past client requests

Protocols that maintain state are complex! past history (state) must be maintained if server/client crashes, their views of state may be inconsistent, must be reconciled

aside

Client Side
Steps when link is selected Eg: http://www.sitams.org/btech/ece.html
- The browser determines the URL - The browser asks DNS for IP address of www.sitams.org - DNS replies with IP address - The browser makes a TCP connection on the IP address - It then sends over a request asking for the file /btech/ece.html - The www. sitams.org server sends the file /btech/ece.html - TCP connection is released - The browser fetches and displays all the text and images in the file

Server Side
Steps the Server performs when a link is selected
- Accept the TCP connection from the client - Get the name of the file requested - Get the file from the disk - Return the file to the client - Release the TCP connection

Protocol - HTTP
Protocol for client/server communication
Very simple request/response protocol Client sends request message, server replies with response message

Three versions has been used


0.9/1.0 RFC 1945 1.1 RFC 2068 1.0 dominates today but 1.1 is catching up

Contd

HTTP/0.9
- Transfer of Raw data across the Internet

HTTP/1.0
- It is a stop and wait protocol - Provides messages in Multipurpose Internet Mail Extension (MIME) types - Separate TCP connection for each file

HTTP/1.1 focused on performance enhancements


Persistent connections Pipelining Enhanced caching options

HTTP connections
Nonpersistent HTTP At most one object is sent over a TCP connection. Persistent HTTP Multiple objects can be sent over single TCP connection between client and server.

Nonpersistent HTTP
Suppose user enters URL www.sitams.org/ece/home.index
1a. HTTP client initiates TCP
(contains text, references to 10 jpeg images)

connection to HTTP server (process) at www.sitams.org on port 80

1b. HTTP server at host

2. HTTP client sends HTTP

www.sitams.org waiting for TCP connection at port 80. accepts connection, notifying client

request message (containing


URL) into TCP connection socket. Message indicates that client wants object /ece/home.index

3. HTTP server receives request


message, forms response message containing requested object, and sends message into its socket

time

Nonpersistent HTTP (cont.)


4. HTTP server closes TCP 5. HTTP client receives response
message containing html file, displays html. Parsing html file, finds 10 referenced jpeg objects
connection.

time 6. Steps 1-5 repeated for each


of 10 jpeg objects

Non-Persistent HTTP: Response time


Definition of RTT: time for a small packet to travel from client to server and back. Response time: one RTT to initiate TCP connection one RTT for HTTP request and first few bytes of HTTP response to return file transmission time total = 2RTT+transmit time

initiate TCP connection

RTT
request file RTT file received time

time to transmit file

time

Persistent HTTP
Nonpersistent HTTP issues: requires 2 RTTs per object OS overhead for each TCP connection browsers often open parallel TCP connections to fetch referenced objects Persistent HTTP server leaves connection open after sending response subsequent HTTP messages between same client/server sent over open connection client sends requests as soon as it encounters a referenced object as little as one RTT for all the referenced objects

HTTP Request Messages


GET retrieve document specified by URL PUT store specified document under given URL HEAD retrieve info. about document specified by URL OPTIONS retrieve information about available options POST give information to the server DELETE remove document specified by URL TRACE loopback request message CONNECT Reserved for future use

HTTP Request Format


request-line ( request message-URL-HTTP-version) headers (0 or more)
<blank line> body (only for POST request)

Clients browser constructs and sends message Typical HTTP request:


GET http://www.sitams.org/ece/faculty_details/index.html HTTP/1.0

HTTP Response Codes


1xx Informational request received, processing Ex: 100:server agrees to handle clients request. 2xx Success action received, understood, accepted Ex:200:request succeeded;204:no content present. 3xx Redirection further action necessary Ex:301:Page moved;304:cache page still valid. 4xx Client Error bad syntax or cannot be fulfilled Ex:403:forbidden page;404:page not found. 5xx Server Error server failed Ex:500:internal server error;503:try again later

HTTP Response Format


status-line (HTTP-version response-code response-phrase) headers (0 or more)
<blank line>

body

Web servers construct and send response messages Typical HTTP response:
- HTTP/1.0 301 Moved Permanently Location: http://www.sitams.org/ece/faculty/index.html

HTTP Headers
Both requests and responses contain a variable number of header fields
Consists of field name, colon, space, field value 17 possible header types divided into three categories
Request Response Body

Example: Date: Friday, 10-Dec-13 11:30:01 IST Example: Content-length: 3001

User-server state: cookies


Many major Web sites use cookies Four components: Example: Susan always access Internet always from PC visits specific e1) cookie header line of HTTP response message commerce site for first 2) cookie header line in time HTTP request message when initial HTTP 3) cookie file kept on users host, managed by requests arrives at site, users browser site creates: 4) back-end database at unique ID Web site entry in backend database for ID

Cookies: keeping state (cont.)


client
ebay 8734

server
usual http request msg

cookie file
ebay 8734 amazon 1678

Set-cookie: 1678
usual http request msg

usual http response

Amazon server creates ID 1678 for user create

entry

cookie: 1678

one week later:


ebay 8734 amazon 1678

usual http response msg usual http request msg

cookiespecific action
cookiespectific action

access

access

backend database

cookie: 1678

usual http response msg

Cookies (continued)
What cookies can bring: authorization shopping carts recommendations user session state (Web e-mail) Cookies and privacy: cookies permit sites to learn a lot about you you may supply name and e-mail to sites
aside

How to keep state: protocol endpoints: maintain state at sender/receiver over multiple transactions cookies: http messages carry state

Web caches (proxy server)


Goal: satisfy client request without involving origin server
user sets browser:
origin server

Web accesses via cache browser sends all HTTP requests to cache

client

Proxy server

object in cache: cache returns object else cache requests object from origin server, then returns object to client

client

origin server

More about Web caching


cache acts as both

client and server typically cache is installed by ISP (university, company, residential ISP)

Why Web caching? reduce response time for client request reduce traffic on an institutions access link. Internet dense with caches: enables poor content providers to effectively deliver content (but so does P2P file sharing)

Caching example
Assumptions
average object size = 100,000

origin servers
public Internet

bits avg. request rate from institutions browsers to origin servers = 15/sec delay from institutional router to any origin server and back to router = 2 sec

1.5 Mbps access link


institutional network 10 Mbps LAN

Consequences
utilization on LAN = 15%
utilization on access link = 100% total delay

= Internet delay + access delay + LAN delay = 2 sec + minutes + milliseconds

institutional cache

Caching example (cont)


possible solution
increase bandwidth of access

origin servers
public Internet

link to, say, 10 Mbps

consequence
utilization on LAN = 15% utilization on access link = 15% Total delay

= Internet delay + access delay + LAN delay = 2 sec + msecs + msecs often a costly upgrade

10 Mbps access link


institutional network 10 Mbps LAN

institutional cache

Caching example (cont)


possible solution: install cache
suppose hit rate is 0.4

origin servers
public Internet

consequence

40% requests will be

satisfied almost immediately 60% requests satisfied by origin server utilization of access link reduced to 60%, resulting in negligible delays (say 10 msec) total avg delay = Internet delay + access delay + LAN delay = .6*(2.01) secs + .4*milliseconds < 1.4 secs

1.5 Mbps access link


institutional network 10 Mbps LAN

institutional cache

Conditional GET
Goal: dont send object if

cache
HTTP request msg
If-modified-since: <date>

server
object not modified

cache has up-to-date cached version cache: specify date of cached copy in HTTP request
If-modified-since: <date>
server: response contains no

HTTP response
HTTP/1.0 304 Not Modified

object if cached copy is upto-date:


HTTP/1.0 304 Not Modified

HTTP request msg


If-modified-since: <date>

HTTP response
HTTP/1.0 200 OK

object modified

<data>

FTP: the file transfer protocol


FTP FTP user client interface local file system

file transfer

FTP server
remote file system

user at host

transfer file to/from remote host client/server model

remote) server: remote host ftp: RFC 959 ftp server: port 21

client: side that initiates transfer (either to/from

FTP: separate control, data connections


FTP client contacts FTP server
TCP control connection port 21

at port 21, TCP is transport protocol TCP data connection FTP FTP port 20 client authorized over control client server connection client browses remote server opens another TCP directory by sending commands data connection to transfer over control connection. another file. when server receives file control connection: out of transfer command, server band opens 2nd TCP connection (for FTP server maintains state: file) to client current directory, earlier after transferring one file, authentication server closes data connection.

FTP commands, responses


Sample commands:
sent as ASCII text over

Sample return codes


status code and phrase (as

control channel USER username PASS password

LIST return list of file in

current directory (gets) file

RETR filename retrieves

STOR filename stores

(puts) file onto remote host

in HTTP) 331 Username OK, password required 125 data connection already open; transfer starting 425 Cant open data connection 452 Error writing file

Potrebbero piacerti anche