Sei sulla pagina 1di 42

Hypertext Transfer

Protocol (HTTP)
By
Mohan Kumar Noothalapati
Agenda
• About HTTP
• HTTP 1.1 vs. HTTP 1.0
• HTTP URLs
• HTTP methods
• WebDAV
• Request/Response formats
• Request/Response headers
• HTTP Status Codes
• HTTP Cookie
• HTTP Caching
• DNS
Hypertext Transfer Protocol

• The Hypertext Transfer Protocol (HTTP) is an application protocol for distributed,


collaborative, hypermedia information systems. HTTP is the foundation of data
communication for the World Wide Web. HTTP has been in use by the World-Wide
Web global information initiative since 1990.
• Language of the Web
• protocol used for communication between web browsers and web servers
• TCP port 80
• Standards development was coordinated by IETF and W3C.
HTTP 1.0 Vs. HTTP 1.1
HTTP/1.0 HTTP/1.1
Not Persistent but can be achieved by Persistent Connection
Connection: Keep-Alive
No Pipelining HTTP Pipelining
No Caching Cache-Control: no-cache or max-age=0 and
Pragma: no-cache is introduced.
Not available Partial Document Transfers with ranges
Not available Conditional Fetching of data

HTTP persistent connection (also known as HTTP keep-alive or HTTP connection reuse) is the idea
of using a single TCP connection to send and receive multiple HTTP requests/responses, as opposed to
opening a new connection for every single request/response pair.
URI,URN,URL

• Uniform Resource Identifier


• Information about a resource. Ex: Scheme://host/path/object#name
• Uniform Resource Name
• The name of the resource with in a namespace. Ex: Host/path/object#name
• Uniform Resource Locator
• How to find the resource.
HTTP - URLs
• URL
• Uniform Resource Locator
• protocol (http, ftp, mailto)
• host name (name.domain name)
• port (usually 80 but many on 8080)
• directory path to the resource
• resource name
• http://www.techonthenet.com:80/oracle/index.php
• http://xxx.myplace.com:80/download/httpex.exe
HTTP - methods
• Methods
• GET
• retrieve a URL from the server
• simple page request
• run a program
• run a program with arguments attached to the URL
• Limited data can be passed with URL as parameters
• POST
• preferred method for forms processing
• run a program
• can send more data than GET method.
• more secure and private
HTTP - methods
• Methods (cont.)
• PUT
• Used to transfer a file from the client to the server
• HEAD
• requests URLs response header only
• used for conditional URL handling for performance enhancement schemes
• DELETE
• Deletes a specified resource
• OPTIONS
• HTTP methods that the server supports.
• CONNECT
• Converts the request connection to a transparent TCP/IP Tunnel usually for SSL.
• PATCH
• Is used to apply partial modifications to a resource.
• TRACE
• Echos back the received request to client to check the changes by intermediate servers.
• TRACK & DEBUG were removed due to security issues in HTTP1.1
WebDAV

• Web Distributed Authoring and Versioning (WebDAV) is an extension of the Hypertext Transfer Protocol
(HTTP) that facilitates collaboration between users in editing and managing documents and files stored on
World Wide Web servers.
• PROPFIND — used to retrieve properties, stored as XML, from a web resource.
• PROPPATCH — used to change and delete multiple properties on a resource in a single atomic act
• MKCOL — used to create collections
• COPY — used to copy a resource from one URI to another
• MOVE — used to move a resource from one URI to another
• LOCK — used to put a lock on a resource. WebDAV supports both shared and exclusive locks.
• UNLOCK — used to remove a lock from a resource
• Status Codes such as 102 Processing, 207 Multi-Status, 424 Method Failure etc.,
Request & Response Formats

• Request Format
Request-Line = Method SP Request-URI SP HTTP-Version CRLF
*(( general-header
| request-header) CRLF)
CRLF
[ message-body ]

Methods can be "OPTIONS" | "GET" | "HEAD" | "POST" | "PUT" |


"DELETE" | "TRACE" | "CONNECT"

• Response Format
Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF
*(( general-header | response-header ) CRLF)
CRLF
[ message-body ]
HTTP Headers

• HTTP header fields are components of the message header of requests and responses in the
HTTP.
• The header fields are transmitted after the request or response line, the first line of a message.
Header fields are colon-separated name-value pairs in clear-text string format, terminated by a
carriage return (CR) and line feed (LF) character sequence. The end of the header fields is
indicated by an empty field, resulting in the transmission of two consecutive CR-LF pairs. Long
lines can be folded into multiple lines; continuation lines are indicated by the presence of a
space (SP) or horizontal tab (HT) as the first character on the next line.
GET /dumprequest HTTP/1.1
Host: djce.org.uk
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:21.0) Gecko/20100101 Firefox/21.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Referer: http://search.conduit.com/Results.aspx
Connection: keep-alive
HTTP Request Packets

• Sent from client to server


• Consists of HTTP header
• header is hidden in browser environment
• contains:
• content type / mime type
• content length
• user agent - browser issuing request
• content types user agent can handle

• and a URL
HTTP Request Headers

• Precede HTTP Method requests


• headers are terminated by a blank line
• Header Fields:
• Accept, Accept-Encoding, Accept Language, Referer, Authorization, Charge-To, If-Modified-Since,
Pragma etc.,
• Byte serving is the process of sending only a portion of an HTTP/1.1 message from a server to
a client. Byte serving uses the Range HTTP request header and the Accept-Ranges and
Content-Range HTTP response headers.
• Chunked transfer encoding is a data transfer mechanism in version 1.1 of the Hypertext
Transfer Protocol (HTTP) in which data is sent in a series of "chunks". It uses the Transfer-
Encoding HTTP header in place of the Content-Length header, which the protocol would
otherwise require. Because the Content-Length header is not used, the sender does not need
to know the length of the content before it starts transmitting a response to the receiver.
Senders can begin transmitting dynamically-generated content before knowing the total size
of that content.
Accept:
• List of schemes which will be accepted by client
• <field> = Accept: <entry> * [,<entry>]
• <entry> = <content type> *[;<param>]
• <param> = <attr> = <float>
• <attr> = q / mxs / mxb
• <float> = <ANSI-C floating point >
• Accept: text/html
• Accept: audio/basic q-1
• if no Accept is found; plain/text is assumed
• may contain wildcards (*)
Accept-Encoding

• Like Accept but list is a list of acceptable encoding schemes


• Ex
• Accept-Encoding: x-compress;x-zip
User-Agent

• Software product used by original client


• <field> = User-Agent: <product>
• <product> = <word> [/<version>]
• <version> = <word>
• Ex.
• User-Agent: IBM WebExplorer DLL /v960311
Referer

• For Server’s benefit, client lists URL od document (or document type) from
which the URL in request was obtained.
• Allows server to generate back-links, logging, tracing of bad links…
• Ex.
• Referer: http:/www.w3.com/xxx.html
Pragma:

• Same format as accept


• for servers
• should be passed through proxies, but used by proxy
• only pragma currently defined is no-cache; proxy should get document from
owning server rather than cache
Modified-Since:

• Used with GET to make a conditional GET


• if requested document has not been modified since specified date a
Modified 304 header is sent back to client instead of document
• client can then display cached version
HTTP Response Headers
• Response packets are Sent by server to client browser in
response to a Request Packet
• Status Header
• Entities
• Content-Encoding:
• Content-Length:
• Content-Type:
• Expires:
• Last-Modified:
• extension-header

• Body – content (usually html)


Status Header

• “HTTP/1.1 statuscode phrase”


• Codes:
• 1xx – Informational - Request received, continuing process
• 2xx – successful - The action was successfully received, understood and accepted.
• 3xx – Redirection - Further action must be taken in order to complete the request.
• 4xx – Client Error - The request contains bad syntax or cannot be fulfilled.
• 5xx – Server Error - The server failed to fulfill an apparently valid request
• The Internet Assigned Numbers Authority (IANA) maintains the official registry of HTTP
status codes.
Status Codes
• 100 Continue
• 401 unauthorized
• 200 OK
• 403 forbidden
• 201 created
• 404 not found
• 202 accepted
• 417 Expectation failed
• 204 no content
• 500 internal server error
• 301 moved perm.
• 501 not implemented
• 302 moved temp
• 502 Bad gateway
• 304 not modified
• 503 Service Unavailable
• 400 bad request
• 504 Gateway Timeout
HTTP Cookie

• HTTP cookie also known as web cookie or browser cookie


• Small piece of data.
• The term "cookie" was derived from "magic cookie", which is the packet of
data a program receives and sends again unchanged.
• Lou Montulli applied for patent in 1995
Types of Cookies
• Session cookie :(also known as in-memory cookie or transient cookie) It is a
temporary memory without an expiry date or validity interval.
• Persistent cookie : It will outlast user sessions with max age 1 yr
• Secure cookie : Used via HTTPS to ensure cookie is encrypted.
• HttpOnly cookie : It will be used only when transmitting via HTTP/S
• Third-party cookie : Cookies set with domains different from the one shown on
the address bar.
• Supercookie : It is a cookie with an origin of a Top-Level Domain (TLD) or an
effective Top-Level Domain.
• Zombie cookie : Some cookies are automatically recreated after a user has deleted
them. (Evercookie by Samy Kamkar)
Structure of a Cookie

• A cookie contains no more than 255 characters and cannot take up more than 4K of Disk
Space, which consists of six parameters
• Name of the cookie
• Value of the cookie
• The expiry of the cookie (using Greenwich Mean Time)
• The path the cookie is good for
• The domain the cookie is good for
• The need for a secure connection to use the cookie
• Only the first two parameters are required for the successful operation of the cookie.
Setting a Cookie & Its attributes
• Browser to Server GET /index.html HTTP/1.1
Host: www.example.org

HTTP/1.1 200 OK

• Server to Browser Content-type: text/html


Set-Cookie: name=value
Set-Cookie: name2=value2; Expires=Wed, 09 Jun 2021 10:18:14 GMT

GET /spec.html HTTP/1.1

• Browser to Server Host: www.example.org


Cookie: name=value; name2=value2
Accept: */*

• Cookie Attributes are the name–value pair, a cookie domain, a path,


expiration time or maximum age, Secure flag and HttpOnly flag.
Set-Cookie: LSID=DQAAAK…Eaem_vYg; Domain=docs.foo.com; Path=/accounts; Expires=Wed, 13 Jan 2021 22:23:01 GMT; Secure; HttpOnly
HTTP Cache (Web Cache)
• A web cache is a mechanism for the temporary storage (caching) of web documents, such
as HTML pages and images, to reduce bandwidth usage, server load, and perceived lag.
• It can be used in various systems (clients, servers and network) like search engine, network
– aware forward cache, reverse cache, web proxy servers, content delivery networks etc.,
• HTTP defines three basic mechanisms for controlling caches: freshness, validation, and
invalidation
• Cache-Control
• no-store - never cache this message
• No-cache - may cache but need revalidation
• public - may cache
• private - intended for single user
• Max-age - set expiration
• Must-revalidate - require revalidation
DNS
• The “Domain Name System”
• Created in 1983 by Paul Mockapetris (RFCs 1034 and 1035), modified, updated, and enhanced by a
myriad of subsequent RFCs
• What Internet users use to reference anything by name on the Internet
• The mechanism by which Internet software translates names to addresses and vice versa
• A lookup mechanism for translating objects into other objects
• A globally distributed, loosely coherent, scalable, reliable, dynamic database
• Comprised of three components
• A “name space”
• Servers making that name space available
• Resolvers (clients) which query the servers about the name space
Name Space
• The name space is the structure of the DNS database
• An inverted tree with the root node at the top
• Each node has a label
• The root node has a null label, written as “”

The root node


""

top-level node top-level node top-level node

second-level node second-level node second-level node second-level node second-level node

third-level node third-level node third-level node


Domain Names
• A domain name is the sequence of labels from a node to the root, separated
by dots (“.”s), read left to right
• The name space has a maximum depth of 127 levels
• Domain names are limited to 255 characters in length
• A node’s domain name identifies its position in the name space
""

com edu gov int mil net org

nominum metainfo berkeley nwu nato army uu

west east www

dakota tornado
The Resolution Process

• Let’s look at the resolution process step-by-step:

annie.west.sprockets.com
ping www.nominum.com.
The Resolution Process
• The workstation annie asks its configured name
server, dakota, for www.nominum.com’s address

dakota.west.sprockets.com

What’s the IP address


of
www.nominum.com?

annie.west.sprockets.com
ping www.nominum.com.
The Resolution Process
• The name server dakota asks a root name server, m, for
www.nominum.com’s address

m.root-servers.net
dakota.west.sprockets.com

What’s the IP address


of
www.nominum.com?

annie.west.sprockets.com
ping www.nominum.com.
The Resolution Process
• The root server m refers dakota to the com name servers
• This type of response is called a “referral”

m.root-servers.net
dakota.west.sprockets.com Here’s a list of the
com name servers.
Ask one of them.

annie.west.sprockets.com
ping www.nominum.com.
The Resolution Process
• The name server dakota asks a com name server,
f, for www.nominum.com’s address

What’s the IP address


of
www.nominum.com?

m.root-servers.net
dakota.west.sprockets.com

f.gtld-servers.net

annie.west.sprockets.com
ping www.nominum.com.
The Resolution Process
• The com name server f refers dakota to the
nominum.com name servers

Here’s a list of the


nominum.com
name servers.
Ask one of them.
m.root-servers.net
dakota.west.sprockets.com

f.gtld-servers.net

annie.west.sprockets.com
ping www.nominum.com.
The Resolution Process
• The name server dakota asks an nominum.com name
server, ns1.sanjose, for www.nominum.com’s address

What’s the IP address


of
www.nominum.com?

m.root-servers.net
dakota.west.sprockets.com

ns1.sanjose.nominum.net

f.gtld-servers.net

annie.west.sprockets.com
ping www.nominum.com.
The Resolution Process
• The nominum.com name server ns1.sanjose
responds with www.nominum.com’s address

m.root-servers.net
dakota.west.sprockets.com

Here’s the IP ns1.sanjose.nominum.net


address for
www.nominum.com
f.gtld-servers.net

annie.west.sprockets.com
ping www.nominum.com.
The Resolution Process
• The name server dakota responds to annie with
www.nominum.com’s address
Here’s the IP
address for
www.nominum.com

m.root-servers.net
dakota.west.sprockets.com

ns1.sanjose.nominum.net

f.gtld-servers.net

annie.west.sprockets.com
ping www.nominum.com.
The Current TLDs
"."

Generic TLDs Country Code TLDs International TLDs US Legacy TLDs


(gTLDs) (ccTLDs) (iTLDs) (usTLDs)

COM AF INT GOV


Commercial Organizations Afghanistan International Treaty Organizations Governmental Organizations

NET AL ARPA MIL


Network Infrastructure Albania (Transition Device) Military Organizations

ORG DZ EDU
Other Organizations Algeria Educational Institutions

...

YU
Yugoslavia

ZM
Zambia

ZW
Zimbabwe
Registries, Registrars, and Registrants
Registry updates Master
zone updated
Registry Zone DB

Slaves
Registrar submits
updated
add/modify/delete
to registry

Registrar Registrar Registrar

End user requests


add/modify/delete

Registrants

Potrebbero piacerti anche