Sei sulla pagina 1di 51

o User Agents : Capable of sending and receiving SIP requests.

1. End Devices: SIP phone, PC/laptop with SIP Client, PDA, mobile phone
2. PSTN Gateways are a type of User Agent

o SIP Proxy Servers : Forward or “proxy” requests on behalf of User Agents, Route requests,
Consult databases: (DNS), Location Server Can be any number

o Location Server : Database of locations of SIP User Agents. Queried by Proxies in routing.
Updated by User Agents by Registration

o DNS Server : SRV (Service) Records used to locate Inbound Proxy Servers
The role of UAC and UAS, as well as Proxy and Redirect Servers, are defined on a transaction-by-
transaction basis.

For example, the User Agent initiating a call acts as a UAC when sending the initial INVITE
request and as a UAS when receiving a BYE request from the callee.

Similarly, the same software can act as a Proxy Server for one request and as a Redirect Server
for the next request.

Proxy, Location, and Registrar Servers are logical entities; implementations may combine them
into a single application.
o Intermediary entity that acts as both a server and a client for the purpose of making requests on
behalf of other clients

o Primarily plays the role of routing and is transparent to end devices

o Interprets and, if necessary, rewrites a request message before forwarding it

o Header fields that can be legitimately modified are:

 Request-URI

 Via

 Record-Route

 Route

 Max-Forwards

 Proxy- Authorization
Perform routing function, i.e., determine to which hop (UA/proxy/redirect) signaling should be relayed

Header fields that can be legitimately modified by proxy servers are: Request-URI, Via, Record-Route,
Route, Max-Forwards, and Proxy- Authorization. If these header fields are not intact end-to-end.

The Request-URI indicates the user or service to which this request is being addressed (i.e. the
destination address)

The Via indicates the path taken by the request and identifies the location where the response is to be
sent (i.e. the source address)

Record-Route to force future requests in the dialog to be routed through the proxy

Route used to force routing for a request through the listed set of proxies

Max-Forward limit the number of hops a request can transit on the way to its destination (i.e. maximum
number of hops allowed)

Proxy-Authorization allows the client to identify itself (or its user) to a proxy that requires authentication
The proxy generate a CANCEL request for all pending client transactions associated with this
response context.

A proxy also generate a CANCEL request for all pending client transactions associated with this
response context when it receives a 6xx response. A pending client transaction is one that has
received a provisional response, but no final response (it is in the proceeding state) and has not
had an associated CANCEL generated for it.

A stateful proxy responds to a CANCEL, rather than simply forwarding a response it would
receive from a downstream element. For that reason, CANCEL is referred to as a "hop-by-hop"
request, since it is responded to at each stateful proxy hop.

When a response is received by an element, it first tries to locate a client transaction matching
the response and perform following processing

1. Find the appropriate response context

2. Update for provisional responses

3. Remove the topmost Via

4. Add the response to the response context

5. Check to see if this response should be forwarded immediately

6. When necessary, choose the best final response from the response context

7. Aggregate authorization header field values if necessary

8. Optionally rewrite Record-Route header field values

9. Forward the response

10. Generate any necessary CANCEL requests

The forking of SIP requests means that multiple dialogs can be established from a single request.

The Back-To-Back User Agent (B2BUA) is a SIP based logical entity that can receive and process INVITE
messages as a SIP User Agent Server (UAS). It also acts as a SIP User Agent Client (UAC) that determines
how the request should be answered and how to initiate outbound calls. Unlike a SIP proxy server, the
B2BUA maintains complete call state and participates in all call requests.

B2BUA does originate signaling.

B2BUA is Call-Stateful

In the case of message-oriented transports (such as UDP), if the message has a Content-Length header
field, the message body is assumed to contain that many bytes.

If there are additional bytes in the transport packet beyond the end of the body, they are discarded. If
the transport packet ends before the end of the message body, this is considered an error. If the
message is a response, it must be discarded. If the message is a request, the element generate a 400
(Bad Request) response. If the message has no Content-Length header field, the message body is
assumed to end at the end of the transport packet.

In the case of stream-oriented transports such as TCP, the Content- Length header field indicates the
size of the body. The Content- Length header field must be used with stream oriented transports.

Error Handling

Error handling is independent of whether the message was a request or response.

If the transport user asks for a message to be sent over an unreliable transport, and the result is an ICMP
error, the behavior depends on the type of ICMP error. Host, network, port or protocol unreachable
errors, or parameter problem errors cause the transport layer to inform the transport user of a failure in
sending. Source quench and TTL exceeded ICMP errors is ignored.
If the transport user asks for a request to be sent over a reliable transport, and the result is a connection
failure, the transport layer inform the transport user of a failure in sending.

Client Transaction

The client transaction provides its functionality through the maintenance of a state machine.

There are two types of client transaction state machines, depending on the method of the request
passed by the TU. One handles client transactions for INVITE requests. This type of machine is referred
to as an INVITE client transaction. Another type handles client transactions for all requests except INVITE
and ACK. This is referred to as a non-INVITE client transaction. There is no client transaction for ACK. If
the TU wishes to send an ACK, it passes one directly to the transport layer for transmission.

Server Transaction

The server transaction is responsible for the delivery of requests to the TU and the reliable transmission
of responses. It accomplishes this through a state machine. As with the client transactions, the state
machine depends on whether the received request is an INVITE request.
1. Method: RFC 3261 defines six methods:



c. ACK


e. BYE


2. Request-URI: SIP or SIPS URI or a general URI (RFC 2396)

3. SIP-Version: Include the version of SIP in use, and follow [H3.1] (with HTTP replaced by SIP, and
HTTP/1.1 replaced by SIP/2.0)

a. Must include a SIP-Version of "SIP/2.0“

b. SIP-Version string is case-insensitive

c. Version number is treated as literal string

1. SIP-Version: Include the version of SIP in use and follow [H3.1] (with HTTP replaced by SIP, and
HTTP/1.1 replaced by SIP/2.0)

a. Must include a SIP-Version of "SIP/2.0“

b. SIP-Version string is case-insensitive

c. Version number is treated as literal string

2. Status-Code: 3-digit integer that indicates the outcome of an attempt to understand and satisfy
a request

a. First digit defines the class of response

b. Last two digits intended for use by automata

3. Reason-Phrase: Short textual description of the Status-Code, intended for the human user
A session invitation consists of one INVITE request which is usually sent to a proxy. The proxy sends
immediately a 100 Trying reply to stop retransmissions and forwards the request further.

All provisional responses generated by callee are sent back to the caller. See 180 Ringing response in the
call flow. The response is generated when callee's phone starts ringing.

A 200 OK is generated once the callee picks up the phone and it is retransmitted by the callee's user
agent until it receives an ACK from the caller. The session is established at this point.

Session termination is accomplished by sending a BYE request within dialog established bye INVITE. BYE
messages are sent directly from one user agent to the other unless a proxy on the path of the INVITE
request indicated that it wishes to stay on the path by using record routing

Party wishing to tear down a session sends a BYE request to the other party involved in the session. The
other party sends a 200 OK response to confirm the BYE and the session is terminated.
The reason for separation of ACK is the importance of delivery of all 200 OK messages. Not only that
they establish a session, but also 200 OK can be generated by multiple entities when a proxy server forks
the request and all of them must be delivered to the calling user agent. Therefore user agents take
responsibility in this case and retransmit 200 OK responses until they receive an ACK.
We have seen in the previous slides what transactions are, that one transaction includes INVITE and it's
responses and another transaction includes BYE and it responses when a session is being torn down. But
those two transactions should be somehow related--both of them belong to the same dialog.

We have seen that CSeq header field is used to order messages, in fact it is used to order messages
within a dialog. The number must be monotonically increased for each message sent within a dialog
otherwise the peer will handle it as out of order request or retransmission. In fact, the CSeq number
identifies a transaction within a dialog because requests and associated responses are called
transaction. This means that only one transaction in each direction can be active within a dialog. One
could also say that a dialog is a sequence of transactions

Dialogs are also used to route the messages between user agents. For example,

Let's suppose that user wants to talk to user He knows SIP
address of the callee ( but this address doesn't say anything about current location of
the user--i.e. the caller doesn't know to which host to send the request. Therefore the INVITE request
will be sent to a proxy server.

The request will be sent from proxy to proxy until it reaches one that knows current location of the
callee. This process is called routing. Once the request reaches the callee, the callee's user agent will
create a response that will be sent back to the caller. Callee's user agent will also put Contact header
field into the response which will contain the current location of the user. The original request also
contained Contact header field which means that both user agents know the current location of the

Because the user agents know location of each other, it is not necessary to send further requests to any
proxy--they can be sent directly from user agent to user agent. That's exactly how dialogs facilitate

Further messages within a dialog are sent directly from user agent to user agent. This is a significant
performance improvement because proxies do not see all the messages within a dialog, they are used to
route just the first request that establishes the dialog. The direct messages are also delivered with much
smaller latency because a typical proxy usually implements complex routing logic.

Dialog create by INVITE is terminated using a BYE/CANCEL. Similarly, Dialog created by SUBSCRIBE is
terminated when Subscription Terminates using NOTIFY/SUBSCRIBE itself.

Two users can exchange SDP documents via email, or even snail mail, to set up a session.

A session is established if there is a proper exchange of SDP between two parties and this exchange
results in media being exchanged between the parties.

SIP Events describes a method for setting up a SIP dialog. In this case, the dialog is used for the context
of sending NOTIFY messages to the subscribing endpoint. As such, there is no session associated with a
dialog established by a SUBSCRIBE request.
A property of this selection requirement is that a UA will place a different tag into the From header of an
INVITE than it would place into the To header of the response to the same INVITE. This is needed in
order for a UA to invite itself to a session, a common case for "hairpinning" of calls in PSTN gateways.
Similarly, two INVITEs for different calls will have different From tags, and two responses for different
calls will have different To tags.
When a server transaction is constructed for a request, it enters the "Proceeding" state. The server
transaction generate a 100 (Trying) response unless it knows that the TU will generate a provisional or
final response within 200 ms, in which case it may generate a 100 (Trying) response.

If, while in the "Proceeding" state, the TU passes a 2xx response to the server transaction, the server
transaction pass this response to the transport layer for transmission. It is not retransmitted by the
server transaction; retransmissions of 2xx responses are handled by the TU. The server transaction then
transition to the “Initial" state.

While in the "Proceeding" state, if the TU passes a response with status code from 300 to 699 to the
server transaction, the response is passed to the transport layer for transmission, and the state machine
enter the “Failure/Success" state.

If an ACK is received while the server transaction is in the “Failure/Success" state, the server transaction
transition to the "Confirmed" state.
The initial state, "calling", is entered when the TU initiates a new client transaction with an INVITE
request. If an unreliable transport is being used, the client transaction start timer A with a value of T1. If
a reliable transport is being used, the client transaction should not start timer A (Timer A controls
request retransmissions). For any transport, the client transaction start timer B with a value of 64*T1
seconds (Timer B controls transaction timeouts).

If the client transaction is still in the "Calling" state when timer B fires, the client transaction inform the
TU that a timeout has occurred. The client transaction must not generate an ACK. The value of 64*T1 is
equal to the amount of time required to send seven (7) requests in the case of an unreliable transport.

If the client transaction receives a provisional response while in the "Calling" state, it transitions to the
"Proceeding" state. Any further provisional responses passed up to the TU while in the "Proceeding"

When in either the "Calling" or "Proceeding" states, reception of a response with status code from 300-
699 cause the client transaction to transition to "Completed".

When in either the "Calling" or "Proceeding" states, reception of a 2xx response cause the client
transaction to enter the “Initial" state
A user's location-specific address

Callees bind to this address using SIP REGISTER method

Callers use this address to establish real-time communication with callees

Location-independent addresses

E-mail or Web-based addresses

Leverage off Domain Name Service (DNS)

The recipient of the request receives a set of Record-Route header fields in the message. It must mirror
all the Record-Route header fields into responses because the originator of the request also needs to
know the set of proxies.
Left message flow of Figure show how a BYE (request within dialog established by INVITE) is sent directly
to the other user agent when there is no Record-Route header field in the message. Right message flow
show how the situation changes when the proxy puts a Record-Route header field into the message.
A route set is a collection of ordered SIP or SIPS URI which represent a list of proxies that must be
traversed when sending a particular request.
Strict routing implies that the entire set of SIP nodes which may be visited is listed, in order of visitation,
in the Route header. No other nodes may be visited by this message, and all the listed nodes must be
visited in the given order or the message has "failed".
Loose routing implies that the indicated nodes must be visited before the message can be delivered to
the target indicated in the original request URI. The message may visit other nodes before, between or
after any node specified on the loose route.
For strict routing, replace the Request-URI with the topmost Route Header and place the Request-URI at
the bottom of the Route header list.
All the request generated by the SIP Proxy incase of parallel forking will have same Call-ID, From, To,
CSeq but different branch parameter which will be used by the proxy to distinguish where each message
has originated from.
Now, how does the proxy know this was a spiral, and not a loop? Using the branch-ID. The branch-ID is
supposed to contain a hash of the R-URI. So, when the request arrives again at the proxy, it finds its
previous Via entry (because of the host name), and it matches. Then, it computes the hash of the R-URI
in the incoming request, and compares it to the hash in the branch ID. If they are not the same, its a
spiral. If they're the same, its a loop.

To detect loop, the proxies insert branch parameters which consists of two parts – the first part is
normal branch parameter generation which is globally unique and the second part is used to detect loop
and spiral.

Loop detection is performed by verifying that, when a request returns to a proxy, the fields (including
any Route, Proxy-Require and Proxy-Authorization header fields) having an impact on the processing of
the request, including the incoming Request-URI and any header fields affecting the request's admission
or routing have not changed. The value placed in this part of the branch parameter reflect all of those
fields. This is to ensure that if the request is routed back to the proxy and one of those fields changes, it
is treated as a spiral and not a loop. A common way to create this value is to compute a cryptographic
hash of the To tag, From tag, Call-ID header field, the Request-URI of the request received (before
translation), the topmost Via header, and the sequence number from the CSeq header field, in addition
to any Proxy-Require and Proxy-Authorization header fields that may be present.

The request method is not included in the calculation of the branch parameter because incase of
CANCEL and ACK for non-2xx responses, the branch parameter is same as that of the request.
1. Session description

v= (protocol version)

o= (owner/creator and session identifier).

s= (session name)

i=* (session information)

u=* (URI of description)

e=* (email address)

p=* (phone number)

c=* (connection information - not required if included in all media)

b=* (bandwidth information)

One or more time descriptions (see below)

z=* (time zone adjustments)

k=* (encryption key)

a=* (zero or more session attribute lines)

Zero or more media descriptions (see below)

2. Time description

t= (time the session is active)

r=* (zero or more repeat times)

3. Media description

m= (media name and transport address)

i=* (media title)

c=* (connection information - optional if included at session-level)

b=* (bandwidth information)

k=* (encryption key)

a=* (zero or more media attribute lines)

The m line in the answer (m=video 0 RTP/AVP 31) indicates that the offer (m=video 51372 RTP/AVP 31)
is rejected (or not accepted). The remaining m lines are accepted by the answerer.
HTTP Basic Authentication

HTTP basic authentication requires the transmission of a username and a matching password embedded
in the header of a HTTP request. Included in a SIP request this user information could be used by a SIP
proxy server or destination user agent to authenticate a SIP client or the previous SIP hop in a proxy
chain. Because the clear text password can be easily sniffed and therefore poses a serious security risk,
the use of HTTP basic authentication has been deprecated by SIP 2.0

Pretty Good Privacy (PGP)

Pretty Good Privacy could be potentially used to authenticate and optionally encrypt MIME payloads
contained in SIP messages but SIP 2.0 has deprecated the use of PGP in favor of S/MIME.
Cryptography is an important element of any strategy to address data transmission security
requirements. It is the practical art of converting messages or data into a different form, such that no-
one can read them without having access to the 'key'. The message may be converted using a 'code' (in
which case each character or group of characters is substituted by an alternative one), or a 'cypher' or
'cipher' (in which case the message as a whole is converted, rather than individual characters).

Cryptography comprises two distinct classes: symmetric and asymmetric

• Symmetric cryptography

• Involves a single, secret key, which both the message-sender and the message-
recipient must have

• Used by the sender to encrypt the message, and by the recipient to decrypt it

• Provides a means of satisfying the requirement of message transmission

security, because the content cannot be read without the secret key

• Can also be used to address the integrity and authentication requirements

• Sender creates a summary of the message, or 'message authentication

code' (MAC), encrypts it with the secret key, and sends that with the
• Recipient then re-creates the MAC, decrypts the MAC that was sent,
and compares the two

• If they are identical, then the message that was received must have
been identical with that which was sent.

• Major difficulty with symmetric schemes is that the secret key has to be
possessed by both parties, and hence has to be transmitted from whomever
creates it to the other party. But if the key is compromised, all of the data
transmission security measures are undermined. The steps taken to provide a
secure mechanism for creating and passing on the secret key are referred to as
'key management'

• Asymmetric cryptography (Rivest, Shamir and Adleman – RSA, Digital Signature

Standard – DSS)

• Involves two related keys, referred to as a 'key-pair', one of which only the
owner knows (the 'private key') and the other which anyone can know (the
'public key')

• Only one party needs to know the private key

• Knowledge of the public key by a third party does not compromise the security
of data transmissions

• A 1024-bit asymmetric key-length as being necessary to provide security


• AES Cipher suite for TLS protocol

• Uses RSA Key Exchange Algorithm

• Use the AES (Advanced Encryption Standard) 128 bit keys in cipher block chaining (CBC)

• Use SHA-1 (Secure Hash Standard)

The flow below shows the edited SSLDump output of the host forming a TLS connection
to Note that the client proposed three protocol suites including the required
TLS_RSA_WITH_AES_128_CBC_SHA. The certificate returned by the server contains a Subject Alternative
Name that is set to

Reference : Rescorla, E.K., "SSL and TLS - Designing and Building Secure Systems", 2001.
Full Cone

A computer behind a NAT with IP sending and receiving on port 8000, is mapped to the external
IP:port on the NAT of Anyone

on the Internet can send packets to that IP:port and those packets will be passed on to the client
machine listening on

Restricted Cone

In the case where the client sends out a packet to external computer 1, the NAT maps the client’s to, and External 1 can send back packets to that destination.
However, the NAT will block packets coming from External 2, until the client sends out a packet to
External 2’s IP address. Once that is done, both External 1 and External 2 can send packets back to the
client, and they will both have the same mapping through the NAT.

Port Restricted Cone

If the client sends to External 1 to port 10101, the NAT will only allow through packets to the client that
come from Again, if the client has sent out packets to multiple IP:port pairs, they
can all respond to the client, and all of them will respond to the same mapped IP:port on the NAT.

If the client sends from to Computer B, it may be mapped as,
whereas if the client sends from the same port ( to a different IP, it is mapped differently
( Computer B can only respond to it’s mapping and Computer A can only
respond to it’s mapping.If either one tries to send to the other’s mapped IP:port, those packets will be
SIP Signaling Issues

1. SIP Proxy does not communicate back to SIP client on NAT’ed channel

2. Pinhole in Firewall/NAT will timeout on inactivity. Typically less than 1 minute.

a. If this occurs, client can’t receive incoming calls

Media Traversal Issues

1. IP address & port sent in SIP INVITE/200 OK (SDP) is Private, and not globally routable.

2. Media must be initiated in Private -> Public direction

3. RTCP (port +1) fails through Firewall with NAPT function

4. Pinhole in Firewall/NAT timeout on inactivity. Typically less than 1 minute.

A TCP connection is always initiated with the 3-way handshake, which establishes and negotiates the
actual connection over which data will be sent. The whole session is begun with a SYN packet, then a
SYN/ACK packet and finally an ACK packet to acknowledge the whole session establishment. At this
point the connection is established and able to start sending data.
The above solution (NAT probe or STUN server) will only work for the first 3 types of NAT. The 4th case –
symmetric NATs – will not allow this scheme since they have different mappings depending on the
target IP address. So the mapping that the NAT assigns between the client and the NAT probe is
different than that assigned between the client and the gateway. In the case of a symmetric NAT, the
client must send out RTP to, and receive RTP back from the same IP address. Any RTP connection
between an endpoint outside a NAT and one inside a NAT must be established point-to-point, and so
(even if a SIP connection has already been established) the endpoint outside the NAT must wait until it
receives a packet from the client before it can know where to reply. This is known as Connection
Oriented Media.
1. UA sends an INVITE to the NAT Proxy through the NAT

2. The NAT Proxy contacts the RTP Relay and requests it to set up a session.

3. The RTP Relay assigns an available pair of ports to this Call. It responds to the NAT Proxy with
downstream available port in RTP Relay. The NAT Proxy uses this to modify the SDP information
of the received INVITE request.

4. The NAT Proxy forwards the SIP INVITE request with modifi ed SDP (refl ecting the RTP Relay’s
IP:port) on to the Voice Gateway.

5. The Gateway replies (in the 200 OK) with its own SDP information including the port to receive
RTP packets.

6. The NAT Proxy contacts the RTP Relay to supply the IP:port of the gateway (if the gateway was
also behind a symmetric NAT, then the NAT Proxy would instruct the Relay to wait for packets
from the Gateway before setting the IP:port to forward RTP on to the Gateway).

7. The Relay responds to the NAT Proxy with the upstream available RTP Port.

8. The NAT Proxy forwards the response upstream back to the UA after modifying the response
SDP with the IP:port of the RTP Relay.

9. UA begins sending RTP to the IP:port it received in the 200 OK – to the RTP Relay.
10. RTP Relay notes the IP:port that it received the packet from (for the fi rst packet), and passes on
the packet to the IP:port of the gateway.

11. RTP packets proceed from the gateway to the RTP Relay.

12. The RTP Relay forwards those packets to the client (according to IP:port that it saved when it
received the fi rst RTP packet from the client).

When BYE is received by the NAT Proxy, it forwards this information over to the RTP Relay which tears
down the session.

The following considerations should be noted:

1. The client will always need to send and receive RTP on the same port.

2. This solution will work for all types of NATs, but because of the delay associated with the RTP
Relay (which may be substantial, especially if the RTP Relay is not close to at least one of the
endpoints), it should probably not be used unless a Symmetric NAT is involved. In other NAT
scenarios, modification of the SDP will be sufficient.

3. The client will not hear any voice until the first packet is sent to the RTP Relay. That could cause
problems when receiving a 183 message as part of the call setup, since the gateway at that point
opens a one-way media stream and passes back network announcements over that stream. If
the client has not yet sent its first RTP packet, the RTP relay does not yet know its public IP:port

4. This is just one way of implementing an RTP Relay. There are other possibilities, including
schemes that do not insert themselves into the SIP flow.
ICE Solution

1. On deciding to initiate a SIP voice session the VOIP client starts a local STUN and TURN client to
obtain a mapping.

2. The client now constructs a SIP INVITE message. The INVITE request will use the addresses it has
obtained in the previous STUN/TURN interactions to populate the SDP of the SIP INVITE.


o=test 2890844526 2890842807 IN IP4

c=IN IP4 t=0 0

m=audio 5601 RTP/AVP 0

a=candidate:H83jksd 1.0 rtp_uname_frag_1 rtp_pass_1 5601 rtcp_uname_frag_1 rtcp_pass_1 5611

a=candidate:Hye73hd 0.8 rtp_uname_frag_2 rtp_pass_2 5608 rtcp_uname_frag_2 rtcp_pass_2 5618

a=candidate:H82hjjh 0.5 rtp_uname_frag_3 rtp_pass_3 5600

1. The SDP has been constructed to include all the available addresses that have been assembled.
• The first 'candidate' address contains the two STUN derived addresses for both RTP and
RTCP traffic. This entry has been given the highest priority (1.0) by the client and also
inserted as the default address.

• The second 'candidate' address contains the two TURN derived addresses for both RTP
and RTCP traffic. This entry has been given the second highest priority (0.8).

• The third and final 'candidate' address contains a local interface address that has not
been derived externally. This entry has been given the lowest priority (0.5).

2. The SIP signaling then traverses the NAT and sets up the SIP session. On advertising a candidate
address, the client should have a local STUN server running on each advertised candidate
address. This is for the purpose of responding to incoming connectivity checks.

3. The remote destination will also carry out similar STUN connectivity checks which then allows
media to be streamed to the client behind the NAT using the advertised connections. Two way
audio is now possible between the two clients.

Handles authentication, authorization and accounting requests for a particular realm

DIAMETER protocol

Used to provide AAA framework for applications

SIP server

To exchange message between the clients

To add AAA related information in the SIP messages

Implements DIAMETER client

Interact with the network AAA mechanisms

DIAMETER Subscriber Locator (SL)

Serves for the purposes of locating the DIAMETER server that contains the user related data
1. SIP User Agent Client (UAC) sends a SIP REGISTER request to SIP server 1, which will receive the
SIP request. We assume that this SIP server may be located, e.g., at the edge of the
administrative home domain.

2. The Diameter client in SIP server 1 will contact its Diameter server by sending a Diameter User-
Authorization-Request (UAR) message to determine if this user is allowed to receive service, and
if so, request the address of a local SIP server capable of handling this user.

3. The Diameter server will answer with a Diameter User-Authorization-Answer (UAA) message
which will indicate either a list of capabilities that SIP server 1 may use to select an appropriate
SIP server (SIP server 2) and/or a SIP or SIPS URI pointing to SIP server 2.

4. SIP server 1 will forward the SIP REGISTER request to an appropriate SIP server (SIP server 2).

5. The Diameter client in SIP server 2 will then request user authentication from the Diameter
server by sending a Diameter Multimedia-Auth-Request (MAR) message.

6. The Diameter server will respond with a Diameter Multimedia-Auth-Answer (MAA) message
with Result-Code AVP set to the value DIAMETER_MULTI_ROUND_AUTH.
7. The Diameter server will also include a challenge, which SIP server 2 will use to map into the
WWW- authentication header in the SIP 401 (Unauthorized) response, which is sent back to SIP
server 1

8. And then to the SIP UAC.

9. SIP server 1 will receive a next SIP REGISTER request containing the user credentials

10. The Diameter client in SIP server 1 will contact a Diameter server by sending a Diameter UAR
message to determine the SIP server allocated to the user.

11. The Diameter server will send the SIP or SIPS URI of SIP server 2 in a Diameter UAA message

12. SIP server 1 will then forward the SIP REGISTER request to SIP server 2

13. SIP server 2 will extract the credentials from the SIP REGISTER request. The Diameter client in
SIP server 2 will send those credentials in a Diameter MAR message to the Diameter server.

14. At this point, the Diameter server will be able to authenticate the user, and upon success, will
return a Diameter MAA message with the AVP Result-Code set to the value

15. SIP server 2 will then generate a SIP 200 (OK) response which is forwarded to SIP server 1

16. And eventually to the SIP UAC

1. A SIP User Agent sends a SIP request to its outbound SIP proxy server. In this case, the message
is a SIP INVITE request, but it could be any other SIP request. We assume that this SIP request
does not contain any credentials at this time. The outbound SIP proxy server needs to
authenticate and authorize the proxy services offered to the user.

2. The Diameter client in the SIP server sends a Multimedia-Auth-Request (MAR) message

3. The Diameter server sends a Multimedia-Auth-Answer (MAA) message that includes all the data
necessary for the SIP server to challenge the user, typically with HTTP Digest Authentication
indicated in the MAA message.

4. This data will serve the SIP server to create a SIP 407 (Proxy Authentication Required) response
that contains a challenge.

5. The SIP UA will create a new INVITE request that contains the credentials.

6. The Diameter client in the SIP server will send the credentials to the Diameter server in a new
Diameter MAR message

7. The Diameter server will validate the credentials and authorize the SIP transaction in a Diameter
MAA message

8. The SIP server forwards the SIP INVITE request to its destination as per regular SIP procedures.

9. Eventually, the session setup will be confirmed with a SIP 200 (OK) response

10. That is forwarded to the SIP UA. The session setup is complete.
Edge routers

Implement all mechanisms needed to perform admission control decision and policing function

COPS protocol

Used to make QoS reservation requests to the QoS access points

SIP server

To exchange message between the clients

To add QoS related information in the SIP messages

To negotiate QoS parameters among them

Interact with the network QoS mechanisms


Enhanced SIP
1. The call setup starts with a standard SIP INVITE message sent by the caller to the local Q-SIP
server (caller-side Q-SIP server). The message carries the callee URI in the SIP header and the
session specification within the body SDP (media, codecs, source ports, etc).

2. The Q-SIP server decides whether a QoS session has to be started or not. Q-SIP server extracts
the required information from the message, inserts the additional Q-SIP header and the Record-
Route header information (to assure that all the messages for this session will pass through
itself) within the INVITE message. Then the Q-SIP forwards the INVITE message towards the
invited callee;

3. When the Q-SIP server on the callee side (callee-side Q-SIP server) receives an INVITE message
that contains the SIP QoS extensions, it understands that a session with QoS has to be setup.
Therefore it extracts the needed information from the message, removes the Q-SIP extension
and inserts Record-Route header.

4. When the callee responds with a 200 OK message, it is passed back to the last Q-SIP server that
is the Q-SIP server that controls the access network of the callee.

5. At this point the Q-SIP server on the callee side has all the information to request a specific QoS
reservation to the ER on the callee access network for the callee-to-caller traffic flow.

6. When the callee-side Q-SIP receive a positive response for the QoS reservation request, it stores
such QoS information completing the QoS state and sends the extension information for the
callee side within the 200 OK message toward the caller.

7. When the caller-side Q-SIP server receives the 200 OK message with the complete QoS session
indicators, it completes the QoS session setup by performing the QoS request to the ER on the
caller access network for the caller-to-callee traffic flow.

8. If the response is positive, the QoS state is completed, and the 200 OK is forwarded to the caller.
The fundamental difference from the QoS unidirectional reservation mode is that now there is only one
interaction with the QoS provider. In this case when the caller-side Q-SIP receives a 200 OK response
message for a QoS call, it starts a "bidirectional" QoS reservation with the local QoS provider. The callee-
side SIP server still participates to Q-SIP signaling but does not talk with a QoS provider.