SIP
• SIP is a signaling protocol used to create, modify, and terminate a multimedia session over
the Internet Protocol. A session is nothing but a simple call between two endpoints. An
endpoint can be a smartphone, a laptop, or any device that can receive and send multimedia
content over the Internet.
• Session=Signaling + Media
• It is defined in RFC 3261
• Call flows are defined under RFC 5359
• SIP version 2.0
• It is an application layer protocol.
• Default port of SIP is 5060
SIP - Network Elements
• User Agent- UAC & UAS(Caller & callee)
• Proxy Server- Stateless & Stateful
• Registrar Server
• Redirect Server
• Location Server
User Agent
It is the endpoint of SIP network. It can initiate, modify, or terminate a session. It could be a
softphone, a mobile, or a laptop.
User agents are of two types −
User Agent Client (UAC) − The entity that sends a request and receives a response.
User Agent Server (UAS) − The entity that receives a request and sends a response.
Proxy Server
It is the network element that takes a request from a user agent and forwards it to another user.
There are two types of proxy servers −
Stateless Proxy Server − It simply forwards the message received. This type of server does
not store any information of a call.
Stateful Proxy Server − This type of proxy server keeps track of every request and response
received and can use it in future if required. It can retransmit the request.
Registrar Server
The registrar server accepts registration requests from user agents.
Location Server
The location server provides information about a UA's possible locations to the redirect and proxy
servers.
Redirect Server
The redirect server receives requests and looks up the intended recipient of the request in the
location database created by the registrar.
The redirect server uses the database for getting location information and responds with 3xx
(Redirect response) to the user.
What is DNS?
The Domain Name System (DNS) is the phonebook of the Internet. Humans access
information online through domain names, like xyz@[Link]. UAs
interact through Internet Protocol (IP) addresses. DNS translates domain names
to IP addresses.
Each device connected to the Internet has a unique IP address which other
machines use to find the device. DNS servers eliminate the need for humans to
memorize IP addresses such as [Link]
How does DNS work?
The process of DNS resolution involves converting a hostname(FQDN) (such as
[Link]) into a computer-friendly IP address (such as [Link]). An IP
address is given to each device on the Internet, and that address is necessary to find
the appropriate Internet device.
FQDN
An FQDN (fully-qualified domain name) is a complete domain name consisting of the
hostname and domain name that can be assigned to an IP address.
Rules/Important points:
1)For every request we can have multiple provisional responses.
2)For every request there will be only one final response.
3)Invite is the only request which supports three-way hand shake.
4)Cancel always goes in the direction of invite.
5)RTP stands for Real time protocol and it is the actual media between two UA.
6) There are three ways to hold a call:
i)Media port 0
ii)attribute line:sendonly in request and attribute line:reconly in response
iii)set IP as [Link] in connection information line in SDP
7)A User agent can handle only one invite at a time.
8)
SIP – Messaging Flow chart
CORE REQUESTS:
INVITE: INVITE is used to initiate a session with another user agent.
BYE: BYE is the method used to terminate an established session. This is a SIP request that
can be sent by either the caller or the callee to end a session.
REGISTER: REGISTER request performs the registration of a user agent. This request is sent
by a user agent to a registrar server.
CANCEL: CANCEL is used to terminate a session which is not established.
ACK: ACK is used to acknowledge the final responses to an INVITE method. An ACK always
goes in the direction of INVITE.
OPTIONS: OPTIONS method is used to query a user agent or a proxy server about its
capabilities without ringing.
Difference between Cancel and Bye:
Cancel terminates a non-established session.
Cancel can only be sent by UAC (UA who sent invite)
Bye can terminate established session.
Bye can be sent either of the two parties.
Note:We cannot use cancel in place of bye and vice versa.
Difference between IPv4 and
IPv6
IPv4 is 32-Bit IP address whereas IPv6 is a 128-Bit IP address.
IPv4 is a numeric addressing method whereas IPv6 is an alphanumeric
addressing method.
IPv4 binary bits are separated by a dot(.) whereas IPv6 binary bits are
separated by a colon(:).
IPv4 offers 12 header fields whereas IPv6 offers 8 header fields.
Extension Requests:
PRACK:
It stands for provisional acknowledgment.
It is used to make any provisional response reliable except 100trying.
To identify each provisional response uniquely we use R-seq.
To identify each PRACK request uniquely we use RACK header.
RACK header is the combination of R-Seq and C-seq.
REFER: The REFER method indicates that the recipient should contact a third party using the
contact information provided in the request.
A UA refers another UA to access the URI of the dialog.
Refer request has Refer To, Refer By and may have Replaces header parameter.
The response to Refer is 202 accepted
NOTIFY: It is used for the notification for any events to the user. It is always sent at the start
and termination of Refer, Subscription.
The response to notify is 200 Ok.
UPDATE: It is used for changing some of the media parameters before and after call
establishment.
SUBSCRIBE
The SUBSCRIBE method is used to request current state and state updates from a remote
node.
SUBSCRIBE requests SHOULD contain an "Expires" header. This expires value indicates
the duration of the subscription.
The response to subscribe is 200ok
Note:We use Subscribe method to de-subscribe the user with Expires:0
Publish:
PUBLISH was introduced to save bandwidth a processing time. So instead to send
NOTIFY to all endpoints, you just have to send one single PUBLISH to the presence
server and the server will notify all your subscribers about your new status.
INFO: It is used to pass any information to another UA with which it has established a
media session.
The SIP INFO Method is designed to transmit application level control information such as
DTMF tones along the SIP signaling path.
SIP responses
A SIP response is a message generated by a user agent server (UAS) to reply a request generated by
a UAC.
1xx: Provisional/Informational Responses
Informational responses are used to indicate that the call is in progress.
Example:100 trying,180 ringing etc
2xx: Success final Responses
This class of responses is meant for indicating that a request has been accepted.
Example:200 ok,202 accepted etc
3xx: Redirect Responses
These class responses are sent by redirect servers in response to INVITE.
Example:301 Moved Permanently, 305 Use Proxy etc
4xx: Client Failure Responses
Client error responses indicate that the request cannot be fulfilled as some errors are identified from
the UAC side. Example: 400 Bad Request,403 Forbidden,411 Length Required etc
5xx: Server Failure Responses
This class response is used to indicate that the request cannot be processed because of an error with
the server. (UAS side)
Example:500 Internal Server Error,504 Server Time-out,503 Service Unavailable etc
6xx: Global Failure Responses
This response class indicates that the server knows that the request will fail wherever it is tried. As a
result, the request should not be sent to other locations.
Example:603 Decline,606 Not Acceptable,608 Rejected etc
Mandatory Headers in SIP-
• To
• From
• Via
• C-Seq
• Max-Forward
• Call ID
TO: Logical address of destination.
From: logical address of source. The “From” header field indicates the
identity of the initiator of the request.
Via: It is the one’s own address where response has to be received.
Every proxy in the request path adds to top of the “Via” the address and
port on which it received the message, then forwards it onwards. When
processing responses, each proxy in the return path processes the
contents of the “Via” field in reverse order, removing its address from the
top.
re
Branch ID: Branch IDs help proxies to match responses to forked
requests. Without Branch IDs, a proxy server would not be able to
understand the forked response. Branch-id is the header parameter
available in Via header. It always starts with [Link] seven digit
and alphabet combination are called seven magic cookie.
Received parameter:
“received” is a standard parameter in the Via header, which contains the
actual source address from which the packet was received.
Max Forward:
This is a mandatory header used to detect the loop in the network.
Max-Forward header limits the number of hops a request can make on the way
to its destination.
It consists of an integer that is decremented by one at each hop. If the Max-
Forwards value reaches 0 before the request reaches its destination, it is
rejected with a 483 (Too Many Hops) error response.
Call-ID
It is used to uniquely identify a call between two user agents.
A Call-ID is always created by a user agent and is never modified by a server.
C-sequence: The C-Seq header field also known as command sequence is a
required header field in every request. It contains a number that increases for
each request. Usually, it increases by 1 for each new request, with the exception
of CANCEL and ACK requests, which use the C-Seq number of the INVITE request
to which it refers.
Contact header: Contact header contains the address where request has to be
received.
The “Contact” header field provides a single SIP URI that can be used to contact
the sender of the INVITE for subsequent requests.
Contact header is only mandatory for Invite request.
Optional headers:
Allow: The Allow header lists the set of methods supported by the user agent
generating the message. Snapshot is as below for reference.
Supported Header: The Supported header field is used to list one or more optional tags
implemented by UA. Snapshot is as below for reference.
Content-Length
Indicates the size of the message attachment in Kb.
SIP messages that do not contain a message attachment will set Content-Length to
0
Content Type
Content-Type indicates the name of the message Attachment sent to the recipient.
Examples include application/SDP.
Error Info Header:
The Error-Info header provides a pointer to additional information about the error
status response.
This contains the address of announcement server so that the UA can get the audio
of error response from proxy.
This is usually used in call management and is shared by proxy to UAC.
Transaction:
• For any request if we get a final response, it is called a Transaction.
• For any request, if we are getting a successful final response then ACK
will be a different Transaction.
• For any request, if we are getting any final response other than a
successful final response then ACK will the part of Transaction.
• It can be identified by Branch ID & C-Seq.
• Dialog:
•
• Dialog is a peer-to-peer relationship between two user agents.
• It is established through successful final response for the invite only.
• Dialog can be identified by To Tag, From Tag , Call ID
•
• TAGS: There is the local tag (also known as From tag) which is assigned
by the sender of a message or the UAC. There is also the remote tag
(also known as To tag) which is assigned by the final recipient of the
message or the UAS (User Agent Server). The UAC puts its tag in the
From header and the UAS puts its tag in the To header. So, when a
message leaves a UAC it has one tag in the From header and there is no
tag in the To header. When a UAS receives that message and responds
back with a SIP response (e.g. 180 Ringing), it then adds a tag to the To
header.
• Types of Dialog:
• There are two types of dialog.
• 1)Early Dialog
• 2)Confirmed Dialog
•
• Early Dialog is the state where there is a chance of call establishment but
it is not confirmed whereas Confirmed dialog is a state where the
possibility of call establishment is 100%
• Early dialog is set when we get a provisional response to invite.
• Confirmed dialog is set when we get successful final response to the
invite.
Early Media:
Early media is the ability of two SIP User Agents to communicate before a SIP
call is actually established. Before the call is set up, the gateway might provide
in-band tones or announcements that inform the caller of the call progress.
Session: Signalling and RTP in SIP makes a session.
Notes:
User busy call flow has zero dialog as it doesn’t have successful final response.
It has 1 Transaction.
It has zero dialog.
Notes:
Cancel call flow has one transaction.
C seq for cancel is 1 Cancel
Cancel has no meaning after 200Ok.
Cancel can be sent only by the UA who sends invite.
Basic call flow:
In a basic call flow, we have 3 transactions,1 dialog and 1 session.
CODECS:
It stands for coder decoder.
Codecs are compression technologies and have two components, an
encoder to compress the files, and a decoder to decompress.
A codec is a device or computer program which encodes or decodes a data
stream or signal.
It encodes analog audio as digital signals and decodes digital back into
analog.
Audio Codec Bit Rate MOS
G.711A/PCMU/G711U/PCMA 64 kbps 4.4
G.722 64 kbps 3.9
G.729 8 kbps 3.9
G.726 16kbps 4.3
G.728 16Kbps 3.6
Video codecs
MPEG-4
H.264
Note:If the codec negotiation fails:
MOS:
It stands for Mean opinion score.
MOS value is mainly used to assess the quality of voice
VoIP calls are in the 0 to 5 MOS range
Value Quality Meaning
5 Excellent No effort is required to understand the language.
4 Good Through attentive listening, speech can be heard without effort.
3 Fair The language can be heard with a slight effort.
2 Poor It takes a lot of concentration and effort to understand the transmitted language.
1 Bad Despite great efforts, one cannot communicate.
Session Description
Protocol
SDP stands for Session Description Protocol. It is used to describe multimedia
sessions in a format understood by the participants over a network.
SDP is defined in RFC 2327.
SDP Version is “0”
Purpose of SDP
The purpose of SDP is to convey information about media streams in
multimedia sessions to help participants join or gather info of a particular
session.
Headers in SDP:
Session description (* denotes optional)
v = (protocol version)
o = (owner/creator and session identifier)
o=<username><session-id><version><network-type><address-type>
s = (session name)
i =* (session information)
u =* (URI of description)
e =* (email address)
p =* (phone number)
c =* (connection information - not required if included in all media)
c =<network-type><address-type><connection-address>
t= 0 0
b =* (bandwidth information)
z =* (time zone adjustments)
k =* (encryption key)
m=Media line
m = audio 49430 RTP/AVP 0 6 8 99
a =* (zero or more session attribute lines)
Example:
v=0
o=alice 2890844526 2890844526 IN IP4 [Link]
s=
c=IN IP4 [Link]
t=0 0
m=audio 49170 RTP/AVP 0
a=rtpmap:0 PCMU/8000
Q value in SIP:
The q value is a floating-point number in a range 0 to 1.0 specify as the parameter in
the Contact header field. The higher the q value number, the more priority that
device has. Contacts with q value 1.0 have maximum priority.
This is majorly used in serial forking.
Format:
Contact:<sip:johdoe@[Link];transport=udp>;expires=3600;q=0.9
Note: q-value is set by the client during the Registration process.
ROUTE AND RECORD ROUTE:
Record-Route: The Record-Route header is inserted into requests by proxies that
wanted to be in the path of subsequent requests for the same call-id.
SIP Record Routing is a mechanism by which a proxy can inform user agents that it
wants to stay on the path of all future messages
It is then used by the user agent to route subsequent requests.
Route: It is the address where request has to be sent.
Parallel forking
DTMF
Dual tone multi-frequency (DTMF) is the sounds or tones generated by a telephone
when the numbers are pressed. These tones are transmitted with the voice channel.
DTMF is used to control automated equipment and signal user intent, such as the number
they wish to dial.
DTMF is of two types:
In-Band and Out of Band DTMF
When talking about VoIP, in-band DTMF means that the tones are sent in the audio
stream in the traditional way, whereas out-of-band DTMF is sent as specially
formatted data packets.
Globally Routable User Agent URI (GRUU)
Several applications of the Session Initiation Protocol (SIP) require a user agent (UA) to
construct and distribute a URI that can be used by anyone on the Internet to route a call to
that specific UA instance. A URI that routes to a specific UA instance is called a Globally
Routable UA URI (GRUU).
SIP Architecture:
Syntax and Encoding
The lowest layer of SIP is its syntax and encoding. Its encoding is specified using an
augmented Backus-Naur Form grammar (BNF).
Transport Layer
The second level is the transport layer. It defines how a client sends requests and
receives responses and how a Server receives requests and sends responses over
the network.
Transaction layer
When a request gets a final response in SIP, it is called transaction.
Transaction layer takes care of all the transactions getting completed in all hops and
all stages within a session.
Transaction user
Each of the SIP entities, except the Stateless proxies, is a transaction user.
Retransmission and Re-invite:
When the request is sent again for the same purpose because the initial request did
not receive the response, it is called retransmission of the request.
When within the same session the invite is sent for a different purpose like call hold
It is called Re-invite.
The C-seq value of re-invite changes with each request (if sent from the same UA)
The cseq value of retransmitted request will remain the same.
PAYLOAD:
In computer networking, data to be transmitted is the payload.
It is the actual data to be transmitted.
SIP URI: (Uniform Resource Identifier)
A SIP URI is the SIP addressing schema, to call another person via SIP. It is,
essentially, a user’s sip “phone number,” and it is in a format similar to
email. A SIP-URI communicates whom to call via SIP.
Question: Why ACK is considered as the part of transaction in case of un-
successful final response and a different transaction in case of successful final
response?
Answer: In the case of successful final response, the purpose of sending the invite
is solved and so Transaction User sends the ACK.
In case of un-successful final response, the purpose of sending the invite is not
completed and so transaction user denies to send the [Link] Transaction layer
sends ACK to complete the transaction. This is why ACK is considered as the part of
transaction in case of un-successful final response and a different transaction in
case of successful final response.
Automata and Byeless parameter:
The <automata> element indicates whether the service represents an
automaton (such as a voicemail server, conference server, or
recording device) or a human bot.
Byeless parameter indicates that the machine is not capable of sending bye or
terminate the session.\
Sip Rendering Parameter
It provides a positive indication whether the User Agent setting the parameter is
currently rendering any of the media it is receiving in the context of a
specific session.
It MUST only be used in a Contact header field in a dialog created using the INVITE
request.
This parameter has three legal values: "yes", "no", and "unknown".
The value "yes" indicates positive knowledge that the User Agent is
rendering at least one of the streams of media that it is receiving.
The value "no" indicates positive knowledge that the User Agent is
rendering none of the media that it is receiving. The value
"unknown" indicates that the User Agent does not know whether the
media associated with the session is being rendered
Race Around condition:
A race condition occurs when a UAC (User Agent Client) sends a CANCEL in the
Early state while the UAS (User Agent Server) is sending 200ok from other side to
an initial INVITE.
The message which is received first is processed first and other is ignored.
STRICT ROUTING AND LOOSE ROUTING:
STRICT ROUTING: The Request-URI always contains the URI of the next
hop.
The next hop can be another SIP router or destination user agent.
Strict Routing is legacy approach.
The problem with strict routing is in the process of specifying the entire proxy
set in the initial request before starting the SIP dialog. The processing throws
away the information contained in the received R-URI. The behaviour of UAs
with outbound-proxy was problematic. The whole system would fail if there
was a failure in one of the elements.
LOOSE ROUTING
The Request-URI always contains the URI of the destination.
SIP TIMMERS
Timer Default value Meaning
T1 500 ms Round-trip time (RTT) estimate
T2 4 sec. Maximum retransmission interval for non-INVITE requests and INVITE responses
T4 5 sec. Maximum duration that a message can remain in the network
Timer A initially T1 INVITE request retransmission interval, for UDP only
Timer B 64*T1 INVITE transaction timeout timer
> 32 sec. for
Timer D UDP Wait time for response retransmissions
Timer E initially T1 Non-INVITE request retransmission interval, UDP only
Timer F 64*T1 Non-INVITE transaction timeout timer
Timer G initially T1 INVITE response retransmission interval
Timer H 64*T1 Wait time for ACK receipt
Timer I T4 for UDP Wait time for ACK retransmissions
Timer J 64*T1 for UDP Wait time for retransmissions of non-INVITE requests
Timer K T4 for UDP Wait time for response retransmissions