Sei sulla pagina 1di 6

Kurento: a media server technology for convergent WWW/mobile real-time

multimedia communications supporting WebRTC.

Luis Lpez Fernndez, Miguel Pars Daz, Ral


Bentez Mejas
Grupo de Sistemas y Comunicaciones (GSyC)
Universidad Rey Juan Carlos (URJC)
(Madrid), Spain.
llopez@gsyc.es, mparisdiaz@gsyc.es, rbenitez@gsyc.es
Abstract WebRTC technologies are an opportunity for
achieving a real convergence between WWW, desktop and
mobile multimedia real-time communications services, which
will contribute to defeating fragmentation and shall provide
significant advantages to users and developers all around the
world. Following this vision, in this paper we introduce
Kurento, a media server technology based on open source
software capable of demonstrating how this convergence could
take place by combining a SIP/HTTP based signaling plane
and a powerful media server infrastructure built on top of the
GStreamer software stack. The presented technology is
suitable for sending and delivering real-time multimedia
through different protocols and formats and capable of
providing advanced processing capabilities, which include
media mixing, transcoding and filtering. Thanks to this,
Kurento could push current WebRTC capabilities beyond
plain peer-to-peer communication.
Keywords WebRTC; SIP; HTTP; RTP; SDP;
WWW/Mobile Convergence; IMS; GStreamer; Mobicents

I.

INTRODUCTION

Real-Time communications are one of the most relevant


technologies of todays Internet. Services such as Skype,
Google Hangouts, Tokbox or Apples FaceTime count their
users in billions. This phenomenon, joined with the
popularization of social networking, is opening a new
paradigm of human communications that is slowly
cannibalizing the traditional phone service, which has been
dominant during the last century. At the core of this new
paradigm, we find two types of devices: desktop computers
and Smartphone platforms. Desktop computers, through
WWW browsers, have been traditionally the way users
prefer to access most Internet services including social
networks. However, in the last few years, Smartphone
platforms have contributed with their ubiquity and mobility
and their adoption is skyrocketing.
The area of real-time multimedia communication
services has traditionally experienced a severe fragmentation
of solutions due to lack of interoperability. For this reason,
the above mentioned platform diversity may aggravate the
fragmentation problem. This situation is extremely negative
for users, who need to deal with different services to
communicate with different contacts. Clearly, this is a
978-1-4673-5828-6/13/$31.00 2013 IEEE

Francisco Javier Lpez, Jos Antonio Santos


Naevatec
Las Rozas (Madrid), Spain
fjlopez@naevatec.com, jcaden@naevatec.com

stopper for the massive adoption of such services and for the
emergence of a whole new industry around Internet real-time
communication.
In this scenario, there is an increasing effort for the
creation of standardized technologies and services suitable
for defeating fragmentation and for enabling and effective
convergence
of
desktop
WWW
browsers
and
mobile/smartphone platforms. One of the most relevant
initiatives in this area is WebRTC [1]. WebRTC belongs to
the HTML5 ecosystem and is aimed at providing Real Time
multimedia Communications (RTC) on the WWW. It has
awakened significant interest among developers (see
http://www.webrtc.org). In opposition to other previous
proprietary WWW multimedia technologies such as Flash or
Silverlight, WebRTC has been conceived to be open in a
wide sense, both by basing on open standards and by
providing open source software implementations
WebRTC standards are still under development and they
will require some additional time to consolidate, however,
there are a number of ingredients that clearly WebRTC will
be incorporating in the near future. First, it will provide
interoperable multimedia communications through wellestablished standardized protocols, codecs and formats.
Second, it will enable and effective convergence among
desktop WWW platforms and Smartphones. This
convergence shall take place through a double mechanism.
The first is based on the fact that all relevant Smartphone
platforms are adhering to HTML5 and, hence, their WWW
browsers will come with built-in WebRTC capabilities when
the standard consolidates. Second, because the WebRTC
stack is being open sourced with really open licenses, which
makes simple and attractive for any developer or vendor to
incorporate WebRTC capabilities on top of the native APIs
of mobile platforms.
Given this, WebRTC is an opportunity for the creation of
a truly open and interoperable technology, which could
catalyze the emergence of a new generation of novel and
non-fragmented social communication services. However,
for this to happen, the WebRTC ecosystem needs to evolve
further and provide more than pure peer-to-peer video
conferencing, as it does now. In this direction, current efforts
on WebRTC are concentrated on building the client side
capabilities. However, at the server side, only minor
contributions have occurred. Nevertheless, a state-of-the-art

WebRTC capable media server could provide very


interesting features such as media recording, media mixing
for group communications, media adaption and transcoding
for integration into legacy systems, etc. In this paper we
contribute to that vision by introducing Kurento, an open
source based media server capable of strengthening the
WebRTC ecosystem in several directions.
The structure of this paper is as follows. First, we review
current state-of-the-art and show why Kurento contributes to
pushing it. Second, we examine current status of WebRTC
implementation and explain how the associated APIs work.
After that, we introduce the Kurento Media Server
architecture and show how WebRTC can be integrated into
it. To conclude, we explain how WebRTC enabled
applications can be created on top of the resulting
infrastructure.
II.

STATE OF THE ART AND CONTRIBUTIONS

During the preceding decade, the most successful


multimedia solutions on the WWW have been based on
proprietary technologies that made difficult their
convergence with other technologies for a number of
reasons. First, because they used non-standard protocols
controlled by commercial companies that had their own
objectives and roadmaps, which were not necessarily aligned
with the ones of users. Second, because those WWW
multimedia technologies were not designed for real-time
communications and the quality of experience they provide
is not always satisfactory.
More recently, many different initiatives have emerged
for the generation of a technology capable of bringing
together the mobile and WWW real-time communication
worlds. One of the most remarkable of them has come from
the IMS ecosystem [2, 3, 4]. However, the IMS model has
not succeeded in permeating out of operators and those
initiatives did not generate a critical mass of users.
In this context, WebRTC has appeared bringing to reality
all the required ingredients for achieving an effective
convergence between Web and mobile services. Early
experiments and implementations of WebRTC were carried
out by Ericsson. However, currently many different
companies and individuals are involved in their
standardization and prototyping. The standardization efforts
are split into two complementary initiatives. On one hand,
the
RTCWeb
group
of
the
IETF
(http://tools.ietf.org/wg/rtcweb/charters) is focused on the
definition of the required protocols and interoperability
mechanisms. On the other, the WebRTC group of the W3C
(http://www.w3.org/2011/04/webrtc-charter.html) is working
to define a number of APIs suitable for providing, through
scripting languages, web browser support for interacting
with media devices (microphones, webcams, speakers, etc.),
media encoding/decoding capabilities and media transport
features.
Although a relevant number of drafts for the protocols
and APIs have already appeared [5, 6], WebRTC is still on
its infancy. However, with independence on the details,
WebRTC is being designed in such a way that its integration
with real-time mobile communication services is immediate.

First, because it solves all the complex details of real WWW


architectures such as NAT traversal, browser integration,
media security, etc. Second, because it is based on a wellknow standard for real-time multimedia: the RTP/RTCP
protocol stack. Although WebRTC does not specify any type
of signaling protocol, it is fully compatible with SIP (many
current WebRTC applications are based on SIP). These two
ingredients make WebRTC services to be directly usable in
IMS and other VoIP infrastructures, which are usually based
on combining SIP and RTP/RTCP.
In this context, WebRTC brings a clear opportunity for
the creation of non-fragmented and universal communication
services accessible from WWW, Smartphones and traditional
VoIP systems. Nevertheless, WebRTC efforts are currently
concentrated on creating a client technology capable of
providing peer-to-peer (P2P) media. Although the provision
of a P2P communication model is a remarkable first step, for
its mass adoption, WebRTC needs to progress further so that
a number of requirements demanded by users (which are not
possible basing on current state of the standards) are
fulfilled:
The compatibility with group communications
satisfying social interaction schemes.
The capability of providing value added services
such as call recording, call redirecting, answering
machines, etc.
The interoperability with legacy WWW media
technologies such as Flash or Silverlight. In the same
direction, the capability of integrating into legacy
multimedia communication systems based on VoIP
or other similar schemes.
The ability to satisfy novel tendencies for the
creation of media-aware and context-aware services
involving computer vision, media augmentation,
content searching, etc. This type of capability is
expected to be the catalyzer of a whole new
ecosystem of professional services in areas such as
security, entertainment, eHealth, eLearning, etc.
In this paper, we contribute to pushing the success of
WebRTC by enriching current state-of-the-art technologies
with the following enablers
A. Contribution: innovative media server architecture
abstracting complex details of application development.
We have created a powerful media server architecture
combining the Mobicents/JBoss Application Server [7] and
the GStreamer [http://www.gstreamer.net/] multimedia stack.
To understand why this architecture is challenging and
relevant, we need a basic understanding of GStreamer.
GStreamer is architected around two main concepts: media
elements and media pipelines. A media element can be seen
as a black box capable of acting as a media sink, as a
media source or as a media processor. Hence, media
elements are usually associated to a specific function
performed on the stream. Currently, GStreamer provides
more than 1000 different media elements with diverse
capabilities such as, for example, reading/writing streams
from/to files, mixing several media streams into one,
transcoding streams to/from many different formats

Current WebRTC architecture is depicted on Fig. 1. As it


can be observed, it is based on separating signaling and
media planes. Following the WebRTC philosophy, signaling
is not part of the standards and its specific implementation is
let to the application developer. Currently, different types of
protocols are used for that purpose, being SIP and XMPP the
most popular ones. The objective of the signaling protocol is
to make possible the negotiation of the media formats and
transport parameters between the two communicating endpoints. This requires the exchange of SDPs that describe
first, the offer and later the agreed answer. The details of this
negotiation are out of the scope of this paper. The interested
reader can find them in the JavaScript Session Establishment
Protocol draft [5].
The WebRTC browser capabilities are exposed through
an API [6] designed around two complementary concepts:
PeerConnection and MediaStream.

g'
lin

g'

WEBRTC TECHNOLOGIES

Applica2on'
Provider'

lin

III.

B. MediaStream
The MediaStream represents the media plane of the
WebRTC protocol stack and comes with two flavors: local
and remote. Local MediaStreams are used as a handle for
managing the audio and video captured locally (i.e. through
the browser of the local peer) in a webcam or microphone. A
local MediaStream can be rendered through an HTML5
standard <video> tag.
In the same way, a remote MediaStream may carry video
and audio channels and can be rendered using the tag
<video>. However, in this case the media does not come
from the local camera or microphone, but from the remote
peer at the other end of the communication. From a protocol
perspective, the stream traverses the network using the
formats and ICE negotiated candidates. To secure the
communication, SRTP is used, being DTLS one of the
possible
key
exchange
mechanisms.

na
Sig

B. Contribution: WebRTC support for GStreamer.


In current state-of-the-art it is not possible to use the
WebRTC protocol stack into GStreamer, given that there is
no media element providing the required capabilities. For
this reason, another major contribution of our work is to
create the appropriate enablers adding to GStreamer the
capability of receiving and sending WebRTC streams. Given
our above discussion, this is a clear progress to the state-ofthe-art given that it gives the possibility of injecting those
streams into specific purpose pipelines, which can be
managed in a simple and seamless manner through our
media server. This opens a new whole spectrum of
multimedia services for WebRTC applications including the
support for flexible group communications, the integration of
augmented reality, the use of computer vision for enhancing
and personalizing services, etc.

A. PeerConnection
PeerConnection is the WebRTC component that handles
communication of streaming data between peers. The
capabilities of this component are exposed through a
JavaScript object to developers. This object abstracts a large
number of tedious details and complexities associated to the
inner workings of video and audio including packet loss
concealment, echo cancellation, bandwidth adaptation,
automatic gain control, ICE control for NAT traversal, etc.

Sig
na

(including H.263, H.264, VP8, Ogg, Vorbis, MP3, AMR and


Speex), detecting faces into a video stream, blending
multiple streams, etc. Basing on this, a media pipeline is a
chain of media elements where the output generated by one
element is fed into one (or several) downstream elements.
Hence, the pipeline can be seen as a machine performing
complex media processing comprised of a sequence of
individual media operations. In summary, GStreamer is a
powerful architecture enabling the creation of rich
applications performing complex media processing.
However, the use of GStreamer for creating applications an
its integration into WWW services and systems is extremely
complex and requires huge efforts and considerable expertise
from developers.
However, our architecture abstracts the GStreamer stack
on top of the JBoss/Mobicents capabilities. This means that
the GStreamer media elements and pipelines can be managed
through specialized Java stubs that can be used in the context
of standard SIP and WWW Servlets applications. In other
words, we bring the simplicity of WWW development into
GStreamer. Besides, given that Mobicents may behave as an
IMS Application Server, the integration with mobile, VoIP
and IPTV network environments is immediate.

JavaScript'
Applica2on'

JavaScript'
Applica2on'

WebRTC'API'

Browser'

WebRTC'API'
Media'+'
ICE'signaling'

Browser'

Figure 1. Current WebRTC architecture is based on the typical separation


between signaling and media planes. The media plane is based on direct
browser-to-broser secure RTP connections using ICE/STUN/TURN for
NAT traversal. The signaling protocol is not specified and the application
developer can select her preferred option for creating it.

IV.

KURENTO ARCHITECTURE FOR REAL-TIME


COMMUNICATIONS

Once we have understood the basic concepts around


WebRTC, we may present the media server solution where
we wish to integrate its capabilities: Kurento. Kurento is a
Free Open Source Software initiative whose source code is

available here: http://code.google.com/p/kurento/. Kurento


signaling plane is based on Mobicents [7], an Open Source
platform written on top of the JBoss Application Server.
Developing multimedia applications based on Mobicents has
a number of advantages given that the underlying JBoss
Microcontainer exposes to developers all the features of a
professional and mature Java EE server infrastructure
including database connectivity, transactional capabilities,
messaging, web services, ESB connectivity, security,
clustering, seamless web integration, etc. Hence, the
JBoss/Mobicents stack acts as an Application Server where
application business logic may be created and deployed.
In addition, and given that Mobicents does not provide
video capabilities, the Kurento media plane has been created
independently basing on the rich multimedia features
provided by the GStreamer project. The conceptual
representation of the Kurento architecture can be seen on
Figure 2, where the separation between the Kurento
Signaling Server (KSS) and the Kurento Media Server
(KMS) can be observed.

Input#
Element#

Output#
Element#

SIP(Servlet(

GStreamer(media(pipeline(

Kurento(Media(Server(

Other#Signaling#

SOAP#

HTTP(Servelet(

Other(JEE((

Server(side(signaling(plane(
Thri6#
interface#

Media(message(bus(

REST#

Media((
Repository(

RAW#HTTP#

Flash(Video((
Server(

SIP#

media#stream#

media#stream#

IP#Network#(remote#clients)#

Given this architecture, it is easy to understand that


Kurento can be used in any type application where the
signaling is based on SIP or HTTP and the media is
represented and transported in any of the protocols and
formats supported by GStreamer. This makes Kurento an
ideal candidate for the provision of advanced multimedia
applications requiring more than peer-to-peer or media
switching capabilities.
V.

WEBRTC INTEGRATION INTO KURENTO

The integration of WebRTC into Kurento requires, as a


first step, providing a WebRTC capable media plane on top
of GStreamer. For this reason, and as part of the research
effort described in this paper, we have created a GStreamer
media element providing such capability. We have called it
webrtcbin, a bin in GStreamer is a special type of element
that contains other elements and manages them. This
component has been also open sources and is available here:
http://code.google.com/p/kurento/source/checkout?repo=gstplugins-webrtc. The webrtcbin is composed by four basic
elements: nicesink, nicesrc, srtpprotect and
srtpunprotect. Nicesink and nicesrc are elements
provided by libnice package, they are responsible of sending
and receiving data to/from the remote client following the
ICE protocol. Srtpunprotect and srtpprotect are the
elements that handle the security.
Kurento'Media'Server'
Gstreamer'media'pipeline'

Media(
proxies(

Media(Session((

Processing'media'elements'

WebRTC'
Bin'

Kurento(Signaling(Server((JBosss)(

Figure 2. The architecture of Kurento is split into media and signaling


planes. The former is based on the JBoss/Mobicents Java EE stack, while
the latter has been built on top of the GStreamer media pipeline framework.
Both communicate using Thrift RPCs for exchanging media control
information. This architecture exposes the powerful media capabilities of
GStreamer through the flexible and interoperable framework provided by
Java EE technologies.

Observing that figure we can understand precisely what


are the contributions of Kurento to the developer
community:
First, a C++ wrapper to GStreamer, which exposes its
capabilities (i.e. media elements and pipelines) through
low-latency and efficient RPCs based on Thirft [8].
Second, a number of Java proxies which consume such
RPCs and which are embeddable into the Mobicents
Application Server.
Third, a number of extensions to the Mobicents
Application server enabling the management of the
lifecycle of such proxies and its coordination in the
context of Media Sessions.
Fourth, a development framework based on such Media
Sessions making simple the creation of applications
combining WWW/SIP Servlets with the advanced
multimedia processing capabilities of GStreamer.

RTP/RTCP'stream'

SRTP'

RTP/RTCP'stream'

ICE'

WebRTC'
media'streams'
from/to'network'
Figure 3. Architecture of the Gstreamer webrtcbin component. This
media element implements the procol stack required for receving and
delivering WebRTC multimedia streams. Thanks to it, Kurento Media
Server is capable of combining the GStreamer pipeline architecture with
the WebRTC capabilities.

These elements can also multiplex/demultiplex RTP and


RTCP channels if they come within the same SRTP stream.
Current implementation works fine in bundle mode,
which is the default in Chrome WebRTC implementation.
Bundle mode uses only one stream for all the media (audio,
video and RTCP packets). In the future, this feature should
be configurable and no-bundle mode will be supported as
well. This module has been tested against chrome using VP8

as video codec and OPUS as audio codec, using pre-existing


GStreamer modules to encode and decode media.
VI.

CREATING A SIMPLE WEBRTC APPLICATIONS WITH


SERVER SUPPORT BASING ON WEBRTCBIN.

For illustration and validation purposes, we may explain


how to create a simple demo application performing a media
loopback, so that the streams the web client sends to the
server are given-back to it. This application is not of special
practical interest, but we include it given that it allows
understanding easily the different elements and modules that
are required for creating applications basing on Kurento and
WebRTC. This example also allows evaluating the
complexity a developer needs to face for building such an
application.
The client side of the application has been built on top of
Chrome 25 and uses a signaling plane based on a minimal
SIP implementation using a WebSocket transport created adhoc for the experiment. As it can be observed on Fig 4, the
Web application establishes a media session through the
same sequence of API-calls used to establish a peer-to-peer
connection with a remote browser. This sequence involves
invoking
the
appropriate
primitives
on
the
RTCPeerConnection JavaScript object exposed by the
WebRTC API. In other words, from the perspective of the
Web application developer, the Kurento stack is
undistinguishable from another WebRTC client. Hence,
application developers do not need to execute any special
actions to communicate with our media server.
As Fig. 4 shows, once the call invitation has been
received, the KSS has the opportunity of executing an
application logic deciding whether the call is accepted or not
(more details about this are presented on the following
section.) In case it is, the KMS is instructed, through a
number of Thrift RPC calls, to create the appropriate
GStreamer pipeline, which contains a webrtcbin capable of
sending and receiving media. This bin uses the SDP offer
received from the remote peer to initialize its media
capabilities and generates the appropriate information for
issuing an answer, which includes the supported media
formats, the ICE candidates and the ciphering keys used for
sending the local SRTP streams. Given this, the KMS logic
is capable of building the answer SDP and delivering it as
the return value of the Thirft RPC.
At this point, the KSS is able to create and issue the SIP
OK message in response to the preceding INVITE. Upon
reception, the Web client signaling plane delivers the remote
SDP to the application, which uses it to assign the remote
description and the remote ICE candidates to the local
RTCPeerConnection object, which enables the WebRTC
stack to initiate the media exchange. When the call is
established, both the client and the server side applications
receive a signal, so that specific actions (such as rendering
the media flows in the client or recording the exchanged
media in the server) can be executed.

VII. CREATING CONVERGENT REAL-TIME MULTIMEDIA


SERVICES
After a simple example, we can introduce more complex
applications involving convergent scenarios. The creation of
rich and convergent WebRTC applications based on Kurento
requires developers to implement the server side code
providing information on what is the specific media
processing logic. To start, a UA needs to be created and
registered. UAs are server-side stateful objects that take the
responsibility of managing communication end-points. In
other words, UAs are in charge of specific server side
SIP/HTTP URIs. Developers can subscribe event listeners to
UAs. This makes possible to provide the specific logic that
will be executed when incoming calls arrive, when outgoing
calls are issued and when a call is established or terminated
by a UA.
On each of these events, the UA provides its listeners a
reference to a Call object. This object gives the application
developer access to the low-level media stack. This is
performed through Joinable objects inspired on the
Joinable interface introduced on JSR 309 [9]. The
developer is able to instantiate different joinables such as
recorders, transcoders, filters, etc., which correspond to the
different media elements available on the GStreamer stack.
Calls, on their side, also provide joinables representing
their incoming and outgoing media streams. Given this, the
application developer can create the media processing logic
by joining the different media elements so that a media
pipeline can be created performing the desired actions (i.e.
adapt, augment, filter, record, transcode, etc.) on the media.
A particularly interesting media object is the Mixer. The
Mixer represents a GStreamer object capable of mixing
media following a scheme defined by the specific type of
mixer. Some of the currently available schemes include:
audio mixing (full-duplex or half-duplex), audio mixing plus
half-duplex video mixing, audio mixing plus grid video
mixing (i.e. composing a video wall grid from the individual
incoming flows), audio mixing plus selecting the video
channel associated to the more powerful audio signal, etc.
Mixers allow connecting clients through Ports, which are
also joinable objects.
With this information in mind, we can understand how a
rich convergent multimedia application can be created
integrating WebRTC clients with SIP softphones or other
types of videoconferencing application on smartphone, tablet
or desktop PCs. For example, if we want to build a group
video-chat among all those types of devices, we simply need
to create an UA listening at the appropriate (SIP or HTTP)
URIs, where clients will call to joint the chat. That UA will
instantiate a Mixer which will be ready for combining the
different incoming streams into a unified group call through
the desired mixing scheme. Upon reception of a Call, the
specific incoming call listener simply needs to create the
appropriate GStreamer media end-point capable of
communicating with the calling client. This end-point will be
based on our webrtcbin component, when the caller is a
WebRTC capable browser. It will be a standard RTP/RTCP
end-point when the caller is a traditional SIP-phone. It may

even be a Flash based client, in which case we may use the


RTMP GStreamer capabilities. Independently on the type of
end-point, the important aspect is that all of them are
joinables, and can be joined to the Mixer.
A very relevant aspect of the joinable mechanism is that
Kurento provides a transparent transcoding mechanism
capable of transforming the media from the source joinable
format to the destination joinable format without requiring
additional actions from developers. Mixers require media in
raw format to be able to apply the mixing scheme. For this
reason, joining to a mixer usually involves transcoding the
source/mixed streams to/from raw. From a practical
perspective, this means that the above mentioned video chat
application supports some clients to use, for example
VP8/OPUS, while others may be using H.264/AMR or even
others H.263/Speex, etc. In all cases, joining the downstream
media flows to the mixer will involve their transcoding to
raw, while joining the upstream media flows will be
associated to transcoding the raw media provided by the
mixer to the expected format of each individual call.
VIII. CONCLUSIONS
In this paper we have introduced Kurento: a media server
technology compatible with WebRTC clients and capable of
demonstrating how WebRTC applications can interoperate
with other mobile and desktop real-time communication
services in a seamless and simple way. We expect Kurento to
contribute to the consolidation of the WebRTC ecosystem by
showing a pathway toward more advanced and universal
real-time communication services.

REFERENCES
[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

WWW'
Applica<on'

WebRTC'
stack'

Client''
Signaling''
Plane'

L. Salvatore. R. Simon-Pietro. Real-Time Communications in the


Web: Issues, Achievements, and Ongoing Standardization Efforts.
Internet Computing, IEEE, vol. 16, no 5, p. 68-73, 2012. doi:
10.1109/MIC.2012.115
D. Lozano, L.A. Galindo, L. Garcia, "WIMS 2.0: Converging IMS
and Web 2.0. Designing REST APIs for the Exposure of SessionBased IMS Capabilities," Next Generation Mobile Applications,
Services and Technologies, 2008. NGMAST '08. The Second
International Conference on , vol., no., pp.18-24, 16-19 Sept. 2008.
doi: 10.1109/NGMAST.2008.97
S. Islam, J.C. Grgoire, "Convergence of IMS and Web Services: A
Review and a Novel Thin Client Based Architecture,"
Communication Networks and Services Research Conference
(CNSR), 2010 Eighth Annual , vol., no., pp.221-228, 11-14 May
2010. doi: 10.1109/CNSR.2010.10
L. Lopez-Fernandez, D. Gonzalez-Martinez, D.L. Llanos, C. MaestreTerol, "AFICUS: An architecture for a future internet of User
Generated Contents," Intelligence in Next Generation Networks
(ICIN), 2011 15th International Conference on, vol., no., pp. 207-212,
4-7 Oct. 2011. doi: 10.1109/ICIN.2011.6081076
J. Uberti, C. Jennings, Javascript Session Establishment Protocol.
Internet Draft draft-uberti-rtcweb-jsep-02, Internet Engineering Task
Force, Feb. 2012.
A. Bergkvist, D.C. Burnett, C. Jennings and A. Narayanan,
WebRTC 1.0: Real-time Communications Between Browsers,
W3C Editors Draft 16, Jan. 2013.
J. Deruelle, "JSLEE and SIP-Servlets Interoperability with Mobicents
Communication Platform," Next Generation Mobile Applications,
Services and Technologies, 2008. NGMAST '08. The Second
International Conference on , vol., no., pp.634,639, 16-19 Sept. 2008.
doi: 10.1109/NGMAST.2008.91
M. Slee, A. Agarwal and M. Kwiatkowski, Thrift: scalable crosslanguage services implementation, Whitepaper, Facebook, 156
University Ave, Palo Alto, CA.
T. Ericson, M. Brandt. JSR 309-Overview of Media Server Control
API. Public Final Draft, Media Server Control API v1.0, 2009.

Kurento''
Signaling'Server'

Kurento''
Media'Server'

WebRTC''
Bin'

createOffer()
gotOffer(offer)
setLocalDescription
(offer)
ICE candidates
invite(Client SDP)
INVITE (Client SDP)

Application
logic
accepts call
process(Client SDP)

create(Client SDP)
ICE candidates
SRTP sending Key
getSDPAnswer()

OK(Server SDK)
onOk(Server SDP)
setRemoteDescription
(Server SDP)
addIceCandidate
(Server candidates)

OK(Server SDP)

ACK

SRTP'media'(audio/video)'streams'

gotRemoteStream(stream)

gotRemoteStream
callSuccessful

render
streams

WebRTC'Client'

Kurento'Server'Infrastructure'

Figure 4. Sequence diagram showing the different components, messages and interactions involved in a WebRTC application using the Kurento
infrastructure and the webrtcbin GStreamer module.

Potrebbero piacerti anche