Sei sulla pagina 1di 30

Introduction to Multimedia

Synchronization
Klara Nahrstedt
cs598KN
Content
Notion of Synchronization
Intra-object and inter-object synchronization
Live and synthetic Synchronization
Synchronization Requirements
Reference Model for Synchronization
Synchronization in Distributed Environments
Synchronization Specification
Notion of Synchronization
Multimedia synchronization is understood in
correspondence of content relation, spatial
relation and temporal relation
Content Relation: defines dependency of
media objects for some data
Example: dependency between a filled
spreadsheet and graphics that represent data
listed in spreadsheet

Spatial Relation
Spatial relation is represented through layout relation
and defines space used for presentation of a media
object on an output device at a certain point of time
in a multimedia document
Example: desktop publishing
Layout frames are placed on an output device and a content
is assigned to this frame
Positioning of layout frames:
Fixed to a position of a document
Fixed to a position on a page
Relative to the position of other frame
Concept of frames is used for positioning of time-dependent
objects
Example: in window-based systems, layout frames correspond
to windows and video can be positioned in a window
Temporal Relation
Temporal relation defines temporal dependencies
between media objects
Example: lip synchronization
This relation is the focus of our papers, we will not
talk about content or spatial relation
Time-dependent objects represent a media stream
because there exists temporal relations between
consecutive units of the stream
Time-independent objects are traditional media such
as images or text.

Temporal Relations (2)
Temporal synchronization is supported by
many system components:
OS (CPU scheduling, semaphores during IPC)
Communication systems (traffic shaping, network
scheduling)
Databases
Document handling
Synchronization needed at several levels of
a Multimedia System
Temporal Relations (3)
1. level: OS and lower communication layers bundle
single streams
Objective is to avoid jitter at the presentation time of one
stream
2. level: on top of this level is the RUN-TIME support
for synchronization of multimedia streams
(schedulers)
Objective is bounded skews between various streams
3. level: next level holds the run-time support for
synchronization between time-dependent and time-
independent media together with handling of user
interaction
Objective is bounded skews between time-dependent and
time-independent media

Specification of Synchronization
Implicit Specification
Temporal relation may be specified implicitly during
capturing of the media objects; the goal of a presentation is
to present media in the same ways as they were originally
captured
Audio/video recording and playback /VOD applications
Explicit Specification
Temporal relation may be specified explicitly in the case of
presentations that are composed of independently captured
or otherwise created objects
Slide show where presentation designed
Selects appropriate slides
Creates audio slides
Defines units of audio presentation stream
Defines units of audio presentation stream where slides have to
be presented
Inter-object and Intra-Object
Synchronization
Intra-object synchronization refers to the time
relation between various presentation units of one
time-dependent media object


Inter-object synchronization refers to the
synchronization between media objects

40ms
40ms
Audio 1
Video
Slides
Animation
Audio 2
Classification of Synchronization Units
Logical Data Units (LDU)
Samples or Pixels,
Notes or frames
Movements or scenes
Symphony or movie
Fixed LDU vs Variable LDU
LDU specification during recording vs LDU
defined by user
Open LDU vs Closed LDU
Live Synchronization
Goal is to exactly reproduce at a
presentation the temporal relations as they
existed during the capturing process
Need to capture temporal relation information
during the capturing
Live Sync is needed in conversational
services
Video Conferencing, Video Phone
Recording and Retrieval services are considered
retrieval services, or presentations with delay
Synthetic Synchronization
Temporal relations are artificially specified
Often used in presentation and retrieval-based
systems with stored data objects that are arranged
to provide new combined multimedia objects
Authoring and tutoring systems
Need synchronization editors to support flexible
synchronization relations between media
Two phases: (1) specification phase defines
temporal relations, (2) presentation phase presents
data in a synchronized mode
4 audio messages recorded relate to parts of an engine in
an animation; the animation sequence shows a slow 360
degrees rotation of the engine.
Synchronization Requirements
For intra-object synchronization:
Accuracy concerning jitters and end-to-end delays in the
presentations of LDUs
For inter-object synchronization:
Accuracy in the parallel presentation of media objects
Implication of blocking method:
Fine for time-independent media
Gap problem for time-dependent media
What does the blocking of a stream mean for the output
device?
Should be repeated previous parts in case of speech or
music?
Should the last picture of a stream be shown?
How long can such a gap exist?
Synchronization Requirements (2)
Solutions to Gap problem
Restricted blocking method
Switching to alternative presentation if gap between late video
and audio exceeds a predefined threshold
Show last picture as a still image
Re-sampling of a stream
Speed up or slow down streams for the purpose of
synchronization
Off-line-re-sampling used after capturing of media streams
with independent devices
Concert which is captured with two independent audio and
video devices
Online re-sampling used during a presentation in the case that
at run-time a gap between media streams occurs
Synchronization Requirements (3)
Lip Synchronization Requirements refer to temporal
relation between audio and video streams for human
speaking
Time difference between related audio and video LDUs is
called synchronization skew
Streams are in sync if skew = 0 or skew bound
Streams are out of sync if skew > 0
Bounds:
Audio/Video in sync means -80ms skew 80ms
Audio/Video out of sync means skew < -160ms or skew > 160ms
Transient means -160ms skew < -80ms, and 80ms < skew
160ms
Synchronization Requirements (4)
Pointer Synchronization Requirements are very
important in computer-supported cooperative work
(CSCW)
We need synchronization between graphics, pointers
and audio
Comparison
Lip sync error is skew between 40 to 60ms
Pointer sync error is skew between 250 to 1500ms
Bound:
Pointer/Audio/Graphics in sync means -500ms skew
750ms
Out of sync means skew < -1000ms and skew > 1250ms
Transient means -1000ms skew < -500ms and 750ms <
skew 1250ms
Synchronization Requirements (5)
Digital audio on CD-ROM:
maximum allowable jitter delay in perception experiments 5-
10ns, other experiments suggest 2ms
Combination of audio and animation is not as
stringent as lip synchronization
Maximum allowable skew is +/- 80ms
Stereo audio is tightly coupled
Maximum allowable skew is 20ms; because of listening
errors, suggestion for skew is +/- 11ms
Loosely coupled audio channels: speaker and
background music
Maximum allowable skew is 500ms.
Synchronization Requirements (6)
Production-level synchronization should be
guaranteed prior to the presentation of data
at the user interface
For example, in case of recording of synchronized
data for subsequent playback
Stored data should be captured and recorded with no
skew
For playback, the defined lip sync boundaries are 80 ms
For playback at local and remote workstation
simultaneously, sync skews should be between -160ms
and 0ms (video should be ahead of audio for remote
station due to pre-fetching)
Synchronization Requirements (7)
Presentation-level synchronization should be
defined at the user interface
This synchronization focuses on human
perception
Examples
Video and image overlay +/- 240ms
Video and image non-overlay +/- 500ms
Audio and image (music with notes) +/- 5ms
Audio and slide show (loosely coupled image) +/- 500ms
Audio and text (Text annotation) +/-240ms
Reference Model for Synchronization
Synchronization of multimedia objects are
classified with respect to four-level system
SPECIFICATION LEVEL (is an open layer, includes applications and tools
Which allow to create synchronization specifications, e.g. sync editors); editing
And formatting, mapping of user QoS to abstractions at object level
OBJECT/SERVICE LEVEL (operates on all types of media and hides
differences between discrete and continuous media), plan and coordinate
presentations, initiate presentations
STREAM LEVEL (operates on multiple media streams, provides inter-
stream synchronization), resource reservation and scheduling
MEDIA LEVEL (operates on single stream; treaded as a sequence of LDUs,
provides intra-stream synchronization), file and device access
Synchronization in Distributed
Environments
Information of synchronization must be transmitted with audio
and video streams, so that the receiver side can synchronize the
streams
Delivery of complete sync information can be done before the
start of presentation
This is used in synthetic synchronization
Advantage: simple implementation
Disadvantage: presentation delay
Deliver of complete sync information can be using out-of-band
communication via a separate sync channel
This is used in live synchronization
Advantage: no additional presentation delays
Disadvantage: additional channel is needed; additional errors can occur

Synchronization in Distributed
Environments (2)
Delivery of complete synchronization
information can be done using in-band
communication via multiplexed data streams,
i.e., synchronization information is in headers
of the multimedia PDU
Advantage: related sync information is delivered
together with media units
Disadvantage: difficult to use for multiple sources
Synchronization in Distributed
Environments (3)
Location of Synchronization Operations
It is possible to synchronize media objects by recording
objects together and leave them together as one object,
i.e., combine objects into a new media object during
creation; Synchronization operation happens then at
the recording site
Synchronization operation can be placed at the sink. In
this case the demand on bandwidth is larger because
additional sync information must be transported
Synchronization operation can be placed at the source.
In this case the demand on bandwidth is smaller
because the streams are multiplexed according to
synchronization requirements


Synchronization in Distributed
Environments (4)
Clock Synchronization
Consider synchronization accuracy between
clocks at source and destination
Global time-based synchronization needs
clock synchronization
In order to re-synchronize, we can allocate
buffers at the sink and start transmission of
audio and video in advance, or use NTP
(Network Time Protocol) to bound the
maximum clock offset
Synchronization in Distributed
Environments (5)
Other Synchronization Issues
Synchronization in distributed environment is a multi-
step process
Sync must be considered during object acquisition (during
video digitization)
Sync must be considered during retrieval (synchronize
access to frames of a stored video)
Sync must be considered during delivery of LDUs to the
network (traffic shaping)
Sync must be considered during transport (use isochronous
protocols if possible)
Sync must be considered at the sink (sync delivery to the
output devices)

Synchronization Specification Methods
Interval-based Specification
presentation duration of an object is considered
as interval
Examples of operations: A before(0) B, A overlap B, A
starts B, A equals B, A during B, A while(0,0) B
Advantage: easy to handle open LDUs ad
therefore user interactions
Disadvantage: model does not include skew
specifications
Audio 1
Video 1
slides
animation
Audio 2
Synchronization Specification (2)
Control Flow-based Specification Hierarchical
Approach
Flow of concurrent presentation threads is synchronized in
predefined points of the presentation
Basic hierarchical specification: 1. serial synchronization, 2.
parallel synchronization of actions
Action can be atomic or compound
Atomic action handles presentation of a single media object,
user input or delay
Compound actions are combinations of synchronization
operators and atomic actions
Delay as an atomic action allows modeling of further
synchronization (e.g. delay in serial presentation)
Synchronization Specification (3)
Control Flow-based Specification Hierarchical Approach










Advantage: easy to understand, natural support of hierarchy and integration of
interactive objects is easy
Disadvantage: Additional description of skews and QoS is necessary, we must add
presentation durations


Audio 1 Video
Slides
Animation
Audio 2
Synchronization Specification (3)
Control flow-based Synchronization Specification
Timed Petri Nets


Advantages: Timed Petri nets allow all kinds of
synchronization specification
Disadvantage: Difficulty with complex specification
and offers insufficient abstraction of media object
content because the media objects must be split into
sub-objects

transition Input place with token
Output place
Summary
Different synchronization frameworks
Littles synchronization framework (Boston University)
Goal support retrieval and delivery of multimedia
Firefly System Buchanan and Zellweger
Goal generate automatically consistent presentation schedules
for interactive documents
HyTime standard system with hypermedia/time-based
structuring language
Goal standard for structured representation of hypermedia
information
HyTime is an application of the Standardized Markup Language
(SGML)

Potrebbero piacerti anche