Sei sulla pagina 1di 57

Google VP9 Summit, 2014-06-06

The Daala Video Codec:


Research Update
Monty Montgomery <monty@xiph.org>
(Mozilla, Xiph.Org)
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
Why Free Codecs Matter
...that's Free with a capital F

Free refers to control, not [just cost

!ncum"ere# co#ecs are a "illion #ollar toll$ta% on


communications tools

&o#ec licensing is use# as 'eaponry in competiti(e


"attles

)icensing regimes are uni(ersally #iscriminatory

*he success of the +nternet 'as "ase# on inno(ation


'ithout as,ing permission
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
Why Free Codecs Matter
&%ontinue#'

Many applications can-t tolerate any co#ec licensing


costs at all

e(en the cost of just counting the users is too much

+gnoring the licensing creates ris,s that can sho' up at


any time

a ta% on success

&ompati"ility is usually the "ig cost, not &./,


"an#'i#th, etc.
...or "egging forgi(eness
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
..."ut that-s missing the usual moti(ations "ehin# ne' co#ecs0
http://xkcd.com/927/

he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
More and More Codecs

1n organization can-t license an encum"ere# co#ec 'hen


there-s no accepta"le license offere#

2uil#ing a ne' co#ec from scratch may cost less than


licensing

1#(ersarial licensing is a ris, in a competiti(e mar,et

F3145 is often none of Fair, 3easona"le, or 4on$5iscriminatory


he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
Changing the Game

&reating goo# co#ecs isn-t easy...

2ut 'e #on-t nee# many. 6ithout 'eir# competiti(e pressures


the 'hole 'orl# can cooperate

2est implementations of the patente# co#ecs are alrea#y often


the free soft'are ones

6here 3F is esta"lishe# non$free co#ecs see no a#option. 7ee8


9.!:. 4et'or, effect #eci#es

/nfortunately many #ifferent people care a"out many #ifferent


things

&on(incing e(eryone means "eing "etter in almost e(ery 'ay, not


just one or t'o
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
What About Video Codecs

7ome e%isting royalty$free formats

*heora is circa ;<<<=>??? technology

@.A soli#ly "etter than h>BC "aseline profile, "ut the "ar 'as
mo(ing to high profile at release

@.< a#(ances performance "ut shares the same architecture


technically an# politically

7tructural similarity to patente# tech ma,es F/5 too


easy

7ingle company sponsorship ma,es some parties uneasy,


e(en 'ith permissi(e licenses
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
!trategy is "ssential:
#it$alls to A%oid

2a# +.3 story

O(eroptimistic, late rush to mar,et

7upporting competitors for short term gain

an# #ri(ing off your partners at the same time

3eleasing uncompelling technology

Merely competiti(e isn-t goo# enough 'hen you-re the un#er#og

!%erting complete control o(er format

Occasionally thro'ing technology o(er a 'all 'ith a permissi(e license is not the same as open
#e(elopment.

Outsi#e input is nee#e# to impro(e technology, "uil# an e%cite# community of early a#opters, s'ay critics,
an# fin# em"arrassing "ugs.

:i(ing up all of the a"o(e in or#er to spee# time to mar,et isn-t 'orth it.

)ate=4one%istent har#'are support

1 real spec is not option"l


he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
!trategy is "ssential:
&o' $or the D(s

5esign alternati(es to a(oi# the 'orst patent thic,ets

3ea# an# analyze patents, an# pu"lish the results

.atent the ne' technology 'e #e(elop

/se a patent license that encourages a#option an#


#iscourages #efection

*arget ne%t$ne%t$generation to a(oi# rushing to mar,et

3un the open project as an actually open project

5ocument, #ocument, #ocument0

(the )hole point o* " !oom+#"y ,"%hine i+ lo+t i* you -eep it " +e%ret..
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
!trategy is "ssential:
These #arts Will )e *ard

2e "est$in$class or go home

6oo competitors an# critics

especially those 'ho thin, they-re allies

Fin# ne' niches, uses, applications that are


unoccupie# an# fill them

Dar#'are 7upport
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06

)ets ta,e some of the strategy that 'or,e# in Opus,


an# apply it to (i#eo8

6or, in a pu/li% pro%e++ in a recognize# 75O 'ith a


+trong 0P1 #i+%lo+ure poli%y an# Opus$li,e patent
licensing

Euestion assumptions in the con(entional structure of


(i#eo co#ecs, no sacre# co's

*arget applications 'here high fle%i"ility is essential

optimize for per%eption not PS21


&e+t Generation Video: Daala
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
,- !econd .ntroduction to
Video Coding
Most (i#eo co#ecs use the same "asic i#eas8

#rediction8 &onsi#er 'hat you ,no' a"out pre(ious or


typical content to pre#ict future #ata

Trans$ormation8 3earrange the information to ma,e it


more compressi"le

/uanti0ation8 7trategically lo'er the resolution of the


transforme# #ata

"ntropy coding8 &o#e the Fuantize# #ata ta,ing


pro"a"ility #istri"ution into account
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
,- !econd .ntroduction to Video Coding:
#rediction

.ntra1#rediction8 .re#ict portions of the current frame from


alrea#y #eco#e# portions of the current frame

.nter1#rediction8 .re#ict portions of the current frame from


pre(ious #eco#e# frames

Motion Compensation to eliminate temporal re#un#ancy


he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
,- !econd .ntroduction to Video Coding:
Trans$ormation

Map spatial pi%el (alues into some other more compressi"le


representation (ia a >5 transform, usually the 5&*.
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
,- !econd .ntroduction to Video Coding:
/uanti0ation and Coding

/uanti0ation8 &ompute the #ifference remaining


after pre#iction, then lo'er its resolution.

*his is the lossy part

Coding8 *he Fuantize# error signal is (hopefully)


ran#om num"ers from some pro"a"ility #istri"ution.

.ac, it efficiently into the "itstream


he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
Daala Technological Di$$erences
2so $ar3

)appe# transforms rather than tra#itional 5&*

+mplemente# (ia re(ersi"le lifting

Multisym"ol arithmetic enco#ing

FreFuency #omain intra$pre#iction

.spherical (ector Fuantization

&hroma plane pre#iction from luma planes

O(erlapping$"loc, motion compensation

*ime$freFuency resolution s'itching


he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
DCT )loc4ing Arti$acts

6hen 'e ha(e fe' "its, Fuantization errors may cause a step #iscontinuity
"et'een "loc,s

!rror correlate# along "loc, e#ge G highly (isi"le

7tan#ar# solution8 a loop filter

Mo(e pi%el (alues near "loc, e#ges closer to each other


he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
5apped Trans$orms
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
5apped Trans$orms

4o more "loc,ing artifacts, 'ithout loop filter

&omputationally cheaper than 'a(elets

2etter compression than 5&* or 6a(elets

5oesn-t completely #isrupt "loc,$"ase# 5&* infrastructure

More #etails at http8==people.%iph.org=H%iphmont=#emo=#aala=#emo;.shtml


4-point 8-point 16-point
KLT 7.5825 dB 8.8462 dB 9.4781 dB
DCT 7.5701 dB 8.8259 dB 9.4555 dB
LT 8.6060 dB 9.5572 dB 9.8614 dB
9/7 !"#$#t 9.46 dB
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
Why not Wa%elets
6a(elets 'ere toute# as the net "ig thing in (i#eo
co#ing ;?$;I years ago.

:oo# )F resolution
(mo#els correlation 'ell)

2etter time resolution in DF


(pre(ents ringing)

7mooth "asis functions


(no "loc,ing artifacts)
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
Why not Wa%elets
(continued)

:oo# for large scale correlations, "ut co#ecs #i#n-t use them for that

6a(elets "rea, #o'n at lo' rates

DF te%ture reFuires more "its to co#e separately at e(ery spatial position

!%treme lo'$passing is typical


he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
Arithmetic Coding

:i(en some sym"ol pro"a"ilities, efficientlypac, sym"ols


into a "itstream

+nherently serialJ major performance limitation in


har#'are

*here are many fast appro%imations if your sym"ols are


"inary

2ut many of them are patente#

6hat a"out non-/in"ry3

6e use# multisym"ol co#ing in Opus


he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
Arithmetic Coding

*urns out that non$"inary co#ing ma,es part of the process inherently
parallel

3e#uces the serial part


in #irect proportion to
the sym"ol range

H>% spee#up 'hen


testing on top of @.A

Multisym"ol pro"a"ility
mo#eling is har#er, "ut
often more po'erful
0
1
%
B
C
D
%%
%B
%C
%D
B%
BB
BC
BD
D%
DB
DC
DD
C%
CB
CC
CD
%B%
%BB
%BC
%BD
%%%
%%B
%%C
%%D
%D%
%DB
%C%
%CB
B%%
B%B
B%C
B%D
BB%
BBB
D%%
D%B
B%%
B%B
B%C
B%D
BB%
BBB
C%%
C%B
&!n'#: (58/108)
59/108*
0 5 1
0
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
Typical .ntra1#rediction
*he intra$pre#iction mo#es for C%C "loc,s in 6e"M (@.A).
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
Typical .ntra1#rediction

.ros8

/ses image #ata from neigh"oring "loc,s

Only nee# to remem"er ; pi%el "or#er

.arameteriza"le for any angle +

.re#icts #ifficult to co#e features 'ell

e#ges are e%ten#e#

!fficient implementation (Hno multiplies)


&ons8

.oor pre#iction in te%ture# areas

2loc,s ), /), /, /3 must "e #eco#e#

5oesn-t 'or, 'ith o(erlappe# "loc,s0


he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
Decoding an .ntra Frame
4eigh"oring 2loc,s8
5eco#e# +mage
/npre#icte#
.re#icte#
&urrently .re#icting
4ee#s .ost$filter
.re#iction 7upport
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06

6e #on-t ha(e the pi%els nee#e# for a tra#itional intra


pre#ictor

)apping re#uces the nee# for pre#iction, "ut only some'hat

6hy not pre#ict in the lappe# 5&* #omainK

!ach coefficient for the "loc, pre#icte# as a 'eighe# sum of the


neigh"oring "loc,s coeffs

+f not for the lapping 'e coul# ha(e the same pre#ictors either 'ay

5irections #on-t ha(e a clear meaning in the transform


#omain, so ho' #o 'e #esign theseK
.ntra1#rediction in the
Coe$$icient Domain
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
Machine 5earning $or
.ntra1#redictors
Original @.A mo#es
L$Means refinement
*raining +mage
7o far8 H?.>I #2 more co#ing gain than classic intrapre#,
plus actually 'or,s 'ith lappe# transforms
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
&ot 6ust 5imited to
Directions7
Mo#e ; no' pre#icts perio#ic te%ture0
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
Time1Fre8uency Resolution !'itching
2TF3

Opus uses *F to ma,e #ifferent time=freFuency resolution tra#eoffs in each


au#io "an# (thus the name)

5aala uses *F to cheaply merge=split "loc,s in the transform #omain


'ithout re(ersing or repeating the transform

More #etails at http8==people.%iph.org=H%iphmont=#emo=#aala=#emoM.shtml


he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
Chroma #lane #rediction
$rom 5uma 2C$53
3:2 $N O/@ mo(es most of the entropy into the luma channel
"ut resi#ual local correlation remains, esp. e#ge locations
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
Chroma #lane #rediction
$rom 5uma 2C$53

!%isting pu"lishe# &f) techniFues 'or, in the pi%el (spatial) #omain

.re#icting chroma from luma in the pi%el #omain can "e


computationally comple%

2ut in the freFuency #omain it-s fast0

*F ena"les freFuency #omain &f) 'ith su"sample# chroma


he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
Chroma $rom 5uma
2continued3

7o far, our &f) results #o some'hat 'orse on e(ery


o"jecti(e metric (.743, 77+M, fast$77+M, .743D@7)

2ut it loo-+ clearly "etter0

Most metrics #esigne# on greyscale

For more information on transform$#omain &f), see8


http8==people.%iph.org=H%iphmont=#emo=#aala=#emoC.shtml
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
Motion Compensation

.re#ict frames from past


(sometimes future) frames,
compensating for things
that mo(e

*ra#itional motion
compensation #isplaces
"loc,s of pi%els, creates
"loc,ing artifacts
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
(%erlapped1)loc4 Motion
Compensation

O(erlap the pre#ictions from multiple near"y M@s,


an# "len# them 'ith a 'in#o'
1lso a form of
multi$hypothesis
pre#iction
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
()MC &%ontinue#'

/se# "y 5irac

1lso 'ant to a(oi# "loc,ing artifacts 'ith 'a(elets

.743 impro(ements as much as ; #2

+ssues

Motion (ectors no longer in#epen#ent

&an use iterati(e refinement, #ynamic programming (&hen an# 6illson,


>???), "igger cost of ignoring this

&an "lur sharp features

&an a## ghosting artifacts

Dan#ling multiple "loc, sizes


he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
Variable )loc4 !i0e

4ee# a 'ay change "loc, size that #oesnPt create


"loc,ing artifacts

5irac su"#i(i#es all "loc,s to the smallest le(el an#


copies M@s

)ots of setup o(erhea# for smaller "loc,s

3e#un#ant computations for a#jacent "loc,s 'ith same


M@
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
Adapti%e !ubdi%ision

1llo' artifact$free su"#i(ision in a C$A mesh

4eigh"ors #iffer "y at most ; le(el of su"#i(ision

Fine$graine# control (M@ rate #ou"les each le(el)

!fficient 3$5 optimization metho#s (2almelli >??;)

5e(elope# for compressing triangle mesh=terrain #ata

)arger interpolation ,ernels, less setup o(erhea#, fe'er re#un#ant calculations


he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
Multiresolution )lending

*echniFue #ue to 2urt an# 1#elson ;<AM

5ecompose pre#ictor into lo'$pass an# high$pass su""an#s )), D),


)D, DD

2len# 'ith small 'in#o' in high$pass "an#s

)i,e en"len# use# for panorama stitching

3e#uces ghosting an# "lurring

.ropose# simplification

One le(el of Daar #ecomposition (no multiplies)

2len# )) "an# li,e O2M&, copy the rest

3e#uces O2M& multiplies "y QIR


he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
"dge1directed !ubpel
.nterpolation

Fractional pi%el (ectors nee# interpolation

.ossi"le to #o "etter than linear filters

1ll 'e nee# is something fast enough for (i#eo


5a(i# 7chleef has something, 'e ha(en-t trie# it yet
'hole pel
half pel
Fuarter pel
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
#spherical Vector
/uanti0ation

.reser(ing the o(erall energy in a "an# turne# out to "e


perceptually critical for au#io

Opus #esigne# to e%plicitly preser(e energy

*a,e a set (alues an# treat them as a point on an 4$#imensional


sphere8 the ra#ius is the energy, the an# the angle is #etails. &o#e
these separately.

+ntuiti(ely it ma,es sense for image co#ing8 Might it "e "etter for
lo' Fuality "loc,s to "ecome noisy instea# of "lurryK Film
grain

.@E pro(i#es a con(enient, 'ell$teste# means of gain$shape co#ing


(ia alge"raic co#e"oo,s
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
#spherical Vector /uanti0ation
2continued3

6e 'ant a fast alge"raic representation of e(enly #istri"ute# points


on the surface a sphere

5on-t ,no' ho' to #o that for ar"itrary #imension

/se e(enly #istri"ute# points on a pyrami# instea#

.yrami# @ector Euantization (Fischer, ;<AB)

6arp the pyr"mi# into a +phere, thus pspherical

For 4$#imensional (ector, allocate L SpulsesS

&o#e"oo,8 normalize# (ectors 'ith integer coor#inates 'hose


magnitu#es sum to L
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
#V/ "numeration
1ssume the follo'ing
co#e"oo,8

#imension 4T>

3esolution (pulses)
LTM

@ector (alues are


positi(e
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
#V/ "numeration
2continued3
1ssume the follo'ing
co#e"oo,8

#imension 4T>

3esolution (pulses)
LTM

@ector (alues may


"e positi(e or
negati(e
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
#spherical Warping
1ssume the follo'ing
co#e"oo,8

#imension 4T>

3esolution (pulses)
LTM

@ector (alues may "e


positi(e or negati(e

.roject co#e"oo,
points onto unit circle
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
#spherical Codeboo4s
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
#V/ 'ith #rediction

@i#eo pro(i#es us 'ith useful pre#ictors

6e 'ant to treat (ectors in the #irection of the


pre#iction as special

*hey are much more li,ely0

7u"tracting an# co#ing the resi#ual 'oul# lose


energy preser(ation

7olution8 align the co#e"oo, a%es 'ith the


pre#iction, treat one #imension #ifferently
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
91D #ro:ection "+ample
,-#diction
.np/t

+nput U .re#iction
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
91D #ro:ection "+ample
,-#diction
.np/t

+nput U .re#iction

&ompute Dousehol#er
3eflection
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
91D #ro:ection "+ample
,-#diction
.np/t

+nput U .re#iction

&ompute Dousehol#er
3eflection

1pply 3eflection
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
91D #ro:ection "+ample

+nput U .re#iction

&ompute Dousehol#er
3eflection

1pply 3eflection

&ompute V
co#e angle
,-#diction
.np/t
+
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
91D #ro:ection "+ample

+nput U .re#iction

&ompute Dousehol#er
3eflection

1pply 3eflection

&ompute V
co#e angle

&o#e other
#imensions
,-#diction
.np/t
+
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
#rediction %ia Theta1#V/

&reates another intuiti(e parameter,

01o2 m/ch $ik# th# p-#dicto- !-# 2#34

5 0 6 /7# p-#dicto- #x!ct$8

d#t#-min#7 ho2 m!n8 p/$7#7 'o in th# 0p-#diction4


di-#ction

K 9!nd th/7 :it-!t#* ;o- -#m!inin' N-1 dim#n7ion7 !d</7t#d


do2n

&#m!inin' N-1 dim#n7ion7 h!"# N-2 d#'-##7 o; ;-##dom


9no -#d/nd!nc8*

C!n -#p#!t ;o- mo-# p-#dicto-7


he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
Today;s Formats Are a 5ong Way
From "+hausting the #ossible
Do' a"out un"len#ing a cross$fa#eK
Sp"ti"l Sp"r+ity-0n#u%e# Pre#i%tion *or 0m"ge+ "n# Vi#eo4 5 Simple 6"y to 1e7e%t Stru%ture# 0nter*eren%e
Gang Hua and Onur G. Guleryuz (2011)
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
Recent Wor4 < Updates
Monty-s #emo pages at8
https8==people.%iph.org=H%iphmont=#emo
#ocument an# e%plain many of these techniFues in
more #etail, "ut there ha(e "een ne' #e(elopments
e(en since then.
5emo I (#iscussing .spherical @ector Euantization
in #etail) is coming, + promise0
he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
The Road Ahead

*he techniFues 'e-(e "een 'or,ing 'ith appear to 'or,,


"ut there is much to "e #one

+n#ustry is currently #istracte# figuring out ho' they-re


going to #eploy D!@& (=@.<)

Oour participation is 'elcome0

http8==%iph.org=#aala

Opus "enefite# from some applications ser(e# "y no other


au#io co#ec.

5oes something similar e%ist for (i#eoK


he !""l" Vi#eo $o#e%, Google VP9 Summit, 2014-06-06
Daala: Additional Resources

6e"site8 http8=='''.%iph.org=#aala

Mailing list8 #aalaW%iph.org

+3&8 X#aala on irc.freeno#e.net

:it repository8 git8==git.%iph.org=#aala.git

5emos8 http8==people.%iph.org=H%iphmont=#emo=
EuestionsK

Potrebbero piacerti anche