Sei sulla pagina 1di 3

25/1/2016

Whyispythonitertools"consume"recipefasterthancallingnextntimes?StackOverflow

StackOverflowisacommunityof4.7
millionprogrammers,justlikeyou,
helpingeachother.

signup

login

tour

help

JointheStackOverflowcommunityto:

Jointhemitonlytakesaminute:
Askprogramming
questions

Signup

Answerandhelp
yourpeers

Getrecognizedforyour
expertise

Whyispythonitertoolsconsumerecipefasterthancallingnextntimes?

Inthepythondocumentationforitertoolsitprovidesthefollowing"recipe"foradvancinganiteratornsteps:
defconsume(iterator,n):
"Advancetheiteratornstepsahead.Ifnisnone,consumeentirely."
#UsefunctionsthatconsumeiteratorsatCspeed.
ifnisNone:
#feedtheentireiteratorintoazerolengthdeque
collections.deque(iterator,maxlen=0)
else:
#advancetotheemptyslicestartingatpositionn
next(islice(iterator,n,n),None)

I'mwonderingwhythisrecipeisfundamentallydifferentfromsomethinglikethis(asidefromthehandlingofconsumingthewholeiterator):
defother_consume(iterable,n):
foriinxrange(n):
next(iterable,None)

Iused timeit toconfirmthat,asexpected,theaboveapproachismuchslower.What'sgoingonintherecipethatallowsforthissuperior


performance?Igetthatituses islice ,butlookingat islice ,itAPPEARStobedoingfundamentallythesamethingasthecodeabove:
defislice(iterable,*args):
s=slice(*args)
it=iter(xrange(s.startor0,s.stoporsys.maxint,s.stepor1))
nexti=next(it)
###itseemsasifthisloopyieldsfromtheiterablentimesviaenumerate
###howisthisdifferentfromcallingnextntimes?
fori,elementinenumerate(iterable):
ifi==nexti:
yieldelement
nexti=next(it)

Note:evenifinsteadofimporting islice from itertools Idefineitusingthepythonequivalentfromthedocsshownabove,therecipeisstill


faster..
EDIT: timeit codehere:
timeit.timeit('a=iter([random()foriinxrange(1000000)]);consume(a,1000000)',
setup="from__main__importconsume,random",number=10)
timeit.timeit('a=iter([random()foriinxrange(1000000)]);other_consume(a,1000000)',
setup="from__main__importother_consume,random",number=10)
other_consume

is~2.5xslowereachtimeIrunthis

python python2.7 itertools


editedMay19'13at0:08

askedMay18'13at23:04

qwwqwwq
2,530

16

2Answers

Thedocumentationon itertools.islice() isflawedanddoesn'thandletheedgecasefor start


==stop properly.Itisexactlythatedgecasethat consume() uses.

http://stackoverflow.com/questions/16629845/whyispythonitertoolsconsumerecipefasterthancallingnextntimes

1/3

25/1/2016

Whyispythonitertools"consume"recipefasterthancallingnextntimes?StackOverflow

For islice(it,n,n) ,exactly n elementsareconsumedfrom it butnothingiseveryielded.


Instead, StopIteration israisedafterthose n elementshavebeenconsumed.
ThePythonversionyouusedtotestwithontheotherhandraises StopIteration immediately
withouteverconsuminganythingfrom it .Thismakesanytimingsagainstthispurepython
versionincorrectandwaytoofast.
Thisisbecausethe xrange(n,n,1) iteratorimmediatelyraises StopIteration :
>>>it=iter(xrange(1,1))
>>>printnext(it)
Traceback(mostrecentcalllast):
File"prog.py",line4,in<module>
printnext(it)
StopIteration
editedMay19'13at0:14

answeredMay18'13at23:49

MartijnPieters
431k

48

961

1147

Ithinkthisonlyexplainsamodestamountofthespeeddifference.Usingthenaiveversiontakesabout10
timesasmuchasthe itertools.islice therecipe.Replacingthe foriinrange(n) withan
enumerateloop(breakingwhentheindexequals n )reducesthetimetaketo7timesaslongasthe
itertoolsversion,butIthinktherestfromhavinglogicentirelyinC.BlckknghtMay19'13at0:03
Indeed,it'stheflawinthedocsthatmakestheOPtestflawed.MartijnPieters May19'13at0:16
Indeedasimplementedmyconsumeneverconsumesanything! qwwqwwq May19'13at0:19
AndapologiesforthesloweranswerspeedtestingandtypingallthisonaniPhoneisslow.
MartijnPieters May19'13at0:20

Thereasonthattherecipeisfasteristhatitskeypieces( islice , deque )areimplementedinC,


ratherthaninpurePython.PartofitisthataCloopisfasterthan foriinxrange(n) .Another
partisthatPythonfunctioncalls(e.g. next() )aremoreexpensivethantheirCequivalents.
Theversionof itertools.islice thatyou'vecopiedfromthedocumentationisnotcorrect,and
itsapparentlygreatperformanceisbecauseaconsumefunctionusingitdoesn'tconsume
anything.(ForthatreasonI'mnotshowingthatversion'stestresultsbelow,thoughitwaspretty
fast!:)
Hereareacoupledifferentimplementations,sowecantestwhatisfastest:
importcollections
fromitertoolsimportislice
#thisistheofficialrecipe
defconsume_itertools(iterator,n):
"Advancetheiteratornstepsahead.Ifnisnone,consumeentirely."
#UsefunctionsthatconsumeiteratorsatCspeed.
ifnisNone:
#feedtheentireiteratorintoazerolengthdeque
collections.deque(iterator,maxlen=0)
else:
#advancetotheemptyslicestartingatpositionn
next(islice(iterator,n,n),None)
#yourinitialversion,usingaforlooponarange
defconsume_qwwqwwq(iterator,n):
foriinxrange(n):
next(iterator,None)
#aslightlybetterversion,thatonlyhasasingleloop:
defconsume_blckknght(iterator,n):
ifn<=0:
return
fori,vinenumerate(iterator,start=1):
ifi==n:
break

Timingsonmysystem(Python2.7.364bitonWindows7):
>>>test='consume(iter(xrange(100000)),1000)'
>>>timeit.timeit(test,'fromconsumeimportconsume_itertoolsasconsume')
7.623556181657534
>>>timeit.timeit(test,'fromconsumeimportconsume_qwwqwwqasconsume')
106.8907442334584

http://stackoverflow.com/questions/16629845/whyispythonitertoolsconsumerecipefasterthancallingnextntimes

2/3

25/1/2016

Whyispythonitertools"consume"recipefasterthancallingnextntimes?StackOverflow

>>>timeit.timeit(test,'fromconsumeimportconsume_blckknghtasconsume')
56.81081856366518

MyassessmentisthatanearlyemptyPythonlooptakessevenoreighttimeslongertorunthan
theequivalentloopinC.Loopingovertwosequencesatonce(as consume_qwwqwwq doesby
callingnexton iterator inadditiontothe for looponthe xrange )makesthecostroughly
double.
editedMay19'13at0:48

answeredMay18'13at23:18

Blckknght
31.3k

18

55

IamimplementingisliceinpurepythonandstillgettingthesameresultsasIexplaininmyquestion,so
thiscantbethereason qwwqwwq May18'13at23:25
@qwwqwwqHowareyoutimingit?Iimagineyouareeitherusingatinyamountofdata,sothetimeittakes
todoitreallydoesn'tmatter,andoverheadhappenstobehigher,oryouaregettingthetimingwrong.
Latty May18'13at23:54
@qwwqwwq:YourpurePython islice implementationdoesn'tworkcorrectly.Itdoesnothingwhen
start and stop areboththesame( next(iter(xrange(100,100))) raises StopIteration
immediately).Sincethat'swhatthe consume functioncallsitwith,yourconsumewillnotconsume
anything(butitwilldosoveryquickly!).BlckknghtMay19'13at0:00
ahIsee..Icopiedthatcodeforislicestraightfromthedocs,guessitwasincorrect..?ifyouputinan
answerIwillaccept qwwqwwq May19'13at0:14
@qwwqwwq:I'veupdatedwithseveraldifferentimplementationsthatI'vetimedagainsteachother.Apure
Pythonversionusing enumerate toloopovertheiteratordirectlytakesabout7.5timeslongerthanthe
itertools recipe.Thesimplecodeyouposedasanexampletakesalmosttwiceaslong(14timesas
longasthe itertools codeinmytest).So,Ithinthedoubled next callsthat@MartijnPietersmentioned
inhisanswerareresponsibleforabouthalftheslowdownyouwereseeing.Therestofthestoryisthatany
PythonloopisgoingtobeslowerthantheequivalentinC.BlckknghtMay19'13at0:46

http://stackoverflow.com/questions/16629845/whyispythonitertoolsconsumerecipefasterthancallingnextntimes

3/3

Potrebbero piacerti anche