Sei sulla pagina 1di 3

1/7/2015

Savingthefloatingpointstate(LinusTorvalds)

IndexHomeAboutBlog
Newsgroups:fa.linux.kernel
From:torvalds@transmeta.com(LinusTorvalds)
Subject:Re:contextswitchvs.signaldelivery[was:Re:Acceleratinguser

modelinux]
OriginalMessageID:<ail2qh$bf0$1@penguin.transmeta.com>
Date:Mon,5Aug200205:36:20GMT
MessageID:<fa.k1162hv.5mq5i9@ifi.uio.no>
Inarticle<m3u1mb5df3.fsf@averell.firstfloor.org>,
AndiKleen<ak@muc.de>wrote:
>IngoMolnar<mingo@elte.hu>writes:
>
>
>>actuallytheoppositeistrue,ona2.2GHzP4:
>>
>>$./lat_sigcatch
>>Signalhandleroverhead:3.091microseconds
>>
>>$./lat_ctxs02
>>20.90
>>
>>ie.*processtoprocess*contextswitchesare3.4timesfasterthansignal
>>delivery.Ie.wecanswitchtoahelperthreadandback,andstillbe
>>fasterthana*single*signal.
>
>Thisisbecausethesignalsave/restoredoesalotofunnecessarystuff.
>OneoptimizationIimplementedatonetimewasaddingaSA_NOFPsignal
>bitthattoldthekernelthatthesignalhandlerdidnotintend
>tomodifyfloatingpointstate(fewsignalhandlersneedFP)Itwould
>notsavetheFPUstatethenandreachedquitesomespeedupinsignal
>latency.
>
>LinuxgotalotslowerinsignaldeliverywhentheSSE2supportwas
>added.Thatgotthisspeedback.
Thiswillbreak_horribly_when(if)glibcstartsusingSSE2forthings
likememcpy()etc.
Iagreethatitisreallysadthatwehavetosave/restoreFPon
signals,butIthinkit'sunavoidable.Yourhackmayworkforyou,but
itjustgetsreallydangerousingeneral.havingsignalsrandomly
subtlycorruptsomeSSE2statejustbecausethesignalhandleruses
somethinglikememcpy(withoutevenrealizingthatthatcouldleadto
trouble)isbad,bad,bad.
Inotherwords,"notintendingto"doesnotimply"willnot".It'sjust
potentiallytooeasytochangeSSE2statebymistake.
Andyes,thissignalhandlerthingisclearlyvisibleonbenchmarks.
MUCHtooclearlyvisible.Ijustdidn'tseeanysafealternatives
(andIstilldon't;()

Linus

Newsgroups:fa.linux.kernel
From:LinusTorvalds<torvalds@transmeta.com>
Subject:Re:contextswitchvs.signaldelivery[was:Re:Acceleratinguser
OriginalMessageID:<Pine.LNX.4.44.0208050922570.1753100000@home.transmeta.com>
Date:Mon,5Aug200216:39:34GMT
MessageID:<fa.m7f8dqv.17gi8gs@ifi.uio.no>
OnMon,5Aug2002,JamieLokierwrote:
>LinusTorvaldswrote:
>>Iagreethatitisreallysadthatwehavetosave/restoreFPon
>>signals,butIthinkit'sunavoidable.
>
>Couldn'tyoumarktheFPUasunusedforthedurationofthe
>handler,andletthelazyFPUmechanismsavethestatewhenitisused
>bythesignalhandler?
Nope.Believeme,Igavesomethoughttocleverthingstodo.
Thekernelwon'teven_see_alongjmp()outofasignalhandler,sothe
kernelhasareallyhardtimetryingtodoanycleverlazystuff.
Also,peoplewhoplaygameswithFPactuallychangetheFPdataonthe
stackframe,anddependonsignalreturntoreloadit.AdmittedlyI've
onlyeverseenthisonSIGFPE,butanywaythisisalldonewithinteger
instructionsthatjusttouchbitpatternsonthestack..Thekernelcan't
catchitsanely.
>Forsophisticateduserspaceuses,liketheabove,I'dliketosee
>atraphandlingmechanismthatsavesonlythe_minimum_state.
Iwouldnotmindanextrapersignalflagthatsays"don'tbotherwithFP
saves"(thesamewaywealreadyhave"don'trestart"etc),butIwouldbe
verynervousifglibcuseditbydefault(evenifglibcdoesn'tuseSSE2
inmemcpy,gccitselfcandoit,andobviously_users_mayjustdoit
http://yarchive.net/comp/linux/fp_state_save.html

1/3

1/7/2015

Savingthefloatingpointstate(LinusTorvalds)

themselves).
SoitwouldhavetobeexplicitlyenabledwithaSA_NOFPSIGHANDLERflagor
something.
(Andyes,it'stheFPstuffthattakesmostofthetime.Ithinkthe
lmbenchnumbersforsignaldeliverytripledwhenthatwentin).

Linus

Newsgroups:fa.linux.kernel
From:LinusTorvalds<torvalds@transmeta.com>
Subject:Re:contextswitchvs.signaldelivery[was:Re:Acceleratingusermode
OriginalMessageID:<Pine.LNX.4.44.0208051317480.11693100000@home.transmeta.com>
Date:Mon,5Aug200220:24:54GMT
MessageID:<fa.l3t7nqv.1n143hl@ifi.uio.no>
OnMon,5Aug2002,OliverNeukumwrote:
>
>>Also,peoplewhoplaygameswithFPactuallychangetheFPdataonthe
>>stackframe,anddependonsignalreturntoreloadit.AdmittedlyI've
>>onlyeverseenthisonSIGFPE,butanywaythisisalldonewithinteger
>>instructionsthatjusttouchbitpatternsonthestack..Thekernelcan't
>>catchitsanely.
>
>Couldthefpstatebeputonitsownpageandthedirtybit
>evaluatedinthedecisionwhethertorestorefpustate?
I'msureanythingis_possible_,butthereareafewproblemswiththat
approach.Inparticular,playingVMgamestendstobequiteexpensiveon
SMP,sinceyouneedtomakesurethattheTLBentryforthatpageis
invalidatedonalltheotherCPU'sbeforeyouinserttheFPUpage.
Also,you'dneedtoplaygameswithdirtybithandling,sincethepage
_is_dirty(itcontainsFPdata),sotheVMmustknowtowriteitoutif
itpagesthings.That'sokwehaveseparateperpageandperTLBentry
dirtybitsanyway,butrightnowtheVMlayerknowsitcanmovetheTLB
entrydirtybitintotheperpagedirtybitanddropitwhichwouldn't
bethecaseifwealsohaveaFPUdirtybit.
That'sfixablewecouldjustmakea"softwareTLBdirtybit"thatit
updatedwheneverthehardwareTLBdirtybitisclearedandmovedintothe
perpagedirtybit.
Buttheendresultsoundsrathercomplicated,especiallysinceallthe
pagetablewalkingnecessaryforsettingthisallupislikelytobeabout
asexpensiveasthethingwe'retryingtoavoid..
Ruleofthumb:italmostneverpaystobe"clever".

Linus

Newsgroups:fa.linux.kernel
From:LinusTorvalds<torvalds@transmeta.com>
Subject:Re:contextswitchvs.signaldelivery[was:Re:Acceleratinguser
OriginalMessageID:<Pine.LNX.4.44.0208050910420.1753100000@home.transmeta.com>
Date:Mon,5Aug200216:22:27GMT
MessageID:<fa.m6uudiv.170o88u@ifi.uio.no>
On5Aug2002,AndiKleenwrote:
>
>Ithinkthepossibilityatleastformemcpyisratherremote.Anysane
>SSEmemcpywouldonlykickinforreallybigarguments(forsmall
>memcpysitdoesn'tmakeanysenseatallbecauseofthecontextsave/possible
>reformattingpenaltyoverhead).Soonlypeopledoingreally
>bigmemcpyscouldbepossiblyhurt,andthatisratherunlikely.
Andthisiswhythekernel_has_tosavetheFPstate.
It'sthe"onlyhappensinabluemoon"bugsthataretheabsolute_worst_
bugs.IwanttooptimizethekerneluntilI'mblueintheface,butthe
kernelmustNEVEREVERhavea"nonstable"interface.
Signalhandlersthatdon'trestorestatearehardas_hell_todebug.Most
ofthetimeitdoesn'treallymatter(unlessthelackofrestoreis
somethingreallymajorlikeoneofthemostcommonintegerregisters),but
thendependingonwhatlibrariesyouuse,andjust_exactly_whenthe
signalcomesin,yougetsubtledatacorruptionthatmaynotshowupuntil
muchlater.
Atwhichpointyourprogrammerwondersifhemistakenlywanderedinto
MSWindowsland.
Nothankyou.I'lltakeslowsignalhandlersoveronesthat_sometimes_
don'twork.
>AfterallLinuxshouldgiveyouenoughropetoshotyourselfinthefoot;)
http://yarchive.net/comp/linux/fp_state_save.html

2/3

1/7/2015

Savingthefloatingpointstate(LinusTorvalds)

Onpurpose,yes.It'soktotakecarefulaim,andsay"I'mnowshooting
myselfinthefoot".
Andyes,it'salsooktosay"Idon'tknowwhatI'mdoing,soImaybe
shootingmyselfinthefoot"(thisisobviouslythemostcommon
footshooter).
Andifyoucometomeandcomplainabouthowdrunkyouwere,andhowyou
shotyourselfinthefootbymistakeduetothat,I'lljustignoreyou.
BUTandthisisabigBUTifyouaredoingeverythingright,andyou
actuallyknowwhatyou'redoing,andyouendupshootingyourselfinthe
footbecausethekernelwastakingashortcut,thenIthinkthekernelis
_wrong_.
AndI'dratherhaveaslowkernelthatdoesthingsright,thanafast
kernelwhichscrewswithpeople.
>Intheoryyoucoulddoasuperhack:puttheFPcontextintoanunmapped
>pageonthestackandonlysavewithlazyFPUoraccesstotheunmapped
>page.
Thatwouldbeextremelyinterestingespeciallywithsignalhandlersthat
doalongjmp()thing.
Therealfixforalotofprogramsonx86wouldbeforthemtoneverever
useFPinthefirstplace,inwhichcasethekernelwouldbeabletojust
notsaveandrestoreitatall.
However,glibcfiddleswiththefpuatstartup,evenfornonFPprograms.
Dunnowhattodoaboutthat.

Linus

From:LinusTorvalds<torvalds@osdl.org>
Newsgroups:fa.linux.kernel
Subject:Re:[patch2.6.13rc3a]i386:inlinerestore_fpu
Date:Tue,26Jul200521:53:46UTC
MessageID:<fa.hekf6kt.hne20f@ifi.uio.no>
OriginalMessageID:<Pine.LNX.4.58.0507261438540.19309@g5.osdl.org>
OnTue,26Jul2005,ChuckEbbertwrote:
>
>SincefxsaveleavestheFPUstateintact,thereoughttobeabetterwaytodo
>thisbutitgetstricky.MaybeusingtheTSCtoputatimestampineverythread
>savearea?
WeusedtohavetotallylazyFPsaving,andnottouchtheFPstateat
_all_intheschedulerexcepttojustsettheTSbit.
ItworkedwonderfullywellonUP,butgettingitworkingonSMPisamajor
pain,sincethelazystateyouwanttoswitchbackintomightbecachedon
someotherCPU'sregisters,soweneverdiditonSMP.Eventuallyitgot
toopainfultomaintaintwototallydifferentlogicalcodepathsbetween
UPandSMP,andsomebugorotherendedupresultinginthecurrent"lazy
onatimeslicelevel"thingwhichworkswellinSMPtoo.
Also,alotofthecostisreallythesave,andbeforeSSE2thefnsave
wouldcleartheFPUstate,soyoucouldn'tjustdoasaveandtrytoelide
justtherestoreinthelazycase.InSSE2(withfxsave)we_could_tryto
dothat,butthethingis,Idoubtitreallyhelps.
Firstoff,99%ofallprogramsdon'thitthenastycaseatall,andfor
somethingbrokenlikevolanomarkthat_does_hitit,Ibetthatthereis
morethanonethreadusingtheFP,soyoucan'tjustcachetheFPstate
intheCPU_anyway_.
Sowecouldenhancethecurrentstatebyhavinga"nonlazy"modelikein
theexamplepatch,exceptwe'dhavetomakeitadynamicflag.Whichcould
eitherbedonebyexplicitlymarkingbinarieswewanttobenonlazy,or
byjustdynamicallynoticingthattherateofFPrestoresisveryhigh.
Doesanybodyreallycareaboutvolanomark?Quitefrankly,Ithinkyou'd
seea_lot_moreperformanceimprovementifyoucouldinsteadteachthe
JavastuffnottouseFPallthetime,soitfeelsabitlikepapering
overthe_real_bugifwe'dtrytooptimizethisabnormalandsillycase
inthekernel.

Linus

IndexHomeAboutBlog

http://yarchive.net/comp/linux/fp_state_save.html

3/3

Potrebbero piacerti anche