Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Menu
Home
Topics
About
Backend
LearnRedisthehardway(inproduction)
Postedon25January2017byAndyGrunwald
Forourproducts,likethetrivagohotelsearch,weareusingRedisalot.Theuse
casesvary:Caching,temporarystorageofdatabeforemovingthoseintoanother
storageoratypicaldatabaseforhotelmetadataincludingpersistence.
ThemainpartsofthehotelsearcharebuiltwithPHPandtheSymfonyFrameworkfor
thefrontend(web)andJavaforthebackendpart.Inthisarticle,wewillfocusonthe
collaborationbetweenourPHPapplicationandRedis.Botharerunningfine,butit
wasalongandhardwayuptothecurrentsituation.Thisisthestoryofhowwe
learnedtouseRedis,includingourfailuresandexperience.
Thebeginning
ThisstorybeganonFriday,September3,2010.At10:33amthefirstclassesofPredis
,aRedisclientlibraryforPHP,werecommittedintoourcodebase.
http://tech.trivago.com/2017/01/25/learnredisthehardwayinproduction/? 1/15
29/3/2017 Learnredisthehardway(inproduction)trivagotechblog
ThismomentcanbemarkedastheintroductionofRedisintoourPHPstack.Fast
forwardtoFebruary2013.WereplacedthelibraryPredis(PHPimplementation)with
thePHPextensionphpredis(Cimplementation).Thereasonwassimple:
Performance.Thereplacementwentwell.EverythingwithRediswasfineandwe
enjoyedthesummerthatyear.
TherealfunstartedexactlyoneyearlaterinFebruary2014.Aroundthistime,we
launchednewfeaturesandaddednewlanguagestoourplatform.Theresultofthis:
TheHTTPtrafficdoubledinashorttime.Duetogoodcapacityplanningonthe
hardwareside,wewereabletohandlethegrowth.Onthesoftwaresidehowever,we
wereconfrontedwith40%oftheincomingrequestsresultinginHTTP500:Internal
ServerError.
Afterinvestigatingourlogs,wesawerrorsrelatedtothePHP<>Redisconnection
handling.MostofthemwerereaderroronconnectionandRedisserverwentaway.
Ourloggingwasverboseandgaveusenoughdetailtostartourdebuggingsession:
www7.trivago.com20140202102259941|WARN|...Redis\ConnectException:Unabletoconnect:
#0/.../vendor/.../Redis/RedisPool.php(106):...\Redis\RedisPool>connect(Object(Redis),O
#1/.../vendor/.../Redis/RedisClient.php(130):...\Redis\RedisPool>get('default',true)
#2/.../vendor/.../Redis/RedisClient.php(94):...\Redis\RedisClient>setMode(false)
...
#17/.../app/bootstrap.php.cache(551):Symfony\Bundle\FrameworkBundle\HttpKernel>handle(O
#18/.../web/app.php(15):Symfony\Component\HttpKernel\Kernel>handle(Object(Symfony\Compo
#19{main}
|12.34.56.78|www.trivago.de|/?aDateRange%5Barr%5D=20140520&aDateRange%5Bdep%5D=2014
Aquickgooglesearchshowedthatwewerenotalonewiththisissue.See
debugreaderroronconnection#70andreaderroronconnection#492.
Debuggingandfixingtheissue
Basedonthediscussioninthetickets,wethought:Congratulations,wegotourselves
anastybugthere.
Wehadnocluewhattherootcausewas.Redishadbeenworkingfineformorethan
3.5years,andhadnevercausedusanytroublebefore.Leavinguswiththequestion:
Howdowecontinuefromhere?Ourfirstattemptwastotryeverythingthatwas
mentionedintheGithubissue:
RaisingPHPconnectionandcommandtimeoutsfrom500msto2.5seconds
DisablingthePHPsettingdefault_socket_timeout
DisablingSYNcookiesonthehostsystems
http://tech.trivago.com/2017/01/25/learnredisthehardwayinproduction/? 2/15
29/3/2017 Learnredisthehardway(inproduction)trivagotechblog
CheckingthenumberoffiledescriptorsonRedisandWebservers
Raisingthembufferofthehostsystems
ControlandadjusttheTCPbacklogsizes
andmuchmore
Nothinghelpedtosolvethisissueinareliableway.Sobacktotraditionaldebugging!
Wetriedtoreproducetheissueinourpreproductionenvironments.Sadly,wewere
notsuccessful.Wethoughtthatthoseissuesonlyappearedwithmuchhighertraffic.
Sowecontinuedtodeepdiveintoourapplications
ClosingRedisconnectionsattheendofawebrequest
PHPapplicationsareusuallystateless.Everythingyouallocateinarequestisgone
oncetherequestfinishes.Atthistime,wewerenotusingphpfpmandpersistent
connections.ThismeansthateveryHTTPrequestwouldcreateanewRedis
connection.WhilecheckingourconnectionhandlingtoRedis,weopeneda
connection,butneverclosedit.
ThisshouldnotmakeadifferenceinnewerPHPversions.PHPwillautomatically
closetheconnectionwhenyourscriptends.Inolderversions,however,thiscould
leadtoproblemslikestaleconnectionsormemoryleaks.Besidesthat:Itisgood
practicetocloseyourconnections,sowefixedit.Butthatdidnothelpuswithour
initialproblem.
A/BTestingofconnectionlibraries
Wekeptonsearchingandaskedourselvesifwehitabuginphpredis(php
extension).Toverifythishypothesis,weimplementedanA/BTest.Thenecessary
infrastructuretorunA/BTestswasalreadythere.Soweuseditandswitchedfrom
theCextensionbacktothepredislibrary.
Wewerepleasedtofindthatprediswasstillbeingmaintainedandhadreceiveda
lotofdevelopmentlove.
Duetogoodcodestructure,thischangewasdonequickly.Weimplementedone
interface,replacedthephpredisconnectionimplementationwithpredisand
reconfiguredourdependencyinjectioncontainer.
Wedeployedthetestto20%ofourusersinonedatacenter.Theerrorsoccured
again.Inbothlibraries!Wasthisanotherfailureonourroadtofixthisbug?No!We
considereditapartialsuccess.Wewereabletoexcludetheextension(phpredis)as
apossiblerootcause.
UpgradeRedis
http://tech.trivago.com/2017/01/25/learnredisthehardwayinproduction/? 3/15
29/3/2017 Learnredisthehardway(inproduction)trivagotechblog
OurnextstepwastohaveadeeperlookattheRedisside.Commonstepsinan
investigationofabugarecheckingouttheissuetrackeroftheprojectorgettingin
contactwithoneofthemaintainers.ThefirstthingtheywillaskisWhatversionare
yourunning?.Onceyoumentionaversionthatisnotthelatestintheupstream,they
willanswerPleaseupgradeandcheckifitstilloccurs.Thisiswhatwedid.
AtthistimewewererunningRedisv2.6.Thelatestupstreamversionwasv2.8.9.We
thoughtthatmaybewehitabugthatwasalreadyresolved.Unfortunately,thiswas
notthecase.Wemadenoprogressrelatedtoourproblem,butatleastourRedis
serverswereuptodate.:)
Debugginglatencyproblems
Afterreadingalotofdocumentation,wecameacrossafeaturefordebugginglatency
problems.ItscalledRedisSoftwareWatchdog.Itwas(andstillis)markedas
experimentalintheofficialdocumentationbutwewantedtogiveitatry.Theideawas
toidentifylongrunningandblockingcommands.
Tip
Redisissinglethreaded.Everycommandmayblockothercommands.Keepthis
inmindifyoustartthinkingaboutproblemsorusecases.Thisseemsobvious,but
ismostlyoverlookedandtherootcauseofmanyissues.
SoweactivatedWatchdog,waitedafewsecondsandMurphyslawkickedin.We
hitabugandourRedisservercrashed.Inproduction!See
Softwarewatchdogcrashesredisduringrdbsavepoint#1771.Again:Relatedtoour
mainproblem,wemadenosignificantprogressatthistime.Instead,wehadanew
problem:AnofflineRedisdatabase(whichwasimmediatelyrestarted,ofcourse).So
wekeptgoing.
WestartedthenexttrybymeasuringthelatencybaselineofourRedissetup.The
numbersoftheintrinsiclatencylookedprettygood.Thebaselatencylooked
horrifying.
$redisclilatencyp6380h1.2.3.4
min:0,max:463,avg:2.03(19443samples)
WecheckedtheRedislogsanddiscoveredthatRediswassavingdatatodiskevery
fewminutes:
...
[20398]22May09:20:55.351*10000changesin60seconds.Saving...
http://tech.trivago.com/2017/01/25/learnredisthehardwayinproduction/? 4/15
29/3/2017 Learnredisthehardway(inproduction)trivagotechblog
[20398]22May09:20:55.759*Backgroundsavingstartedbypid41941
[41941]22May09:22:48.197*DBsavedondisk
[20398]22May09:22:49.321*Backgroundsavingterminatedwithsuccess
[20398]22May09:25:23.299*10000changesin60seconds.Saving...
[20398]22May09:25:23.644*Backgroundsavingstartedbypid42027
[20398]22May09:26:50.646#Acceptingclientconnection:accept:Softwarecausedconnect
[20398]22May09:26:50.900#Acceptingclientconnection:accept:Softwarecausedconnect
...
WerunRedisonbaremetalserverswith(nearly)defaultconfiguration,becauseitis
shippedwithsanedefaults.OurfirstquestionwasWhydidtheforkofabackground
savingprocesstake~400ms?(havealookatthefirsttwologlines).Afterreadinga
fewmailinglistpostsandtheimplementationofBGSAVE,weunderstoodwhy.Redis
isforkingabackgroundprocessandneedstocopythepagetable.So,ifyouhavea
bigRedisinstancewithmanykeys,itwilltaketime.Evenonbaremetal,without
virtualization.Bynow,thisbehaviourhasbeenaddedtotheofficaldocumentation.
SeeForktimeindifferentsystems.
AsafollowupwedeactivatedRedissnapshotsforserviceswherepersistencewas
notneeded.Thisreducedtheamountofreaderroronconnectionbymorethan
30%.
Forinstanceswherepersistenceisneeded,theusageofsnapshotpointscanbe
tricky.Ifyouhavealotoftrafficonyourinstancesandyourapplicationisdoingwrite
operationsperrequestyouwillhavemorekeymodifications.Thisleadstomore
BGSAVEtriggersand(possibly)abiggeramountofrejectedconnections.Thereasons
arehigherprocessforktimesandblockedRedisinstances.
Tip
ReviewyourpersistencerequirementsandRedisconfiguration.Doesyourapp
modifymorekeyswhenyougetmoretrafficanddoyouneedpersistence?
ConsiderAOForrollingBGSAVEasanalternativetostandardsnapshotting.This
mayavoidconnection/commandexecutiontimeoutsandblockingRedis
instances.
Thiswasthecaseinoursituation.Ourapplicationisreadingandwritingkeys,butwe
didntwanttodeactivatepersistenceglobally.Wedeactivatedthesnapshotpointsin
thoseRedisinstancesandactivatedcronjobsthatwillcalltheBGSAVEcommandata
specifictime(rollingBGSAVE).Withthisweknowwhenadumpistriggeredandcan
avoidthoseduringhightraffictimes.AnalternativetorollingBGSAVEoperations
wouldbeaseparateslaveinstanceforpersistence.Thisslavewillnothandlereal
trafficanditsonlypurposeistotakecareofpersistence.Inusecaseswithhigher
http://tech.trivago.com/2017/01/25/learnredisthehardwayinproduction/? 5/15
29/3/2017 Learnredisthehardway(inproduction)trivagotechblog
persistencerequirementsweprefertouseAOF.Ifyouwanttoknowmoreabout
rollingBGSAVEwestartedasmallpostontheoldRedismailinglisttodiscussthis
topic.SeeRollingBGSAVEinsteadof(pre)configuredsavepoints.
Thischangewasconsideredasuccess.Wereducedourerror/timeoutproblems
immediately(eveninnonpeaktraffictimes).Butwestillsawerrorspoppinguphere
andthere,sowewerenotdoneyet.
Oneinstanceperdatacontext
SinceRedishasbeenintroducedintoourwebstack,itwasadoptedbymoreand
moreteamsforvarioususecases.
TheyusedtheexistingRedisinstancesandstoredtheirdatainadifferentdatabase.
Thisway,theycouldstartrightaway.Thiswasgreatforseveralreasons,butledus
toournextchallenge.
Oneteamhadacronjobrunningevery15minutesthatdumpeddatafromaMySQL
databaseintoasharedRedisinstanceviathePipeliningfeature.Duetothesingle
threadednatureofRedisthesharedRedisinstancewasblockedevery15minutesfor
severalseconds.
Tip
BeawareofRedisinstancesthataresharedbetweenteamsanddatacontext.
Duetodifferentusecasesandlongrunningcommandstheymayblockeach
other.Remember:Redisissinglethreaded.
WemovedthecronjobtoitsownRedisinstance.Asaresult,the(former)shared
instancethrewalotlessconnectionandtimeouterrorsthanbefore.Duetothesplitof
datacontextstheamountofcommandsperinstancewasreduced.Furthermore,
startingseveralRedisinstancesperserverleadstoabetterutilizationofcomputing
resources.Why?Again:Redisissinglethreaded!Andmodernservershavealotof
cores.
Sidenote:TheusageofSELECTandmultipledatabasesinsideoneRedisinstance
wasmentionedasanantipatternbySalvatore.
O(n)cankillyou
Okay,wefoundseveralcausesandreducedtheamountofconnectionandtimeout
errorsbyanorderofmagnitude.EverythingwentwellforalongtimeandourRedis
setupwashealthy.Thetimewentby,teamsimplementednewfeaturesintoour
http://tech.trivago.com/2017/01/25/learnredisthehardwayinproduction/? 6/15
29/3/2017 Learnredisthehardway(inproduction)trivagotechblog
applicationandourtrafficcontinuedtogrow.Thetrafficgrowthwentfastandthe
connectionandcommandtimeouterrorscameback.
Ourfirstthought:Really?Murphy?Areyouthere?
Luckilywesawapatternintheoccurrenceofthemessage.Itoccurredperiodically
every5minutes.Basedonourknowledgefromthelastinvestigationwestartedright
away.Wemeasuredthebaselatency,enabledwatchdogandreadtheSLOWLOG
documentation.
Inaveryshorttime,comparedtopreviousinvestigations,weidentifiedacronjobthat
firestheKEYS*commandagainstabigRedisinstance.LuckilytheRedis
documentationdescribestheTimecomplexitypercommandwiththehelpofthe
BigOnotation.ThetimecomplexityoftheKEYScommandisdefinedas:
O(N)withNbeingthenumberofkeysinthedatabase
InbigdatabasesanddependingonthepatternyouapplyontheKEYScommandthis
operationcanleadtoalongblockingRedisinstance.
Tip
TakeadedicatedlookatyourRediscommandsandtheirTimecomplexity.Afew
commandswithahighBigOestimationcanslowdowntheperformanceofyour
Redisinstance.Oftentherearealternativecommandsthatservenearlythesame
purposeandareabetterchoice(e.g.theSCANfamilyasareplacementforKEYS
)
Backin2014therewasonlyasmallnoteinthe
Latencygeneratedbyslowcommandsdocumentation,whichsaidtheKEYS
commandshouldonlybeusedfordebuggingpurposes.Inthemeantimethe
commandreferencewasextendedandawarningrelatedtothiswasadded.
Basedonthisnewexperiencewehadalookatourapplicationcodeagain.We
checkedallRediscommandswithaspecialattentiontotheusecase,useddata
structureandtheirtimecomplexity.Thiswasalotofwork,butitpaidoff.We
optimizedover40%oftheexecutedcommandswhichledtolesstimespentinthe
communicationwithRedis.Intheendthisledtoanoverallfasterresponsetimeofour
webstack.Wewerefineagain.
Oneconnectionperrequest
http://tech.trivago.com/2017/01/25/learnredisthehardwayinproduction/? 7/15
29/3/2017 Learnredisthehardway(inproduction)trivagotechblog
Weacceptedthechallengewiththeevergrowingamountoftrafficinthefollowing
monthsandfurtheroptimizedourapplicationandstackinseveralways.Someofour
goalsweretacklingtheconsumptionofmemoryperrequest,optimizingourdatabase
queries(slowquerylog),tuningourcachinglayersandaddingmorehardware(web
servers)toourdatacenters.Especiallythelastchange,addingmorewebserversto
ourstack,createdyetanotherchallenge.
Asmentionedearlier,weweredealingwithstatelessapplications.Withouttheusage
ofphpfpmandpersistentconnectionsthismeans:
1.AnHTTPrequestcomesin
2.Theapplicationcreatesconnectionstovariousservices
3.Operationsareexecuted,queriesarerun,requestsaremade,etc.
4.Theapplicationclosesconnectionscreatedin2.
5.Thereponseisdeliveredtotheclient
Thisworkedgreatsofarandthisisthewaymanyapplicationswork.Butifyouscale
thenumberofserversthatcanacceptincomingrequests,yourtrafficgrowsandyou
don`tpayspecialattentiontoyour3rdpartycomponentsthiscangowrong.Very
wrong.
Dependingontherequest,theapplicationcreatesthirdpartyconnections,executes
oneortwocommandsanddisconnectsagain.50%upto75%ofthecommandswe
executeareusedforconnectionhandling.RememberRedisissinglethreaded.Ifyou
havealotofclientsthattrytoconnecttoyourRedisinstancecontinuously,youwill
keepyourinstancebusywithconnectionhandlinginsteadofexecutingthe
commandsyourunyourbusinesslogicon.Thismayleadtoaslowdown/blockingof
http://tech.trivago.com/2017/01/25/learnredisthehardwayinproduction/? 8/15
29/3/2017 Learnredisthehardway(inproduction)trivagotechblog
yourRedisinstance.The(simplified)imageabovevisualizestheproblem.Every
arrowrepresentsoneHTTPclientrequest.
Tip
Consideraproxybetweenyourapplicationandyour3rdpartycomponents.Ifyou
haveahighconnection/commandratioorRedisisusedasaplatformacross
manydifferentteams,aproxycanbeverybeneficial.Itcanreducetheconnection
overheadoractasafirewallforexpensiveandunwantedcommands.
Thisproblemsoundslikeatypicalproxyproblem.Subsequentresearchshowedthat
wewerenotaloneandthatthisproblemhadbeensolvedbefore.Onesolutionis
twemproxybytwitter.twemproxywascreatedspecificallyforthisusecase.Youinstall
thisproxyoneverywebserverandtwemproxyholdsapersistentconnectiontoyour
Redisinstance(s).Yourapplicationwillonlyconnecttothelocalproxywhichshould
bealotfaster,becauseitconnectstoaunixdomainsocketinsteadofanexternal
servicevianetwork.Andevenbetter:Itsupportsmemcachedaswell.Thiswasgood
newsforus,becausememcachedispartofourstackandmightfacethisproblemin
thefuture.
Soweintroducedtwemproxyintoourstack.Itwasnotaputitinandeverythingis
workingproject.Wehadseveralsmalladjustmentstomakebeforeitwasasuccess,
like
gettingtwemproxyupandrunningonFreeBSD(OSweuseforwebandRedis
servers)
addingRedisSELECTonconnect/supportformultipledatabasesinRedis
configuringourmbufvaluescorrectly
checkforusageofRediscommandsnotsupportedbytwemproxy
Asdescribedearliertheusageofmultipledatabaseswasmarkedasanantipattern.
Atthistimewewerenotabletomoveallourapplicationsawayfromconnectingto
differentdatabasesononeRedisinstancesothecapabilitytoSELECTadatabase
otherthanthedefaultonewasstillarequirementforus.
Anotheradvantageoftwemproxyisitsabilitytoblockexpensivecommands.Soitcan
actlikeacircuitbreakerforcommandslikeKEYSanddangerousoperationslike
FLUSHALLandFLUSHDB.
Thedownsideofthis:EverynewcommandoffutureRedisversionsneedstobe
supportedbytwemproxyaswell.IfyouupgradeyourRedisinstallationtousenew
features,likeGEOcommands,twemproxysupportneedstobeaddedanddeployed
aswell.
http://tech.trivago.com/2017/01/25/learnredisthehardwayinproduction/? 9/15
29/3/2017 Learnredisthehardway(inproduction)trivagotechblog
Thedeploymentoftwemproxywasagreatsuccess.Weeliminatedalltimeoutand
connectionerrorsthatwereleft(withoutbuyingnewhardware).
Bonusround:Shardyourdata
Allknownrootcausesfortheconnectionandcommandtimeoutsweresolved.The
growthandtrafficofourplatformstillcontinuedandwecontinuedtooptimizeour
Redisusage.
TwoofourRedisusecaseswerecachingofcalculateddataandshortterm(~1min.)
storage.Atsomepointintimeonemachinewillnotbeabletohandlethoseusecases
aloneanymore,because
thedatawontfitintoRAM
thenumberofread/writerequestsistohighforonemachine
Asafollowupwestartedtoshardourdataoverseveralmachinesusing
consistenthashing.Luckilythiswayofshardingisnativelysupportedbytwemproxy.
Thisresultedin:
Reductionoftraffic/load/requestspermachine
Improvedreliabilityofthiscachinginfrastructureduringnodefailures
Bothpointsareabigwin,especiallythesecondone.Ifamachinefails(e.g.hardware
failure)ourserviceisabletooperatenormally.Onlyasmallpercentageofcompute
andstoragepowerissacrificed.Weappliedthispatterntoeveryusecasewhereit
madesenseorwasapplicableWedidntregretit.Mostprominently,duetoanode
failureinthiscomponentlastyearthisoptimizationhaditsdebut.
Conclusion
Inthispost,wetoldyouourpainfulstoryofhowwelearnedtouseandbenefitfrom
Redis.Duringthetimeofidentifyingandfixingtheseissues,wefacedmultiple
challengesliketheconstantlyincreasingHTTPtraffictoourplatformand
understandingtheimplicationsofoperatingsuchadatabase.
Lookingbackthiswasnotonlyatechnicalissueandnowthisseemstobeobvious.
Therootcausesoftheseerrorsweremoreaconceptionalissue.Thewayweused
Rediswasnotidealforthiskindoftrafficandgrowth.
Afterunderstandingtheissuesmoreandmoreitwasclearthatthereisnosilver
bullettosolvethisproblem.Therewereseveralimportantlessonslike
understandinghowcommandsareexecuted(singlethreaded)
properlyconfigurethewayhowRedispersistsdata(BGSAVE)
http://tech.trivago.com/2017/01/25/learnredisthehardwayinproduction/? 10/15
29/3/2017 Learnredisthehardway(inproduction)trivagotechblog
splittingdatastorageperserviceandavoidingsharedRedisinstances(multiple
databases)
understandtimecomplexityofRediscommands(KEYS/O(n))
controltheamountofTCP/IPconnectionstoyourRedisinstances(twemproxy)
shardyourdataonceitdoesntfitontoonemachineanymoreandaccepting
machinefailures(consistenthashing)
Wehadahardbutexcitingtimeandallofuslearnedalot.Eversinceweappliedthe
changesdescribedheretooursetupwedidntfaceanybiggerissue.Butbeaware:
TherearemanyotherthingsyouhavetotakecareofwhenyourunRedisin
productionlike
replicationandrequiredRAM
automaticfailoverviasentinal
correctuseofevictionpolicies
basicunderstandingofRedissecuritymodel(also
possibleattackpatternsbyinsufficientconfiguration)
propermonitoringandalerting
Didyouexperiencesimilarissues/problems?Orwereyouabletouseanyofthetips
mentionedhere?Letusknowinthecommentsection.
AndyGrunwald
FollowAndyGrunwaldonTwitter
Comments
5Comments trivagotechblog
1 Login
Jointhediscussion
PeeyushChandel2monthsago
Nicearticlewithdetailedexplanation.
http://tech.trivago.com/2017/01/25/learnredisthehardwayinproduction/? 11/15
29/3/2017 Learnredisthehardway(inproduction)trivagotechblog
Nicearticlewithdetailedexplanation.
Wealsofacedsimilarchallengesandsolvedthemafterfixingvariouspartsasyou
did.
OneoftheinstanceIrememberis:
AllofasuddenonefinenightoneofourRedisinstancesstoppedacceptingwrites
thenIcheckedthelogsandfoundthatLinuxwasnotabletoforktheprocessfor
backgroundsaveandbydefaultRedisstopsacceptingthewritesifitisnotableto
savethemtodisk(whichisokayforpersistence).
Thendebuggingmorewefiguredoutthatwedon'tevenneedthepersistenceon
thatinstance.
Andthesolutionwas:
1.Turnedoffpersistenceonthatinstanceasitisnotneeded.
2.Madevm.overcomit_memory=1insysctl.conf
3.AsourRediswasbackedupbydatabase,soIchangedthememorypolicytoall
keyslru,sothatinthecaseofmemoryoverflowRediscanremoveanykey,
otherwise,bydefault,itwillonlyremovethekeyswhichhaveanexpiry.
Also,payattentiontoRedisconnectiontimeoutandmaxclientconnectionsettings
intheRedisconfigfile.Whenyouusethepersistentconnectionswithoutany
proxythenyouneedtomaintainthepoolofconnectionbyyourselfandkeepof
recyclingtheconnectionswhichwerealreadydead.
2 Reply Share
HeyPeeyush,
thankstowritedownyouroutageandhowyousolvedit.Reallyvaluable
andmaybesomeotherreadershavetheexactsameissue.
1 Reply Share
MangatRai>PeeyushChandel13hoursago
Weusedtohaveexactsimilarissue,BGSAVEwouldfail(vmovercommitis
0bydefault)whichwouldkickRedisinreadonlymode.Forourdesign,
Persistencewasrequiredasthedatabasewasbuiltovertime,andcannot
bedoneagain.Alsowecan'taffordtolooseanykeys(otherthanexpiry..).
ThattimerollingBGSAVEdidn'toccurtome.EndedupusingAerospike
afterusingRedisfor1year
Reply Share
hughdbrown3hoursago
Justsometypos.Feelfreetodeletethiscommentwhenyouhavefixedthem.
s/Thiswasgreatforseveralreasons,butleadustoournextchallenge./Thiswas
greatforseveralreasons,butledustoournextchallenge./
s/Weoptimizedover40%oftheexecutedcommandswhichleadtolesstimespent
inthecommunicationwithRedis./Weoptimizedover40%oftheexecuted
commandswhichledtolesstimespentinthecommunicationwithRedis./
s/Intheendthisleadtoanoverallfasterresponsetimeofourwebstack./Inthe
http://tech.trivago.com/2017/01/25/learnredisthehardwayinproduction/? 12/15
29/3/2017 Learnredisthehardway(inproduction)trivagotechblog
s/Intheendthisleadtoanoverallfasterresponsetimeofourwebstack./Inthe
endthisledtoanoverallfasterresponsetimeofourwebstack./
Reply Share
Thankyou@hughdbrown.
Ifixedthemanddeployedanewversion.Iamfinewithlettingthis
commenthere:)
Reply Share
We'reHiring
Tacklinghardproblemsislikegoingonanadventure.Solvingatechnical
challengefeelslikefindingahiddentreasure.Wanttogotreasurehuntingwith
us?
Viewallcurrentjobopenings
Relatedposts
Backend
YourDefiniteGuideForAutoscalingJenkins
Postedon17February2017
AttrivagoweuseJenkinsasourmainCItool.However,whenourphysicalsetupwas
notenoughweneededtomoveittothecloudandimplementanautomatedslave
scaling.Thisisthedefiniteguidewithallthestepswetooktoimplementanauto
scalingJenkinsplatform.
Readthepost
http://tech.trivago.com/2017/01/25/learnredisthehardwayinproduction/? 13/15
29/3/2017 Learnredisthehardway(inproduction)trivagotechblog
Backend
ConfigurationmanagementHowtostarttestingyour
saltformulas
Postedon12October2016
Configurationmanagementtoolshaverecentlygainedalotofpopularity.Attrivago
weuseSaltStacktoautomateourinfrastructure.Asthecomplexityofconfiguration
filesandformulasisincreasing,weneedafast,reliablewaytotestourchanges.
Readthepost
trivagoGmbH,BennigsenPlatz1,40474Dsseldorf,Deutschland
Company
Careers
Jobs
Press
SupportCentre
Expedia
Mobileappssearchonthego
HotelManagerforhoteliers
Communityforthosewholovehotels
AffiliateProgram
TermsandConditions
LegalInformation
PrivacyPolicy
Cookiepolicy
Copyright2017trivagoAllrightsreserved
http://tech.trivago.com/2017/01/25/learnredisthehardwayinproduction/? 14/15
29/3/2017 Learnredisthehardway(inproduction)trivagotechblog
http://tech.trivago.com/2017/01/25/learnredisthehardwayinproduction/? 15/15