Sei sulla pagina 1di 4

AnsysHPC

HPCisanabbreviationforhighperformancecomputingandintroducestheconceptofsolving
problemsinparallelinordertoreducethesimulationtime.Thecomputationalpoweravailableto
commonusersisrapidlyevolving,givingtheopportunitiestosolvecomplexsimulationswithin
reasonabletime.Distributethecomputationalefforttoseveralcoresandtherebyreducethe
computationaltimesignificantlyisakeyadvantagetoquicklyproducingaccurateresults.

Introduction
SeveralparametersaresignificantwheninvestigatingthespeedupofsimulationperformedinAnsys.
Hardware,asinmemoryspeed,CPUclockspeed,IODiskSpeedandinterconnectsareall
contributingtothesimulationtime,howeverthesefactorswillnotbementionedinthisblog.The
focushereistoillustratethepossibilitiesofspeedupbyusingparallelsolversorGPUswhen
performingeitherCFDorFEanalysis.

Whenrunningasimulationonseveralcores,theproblemisdividedintoNpieces,whereeach
domainissolvedindependently.Thedifferentpiecesarecommunicatingattheintersectionsandthe
solutionisreassembledwhenthesolverisfinishedintoonesingeresultfile.Sincethepieceshaveto
communicate,doublingtheamountofcoreswillnotalwaysreducethesimulationtimebyafactorof
two.Thebenefitofincreasingnumberofcorescouldbevisualizedinascalabilitychart,visualizing
thereductioninsimulationtimevs.numberofcoresused.

ScalabilityofaCFDairfoilsimulation

Speed Up

Whencomparingtheresponseinsimulationtimewithnumberofcoresuseda
scalabilitychartcouldbegenerated.Thechartbelowisoneexample,generated
fromaCFDsimulationonanairfoilwithameshcontaining9.930.000nodes.As
thegraphshows,thespeedupwhenincreasingnumberofcoresarelinearuntil
33cores.Thisindicatesaperfectlinearscalinguptoanodetocoreratioof
300.000nodespercore.Thecommunicationbetweentheparallelizedpartsreducesthebenefitof
addingadditionalcoresinasmallextent.
50
45
40
35
30
25
20
15
10
5
0

Performence
Linear

10

20

30

40

50

Number of Cores

IntelXeonNehalemX55502.67GHz
24GBmemorypernode

ScalabilityofaCFDIndycarsimulation

Speed Up

ThechartbelowgivesanotherexampleofscalabilityinCFDsimulations.The
meshofthesurroundingfluiddomainembeddingtheIndycarcontains483.360
nodes.Asthegraphindicates,quitegoodscalingisachievedinthisCFD
simulationaswell,whereresultsareproduced6timesfasterwhenrunningon8
coresinsteadofaserialrun.
10
9
8
7
6
5
4
3
2
1
0

Performence
Linear

IntelXeonNehalemX55502.67GHz
24GBmemorypernode

10

Number of Cores

SpeedingupFEanalysisbytheuseofGPUGraphicsProcessingUnit
Unfortunately,FEresultsdonotprovidethesameprocessingscalabilityasCFD.However,theGPU
offersanextensivecomputationalpowerresource,andcouldbeusedtospeeduptheprocess
significantlyinmanyapplications.

GPU

CPU

PCIExpress
channel

AmulticoreCPUprocessor,typically48cores,isapowerfulunitforgeneralpurposecomputations.
TheGPUontheotherhandtypicallycontainshundredsofcoresandisgreatforhighlyparallelcode,
withinmemoryconstraints.TheGPUisamassivecomputationalpowerinmoderncomputers,
capableofhandlingvectoroperationsquicklyseveralordersofmagnitudehigherperformance
thanaconventionalCPU.However,GPUisnotareplacementofCPUs,ratherthetworesources
couldworktogetherinacollaborativefashiontoperformtheANSYSsimulation.
PertodayANSYSsupportsallNvidiaTeslacardsandQuadro6000forGPU.
PresentedareresultsfromabenchmarkstudyconductedbyengineersintheNVIDIAperformance
lab.ThemodelV13sp5isfromthestandardANSYSbenchmarksetandisoneofthebestmodelsto
representrealworldcustomerpractice.Itwasderivedfromtheturbomachineryindustryand
comprisesatypicalnonlinearstaticanalysisoflargedeflectionbutwithonlyasingleequilibrium
iterationfromafullsolutionthatwouldrequire25iterations.Asthegraphsshows,asignificant
benefitcouldbeachievedinthissetup.

ANSYSHPCOffering
ThesameANSYSHPClicensescouldbeusedforANSYS
FEAorANSYSCFD.Theycouldbepurchasedonebyone,
whereoneHPClicensegivetheuserthechangetoruna
FEAorCFDsolveronanadditionalcoreorinpacksof8.
ANSYSHPCPack
ANSYSHPCPacksenabletheparallelyouneedinorder
todohighfidelitysimulationsthatprovideenhanced
insight.Packsworkasillustrated.Eachsimulationjob
willconsumeoneormoreHPCPacklicense.Asyouadd
Packs,theamountofparallelenabledincreasesrapidly:
Onepackallows8way,twopacksallow32way,three
packsallow128way,andonupto5packsforextremelyhighfidelityusing2048processes.Packscan
notbesplitandallwaystakeatleastonefullpackage.

Potrebbero piacerti anche