Sei sulla pagina 1di 18

Eugene Magnier

Astro 734 : Lecture 09

Parallel and Distributed Processing

Eugene Magnier Astronomy 734 Spring 2006

Eugene Magnier

Astro 734 : Lecture 09

Lecture

!er!ie"

Moti!ations understanding bottlenec#s multiple$ processing or %& ' (ypes o) Parallel and Distributed Processing Multitas#ing !s Multit*reading Multicomputer !s multiprocessor Parallel processing !s distributed processing MP% !s P+M ,ondor Pan,ontrol Pan(as#s

Eugene Magnier

Astro 734 : Lecture 09

!er!ie" - Moti!ations

(*e Problem - processing one.at.a.time ta#es too long (*e Solution - do more t*an one at a time/ 0nderstanding your bottlenec#s- (oo muc* data or too muc* "or#' measure your processing speed add timing points "it*in t*e code time complete1 representati!e 2obs count your data %& s is it local or net"or#' measure your %& s time dd i)3)ile o)3&de!&null e$amine your t*roug*puts seconds )or processing' seconds )or %& ' compare ,P0 45igacycles & sec6 to %& 4Megabytes & sec6

Eugene Magnier

Astro 734 : Lecture 09

Multitas#ing !s Multit*reading

(*e simplest parallel processing multiple 2obs on your o"n mac*ine Multitas#ing separate programs independent data *andled by #ernel automatically Multit*reading multiple reali7ations o) t*e same program s*ared memory independent processing re8uires care "it* memory and messages programs must be "ritten to use multit*reading

chip 1

chip 2

collect results multiple programs

read data

chip 1

chip 2

collect results single program

Eugene Magnier

Astro 734 : Lecture 09

Parallel ,omputer ,oncepts

Multiprocessor se!eral ,P0 c*ips on a single mot*erboard standard since 92000 t*ermal limitations in a single ,P0 traditional supercomputer design - :00s . :000s o) c*ips in a single bo$ Multi.core processor increase number o) transistors1 but not cloc# speed 2 441 ;<<<6 =cores= 4,P0s6 on a single c*ip may s*are cac*e AMD released dual.core c*ips in t*e )all Linu$ #ernel manages multiple t*reads and multiple processes on bot* multiprocessor and multi.core mac*ines Multiple computers distribute t*e load to multiple bo$es net"or# to tie mac*ines toget*er =data net"or#s= !s =signal net"or#s=

Eugene Magnier

Astro 734 : Lecture 09

Parallel processing !s Distributed processing

Distributed processing multiple 2obs "*ic* re8uire little or no intercommunication Data is not s*ared bet"een distributed 2obs E$amples large number o) indi!idual images *undreds o) distinct spectra data preparation Parallel processing multiple 2obs re8uire )re8uent communication Data is *ea!ily s*ared bet"een 2obs E$amples large >.body simulations )ull.s#y astrometric & p*otometric analysis !ery large matri$ in!ersion !ery large ??(s

Eugene Magnier

Astro 734 : Lecture 09

MP% !s P+M

MP% - Message Passing %nter)ace Library allo" distributed processes to s*are data send messages bloc# )or messages *ig*ly e))icient )or message passing P+M - Parallel +irtual Mac*ine also pro!ides a message passing library includes resource and process control layer pro!ides a single point )or interactions bot* re8uire de!eloper to program to t*e model

Eugene Magnier

Astro 734 : Lecture 09

,ondor

Layered on top o) P+M Pro!ides management o) distributed 2obs 2obs don't re8uire recompilation e$pects *eterogeneous cluster with machine owners! some"*at restricti!e on be*a!ior o) 2obs

Eugene Magnier

Astro 734 : Lecture 09

Pan,ontrol

manages distributed 2obs 4li#e ,ondor6 manage mac*ines in pool 2obs can re8uest or demand speci)ic mac*ines simple user inter)ace host add foo inter)aces "it* Pan(as#s host add bar
job job job job program -host foo program +host bar program +host baz program

for i 0 100 sprintf input chip.%02d.fits sprintf output chip.%02d.flat job process $input $output end check job 0 stdout job 0 stderr job 1 delete job 5 host off foo host on foo

Eugene Magnier

Astro 734 : Lecture 09

Pan(as#s

@egularly.sc*edule tas# e!aluation (as#s 4potentially6 spa"n 2obs Aobs may be local or parallel Aobs may be targetted to speci)ic mac*ines
task datalist command ls /data/foo periods -exec 5.0 periods -timeout 50.0 periods -poll 1.0 task.exit 0 queueprint stdout queuedelete stdout end task.exit 1 queuepush failure "task failed" end end task datalist periods -exec 5.0 periods -timeout 50.0 periods -poll 1.0 task.exec $file = `next.file` if ($file == "none") break end command cp /data/foo/$file /data/bar end task.exit 0 queueprint stdout queuedelete stdout queuepush copied $file end task.exit 1 queuepush failure $file end end

Eugene Magnier

Astro 734 : Lecture 09

Pan(as#s - Process Loop & 0ser %nter)ace


0ser issues commands1 loads scripts !ia readline ,lient & Ser!er Model designed1 not yet implemented

,*ec#,ontrol

pcontrol ser!er

user cmds

readline

,*ec#(as#s

tas# 8ueue

,*ec#,*ild

2ob 8ueue

Eugene Magnier

Astro 734 : Lecture 09

Pan(as#s - (as# Loop !s Aob Loop


Limited number o) tas#s & 2obs per interrupt cycle ma$imum o) one e!aluation per cycle tas# & 2ob 8ueues are continuously cycled

ne$t tas#

ne$t 2ob

c*ec# tas# timer

c*ec# 2ob timer

run tas# prep

c*ec# 2ob status

construct 2ob cmd

return 2ob results

submit 2ob

Eugene Magnier

Astro 734 : Lecture 09

Pan(as#s - Pcontrol Bost States


Pcontrol monitors *osts *osts may be added & deleted =do"n= *osts are automatically re.attac*ed ss* communication to t*e *osts
Pcontrol Bost Cueue D States
ne"
0SE@- *ost 4name6

busy

done
L P- StartAob

do"n
0SE@- *ost .o)) 4name6 0SE@- *ost .on 4name6 0SE@- *ost .o)) 4name6

idle
0SE@- *ost .o)) 4name6

o))
0SE@- *ost .delete 4name6

delete

Eugene Magnier

Astro 734 : Lecture 09

Pan(as#s - Pcontrol Aob States


Pcontrol mo!es 2obs and *osts in parallel 0sers may delete pending 2obs1 #ill running 2obs1 or *ar!est cras*&e$it 2obs< Pan(as#s sc*eduler is normal Pcontrol =user=

ne"

*ung

0: P: (:

pending

busy

done

(:

(2

e$it

cras*

02

02

delete

Eugene Magnier

Astro 734 : Lecture 09

Pan(as#s - Pcontrol Process Loop & 0ser %nter)ace


0ser issues commands1 loads scripts !ia readline !ery similar to Pan(as#s sc*eduler loop 4same code6

Pcontrol Ser!er

,*ec#,*ild command src 4sc*eduler6

pcontrol client

readline

,*ec#,*ild

pcontrol client

,*ec#,*ild

pcontrol client

Eugene Magnier

Astro 734 : Lecture 09

Pan(as#s - Pclient Process Loop & 0ser %nter)ace


0ser issues commands Pclient launc*es1 monitors bac#ground c*ild process reports stdout1 stderr1 e$it status !ery simple command set

Pclient
command src 4pcontrol6

readline

,*ec#,*ild

c*ild process

Eugene Magnier

Astro 734 : Lecture 09

Pan(as#s - loading tests


Pclient - demonstrated 2ob rates o) 9E00 per second Pcontrol - tests to manage 62 nodes1 9:E0 2obs per second total rate Pan(as#s - sc*edule & *ar!est 9E0 2obs per second @e8uirement - FF 640 2obs & 4E seconds 4:4 per second6

Eugene Magnier

Astro 734 : Lecture 09

%PP computing - distributed processing


classical parallel 4eg MP%6 !s distributed processing increase total MB7 increase total %& rate connection to =targeted= processing obser!atory system

metadata db ser!er

pantas#s ser!er

D+ ser!er

Gig S"itc*

(A >odes

S#y >odes

Potrebbero piacerti anche