Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Eugene Magnier
Lecture
!er!ie"
Moti!ations understanding bottlenec#s multiple$ processing or %& ' (ypes o) Parallel and Distributed Processing Multitas#ing !s Multit*reading Multicomputer !s multiprocessor Parallel processing !s distributed processing MP% !s P+M ,ondor Pan,ontrol Pan(as#s
Eugene Magnier
!er!ie" - Moti!ations
(*e Problem - processing one.at.a.time ta#es too long (*e Solution - do more t*an one at a time/ 0nderstanding your bottlenec#s- (oo muc* data or too muc* "or#' measure your processing speed add timing points "it*in t*e code time complete1 representati!e 2obs count your data %& s is it local or net"or#' measure your %& s time dd i)3)ile o)3&de!&null e$amine your t*roug*puts seconds )or processing' seconds )or %& ' compare ,P0 45igacycles & sec6 to %& 4Megabytes & sec6
Eugene Magnier
Multitas#ing !s Multit*reading
(*e simplest parallel processing multiple 2obs on your o"n mac*ine Multitas#ing separate programs independent data *andled by #ernel automatically Multit*reading multiple reali7ations o) t*e same program s*ared memory independent processing re8uires care "it* memory and messages programs must be "ritten to use multit*reading
chip 1
chip 2
read data
chip 1
chip 2
Eugene Magnier
Multiprocessor se!eral ,P0 c*ips on a single mot*erboard standard since 92000 t*ermal limitations in a single ,P0 traditional supercomputer design - :00s . :000s o) c*ips in a single bo$ Multi.core processor increase number o) transistors1 but not cloc# speed 2 441 ;<<<6 =cores= 4,P0s6 on a single c*ip may s*are cac*e AMD released dual.core c*ips in t*e )all Linu$ #ernel manages multiple t*reads and multiple processes on bot* multiprocessor and multi.core mac*ines Multiple computers distribute t*e load to multiple bo$es net"or# to tie mac*ines toget*er =data net"or#s= !s =signal net"or#s=
Eugene Magnier
Distributed processing multiple 2obs "*ic* re8uire little or no intercommunication Data is not s*ared bet"een distributed 2obs E$amples large number o) indi!idual images *undreds o) distinct spectra data preparation Parallel processing multiple 2obs re8uire )re8uent communication Data is *ea!ily s*ared bet"een 2obs E$amples large >.body simulations )ull.s#y astrometric & p*otometric analysis !ery large matri$ in!ersion !ery large ??(s
Eugene Magnier
MP% !s P+M
MP% - Message Passing %nter)ace Library allo" distributed processes to s*are data send messages bloc# )or messages *ig*ly e))icient )or message passing P+M - Parallel +irtual Mac*ine also pro!ides a message passing library includes resource and process control layer pro!ides a single point )or interactions bot* re8uire de!eloper to program to t*e model
Eugene Magnier
,ondor
Layered on top o) P+M Pro!ides management o) distributed 2obs 2obs don't re8uire recompilation e$pects *eterogeneous cluster with machine owners! some"*at restricti!e on be*a!ior o) 2obs
Eugene Magnier
Pan,ontrol
manages distributed 2obs 4li#e ,ondor6 manage mac*ines in pool 2obs can re8uest or demand speci)ic mac*ines simple user inter)ace host add foo inter)aces "it* Pan(as#s host add bar
job job job job program -host foo program +host bar program +host baz program
for i 0 100 sprintf input chip.%02d.fits sprintf output chip.%02d.flat job process $input $output end check job 0 stdout job 0 stderr job 1 delete job 5 host off foo host on foo
Eugene Magnier
Pan(as#s
@egularly.sc*edule tas# e!aluation (as#s 4potentially6 spa"n 2obs Aobs may be local or parallel Aobs may be targetted to speci)ic mac*ines
task datalist command ls /data/foo periods -exec 5.0 periods -timeout 50.0 periods -poll 1.0 task.exit 0 queueprint stdout queuedelete stdout end task.exit 1 queuepush failure "task failed" end end task datalist periods -exec 5.0 periods -timeout 50.0 periods -poll 1.0 task.exec $file = `next.file` if ($file == "none") break end command cp /data/foo/$file /data/bar end task.exit 0 queueprint stdout queuedelete stdout queuepush copied $file end task.exit 1 queuepush failure $file end end
Eugene Magnier
0ser issues commands1 loads scripts !ia readline ,lient & Ser!er Model designed1 not yet implemented
,*ec#,ontrol
pcontrol ser!er
user cmds
readline
,*ec#(as#s
tas# 8ueue
,*ec#,*ild
2ob 8ueue
Eugene Magnier
Limited number o) tas#s & 2obs per interrupt cycle ma$imum o) one e!aluation per cycle tas# & 2ob 8ueues are continuously cycled
ne$t tas#
ne$t 2ob
submit 2ob
Eugene Magnier
Pcontrol monitors *osts *osts may be added & deleted =do"n= *osts are automatically re.attac*ed ss* communication to t*e *osts
Pcontrol Bost Cueue D States
ne"
0SE@- *ost 4name6
busy
done
L P- StartAob
do"n
0SE@- *ost .o)) 4name6 0SE@- *ost .on 4name6 0SE@- *ost .o)) 4name6
idle
0SE@- *ost .o)) 4name6
o))
0SE@- *ost .delete 4name6
delete
Eugene Magnier
Pcontrol mo!es 2obs and *osts in parallel 0sers may delete pending 2obs1 #ill running 2obs1 or *ar!est cras*&e$it 2obs< Pan(as#s sc*eduler is normal Pcontrol =user=
ne"
*ung
0: P: (:
pending
busy
done
(:
(2
e$it
cras*
02
02
delete
Eugene Magnier
0ser issues commands1 loads scripts !ia readline !ery similar to Pan(as#s sc*eduler loop 4same code6
Pcontrol Ser!er
pcontrol client
readline
,*ec#,*ild
pcontrol client
,*ec#,*ild
pcontrol client
Eugene Magnier
0ser issues commands Pclient launc*es1 monitors bac#ground c*ild process reports stdout1 stderr1 e$it status !ery simple command set
Pclient
command src 4pcontrol6
readline
,*ec#,*ild
c*ild process
Eugene Magnier
Pclient - demonstrated 2ob rates o) 9E00 per second Pcontrol - tests to manage 62 nodes1 9:E0 2obs per second total rate Pan(as#s - sc*edule & *ar!est 9E0 2obs per second @e8uirement - FF 640 2obs & 4E seconds 4:4 per second6
Eugene Magnier
classical parallel 4eg MP%6 !s distributed processing increase total MB7 increase total %& rate connection to =targeted= processing obser!atory system
metadata db ser!er
pantas#s ser!er
D+ ser!er
Gig S"itc*
(A >odes
S#y >odes