Sei sulla pagina 1di 11

5/10/2014

Gevent Tutorial

gevent For the Working Python Developer

gevent is a concurrency library based around libev. It provides a clean API for a variety of concurrency and network related tasks.

Intro

The structure of this tutorial assumes an intermediate level knowledge of Python but not much else. No knowledge of concurrency is expected. The goal is to give you the tools you need to get going with gevent and use it to solve or speed up your applications today.

The primary pattern provided by gevent is the Greenlet, a lightweight coroutine provided to Python as a C extension module. Greenlets all run inside of the OS process for the main program but are scheduled cooperatively by libevent. This differs from subprocceses which are new processes are spawned by the OS.

Greenlets

Synchronous & Asynchronous Execution

The core idea of concurrency is that a larger task can be broken down into a collection of subtasks whose operation does not depend on the other tasks and thus can be run side-by-side ( asynchronously ) instead of one at a time ( synchronously ). For example:

A somewhat synthetic example defines a taskfunction which is non-deterministic (i.e. its output is not guaranteed to give the same result for the same inputs). In this case the side effect of running the function is that the task pauses its execution for a random number of seconds.

importgevent

importrandom

deftask(pid):

""" Somenon-deterministictask """

gevent.sleep(random.randint(0,2))

print'Task',pid,'done'

defsynchronous():

foriinrange(1,10):

task(i)

defasynchronous():

threads=[]

foriinrange(1,10):

threads.append(gevent.spawn(task,i))

gevent.joinall(threads)

print'Synchronous:'

synchronous()

print'Asynchronous:'

asynchronous()

5/10/2014

Gevent Tutorial

In the synchronous case all the tasks are run sequentially, which results in the main programming blocking ( i.e. pausing the execution of the main program ) while each task executes.

The important parts of the program are the gevent.spawnwhich wraps up the given function inside of a Greenlet thread. The list of initialized greenlets are stored in the array threadswhich is passed to the gevent.joinallfunction which blocks the current program to run all the given greenlets. The execution will step forward only when all the greenlets terminate.

The output is:

Synchronous:

 

Task1done

Task2done

Task3done

Task4done

Task5done

Task6done

Task7done

Task8done

Task9done

Task10done

Asynchronous:

Task2done

Task3done

Task5done

Task10done

Task8done

Task6done

Task9done

Task1done

Task4done

Task7done

The important fact to notice is that the order of execution in the async case is essentially random and that the total execution time in the async case is much less than the sync case. In fact the maximum time for the synchronous case to complete is when each tasks pauses for 2 seconds resulting in a 20 seconds for the whole queue. In the async case the maximum runtime is roughly 2 seconds since none of the tasks block the execution of the others.

A more common use case, fetching data from a server asynchronously, the runtime of fetch()will differ between requests given the load on the remote server.

importgevent

 

importurllib2

importsimplejsonasjson

deffetch(pid):

response=urllib2.urlopen('http://json-time.appspot.com/time.json'

)

result=response.read()

json_result=json.loads(result)

datetime=json_result['datetime']

print'Process',pid,datetime

5/10/2014

Gevent Tutorial

returnjson_result['datetime']

defsynchronous():

foriinrange(1,10):

fetch(i)

defasynchronous():

threads=[]

foriinrange(1,10):

threads.append(gevent.spawn(fetch,i))

gevent.joinall(threads)

print'Synchronous:'

synchronous()

print'Asynchronous:'

asynchronous()

Race Conditions

The perennial problem involved with concurrency is known as a race condition. Simply put is when two concurrent threads / processes depend on some shared resource but also attempt to modify this value. This results in resources whose values become time-dependent on the execution order. This is a problem, and in general one should very much try to avoid race conditions since they result program behavior which is globally non-deterministic.

One approach to avoiding race conditions is to simply not have any global state shared between threads. To communicate threads instead pass stateless messages between each other.

Spawning threads

gevent provides a few wrappers around Greenlet initialization. Some of the most common patterns are:

importgevent

fromgeventimportGreenlet

deffoo(message,n):

"""

Eachthreadwillbepassedthemessage,andnarguments

initsinitialization.

"""

printmessage

gevent.sleep(n)

#InitializeanewGreenletinstancerunningthenamedfunction #foo

thread1=Greenlet.spawn(foo,"Hello",1)

thread1.start()

#WrapperforcreatingandruninganewGreenletfromthenamed #functionfoo,withthepassdarguments

thread2=gevent.spawn(foo,"Ilive!",2)

#Lambdaexpressions

thread3=gevent.spawn(lambdax:(x+1),2)

5/10/2014

Gevent Tutorial

threads=[thread1,thread2,thread3]

#Blockuntilallthreadscomplete.

gevent.joinall(threads)

In addition to using the base Greenlet class, you may also subclass Greenlet class and overload the _run method.

fromgeventimportGreenlet

classMyGreenlet(Greenlet):

def

init (self,message,n):

Greenlet

init (self)

self.message=message

self.n=n

def_run(self):

printself.message

gevent.sleep(self.n)

g=MyGreenlet("Hithere!",3)

g.start()

g.join()

Greenlet State

Like any other segement of code Greenlets can fail in various ways. A greenlet may fail throw an exception, fail to halt or consume too many system resources.

The internal state of a greenlet is generally a time-depenent parameter. There are a number of flags on greenlets which let you monitor the state of the thread

started-- Boolean, indicates whether the Greenlet has been started. ready()-- Boolean, indicates whether the Greenlet has halted successful()-- Boolean, indicates whether the Greenlet has halted and not thrown an exception value-- arbitrary, the value returned by the Greenlet exception-- exception, uncaught exception instance thrown inside the greenlet

importgevent

defwin():

return'Youwin!'

deffail():

raiseException('Youfailatfailing.')

winner=gevent.spawn(win)

loser=gevent.spawn(fail)

printwinner.started#True printloser.started #True

#ExceptionsraisedintheGreenlet,stayinsidetheGreenlet.

5/10/2014

Gevent Tutorial

try:

 

gevent.joinall([winner,loser])

exceptExceptionase:

 

print'Thiswillneverbereached'

printwinner.value#'Youwin!' printloser.value #None

printwinner.ready()#True printloser.ready() #True

printwinner.successful()#True printloser.successful() #False

#Theexceptionraisedinfail,willnotpropogateoutsidethe

#greenlet.Astacktracewillbeprintedtostdoutbutit

#willnotunwindthestackoftheparent.

printloser.exception

#Itispossiblethoughtoraisetheexceptionagainoutside

raiseloser.exception

#orwith

 

loser.get()

Greenlets that fail to yield when the main program receives a SIGQUIT may hold the program's execution longer than expected. This results in so called "zombie processes" which need to be killed from outside of the Python interpreter.

A common pattern is to listen SIGQUIT events on the main program and to invoke gevent.shutdownbefore exit.

importgevent

 

defrun_forever():

 

gevent.sleep(1000)

defmain():

 
 

thread=gevent.spawn(run_forever)

if

name

==' main ':

try:

 

main()

 

exceptKeyboardInterrupt:

gevent.shutdown()

Timeouts

 

Timeouts are a constraint on the runtime of a block of code or a Greenlet.

fromgeventimportTimeout

timeout=Timeout(seconds)

5/10/2014

Gevent Tutorial

timeout.start()

defwait():

gevent.sleep(10)

try:

gevent.spawn(wait).join()

exceptTimeout:

print'Couldnotcomplete'

Or with a context manager in a witha statement.

importgevent

fromgeventimportTimeout

time_to_wait=5#seconds

classTooLong(Exception):

pass

withTimeout(time_to_wait,TooLong):

gevent.sleep(10)

In addition, gevent also provides timeout arguments for a variety of Greenlet and data stucture related calls. For example:

importgevent

fromgeventimportTimeout

defwait():

gevent.sleep(2)

timer=Timeout(1).start()

thread1=gevent.spawn(wait)

thread1.join(timeout=timer)

#--

timer=Timeout.start_new(1)

thread2=gevent.spawn(wait)

thread2.get(timeout=timer)

#--

gevent.with_timeout(1,wait)

Data Structures

5/10/2014

Gevent Tutorial

Events

Events are a form of asynchronous communication between Greenlets.

importgevent

fromgevent.eventimportAsyncResult

a=Event()

defsetter():

"""

After3secondssetwakeallthreadswaitingonthevalueof

a.

"""

gevent.sleep(3)

a.set()

defwaiter():

"""

After3secondsthegetcallwillunblock.

"""

a.get()#blocking

print'Ilive!'

gevent.joinall([

gevent.spawn(setter),

gevent.spawn(waiter),

])

A extension of the Event object is the AsyncResult which allows you to send a value along with the wakeup call. This is sometimes called a future or a deferred, since it holds a reference to a future value that can be set on an arbitrary time schedule.

importgevent

fromgevent.eventimportAsyncResult

a=AsyncResult()

defsetter():

"""

After3secondssettheresultofa.

"""

gevent.sleep(3)

a.set('Hello!')

defwaiter():

"""

After3secondsthegetcallwillunblockafterthesetter

putsavalueintotheAsyncResult.

"""

printa.get()

gevent.joinall([

gevent.spawn(setter),

gevent.spawn(waiter),

])

5/10/2014

Gevent Tutorial

Queues

Queues are ordered sets of data that have the usual put / get operations but are written in a way such that they can be safely manipulated across Greenlets.

For example if one Greenlet grabs an item off of the queue, the same item will not grabbed by another Greenlet executing simultaneously.

importgevent

fromgevent.queueimportQueue

tasks=Queue()

defworker(n):

whilenottasks.empty():

task=tasks.get() print'Worker%sgottask%s'%(n,task)

gevent.sleep(0.5)

print'Quittingtime!'

defboss():

foriinxrange(1,25):

tasks.put_nowait(i)

gevent.spawn(boss).join()

gevent.joinall([

gevent.spawn(worker,'steve'),

gevent.spawn(worker,'john'),

gevent.spawn(worker,'nancy'),

])

Queues can also block on either put or get as the need arises. Each of the put and get operations has a non- blocking counterpart, put_nowaitand get_nowaitwhich will not block, but instead raise either gevent.queue.Emptyor gevent.queue.Fullin the operation is not possible.

In this example we have the boss running simultaneously to the workers and have a restriction on the Queue that it can cantain no more than three elements. This restriction means that the putoperation will block until there is space on the queue. Conversely the getoperation will block if there are no elements on the queue to fetch, it also takes a timeout argument to allow for the queue to exit with the exception gevent.queue.Emptyif no work can found within the time frame of the Timeout.

importgevent

fromgevent.queueimportQueue,Empty

tasks=Queue(maxsize=3)

defworker(n):

try:

whileTrue:

task=tasks.get(timeout=1)#decrementsqueuesizeby1

5/10/2014

Gevent Tutorial

 

print'Worker%sgottask%s'%(n,task)

gevent.sleep(0.5)

exceptEmpty:

 
 

print'Quittingtime!'

defboss():

 

""" Bosswillwaittohandoutworkuntilaindividualworkeris

freesincethemaxsizeofthetaskqueueis3.

"""

foriinxrange(1,10):

 

tasks.put(i)

print'Assignedallworkiniteration1'

foriinxrange(10,20):

 

tasks.put(i)

print'Assignedallworkiniteration2'

gevent.joinall([

 

gevent.spawn(boss),

gevent.spawn(worker,'steve'),

gevent.spawn(worker,'john'),

gevent.spawn(worker,'bob'),

])

Locks and Semaphores

 

Groups and Pools

Actor Model

The actor model is a higher level concurrency model popularized by the language Erlang. In short the main idea is that you have a collection of independent Actors which have an inbox from which they receive messages from other Actors. The main loop inside the Actor iterates through its messages and takes action acording to its desired behavior.

Gevent does not have a primitive Actor type, but we can define one very simply using a Queue inside of a subclassed Greenlet.

classActor(gevent.Greenlet):

def

init (self):

self.inbox=queue.Queue()

Greenlet

init (self)

defrecieve(self,message):

 

"""

Defineinyoursubclass.

"""

raiseNotImplemented()

def_run(self):

 
 

self.running=True

5/10/2014

Gevent Tutorial

whileself.running:

message=self.inbox.get()

self.recieve(message)

In a use case:

classEcho(Actor):

defrecieve(self,message):

printmessage

classSpeaker():

defrecieve(self,message):

ifmessage=='start':

foriinxrange(1,5):

echo.inbox.put('Heythere!')

echo=Echo()

speak=Speaker()

echo.start()

speak.start()

speak.inbox.put('start')

gevent.joinall([echo,speak])

Real World Applications

Holding Side Effects

In this example we hold the side effects of executing an arbitrary string,

fromgeventimportGreenlet

env={}

defrun_code(code,env={}):

local=locals()

local.update(env)

exec(code,globals(),local)

returnlocal

whileTrue:

code=raw_input('>')

g=Greenlet.spawn(run_code,code,env)

g.join()#blockuntilcodeexecutes

#Ifsuccesfullthenpassthelocalstothenextcommand

ifg.value:

env=g.get()

else:

5/10/2014

Gevent Tutorial

printg.exception WSGI Servers fromgevent.pywsgiimportWSGIServer defapplication(environ,start_response):
printg.exception
WSGI Servers
fromgevent.pywsgiimportWSGIServer
defapplication(environ,start_response):
status='200OK'
body='HelloCruelWorld!'
headers=[
('Content-Type','text/html')
]
start_response(status,headers)
return[body]
WSGIServer(('',8000),application).serve_forever()
Long Polling
Chat Server
License
This is a collaborative document published under MIT license. Forking on GitHub is encouraged