<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Perplexed Labs &#187; multiprocessing</title>
	<atom:link href="http://blog.perplexedlabs.com/tag/multiprocessing/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.perplexedlabs.com</link>
	<description>web development war stories from the frontlines to the backend</description>
	<lastBuildDate>Mon, 16 May 2011 14:19:38 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.2</generator>
		<item>
		<title>Python data sharing in the multiprocessing module</title>
		<link>http://blog.perplexedlabs.com/2010/03/04/python-data-sharing-in-the-multiprocessing-module/</link>
		<comments>http://blog.perplexedlabs.com/2010/03/04/python-data-sharing-in-the-multiprocessing-module/#comments</comments>
		<pubDate>Thu, 04 Mar 2010 13:00:08 +0000</pubDate>
		<dc:creator>Matt</dc:creator>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[multiprocessing]]></category>

		<guid isPermaLink="false">http://blog.perplexedlabs.com/?p=430</guid>
		<description><![CDATA[Python's multiprocessing module is a great tool that abstracts the details of forking and managing child processes in an interface inspired by the threading module. The benefit to using processes over threads is that you effectively avoid the issues of the GIL (Global Interpreter Lock). I wanted to share my experience with sharing static data [...]


Related posts:<ol><li><a href='http://blog.perplexedlabs.com/2010/09/09/python-libwkhtmltox-module-wrapping-a-c-library-using-cython/' rel='bookmark' title='Permanent Link: Python libwkhtmltox module &#8211; wrapping a C library using Cython &#8211; convert HTML to PDF'>Python libwkhtmltox module &#8211; wrapping a C library using Cython &#8211; convert HTML to PDF</a></li>
<li><a href='http://blog.perplexedlabs.com/2010/07/01/pythons-tornado-has-swept-me-off-my-feet/' rel='bookmark' title='Permanent Link: Python&#8217;s Tornado has swept me off my feet'>Python&#8217;s Tornado has swept me off my feet</a></li>
<li><a href='http://blog.perplexedlabs.com/2008/11/10/setup-python-25-mod_wsgi-and-django-10-on-centos-5-cpanel/' rel='bookmark' title='Permanent Link: Setup Python 2.5, mod_wsgi, and Django 1.0 on CentOS 5 (cPanel)'>Setup Python 2.5, mod_wsgi, and Django 1.0 on CentOS 5 (cPanel)</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>Python's <a href="http://docs.python.org/library/multiprocessing.html">multiprocessing</a> module is a great tool that abstracts the details of forking and managing child processes in an interface inspired by the <a href="http://docs.python.org/library/threading.html">threading</a> module.  The benefit to using processes over threads is that you effectively avoid the issues of the GIL (Global Interpreter Lock).</p>
<p>I wanted to share my experience with sharing static data between the parent and the forked children.  The solution I ultimately went with is trivially implemented and works well.  It takes advantage of the fact that the children import the same modules of the parent.  If you house your data in a shared module, it's accessible in both places.</p>
<p>The directory structure looks like this:</p>
<blockquote>
<pre>
mypackage/
    __init__.py
    mp.py
    myglobals.py
myscript.py
</pre>
</blockquote>
<p>Here's my light wrapper around the multiprocessing module, mp.py:</p>
<pre class="brush: python; title: ;">
import multiprocessing

import MySQLdb

import myglobals

# handles each unit of work, in this case a SQL query
def worker_do(sql):
    myglobals.cursor.execute(sql)

# called once upon worker initialization
def worker_init():
    myglobals.conn = MySQLdb.connect(**myglobals.config['db'])
    myglobals.cursor = myglobals.conn.cursor()
    myglobals.cursor.execute('SET AUTOCOMMIT=1')

# wrapper for multiprocessing module
def do_work(queue, num_processes):
    pool = multiprocessing.Pool(num_processes, initializer=worker_init)
    pool.map(worker_do, queue, 1)
    pool.close()
    pool.join()
</pre>
<p>And here's my example script, myscript.py:</p>
<pre class="brush: python; title: ;">
import os
import sys

import mp
import myglobals

def main():
   # anything in the myglobals module will be accessible by the child processes
   # we could then programatically retrieve this config info from a file
   # via ConfigParser
   #
   # for simplicity I hard-coded it here
   myglobals.config = {
      'db': {
         'host': 'db1',
         'user': 'dbuser',
         'passwd': 'dbpasswd',
         'db': 'dbase'
      }
   }

   # build a whole bunch of queries to perform via the workers
   queries = build_queries()

   # perform the multiprocessing operation
   mp.do_work(queries, 4)

   return 0

if __name__ == '__main__':
   sys.exit(main())
</pre>
<p>In this example the benefit would be to keep your database configuration code DRY - and share that data with the child processes.</p>


<p>Related posts:<ol><li><a href='http://blog.perplexedlabs.com/2010/09/09/python-libwkhtmltox-module-wrapping-a-c-library-using-cython/' rel='bookmark' title='Permanent Link: Python libwkhtmltox module &#8211; wrapping a C library using Cython &#8211; convert HTML to PDF'>Python libwkhtmltox module &#8211; wrapping a C library using Cython &#8211; convert HTML to PDF</a></li>
<li><a href='http://blog.perplexedlabs.com/2010/07/01/pythons-tornado-has-swept-me-off-my-feet/' rel='bookmark' title='Permanent Link: Python&#8217;s Tornado has swept me off my feet'>Python&#8217;s Tornado has swept me off my feet</a></li>
<li><a href='http://blog.perplexedlabs.com/2008/11/10/setup-python-25-mod_wsgi-and-django-10-on-centos-5-cpanel/' rel='bookmark' title='Permanent Link: Setup Python 2.5, mod_wsgi, and Django 1.0 on CentOS 5 (cPanel)'>Setup Python 2.5, mod_wsgi, and Django 1.0 on CentOS 5 (cPanel)</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://blog.perplexedlabs.com/2010/03/04/python-data-sharing-in-the-multiprocessing-module/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

