Thread processing in Python

Fork This

Simple Introductory Code

To illustrate how to code with Pymp, the sample code in Listing 3 from the website begins with basic Python code. To keep things simple, this is a serial code with a single array. Listing 4 is the Pymp version of the same code.

Listing 3

Python Code

01 from __future__ import print_function
02
03 ex_array = np.zeros((100,), dtype='uint8')
04 for index in range(0, 100):
05   ex_array[index] = 1
06   print('Yay! {} done!'.format(index))

Listing 4

Pymp Code

01 from __future__ import print_function
02
03 import pymp
04
05 ex_array = pymp.shared.array((100,), dtype='uint8')
06 with pymp.Parallel(4) as p:
07   for index in p.range(0, 100):
08     ex_array[index] = 1
09     # The parallel print function takes care of asynchronous output.
10     p.print('Yay! {} done!'.format(index))

The first change to the serial code is creating a shared array with a pymp method. The next change is to add the statement creating the number of processes (with pymp.Parallel(4) as p). Remember that these are forked processes and not threads.

The final action is to change the range function to p.range(0, 100). According to the Pymp website, this is the same as using the static schedule.

The approach illustrated in these code samples bypasses the GIL in favor of the operating system's fork method. From the GitHub site, "Due to the copy-on-write strategy, this causes only a minimal overhead and results in the expected semantics." Note that using the system fork operation excludes Windows, because it lacks a fork mechanism.

Laplace Solver Example

The next example, the common Laplace solver, is a little more detailed. The code is definitely not the most efficient – it uses loops – but I hope it illustrates how to use Pymp. For the curious, timings are included in the code. Listing 5 is the Python version, and Listing 6 shows the Pymp version of the code. Changed lines in are marked with arrows (-->**<---).

Listing 5

Python Laplace Solver

01 import numpy
02 from time import perf_counter
03
04 nx = 1201
05 ny = 1201
06
07 # Solution and previous solution arrays
08 sol = numpy.zeros((nx,ny))
09 soln = sol.copy()
10
11 for j in range(0,ny-1):
12   sol[0,j] = 10.0
13   sol[nx-1,j] = 1.0
14 # end for
15
16 for i in range(0,nx-1):
17   sol[i,0] = 0.0
18   sol[i,ny-1] = 0.0
19 # end for
20
21 # Iterate
22 start_time = perf_counter()
23 for kloop in range(1,100):
24   soln = sol.copy()
25
26   for i in range(1,nx-1):
27     for j in range (1,ny-1):
28       sol[i,j] = 0.25 * (soln[i,j-1] + soln[i,j+1] + soln[i-1,j] + soln[i+1,j])
29     # end j for loop
30   # end i for loop
31 #end for
32 end_time = perf_counter()
33
34 print(' ')
35 print('Elapsed wall clock time = %g seconds.' % (end_time-start_time) )
36 print(' ')

Listing 6

Pymp Laplace Solver

01 -->  import pymp  <--
02 from time import perf_counter
03
04 nx = 1201
05 ny = 1201
06
07 # Solution and previous solution arrays
08 -->  sol = pymp.shared.array((nx,ny))  <--
09 -->  soln = pymp.shared.array((nx,ny))  <--
10
11 for j in range(0,ny-1):
12   sol[0,j] = 10.0
13   sol[nx-1,j] = 1.0
14 # end for
15
16 for i in range(0,nx-1):
17   sol[i,0] = 0.0
18   sol[i,ny-1] = 0.0
19 # end for
20
21 # Iterate
22 start_time = perf_counter()
23 -->  with pymp.Parallel(6) as p:  <--
24   for kloop in range(1,100):
25     soln = sol.copy()
26
27     for i in p.range(1,nx-1):
28       for j in p.range (1,ny-1):
29         sol[i,j] = 0.25 * (soln[i,j-1] + soln[i,j+1] + soln[i-1,j] + soln[i+1,j])
30       # end j for loop
31     # end i for loop
32   # end kloop for loop
33 # end with
34 end_time = perf_counter()
35
36 print(' ')
37 print('Elapsed wall clock time = %g seconds.' % (end_time-start_time) )
38 print(' ')

To show that Pymp is actually doing what it is supposed to do, Table 1 shows the timings for various numbers of cores. Notice that the total time decreases as the number of cores increases, as expected.

Table 1

Pymp Timings

Number of Cores Total Time (sec)
Base (serial) 165
1 94
2 42
4 10.9
6 5

Summary

Ever since Python was created, people have been trying to achieve multithreaded computation. Several tools were created to do computations outside of and integrate with Python.

Over time, parallel programming approaches have become standards. OpenMP is one of the original standards and is very popular in C/C++ and Fortran programming, so a large number of developers know and use it in application development. However, OpenMP is used with compiled, not interpreted, languages.

Fortunately, the innovative Python Pymp tool was created. It is an OpenMP-like Python module that uses the fork mechanism of the operating system instead of threads to achieve parallelism. As illustrated in the examples in this article, it's not too difficult to port some applications to Pymp.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus