serial

Ben Payne 2015-01-16

serial

$ python serial_print_to_file.py
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100]

this Python script reads from "parameters.input" to determine how many integers to write to a file and to stdout The integers get printed as plain text in output.dat and as Pickle binary in output.pkl

multiple independent workers, sequential

$ ./launch_workers.sh 6 print_to_file_worker.py

This bash script launches the "print_to_file_worker.py" script which generates two sets of output files, output_<N>.dat and output_<N>.pkl where is the index for the number of files (here 6, so files are numbered 1 to 6) Since 6 does not divide 100, the last file (output_6) contains more lines than the others

$ wc -l output_*.dat
      16 output_1.dat
      16 output_2.dat
      16 output_3.dat
      16 output_4.dat
      16 output_5.dat
      20 output_6.dat
     100 total

Each worker includes a "sleep" delay to replicate the behavior of a computational task. Since the set of independent workers is sequential, the total time is a sum of each worker delay

To join these files,

$ python join_worker_output.py 6

checking that this produced the right output,

$ wc -l output_from_joiner.dat 
     100 output_from_joiner.dat

math of dividing indices among workers ====

Suppose I have 1:13 and 3 computation cores

indices:   1 2 3 4 5 6 7 8 9 10 11 12 13
CPU cores: <- 1 -> <- 2 -> <-    3    ->

let num_items = 13 let num_cores = 3

then chunk_size = floor(13/3) = 4

let core_indx = 1:num_cores = 1:3 then we want

core_indx | start | end
----------------------
     1    |   1   |  4
     2    |   5   |  8
     3    |   9   | 13

To capture this, let start = (core_indx-1)chunk_size+1 if core_indx = num_cores, then end = num_items else end = core_indxchunk_size

multiple independent workers, parallel

The launch_workers bash script can either run the "print_to_file_worker" scripts sequentially or, using subshells, in parallel

sequential:
python $2 $core_indx $1

parallel: ( python $2 $core_indx $1 ) &

When run in parallel, the total delay is the time for one worker to run

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
join_worker_output.py		join_worker_output.py
launch_workers.sh		launch_workers.sh
parameters.input		parameters.input
print_to_file_worker.py		print_to_file_worker.py
serial_print_to_file.py		serial_print_to_file.py
threads_concurrent_chunks_of_large_array.py		threads_concurrent_chunks_of_large_array.py
threads_concurrent_helloWorld.py		threads_concurrent_helloWorld.py
threads_concurrent_squaring.py		threads_concurrent_squaring.py
threads_daemon.py		threads_daemon.py
threads_map_and_async.py		threads_map_and_async.py
threads_serial_.py		threads_serial_.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

serial

multiple independent workers, sequential

math of dividing indices among workers ====

multiple independent workers, parallel

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

serial

multiple independent workers, sequential

math of dividing indices among workers ====

multiple independent workers, parallel

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages