Exercise: Process N files in parallel


Create a script that given two number N and X will create N files (1.txt - N.txt). In each file put X rows of random ASCII characters: (digits, lower- and upper-case letters, space). (see string) Each row should be 0-80 characters long. (random length for each row). Using the script create a bunch of files.

Write a script that given a list of files will read all the files. For each file and count how many times each digit(!) appears and provide a combined report. First write the script in a single process (linear) way. Then convert it to be able to work with multiprocess. This version should also accept a number that indicates the size of the pool. Ideally you'd only need to write a few lines of code for this and you'd be able to use the code from the previous (linear) solution as a module

Submit the 3 scripts.

The report could look like this:


         0   1   2   3   4   5   6   7   8   9
1.txt    3   1   3   2   8   3   2   3   2   6
2.txt    6   5   3   1   6   7   4   4   4   4
3.txt    6   3   4   7   2   5   5   1   7   6
TOTAL   15   9  10  10  16  15  11   8  13  16

Create 100 files with 10000 rows in each one and measure how long the linear process takes vs the parallel process with various numbers.