Skip to content

ShakerMaker Parallel Processing #7

@MohamedZouatine

Description

@MohamedZouatine

Dear Professor @jaabell,

do you have an example of running ShakerMaker with parallel processing to generate DRM box ?

I worked on an example where I generated a DRM box, but the process has been running for a couple of days because I used only a single processor. When I switched to parallel processing using the command

mpirun -np 10 python

I encountered the error shown below. I suspect this happens because multiple processors are trying to write simultaneously to the generated HDF5 DRM file.

ShakerMaker Run begin. dt=0.005 nfft=2048 dk=0.02 tb=200 tmin=0.0 tmax=100

Starting ShakerMaker…

ShakerMaker Run begin. dt=0.005 nfft=2048 dk=0.02 tb=200 tmin=0.0 tmax=100

Traceback (most recent call last):
File "/home/zouatine/ShakerMaker/examples/drm_example_santiago_10.py", line 102, in
model.run(dt=dt, nfft=nfft, dk=dk, tb=tb, writer=writer) # writer only on rank 0
Traceback (most recent call last):
Traceback (most recent call last):
File "/home/zouatine/ShakerMaker/examples/drm_example_santiago_10.py", line 102, in
model.run(dt=dt, nfft=nfft, dk=dk, tb=tb, writer=writer) # writer only on rank 0
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zouatine/shakermaker-py311-env/lib/python3.11/site-packages/shakermaker/shakermaker.py", line 157, in run
writer.initialize(self._receivers, 2nfft)
File "/home/zouatine/shakermaker-py311-env/lib/python3.11/site-packages/shakermaker/slw_extensions/drmhdf5stationlistwriter.py", line 32, in initialize
self._h5file = h5py.File(self._filename, mode="w")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zouatine/shakermaker-py311-env/lib/python3.11/site-packages/h5py/_hl/files.py", line 564, in init
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zouatine/shakermaker-py311-env/lib/python3.11/site-packages/h5py/_hl/files.py", line 244, in make_fid
fid = h5f.create(name, h5f.ACC_TRUNC, fapl=fapl, fcpl=fcpl)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "h5py/_objects.pyx", line 56, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 57, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 122, in h5py.h5f.create
BlockingIOError: [Errno 11] Unable to synchronously create file (unable to lock file, errno = 11, error message = 'Resource temporarily unavailable')
File "/home/zouatine/shakermaker-py311-env/lib/python3.11/site-packages/shakermaker/shakermaker.py", line 157, in run
writer.initialize(self._receivers, 2
nfft)
File "/home/zouatine/shakermaker-py311-env/lib/python3.11/site-packages/shakermaker/slw_extensions/drmhdf5stationlistwriter.py", line 32, in initialize
self._h5file = h5py.File(self._filename, mode="w")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zouatine/shakermaker-py311-env/lib/python3.11/site-packages/h5py/_hl/files.py", line 564, in init
fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zouatine/shakermaker-py311-env/lib/python3.11/site-packages/h5py/_hl/files.py", line 244, in make_fid
fid = h5f.create(name, h5f.ACC_TRUNC, fapl=fapl, fcpl=fcpl)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "h5py/_objects.pyx", line 56, in h5py._objects.with_phil.wrapper
File "/home/zouatine/ShakerMaker/examples/drm_example_santiago_10.py", line 102, in
model.run(dt=dt, nfft=nfft, dk=dk, tb=tb, writer=writer) # writer only on rank 0
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zouatine/shakermaker-py311-env/lib/python3.11/site-packages/shakermaker/shakermaker.py", line 157, in run
writer.initialize(self._receivers, 2*nfft)
File "/home/zouatine/shakermaker-py311-env/lib/python3.11/site-packages/shakermaker/slw_extensions/drmhdf5stationlistwriter.py", line 32, in initialize
self._h5file = h5py.File(self._filename, mode="w")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zouatine/shakermaker-py311-env/lib/python3.11/site-packages/h5py/_hl/files.py", line 564, in init
fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zouatine/shakermaker-py311-env/lib/python3.11/site-packages/h5py/_hl/files.py", line 244, in make_fid
fid = h5f.create(name, h5f.ACC_TRUNC, fapl=fapl, fcpl=fcpl)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "h5py/_objects.pyx", line 56, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 57, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 122, in h5py.h5f.create
BlockingIOError: [Errno 11] Unable to synchronously create file (unable to lock file, errno = 11, error message = 'Resource temporarily unavailable')
key = name drm_10layers
key = h [0.01481481 0.01481481 0.01481481]
key = drmbox_x0 [ 0. 10. 0.]
File "h5py/_objects.pyx", line 57, in h5py._objects.with_phil.wrapper
key = drmbox_xmax 0.08888888888888889
File "h5py/h5f.pyx", line 122, in h5py.h5f.create
BlockingIOError: [Errno 11] Unable to synchronously create file (unable to lock file, errno = 11, error message = 'Resource temporarily unavailable')
key = drmbox_ymax 10.088888888888889
key = drmbox_zmax 0.16296296296296298
key = drmbox_xmin -0.08888888888888889
key = drmbox_ymin 9.911111111111111
key = drmbox_zmin 0.0
key = created_by ---
key = program_used ShakeMaker version 1.1
key = created_on 26-Sep-2025 (14:13:12.345156)

Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.


mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

Process name: [[47521,1],1]
Exit code: 1

Your help will be highly aprecitated,

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions