-
Notifications
You must be signed in to change notification settings - Fork 12
Open
Description
Dear Professor @jaabell,
do you have an example of running ShakerMaker with parallel processing to generate DRM box ?
I worked on an example where I generated a DRM box, but the process has been running for a couple of days because I used only a single processor. When I switched to parallel processing using the command
mpirun -np 10 python
I encountered the error shown below. I suspect this happens because multiple processors are trying to write simultaneously to the generated HDF5 DRM file.
ShakerMaker Run begin. dt=0.005 nfft=2048 dk=0.02 tb=200 tmin=0.0 tmax=100
Starting ShakerMaker…
ShakerMaker Run begin. dt=0.005 nfft=2048 dk=0.02 tb=200 tmin=0.0 tmax=100
Traceback (most recent call last):
File "/home/zouatine/ShakerMaker/examples/drm_example_santiago_10.py", line 102, in
model.run(dt=dt, nfft=nfft, dk=dk, tb=tb, writer=writer) # writer only on rank 0
Traceback (most recent call last):
Traceback (most recent call last):
File "/home/zouatine/ShakerMaker/examples/drm_example_santiago_10.py", line 102, in
model.run(dt=dt, nfft=nfft, dk=dk, tb=tb, writer=writer) # writer only on rank 0
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zouatine/shakermaker-py311-env/lib/python3.11/site-packages/shakermaker/shakermaker.py", line 157, in run
writer.initialize(self._receivers, 2nfft)
File "/home/zouatine/shakermaker-py311-env/lib/python3.11/site-packages/shakermaker/slw_extensions/drmhdf5stationlistwriter.py", line 32, in initialize
self._h5file = h5py.File(self._filename, mode="w")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zouatine/shakermaker-py311-env/lib/python3.11/site-packages/h5py/_hl/files.py", line 564, in init
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zouatine/shakermaker-py311-env/lib/python3.11/site-packages/h5py/_hl/files.py", line 244, in make_fid
fid = h5f.create(name, h5f.ACC_TRUNC, fapl=fapl, fcpl=fcpl)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "h5py/_objects.pyx", line 56, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 57, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 122, in h5py.h5f.create
BlockingIOError: [Errno 11] Unable to synchronously create file (unable to lock file, errno = 11, error message = 'Resource temporarily unavailable')
File "/home/zouatine/shakermaker-py311-env/lib/python3.11/site-packages/shakermaker/shakermaker.py", line 157, in run
writer.initialize(self._receivers, 2nfft)
File "/home/zouatine/shakermaker-py311-env/lib/python3.11/site-packages/shakermaker/slw_extensions/drmhdf5stationlistwriter.py", line 32, in initialize
self._h5file = h5py.File(self._filename, mode="w")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zouatine/shakermaker-py311-env/lib/python3.11/site-packages/h5py/_hl/files.py", line 564, in init
fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zouatine/shakermaker-py311-env/lib/python3.11/site-packages/h5py/_hl/files.py", line 244, in make_fid
fid = h5f.create(name, h5f.ACC_TRUNC, fapl=fapl, fcpl=fcpl)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "h5py/_objects.pyx", line 56, in h5py._objects.with_phil.wrapper
File "/home/zouatine/ShakerMaker/examples/drm_example_santiago_10.py", line 102, in
model.run(dt=dt, nfft=nfft, dk=dk, tb=tb, writer=writer) # writer only on rank 0
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zouatine/shakermaker-py311-env/lib/python3.11/site-packages/shakermaker/shakermaker.py", line 157, in run
writer.initialize(self._receivers, 2*nfft)
File "/home/zouatine/shakermaker-py311-env/lib/python3.11/site-packages/shakermaker/slw_extensions/drmhdf5stationlistwriter.py", line 32, in initialize
self._h5file = h5py.File(self._filename, mode="w")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zouatine/shakermaker-py311-env/lib/python3.11/site-packages/h5py/_hl/files.py", line 564, in init
fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zouatine/shakermaker-py311-env/lib/python3.11/site-packages/h5py/_hl/files.py", line 244, in make_fid
fid = h5f.create(name, h5f.ACC_TRUNC, fapl=fapl, fcpl=fcpl)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "h5py/_objects.pyx", line 56, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 57, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 122, in h5py.h5f.create
BlockingIOError: [Errno 11] Unable to synchronously create file (unable to lock file, errno = 11, error message = 'Resource temporarily unavailable')
key = name drm_10layers
key = h [0.01481481 0.01481481 0.01481481]
key = drmbox_x0 [ 0. 10. 0.]
File "h5py/_objects.pyx", line 57, in h5py._objects.with_phil.wrapper
key = drmbox_xmax 0.08888888888888889
File "h5py/h5f.pyx", line 122, in h5py.h5f.create
BlockingIOError: [Errno 11] Unable to synchronously create file (unable to lock file, errno = 11, error message = 'Resource temporarily unavailable')
key = drmbox_ymax 10.088888888888889
key = drmbox_zmax 0.16296296296296298
key = drmbox_xmin -0.08888888888888889
key = drmbox_ymin 9.911111111111111
key = drmbox_zmin 0.0
key = created_by ---
key = program_used ShakeMaker version 1.1
key = created_on 26-Sep-2025 (14:13:12.345156)
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[47521,1],1]
Exit code: 1
Your help will be highly aprecitated,
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels