Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pepatac bulker crate issues with /etc/resolv.conf, genrich, and macs2/python problems #159

Closed
rcorces opened this issue Nov 12, 2020 · 15 comments

Comments

@rcorces
Copy link
Collaborator

rcorces commented Nov 12, 2020

I'm having a few issues getting the pepatac bulker crate to work. honestly, some of this might be better posted to bulker instead.

  1. I get a persistent warning (looks like every time a singularity container is loaded) when running pepatac from within the crate:
    WARNING: Skipping mount /var/singularity/mnt/session/etc/resolv.conf [files]: /etc/resolv.conf doesn't exist in container
    This problem is specific to singularity containers associated with the pepatac:1.0.4 bulker container and does not happen with other bulker containers (alpine or coreutils)

  2. I'm having issues with python from within the bulker container. I think this is because bulker is trying to use a local version of python based on this message which occurs during crate loading:
    Host commands available: python3, perl

First, the container couldnt find pararead.
Then macs2 threw a very bizarre error:
Fatal Python error: _Py_HashRandomization_Init: failed to get random numbers to initialize Python
When I launch the _macs2 singularity container, it tries to access a python version at /usr/local/bin/python which is not the version of python that I use (and I think that python install is actually botched).

(p3.8.5) [rcorces@pelayo bulker]$ bulker activate databio/pepatac:1.0.4
Bulker config: /corces/home/shared/tools/bulker/bulker_config.yaml
Activating bulker crate: databio/pepatac:1.0.4
\[\033[01;93m\]databio/pepatac|\[\033[00m\]\[\033[01;34m\]\w\[\033[00m\]\$
databio/pepatac|/corces/home/shared/tools/bulker$ _macs2
WARNING: Skipping mount /var/singularity/mnt/session/etc/resolv.conf [files]: /etc/resolv.conf doesn't exist in container
Singularity> python
Fatal Python error: _Py_HashRandomization_Init: failed to get random numbers to initialize Python

Singularity> which python
/usr/local/bin/python
Singularity> 

I'm not totally clear on how python is working within bulker. Even if the python3 executable provided with the pepatac:1.0.4 crate were called, I dont know where it would get the corresponding libraries/modules if not from the local environment.

@nsheff
Copy link
Member

nsheff commented Nov 12, 2020

  1. That warning is something I see all the time too. It has to do with specific containers and I think related to interaction between quay, biocontainers, and singularity. I think it's just a warning and it works despite the warning for most things. I have looked into that quite a bit but I don't remember what I concluded at this point...just that it works anyway and it's a container image issue Oh yeah -- I think it has to do with containers that are packaged from bioconda. So, this is not really a bulker issue, but a problem with those particular biocontainers that we're relying on in the manifest...

  2. we've struggled a bit with the best way for bulker to handle python. the problem is, if we include a python package in bulker, then you can't use the local python. since bulker's written in python, you have to have one. it gets tricky...and then python commands get executed in the container and can't do stuff in the parent environment, which kills a lot of things we wants. another related issue: python console scripts are problematic bulker#41

So, the way we're currently doing it is that python and packages will all just use your native one, and bulker doesn't touch it, which is why it's listed under "host commands".

So, what this means is that this is the one "weak link" of bulker -- you do have to install all the python related stuff manually. It's best if you just use bulker to handle all the non-python dependencies. While in theory you could bundle those up, you'd have to define a separate monolithic container for each use case, which would defeat the purpose of bulker.

@nsheff
Copy link
Member

nsheff commented Nov 12, 2020

Also just to be clear:

https://github.com/databio/hub.bulker.io/blob/d694c77bfd5bb5a8a4a5060e9b1321ee148705d4/databio/pepatac_1.0.4.yaml#L8

python3 is not containerized in the pepatac manifest. So it's calling your native python. So, just make sure you follow the install instructions to install all the python packages listed in requirements.txt, and you'll need a working native install of python3 of course (which you presumably already have since you're using bulker successfully).

@jpsmith5
Copy link
Contributor

And follow up regarding the warning, I also have always seen this, to no negative consequences. Not sure if there was any conclusion other than that.

@rcorces
Copy link
Collaborator Author

rcorces commented Nov 12, 2020

Yes - the mentioned warning has no negative effects other than showing up all the time.

Is there a rationale for why some steps of the pipeline (whichever one requires pararead) seem to be using the correct python install while macs2 appears to be searching for python at /usr/local/bin?
I actually dont even have a python install at /usr/local/bin. I think this might be related to how macs2 is normally installed (via pip). If that is the case, how would macs2 get properly bundled into a container given what you mentioned about python?

@rcorces
Copy link
Collaborator Author

rcorces commented Nov 12, 2020

with respect to this:

just make sure you follow the install instructions to install all the python packages listed in requirements.txt, and you'll need a working native install of python3 of course (which you presumably already have since you're using bulker successfully).

This could be clarified in the instructions. Currently, the wording "otherwise" made me believe that the python requirements did not apply when running inside the container:

If you want to use containers, you need our multi-container environment manager, bulker, and either docker or singularity -- please see instructions in how to run PEPATAC with containers. Otherwise, follow these instructions to install the requirements natively:

Or you could explicitly mention the python requirements here:
http://pepatac.databio.org/en/latest/run-container/

@jpsmith5
Copy link
Contributor

Bulker's macs2 is itself a container which is hosted from quay.io. Presumably then, the hosted container contains python itself. Not sure why it would be searching there, short of that being the within container path. Will investigate further here.

I'll definitely clarify, and add in a note on the run-container page.

@nsheff
Copy link
Member

nsheff commented Nov 12, 2020

Is there a rationale for why some steps of the pipeline (whichever one requires pararead) seem to be using the correct python install while macs2 appears to be searching for python at /usr/local/bin?
I actually dont even have a python install at /usr/local/bin. I think this might be related to how macs2 is normally installed (via pip). If that is the case, how would macs2 get properly bundled into a container given what you mentioned about python?

To clarify -- pararead is not running in a container. macs2 is running in a container, so it's looking for the python in the container.

This makes it seem like the macs2 container is broken.

@nsheff
Copy link
Member

nsheff commented Nov 12, 2020

that macs2 container is working fine on our cluster with singularity.

ba databio/pepatac 
macs2 --help
usage: macs2 [-h] [--version]
             {callpeak,bdgpeakcall,bdgbroadcall,bdgcmp,bdgopt,cmbreps,bdgdiff,filterdup,predictd,pileup,randsample,refinepeak}
             ...

macs2 -- Model-based Analysis for ChIP-Sequencing
 macs2 --version
macs2 2.2.7.1

and interactively like you did:

databio/pepatac|~$ _macs2
WARNING: Bind mount '/home/ns5bc => /home/ns5bc' overlaps container CWD /home/ns5bc, may not be available
WARNING: Skipping mount /etc/localtime [binds]: /etc/localtime doesn't exist in container
WARNING: Skipping mount /opt/singularity/3.5.3/var/singularity/mnt/session/etc/resolv.conf [files]: /etc/resolv.conf doesn't exist in container
Singularity> python
Python 3.7.6 | packaged by conda-forge | (default, Mar 23 2020, 23:03:20) 
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 

@rcorces
Copy link
Collaborator Author

rcorces commented Nov 12, 2020

does your setup happen to have python at /usr/local/bin/python? If so, the macs2 container would find it just fine. But for anyone who doesnt have python there, it would fail?

@nsheff
Copy link
Member

nsheff commented Nov 12, 2020

does your setup happen to have python at /usr/local/bin/python

No.

ll /usr/local/bin/python
ls: cannot access /usr/local/bin/python: No such file or directory

/usr/local/bin/python is present in the container anyway, so that doesn't make sense

databio/pepatac|~$ _macs2
WARNING: Bind mount '/home/ns5bc => /home/ns5bc' overlaps container CWD /home/ns5bc, may not be available
WARNING: Skipping mount /etc/localtime [binds]: /etc/localtime doesn't exist in container
WARNING: Skipping mount /opt/singularity/3.5.3/var/singularity/mnt/session/etc/resolv.conf [files]: /etc/resolv.conf doesn't exist in container
Singularity> which python
/usr/local/bin/python
Singularity> /usr/local/bin/python
Python 3.7.6 | packaged by conda-forge | (default, Mar 23 2020, 23:03:20) 
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 

@nsheff
Copy link
Member

nsheff commented Nov 12, 2020

what do you see for

cat `which _macs2`
#!/bin/sh

LC_ALL=C singularity shell \
  -B "/tmp:/tmp" \
  -B "/sfs:/sfs" \
  -B "/etc:/etc" \
  -B "/usr/share/zoneinfo/America/New_York:/usr/share/zoneinfo/America/New_York" \
  /project/shefflab/bulker/simages/quay.io/biocontainers/macs2:2.2.7.1--py37h516909a_0

@nsheff
Copy link
Member

nsheff commented Nov 12, 2020

I think it's an issue with the way singularity is configured on your server.

maybe it's not able to access /dev/random or something?

https://stackoverflow.com/questions/61381288/linux-apache-python-cgi-error-failed-to-get-random-numbers-to-initialize-python

you could try some other python container to make sure it's not just the macs container.

can you do:

Singularity> ls -l /dev/random
crw-rw-rw-    1 root     root        1,   8 Jun 17 09:30 /dev/random

@rcorces
Copy link
Collaborator Author

rcorces commented Nov 12, 2020

To answer your questions:

databio/pepatac|/corces/home/shared/tools/bulker$ cat `which _macs2`
#!/bin/sh

LC_ALL=C singularity shell \
  -B "/corces:/corces" \
  -B "/dev:/dev" \
  -B "/tmp:/tmp" \
  -B "/opt:/opt" \
  /corces/home/shared/tools/singularity/simages/quay.io/biocontainers/macs2:2.2.7.1--py37h516909a_0databio/pepatac|
databio/pepatac|/corces/home/shared/tools/bulker$ _macs2
WARNING: Skipping mount /var/singularity/mnt/session/etc/resolv.conf [files]: /etc/resolv.conf doesn't exist in container
Singularity> ls -l /dev/random
crw-rw-rw-    1 root     root        1,   8 Aug 13 13:03 /dev/random
Singularity> exit

For what its worth, I just edited the bulker manifest file to use my local macs2 and deleted the container and reloaded it. The whole pipeline works now.

I'm going to assume that its something specific to my system which is hard to troubleshoot and really not necessary. It could easily be something related to how I've configured python or my python virtualenv? Happy to keep plugging away at this but only if you think its useful. otherwise, feel free to close.

@rcorces rcorces changed the title pepatac bulker crate issues with /etc/resolv.conf and macs2/python problems pepatac bulker crate issues with /etc/resolv.conf, genrich, and macs2/python problems Nov 13, 2020
@rcorces
Copy link
Collaborator Author

rcorces commented Nov 13, 2020

forgot to mention that genrich is included in the pepatac.yaml file but not yet present in the bulker crate so pepatac complains. I'm sure this is just because you are in the process of updating things but figured I would mention it.

@jpsmith5
Copy link
Contributor

Yes, in the process of adding some new options in. Thanks for the heads up. Will make sure the manifest is also updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants