Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add requirements.txt and update .gitignore for Python #5475

Merged
merged 1 commit into from
May 19, 2024

Conversation

RecRanger
Copy link
Contributor

Fixes #5474

requirements.txt Outdated
@@ -0,0 +1 @@
protobuff~=3.20
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This says protobuff with two f's - is this right or a typo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks

@solardiz
Copy link
Member

I'm not into Python. I'd appreciate to hear from someone who is, @exploide maybe? Thank you!

@exploide
Copy link
Contributor

Ok, I see multiple issues here.

  • protobuff with two f's is indeed incorrect. The correct package is named protobuf and the one with two f's doesn't even exist. However it could be added afterwards, which would result in a typosquatting attack if it contains malicious content. So this is directly a security vulnerability.
  • Pinning the version down to an older version forever is probably not what we want. We should aim to make the script compatible with the latest release of protobuf.
  • A top-level requirements.txt in the john repo is probably not appropriate. This particular dependency only affects a single script. And since the scripts are more or less stand-alone and independent, it could as well be the case that another script just requires a different version of protobuf (hypothetical example).
  • There is a bunch of other dependencies needed from within other scripts. And these are all not mentioned in this version of requirements.txt. It is very specific for the script that requires protobuf.

So I see two solutions: Fix the incompatibility in the script to make it work with latest protobuf. Or if this requires too much effort for the moment, include the version constraint in the pip install instruction suggested by the import exception handling instead.
A top-level requirements.txt does not feel right for a project like john. But I'm not the one to decide.

@RecRanger
Copy link
Contributor Author

Good catch, I was moving too fast and made that typo. Fixed just now.

I would strongly suggest adding this requirements.txt file with whatever versions are required to make it work right now. Then, you can upgrade dependencies in the future if that's a desire.

I also just added all the other dependencies to the requirements.txt file which are mentioned via a pip install printed message. I didn't put versions on them, but they should all have pinned versions when tested.

@solardiz
Copy link
Member

Thank you for the review @exploide! I'm not sure what is best to do here. @RecRanger argues for this approach further in #5474.

@RecRanger Thank you for the fix and additions. When fixing an error in a commit within a pending PR, we normally amend the commit and force-push - not add another commit. We also prefix commit titles with subsystem name whenever relevant (would be "requirements.txt: " starting with the second commit here). In this case, though, I suggest merging all 3 commits into one. We can do that at PR merge time by using GitHub's "squash and merge" feature, although this will make GitHub appear as the committer in git history.

@solardiz
Copy link
Member

@RecRanger Where does the dependency on pycrypto come from? Is it indirect? Do we need to list indirect dependencies? (This may be fine. I am just asking.)

@RecRanger
Copy link
Contributor Author

I personally always use Squash-and-Merge when working in semi-anon/informal teams (e.g., open source projects) to make reverting changes easier. Will do better next time though!

I got the library list with a grep 'pip install' search. The pycrypto library came from:

  • telegram2john.py
  • DPAPImk2john.py

Indirect dependencies should not be listed. Libraries are assumed to have valid pyproject.toml/setuptools configurations which indicate which dependencies to install.

@exploide
Copy link
Contributor

From #5474

I don't see anything wrong with continuing to use the older version of Protobuf (which is tested, validated, etc.). Upgrading the dependency seems like a pointless endeavor for no gain.

In principle, I would always recommend to make a software compatible with the latest version of a dependency. For example, it might happen that a vulnerability becomes known for the older version or that there is an unpleasant bug. Furthermore, if someone decides to package this for a Linux distribution, they will fail because they cannot ship an older version of a Python package than the latest one already packaged for the distribution.

The error message explicitly says that the current Python code uses features that are deprecated in protobuf>3.20. We'd likely have to regenerate the Python protobuf files, iirc (which is a huge pain).

But in this case, I agree. I don't have any experience with protobuf file generation and if it is a huge pain, then we should not enforce it for the moment. (But welcome it when someone likes to do the upgrade.)

I also just added all the other dependencies to the requirements.txt file which are mentioned via a pip install printed message. I didn't put versions on them, but they should all have pinned versions when tested.

There are different points of view on this topic. Personally, I don't like pinning version numbers as long as there is no process of keeping them up to date. Otherwise you will soon have a requirements.txt which just contains a bunch of horribly outdated package versions. Which I don't like for reasons as explained above. I would only introduce version constraints where they are technically necessary. However, I know that there are different approaches which emphasize the desire of having a tested set of working dependency versions.

I already tried to explain that a combined global requirements.txt doesn't feel intuitive for me, because this is not a Python project at all. john is a C software and then there are just a bunch of standalone Python scripts as auxiliary tools. And not only Python, there is also Perl and maybe some other stuff.
I don't see a big advantage of a requirements.txt here. But I believe you see an advantage. So it is fine for me to include a requirements.txt but again, I'm not the one to decide.

Since the file is only useful for the scripts in run/ I think requirements.txt should also be moved to run/. As I mentioned, john itself is not a Python project. But this is not really important.

Where does the dependency on pycrypto come from?

Indeed, pycrypto is a dependency. However, it is insecure and deprecated and should no longer be used. There is pycryptodome as a drop-in replacement. Is has the same API. So please list this as a requirement instead. I will conduct a separate PR soon, which also adapts the error message within the scripts.

Sorry for this lengthy answer. Just wanted to make my points clear. Whether to include a requirements.txt or not is up to @solardiz

@solardiz
Copy link
Member

Thank you guys for the discussion. I had actually thought of us adding a requirements.txt and related documentation before, when I saw that the btcrecover has that, although sure enough they're primarily a Python project and we're not. Incidentally, IIRC our two pre-generated protobuf files came from btcrecover, so maybe we should look there (at all forks of it) for any updates.

Since the file is only useful for the scripts in run/ I think requirements.txt should also be moved to run/. As I mentioned, john itself is not a Python project. But this is not really important.

This may actually be a good idea. It could possibly stay available in a distro package that includes our scripts, but does not automatically get the Python modules pre-installed as the distro package dependencies. What do you think, @RecRanger?

@RecRanger
Copy link
Contributor Author

Projects should work out of the box. This PR is how you make it work out-of-the-box.

Ideas for AFTER this bare-minimum PR is merged

  1. If you want opt-in requirements for features, then setup a pyproject.toml.
  2. Requirements can be pinned with version ranges.
  3. Bots can be used to bump requirements, given adequate test cases.

There is no perfect solution for balancing security, functionality, and developer experience. This is the most basic solution. Let's upgrade incrementally from a bare-minimum.

@solardiz
Copy link
Member

I've just merged @exploide's #5478 (obviously triggered by your work here @RecRanger, thanks!) so now we need to revise the changes here accordingly.

Fix protobuf typo in requirements.txt

Add other python dependencies for various scripts

Update gitignore for Python files

Switch pycrypto dep to pycryptodome in requirements.txt
@RecRanger
Copy link
Contributor Author

Revised the "pycrypto" to "pycryptodome" dependency. Also squashed the commits.

@solardiz
Copy link
Member

Thanks, but now the commit message records change history within this PR, which is temporary and irrelevant for our git history. So if it's just one commit, it should say just "Add requirements.txt and update .gitignore for Python" (or this could be 2 commits).

__pycache__/
*.pyo
*.pyd
**/venv/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why two asterisks in **/venv/?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this doesn't really matter. It's redundant in this case but shouldn't harm either. https://stackoverflow.com/questions/41761128/are-leading-asterisks-redundant-in-gitignore-path-matching-syntax

Could be simplified to venv/ and maybe add .venv/ as well as it is another common virtualenv name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two asterisks means that it can be nested any depth. May be redundant, but it definitely works.

@RecRanger
Copy link
Contributor Author

Soooo like, is someone gonna merge this PR? As much as I'd love critique on my git technique and gitignore setup, I'm actually just here because the 100 contributors that came before me didn't setup the gitignore for 1 out of 2 languages used in this repo.

Copy link
Member

@solardiz solardiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commit message is still off, but the changes look good. Thanks!

@solardiz solardiz changed the title Fix requirements.txt Add requirements.txt and update .gitignore for Python May 19, 2024
@solardiz solardiz merged commit 28ba4d7 into openwall:bleeding-jumbo May 19, 2024
7 of 8 checks passed
@solardiz
Copy link
Member

Soooo like, is someone gonna merge this PR?

OK, I did a squash-and-merge on this to set a commit message that matches the final changes. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Python requirements.txt file with pinned version numbers
3 participants