Skip to content

Added environment setup instructions#181

Closed
PriyankaKetkarBNL wants to merge 19 commits intomainfrom
Issue180_AddEnvironmentSetupInstructions
Closed

Added environment setup instructions#181
PriyankaKetkarBNL wants to merge 19 commits intomainfrom
Issue180_AddEnvironmentSetupInstructions

Conversation

@PriyankaKetkarBNL
Copy link
Collaborator

@PriyankaKetkarBNL PriyankaKetkarBNL commented Mar 2, 2025

Addresses Issue #180

These instructions have been tested several times by me and have been used successfully by several novice users during the 2025-1 cycle.

@EliotGann
Copy link
Collaborator

I'm not sure if this belongs in pyhyperscattering itself. It is very specific to a certain setup (local windows anaconda jupyter-lab), and can be misleading about nominal requirements. Ideally setup should be as simple as adding pyhyperscattering to the list of packages you want to create a conda environment from. If it isn't then I think work here should be in making that happen. I think specific setups should probably be linked more closely with specific loaders maybe? or on beamline wikis?

@pbeaucage
Copy link
Collaborator

I would broadly echo Eliot's sentiment; there is some good stuff here but I'm not sure the overall scheme really fits into this project's documentation.

I would generally cluster my thoughts/objections into 3 themes, in no particular order:

  1. Documentation for this project doesn't live in random markdown files; it should be inside the existing HTML docs website and organized accordingly. Expecting a novice user to find a random markdown file...well, seems unlikely.

  2. The overall tone of these instructions is very much focused on the user reciting magic incantations into a terminal without much understanding of what/why they're doing. This isn't a good way to train users anyway (what if one step was rm -rf /*?) and seems especially counterproductive for a flexible toolkit library intended to support further development of bespoke software, rather than an application. We want to be teaching our users fundamental understanding so they can be the captains of how they use the tools, rather than repeating incantations.

  3. Even if you accept 1/2 as fine, these instructions are remarkably specific - to Windows, to NSLS2, to an odd way of installing git, to a semi-superuser way of linking different environments into the same Jupyter server. There are a lot of tutorials out there on "how do I setup Python/jupyter" and I don't really see a compelling reason why we should write one rather than link to somebody else's which is likely going to be better maintained. They also aren't organized in a clear way (section headings, etc) which makes it hard to separate basic tutorial from specific instructions (such as the sub requirement specs which probably do merit docs, and common install issues).

So, where do we go?

I think a valuable contribution would be a page in the real docs, that linked out to tutorial content in setting up Python/jupyter, included info on the sub requirement specs and common install issues. As an example, you might look at the AFL-agent setup page: https://pages.nist.gov/AFL-agent/en/23-documentation-improvements-v2/tutorials/installation.html

Copy link
Collaborator

@pbeaucage pbeaucage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above.

@PriyankaKetkarBNL
Copy link
Collaborator Author

Sure, I made a few changes, and I have a few questions before I revise further.

Re: Eliot’s comments, I’d like to learn more about the details. I think the python version is a direct requirement for PyHyperScattering. And I think the Microsoft C++ Build Tools was directly needed for PyHyperScattering. These issues came up prior to loading any NSLS II-specific data. Are there ways to set these up in the codebase rather than in an environment?

Re: Peter’s comment 1, I moved everything to an environment_setup.rst page in the getting_started docs section and added more subheadings. I’ll admit, I completely forgot that a real docs page existed.

My one pushback is that I believe this level of hand-holding specific to our use case is a must to have somewhere in public, whether it’s in the real docs, the RSoXS wiki, or elsewhere. Re: Peter’s comment 3, I specifically had at least 2 users this cycle attempt to Google general environment setup instructions and fail to have a working PyHyperScattering. From my own experience, it’s not always obvious how to apply generic instructions to our use case.

Re: Peter’s comment 2, I largely agree with the spirit, but I think it’s also important to consider what’s been happening in practice at least for the past year: several novice users and even repeat users barely manage to get samples prepared by beam time and don’t have the bandwidth to review through the wiki, PyHyperScattering documentation, etc. regardless of how early on I send it to them. It’s also more practical for my own bandwidth currently to help novice users troubleshoot setup from a single uniform procedure. Otherwise, we are losing beam time just not being able to see data. Most users are naturally curious about understanding the deeper workings, but not having a structured starting point makes the learning curve too steep.

My intention for balancing Peter’s comment 2 vs. a need for a quick and reliable procedure was to format the instructions to start with the “what” should be run followed by an explanation of “why” we are running it. I had some ideas on using different text colors for the “what” and the “why”, but I haven’t figured out how to format that yet. Could you let me know specific examples of where the “why” is not clear? Also, could you clarify what about the git installation is odd and where I am linking different environments onto the same Jupyter server? I am not aware of what is the “normal” way to do this.

@pbeaucage
Copy link
Collaborator

pbeaucage commented Mar 4, 2025

Some specific point answers below, I'll take a look and edit this version of the page in line review mode.

Also, could you clarify what about the git installation is odd

Well, I am not a windows person. But my understanding is that the normal way one installs a windows package is with a downloaded exe and a GUI installer app. In this case I would suggest users download the Github Desktop app, or install command line git using a package manager like conda. I personally use Github desktop and encourage everyone to do the same.

where I am linking different environments onto the same Jupyter server?

The line where you install just ipykernel in the server environment. You have the server running somewhere -- base?? -- and then your separate kernel. A simpler way to do this is install jupyterlab in the same environment and run the server in that environment. No need to handle environment switching that way, and that's a much easier way to run something like an "application"

python version is a direct requirement for PyHyperScattering

I'm not sure about this, either way. If we are incompatible with newest Python - totally could be - that is worth an issue. We assert some version compatibility in pyproject.toml and that should be up to date.

And I think the Microsoft C++ Build Tools was directly needed for PyHyperScattering

PyHyperScattering is a pure Python package, so no, no need for a C compiler. You will need a C compiler for compiled packages, if there is no binary for your system. I don't offhand know of which compiled package we depend on that wouldn't have x64 windows binaries. We could mention this in troubleshooting, but the error messages that your system doesn't have a compiler are really clear. We should be building the habit of reading and understanding errors rather than throwing up hands.

Copy link
Collaborator

@pbeaucage pbeaucage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some specific comments here, sorry for the delay in looking at this

@@ -0,0 +1,120 @@
.. _Set_up_Python:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As GitHub warning says, this needs to be included in the top level toctree.


Please follow the instructions below to set up an appropriate environment to use PyHyperScattering. All of these setup steps should be run in a terminal, not a Jupyter notebook.

These instructions have been tested for the use of PyHyperScattering in a local JupyterLab notebook using the Anaconda distribution (https://www.anaconda.com/download) on a Windows computer. The instructions below *might* work for other platforms (e.g., NSLS II Jupyterhub, Google Colab), but there are no guarantees; recently, the NSLS II Jupyterhub has been especially incompatible PyHyperScattering.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A reference to a specific facility's platform is entirely inappropriate to have here. Also, simply setting this incompatibility in stone is unproductive; it would be prudent for NSLS2 staff to actually engage and make this work on their systems, especially with recent fixes to dependency management, things might work a lot better.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it is worth mentioning that Colab and Jupyterhub are container-based platforms but you do not really have shell access (or you only have them inside the container) so I wouldn't offhand expect the command line parts to work.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, also we shouldn't be endorsing a commercial product like Anaconda, miniforge is probably what I would recommend, or refer out to somebody else who can endorse a product

Download Git
------------

To aid this workflow, download Git (https://git-scm.com/download/win). Then in the command prompt (not Anaconda Prompt), run ``winget install --id Git.Git -e --source winget``. After this, if you are able to run ``git --version`` and have a version number outputted, the installation was successful. If Anaconda Prompt was open, it may need to be restarted.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said in the comment chain, if the user is going to use git - they don't need it unless they plan to develop - they should really use GitHub desktop which is a lot easier.

Create and activate an environment
----------------------------------

- Open the Anaconda Prompt. Do not use the terminal feature after opening JupyterLab.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idk what the second sentence means? Do not use the terminal within jupyterlab? I don't offhand know why.

Install Python
~~~~~~~~~~~~~~

Check the Python version. Use version 3.11 or lower for PyHyperScattering to work.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment in thread -- is this true? it would be better to have a softer recommendation like "use a recent Python version but perhaps not the newest one"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of 0.2.9 we are now verified to be compatible with all Python up to current 3.13, so this is incorrect.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still running into issues. Below is a screenshot for the error I encountered with Python 3.13.2, and I ran into a similar error while when I downgraded to Python 3.13. I meant to attach the full log, but when I enter the file path into my Windows browser, I get a "Windows can't find" error. I don't encounter this error when I use Python 3.12. Noting this for reference until I can troubleshoot further, and then I'll update the instructions.

image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those error messages look like you don't have a C compiler to build a 1.x numpy against Python 3.13; idk why it wouldn't be able to find it assuming you do have one. The CI build chain does fail on 3.13/windows but due to the container running out of resources while compiling, so I figured it would work. It does test successfully against 3.13 on Mac and linux.

Comment on lines +73 to +77
- ``pip install "git+https://github.com/usnistgov/PyHyperScattering.git#egg=PyHyperScattering[bluesky]"`` installs the latest commit on the main branch.

- ``pip install "git+https://github.com/usnistgov/PyHyperScattering.git@Issue170_UpdateDatabrokerImports#egg=PyHyperScattering[bluesky]"`` installs the latest commit on the branch named ``Issue170_UpdateDatabrokerImports``.

- ``pip install "git+https://github.com/usnistgov/PyHyperScattering.git@6657973#egg=PyHyperScattering[bluesky]"`` installs commit ``6657973``.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am really uncomfortable telling somebody who doesn't know how to set up Python to install unsupported, bleeding-edge variants of this package. There is an implicit agreement in running from main or especially a branch that it might break at any time; a user reading this isn't going to "get" that and will develop a bad impression of the software.


- ``pip install "git+https://github.com/usnistgov/PyHyperScattering.git@6657973#egg=PyHyperScattering[bluesky]"`` installs commit ``6657973``.

``pip install pyhyperscattering[ui]`` installs the necessary dependencies to draw a mask. Make sure to install the ``[ui]`` dependencies of the same version/branch/commit of PyHyperScattering used to install the ``[bluesky]`` dependencies.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can, I believe, chain the dependency specs together pyhyperscattering[bluesky,ui]


If there are errors during installation or later on, it might be necessary to install additional packages and then retry the pip installs. Below is a list of what might be needed.

- Microsoft C++ Build Tools (https://visualstudio.microsoft.com/visual-cpp-build-tools/). This is installed outside the Anaconda prompt. Computer should be restarted after this installation.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should explain that some dependencies of PHS use compiled C code and if there is no binary for your system, you need a C compiler such as MS Visual C++, as indicated in the error message you get.


- ``pip install --upgrade holoviews`` This may be necessary if mask drawing is not working. The ``--upgrade`` is necessary to ensure that the package will get upgraded even if some version of it is currently installed.

- ``pip install natsort`` allows use of the natsort package, but is not necessary for the main functioning of PyHyperScattering.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...do we use natsort anywhere? it should be added to requirements if so. if not, it isn't really in scope here?

Open JupyterLab
---------------

- Start up JupyterLab from the Anaconda command prompt. Do not open JupyterLab using Anaconda's GUI menu.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused as to why.

@pbeaucage
Copy link
Collaborator

pbeaucage commented Apr 7, 2025

I'd encourage training the use of the JupyterLab Desktop app rather than having users mess with manual environment management. The setup instructions in that case are essentially

  1. Download and install JupyterLab desktop
  2. Open the app
  3. Open terminal and type "pip install pyhyperscattering[bluesky,ui]]" or whatever other extras the user desires. OR use the "environment manager" inside the app to create an environment with this package.
  4. Open a notebook and run.

A lot of the "quirks" of this (git, specific branch install, separate server env, etc) are really only necessary for developers, who hopefully don't need detailed and specific instructions.

@PriyankaKetkarBNL PriyankaKetkarBNL marked this pull request as draft April 25, 2025 13:33
@pbeaucage
Copy link
Collaborator

closing this PR, some of the requested improvements are in #211. Feel free to reopen if there are other things worth merging in from this.

@pbeaucage pbeaucage closed this Jun 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

docs: Add thorough instructions for setting up an appropriate environment to run PyHyperScattering

3 participants