-
Notifications
You must be signed in to change notification settings - Fork 2
new guide quickstart, csc-2892 #31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
EmiliaSzynwald
wants to merge
2
commits into
master
Choose a base branch
from
es-guide-quickstart
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,101 @@ | ||
--- | ||
sort: 2 | ||
--- | ||
|
||
# Quick Start Guide | ||
|
||
## Account Access | ||
|
||
A Star HPC account is required to access and submit jobs to the Star HPC cluster. | ||
|
||
The application process may require justification of the need for HPC resources, detailing the kind of work you intend to do, the resources you expect to use, and sometimes, the anticipated outcomes of your research. | ||
|
||
Access to the cluster comes with responsibilities and certain privileges. Your account comes the responsibility to use the resources wisely and efficiently, to respect the shared nature of the environment, and to contribute to the overall HPC community. | ||
|
||
Users should understand the policies on data privacy and user responsibilities. | ||
|
||
## Getting started on the cluster (account, quota, password) | ||
|
||
Before you start using the Star cluster, please read Hofstra University's [Acceptable Use Guidelines](http://www.hofstra.edu/scs/aug). | ||
|
||
A general introduction to the Star HPC cluster, research community, and support group be found at [starhpc.hofstra.io](https://starhpc.hofstra.io). | ||
|
||
To be able to work on the Star cluster, you must have an approved account on the cluster. | ||
|
||
## How to get an account on Star | ||
|
||
Members of Hofstra University, Nassau Community College, or Adelphi University, and other researchers affiliated with these institutions in some way may apply for an account. | ||
|
||
### Requesting an account | ||
|
||
To get an account on Star, you need to complete out the [request form](https://access.starhpc.hofstra.io/apply). There, you will need to provide us the following information: | ||
|
||
- Your full name, date of birth, and nationality. | ||
- Your position (master student, PhD, PostDoc, staff member, | ||
visitor/guest). | ||
- Your mobile phone number. This is necessary for recovery of | ||
passwords. | ||
- Your institutional mail address (i.e. your work Email at the | ||
research institution to which you belong) | ||
- The name and address of the instruction you belong to; also | ||
including name of the center, institute etc. | ||
- Institution username or a preferred username. If you are a member of | ||
Hofstra and already have a Hofstra account, you must enter your Hofstra | ||
username. A username is defined as a sequence of two to thirty lowercase | ||
alphanumeric characters, where the first letter may only be a lowercase | ||
character. | ||
- Necessary additional group and account memberships. | ||
- Optional: If you know in advance, please let us know: how many CPU | ||
hours you expect to use, how much long-term storage space (GB) you | ||
will need, and what software you will use. Partial answers are also | ||
welcome. The more we know about the needs of our users, the better | ||
services we can provide and the better we can plan for the future. | ||
|
||
**If you are a staff member and need to get a local project,** we need information about the project: | ||
- Name of the project | ||
- Brief description of the project | ||
- Field of science for the project | ||
- Name of additonal members of the project | ||
|
||
**If you are a student, PhD or post-doc,** you need to also provide us | ||
with the name of your advisor and name of the project you are to be a | ||
member of. | ||
|
||
Submit the above information through the online registration form. | ||
|
||
## Login node | ||
|
||
### About the login node | ||
|
||
The login node serves as the gateway or entry point to the cluster. Note that most software tools are not available on the login node and it is not for prototyping, building software, or running computationally intensive tasks itself. Instead, the login node is specifically for accessing the cluster and performing only very basic tasks, such as copying and moving files, submitting jobs, and checking the status of existing jobs. For development tasks, you would use one of the development nodes, which are accessed the same way as the large compute nodes. The compute nodes are where all the actual computational work is performed. They are accessed by launching jobs through Slurm with `sbatch` or `srun`. | ||
|
||
### Connection and credentials | ||
|
||
Access to the cluster is provided through SSH to the login node. Upon your account's creation, you can access the login node using the address provided in your welcome Email. | ||
|
||
If you have existing Linux lab credentials, use them to log in. Otherwise, login credentials will be provided to you. | ||
|
||
Additionally, the login node provides access to your Linux lab files, **But note that** the login node is **not** just another Linux lab machine. It simply provides mutual features (e.g., credentials) for convenience. | ||
|
||
## Scheduler policies | ||
|
||
Users should understand the cluster's policies regarding CPU and GPU usage, including time limits and priority settings. | ||
|
||
Users are advised to learn how to check their usage statistics to manage the resource allocation efficiently. | ||
|
||
## Storage policies | ||
|
||
Storage quotas and usage limits are put in place to ensure fair use and equitable distribution of the resources among all users. | ||
|
||
It is important to know where to store different types of data (such as large datasets or temporary files). | ||
|
||
Your home directory (`/home/your_username`) provides a limited amount of storage for scripts, source code, and small datasets. | ||
|
||
Project-specific directories may be created upon request for shared storage among multiple accounts. | ||
|
||
## Further Reading | ||
|
||
To make proper use of the cluster, please familiarize yourself with the basics of using Slurm, fundamental HPC concepts, and the cluster's architecture. | ||
|
||
You may be familiar with the `.bashrc`, `.bash_profile`, or `.cshrc` files for environment customization. To support different environments needed for different software packages, [environment modules]({{site.baseurl}}{% link software/env-modules.md %}) are used. Modules allow you to load and unload various software environments tailored to your computational tasks. | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,142 @@ | ||
# Star HPC LONG Start Guide | ||
|
||
Welcome to Star HPC! This guide will help you log in, run your first job, and start using the cluster. For an overview of policies, see the [Account & Access Overview](https://starhpc.hofstra.io/account-policies/). | ||
|
||
## Connect to Star HPC | ||
|
||
### Step 1: SSH into the Login Node | ||
|
||
Use your provided credentials to connect: | ||
|
||
```bash | ||
ssh -p 5010 your_username@[login_node].hofstra.edu | ||
``` | ||
|
||
> If you're using a Windows system, you can connect using [PuTTY](https://www.putty.org/) or [WSL + OpenSSH]. | ||
|
||
## Submit Your First Job | ||
|
||
### Step 2: Load the Sample Python Job Script | ||
|
||
Copy the sample job script into your home directory: | ||
|
||
```bash | ||
cp -r /fs1/shared/docs/examples/python_hello ~ | ||
cd ~/python_hello | ||
``` | ||
|
||
### Step 3: Submit the Job | ||
|
||
#### Using a Batch Job (Non-interactive) | ||
|
||
To submit a Batch job or a non-interactive job, simply run the sbatch command in the terminal. For example: | ||
|
||
```bash | ||
sbatch python_hello.sh | ||
``` | ||
|
||
For more information look at [Batch jobs](https://docs.starhpc.hofstra.io/jobs/submitting-jobs.html#batch-jobs-non-interactive) | ||
|
||
#### Using a Interactive Job | ||
|
||
To submit an interactive job, simply run the srun command with your desired options directly in the terminal. For example: | ||
|
||
```bash | ||
srun --pty python ~/python_hello/hello_world.py | ||
``` | ||
|
||
Once submitted, you'll be placed in an interactive shell on the allocated node when resources become available. Your prompt will change to indicate you're on the compute node. | ||
|
||
For more information look at [Interactive jobs](https://docs.starhpc.hofstra.io/jobs/submitting-jobs.html#interactive-jobs) | ||
|
||
### Step 4: Check the Output | ||
|
||
```bash | ||
cat python_hello.out | ||
``` | ||
|
||
## Common Job Monitoring Commands | ||
|
||
### Check your job status | ||
|
||
```bash | ||
sacct --user=<your_username> | ||
``` | ||
|
||
### Cancel a Job | ||
|
||
```bash | ||
scancel <jobid> | ||
``` | ||
|
||
### Transfer Files | ||
|
||
Use `scp` or `rsync` to transfer files to/from the cluster: | ||
|
||
```bash | ||
scp -P 5010 myfile.txt your_username@[login_node].hofstra.edu:/home/your_username/ | ||
``` | ||
Learn more about monitoring and managing jobs at [Monitoring Jobs](https://docs.starhpc.hofstra.io/jobs/monitoring-jobs.html) | ||
|
||
## How to run a Jupyter Notebook | ||
|
||
### Step 5: Load the Jupyter Notebook Job Script | ||
|
||
Copy the job script into your home directory: | ||
|
||
```bash | ||
cp -r /fs1/shared/docs/examples/jupyter ~ | ||
cd ~/jupyter | ||
``` | ||
|
||
### Step 6: Submit the Job | ||
|
||
```bash | ||
sbatch jupyter_notebook.sbatch | ||
``` | ||
|
||
Upon your job's submission to the queue, you will see the output indicating your job's ID. You need to replace your job ID value with the `<jobid>` placeholder throughout the Jupyter Notebook section. | ||
|
||
Additionaly, if you run squeue immediately after submitting your job, you might see a message such as Node Unavailable next to your job. Before proceeding, wait until your job has changed to the RUNNING state as reported by the squeue command. | ||
|
||
### Step 7: Check your output file for the SSH command | ||
|
||
```bashmodu | ||
cat jupyter_notebook_<jobid>.out # Run this command in the directory the .out file is located. | ||
``` | ||
|
||
### Step 8: Run the SSH port-forwarding command | ||
|
||
Open a new terminal on your local machine and run the SSH command provided in the output file. If prompted for a password, use your Linux lab password if you haven't set up SSH keys. You might be requested to enter your password multiple times. **Note** that the command will appear to hang after successful connection - this is the expected behavior. Do not terminate the command (Ctrl + C) as this will disconnect your Jupyter notebook session. | ||
|
||
### Step 9: Find and open the link in your browser | ||
|
||
Make sure you wait about 30 seconds after executing the SSH port-forwarding command on your local machine. It takes the .err file a little time to be updated and include your link. | ||
|
||
Check the error file on the login node for your Jupyter notebook's URL: | ||
|
||
```bash | ||
cat jupyter_notebook_<jobid>.err | grep '127.0.0.1' # Run this command in the directory the .err file is located. | ||
``` | ||
|
||
Copy the URL from the error file, either of the two lines printed works, and paste it into your **local machine's browser**. | ||
|
||
### Step 10: Clean up | ||
|
||
If you're done prior to the job's termination due to the walltime, clean up your session by running this command on the login node: | ||
|
||
```bash | ||
scancel <jobid> | ||
``` | ||
|
||
Afterwards, press Ctrl + C on your local computer's terminal session where you ran the port forwarding command to terminate the SSH connection. | ||
|
||
Learn more about running and creating a job script with [Jupyter Notebook](https://docs.starhpc.hofstra.io/software/jupyter-notebook.html) | ||
|
||
## Next Steps | ||
|
||
- Learn about job scheduling with [Slurm](https://docs.starhpc.hofstra.io/jobs/submitting-jobs.html) | ||
- Explore available software with `module avail` | ||
- Store large datasets in project directories (request access if needed) | ||
|
||
For help or questions, visit: [https://github.com/StarHPC/Issues](https://github.com/StarHPC/Issues) or email: `[email protected]`. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,137 @@ | ||
# Star HPC Quick Start Guide | ||
|
||
Welcome to Star HPC! This guide will help you log in, run your first job, and start using the cluster. For an overview of policies, see the [Account & Access Overview]. | ||
|
||
## Connect to Star HPC | ||
|
||
### Step 1: SSH into the Login Node | ||
Use your provided credentials to connect: | ||
|
||
```bash | ||
ssh -p 5010 [email protected] | ||
``` | ||
|
||
> If you're using a Windows system, you can connect using [PuTTY](https://www.putty.org/) or [WSL + OpenSSH]. | ||
|
||
## Set Up Your Environment | ||
|
||
### Step 2: Load a Module | ||
Star uses environment modules to manage software. Load a basic module like Python: | ||
|
||
```bash | ||
module avail # View available modules | ||
module load python # Load Python module | ||
``` | ||
|
||
You can use `module list` to see what you have loaded. | ||
|
||
## Submit Your First Job | ||
|
||
### Step 3: Create a Simple Job Script | ||
Create a file called `test_job.slurm`: | ||
|
||
```bash | ||
nano test_job.slurm | ||
``` | ||
|
||
Paste this content [These lines at the beginning of a job script are not just comments, but are actually processed by sbatch]: | ||
|
||
```bash | ||
#!/bin/bash | ||
#SBATCH --job-name=test_job | ||
#SBATCH --output=test_job.out | ||
#SBATCH --error=test_job.err | ||
#SBATCH --nodes=1 | ||
#SBATCH --time=10:00 | ||
#SBATCH --mem=1G | ||
|
||
what you wish to run in the script. | ||
``` | ||
|
||
Save and exit. | ||
### Example | ||
|
||
Create a file called `hello.slurm`: | ||
|
||
```bash | ||
nano hello.slurm | ||
``` | ||
|
||
Paste this content [These lines at the beginning of a job script are not just comments, but are actually processed by sbatch]: | ||
|
||
```bash | ||
#!/bin/bash | ||
#SBATCH --job-name=hello | ||
#SBATCH --output=hello.out | ||
#SBATCH --time=00:01:00 | ||
#SBATCH --ntasks=1 | ||
|
||
echo "Hello from Star HPC!" | ||
``` | ||
|
||
Save and exit. | ||
|
||
### Step 4: Submit the Job | ||
|
||
### Using a Batch job (Non-interactive) | ||
|
||
```bash | ||
sbatch hello.slurm | ||
``` | ||
For more information look at [Batch jobs] (https://docs.starhpc.hofstra.io/jobs/submitting-jobs.html#batch-jobs-non-interactive) | ||
|
||
### Using a Interactive Job | ||
|
||
To submit an interactive job, simply run the srun command with your desired options directly in the terminal. For example: | ||
|
||
```bash | ||
chmod +x ~/python_hello/python_hello.sh | ||
srun --pty --ntasks=1 --cpus-per-task=1 --time=01:00:00 --mem=4G ~/python_hello/python_hello.sh | ||
|
||
srun --pty --ntasks=1 --cpus-per-task=1 --time=01:00:00 --mem=4G python ~/python_hello/python_hello.sh | ||
``` | ||
|
||
Once submitted, you'll be placed in an interactive shell on the allocated node when resources become available. Your prompt will change to indicate you're on the compute node. | ||
|
||
For more information look at [Interactive jobs] (https://docs.starhpc.hofstra.io/jobs/submitting-jobs.html#interactive-jobs) | ||
|
||
### Step 5: Monitor Your Job | ||
|
||
```bash | ||
squeue -u your_username | ||
``` | ||
|
||
Once complete, check the output: | ||
|
||
```bash | ||
cat hello.out | ||
``` | ||
|
||
### Cancel a job | ||
|
||
```bash | ||
scancel job_id | ||
``` | ||
|
||
## Transfer Files | ||
|
||
Use `scp` or `rsync` to transfer files to/from the Star cluster: | ||
|
||
```bash | ||
scp -P 5010 myfile.txt [email protected]:/home/your_username/ | ||
``` | ||
|
||
### How to run a Jupyter Notebook – Include the necessary commands, a sample sbatch job template (cp /fs1/shared/job-examples/juperter-notebook.sbatch /fs1/projects/<your_project>), and an ssh port-forwarding command. | ||
|
||
-copy into their own directory and have them submit that | ||
change that one from 1-30 to anything cause if its down it wouldnt work(alex may have changed) | ||
|
||
Learn more about running [Jupyter Notebook] (https://docs.starhpc.hofstra.io/software/jupyter-notebook.html) | ||
|
||
## Next Steps | ||
|
||
- Learn about job scheduling with [Slurm] (https://docs.starhpc.hofstra.io/jobs/submitting-jobs.html) | ||
- Explore available software with `module avail` | ||
- Store large datasets in project directories (request access if needed) | ||
|
||
For help or questions, visit: [https://github.com/StarHPC/Issues] or email: `[email protected]`. |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add how to unload a module