Skip to content
Open
2 changes: 1 addition & 1 deletion binder/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -39,4 +39,4 @@ RUN find_notebooks | xargs nbstripout
WORKDIR ${HOME}/tutorials

# install HTCondor Python bindings pre-release, when necessary
RUN python -m pip install --no-cache-dir --upgrade htcondor==9.1.3
RUN python -m pip install --no-cache-dir --upgrade htcondor==10.7.0
8 changes: 4 additions & 4 deletions tutorials/Advanced-Job-Submission-And-Management.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
"[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/htcondor/htcondor-python-bindings-tutorials/master?urlpath=lab/tree/Advanced-Job-Submission-And-Management.ipynb)\n",
"\n",
"The two most common HTCondor command line tools are `condor_q` and `condor_submit`.\n",
"In the previous module, we learned about the `xquery()` method that corresponds to `condor_q`. Here, we will learn the Python binding equivalent of `condor_submit` in greater detail.\n",
"In the previous module, we learned about the `query()` method that corresponds to `condor_q`. Here, we will learn the Python binding equivalent of `condor_submit` in greater detail.\n",
"\n",
"We start by importing the relevant modules:"
]
Expand Down Expand Up @@ -255,7 +255,7 @@
"submit_result = schedd.submit(sub, count=5) # queues 5 copies of this job\n",
"schedd.edit([f\"{submit_result.cluster()}.{idx}\" for idx in range(2)], \"foo\", '\"bar\"') # sets attribute foo to the string \"bar\" for the first two jobs\n",
" \n",
"for ad in schedd.xquery(\n",
"for ad in schedd.query(\n",
" constraint=f\"ClusterId == {submit_result.cluster()}\",\n",
" projection=[\"ProcId\", \"JobStatus\", \"foo\"],\n",
"):\n",
Expand All @@ -272,7 +272,7 @@
"source": [
"schedd.act(htcondor.JobAction.Hold, f\"ClusterId == {submit_result.cluster()} && ProcId >= 2\")\n",
"\n",
"for ad in schedd.xquery(\n",
"for ad in schedd.query(\n",
" constraint=f\"ClusterId == {submit_result.cluster()}\",\n",
" projection=[\"ProcId\", \"JobStatus\", \"foo\"],\n",
"):\n",
Expand Down Expand Up @@ -334,7 +334,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.6"
"version": "3.9.2"
}
},
"nbformat": 4,
Expand Down
140 changes: 2 additions & 138 deletions tutorials/Advanced-Schedd-Interactions.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@
"source": [
"## Job and History Querying\n",
"\n",
"In [HTCondor Introduction](HTCondor-Introduction.ipynb), we covered the `Schedd.xquery` method\n",
"In [HTCondor Introduction](HTCondor-Introduction.ipynb), we covered the `Schedd.query` method\n",
"and its two most important keywords:\n",
"\n",
"* ``requirements``: Filters the jobs the schedd should return.\n",
Expand Down Expand Up @@ -107,135 +107,6 @@
"print(len(schedd.query(projection=[\"ProcID\"], constraint=f\"ClusterId=={submit_result.cluster()}\")))"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {}
},
"source": [
"The ``sum(1 for _ in ...)`` syntax is a simple way to count the number of items\n",
"produced by an iterator without buffering all the objects in memory."
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {}
},
"source": [
"## Querying many Schedds\n",
"\n",
"On larger pools, it's common to write Python scripts that interact with not one but many schedds. For example,\n",
"if you want to implement a \"global query\" (equivalent to ``condor_q -g``; concatenates all jobs in all schedds),\n",
"it might be tempting to write code like this:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {}
},
"outputs": [],
"source": [
"jobs = []\n",
"for schedd_ad in htcondor.Collector().locateAll(htcondor.DaemonTypes.Schedd):\n",
" schedd = htcondor.Schedd(schedd_ad)\n",
" jobs += schedd.xquery()\n",
"print(len(jobs))"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {}
},
"source": [
"This is sub-optimal for two reasons:\n",
"\n",
"* ``xquery`` is not given any projection, meaning it will pull all attributes for all jobs -\n",
" much more data than is needed for simply counting jobs.\n",
"* The querying across all schedds is serialized: we may wait for painfully long on one or two\n",
" \"bad apples.\"\n",
"\n",
"We can instead begin the query for all schedds simultaneously, then read the responses as\n",
"they are sent back. First, we start all the queries without reading responses:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {}
},
"outputs": [],
"source": [
"queries = []\n",
"coll_query = htcondor.Collector().locateAll(htcondor.DaemonTypes.Schedd)\n",
"for schedd_ad in coll_query:\n",
" schedd_obj = htcondor.Schedd(schedd_ad)\n",
" queries.append(schedd_obj.xquery())"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {}
},
"source": [
"The iterators will yield the matching jobs; to return the autoclusters instead of jobs, use\n",
"the ``AutoCluster`` option (``schedd_obj.xquery(opts=htcondor.QueryOpts.AutoCluster)``). One\n",
"auto-cluster ad is returned for each set of jobs that have identical values for all significant\n",
"attributes. A sample auto-cluster looks like:\n",
"\n",
" [\n",
" RequestDisk = DiskUsage;\n",
" Rank = 0.0;\n",
" FileSystemDomain = \"hcc-briantest7.unl.edu\";\n",
" MemoryUsage = ( ( ResidentSetSize + 1023 ) / 1024 );\n",
" ImageSize = 1000;\n",
" JobUniverse = 5;\n",
" DiskUsage = 1000;\n",
" JobCount = 1;\n",
" Requirements = ( TARGET.Arch == \"X86_64\" ) && ( TARGET.OpSys == \"LINUX\" ) && ( TARGET.Disk >= RequestDisk ) && ( TARGET.Memory >= RequestMemory ) && ( ( TARGET.HasFileTransfer ) || ( TARGET.FileSystemDomain == MY.FileSystemDomain ) );\n",
" RequestMemory = ifthenelse(MemoryUsage isnt undefined,MemoryUsage,( ImageSize + 1023 ) / 1024);\n",
" ResidentSetSize = 0;\n",
" ServerTime = 1483758177;\n",
" AutoClusterId = 2\n",
" ]\n",
"\n",
"We use the `poll` function, which will return when a query has available results:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {}
},
"outputs": [],
"source": [
"job_counts = {}\n",
"for query in htcondor.poll(queries):\n",
" schedd_name = query.tag()\n",
" job_counts.setdefault(schedd_name, 0)\n",
" count = len(query.nextAdsNonBlocking())\n",
" job_counts[schedd_name] += count\n",
" print(\"Got {} results from {}.\".format(count, schedd_name))\n",
"print(job_counts)"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {}
},
"source": [
"The `QueryIterator.tag` method is used to identify which query is returned; the\n",
"tag defaults to the Schedd's name but can be manually set through the ``tag`` keyword argument\n",
"to `Schedd.xquery`."
]
},
{
"cell_type": "markdown",
"metadata": {
Expand Down Expand Up @@ -264,13 +135,6 @@
"):\n",
" print(ad)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
Expand All @@ -289,7 +153,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.6"
"version": "3.9.2"
}
},
"nbformat": 4,
Expand Down
4 changes: 2 additions & 2 deletions tutorials/ClassAds-Introduction.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
"Launch this tutorial in a Jupyter Notebook on Binder: \n",
"[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/htcondor/htcondor-python-bindings-tutorials/master?urlpath=lab/tree/ClassAds-Introduction.ipynb)\n",
"\n",
"In this tutorial, we will learn the basics of the [ClassAd language](https://research.cs.wisc.edu/htcondor/classad/classad.html),\n",
"In this tutorial, we will learn the basics of the [ClassAd language](https://htcondor.org/classad/classad.html),\n",
"the policy and data exchange language that underpins all of HTCondor.\n",
"ClassAds are fundamental in the HTCondor ecosystem, so understanding them will be good preparation for future tutorials.\n",
"\n",
Expand Down Expand Up @@ -366,7 +366,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
"version": "3.9.6"
}
},
"nbformat": 4,
Expand Down
12 changes: 2 additions & 10 deletions tutorials/DAG-Creation-And-Submission.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -406,8 +406,7 @@
"os.chdir(dag_dir)\n",
"\n",
"schedd = htcondor.Schedd()\n",
"with schedd.transaction() as txn:\n",
" cluster_id = dag_submit.queue(txn)\n",
"cluster_id = schedd.submit(dag_submit).cluster()\n",
" \n",
"print(f\"DAGMan job cluster is {cluster_id}\")\n",
"\n",
Expand Down Expand Up @@ -464,13 +463,6 @@
"source": [
"Image(dag_dir / \"mandelbrot.png\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
Expand All @@ -489,7 +481,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.6"
"version": "3.9.2"
}
},
"nbformat": 4,
Expand Down
20 changes: 4 additions & 16 deletions tutorials/HTCondor-Introduction.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -212,7 +212,7 @@
},
"outputs": [],
"source": [
"for job in schedd.xquery(projection=['ClusterId', 'ProcId', 'JobStatus']):\n",
"for job in schedd.query(projection=['ClusterId', 'ProcId', 'JobStatus']):\n",
" print(repr(job))"
]
},
Expand All @@ -233,7 +233,7 @@
"\n",
"Depending on how quickly you executed the above cell, you might see all jobs idle (`JobStatus = 1`) or some jobs running (`JobStatus = 2`) above.\n",
"\n",
"As with the Collector's `query` method, we can also filter out jobs using `xquery`:"
"As with the Collector's `query` method, we can also filter out jobs using `query`:"
]
},
{
Expand All @@ -244,22 +244,10 @@
},
"outputs": [],
"source": [
"for ad in schedd.xquery(constraint = 'ProcId >= 5', projection=['ProcId']):\n",
"for ad in schedd.query(constraint = 'ProcId >= 5', projection=['ProcId']):\n",
" print(ad.get('ProcId'))"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {}
},
"source": [
"Astute readers may notice that the `Schedd` object has both `xquery` and `query` methods.\n",
"The difference between them is primarily how memory is managed:\n",
"- `query` returns a _list_ of ClassAds, meaning all objects are held in memory at once. This utilizes more memory, but the results are immediately available.\n",
"- `xquery` returns an _iterator_ that produces ClassAds. This only requires one ClassAd to be in memory at once."
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -315,7 +303,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.6"
"version": "3.9.2"
}
},
"nbformat": 4,
Expand Down
6 changes: 3 additions & 3 deletions tutorials/Scalable-Job-Tracking.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
"The Python bindings provide two scalable mechanisms for tracking jobs:\n",
"\n",
"* **Poll-based tracking**: The Schedd can be periodically polled\n",
" through the use of `Schedd.xquery` to get job\n",
" through the use of `Schedd.query` to get job\n",
" status information.\n",
"* **Event-based tracking**: Using the job's *user log*, Python can\n",
" see all job events and keep an in-memory representation of the\n",
Expand All @@ -34,14 +34,14 @@
"Beside the technical means of polling, important aspects to consider are *how often*\n",
"the poll should be performed and *how much* data should be retrieved.\n",
"\n",
"**Note**: When `Schedd.xquery` is used, the query will cause the schedd to fork\n",
"**Note**: When `Schedd.query` is used, the query will cause the schedd to fork\n",
"up to ``SCHEDD_QUERY_WORKERS`` simultaneous workers. Beyond that point, queries will\n",
"be handled in a non-blocking manner inside the main ``condor_schedd`` process. Thus, the\n",
"memory used by many concurrent queries can be reduced by decreasing ``SCHEDD_QUERY_WORKERS``.\n",
"\n",
"A job tracking system should not query the Schedd more than once a minute. Aim to minimize the\n",
"data returned from the query through the use of the projection; minimize the number of jobs returned\n",
"by using a query constraint. Better yet, use the ``AutoCluster`` flag to have `Schedd.xquery`\n",
"by using a query constraint. Better yet, use the ``AutoCluster`` flag to have `Schedd.query`\n",
"return a list of job summaries instead of individual jobs.\n",
"\n",
"Advantages:\n",
Expand Down
4 changes: 2 additions & 2 deletions tutorials/Submitting-and-Managing-Jobs.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@
"pycharm": {}
},
"source": [
"The available descriptors are documented in the [`condor_submit` manual page](https://htcondor.readthedocs.io/en/latest/man-pages/condor_submit.html).\n",
"The available descriptors are documented in the `condor_submit` [manual page](https://htcondor.readthedocs.io/en/latest/man-pages/condor_submit.html).\n",
"The keys of the Python dictionary you pass to `htcondor.Submit` should be the same as for the submit descriptors, and the values should be **strings containing exactly what would go on the right-hand side**.\n",
"\n",
"Note that we gave the `Submit` object several relative filepaths.\n",
Expand Down Expand Up @@ -617,7 +617,7 @@
"- Modify the code, or add new code to it, to pass the test. Do whatever it takes!\n",
"- You can run the test by running the block it is in.\n",
"- Feel free to look at the test for clues as to how to modify the code.\n",
"- Many of the exercises can be solved either by using Python to generate inputs, or by using advanced features of the [ClassAd language](https://htcondor.readthedocs.io/en/latest/misc-concepts/classad-mechanism.html#htcondor-s-classad-mechanism). Either way is valid!\n",
"- Many of the exercises can be solved either by using Python to generate inputs, or by using advanced features of the [ClassAd language](https://htcondor.readthedocs.io/en/latest/classads/classad-mechanism.html#htcondor-s-classad-mechanism). Either way is valid!\n",
"- Don't modify the test. That's cheating!"
]
},
Expand Down
29 changes: 1 addition & 28 deletions tutorials/index.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -46,33 +46,6 @@
"1. [DAG Creation and Submission](DAG-Creation-And-Submission.ipynb) - Using `htcondor.dags` to create and submit a DAG.\n",
"1. [Personal Pools](Personal-Pools.ipynb) - Using `htcondor.personal` to create and manage a \"personal\" HTCondor pool.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbsphinx-toctree": {
"hidden": true,
"maxdepth": 2
}
},
"source": [
"1. [Submitting and Managing Jobs](Submitting-and-Managing-Jobs.ipynb)\n",
"1. [ClassAds Introduction](ClassAds-Introduction.ipynb)\n",
"1. [HTCondor Introduction](HTCondor-Introduction.ipynb)\n",
"1. [Advanced Job Submission and Management](Advanced-Job-Submission-And-Management.ipynb)\n",
"1. [Advanced Schedd Interaction](Advanced-Schedd-Interactions.ipynb)\n",
"1. [Interacting with Daemons](Interacting-With-Daemons.ipynb)\n",
"1. [Scalable Job Tracking](Scalable-Job-Tracking.ipynb)\n",
"1. [DAG Creation and Submission](DAG-Creation-And-Submission.ipynb)\n",
"1. [Personal Pools](Personal-Pools.ipynb)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
Expand All @@ -91,7 +64,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.6"
"version": "3.9.2"
}
},
"nbformat": 4,
Expand Down