Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mymdc proxy #226

Merged
merged 3 commits into from
Apr 4, 2024
Merged

Mymdc proxy #226

merged 3 commits into from
Apr 4, 2024

Conversation

JamesWrigley
Copy link
Member

In a nutshell, DAMNIT will now look for mymdc credentials in usr/mymdc-credentials.yml whenever a mymdc#... argument is used in a variable, and use them to retrieve mymdc fields (currently just the sample and run type are allowed).

The credentials file must have the format:

token: <token>
server: <server-url>

Fixes #34.

@JamesWrigley JamesWrigley added the enhancement New feature or request label Mar 18, 2024
@JamesWrigley JamesWrigley self-assigned this Mar 18, 2024
@JamesWrigley
Copy link
Member Author

Oh, and I decided to stick with requests instead of metadata_client because that seemed like a more lightweight dependency, and the code is barely longer than using the metadata_client API anyway.

@@ -45,6 +46,45 @@ class DataType(Enum):
Timestamp = "timestamp"


def get_run_info(run, run_no, proposal):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would cache this function, as to not query mymdc multiple times for the same information.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 8ce6281.

@@ -45,6 +46,45 @@ class DataType(Enum):
Timestamp = "timestamp"


def get_run_info(run, run_no, proposal):
import yaml
import requests
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move imports top of the file? Unless there's a good reason not to.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would guess James has done this because the ctxrunner file is imported in DAMNIT's own frontend & backend processes, as well as the subprocesses executing the context file. By hiding the imports inside functions like this, DAMNIT doesn't require them.

We increasingly run the context file in a separate Python environment from the DAMNIT application - perhaps it's time to make that the default.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, though with the new class in 8ce6281 it's a little annoying to import everything where necessary so I moved them to the top.


run_info = run_res["runs"][0]

return headers, server, run_info
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it shaky to have a function called get_run_info, to return a header with a token and url to be reused in other fonctions in addition to the result it should give (the run info).
I'd split that part in a separate function to do the requests to mymdc (taking the API endpoint as argument).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could even be a small class, with the server URL & token as instance attributes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here I was trying not to make another client 😄 Added a wee one in 8ce6281.

for name, var in self.vars.items():
mymdc_args = var.arg_dependencies("mymdc#")
for arg_name, annotation in mymdc_args.items():
if annotation not in ["sample", "run_type"]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpicking: I'd call sample sample_name, as the sample data in mymdc may also contains more information.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed in 8ce6281.

Comment on lines 61 to 62
run_res = requests.get(f"{server}/api/mymdc/proposals/by_number/{proposal}/runs/{run_no}",
headers=headers, timeout=MYMDC_TIMEOUT).json()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
run_res = requests.get(f"{server}/api/mymdc/proposals/by_number/{proposal}/runs/{run_no}",
headers=headers, timeout=MYMDC_TIMEOUT).json()
resp = requests.get(f"{server}/api/mymdc/proposals/by_number/{proposal}/runs/{run_no}",
headers=headers, timeout=MYMDC_TIMEOUT)
resp.raise_for_status()
run_res = resp.json()

We probably want something like this basically everywhere we make a request - requests doesn't turn HTTP errors into exceptions unless you explicitly ask it to, so if we get something like a 404 or a 401, the failure would be quite confusing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, fixed in 8ce6281.

@JamesWrigley JamesWrigley force-pushed the mymdc-proxy branch 2 times, most recently from 24b5a36 to 323535e Compare March 24, 2024 06:39
@JamesWrigley
Copy link
Member Author

Oh, forgot to mention that I added support for propagating changes in the GUI in 323535e.

Copy link
Member

@tmichela tmichela left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also check if the yaml file exists, and if not ask the proxy to produce one?

@RobertRosca
Copy link
Member

Just glanced through the code, I'll leave a review later.

For the caching did you consider just using native HTTP response caching? e.g. https://requests-cache.readthedocs.io/en/stable/

Responses from MyMdC include cache info but always explicitly disable caching, spoke to ITDM about allowing cached responses and they said they'll look into it.

For the proxy I can add the relevant headers to the responses.

@JamesWrigley
Copy link
Member Author

Should we also check if the yaml file exists, and if not ask the proxy to produce one?

Excellent point, I forgot about that... added in 1d4a877.

For the caching did you consider just using native HTTP response caching? e.g. https://requests-cache.readthedocs.io/en/stable/

Responses from MyMdC include cache info but always explicitly disable caching, spoke to ITDM about allowing cached responses and they said they'll look into it.

I didn't, but that does look like the way to go 🤔 I'd say let's stick with explicit caching for now until mymdc implements it, I wouldn't dare change the response in the proxy.

@JamesWrigley
Copy link
Member Author

(if there's no objections I'll probably merge this at the end of the week)

@tmichela
Copy link
Member

tmichela commented Apr 3, 2024

LGTM

@RobertRosca you said you wanted to have a closer look at that, did you have the chance to do so?

@RobertRosca
Copy link
Member

I'll have a quick look now

Copy link
Member

@RobertRosca RobertRosca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments but they're minor suggestions, so LGTM 👍

# delay in the GUI and because people don't usually close the GUI
# immediately after editing a variable so it's unlikely that the process
# will die before the update is sent (as happened with the backend).
self.kafka_prd.send(self.update_topic, message)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's a concern, I guess one option would be to use loop.run_soon, or store a reference to the coroutine and, only when the gui is being closed, await it or .close() it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I added a call to .flush() in fbced4c which should do the same thing.

def __init__(self, proposal, timeout=10, init_server="https://exfldadev01.desy.de/zwop"):
self.proposal = proposal
self.timeout = timeout
self._cache = {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With how the cache is used (keys as a tuple of arguments basically as far as I could tell?) it seems pretty similar to what using the functools cache decorator would do right?

e.g.

    @cache
    def _run_info(self, run):
        response = requests.get(f"{self.server}/api/mymdc/proposals/by_number/{self.proposal}/runs/{run}",
                                    headers=self._headers, timeout=self.timeout)
        ....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, but when we make this public the caching should probably be disabled by default (at least until mymdc implements caching headers), so I'll stick with the dictionary for now so we can add an option to disable caching later.

In a nutshell, DAMNIT will now look for mymdc credentials in
`usr/mymdc-credentials.yml` whenever a `mymdc#...` argument is used in a
variable and use them to retrieve mymdc fields (currently just the sample and
run type).
Currently only for the supported message types. Caveat: adding a new editable
column doesn't preserve the column position in the update.
@JamesWrigley JamesWrigley merged commit 5560256 into master Apr 4, 2024
3 checks passed
@JamesWrigley JamesWrigley deleted the mymdc-proxy branch April 4, 2024 08:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Lightweight integration with myMdC
4 participants