-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add user defined YAML tools #19434
base: dev
Are you sure you want to change the base?
Add user defined YAML tools #19434
Conversation
container: str | ||
|
||
|
||
class AdminToolSource(ToolSourceBase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally fine just dropping the Admin version here if it is easier. The point was to establish some abstractions that could be used by the feature you're implementing and CWL - you've gone a lot farther with the user tools so I'm happy to drop any old admin-only, insecure functionality if it would help. Either way though is fine.
# and is scoped to to individual user and never adds to global toolbox | ||
dynamic_tools_manager: DynamicToolManager = depends(DynamicToolManager) | ||
|
||
@router.get("/api/unprivileged_tools") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like I would prefer a name like 'user_tools' but if it is a big change - feel free to ignore. Also sorry I'm reviewing by commit so if any of these changes are undone in future commits feel free to ignore.
Also like the last comment - feel free to just dump the dynamic tools API endpoint and replace it with this. I think I like that name better also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely open to changing how we name things, we did already have 3 or 4 different versions when we hacked on this in Berlin. I'll keep that for a last pass when it's nearing completion though.
This is amazing. I don't hate any of it. The CWL fields I think we would struggle to fill out with our runtime is the only part that caused significant stress but I didn't see an attempt to fill those out. I would have semantic questions like is name going to be the name or collection identifier, etc.. but I don't think those are details you've tackled yet unless I missed the commit. I would have started with locking tools down at the XML layer and have dozens of test cases around making sure tool action expressions cannot be evaluated, etc.. for unprivileged tools but I understand that part is pretty unsexy and I think there is some chance that having a fully defined model means those things might be completely unreachable and so that might have been unnecessary work. I think we need to at least audit all the features before the final merge. I created a list of things I'd like to see to broken out into smaller PRs to clean up the core as I was reviewing the commits. None of this is essential - if it works, it works - but any of that extra effort would be appreciated and would ease follow up reviews I think and help isolate potential problems.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestions from my current work on the cwl-1.0 branch.
) | ||
|
||
from galaxy import ( | ||
exceptions, | ||
model, | ||
) | ||
from galaxy.exceptions import DuplicatedIdentifierException | ||
from galaxy.model import DynamicTool | ||
from galaxy.managers.context import ProvidesUserContext |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from galaxy.managers.context import ProvidesUserContext |
@@ -1381,6 +1400,62 @@ def test_dynamic_tool_no_id(self): | |||
output_content = self.dataset_populator.get_history_dataset_content(history_id) | |||
assert output_content == "Hello World 2\n" | |||
|
|||
# This works except I don't want to add it to the schema right now, | |||
# since I think the shell_command is what we'lll go with (at least initially) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# since I think the shell_command is what we'lll go with (at least initially) | |
# since I think the shell_command is what we'll go with (at least initially) |
Maybe the way to address this is worrying about documentation. A tool translation guide maybe where each feature from XML is listed (under contents in https://docs.galaxyproject.org/en/master/dev/schema.html) and how to port it to YAML and if there are any security considerations. I did a lot of work in syncing XSD and YAML model docs in #18787 - I think we will want something like that for the broader tools right? We will need to keep model docs and XSD docs synchronized but also have separate customizations for each. It is kind of a hard problem but worth thinking about and maybe capturing security concerns at this point. |
hidden: Mapped[Optional[bool]] = mapped_column(default=False) | ||
active: Mapped[Optional[bool]] = mapped_column(default=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These 2 columns are also present in "DynamicTool", is the duplication because the plan is to make DynamicTools sharable by associating multiple user ids to the same tool id?
The primary benefit is that the command section does not have access the app or any dangling reference to the database connection or any other secrets. There are two flavors here, one uses base_command and arguments, and allows building up an (escaped) argv list, the other is a shortcut for writing shell scripts and feels maybe a bit more like writing a very simple cheetah section. base_command: ```yml name: base_command tool class: GalaxyTool version: 1.0.0 base_command: cat arguments: - $(inputs.input.path) - '>' - output.fastq inputs: - type: data name: input outputs: output: type: data from_work_dir: output.fastq name: output ``` shell_command style: ```yml name: shell_command tool class: GalaxyTool version: 1.0.0 shell_command: cat '$(inputs.input.path)' > output.fastq inputs: - type: data name: input outputs: output: type: data from_work_dir: output.fastq name: output ```
Will probably need this later for efficiency and ignoring `$()` outside of shell_command.
Simply re-use models by index and set values. It's currently a high-water mark situation, and there will be a warning once 200 models (i.e. 200 embedded fragments) are created, but that seems pretty unlikely.
ec05106
to
a0c1e15
Compare
Co-authored-by: Nicola Soranzo <[email protected]>
This work enhances the existing YAML tool format to bring it close to feature parity with XML tools, and strips inherently unsafe elements, which should eventually allow a subset of trusted users to bring their own tools.
I'll follow up with a more extensive description, but here's a screenshot of the embedded tool editor.
How to test the changes?
(Select all options that apply)
License