Skip to content

Ensure that relative imports can be imported without requiring ./ in front of the import file name #350

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 28, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Runtime support for linkml generated data classes.

## About

This python library provides runtime support for [LinkML](https://linkml.io/linkml/) datamodels.
This Python library provides runtime support for [LinkML](https://linkml.io/linkml/) datamodels.

See the [LinkML repo](https://github.com/linkml/linkml) for the [Python Dataclass Generator](https://linkml.io/linkml/generators/python.html) which will convert a schema into a Python object model. That model will have dependencies on functionality in this library.

Expand All @@ -24,8 +24,8 @@ See [working with data](https://linkml.io/linkml/data/index.html) in the documen

This repository also contains the Python dataclass representation of the [LinkML metamodel](https://github.com/linkml/linkml-model), and various utility functions that are useful for working with LinkML data and schemas.

It also includes the [SchemaView](https://linkml.io/linkml/developers/manipulating-schemas.html) class for working with LinkML schemas
It also includes the [SchemaView](https://linkml.io/linkml/developers/manipulating-schemas.html) class for working with LinkML schemas.

## Notebooks

See the [notebooks](https://github.com/linkml/linkml-runtime/tree/main/notebooks) folder for examples
See the [notebooks](https://github.com/linkml/linkml-runtime/tree/main/notebooks) folder for examples.
33 changes: 12 additions & 21 deletions linkml_runtime/utils/schemaview.py
Original file line number Diff line number Diff line change
Expand Up @@ -108,25 +108,6 @@ def is_absolute_path(path: str) -> bool:
drive, tail = os.path.splitdrive(norm_path)
return bool(drive and tail)

def _resolve_import(source_sch: str, imported_sch: str) -> str:
if os.path.isabs(imported_sch):
# Absolute import paths are not modified
return imported_sch
if urlparse(imported_sch).scheme:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

urlparse does not check for URL validity, which isn't obvious if you haven't read the docs for the function. A better test here would be for :, which should only occur in well-formed URLs and CURIEs, but not in file system paths.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also -- imported_sch starting with file:// and file:/// do not resolve and http / https / etc. will be rejected unless there's a mapping in the prefixes section of the schema.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plz add failing tests illustrating these cases and then fix :)

Copy link
Collaborator Author

@ialarmedalien ialarmedalien Apr 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't clear to me what the desired behaviour is here, which is the main reason why I didn't do anything. I thought that the point of the prefixes section was so that external resources could be referred to without using URLs - specify the URL there and the importer Does The Right Thing automatically. file:// (or file:///) don't work at all due to bugs in the underlying hbreader package. I don't know whether this is a problem throughout the codebase, but I think I would consider ditching the package and using something more standard if so.

The docs/code are a bit ambiguous as the code enforces prefixes but the docs (or at least the imports slot range, uriorcurie) suggest URLs are OK. The documentation page on imports only talks about CURIEs or local paths, so I don't think that URIs should be allowed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ETA: I've just realised it's a code comment that's leading me astray:

                    # origin schema. Imports can be a URI or Curie, and imports from the same
                    # directory don't require a ./, so if the current (sn) import is a relative

The docs are pretty clear that the field contents should be local paths or CURIEs.

Copy link
Collaborator Author

@ialarmedalien ialarmedalien Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for asking in slack.

I should emphasise that the implementation assumes that anything curie-like is treated as a prefixed entity with an entry for that prefix in the prefixes section -- i.e. if you have http://example.com/schema and ftp://my-fave-schemas.com/path/to/file in your imports, it expects there to be entries for http and ftp in the prefixes section, and throws an error if there are not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well that's certainly a bug

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a core LinkML person to chime in here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally would say that is out of scope for this PR, i don't think you need to fix all of URI resolution rn. i think the shortcomings of the hbreader are known, and also think that the only way could feasibly resolve the problem is make a plugin architecture that allowed people to handle all the various schemes that one might want to use to resolve a URI.

Copy link
Collaborator Author

@ialarmedalien ialarmedalien Apr 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it did seem like a bit too much for this PR -- I wasn't sure from your comment whether or not you thought it needed fixing now or whether it was something that could be added to the TODO list.

# File with URL schemes are not modified
return imported_sch

if WINDOWS:
path = PurePath(os.path.normpath(PurePath(source_sch).parent / imported_sch)).as_posix()
else:
path = os.path.normpath(str(Path(source_sch).parent / imported_sch))

if imported_sch.startswith(".") and not path.startswith("."):
# Above condition handles cases where both source schema and imported schema are relative paths: these should remain relative
return f"./{path}"
Comment on lines -124 to -126
Copy link
Collaborator Author

@ialarmedalien ialarmedalien Mar 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

It looks like this was put in to fudge test results -- that is not a good reason to keep it in.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see #368 (comment)

they are functionally equivalent. i mildly prefer having explicit relative path annotations, or some way of telling that these are relative paths/paths at all, but yes string conventions are a weak way of doing that compared to proper typing

Copy link
Collaborator Author

@ialarmedalien ialarmedalien Apr 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with your (linked) comment about wanting the explicit paths -- when I first looked at this code, I was wondering about the possibility of using Path objects but that turned out to be a non-trivial exercise to implement. More than wanting explicit paths, I would prefer everything to be treated uniformly: either keep everything as it is in the source, or convert everything to absolute paths. The previous approach altered some paths but not others, which didn't sit well with a pedant like me!


return path
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

summary of logic in this function:

  • if the imported_sch path is absolute or is URI / CURIE-like, leave as-is
  • normalise the imported_sch path, assuming that source_sch and imported_sch are siblings
  • add ./ to the normalised imported_sch path if it originally started with ./ (not really necessary)

==> this can be refactored and simplified



@dataclass
class SchemaUsage:
Expand Down Expand Up @@ -319,14 +300,24 @@ def imports_closure(self, imports: bool = True, traverse: Optional[bool] = None,
# path, and the target import doesn't have : (as in a curie or a URI)
# we prepend the relative path. This WILL make the key in the `schema_map` not
# equal to the literal text specified in the importing schema, but this is
# essential to sensible deduplication: eg. for
# essential to sensible deduplication: e.g. for
# - main.yaml (imports ./types.yaml, ./subdir/subschema.yaml)
# - types.yaml
# - subdir/subschema.yaml (imports ./types.yaml)
# - subdir/types.yaml
# we should treat the two `types.yaml` as separate schemas from the POV of the
# origin schema.
i = _resolve_import(sn, i)

# if i is not a CURIE and sn looks like a path with at least one parent folder,
# normalise i with respect to sn
if "/" in sn and ":" not in i:
if WINDOWS:
# This cannot be simplified. os.path.normpath() must be called before .as_posix()
i = PurePath(
os.path.normpath(PurePath(sn).parent / i)
).as_posix()
else:
i = os.path.normpath(str(Path(sn).parent / i))
todo.append(i)

# add item to closure
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
id: four
name: import_four
title: Import Four
description: |
Import loaded by the StepChild class.
imports:
- linkml:types
classes:
Four:
attributes:
value:
range: string
ifabsent: "Four"
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
id: one
name: import_one
title: Import One
description: |
Import loaded by the StepChild class.
imports:
- linkml:types
- two
classes:
One:
attributes:
value:
range: string
ifabsent: "One"
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
id: stepchild
name: stepchild
title: stepchild
description: |
Child class that imports files in the same directory as itself without consistently using `./` in the link notation.
imports:
- linkml:types
- one
- two
- ./three
classes:
StepChild:
attributes:
value:
range: string
ifabsent: "StepChild"
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
id: three
name: import_three
title: Import Three
description: |
Import loaded by the StepChild class.
imports:
- linkml:types
- ./four
classes:
Three:
attributes:
value:
range: string
ifabsent: "Three"
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
id: two
name: import_two
title: Import Two
description: |
Import loaded by the StepChild class.
imports:
- linkml:types
classes:
Two:
attributes:
value:
range: string
ifabsent: "Two"
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,11 @@ imports:
- ../../L0_1/cousin
- ./L2_0_0_0/child
- ./L2_0_0_1/child
- L2_0_0_2/stepchild
classes:
Main:
description: "Our intrepid main class!"
attributes:
value:
range: string
ifabsent: "Main"
ifabsent: "Main"
10 changes: 8 additions & 2 deletions tests/test_utils/test_schemaview.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@

SCHEMA_NO_IMPORTS = Path(INPUT_DIR) / 'kitchen_sink_noimports.yaml'
SCHEMA_WITH_IMPORTS = Path(INPUT_DIR) / 'kitchen_sink.yaml'
SCHEMA_WITH_STRUCTURED_PATTERNS = Path(INPUT_DIR) / "pattern-example.yaml"
SCHEMA_WITH_STRUCTURED_PATTERNS = Path(INPUT_DIR) / 'pattern-example.yaml'
SCHEMA_IMPORT_TREE = Path(INPUT_DIR) / 'imports' / 'main.yaml'
SCHEMA_RELATIVE_IMPORT_TREE = Path(INPUT_DIR) / 'imports_relative' / 'L0_0' / 'L1_0_0' / 'main.yaml'
SCHEMA_RELATIVE_IMPORT_TREE2 = Path(INPUT_DIR) / 'imports_relative' / 'L0_2' / 'main.yaml'
Expand Down Expand Up @@ -357,7 +357,7 @@ def test_caching():
view.add_class(ClassDefinition('X'))
assert len(['X']) == len(view.all_classes())
view.add_class(ClassDefinition('Y'))
assert len(['X', 'Y']) == len(view.all_classes())
assert len(['X', 'Y']) == len(view.all_classes())
# bypass view method and add directly to schema;
# in general this is not recommended as the cache will
# not be updated
Expand Down Expand Up @@ -546,6 +546,11 @@ def test_imports_relative():
'../L1_0_1/dupe',
'./L2_0_0_0/child',
'./L2_0_0_1/child',
'L2_0_0_2/two',
'L2_0_0_2/one',
'L2_0_0_2/four',
'L2_0_0_2/three',
'L2_0_0_2/stepchild',
'main'
]

Expand Down Expand Up @@ -716,6 +721,7 @@ def test_slot_inheritance():
with pytest.raises(ValueError):
view.slot_ancestors('s5')


def test_attribute_inheritance():
"""
Tests attribute inheritance edge cases.
Expand Down