-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filter parameter and inconsistent outputs #602
Comments
Propagation is indeed broken, I filed a separate issue for this: #603. I'll investigate why the conjunction of characteristics filters don't work. |
This is the issue we talked about today. The first example case above now returns 549. The second example case above now returns... nothing. I suspect the second case is because DOID isn't loaded on dev. |
Popping back to confirm that this issue still appear to exist albeit in a different form so I am adding up to date examples and with a list of grievances below along with the assumptions I am making when expecting the results in case there is a mismatch there Case 1, incomplete inheritance
Case 2, failure to return overlap of two queries of allCharacteristics.valueUri or allCharacteristics.value when "and" is used
Case 3 duplicated results when querying for ids.
|
Just to be clear, if you submit multiple The and return no result because of how the SQL is generated. I do a jointure on the characteristics so having multiple conjunctive clauses on the same attribute will not work. I need to adjust these queries to use subqueries instead of jointures. |
None of the examples do this do they? Or are we talking about using "and"? If it's case 3 those are separate calls |
I've investigated this, and it turns out that resolving this require significant work. At the very fundamental level, we cannot express a conjunction on a one-to-many relation with a single jointure. The conjunction will actually turn into a contradiction. c.id = 1 and c.id = 2 which is why you get zero results. Instead, we need to either create one jointure per clause c1.id = 1 and c2.id = 2 or use subqueries id in (select ... join c on c.id = 1) and id in (select ... join c on c.id = 2) I think I will ultimately opt for expressing those using subqueries since it appears to be the simplest and most flexible approach. This is being looked at in #708. |
Add support for subqueries in filters (fix #602)
allCharacteristics.valueUri = http://purl.obolibrary.org/obo/MONDO_0005180 and allCharacteristics.valueUri = http://purl.obolibrary.org/obo/UBERON_0000955 now returns the expected 16 intersecting results. |
How curious that results appear duplicated when filtered by ID. I'll investigate that... |
I know what's going on: the ACL entries are joined in the query, but it's lacking a |
That only leaves the propagation. I'll investigate it now. |
The propagation still seems a bit incomplete. Using the annotation/search/datasets endpoint I ran a search for http://purl.obolibrary.org/obo/UBERON_0000955. This returns 3950 results Whereas datasets endpoint by filter returns 1796 results. These calls aren't exactly equivalent so I examined some of the missing cases. Experiment 18 is such an example. It is annotated by frontal cortex and frontal lobe both of which should be children of brain (http://purl.obolibrary.org/obo/UBERON_0000955) yet the experiment isn't returned. |
These two endpoints should return exactly the same thing. I'm deploying the fix for #729 so you can test it on the dev. Otherwise that means that the inference is behaving differently from prod to dev. This might be expected because we use slimmer ontologies. |
Filter terms separated by
and
should return more and more specialised results as more terms are added but this doesn't seem to be the case for reasons unknown.I am using the presence of the dataset 549 as the indicator in the examples below, but results are consistent with examining the entire output via offset
Case 1
Case 2
Dataset 549 doesn't have the term UBERON_0000955 or any of the associated terms that it propagates to looking at the filter term in the outputs. The closest thing it has is UBERON_0002038 which is a region of the brain. It does have the other term DOID_14330
The text was updated successfully, but these errors were encountered: