Skip to content

Placeholder for discussing support for HCA and FAANG context #95

@henrietteharmse

Description

@henrietteharmse

This ticket serves as a discussion point for adding support for HCA and FAANG context. Here I will make some suggestions with the main intent that it is something people can point at to say it makes sense or it does not make sense.

Currently HCA and FAANG restrict mappings using graph-restrictions for some of their fields to restrict ontology terms that can be used for these fields.

Here is an example from FAANG for their experiments_chip-seq_dna-binding_proteins field:

              "graph_restriction": {
                "ontologies": ["obo:chebi"],
                "classes": ["CHEBI:15358"],
                "relations": ["rdfs:subClassOf"],
                "direct": false,
                "include_self": false
              }

Here is an example from HCA for their cell type field:

            "graph_restriction":  {
                "ontologies" : ["obo:hcao", "obo:cl"],
                "classes": ["CL:0000003"],
                "relations": ["rdfs:subClassOf"],
                "direct": false,
                "include_self": false
            },

Currently our project definition looks as follows:

{
  "name": "Project name",                     // MANDATORY
  "description": "Some description",
  "numberOfReviewsRequired": 3,
  "datasources": [
     "atlas",
     "uniprot",
     "gwas",
     ...
  ],
  "ontologies": [
     "efo",
     "mondo",
     "hp",
     "ordo"
  ],
  "preferredMappingOntologies": [ "efo" ]
}

To support HCA and FAANG, we need to add a fields field consisting of fields supporting graph-restrictions to our project definition.
Here is an example for FAANG.

 "fields": [
            {
            "fieldName" : "experiments_chip-seq_dna-binding_proteins"
            "graphRestriction":  {
                "ontologies" : ["obo:hcao", "obo:cl"],
                "classes": ["CL:0000003"],
                "relations": ["rdfs:subClassOf"],
                "direct": false,
                "includeSelf": false
            }
         },
         {
          "fieldName" : "otherField" ,
          "graphRestriction":  {
          ...
            }
         }
 ] 

Here is an example for HCA:


 "fields": [
            {
            "fieldName" : "cell type"
            "graphRestriction":  {
                "ontologies" : ["obo:hcao", "obo:cl"],
                "classes": ["CL:0000003"],
                "relations": ["rdfs:subClassOf"],
                "direct": false,
                "include_self": false
            }
         },
         {
          "fieldName" : "otherField" ,
          "graphRestriction":  {
          ...
            }
         }
 ] 

Currently our upload format looks as follows:

{
  "data": [
	{
	  "upstreamId": "ID",     // Optional
          "priority": 3,          // Optional
	  "text": "TEXT",     // Mandatory
	  "context": "field"   // Optional (if not provided, data points will be auto-assigned to the `default` context)
	}
  ]
}

I do not think our upload file format will need to change, assuming the context will contain a field that is part of the list of fields for that project.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions