Skip to content

Conversation

@joneubank
Copy link
Contributor

@joneubank joneubank commented Apr 21, 2024

Implement conditional restrictions as described in: #199 Epic: Conditional Restrictions

This feature branch will collect the work for each part of the project as described in the linked Epic, once all parts are available it can be merged as a complete feature into the primary branch.

The outcome of this feature will be to have conditional restrictions available for fields in Lectern Dictionaries, and the script restrictions no longer available. This applies both to the dictionaries maintained by the lectern schema, and the validation implementation provided by the lectern client.

Breakdown

Task Status PR
Remove script restriction In Progress

| Restriction | Used with Field Types | Type | Description | Examples |
| ----------- | ----------------------------- | --------------------------------------------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `codeList` | `integer`, `number`, `string` | Array of type of the field | An array of values of the type matching this field. Data provided for this field must have one of the values in this list. | `["Weak", "Average", "Strong"]` |
| `compare` | all | [ComparedFieldsRule](#comparedfieldsrule-data-structure) object | Enforces that this field has a value based on the provided value in another field. Examples would be to ensure that the two values are not equal, or for numeric values ensure one is greater than the other. | `{ "fields": ["age_at_diagnosis"], "relation": "greaterThanOrEqual" }` Ensure that a field such as `age_at_death` is greater than the provided `age_at_diagnosis` |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@joneubank joneubank Apr 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we want to consistently use camelCase for json properties. for constant values like "greaterThan" we could use upper snake case instead to indicate they are constant values: GREATER_THAN

I might've missed this property in the docs, thanks for pointing it out.

> }
> ```
A `meta` object is available to allow the dictionary creator to add custom properties to the Lectern Dictionary. The `meta` property is available to all Dictionary, Schema, and Field objects. Providing a `meta` value is optional, but if provided it must be a JSON object. There are no restrictions on the field names that can be added to the `meta` object other than they must be valid JSON. The values for meta properties are restricted to being a `string`, `number`, or `boolean` value. This means that meta data cannot be nested or accept arrays.
Copy link

@edsu7 edsu7 Apr 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that meta data cannot be nested or accept arrays.
Is this due to a technical limitation?

For PCGL currently this is the "workaround" ( a string mimicing an array of dictionary objects):
https://github.com/Pan-Canadian-Genome-Library/data-dictionary/blob/71fee8cc4cb2db40ec2e747f297a2ba9d2401e73/schemas/participant.json#L46C11-L46C148

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok good feedback, string parsing is way worse... we were trying to keep this for simple usage and known property names... this shows you want to use it with any number of unknown properties. this can be changed I think.

@edsu7
Copy link

edsu7 commented Apr 24, 2024

Would the following be an appropriate usage:

      {
        "name": "date_of_birth",
        "description": "Indicate participant's date of birth.",
        "valueType": "string",
        "meta": {
          "displayName": "Date Of Birth",
          "mappings": "All4One: Month,year of birth;mCODE STU3: Patient.birthDate;Phenopacket: Individual.date_of_birth;Beacon V2: Individual.date_of_birth",
          "source": "MOH"
        },
        "restrictions": {
          "if" : {
            "conditions": [
              {
                "fields": ["vital_status"],
                "match": {"value": "Deceased"}
              }
            ]
          },
          "then" : {
            "required": true,
            "regex": "#/regex/yyyy_mm"
          },
          "else" : {
            "empty" : true
          }
        }
      }

@joneubank
Copy link
Contributor Author

Would the following be an appropriate usage:

@edsu7 looks semantically correct. The rules for this would be that when vital_status equals Deceased then this field (date_of_birth) must be provided (required) and must match the regex stored in the variable #/regex/yyyy_mm; otherwise (vital_status does not equal Deceased) this field must be empty.

That doesn't make sense for a field titled date_of_birth but assuming you meant a date_of_death then thats perfect.

@edsu7
Copy link

edsu7 commented May 22, 2024

A couple of more examples - to mimic https://github.com/icgc-argo/argo-dictionary/blob/1a9783899c434c54b3585ac41a12f8099621c9eb/references/validationFunctions/chemotherapy/doseReduction.js
https://github.com/icgc-argo/argo-dictionary/blob/1a9783899c434c54b3585ac41a12f8099621c9eb/references/validationFunctions/chemotherapy/drugDose.js

{
  "name": "prescribed_cumulative_drug_dose",
  "description": "Indicate the total prescribed cumulative drug dose in the same units specified in chemotherapy_drug_dose_units.",
  "valueType": "number",
  "meta": {
    "displayName": "Prescribed Cumulative Drug Dose",
    "core": true,
  },
  "restrictions": {
    "if" : {
      "conditions": [
        {
          "fields": ["chemotherapy_drug_dose_units"],
          "match": {"exists":false]}
        },
        {
          "fields": ["actual_cumulative_drug_dose"],
          "match": {"exists":true]}
        }
      ],
      "case":"any"
    },
    "then" : {
      "empty": true
    },
    "else" : {
      "required": true
    }
  }
}

https://github.com/icgc-argo/argo-dictionary/blob/1a9783899c434c54b3585ac41a12f8099621c9eb/references/validationFunctions/biomarker/intervalOrId.js

{
  "name": "submitter_specimen_id",
  "description": "Unique identifier of the specimen, assigned by the data provider.",
  "valueType": "string",
  "meta": {
    "displayName": "Submitter Specimen Id",
    "core": true,
  },
  "restrictions": {
    "if" : {
      "conditions": [
        {
          "fields": ["submitter_primary_diagnosis_id","submitter_treatment_id","submitter_follow_up_id","test_interval"],
          "match": {"exists":true}
        }
      ],
      "case":"any"
    },
    "then" : {
      "empty": true
    },
    "else" : {
      "required": true,
      "regex": "^[A-Za-z0-9\-\._]{1,64}$"
    }
  }
}

@joneubank
Copy link
Contributor Author

joneubank commented May 24, 2024

@edsu7 Replying to your examples. They seem to be mismatched with the links you gave so I'll interpret what you wrote and if that is what you expected then we're good!

  1. If either of your conditions is found, then this field must be empty, otherwise it is required. So... if either chemotherapy_drug_dose_units has no value, OR actual_cumulative_drug_dose has a value, then prescribed_cumulative_drug_dose must not have a value. prescribed_cumulative_drug_dose is required whenever all those conditions are false (chemotherapy_drug_dose_units has a value and actual_cumulative_drug_dose has no value).

  2. This restriction matches the second half of the script linked, the part that checks for the existence of an ID field. As you have written it, this condition is true when ALL the listed fields have a value. So this means if any of those are missing then this condition fails. This is because you have written "case":"any" in the if block, so it applies to the list of conditions, not the fields inside the one condition. It can be updated to match what you want by:

{
  "restrictions": {
    "if" : {
      "conditions": [
        {
          "fields": ["submitter_primary_diagnosis_id","submitter_treatment_id","submitter_follow_up_id","test_interval"],
          "match": {"exists":true},
          "case":"any"
        }
      ],
    },
    "then" : {
      "empty": true
    },
    "else" : {
      "required": true,
      "regex": "^[A-Za-z0-9\-\._]{1,64}$"
    }
  }
}

@joneubank joneubank force-pushed the feat/conditional-restrictions branch from 617eb89 to 6b9b4a5 Compare May 24, 2024 21:26
@joneubank joneubank force-pushed the feat/conditional-restrictions branch from b7a7ad8 to cbe0f7a Compare May 26, 2024 18:28
@joneubank joneubank changed the title Proposed dictionary structure with conditional restrictions Add conditional restrictions to dictionary fields May 26, 2024
@joneubank
Copy link
Contributor Author

joneubank commented Aug 1, 2024

Closing this PR as these changes were made before significant refactoring of the code base. The relevant changes will be replicated into a new feature branch based on the latest develop branch.

The proposed conditional restrictions spec has been committed to the main code base as part of the dictionary reference doc

@joneubank joneubank closed this Aug 1, 2024
@joneubank joneubank deleted the feat/conditional-restrictions branch August 2, 2024 01:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants