-
-
Notifications
You must be signed in to change notification settings - Fork 21
MDIM/Explorer config harmonization #4032
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Partly addressing this in #4030 |
Let us know when you plan to move files or a big restructure, please. There are some open PRs for mdims and rebasing them could get nasty. |
I've merged my changes from #4030, which have put in place the space of I'm resolving conflicts now: |
Hey @lucasrodes thanks a lot for starting this. For this specific issue, is there anything left to do, or do you want to consider it a tracking issue? |
Hey Pablo! So this issue is rather high-level, so probably belongs to the "tracking realm". I'll keep an updated list of related PRs/issues in the description. |
Thanks @lucasrodes, I see some overlap with: Feel free to integrate this into one of those already existing issues, to avoid too much dispersion. |
@pabloarosado The description of #3969 already points readers to #3992, so I'll go ahead and close it. On #3992, I think it is more general than this one, since it also considers things like: update workflow, wizard app, csv-to-etl migration of explorers, etc. This issue instead is focussed on harmonizing the how we handle configs of ETL-based explorers and mdims in ETL, so they almost feel the same. Note that this issue is actually mentioned in the description of #3992 (point 2). I've edited a bit the description of this issue to be a bit more explicit. |
I've finished the first iteration of harmonizing the tooling for Explorers and MDIMs in #4035. COVID explorer and mdims now use this new logic. Find below a more detailed description of the improvements from this work. Model summaryI've abstracted config logic from Explorers and MDIMs, and put a model in place (very similar to what we have in Diagram in
|
for fname in filenames: | |
paths.log.info(fname) | |
config = paths.load_mdim_config(fname) | |
multidim.upsert_multidim_data_page( | |
config=config, | |
paths=paths, | |
mdim_name=fname_to_mdim_name(fname), | |
) |
Explorer
create_dataset
encapsulates logic on validation, processing and uploading to owid-content
. In the future, we should probably have a upsert_explorer
method.
etl/etl/steps/export/explorers/covid/latest/covid.py
Lines 33 to 46 in cbaefcd
def run(dest_dir: str) -> None: | |
# | |
# Load inputs. | |
# | |
# Load grapher config from YAML | |
config = paths.load_explorer_config() | |
# Create explorer | |
ds_explorer = create_explorer( | |
dest_dir=dest_dir, | |
config=config, | |
paths=paths, | |
tolerate_extra_indicators=True, | |
) |
Future work
- We need a schema for Explorers, to validate the YAML config, like we do with MDIMs.
- The config of explorers is still specific to explorers (minor differences now); I think we should try to modify it and bring it closer to MDIMs.
- At the moment the source of truth for the "schema" of an explorer view seems to live in a TypeScript file. It should have its own JSON file!
- We need a place in the DB for explorers (connected to above's point of defining a schema).
- Test this changes by migrating some explorers to use this model.
- Add some docs as it becomes clearer.
- Wizard templating.
- Test other kinds of explorers: code/yaml combinations.
Thanks @lucasrodes, this is looking really good! One idea to make the workflow of explorers/mdims a bit more straightforward (more similar to any other ETL data step) would be to adapt |
Recent work to this space: |
Currently, we have tooling for MDIMs and ETL-
export
Explorers inetl.multidim
andetl.explorer
. However, these two modules could share some logic. The current structure is not very suitable for this.Generally, this tooling is used for handling config files, which ideally should be very close to one another.
Proposal
etl.collections
. In it, we have:etl.collections.base
(preliminar name): which contains common logic.etl.collections.utils
(preliminar name): which contains smaller util functions.etl.collections.multidim
: MDIM-specific logicetl.collections.explorer
: Explorer-specific logic.Others:
paths.load_mdim_config
, which returns a plain dictionary. This function is inetl.helpers
, which can't really import frometl.collections.multidim
, since this latter imports the former already.Related work
create_mdim
and.save
(alignment with data steps) #4106The text was updated successfully, but these errors were encountered: