My name is Cyril, and I am an applied mathematics engineer, interested in machine learning, data assimilation, time series, and cartographic data more generally! You will find on my GitHub some Python package repositories available on PyPI and conda-forge:
Machine Learning
- optimask: For managing missing data in arrays. Allows maximizing the number of valid data before learning a model.
- timefiller: For imputing missing data in a block of correlated time series, or forecasting with covariates containing missing data. An easy-to-implement and efficient package, based on optimask.
- apyxl: A simple wrapper around xgboost, shap, and hyperopt to produce explainable non-linear regressions in one line of code. apyxl is not intended for production but rather as an aid to understanding or a first approach to a dataset.
Large Data
- batchstats: The extension of numpy for calculating statistics of large data larger than available memory or for data arriving in batches.
Weather Data ☁️
- meteofetch: A client for Météo-France data available on https://meteo.data.gouv.fr/. Usable immediately because it does not require an API key.
- isd-fetch: To retrieve ground weather observations across the globe.
I aim to produce well-written, documented, and easy-to-use open-source packages. Do not hesitate to open an issue if you encounter a bug or difficulty. 🙂