Skip to content

Commit

Permalink
Merge branch 'release/3.1.0'
Browse files Browse the repository at this point in the history
  • Loading branch information
dermatologist committed Nov 14, 2023
2 parents 9355247 + a57876a commit 7de89d4
Show file tree
Hide file tree
Showing 15 changed files with 101 additions and 75 deletions.
2 changes: 1 addition & 1 deletion .devcontainer/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# [Choice] Python version: 3, 3.9, 3.8, 3.7, 3.6
ARG VARIANT=3.8
ARG VARIANT="3.10"
FROM mcr.microsoft.com/vscode/devcontainers/python:${VARIANT}

# [Option] Install Node.js
Expand Down
2 changes: 1 addition & 1 deletion .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
"context": "..",
"args": {
// Update 'VARIANT' to pick a Python version: 3, 3.6, 3.7, 3.8, 3.9
"VARIANT": "3.8",
"VARIANT": "3.10",
// Options
"INSTALL_NODE": "false",
"NODE_VERSION": "lts/*"
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,9 @@ jobs:
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v3
uses: actions/setup-python@v4.1.0
with:
python-version: '3.8'
python-version: '3.10'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,9 @@ jobs:
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v3
uses: actions/setup-python@v4.1.0
with:
python-version: '3.8'
python-version: '3.10'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/pytest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,12 @@ jobs:
strategy:
max-parallel: 4
matrix:
python-version: [3.8]
python-version: [3.10.13]

steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
uses: actions/setup-python@v4.1.0
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/tox.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: Python Test
name: Tox Test

on:
push:
Expand All @@ -12,12 +12,12 @@ jobs:
strategy:
max-parallel: 4
matrix:
python-version: [3.8]
python-version: [3.10.13]

steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
uses: actions/setup-python@v4.1.0
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
Expand Down
7 changes: 6 additions & 1 deletion .readthedocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,12 @@ sphinx:
formats:
- pdf

build:
os: ubuntu-22.04
tools:
python: "3.11"

python:
version: 3.8
install:
- requirements: docs/requirements.txt
- {path: ., method: pip}
15 changes: 14 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,20 @@

## [Unreleased](https://github.com/dermatologist/fhiry/tree/HEAD)

[Full Changelog](https://github.com/dermatologist/fhiry/compare/2.1.0...HEAD)
[Full Changelog](https://github.com/dermatologist/fhiry/compare/3.0.0...HEAD)

**Implemented enhancements:**

- Flattening FHIR resources / bundle for LLMs [\#144](https://github.com/dermatologist/fhiry/issues/144)

**Closed issues:**

- Performance warning: DataFrame is highly fragmented [\#135](https://github.com/dermatologist/fhiry/issues/135)
- 'charmap' codec can't decode byte 0x81 in position 1603 [\#133](https://github.com/dermatologist/fhiry/issues/133)

## [3.0.0](https://github.com/dermatologist/fhiry/tree/3.0.0) (2023-03-09)

[Full Changelog](https://github.com/dermatologist/fhiry/compare/2.1.0...3.0.0)

**Implemented enhancements:**

Expand Down
8 changes: 5 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,16 @@ Virtual flattened view of *FHIR Bundle / ndjson / FHIR server / BigQuery!*
[![PyPI download total](https://img.shields.io/pypi/dm/fhiry.svg)](https://pypi.python.org/pypi/fhiry/)
![GitHub tag (latest by date)](https://img.shields.io/github/v/tag/dermatologist/fhiry)

[Bulk data export using FHIR](https://hl7.org/fhir/uv/bulkdata/export/index.html) is needed to export a cohort for data analytics or machine learning.
:fire: **Fhiry** is a [python](https://www.python.org/) package to facilitate this by converting a folder of [FHIR bundles](https://www.hl7.org/fhir/bundle.html)/ndjson into a [pandas](https://pandas.pydata.org/docs/user_guide/index.html) data frame for analysis and importing
into ML packages such as Tensorflow and PyTorch. Fhiry also supports FHIR server search and FHIR tables on BigQuery.
:fire: **FHIRy** is a [python](https://www.python.org/) package to facilitate health data analytics and machine learning by converting a folder of [FHIR bundles](https://www.hl7.org/fhir/bundle.html)/ndjson from [bulk data export](https://hl7.org/fhir/uv/bulkdata/export/index.html) into a [pandas](https://pandas.pydata.org/docs/user_guide/index.html) data frame for analysis. You can import the dataframe
into ML packages such as Tensorflow and PyTorch. **FHIRy also supports FHIR server search and FHIR tables on BigQuery.**

Test this with the [synthea sample](https://synthea.mitre.org/downloads) or the downloaded ndjson from the [SMART Bulk data server](https://bulk-data.smarthealthit.org/). Use the 'Discussions' tab above for feature requests.

:sparkles: Checkout [this template](https://github.com/dermatologist/kedro-multimodal) for Multimodal machine learning in healthcare!

:fire: Checkout [MedPrompt](https://github.com/dermatologist/medprompt) for Medical LLM prompts, including FHIR related prompts, such as text-to-FHIRQuery mapper!


## Installation

### Stable
Expand Down
24 changes: 11 additions & 13 deletions dev-requirements.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#
# This file is autogenerated by pip-compile with Python 3.8
# This file is autogenerated by pip-compile with Python 3.10
# by the following command:
#
# pip-compile dev-requirements.in
Expand All @@ -12,18 +12,20 @@ babel==2.9.1
# via sphinx
backports-entry-points-selectable==1.1.0
# via virtualenv
certifi==2022.12.7
certifi==2023.7.22
# via
# -c requirements.txt
# requests
charset-normalizer==3.1.0
charset-normalizer==3.3.2
# via
# -c requirements.txt
# requests
commonmark==0.9.1
# via recommonmark
coverage[toml]==5.5
# via pytest-cov
# via
# coverage
# pytest-cov
distlib==0.3.2
# via virtualenv
docutils==0.17.1
Expand All @@ -40,15 +42,13 @@ idna==3.4
# requests
imagesize==1.2.0
# via sphinx
importlib-metadata==5.1.0
# via sphinx
iniconfig==1.1.1
# via pytest
jinja2==3.0.1
# via sphinx
markupsafe==2.0.1
# via jinja2
packaging==23.0
packaging==23.2
# via
# -c requirements.txt
# pytest
Expand All @@ -73,13 +73,13 @@ pytest==7.1.2
# pytest-cov
pytest-cov==3.0.0
# via -r dev-requirements.in
pytz==2022.7.1
pytz==2023.3.post1
# via
# -c requirements.txt
# babel
recommonmark==0.7.1
# via -r dev-requirements.in
requests==2.28.2
requests==2.31.0
# via
# -c requirements.txt
# responses
Expand Down Expand Up @@ -124,17 +124,15 @@ tox==3.25.0
# via -r dev-requirements.in
types-toml==0.10.8.1
# via responses
urllib3==1.26.14
urllib3==2.1.0
# via
# -c requirements.txt
# requests
# responses
virtualenv==20.8.0
# via tox
wheel==0.37.1
wheel==0.41.0
# via -r dev-requirements.in
zipp==3.11.0
# via importlib-metadata

# The following packages are considered to be unsafe in a requirements file:
# setuptools
57 changes: 29 additions & 28 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,93 +1,94 @@
#
# This file is autogenerated by pip-compile with Python 3.8
# This file is autogenerated by pip-compile with Python 3.10
# by the following command:
#
# pip-compile
#
cachetools==5.3.0
cachetools==5.3.2
# via google-auth
certifi==2022.12.7
certifi==2023.7.22
# via requests
charset-normalizer==3.1.0
charset-normalizer==3.3.2
# via requests
db-dtypes==1.0.5
db-dtypes==1.1.1
# via fhiry (setup.py)
google-api-core[grpc]==2.11.0
google-api-core[grpc]==2.14.0
# via
# google-api-core
# google-cloud-bigquery
# google-cloud-core
google-auth==2.16.2
google-auth==2.23.4
# via
# google-api-core
# google-cloud-core
google-cloud-bigquery==3.6.0
google-cloud-bigquery==3.13.0
# via fhiry (setup.py)
google-cloud-core==2.3.2
google-cloud-core==2.3.3
# via google-cloud-bigquery
google-crc32c==1.5.0
# via google-resumable-media
google-resumable-media==2.4.1
google-resumable-media==2.6.0
# via google-cloud-bigquery
googleapis-common-protos==1.58.0
googleapis-common-protos==1.61.0
# via
# google-api-core
# grpcio-status
grpcio==1.51.3
grpcio==1.59.2
# via
# google-api-core
# google-cloud-bigquery
# grpcio-status
grpcio-status==1.51.3
grpcio-status==1.59.2
# via google-api-core
idna==3.4
# via requests
numpy==1.24.2
numpy==1.26.2
# via
# db-dtypes
# pandas
# pyarrow
packaging==23.0
packaging==23.2
# via
# db-dtypes
# google-cloud-bigquery
pandas==1.5.3
pandas==2.1.3
# via
# db-dtypes
# fhiry (setup.py)
proto-plus==1.22.2
proto-plus==1.22.3
# via google-cloud-bigquery
protobuf==4.22.1
protobuf==4.25.0
# via
# google-api-core
# google-cloud-bigquery
# googleapis-common-protos
# grpcio-status
# proto-plus
pyarrow==11.0.0
pyarrow==14.0.1
# via db-dtypes
pyasn1==0.4.8
pyasn1==0.5.0
# via
# pyasn1-modules
# rsa
pyasn1-modules==0.2.8
pyasn1-modules==0.3.0
# via google-auth
python-dateutil==2.8.2
# via
# google-cloud-bigquery
# pandas
pytz==2022.7.1
pytz==2023.3.post1
# via pandas
requests==2.28.2
requests==2.31.0
# via
# google-api-core
# google-cloud-bigquery
rsa==4.9
# via google-auth
six==1.16.0
# via
# google-auth
# python-dateutil
tqdm==4.65.0
# via python-dateutil
tqdm==4.66.1
# via fhiry (setup.py)
urllib3==1.26.14
tzdata==2023.3
# via pandas
urllib3==2.1.0
# via requests
2 changes: 2 additions & 0 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ classifiers =
Operating System :: OS Independent
Programming Language :: Python
Programming Language :: Python :: 3.8
Programming Language :: Python :: 3.9
Programming Language :: Python :: 3.10
Topic :: Scientific/Engineering :: Information Analysis


Expand Down
9 changes: 7 additions & 2 deletions src/fhiry/base_fhiry.py
Original file line number Diff line number Diff line change
Expand Up @@ -93,12 +93,17 @@ def add_patient_id(self):
"""Create a patientId column with the resource.id if a Patient resource or with the resource.subject.reference if other resource type
"""
try:
self._df['patientId'] = self._df.apply(lambda x: x['resource.id'] if x['resource.resourceType']
# PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling `frame.insert` many times, which has poor performance. Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
newframe = self._df.copy()
newframe['patientId'] = self._df.apply(lambda x: x['resource.id'] if x['resource.resourceType']
== 'Patient' else self.check_subject_reference(x), axis=1)
self._df = newframe
except:
try:
self._df['patientId'] = self._df.apply(lambda x: x['id'] if x['resourceType']
newframe = self._df.copy()
newframe['patientId'] = self._df.apply(lambda x: x['id'] if x['resourceType']
== 'Patient' else self.check_subject_reference(x), axis=1)
self._df = newframe
except:
pass

Expand Down
2 changes: 1 addition & 1 deletion src/fhiry/fhiry.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ def delete_col_raw_coding(self, delete_col_raw_coding):
self._delete_col_raw_coding = delete_col_raw_coding

def read_bundle_from_file(self, filename):
with open(filename, 'r') as f:
with open(filename, encoding='utf8', mode='r') as f:
json_in = f.read()
json_in = json.loads(json_in)
return pd.json_normalize(json_in['entry'])
Expand Down
Loading

0 comments on commit 7de89d4

Please sign in to comment.