Skip to content

Commit d77a965

Browse files
applying pre-commit
1 parent 1018383 commit d77a965

36 files changed

+388
-101
lines changed

.pre-commit-config.yaml

+104
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,104 @@
1+
default_language_version:
2+
python: python3
3+
4+
ci:
5+
autofix_commit_msg: |
6+
[pre-commit.ci] auto fixes from pre-commit.com hooks
7+
autofix_prs: true
8+
autoupdate_branch: "master"
9+
autoupdate_commit_msg: "[pre-commit.ci] pre-commit autoupdate"
10+
autoupdate_schedule: quarterly
11+
skip: [ ]
12+
submodules: false
13+
14+
repos:
15+
- repo: https://github.com/pre-commit/pre-commit-hooks
16+
rev: v5.0.0
17+
hooks:
18+
- id: check-yaml
19+
- id: check-json
20+
- id: check-executables-have-shebangs
21+
- id: check-toml
22+
- id: check-docstring-first
23+
- id: check-added-large-files
24+
- id: end-of-file-fixer
25+
- id: trailing-whitespace
26+
- id: check-case-conflict
27+
- id: mixed-line-ending
28+
- id: end-of-file-fixer
29+
- id: check-case-conflict
30+
- id: forbid-new-submodules
31+
- id: pretty-format-json
32+
args: [ "--autofix", "--no-sort-keys", "--indent=4" ]
33+
34+
repos:
35+
- repo: https://github.com/charliermarsh/ruff-pre-commit
36+
rev: v0.9.7
37+
hooks:
38+
- id: ruff
39+
name: ruff lint docs & examples
40+
args:
41+
[
42+
"--fix",
43+
"--select=E,W,F,I,D", # E=errors, W=warnings, F=pyflakes, I=import sorting, D=docstring rules
44+
"--ignore=E402,E501,F401,D103,D400,D100,D101,D102,D105,D107,D415,D417,D205", # combined ignores
45+
]
46+
files: ^(docs|examples)/
47+
- id: ruff
48+
name: ruff lint braindecode preview
49+
args:
50+
[
51+
"--fix",
52+
"--preview",
53+
"--select=NPY201",
54+
"--ignore=D100,D101,D102,D105,D107,D415,D417,D205",
55+
]
56+
files: ^braindecode/
57+
- id: ruff
58+
name: ruff lint docs & examples
59+
args:
60+
[
61+
"--fix",
62+
"--select=D", # docstring rules
63+
"--ignore=D103,D400,E402,D100,D101,D102,D105,D107,D415,D417,D205", # drop these specific checks
64+
]
65+
files: ^(docs|examples)/
66+
- id: ruff-format
67+
name: ruff format code
68+
files: ^(braindecode|docs|examples)/
69+
70+
71+
- repo: https://github.com/codespell-project/codespell
72+
rev: v2.4.1
73+
hooks:
74+
- id: codespell
75+
args:
76+
- --ignore-words-list=carin,splitted,meaned,wil,whats,additionals,alle,alot,bund,currenty,datas,farenheit,falsy,fo,haa,hass,iif,incomfort,ines,ist,nam,nd,pres,pullrequests,resset,rime,ser,serie,te,technik,ue,unsecure,withing,zar,mane,THIRDPARTY
77+
- --skip="./.*,*.csv,*.json,*.ambr,*.toml"
78+
- --quiet-level=2
79+
exclude_types: [ csv, json ]
80+
exclude: ^tests/|generated/^.github
81+
82+
- repo: https://github.com/asottile/blacken-docs
83+
rev: 1.19.1
84+
hooks:
85+
- id: blacken-docs
86+
exclude: ^.github|CONTRIBUTING.md
87+
88+
- repo: https://github.com/pre-commit/mirrors-mypy
89+
rev: v1.15.0
90+
hooks:
91+
- id: mypy
92+
files: braindecode/.*.py|test/.*.py|examples/.*.py
93+
additional_dependencies:
94+
[
95+
types-PyYAML,
96+
types-requests,
97+
]
98+
verbose: true
99+
100+
- repo: https://github.com/PyCQA/isort
101+
rev: 6.0.1
102+
hooks:
103+
- id: isort
104+
exclude: ^\.gitignore

DevNotes.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -5,5 +5,5 @@ pip install -r requirements.txt
55
## signalstore mongodb
66
- Check args functions to double check input to db query
77
- Create_index to the collection once its created to speed up querying
8-
- `find` has deserialization to convert timestamp to correct milisecond format and json_schema from bytes to dict
8+
- `find` has deserialization to convert timestamp to correct millisecond format and json_schema from bytes to dict
99
- `add` has serialization before insert into db
File renamed without changes.

README.md

+8-2
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,10 @@ To use the data from a single subject, enter:
3939

4040
```python
4141
from eegdash import EEGDashDataset
42-
ds_NDARDB033FW5 = EEGDashDataset({'dataset': 'ds005514', 'task': 'RestingState', 'subject': 'NDARDB033FW5'})
42+
43+
ds_NDARDB033FW5 = EEGDashDataset(
44+
{"dataset": "ds005514", "task": "RestingState", "subject": "NDARDB033FW5"}
45+
)
4346
```
4447

4548
This will search and download the metadata for the task **RestingState** for subject **NDARDB033FW5** in BIDS dataset **ds005514**. The actual data will not be downloaded at this stage. Following standard practice, data is only downloaded once it is processed. The **ds_NDARDB033FW5** object is a fully functional BrainDecode dataset, which is itself a PyTorch dataset. This [tutorial](https://github.com/sccn/EEGDash/blob/develop/notebooks/tutorial_eoec.ipynb) shows how to preprocess the EEG data, extracting portions of the data containing eyes-open and eyes-closed segments, then perform eyes-open vs. eyes-closed classification using a (shallow) deep-learning model.
@@ -48,7 +51,10 @@ To use the data from multiple subjects, enter:
4851

4952
```python
5053
from eegdash import EEGDashDataset
51-
ds_ds005505rest = EEGDashDataset({'dataset': 'ds005505', 'task': 'RestingState'}, target_name='sex')
54+
55+
ds_ds005505rest = EEGDashDataset(
56+
{"dataset": "ds005505", "task": "RestingState"}, target_name="sex"
57+
)
5258
```
5359

5460
This will search and download the metadata for the task 'RestingState' for all subjects in BIDS dataset 'ds005505' (a total of 136). As above, the actual data will not be downloaded at this stage so this command is quick to execute. Also, the target class for each subject is assigned using the target_name parameter. This means that this object is ready to be directly fed to a deep learning model, although the [tutorial script](https://github.com/sccn/EEGDash/blob/develop/notebooks/tutorial_sex_classification.ipynb) performs minimal processing on it, prior to training a deep-learning model. Because 14 gigabytes of data are downloaded, this tutorial takes about 10 minutes to execute.

docs/conf.py

+10-10
Original file line numberDiff line numberDiff line change
@@ -6,26 +6,26 @@
66
# -- Project information -----------------------------------------------------
77
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information
88

9-
project = 'EEGDash'
10-
copyright = '2025, Arnaud Delorme, Dung Truong'
11-
author = 'Arnaud Delorme, Dung Truong'
9+
project = "EEGDash"
10+
copyright = "2025, Arnaud Delorme, Dung Truong"
11+
author = "Arnaud Delorme, Dung Truong"
1212

1313
# -- General configuration ---------------------------------------------------
1414
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
1515

16-
extensions = ['sphinx.ext.autodoc']
17-
18-
templates_path = ['_templates']
19-
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
16+
extensions = ["sphinx.ext.autodoc"]
2017

18+
templates_path = ["_templates"]
19+
exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]
2120

2221

2322
# -- Options for HTML output -------------------------------------------------
2423
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output
2524

26-
html_theme = 'alabaster'
27-
html_static_path = ['_static']
25+
html_theme = "alabaster"
26+
html_static_path = ["_static"]
2827

2928
import os
3029
import sys
31-
sys.path.insert(0, os.path.abspath('..'))
30+
31+
sys.path.insert(0, os.path.abspath(".."))

docs/convert_xls_2_martkdown.py

+12-7
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,34 @@
11
#!/usr/bin/env python3
22
import sys
3+
34
import pandas as pd
45

6+
57
def excel_to_markdown(filename, sheet_name=0):
68
# Read the specified sheet from the Excel file
79
df = pd.read_excel(filename, sheet_name=sheet_name)
8-
10+
911
# Convert dataset IDs into Markdown links
1012
# Format: [dataset_id](https://nemar.org/dataexplorer/detail?dataset_id=dataset_id)
11-
df['DatasetID'] = df['DatasetID'].astype(str).apply(
12-
lambda x: f"[{x}](https://nemar.org/dataexplorer/detail?dataset_id={x})"
13+
df["DatasetID"] = (
14+
df["DatasetID"]
15+
.astype(str)
16+
.apply(lambda x: f"[{x}](https://nemar.org/dataexplorer/detail?dataset_id={x})")
1317
)
14-
18+
1519
# Replace "Schizophrenia/Psychosis" with "Psychosis" in the entire DataFrame
1620
df = df.replace("Schizophrenia/Psychosis", "Psychosis")
17-
21+
1822
# Convert the DataFrame to a Markdown table (excluding the index)
1923
markdown = df.to_markdown(index=False)
2024
return markdown
2125

22-
if __name__ == '__main__':
26+
27+
if __name__ == "__main__":
2328
if len(sys.argv) < 2:
2429
print("Usage: python script.py <excel_filename> [sheet_name]")
2530
sys.exit(1)
26-
31+
2732
excel_filename = sys.argv[1]
2833
sheet = sys.argv[2] if len(sys.argv) > 2 else 0
2934

eegdash.egg-info/PKG-INFO

+124
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,124 @@
1+
Metadata-Version: 2.4
2+
Name: eegdash
3+
Version: 0.0.8
4+
Summary: EEG data for machine learning
5+
Author-email: Young Truong <[email protected]>, Arnaud Delorme <[email protected]>
6+
License: GNU General Public License
7+
8+
Copyright (C) 2024-2025
9+
10+
Young Truong, UCSD, [email protected]
11+
Arnaud Delorme, UCSD, [email protected]
12+
13+
This program is free software; you can redistribute it and/or modify
14+
it under the terms of the GNU General Public License as published by
15+
the Free Software Foundation; either version 2 of the License, or
16+
(at your option) any later version.
17+
18+
This program is distributed in the hope that it will be useful,
19+
but WITHOUT ANY WARRANTY; without even the implied warranty of
20+
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
21+
GNU General Public License for more details.
22+
23+
You should have received a copy of the GNU General Public License
24+
along with this program; if not, write to the Free Software
25+
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1.07 USA
26+
27+
Project-URL: Homepage, https://github.com/sccn/EEG-Dash-Data
28+
Project-URL: Issues, https://github.com/sccn/EEG-Dash-Data/issues
29+
Classifier: Programming Language :: Python :: 3
30+
Classifier: License :: OSI Approved :: MIT License
31+
Classifier: Operating System :: OS Independent
32+
Requires-Python: >=3.8
33+
Description-Content-Type: text/markdown
34+
License-File: LICENSE
35+
Requires-Dist: xarray
36+
Requires-Dist: python-dotenv
37+
Requires-Dist: s3fs
38+
Requires-Dist: mne
39+
Requires-Dist: pynwb
40+
Requires-Dist: h5py
41+
Requires-Dist: pymongo
42+
Requires-Dist: joblib
43+
Requires-Dist: braindecode
44+
Requires-Dist: mne-bids
45+
Requires-Dist: pybids
46+
Requires-Dist: pymatreader
47+
Requires-Dist: pyarrow
48+
Requires-Dist: tqdm
49+
Requires-Dist: numba
50+
Requires-Dist: pre-commit
51+
Dynamic: license-file
52+
53+
# EEG-Dash
54+
To leverage recent and ongoing advancements in large-scale computational methods and to ensure the preservation of scientific data generated from publicly funded research, the EEG-DaSh data archive will create a data-sharing resource for MEEG (EEG, MEG) data contributed by collaborators for machine learning (ML) and deep learning (DL) applications.
55+
56+
## Data source
57+
The data in EEG-DaSh originates from a collaboration involving 25 laboratories, encompassing 27,053 participants. This extensive collection includes MEEG data, which is a combination of EEG and MEG signals. The data is sourced from various studies conducted by these labs, involving both healthy subjects and clinical populations with conditions such as ADHD, depression, schizophrenia, dementia, autism, and psychosis. Additionally, data spans different mental states like sleep, meditation, and cognitive tasks. In addition, EEG-DaSh will incorporate a subset of the data converted from NEMAR, which includes 330 MEEG BIDS-formatted datasets, further expanding the archive with well-curated, standardized neuroelectromagnetic data.
58+
59+
## Featured data
60+
61+
The following HBN datasets are currently featured on EEGDash. Documentation about these datasets is available [here](https://neuromechanist.github.io/data/hbn/).
62+
63+
| DatasetID | Participants | Files | Sessions | Population | Channels | Is 10-20? | Modality | Size |
64+
|---|---|---|---|---|---|---|---|---|
65+
| [ds005505](https://nemar.org/dataexplorer/detail?dataset_id=ds005505) | 136 | 5393 | 1 | Healthy | 129 | other | Visual | 103 GB |
66+
| [ds005506](https://nemar.org/dataexplorer/detail?dataset_id=ds005506) | 150 | 5645 | 1 | Healthy | 129 | other | Visual | 112 GB |
67+
| [ds005507](https://nemar.org/dataexplorer/detail?dataset_id=ds005507) | 184 | 7273 | 1 | Healthy | 129 | other | Visual | 140 GB |
68+
| [ds005508](https://nemar.org/dataexplorer/detail?dataset_id=ds005508) | 324 | 13393 | 1 | Healthy | 129 | other | Visual | 230 GB |
69+
| [ds005510](https://nemar.org/dataexplorer/detail?dataset_id=ds005510) | 135 | 4933 | 1 | Healthy | 129 | other | Visual | 91 GB |
70+
| [ds005512](https://nemar.org/dataexplorer/detail?dataset_id=ds005512) | 257 | 9305 | 1 | Healthy | 129 | other | Visual | 157 GB |
71+
| [ds005514](https://nemar.org/dataexplorer/detail?dataset_id=ds005514) | 295 | 11565 | 1 | Healthy | 129 | other | Visual | 185 GB |
72+
73+
A total of [246 other datasets](datasets.md) are also available through EEGDash.
74+
75+
## Data format
76+
EEGDash queries return a **Pytorch Dataset** formatted to facilitate machine learning (ML) and deep learning (DL) applications. PyTorch Datasets are the best format for EEGDash queries because they provide an efficient, scalable, and flexible structure for machine learning (ML) and deep learning (DL) applications. They allow seamless integration with PyTorch’s DataLoader, enabling efficient batching, shuffling, and parallel data loading, which is essential for training deep learning models on large EEG datasets.
77+
78+
## Data preprocessing
79+
EEGDash datasets are processed using the popular [BrainDecode](https://braindecode.org/stable/index.html) library. In fact, EEGDash datasets are BrainDecode datasets, which are themselves PyTorch datasets. This means that any preprocessing possible on BrainDecode datasets is also possible on EEGDash datasets. Refer to [BrainDecode](https://braindecode.org/stable/index.html) tutorials for guidance on preprocessing EEG data.
80+
81+
## EEG-Dash usage
82+
83+
### Install
84+
Use your preferred Python environment manager with Python > 3.9 to install the package.
85+
* To install the eegdash package, use the following command: `pip install eegdash`
86+
* To verify the installation, start a Python session and type: `from eegdash import EEGDash`
87+
88+
### Data access
89+
90+
To use the data from a single subject, enter:
91+
92+
```python
93+
from eegdash import EEGDashDataset
94+
ds_NDARDB033FW5 = EEGDashDataset({'dataset': 'ds005514', 'task': 'RestingState', 'subject': 'NDARDB033FW5'})
95+
```
96+
97+
This will search and download the metadata for the task **RestingState** for subject **NDARDB033FW5** in BIDS dataset **ds005514**. The actual data will not be downloaded at this stage. Following standard practice, data is only downloaded once it is processed. The **ds_NDARDB033FW5** object is a fully functional BrainDecode dataset, which is itself a PyTorch dataset. This [tutorial](https://github.com/sccn/EEGDash/blob/develop/notebooks/tutorial_eoec.ipynb) shows how to preprocess the EEG data, extracting portions of the data containing eyes-open and eyes-closed segments, then perform eyes-open vs. eyes-closed classification using a (shallow) deep-learning model.
98+
99+
To use the data from multiple subjects, enter:
100+
101+
```python
102+
from eegdash import EEGDashDataset
103+
ds_ds005505rest = EEGDashDataset({'dataset': 'ds005505', 'task': 'RestingState'}, target_name='sex')
104+
```
105+
106+
This will search and download the metadata for the task 'RestingState' for all subjects in BIDS dataset 'ds005505' (a total of 136). As above, the actual data will not be downloaded at this stage so this command is quick to execute. Also, the target class for each subject is assigned using the target_name parameter. This means that this object is ready to be directly fed to a deep learning model, although the [tutorial script](https://github.com/sccn/EEGDash/blob/develop/notebooks/tutorial_sex_classification.ipynb) performs minimal processing on it, prior to training a deep-learning model. Because 14 gigabytes of data are downloaded, this tutorial takes about 10 minutes to execute.
107+
108+
### Automatic caching
109+
110+
EEGDash automatically caches the downloaded data in the .eegdash_cache folder of the current directory from which the script is called. This means that if you run the tutorial [scripts](https://github.com/sccn/EEGDash/tree/develop/notebooks), the data will only be downloaded the first time the script is executed.
111+
112+
## Education -- Coming soon...
113+
114+
We organize workshops and educational events to foster cross-cultural education and student training, offering both online and in-person opportunities in collaboration with US and Israeli partners. Events for 2025 will be announced via the EEGLABNEWS mailing list. Be sure to [subscribe](https://sccn.ucsd.edu/mailman/listinfo/eeglabnews).
115+
116+
## About EEG-DaSh
117+
118+
EEG-DaSh is a collaborative initiative between the United States and Israel, supported by the National Science Foundation (NSF). The partnership brings together experts from the Swartz Center for Computational Neuroscience (SCCN) at the University of California San Diego (UCSD) and Ben-Gurion University (BGU) in Israel.
119+
120+
![Screenshot 2024-10-03 at 09 14 06](https://github.com/user-attachments/assets/327639d3-c3b4-46b1-9335-37803209b0d3)
121+
122+
123+
124+

eegdash.egg-info/SOURCES.txt

+25
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
LICENSE
2+
README.md
3+
pyproject.toml
4+
eegdash/__init__.py
5+
eegdash/data_config.py
6+
eegdash/data_utils.py
7+
eegdash/main.py
8+
eegdash.egg-info/PKG-INFO
9+
eegdash.egg-info/SOURCES.txt
10+
eegdash.egg-info/dependency_links.txt
11+
eegdash.egg-info/requires.txt
12+
eegdash.egg-info/top_level.txt
13+
eegdash/features/__init__.py
14+
eegdash/features/datasets.py
15+
eegdash/features/decorators.py
16+
eegdash/features/extractors.py
17+
eegdash/features/serialization.py
18+
eegdash/features/utils.py
19+
eegdash/features/feature_bank/__init__.py
20+
eegdash/features/feature_bank/complexity.py
21+
eegdash/features/feature_bank/connectivity.py
22+
eegdash/features/feature_bank/csp.py
23+
eegdash/features/feature_bank/dimensionality.py
24+
eegdash/features/feature_bank/signal.py
25+
eegdash/features/feature_bank/spectral.py

eegdash.egg-info/dependency_links.txt

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+

eegdash.egg-info/requires.txt

+16
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
xarray
2+
python-dotenv
3+
s3fs
4+
mne
5+
pynwb
6+
h5py
7+
pymongo
8+
joblib
9+
braindecode
10+
mne-bids
11+
pybids
12+
pymatreader
13+
pyarrow
14+
tqdm
15+
numba
16+
pre-commit

eegdash.egg-info/top_level.txt

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
eegdash

0 commit comments

Comments
 (0)