Update data tools by FasilGibdaw · Pull Request #61 · klaundal/lompe

FasilGibdaw · 2026-04-28T18:33:47Z

cleaned data_tools adding case example how to use the data tools.

Copilot

Pull request overview

This PR expands and cleans up data_tools by adding new dataset download helpers (Swarm, SuperMAG, SuperDARN, CHAMP, AMPERE, DMSP SSUSI/SSIES), a convenience loader for event-based workflows, and small SSUSI-related updates in the existing loader code.

Changes:

Added new downloader modules for multiple data sources (SuperMAG, SuperDARN, Swarm, CHAMP, AMPERE, DMSP SSUSI/SSIES) plus an event-oriented “prepare + subset” helper.
Updated dataloader.read_ssusi() to support selecting SSUSI source and adjusted SSIES/Iridium handling.
Added/updated supporting assets (station list CSV, SuperDARN Zenodo catalog CSV, data_tools README notes).

Reviewed changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 18 comments.

Show a summary per file

File	Description
`lompe/utils/time.py`	Formatting changes + new `date2doy()` helper for YYYY-MM-DD strings.
`lompe/data_tools/swarm.py`	New Swarm downloader using `viresclient` and HDF output.
`lompe/data_tools/supermag.py`	New SuperMAG downloader with parallel station downloads and HDF output.
`lompe/data_tools/superdarn.py`	New SuperDARN downloader/parser driven by a Zenodo URL catalog and HDF output.
`lompe/data_tools/get_lompe_data.py`	New “prepare event data once” + “extract subsets into `lompe.Data`” helper and example main.
`lompe/data_tools/dmsp_ssusi.py`	New SSUSI downloader (CDAWeb) + processing to NetCDF.
`lompe/data_tools/dmsp_ssies.py`	New SSIES downloader via Madrigal FTP-like pages + HDF output.
`lompe/data_tools/dataloader.py`	SSUSI source option + SSIES file filtering changes + Iridium glob extension change + formatting tweaks.
`lompe/data_tools/champ.py`	New CHAMP downloader/processor producing HDF.
`lompe/data_tools/ampere.py`	New AMPERE raw download + conversion via existing `read_iridium()`.
`lompe/data_tools/README`	Updated dataset list and SSUSI source note.
`lompe/data/supermag_stations.csv`	Added SuperMAG station metadata CSV.
`lompe/data/sdarn_2010_to_2021.csv`	Added SuperDARN Zenodo catalog CSV.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-28T18:41:19Z

+    # URL of the Zenodo record
+    url = filtered_df['url'].tolist()[0]
+


If there is no matching (year, month) entry in the CSV, filtered_df will be empty and tolist()[0] will raise IndexError. Add an explicit empty check and return/raise a clear “event not covered” error.

Copilot · 2026-04-28T18:41:20Z

@@ -0,0 +1,156 @@
+from joblib import Parallel, delayed


joblib is imported at module import time, but it’s not listed in pyproject.toml dependencies. This will cause import lompe.data_tools.supermag to fail in a standard install. Consider moving the import inside download_supermag() with a helpful error message, and/or adding joblib as an optional dependency.

Suggested change

from joblib import Parallel, delayed

try:

from joblib import Parallel, delayed

except ImportError:

def _missing_joblib(*args, **kwargs):

raise ImportError(

"joblib is required to use download_supermag() with parallel execution. "

"Install joblib to enable this functionality."

)

Parallel = _missing_joblib

delayed = _missing_joblib

Copilot · 2026-04-28T18:41:20Z

+from .champ import download_champ
+from .superdarn import download_sdarn
+from .swarm import download_swarm
+from .dmsp_ssies import download_dmsp_ssies


download_dmsp_ssies is imported but never used in this module. Consider removing it until it’s actually wired into prepare_event_data() to avoid unused-import warnings.

Suggested change

from .dmsp_ssies import download_dmsp_ssies

Copilot · 2026-04-28T18:41:20Z

+        for line in lines:
+            if line.startswith("token ="):
+                token_value = line.split('=', 1)[1].strip()
+                if token_value:
+                    print("Swarm token is present:", token_value)


The token-check loop only prints when a non-empty token is found; if the file exists but has no token = ... entry (or it’s empty), the function will still proceed to make the request and fail later. Consider explicitly detecting the “no token found/empty token” case and returning with a clear message.

Suggested change

for line in lines:

if line.startswith("token ="):

token_value = line.split('=', 1)[1].strip()

if token_value:

print("Swarm token is present:", token_value)

token_value = None

for line in lines:

if line.startswith("token ="):

token_value = line.split('=', 1)[1].strip()

break

if not token_value:

print("Token is missing or empty. \nPlease visit https://viresclient.readthedocs.io/en/latest/config_details.html to configure it")

return

print("Swarm token is present:", token_value)

Copilot · 2026-04-28T18:41:21Z

+import requests
+import pandas as pd
+import xarray as xr
+from bs4 import BeautifulSoup
+


This module imports third-party packages (requests, bs4) at import time, but they are not declared in pyproject.toml dependencies/optionals. That means import lompe.data_tools.superdarn can fail in a standard install even if the caller never uses these functions. Prefer lazy-importing inside the download functions (and raising a clear install hint) and/or declare these as optional dependencies.

Copilot · 2026-04-28T18:41:23Z

+import numpy as np
+import pandas as pd
+import lompe
+from lompe.data_tools import datadownloader, dataloader


datadownloader and dataloader are imported but not used in this module (only referenced in commented code). Removing unused imports avoids linter warnings and reduces import-time work.

Suggested change

from lompe.data_tools import datadownloader, dataloader

Copilot · 2026-04-28T18:41:24Z

+        try:
+            files = glob.glob(
+                f"{temp_sdarn_path}*{event.replace('-', '')}*.nc")
+            files.sort()


If no files are downloaded/found, files will be empty and the downstream processing will fail (e.g., missing 'time' column / ddd.index[0]). Add a guard for if not files: (and ideally clean up the temp directory) before continuing.

Suggested change

files.sort()

files.sort()

if not files:

print(f"No SuperDARN files found for event {event}")

shutil.rmtree(temp_sdarn_path, ignore_errors=True)

return None

Copilot · 2026-04-28T18:41:24Z

+        for line in lines:
+            if line.startswith("token ="):
+                token_value = line.split('=', 1)[1].strip()
+                if token_value:
+                    print("Swarm token is present:", token_value)
+    except:
+        print("Token is missing or empty. \nPlease visit https://viresclient.readthedocs.io/en/latest/config_details.html to configure it")
+        return


Bare except: will swallow unrelated errors (e.g., permission issues, parse errors) and makes debugging difficult. Catch specific exceptions (e.g., FileNotFoundError, PermissionError) and handle the “token missing” case explicitly.

Suggested change

for line in lines:

if line.startswith("token ="):

token_value = line.split('=', 1)[1].strip()

if token_value:

print("Swarm token is present:", token_value)

except:

print("Token is missing or empty. \nPlease visit https://viresclient.readthedocs.io/en/latest/config_details.html to configure it")

return

except FileNotFoundError:

print("Token is missing or empty. \nPlease visit https://viresclient.readthedocs.io/en/latest/config_details.html to configure it")

return

except PermissionError:

print("Unable to read Swarm token configuration due to file permissions. \nPlease visit https://viresclient.readthedocs.io/en/latest/config_details.html to configure it")

return

except OSError as e:

print(f"Unable to read Swarm token configuration: {e}")

return

token_value = None

for line in lines:

if line.startswith("token ="):

token_value = line.split('=', 1)[1].strip()

break

if not token_value:

print("Token is missing or empty. \nPlease visit https://viresclient.readthedocs.io/en/latest/config_details.html to configure it")

return

print("Swarm token is present:", token_value)

Copilot · 2026-04-28T18:41:24Z

        fn = basepath + event.replace('-', '') + 'Amp_invert.ncdf'
    if not os.path.isfile(fn):
-        files = glob.glob(basepath + '*' + event.replace('-', '') + '*.ncdf')
+        files = glob.glob(basepath + '*' + event.replace('-', '') + '*.nc')


The fallback search now only globs for *.nc, but AMPERE fitted files are often *.ncdf (and the default filename is still Amp_invert.ncdf). Consider globbing both *.nc and *.ncdf to avoid missing valid files.

Suggested change

files = glob.glob(basepath + '*' + event.replace('-', '') + '*.nc')

files = (

glob.glob(basepath + '*' + event.replace('-', '') + '*.nc') +

glob.glob(basepath + '*' + event.replace('-', '') + '*.ncdf')

)

Copilot · 2026-04-28T18:41:25Z

+            (irid.lon.values, irid.lat.values, irid.r.values))
+    else:
+        irid_B = np.empty((3, 0))
+        irid_coords = np.empty((2, 0))


When irid is empty, irid_coords is created with shape (2, 0), but space_mag_fac expects 3 coordinate rows (lon, lat, r). This will raise ShapeError in lompe.Data; make the empty-case coords (3, 0).

Suggested change

irid_coords = np.empty((2, 0))

irid_coords = np.empty((3, 0))

FasilGibdaw added 5 commits April 24, 2026 18:42

Update data and data_tools folders

cd74dab

datadownloader script removed and parts of it in separate scripts

83d121a

deleted datadownloader

80ca26d

example of using the data tools added

40b0b69

cleaning by deleting supermag_api.py

8223ee4

Copilot AI review requested due to automatic review settings April 28, 2026 18:33

Copilot AI reviewed Apr 28, 2026

View reviewed changes

klaundal merged commit 435bf11 into klaundal:main Apr 28, 2026
5 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update data tools #61

Update data tools #61
klaundal merged 5 commits into
klaundal:mainfrom
FasilGibdaw:update-data-folders

FasilGibdaw commented Apr 28, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		# URL of the Zenodo record
		url = filtered_df['url'].tolist()[0]

-from joblib import Parallel, delayed
+try:
+    from joblib import Parallel, delayed
+except ImportError:
+    def _missing_joblib(*args, **kwargs):
+        raise ImportError(
+            "joblib is required to use download_supermag() with parallel execution. "
+            "Install joblib to enable this functionality."
+        )
+    Parallel = _missing_joblib
+    delayed = _missing_joblib

-            files.sort()
+            files.sort()
+            if not files:
+                print(f"No SuperDARN files found for event {event}")
+                shutil.rmtree(temp_sdarn_path, ignore_errors=True)
+                return None

	irid_coords = np.empty((2, 0))
	irid_coords = np.empty((3, 0))

Conversation

FasilGibdaw commented Apr 28, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants