Update data tools by FasilGibdaw · Pull Request #60 · klaundal/lompe

FasilGibdaw · 2026-04-28T17:11:49Z

Adding downloading scripts and sample usage (notebook)

Copilot

Pull request overview

This PR adds a set of data download/assembly utilities under lompe/data_tools (plus small supporting changes) to fetch common event-day datasets and produce Lompe-ready files/dataframes.

Changes:

Added new download scripts for Swarm, SuperMAG, SuperDARN, CHAMP, and SuperMAG API utilities.
Added SSUSI/SSIES download helpers and an orchestration module to prepare per-event datasets and Lompe Data objects.
Updated existing time and loader utilities to support day-of-year handling and alternate SSUSI file sources/patterns.

Reviewed changes

Copilot reviewed 14 out of 15 changed files in this pull request and generated 15 comments.

Show a summary per file

File	Description
`lompe/utils/time.py`	Formatting cleanup and new `date2doy` helper used by data tools.
`lompe/data_tools/swarm.py`	New Swarm downloader using `viresclient` and HDF output.
`lompe/data_tools/supermag_api.py`	Added SuperMAG API helper module (contains restrictive license header).
`lompe/data_tools/supermag.py`	New SuperMAG downloader (parallel station downloads) producing Lompe-style dataframe/HDF.
`lompe/data_tools/superdarn.py`	New SuperDARN downloader that pulls Zenodo files and converts to Lompe-style dataframe/HDF.
`lompe/data_tools/get_lompe_data.py`	New orchestration layer to download multiple sources and build Lompe `Data` subsets.
`lompe/data_tools/dmsp_ssusi.py`	New SSUSI downloader/processor + concurrent download helper.
`lompe/data_tools/dmsp_ssies.py`	New SSIES downloader/processor using Madrigal endpoints.
`lompe/data_tools/dataloader.py`	Updated SSUSI reader signature and added DOY handling + small robustness tweaks.
`lompe/data_tools/champ.py`	New CHAMP downloader/processor producing Lompe-style dataframe/HDF.
`lompe/data_tools/ampere.py`	New AMPERE/Iridium downloader (raw + processed) integrating with existing loader.
`lompe/data_tools/README`	Updated list of supported datasets and SSUSI source note.
`lompe/data/supermag_stations.csv`	Added SuperMAG station metadata CSV.
`lompe/data/sdarn_2010_to_2021.csv`	Added Zenodo record mapping used by SuperDARN downloader.

Comments suppressed due to low confidence (1)

lompe/utils/time.py:30

The error message has a typo (date2ody) and references a different function name than the actual one (date_to_doy). Since this is user-facing, consider correcting it to date_to_doy: ... to make debugging clearer.

    if month.shape != day.shape:
        raise ValueError('date2ody: month and day must have the same shape')

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-28T17:19:32Z

+;supermag-api.py
+; ================
+; Author S. Antunes, based on supermag-api.pro by R.J.Barnes
+
+
+; (c) 2021  The Johns Hopkins University Applied Physics Laboratory
+;LLC.  All Rights Reserved. 
+
+;This material may be only be used, modified, or reproduced by or for
+;the U.S. Government pursuant to the license rights granted under the 
+;clauses at DFARS 252.227-7013/7014 or FAR 52.227-14. For any other
+;permission, 
+;please contact the Office of Technology Transfer at JHU/APL.
+
+; NO WARRANTY, NO LIABILITY. THIS MATERIAL IS PROVIDED "AS IS."
+; JHU/APL MAKES NO REPRESENTATION OR WARRANTY WITH RESPECT TO THE
+; PERFORMANCE OF THE MATERIALS, INCLUDING THEIR SAFETY, EFFECTIVENESS, 
+; OR COMMERCIAL VIABILITY, AND DISCLAIMS ALL WARRANTIES IN THE
+; MATERIAL, WHETHER EXPRESS OR IMPLIED, INCLUDING (BUT NOT LIMITED TO)
+; ANY AND ALL IMPLIED WARRANTIES OF PERFORMANCE, MERCHANTABILITY,
+; FITNESS FOR A PARTICULAR PURPOSE, AND NON-INFRINGEMENT OF
+; INTELLECTUAL PROPERTY OR OTHER THIRD PARTY RIGHTS. ANY USER OF THE
+; MATERIAL ASSUMES THE ENTIRERISK AND LIABILITY FOR USING THE
+; MATERIAL. IN NO EVENT SHALL JHU/APL BE LIABLE TO ANY USER OF THE
+; MATERIAL FOR ANY ACTUAL, INDIRECT, CONSEQUENTIAL, SPECIAL OR OTHER
+; DAMAGES ARISING FROM THE USE OF, OR INABILITY TO USE, THE MATERIAL,
+; INCLUDING, BUT NOT LIMITED TO, ANY DAMAGES FOR LOST PROFITS. 


The module docstring contains a restrictive “All Rights Reserved… may only be used… by or for the U.S. Government” notice. This is incompatible with this repository’s MIT license and likely can’t be redistributed here. Please remove this file, replace it with a clean-room implementation, or include only code that is explicitly MIT-compatible (with appropriate attribution).

Suggested change

;supermag-api.py

; ================

; Author S. Antunes, based on supermag-api.pro by R.J.Barnes

; (c) 2021 The Johns Hopkins University Applied Physics Laboratory

;LLC. All Rights Reserved.

;This material may be only be used, modified, or reproduced by or for

;the U.S. Government pursuant to the license rights granted under the

;clauses at DFARS 252.227-7013/7014 or FAR 52.227-14. For any other

;permission,

;please contact the Office of Technology Transfer at JHU/APL.

; NO WARRANTY, NO LIABILITY. THIS MATERIAL IS PROVIDED "AS IS."

; JHU/APL MAKES NO REPRESENTATION OR WARRANTY WITH RESPECT TO THE

; PERFORMANCE OF THE MATERIALS, INCLUDING THEIR SAFETY, EFFECTIVENESS,

; OR COMMERCIAL VIABILITY, AND DISCLAIMS ALL WARRANTIES IN THE

; MATERIAL, WHETHER EXPRESS OR IMPLIED, INCLUDING (BUT NOT LIMITED TO)

; ANY AND ALL IMPLIED WARRANTIES OF PERFORMANCE, MERCHANTABILITY,

; FITNESS FOR A PARTICULAR PURPOSE, AND NON-INFRINGEMENT OF

; INTELLECTUAL PROPERTY OR OTHER THIRD PARTY RIGHTS. ANY USER OF THE

; MATERIAL ASSUMES THE ENTIRERISK AND LIABILITY FOR USING THE

; MATERIAL. IN NO EVENT SHALL JHU/APL BE LIABLE TO ANY USER OF THE

; MATERIAL FOR ANY ACTUAL, INDIRECT, CONSEQUENTIAL, SPECIAL OR OTHER

; DAMAGES ARISING FROM THE USE OF, OR INABILITY TO USE, THE MATERIAL,

; INCLUDING, BUT NOT LIMITED TO, ANY DAMAGES FOR LOST PROFITS.

Utilities for building requests to the SuperMAG web services and

parsing the returned data into Python-friendly structures.

This module contains helper functions used to construct API URLs and

work with SuperMAG service responses.

Copilot · 2026-04-28T17:19:33Z

+    savefile = tempfile_path + event.replace('-', '') + '_superdarn_grdmap.h5'
+    if os.path.isfile(savefile):
+        return savefile
+    else:
+        from lompe.data_tools.dataloader import radar_losvec_from_mag
+        temp_sdarn_path = basepath + f"sdarn_files_{event.replace('-', '')}/"
+        os.makedirs(temp_sdarn_path, exist_ok=True)
+        download_sdarn_files(event, temp_sdarn_path)
+        # looking for the .nc files for the event
+        try:
+            files = glob.glob(
+                f"{temp_sdarn_path}*{event.replace('-', '')}*.nc")


Several paths are built with basepath + ... / tempfile_path + ..., which breaks when the base path doesn’t end with / and is non-portable on Windows. Prefer os.path.join(...) for save_path, savefile, and temp_sdarn_path construction.

Suggested change

savefile = tempfile_path + event.replace('-', '') + '_superdarn_grdmap.h5'

if os.path.isfile(savefile):

return savefile

else:

from lompe.data_tools.dataloader import radar_losvec_from_mag

temp_sdarn_path = basepath + f"sdarn_files_{event.replace('-', '')}/"

os.makedirs(temp_sdarn_path, exist_ok=True)

download_sdarn_files(event, temp_sdarn_path)

# looking for the .nc files for the event

try:

files = glob.glob(

f"{temp_sdarn_path}*{event.replace('-', '')}*.nc")

event_id = event.replace('-', '')

savefile = os.path.join(tempfile_path, f'{event_id}_superdarn_grdmap.h5')

if os.path.isfile(savefile):

return savefile

else:

from lompe.data_tools.dataloader import radar_losvec_from_mag

temp_sdarn_path = os.path.join(basepath, f"sdarn_files_{event_id}")

os.makedirs(temp_sdarn_path, exist_ok=True)

download_sdarn_files(event, temp_sdarn_path)

# looking for the .nc files for the event

try:

files = glob.glob(

os.path.join(temp_sdarn_path, f"*{event_id}*.nc"))

Copilot · 2026-04-28T17:19:33Z

+        print('DMSP SSUSI file saved: ' + savefile)
+
+        imgs.to_netcdf(savefile)
+        shutil.rmtree(basepath)
+        return savefile


download_ssusi unconditionally deletes basepath via shutil.rmtree(basepath). If a caller passes a non-temporary directory, this can remove unrelated files. Only delete directories you created for this run (e.g., a dedicated temp dir under tempfile_path) or gate deletion behind an explicit cleanup=True option.

Copilot · 2026-04-28T17:19:34Z

+def get_data_subsets(event_data, event, delta_minutes=2, sources=None, **kwargs):
+    '''
+    Extract data subsets for the given time interval [t0, t1]. and prepare lompe.Data objects.
+    Returns: iridium_data, supermag_data, superdarn_data, champ_data'''
+    if sources is None:
+        sources = ["supermag", "iridium", "superdarn", "champ"]
+    T0 = pd.to_datetime(event)
+    DT = pd.Timedelta(minutes=delta_minutes)
+    t0, t1 = T0 - DT / 2, T0 + DT / 2
+
+    def ensure_datetimeindex(df):
+        if not isinstance(df.index, pd.DatetimeIndex):
+            try:
+                df = df.copy()
+                df.index = pd.to_datetime(df.index)
+            except Exception as e:
+                raise TypeError(f"Failed to convert index to datetime: {e}")
+        return df
+
+    # --- iridium ---
+    iridium = event_data["iridium"]
+    irid = iridium[(iridium.time >= t0) & (iridium.time <= t1)]
+


get_data_subsets assumes required columns exist (e.g., iridium.time, smag.Be, etc.). If a download fails, safe_read_hdf returns an empty DataFrame without these columns, and attribute access like iridium.time will raise. Add guards for missing/empty datasets (and/or honor the sources argument) so the function can gracefully return empty lompe.Data objects.

Copilot · 2026-04-28T17:19:34Z

    for sat in ['F16', 'F17', 'F18', 'F19']:
-        files = glob.glob(basepath + '*' + sat + '*' +
-                          event[0:4]+event[5:7]+event[8:10] + '*.NC')
+        if source == 'jhuapl':
+            files = glob.glob(basepath + '*' + sat + '*' +
+                              event[0:4] + event[5:7] + event[8:10] + '*.NC')
+        elif source == 'cdaweb':
+            files = glob.glob(basepath + '*' + sat + '*' +
+                              event[0:4] + doy_str + '*.nc')
        files.sort()


In the source == 'cdaweb' branch, the glob uses '*' + sat + '*' where sat is 'F16' (uppercase). In this PR, download_ssusi_files() saves CDAWeb filenames verbatim (likely containing dmspf16 in lowercase), so this glob may miss the downloaded files on case-sensitive filesystems. Consider using sat.lower() (or a case-insensitive pattern) for the CDAWeb glob to align with the downloader’s naming.

Copilot · 2026-04-28T17:19:36Z

+        event (_type_): _description_
+        userid (str, optional): Defaults to "lompe".
+        n_jobs (int, optional): Defaults to -1.
+        save (bool, optional): Defaults to False.


Docstring says save defaults to False, but the function signature sets save=True. Update either the default value or the docstring so callers don’t get surprised by a file being written by default.

Suggested change

save (bool, optional): Defaults to False.

save (bool, optional): Defaults to True.

Copilot · 2026-04-28T17:19:36Z

+import numpy as np
+import pandas as pd
+import lompe
+from lompe.data_tools import datadownloader, dataloader


from lompe.data_tools import datadownloader, dataloader are currently unused (only referenced in commented-out code). Consider removing these imports to avoid unnecessary import-time side effects and to keep the module clean.

Suggested change

from lompe.data_tools import datadownloader, dataloader

Copilot · 2026-04-28T17:19:36Z

+    try:
+        with open(os.path.expanduser('~/.viresclient.ini'), 'r') as file:
+            lines = file.readlines()
+        for line in lines:
+            if line.startswith("token ="):
+                token_value = line.split('=', 1)[1].strip()
+                if token_value:
+                    print("Swarm token is present:", token_value)
+    except:
+        print("Token is missing or empty. \nPlease visit https://viresclient.readthedocs.io/en/latest/config_details.html to configure it")
+        return


The access token is printed to stdout (print("Swarm token is present:", token_value)), which can leak credentials into logs/notebooks. Avoid printing token values; at most log that a token is configured (without the secret) or rely on viresclient’s own validation.

Suggested change

try:

with open(os.path.expanduser('~/.viresclient.ini'), 'r') as file:

lines = file.readlines()

for line in lines:

if line.startswith("token ="):

token_value = line.split('=', 1)[1].strip()

if token_value:

print("Swarm token is present:", token_value)

except:

print("Token is missing or empty. \nPlease visit https://viresclient.readthedocs.io/en/latest/config_details.html to configure it")

return

token_found = False

try:

with open(os.path.expanduser('~/.viresclient.ini'), 'r') as file:

for line in file:

if line.startswith("token ="):

token_value = line.split('=', 1)[1].strip()

if token_value:

token_found = True

print("Swarm token is configured.")

break

except OSError:

print("Token is missing or empty. \nPlease visit https://viresclient.readthedocs.io/en/latest/config_details.html to configure it")

return

if not token_found:

print("Token is missing or empty. \nPlease visit https://viresclient.readthedocs.io/en/latest/config_details.html to configure it")

return

Copilot · 2026-04-28T17:19:37Z

+    filtered_df = file_loc[(file_loc['year'].astype(
+        str) == year) & (file_loc['Month_Num'] == month)]
+
+    # Apply function and add to DataFrame
+    event_date_str = event.replace('-', '')
+
+    # URL of the Zenodo record
+    url = filtered_df['url'].tolist()[0]
+


filtered_df['url'].tolist()[0] will raise IndexError when the CSV has no matching record for the event’s year/month (or if Month_Num mapping yields NaN). Add an explicit empty-check and return/raise a clear error when no Zenodo record is found for the requested month.

Copilot · 2026-04-28T17:19:37Z

    if day.min() < 1:
        raise ValueError('date2doy: day must not be less than 1')

    # flatten arrays:
    shape = month.shape
    month = month.flatten()
-    day   = day.flatten()
+    day = day.flatten()

    # check if day exceeds days in months
-    days_in_month    = np.array([0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31])
-    days_in_month_ly = np.array([0, 31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31])
-    if ( (np.any(day[~leapyear] > days_in_month   [month[~leapyear]])) | 
-         (np.any(day[ leapyear] > days_in_month_ly[month[ leapyear]])) ):
-        raise ValueError('date2doy: day must not exceed number of days in month')
+    days_in_month = np.array(
+        [0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31])
+    days_in_month_ly = np.array(
+        [0, 31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31])
+    if ((np.any(day[~leapyear] > days_in_month[month[~leapyear]])) |
+            (np.any(day[leapyear] > days_in_month_ly[month[leapyear]]))):
+        raise ValueError(
+            'date2doy: day must not exceed number of days in month')


Several ValueError messages in this function use the prefix date2doy: even though the function is date_to_doy. Aligning the prefix with the actual function name will make tracebacks and user reports less confusing.

FasilGibdaw added 4 commits April 24, 2026 18:42

Update data and data_tools folders

cd74dab

datadownloader script removed and parts of it in separate scripts

83d121a

deleted datadownloader

80ca26d

example of using the data tools added

40b0b69

Copilot AI review requested due to automatic review settings April 28, 2026 17:11

Copilot started reviewing on behalf of FasilGibdaw April 28, 2026 17:12 View session

Copilot AI reviewed Apr 28, 2026

View reviewed changes

FasilGibdaw closed this Apr 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update data tools #60

Update data tools #60
FasilGibdaw wants to merge 4 commits into
klaundal:mainfrom
FasilGibdaw:update-data-folders

FasilGibdaw commented Apr 28, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	save (bool, optional): Defaults to False.
	save (bool, optional): Defaults to True.

Conversation

FasilGibdaw commented Apr 28, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants