-
Notifications
You must be signed in to change notification settings - Fork 13
Open
Description
Feature Request: Include files with tokens in production-snapshot
Hi ClearlyDefined community!
We're building an internal tool to help our developers manage license compliance, and we'd love to leverage ClearlyDefined's data. Specifically, we want to use the production-snapshot blob storage to avoid API rate limits while fetching license files for the libraries our teams use.
The Challenge
Currently, we're unable to retrieve license files from the attachments endpoint because:
- The backup process uses the
definitions-trimmedMongoDB collection, which doesn't include thefilesarray - Without the
filesarray and theirtokenproperties, we can't fetch license files via the/attachments/{token}endpoint
This makes it impossible to retrieve the actual license files for components stored in the production snapshots.
Proposed Solution
We'd like to propose modifying the backup job to:
- Fetch data from the
definitions-pagedcollection (which includes thefilesarray) - Filter to keep only files that have a
tokenproperty (making them easily retrievable)
Example
Current behavior - Files array is not present:
{
"_id": "npm/npmjs/-/react-native-navigation-bar-color/2.0.2",
"files": 16,
"licensed": { ... }
}Proposed behavior - Only files with tokens are included:
{
"_id": "npm/npmjs/-/react-native-navigation-bar-color/2.0.2",
"files": 16,
"licensed": { ... }
"files": [
{
"path": "package/LICENSE",
"license": "MIT",
"hashes": {
"sha1": "8da5d6d75a66a60aedf29a5e70c07e4441b7cb13",
"sha256": "4bcebe9a76f1fbdef1ca52e59f8a97d45444ccdf6816cf4e9ce19af60b9ad6a0"
},
"token": "4bcebe9a76f1fbdef1ca52e59f8a97d45444ccdf6816cf4e9ce19af60b9ad6a0"
}
]
}Benefits
- Downstream tools can fetch license files directly from ClearlyDefined without hitting API limits
- Snapshot size won't increase significantly since we're only including files with retrievable tokens
- Makes it clear which files are available for retrieval
- Maintains backwards compatibility (only adds data that was previously missing)
PR Available
We've implemented this feature in PR.
Happy to iterate based on community feedback!
derbauer97
Metadata
Metadata
Assignees
Labels
No labels