The IPFS Backend Router provides a flexible, pluggable backend system for IPFS operations within ipfs_accelerate_py. It implements a preference-based fallback strategy:
- ipfs_kit_py (Preferred) - Full distributed storage capabilities
- HuggingFace Cache (Fallback 1) - Local storage with IPFS-like addressing
- Kubo CLI (Fallback 2) - Standard IPFS daemon via CLI
┌─────────────────────────────────────────────────────────┐
│ IPFS Backend Router │
├─────────────────────────────────────────────────────────┤
│ Convenience Functions (add_bytes, cat, pin, etc.) │
├─────────────────────────────────────────────────────────┤
│ Backend Selection Layer │
│ • Environment-based configuration │
│ • Automatic fallback on failure │
│ • Backend registry for custom providers │
├─────────────────────────────────────────────────────────┤
│ Backend Implementations │
├──────────────┬──────────────┬──────────────────────────┤
│ ipfs_kit_py │ HF Cache │ Kubo CLI │
│ (Preferred) │ (Fallback1) │ (Fallback2) │
└──────────────┴──────────────┴──────────────────────────┘
The router is included in ipfs_accelerate_py. For full functionality:
# Basic installation (includes HF Cache and Kubo backends)
pip install ipfs_accelerate_py
# With ipfs_kit_py for distributed storage (recommended)
pip install ipfs_accelerate_py[ipfs_kit]
# Ensure Kubo (go-ipfs) is installed for CLI backend
# https://docs.ipfs.tech/install/command-line/from ipfs_accelerate_py import ipfs_backend_router
# Store data and get CID
data = b"Hello, IPFS!"
cid = ipfs_backend_router.add_bytes(data, pin=True)
print(f"Stored with CID: {cid}")
# Retrieve data
retrieved = ipfs_backend_router.cat(cid)
assert retrieved == data
# Store a file
cid = ipfs_backend_router.add_path("/path/to/model.bin", pin=True)
# Retrieve to specific path
ipfs_backend_router.get_to_path(cid, output_path="/path/to/output.bin")from ipfs_accelerate_py.model_manager import ModelManager
# Enable IPFS storage for models
manager = ModelManager(enable_ipfs=True)
# Store model to IPFS
cid = manager.store_model_to_ipfs(
model_path="/path/to/model",
model_id="bert-base-uncased"
)
# Retrieve model from IPFS
success = manager.retrieve_model_from_ipfs(
cid=cid,
output_path="/path/to/cache",
model_id="bert-base-uncased"
)from ipfs_accelerate_py.hf_model_server.loader.cache import ModelCache
# Enable IPFS for model cache
cache = ModelCache(max_size=10, enable_ipfs=True)
# Store model weights to IPFS
cid = await cache.store_model_to_ipfs(
model_id="gpt2",
model_path="/path/to/gpt2"
)
# Retrieve model from IPFS
success = await cache.retrieve_model_from_ipfs(
cid=cid,
output_path="/path/to/cache/gpt2"
)| Variable | Default | Description |
|---|---|---|
IPFS_BACKEND |
(auto) | Force specific backend (e.g., "ipfs_kit", "hf_cache", "kubo") |
ENABLE_IPFS_KIT |
true | Enable ipfs_kit_py backend |
IPFS_KIT_DISABLE |
false | Explicitly disable ipfs_kit_py |
ENABLE_HF_CACHE |
true | Enable HuggingFace cache backend |
KUBO_CMD |
ipfs | IPFS CLI command path |
HF_HOME |
~/.cache/huggingface | HuggingFace cache directory |
IPFS_ROUTER_CACHE |
1 | Enable backend instance caching |
ENABLE_IPFS_MODEL_CACHE |
false | Enable IPFS for model server cache |
ENABLE_IPFS_MODEL_STORAGE |
false | Enable IPFS for model manager |
# Default behavior - prefers ipfs_kit_py, falls back to HF cache, then Kubo
export ENABLE_IPFS_KIT=true
python your_script.py# Disable ipfs_kit_py and Kubo, use HF cache
export IPFS_KIT_DISABLE=1
export IPFS_BACKEND=hf_cache
python your_script.pyexport IPFS_BACKEND=kubo
export KUBO_CMD=/usr/local/bin/ipfs
python your_script.py# Minimal configuration for testing
export IPFS_KIT_DISABLE=1
export IPFS_BACKEND=hf_cache
export HF_HOME=/tmp/test_cache
python -m pytestFeatures:
- Full distributed storage
- P2P content distribution
- Multi-backend support (IPFS, S3, Filecoin)
- Automatic local caching
Requirements:
ipfs_kit_pypackage installed- Network connectivity for distributed operations
Use Cases:
- Production deployments
- Distributed model sharing
- Content-addressed model registry
Features:
- Local filesystem storage
- IPFS-like CID generation
- Integrates with HF model cache
- No external dependencies
Requirements:
- Write access to
HF_HOMEdirectory
Use Cases:
- Development environments
- CI/CD pipelines
- Air-gapped deployments
- Local model caching
Features:
- Standard IPFS daemon integration
- Full IPFS feature set
- Block-level operations
- IPNS support
Requirements:
- Kubo (go-ipfs) installed and in PATH
- IPFS daemon running (for some operations)
Use Cases:
- Existing IPFS infrastructure
- Full IPFS protocol support
- Gateway operations
Store bytes and return CID.
Retrieve data by CID.
Pin content to prevent garbage collection.
Unpin content.
Store raw block and return CID.
Get raw block by CID.
Add file or directory to IPFS.
Retrieve content and save to path.
Get the current backend instance.
Set global default backend.
Register custom backend provider.
from ipfs_accelerate_py import ipfs_backend_router
class CustomBackend:
def add_bytes(self, data: bytes, *, pin: bool = True) -> str:
# Custom implementation
pass
def cat(self, cid: str) -> bytes:
# Custom implementation
pass
# Implement other required methods...
# Register
ipfs_backend_router.register_ipfs_backend(
"custom",
lambda: CustomBackend()
)
# Use
import os
os.environ["IPFS_BACKEND"] = "custom"
cid = ipfs_backend_router.add_bytes(b"data")from ipfs_accelerate_py import ipfs_backend_router
import datasets
# Store dataset to IPFS
dataset = datasets.load_dataset("squad", split="train[:100]")
dataset.save_to_disk("/tmp/squad_sample")
cid = ipfs_backend_router.add_path("/tmp/squad_sample", recursive=True)
# Share CID, others can retrieve
ipfs_backend_router.get_to_path(cid, output_path="/tmp/squad_retrieved")
retrieved = datasets.load_from_disk("/tmp/squad_retrieved")from transformers import AutoModel, AutoTokenizer
from ipfs_accelerate_py import ipfs_backend_router
# Save model locally
model = AutoModel.from_pretrained("bert-base-uncased")
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
model.save_pretrained("/tmp/bert")
tokenizer.save_pretrained("/tmp/bert")
# Store to IPFS
cid = ipfs_backend_router.add_path("/tmp/bert", recursive=True)
print(f"Model CID: {cid}")
# Retrieve from IPFS
ipfs_backend_router.get_to_path(cid, output_path="/tmp/bert_from_ipfs")
model_retrieved = AutoModel.from_pretrained("/tmp/bert_from_ipfs")| Backend | Latency | Throughput | Network | Use Case |
|---|---|---|---|---|
| ipfs_kit_py | Low-Med | High | Required | Production |
| HF Cache | Very Low | Very High | None | Development |
| Kubo CLI | Medium | Medium | Optional | Full IPFS |
- Enable Caching: Keep
IPFS_ROUTER_CACHE=1for backend reuse - Local-First: Use HF Cache backend for local development
- Pin Important Data: Use
pin=Truefor models you'll reuse - Batch Operations: Use
add_pathfor directories instead of individual files - Choose Right Backend: Use ipfs_kit_py for distributed, HF cache for local
Solution: Install ipfs_kit_py or set IPFS_KIT_DISABLE=1
Solution: Data not stored locally. Check if CID is correct or use different backend.
Solution:
- Ensure Kubo is installed and in PATH
- Check if IPFS daemon is running:
ipfs daemon - Verify
KUBO_CMDenvironment variable
Solution:
- Clear cache:
ipfs_backend_router._get_default_backend_cached.cache_clear() - Check environment variables
- Verify backend registration
import logging
logging.basicConfig(level=logging.DEBUG)
from ipfs_accelerate_py import ipfs_backend_router
# This will show backend selection and operations
backend = ipfs_backend_router.get_backend()
print(f"Using backend: {type(backend).__name__}")Run the test suite:
# All router tests
pytest test/test_ipfs_backend_router.py -v
# Specific backend tests
pytest test/test_ipfs_backend_router.py::TestHuggingFaceCacheBackend -v
# With coverage
pytest test/test_ipfs_backend_router.py --cov=ipfs_accelerate_py.ipfs_backend_router- Data Privacy: IPFS content is content-addressed and potentially public
- Pinning: Pin sensitive data carefully; it's harder to remove
- Network Access: ipfs_kit_py and Kubo may expose data to network
- Local Storage: HF Cache keeps everything local by default
The router is API-compatible with ipfs_datasets_py.ipfs_backend_router:
# Old code
from ipfs_datasets_py import ipfs_backend_router
cid = ipfs_backend_router.block_put(data)
# New code (drop-in replacement)
from ipfs_accelerate_py import ipfs_backend_router
cid = ipfs_backend_router.block_put(data)Key differences:
- Backend preference: ipfs_kit_py → HF cache → Kubo (vs. just Kubo)
- Additional backends available
- Better fallback handling
To add a new backend:
- Implement the
IPFSBackendprotocol - Register with
register_ipfs_backend() - Add tests in
test_ipfs_backend_router.py - Document in this guide
See ipfs_backend_router.py for protocol definition.
See main repository LICENSE file.