Skip to content

fix: cache album ID in Firestore — prevent duplicate Family Faces albums#20

Merged
yhyatt merged 5 commits intomainfrom
fix/album-id-cache
Feb 27, 2026
Merged

fix: cache album ID in Firestore — prevent duplicate Family Faces albums#20
yhyatt merged 5 commits intomainfrom
fix/album-id-cache

Conversation

@yhyatt
Copy link
Owner

@yhyatt yhyatt commented Feb 27, 2026

Problem

appendonly scope (DMAF's Google Photos scope) cannot list albums — GET /v1/albums returns 403. The old ensure_album silently failed the list check, then fell through to create a new album on every Cloud Run invocation. Cloud Run runs hourly → one new "Family Faces" album per hour.

Fix

  • ensure_album: removed the broken list GET. Now create-only, with a clear docstring warning callers to cache the result.
  • get_or_create_album_id: new function. Checks dmaf_config/google_photos_album in Firestore first. Returns cached ID if album name matches. Creates + caches only on first run (or if album name changes in config).
  • __main__.py: uses get_or_create_album_id instead of ensure_album.

After this fix

  • First Cloud Run scan: creates "Family Faces", stores ID in Firestore
  • Every subsequent scan: reads ID from Firestore, zero Google Photos API list/create calls

Tests

  • TestEnsureAlbum: updated to test create-only behaviour
  • TestGetOrCreateAlbumId: 3 new tests — cache hit, cache miss (first run), album name change

…lbums

Root cause: appendonly scope cannot list albums (GET /v1/albums returns 403).
The old ensure_album silently failed the list, then created a new album on
every Cloud Run invocation — one new 'Family Faces' album per hour.

Fix:
- ensure_album: remove broken list logic, create-only with clear docstring
- get_or_create_album_id: new function that checks Firestore cache first
  (dmaf_config/google_photos_album), creates once, caches the ID
- __main__.py: use get_or_create_album_id instead of ensure_album
- Tests: replace list-based tests with create-only + cache hit/miss tests
Copilot AI review requested due to automatic review settings February 27, 2026 09:08
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes repeated creation of duplicate Google Photos albums under the photoslibrary.appendonly scope by caching the created album ID in Firestore and reusing it across Cloud Run invocations.

Changes:

  • Make ensure_album create-only (remove broken album listing logic under appendonly scope) and document the limitation.
  • Add get_or_create_album_id to read/write a cached album ID in Firestore, creating the album only on first run (or when the configured name changes).
  • Update the entrypoint and tests to use and validate the Firestore-cached album behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
src/dmaf/google_photos/api.py Removes album listing from ensure_album and introduces Firestore-backed get_or_create_album_id.
src/dmaf/__main__.py Switches album selection to get_or_create_album_id during startup.
src/dmaf/google_photos/__init__.py Exposes get_or_create_album_id as part of the public google_photos API.
tests/test_google_photos_api.py Updates album tests for create-only behavior and adds tests for the Firestore cache logic.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 88 to 114
from google.cloud import firestore as _fs

logger = logging.getLogger(__name__)
db = _fs.Client(project=firestore_project)
ref = db.collection(firestore_collection).document("google_photos_album")

doc = ref.get()
if doc.exists:
data = doc.to_dict() or {}
cached_id = data.get("album_id")
cached_name = data.get("album_name")
if cached_id and cached_name == album_name:
logger.debug(f"Using cached album ID for '{album_name}': {cached_id[:12]}...")
return str(cached_id)
# Album name changed — fall through to create a new one

album_id = ensure_album(creds, album_name)
if album_id:
ref.set({
"album_name": album_name,
"album_id": album_id,
"created_at": _fs.SERVER_TIMESTAMP,
})
logger.info(f"Created and cached Google Photos album '{album_name}' -> {album_id[:12]}...")
return album_id


Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_or_create_album_id imports google.cloud.firestore unconditionally. google-cloud-firestore is not part of the default or .[dev] dependencies, so this will raise ImportError at runtime (and in CI) even when users just want album uploads. Consider wrapping the Firestore import/client creation in a try/except ImportError and falling back to ensure_album (or raising a clear, actionable error telling users to install google-cloud-firestore).

Suggested change
from google.cloud import firestore as _fs
logger = logging.getLogger(__name__)
db = _fs.Client(project=firestore_project)
ref = db.collection(firestore_collection).document("google_photos_album")
doc = ref.get()
if doc.exists:
data = doc.to_dict() or {}
cached_id = data.get("album_id")
cached_name = data.get("album_name")
if cached_id and cached_name == album_name:
logger.debug(f"Using cached album ID for '{album_name}': {cached_id[:12]}...")
return str(cached_id)
# Album name changed — fall through to create a new one
album_id = ensure_album(creds, album_name)
if album_id:
ref.set({
"album_name": album_name,
"album_id": album_id,
"created_at": _fs.SERVER_TIMESTAMP,
})
logger.info(f"Created and cached Google Photos album '{album_name}' -> {album_id[:12]}...")
return album_id
logger = logging.getLogger(__name__)
try:
from google.cloud import firestore as _fs
db = _fs.Client(project=firestore_project)
ref = db.collection(firestore_collection).document("google_photos_album")
doc = ref.get()
if doc.exists:
data = doc.to_dict() or {}
cached_id = data.get("album_id")
cached_name = data.get("album_name")
if cached_id and cached_name == album_name:
logger.debug(
f"Using cached album ID for '{album_name}': {cached_id[:12]}..."
)
return str(cached_id)
# Album name changed — fall through to create a new one
album_id = ensure_album(creds, album_name)
if album_id:
ref.set(
{
"album_name": album_name,
"album_id": album_id,
"created_at": _fs.SERVER_TIMESTAMP,
}
)
logger.info(
f"Created and cached Google Photos album '{album_name}' -> {album_id[:12]}..."
)
return album_id
except ImportError:
# Firestore is optional; fall back to non-cached album creation and
# emit a clear, actionable warning for users who want caching.
logger.warning(
"google-cloud-firestore is not installed; skipping Google Photos album "
"ID caching. Install 'google-cloud-firestore' to enable caching."
)
return ensure_album(creds, album_name)

Copilot uses AI. Check for mistakes.
Comment on lines 168 to 170
@patch("dmaf.google_photos.api.requests.post")
def test_api_error_no_albums_returned(self, mock_post, mock_get):
"""Test handling API response without albums key."""
@patch("google.cloud.firestore.Client")
def test_returns_cached_id(self, mock_fs_client, mock_post):
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests patch google.cloud.firestore.Client, but CI installs dmaf.[dev] which does not include google-cloud-firestore. If that dependency isn’t installed, these tests will error before running. Consider using pytest.importorskip("google.cloud.firestore") / conditional skipping, or patching via sys.modules so the tests don’t require the external Firestore package.

Copilot uses AI. Check for mistakes.
Comment on lines 212 to 215
album_id = get_or_create_album_id(
creds,
settings.google_photos_album_name,
firestore_project=settings.dedup.firestore_project,
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

settings.dedup.firestore_project is optional when dedup.backend is sqlite, but it’s passed into get_or_create_album_id unconditionally. This can break local runs (album upload silently disabled via the broad exception handler) unless Firestore is configured. Consider using Firestore caching only when dedup.backend == "firestore" (and firestore_project is set), otherwise fall back to ensure_album or require an explicit Firestore project for album caching.

Suggested change
album_id = get_or_create_album_id(
creds,
settings.google_photos_album_name,
firestore_project=settings.dedup.firestore_project,
album_kwargs: dict = {}
if (
getattr(settings, "dedup", None) is not None
and getattr(settings.dedup, "backend", None) == "firestore"
and getattr(settings.dedup, "firestore_project", None)
):
album_kwargs["firestore_project"] = settings.dedup.firestore_project
album_id = get_or_create_album_id(
creds,
settings.google_photos_album_name,
**album_kwargs,

Copilot uses AI. Check for mistakes.
from dmaf.database import get_database
from dmaf.face_recognition import best_match, load_known_faces
from dmaf.google_photos import create_media_item, ensure_album, get_creds, upload_bytes
from dmaf.google_photos import create_media_item, ensure_album, get_creds, get_or_create_album_id, upload_bytes
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This import line likely exceeds the repo’s Ruff line-length limit (100) and can fail CI linting. Please wrap it using parentheses and one imported name per line (or otherwise format to <= 100 chars).

Suggested change
from dmaf.google_photos import create_media_item, ensure_album, get_creds, get_or_create_album_id, upload_bytes
from dmaf.google_photos import (
create_media_item,
ensure_album,
get_creds,
get_or_create_album_id,
upload_bytes,
)

Copilot uses AI. Check for mistakes.
Kai (OpenClaw) and others added 4 commits February 27, 2026 11:30
…d-firestore installed

The old tests patched google.cloud.firestore.Client directly — this fails in
CI's base test matrix where google-cloud-firestore is not installed.

Extract _firestore_client(project) into a thin helper that owns the local
import, then mock dmaf.google_photos.api._firestore_client in tests instead.
No package install required in test environments.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@yhyatt yhyatt merged commit e879c84 into main Feb 27, 2026
6 checks passed
@yhyatt yhyatt deleted the fix/album-id-cache branch February 27, 2026 10:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants