-
Notifications
You must be signed in to change notification settings - Fork 55
SQLite cache #59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
SQLite cache #59
Conversation
📝 WalkthroughWalkthroughCentralizes caching into a SQLite-backed CacheManager with migrations, adds HTML/page-level caching and pagination, makes CLI/TUI archive-aware (archive/restore/delete flows), threads page_size through generation, and migrates tests to the DB-backed cache. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant CLI as CLI / main
participant Cache as CacheManager
participant DB as SQLite (cache.db)
participant Conv as Converter
participant Rend as HtmlRenderer
CLI->>Cache: init(project_dir, db_path)
Cache->>DB: open connection, enable WAL, run_migrations()
Cache->>DB: ensure/load project row & aggregates
CLI->>Cache: has_cache_changes(project_dir, page_size)
Cache->>DB: query messages, cached_files, html_pages
DB-->>Cache: cache metadata (timestamps, counts, page info)
alt up-to-date & page_size unchanged
CLI->>Conv: skip regeneration
else regeneration required or page_size changed
CLI->>Conv: convert_jsonl_to_html(..., page_size)
Conv->>Cache: load_cached_entries(file/session)
Cache->>DB: SELECT messages WHERE file_id/session_id
DB-->>Cache: messages list
Cache-->>Conv: session_data
Conv->>Conv: _assign_sessions_to_pages(session_data, page_size)
loop per page
Conv->>Rend: generate(messages_subset, page_info, page_stats)
Rend-->>Conv: HTML output
Conv->>Cache: update_page_cache(page_meta)
Cache->>DB: INSERT/UPDATE html_pages, page_sessions, html_cache
end
Conv->>Cache: update project aggregates
Cache->>DB: UPDATE projects totals
end
CLI->>Cache: get_cache_stats()
Cache->>DB: SELECT aggregates
DB-->>Cache: stats
Cache-->>CLI: GenerationStats summary
Estimated code review effort🎯 5 (Critical) | ⏱️ ~180 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (8)
test/test_cache_integration.py (1)
395-411: Test references obsolete per-project cache directory structure.This test still references
project / "cache"andcache_dir.glob("*.json"), which no longer exist with the SQLite migration. The cache is now stored in a singlecache.dbfile at the parent level.This test should be updated to test SQLite database corruption recovery, or removed if no longer applicable.
Consider updating the test to verify SQLite corruption recovery:
def test_cache_corruption_recovery(self, setup_test_project): """Test recovery from corrupted cache files.""" project_dir = setup_test_project # Create initial cache convert_jsonl_to_html(input_path=project_dir, use_cache=True) - # Corrupt cache file - cache_dir = project_dir / "cache" - cache_files = list(cache_dir.glob("*.json")) - if cache_files: - cache_file = [f for f in cache_files if f.name != "index.json"][0] - cache_file.write_text("corrupted json data", encoding="utf-8") + # Corrupt SQLite database + cache_db = project_dir.parent / "cache.db" + if cache_db.exists(): + cache_db.write_bytes(b"corrupted data") # Should recover gracefully output = convert_jsonl_to_html(input_path=project_dir, use_cache=True) assert output.exists()test/test_integration_realistic.py (7)
561-583: Test references obsolete per-project cache structure.
test_uses_existing_cachechecks forproject / "cache" / "index.json"and verifies its mtime, but with SQLite migration, the cache is now stored incache.dbat the parent level.This test should be updated to work with SQLite or marked as testing legacy behavior that no longer applies.
645-667: Test references obsolete cache/index.json structure.
test_cache_version_mismatch_triggers_rebuildmanipulatesproject / "cache" / "index.json"which no longer exists with SQLite. This test should be updated to test SQLite cache version handling.
669-698: Test references obsolete per-file cache structure.
test_missing_cache_files_regeneratedmanipulates individual JSON cache files in the cache directory, but SQLite uses a single database file. This test logic no longer applies.
722-753: Test references obsolete cache/index.json for incremental updates.
test_adding_lines_triggers_cache_updatechecks mtime ofproject / "cache" / "index.json"to verify cache was updated. With SQLite, should checkcache.dbor uselast_updatedfrom project data.
819-844: Test references obsolete cache structure for new file detection.
test_new_file_detected_and_processedreads and comparescache/index.jsoncontent. Should useCacheManager.get_cached_project_data()to verify session count instead.
1162-1180: Test references obsolete cache/index.json for corruption handling.
test_corrupted_cache_index_handledcorrupts and checkscache/index.jsonwhich no longer exists. Should test SQLite database corruption recovery.
1182-1199: Test references obsolete cache directory structure.
test_missing_cache_directory_handleddeletes and checks forproject / "cache"directory recreation. With SQLite at the parent level, this test logic no longer applies.
🧹 Nitpick comments (7)
claude_code_log/cli.py (1)
279-284: Consider movingimport shutilto the top of the file.The
shutilimport is repeated inside conditional blocks. Moving it to the module-level imports would be cleaner and avoid potential repeated import overhead during iteration.import logging import os +import shutil import sys from pathlib import PathThen remove the inline imports at lines 281 and 299.
Also applies to: 299-302
test/test_cache_sqlite_integrity.py (2)
617-659: Concurrent readers test could be flaky under high load.The 5-second timeout per thread may not be sufficient in CI environments under load. Consider increasing the timeout or using a more robust synchronization mechanism.
# Start multiple reader threads threads = [threading.Thread(target=read_data) for _ in range(5)] for t in threads: t.start() for t in threads: - t.join(timeout=5) + t.join(timeout=10) assert len(errors) == 0, f"Errors occurred: {errors}" assert all(results), "Not all reads succeeded"
862-869:time.sleep(1.1)may cause test flakiness.The 1.1 second sleep assumes mtime resolution of 1 second. On some filesystems (particularly HFS+ on older macOS), mtime resolution is 1 second, but on others it can be finer. The test should still work, but be aware this adds ~1 second to test runtime.
Consider using a mock or explicitly setting the mtime instead of sleeping:
import os # After writing the file future_time = time.time() + 2 os.utime(jsonl_file, (future_time, future_time))claude_code_log/cache_sqlite.py (4)
93-96: Consider logging instead of printing for cache invalidation messages.Using
print()for status messages may pollute stdout in library usage scenarios. Consider using the logging module for consistency with the rest of the codebase.+import logging + +logger = logging.getLogger(__name__) + # In _ensure_project_exists: - print( - f"Cache version incompatible: {cached_version} -> {self.library_version}, invalidating cache" - ) + logger.info( + "Cache version incompatible: %s -> %s, invalidating cache", + cached_version, + self.library_version, + )
266-267: Movedateparserimport to module level.The
dateparserimport is done inside the method. While this may have been intentional for lazy loading, it's imported at the top level in other files per the coding guidelines, and keeping imports consistent aids maintainability.+import dateparser import json import sqlite3Then remove the import at line 267.
279-289: Redundantelsebranch with identical logic.Both branches of the
if/elseforto_dtprocessing result in the same time adjustment. Theelsebranch can be removed.if to_date: to_dt = dateparser.parse(to_date) if to_dt: if to_date in ["today", "yesterday"] or "days ago" in to_date: to_dt = to_dt.replace( hour=23, minute=59, second=59, microsecond=999999 ) - else: - to_dt = to_dt.replace( - hour=23, minute=59, second=59, microsecond=999999 - )The time is set to end-of-day regardless of the date format, so the condition check is redundant here.
349-369: Consider usingexecutemanyfor batch inserts.Inserting entries one at a time in a loop is less efficient than using
executemany. For large files with many messages, this could significantly impact performance.- # Insert all entries in a batch - for entry in entries: - serialized = self._serialize_entry(entry, file_id) - conn.execute( - """ - INSERT INTO messages ( - project_id, file_id, type, timestamp, session_id, - ... - ) - """, - serialized, - ) + # Insert all entries in a batch + serialized_entries = [self._serialize_entry(entry, file_id) for entry in entries] + conn.executemany( + """ + INSERT INTO messages ( + project_id, file_id, type, timestamp, session_id, + _uuid, _parent_uuid, _is_sidechain, _user_type, _cwd, _version, + _is_meta, _agent_id, _request_id, + input_tokens, output_tokens, cache_creation_tokens, cache_read_tokens, + _leaf_uuid, _level, _operation, content + ) VALUES ( + :project_id, :file_id, :type, :timestamp, :session_id, + :_uuid, :_parent_uuid, :_is_sidechain, :_user_type, :_cwd, :_version, + :_is_meta, :_agent_id, :_request_id, + :input_tokens, :output_tokens, :cache_creation_tokens, :cache_read_tokens, + :_leaf_uuid, :_level, :_operation, :content + ) + """, + serialized_entries, + )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
claude_code_log/cache.py(2 hunks)claude_code_log/cache_sqlite.py(1 hunks)claude_code_log/cli.py(3 hunks)claude_code_log/converter.py(2 hunks)claude_code_log/migrations/001_initial_schema.sql(1 hunks)claude_code_log/migrations/__init__.py(1 hunks)claude_code_log/migrations/runner.py(1 hunks)claude_code_log/tui.py(1 hunks)test/test_cache.py(6 hunks)test/test_cache_integration.py(8 hunks)test/test_cache_sqlite_integrity.py(1 hunks)test/test_integration_realistic.py(4 hunks)test/test_project_matching.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
test/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Organize tests into categories with pytest markers to avoid async event loop conflicts: unit tests (no mark), TUI tests (@pytest.mark.tui), browser tests (@pytest.mark.browser), and snapshot tests
Files:
test/test_project_matching.pytest/test_cache_sqlite_integrity.pytest/test_cache_integration.pytest/test_integration_realistic.pytest/test_cache.py
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use ruff for code formatting and linting, with ruff check --fix for automatic fixes
Use pyright and mypy for type checking in Python code
Target Python 3.10+ with support for modern Python features and type hints
Files:
test/test_project_matching.pyclaude_code_log/tui.pyclaude_code_log/migrations/__init__.pyclaude_code_log/cache_sqlite.pytest/test_cache_sqlite_integrity.pytest/test_cache_integration.pytest/test_integration_realistic.pyclaude_code_log/migrations/runner.pyclaude_code_log/cli.pytest/test_cache.pyclaude_code_log/cache.pyclaude_code_log/converter.py
claude_code_log/tui.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use Textual for implementing the interactive Terminal User Interface (TUI) in Python
Files:
claude_code_log/tui.py
claude_code_log/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use dateparser for natural language date parsing to support date range filtering with expressions like 'today', 'yesterday', 'last week', and relative dates
Files:
claude_code_log/tui.pyclaude_code_log/cache_sqlite.pyclaude_code_log/cli.pyclaude_code_log/cache.pyclaude_code_log/converter.py
claude_code_log/cli.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use Click for CLI interface and argument parsing in Python CLI files
Files:
claude_code_log/cli.py
🧠 Learnings (2)
📚 Learning: 2025-11-30T17:16:32.473Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-30T17:16:32.473Z
Learning: Structure the project with separate modules for parser.py (data extraction), renderer.py (HTML generation), converter.py (high-level orchestration), cli.py (CLI interface), models.py (Pydantic data structures), tui.py (Textual TUI), and cache.py (cache management)
Applied to files:
test/test_cache_integration.pytest/test_integration_realistic.py
📚 Learning: 2025-11-30T17:16:32.473Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-30T17:16:32.473Z
Learning: Applies to claude_code_log/models.py : Use Pydantic models for parsing and validating transcript JSON data, including TranscriptEntry (union of UserTranscriptEntry, AssistantTranscriptEntry, SummaryTranscriptEntry), UsageInfo, and ContentItem
Applied to files:
claude_code_log/cache.py
🧬 Code graph analysis (9)
test/test_project_matching.py (1)
claude_code_log/cache_sqlite.py (1)
get_working_directories(451-465)
claude_code_log/tui.py (3)
test/test_cache.py (1)
cache_manager(39-42)test/test_cache_sqlite_integrity.py (1)
cache_manager(37-39)claude_code_log/cache_sqlite.py (1)
get_working_directories(451-465)
claude_code_log/migrations/__init__.py (1)
claude_code_log/migrations/runner.py (1)
run_migrations(135-156)
test/test_cache_sqlite_integrity.py (2)
claude_code_log/cache.py (1)
SessionCacheData(24-37)test/test_cache.py (2)
temp_project_dir(26-29)cache_manager(39-42)
test/test_cache_integration.py (3)
test/test_cache.py (1)
cache_manager(39-42)test/test_cache_sqlite_integrity.py (1)
cache_manager(37-39)claude_code_log/cache_sqlite.py (1)
get_cached_project_data(475-547)
test/test_integration_realistic.py (2)
claude_code_log/cache.py (1)
get_library_version(69-112)claude_code_log/cache_sqlite.py (1)
get_cached_project_data(475-547)
claude_code_log/cli.py (4)
test/test_cache.py (1)
cache_manager(39-42)test/test_cache_sqlite_integrity.py (1)
cache_manager(37-39)claude_code_log/cache_sqlite.py (2)
get_working_directories(451-465)clear_cache(549-557)claude_code_log/converter.py (1)
convert_jsonl_to_html(38-128)
test/test_cache.py (2)
test/test_cache_sqlite_integrity.py (2)
cache_manager(37-39)temp_project_dir(28-33)claude_code_log/cache_sqlite.py (4)
is_file_cached(211-232)save_cached_entries(311-372)get_cached_project_data(475-547)clear_cache(549-557)
claude_code_log/converter.py (1)
claude_code_log/cache_sqlite.py (1)
get_working_directories(451-465)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: test (windows-latest, 3.12)
- GitHub Check: test (windows-latest, 3.14)
- GitHub Check: test (windows-latest, 3.11)
- GitHub Check: test (windows-latest, 3.13)
- GitHub Check: test (windows-latest, 3.10)
- GitHub Check: test (ubuntu-latest, 3.14)
- GitHub Check: test (ubuntu-latest, 3.12)
- GitHub Check: test (ubuntu-latest, 3.13)
- GitHub Check: test (ubuntu-latest, 3.10)
- GitHub Check: test (ubuntu-latest, 3.11)
🔇 Additional comments (43)
claude_code_log/migrations/runner.py (6)
1-27: LGTM! Clean utility functions for migration infrastructure.The helper functions are well-designed: checksum computation uses SHA256, migration number parsing with regex is correct, and the migrations directory discovery is reliable using
__file__.
29-56: Consider implications of dropping old_schema_versiontable.Dropping the
_schema_versiontable when upgrading from old format will cause all migrations to be reapplied. If data tables already exist, theCREATE TABLE IF NOT EXISTSstatements will succeed, but this could cause issues if future migrations have different semantics (e.g.,INSERTstatements that would duplicate data).For the current initial schema this is safe since it only uses
IF NOT EXISTS. Ensure future migrations account for potential re-application.
135-156: LGTM! Solid migration execution with proper cleanup.Good use of
try/finallyfor connection cleanup, 30-second timeout for concurrent access, and sorted application order. ThePRAGMA foreign_keys = ONis correctly placed before migrations.
59-90: LGTM! Well-structured migration state functions.Clean separation between applied, available, and pending migrations. The set-based filtering in
get_pending_migrationsis efficient.
113-133: LGTM! Checksum verification provides good integrity checking.The verification function properly compares stored vs current checksums and returns actionable warnings.
159-163: LGTM! Handles empty schema version correctly.Properly returns 0 when no migrations have been applied (
row[0] is None).claude_code_log/migrations/__init__.py (1)
1-5: LGTM! Clean public API for the migrations package.Properly exposes only
run_migrationsvia__all__, keeping internal functions private.claude_code_log/migrations/001_initial_schema.sql (4)
5-22: LGTM! Projects table well-designed with appropriate aggregates.Good use of
UNIQUEconstraint onproject_pathand pre-computed token aggregates for efficient querying.
23-37: LGTM! File tracking table properly designed for invalidation.Using
REALfor mtime values is correct for Unix timestamps, and the composite unique constraint prevents duplicate entries per project.
39-61: LGTM! Sessions table supports efficient querying.The index on
cwd(line 61) directly supports theget_working_directories()query shown in the relevant snippets.
63-114: LGTM! Messages table balances normalization with query simplicity.The flattened schema with prefixed columns for different entry types avoids complex joins while maintaining clarity. The indexes support common access patterns (by timestamp, session, file).
test/test_project_matching.py (3)
29-58: LGTM! Test correctly updated for newget_working_directories()API.The mock now returns a list directly from
get_working_directories(), matching the new SQLite cache interface.
70-89: LGTM! Subdirectory matching test properly updated.Correctly tests that subdirectory paths match parent project working directories.
114-125: Remove unnecessaryCacheManagermock from test.The test patches
CacheManagerand mocksget_cached_project_data.return_value = None, but thefind_projects_by_cwdimplementation doesn't use the cache—it relies solely on path-based matching viaconvert_project_path_to_claude_dir. The mock is unnecessary and misleading; the test comment about "fallback to project name matching when no cache data available" doesn't reflect the actual implementation. Remove the mock patch to simplify the test and accurately reflect what's being tested.Likely an incorrect or invalid review comment.
claude_code_log/tui.py (1)
388-393: LGTM! Clean migration to newget_working_directories()API.The change correctly calls
get_working_directories()directly from the cache manager, with proper exception handling preserved.claude_code_log/converter.py (2)
469-470: LGTM! Correctly uses the new centralized cache API.The change to retrieve working directories via
cache_manager.get_working_directories()aligns with the SQLite cache migration. This method queries distinctcwdvalues from the sessions table, which is the correct source for this data.
611-611: Consistent with the new cache API pattern.Same pattern as in
_generate_individual_session_files- correctly sources working directories from the cache manager rather than the deprecatedcached_project_data.working_directories.test/test_cache_integration.py (7)
98-113: LGTM! Tests correctly updated for SQLite cache location.The assertions properly verify that:
- The SQLite
cache.dbis created at the parent (projects) level- Cache data is accessed via
CacheManager.get_cached_project_data()- Cache state is validated through the new API
115-139: Test correctly verifies cache clearing via CacheManager API.The updated docstring and assertions align with the SQLite-based cache behavior where clearing removes data from the database rather than deleting files.
163-173: LGTM! All-projects caching test updated for SQLite.Correctly verifies the shared
cache.dbat the projects level and usesCacheManagerto validate per-project cache data.
204-212: LGTM! Converter cache test updated correctly.
226-230: LGTM! No-cache test updated for SQLite semantics.The test correctly verifies that with
use_cache=False, the project data remains empty (total_message_count == 0) rather than checking for file absence.
255-265: LGTM! Hierarchy processing test correctly updated.
427-437: LGTM! Version upgrade test updated for SQLite.Correctly uses
get_cached_project_data()to verify project data and version storage.test/test_integration_realistic.py (3)
242-261: LGTM! Cache clearing test correctly updated for SQLite.Properly verifies that:
cache.dbexists after processingcache.dbis deleted after--clear-cache
415-427: LGTM! Cache creation test correctly migrated to SQLite.Uses
CacheManager.get_cached_project_data()to verify cache contents instead of readingindex.json.
465-469: LGTM! Version storage test updated correctly.Uses the new CacheManager API to retrieve and verify cached version.
claude_code_log/cache.py (2)
2-11: LGTM! Clean module reorganization.The docstring update clearly explains that this module now provides data models while the actual cache implementation uses SQLite in
cache_sqlite.py. The simplified imports are appropriate.
113-125: LGTM! Clean backwards-compatible re-export pattern.The re-export of
SQLiteCacheManagerasCacheManagermaintains API compatibility for existing consumers. Thenoqa: E402comment correctly explains why the import is placed at the bottom (to avoid circular imports betweencache.pyandcache_sqlite.py).The
__all__explicitly defines the public API surface, which is good practice.test/test_cache.py (6)
98-107: LGTM! Initialization tests correctly verify SQLite database location.The tests properly verify that:
db_pathpoints toparent / "cache.db"- The database file is created on initialization
119-120: LGTM! Uses new is_file_cached() API.Replaces filesystem-based cache verification with the proper API method.
132-161: LGTM! Direct SQLite verification for internal structure.Appropriately uses direct database queries to verify the internal storage structure. The query correctly filters by
project_idsince the database is shared, and properly orders by timestamp withNULLS LASTfor summary entries that lack timestamps.
188-201: LGTM! Version compatibility test updated for SQLite.Uses
get_cached_project_data()to verify version storage and compatibility behavior.
258-273: LGTM! Clear cache test properly verifies SQLite state.The test correctly:
- Verifies data exists before clearing
- Uses
is_file_cached()andget_cached_project_data()to verify cleared state- Checks both
cached_filesandsessionsare empty after clearing
577-587: LGTM! Missing cache entry test updated.Correctly uses
is_file_cached()to check cache state and verifiesload_cached_entries()returnsNonefor uncached files.claude_code_log/cli.py (2)
190-210: LGTM! Clean migration to the new cache API.The transition from
project_cache.working_directoriestocache_manager.get_working_directories()is well-implemented. The fallback logic correctly rebuilds the cache when no working directories are found and re-fetches them.
259-268: The review comment is incorrect. The cache.db deletion path is already correct.When
all_projects=True,input_pathequals~/.claude/projects/, which is the projects directory. TheSQLiteCacheManagerstores the cache atproject_path.parent / "cache.db"whereproject_pathis an individual project directory like~/.claude/projects/myproject/. Therefore,project_path.parent = ~/.claude/projects/, placing the cache at~/.claude/projects/cache.db.The deletion code
cache_db = input_path / "cache.db"correctly resolves to~/.claude/projects/cache.db. The suggested diff applyinginput_path.parent / "cache.db"would incorrectly look for the cache at~/.claude/cache.db, which is not where it's created.Likely an incorrect or invalid review comment.
test/test_cache_sqlite_integrity.py (4)
27-39: LGTM! Clean fixture setup.The fixtures properly create isolated temporary directories and cache managers for testing. The
temp_project_dirfixture creates a subdirectory to match expected usage patterns.
89-175: Solid cascade delete test with proper verification.The test correctly verifies that deleting a project cascades to all related records (cached_files, messages, sessions). The use of raw SQL to verify counts is appropriate for this integrity test.
692-701: Performance assertion threshold looks reasonable.The 2-second threshold for querying 1000 entries with date filtering is generous enough to avoid flakiness while still catching major performance regressions.
244-284: Well-designed foreign key constraint tests.These tests properly verify that SQLite foreign key enforcement is working by attempting invalid inserts and expecting
IntegrityError. Good coverage of both file_id and project_id constraints.claude_code_log/cache_sqlite.py (2)
60-70: Connection management is well-implemented.The use of
@contextmanagerwith propertry/finallyensures connections are always closed. The 30-second timeout, WAL mode, and foreign key enforcement are appropriate for concurrent access patterns.
559-585: Version compatibility logic is sound butbreaking_changesdict is empty.The version compatibility checking infrastructure is well-designed with support for both exact versions and wildcard patterns. The empty
breaking_changesdict means all versions are currently compatible, which is correct for an initial implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
claude_code_log/cache_sqlite.py (1)
35-37: CURRENT_SCHEMA_VERSION is unused and can drift from actual schema state
CURRENT_SCHEMA_VERSIONis defined but not referenced anywhere; the code relies on the migrations system (run_migrations) for real schema/version tracking. This constant risks going stale and confusing future changes. Consider either wiring it into the migration/version logic or removing it to avoid dead configuration.claude_code_log/renderer.py (1)
83-92: Project display-name logic aligns with cache-backed working_directoriesThe updated
get_project_display_name:
- Filters out temp paths via
_is_tmp_pathbefore choosing a “best” working directory.- Picks the least-nested path (fewest parts), breaking ties by original order.
- Falls back to transforming
project_dir_name(strip leading-, replace-with/) when no usable dirs remain.This matches how
TemplateProjectand the new tests expect it to behave, and keeps UI display names stable when running under pytest/macOS temp dirs.One minor nit:
_is_tmp_pathcurrently treats any path containing"pytest-"as temporary. If you ever have real projects whose paths include that substring, you might want to tighten the check (e.g., only when the basename starts withpytest-or when the path is under known temp roots like/tmp/or/private/var/folders/).Also applies to: 94-123
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
claude_code_log/cache_sqlite.py(1 hunks)claude_code_log/converter.py(5 hunks)claude_code_log/renderer.py(2 hunks)claude_code_log/tui.py(2 hunks)claude_code_log/utils.py(1 hunks)test/__snapshots__/test_snapshot_html.ambr(2 hunks)test/test_project_display_name.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- claude_code_log/tui.py
🧰 Additional context used
📓 Path-based instructions (4)
claude_code_log/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use dateparser for natural language date parsing to support date range filtering with expressions like 'today', 'yesterday', 'last week', and relative dates
Files:
claude_code_log/cache_sqlite.pyclaude_code_log/utils.pyclaude_code_log/renderer.pyclaude_code_log/converter.py
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use ruff for code formatting and linting, with ruff check --fix for automatic fixes
Use pyright and mypy for type checking in Python code
Target Python 3.10+ with support for modern Python features and type hints
Files:
claude_code_log/cache_sqlite.pyclaude_code_log/utils.pytest/test_project_display_name.pyclaude_code_log/renderer.pyclaude_code_log/converter.py
test/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Organize tests into categories with pytest markers to avoid async event loop conflicts: unit tests (no mark), TUI tests (@pytest.mark.tui), browser tests (@pytest.mark.browser), and snapshot tests
Files:
test/test_project_display_name.py
claude_code_log/renderer.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use mistune for quick Markdown rendering with syntax highlighting support in server-side template rendering
Files:
claude_code_log/renderer.py
🧠 Learnings (4)
📚 Learning: 2025-11-30T17:16:32.473Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-30T17:16:32.473Z
Learning: Applies to claude_code_log/models.py : Use Pydantic models for parsing and validating transcript JSON data, including TranscriptEntry (union of UserTranscriptEntry, AssistantTranscriptEntry, SummaryTranscriptEntry), UsageInfo, and ContentItem
Applied to files:
claude_code_log/utils.py
📚 Learning: 2025-11-30T17:16:32.473Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-30T17:16:32.473Z
Learning: Applies to test/test_snapshot_html.py : Use syrupy for HTML snapshot regression testing to detect unintended changes in HTML output
Applied to files:
test/__snapshots__/test_snapshot_html.ambr
📚 Learning: 2025-11-30T17:16:32.473Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-30T17:16:32.473Z
Learning: Applies to claude_code_log/templates/**/*.html : Use Jinja2 templates for HTML generation, including session navigation with table of contents, message rendering with different content types, and token display for individual messages and session totals
Applied to files:
test/__snapshots__/test_snapshot_html.ambr
📚 Learning: 2025-11-30T17:16:32.473Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-30T17:16:32.473Z
Learning: Applies to claude_code_log/renderer_timings.py : Implement performance timing instrumentation via the CLAUDE_CODE_LOG_DEBUG_TIMING environment variable to identify performance bottlenecks in rendering phases, with detailed timing for initialization, deduplication, session summary processing, main message loop, and template rendering
Applied to files:
claude_code_log/renderer.py
🧬 Code graph analysis (4)
claude_code_log/cache_sqlite.py (3)
claude_code_log/migrations/runner.py (1)
run_migrations(135-156)claude_code_log/models.py (6)
AssistantTranscriptEntry(210-213)QueueOperationTranscriptEntry(231-248)SummaryTranscriptEntry(216-220)SystemTranscriptEntry(223-228)UserTranscriptEntry(203-207)parse_transcript_entry(350-433)claude_code_log/cache.py (4)
CachedFileInfo(14-21)ProjectCache(40-66)SessionCacheData(24-37)get_library_version(69-112)
claude_code_log/utils.py (1)
claude_code_log/models.py (1)
TextContent(60-62)
test/test_project_display_name.py (1)
claude_code_log/renderer.py (1)
get_project_display_name(94-122)
claude_code_log/converter.py (2)
test/test_cache.py (1)
cache_manager(39-42)claude_code_log/cache_sqlite.py (2)
get_working_directories(451-465)get_cached_project_data(475-547)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: test (windows-latest, 3.10)
- GitHub Check: test (windows-latest, 3.13)
- GitHub Check: test (windows-latest, 3.14)
- GitHub Check: test (ubuntu-latest, 3.13)
- GitHub Check: test (windows-latest, 3.12)
- GitHub Check: test (windows-latest, 3.11)
- GitHub Check: test (ubuntu-latest, 3.14)
- GitHub Check: test (ubuntu-latest, 3.10)
- GitHub Check: test (ubuntu-latest, 3.12)
- GitHub Check: test (ubuntu-latest, 3.11)
🔇 Additional comments (5)
test/__snapshots__/test_snapshot_html.ambr (1)
11371-11371: LGTM: Snapshot update reflects test fixture naming.The snapshot correctly captures the updated page title using a descriptive test identifier (
test_multi_session_html0) instead of the generictmp. This change is consistent across both the<title>tag and the<h1>element, and appears to be an expected side effect of the SQLite cache refactoring.Based on learnings, syrupy is used for HTML snapshot regression testing to detect unintended changes in HTML output.
Also applies to: 13739-13739
claude_code_log/utils.py (1)
6-6: TextContent-aware length calculation looks correctImporting
TextContentand restrictingextract_text_content_length()to sum onlyTextContentitems (or stripped string input) matches how other parts of the code treatContentItemunions and avoids over-counting tool/thinking/image payloads. No issues from a typing or behavior standpoint.Also applies to: 103-115
test/test_project_display_name.py (1)
118-140: New tests accurately capture tmp‑path filtering and fallback behaviorThe added tests exercise both key branches of
get_project_display_name:
- Prefer non‑tmp working directories when present (pytest/macOS temp paths filtered out).
- Fall back to
project_dir_nametransformation when all working_directories look temporary.This provides good coverage for the new
_is_tmp_pathlogic and the fallback path conversion.claude_code_log/converter.py (1)
81-87: Working-directory handling now consistently sourced from the cacheUsing
cache_manager.get_working_directories()in:
convert_jsonl_to_htmlfor the main project title,_generate_individual_session_filesfor per‑session titles, andprocess_projects_hierarchyfor project summaries (both cached and fallback paths),and ensuring
_update_cache_with_session_data(cache_manager, messages)runs in the fallback path, keepsworking_directoriesderived from the same SQLite‑backed sessioncwddata everywhere. This makes display names and index summaries consistent with the new cache design.Also applies to: 465-472, 612-613, 641-642, 714-716
claude_code_log/cache_sqlite.py (1)
243-249: SQLite ORDER BY NULLS LAST is supported – no fix requiredSQLite has supported
NULLS LASTandNULLS FIRSTinORDER BYclauses since SQLite 3.30.0 (October 2019). The project requires Python 3.10+, which bundles SQLite 3.36.0 or newer by default. The code at lines 243–249 and 291–305 uses valid, supported syntax and will not cause runtime SQL errors.The
ORDER BY timestamp NULLS LASTqueries are correct and portable to the SQLite versions available in this project's target environment.Likely an incorrect or invalid review comment.
claude_code_log/cache_sqlite.py
Outdated
| cache_ver = version.parse(cache_version) | ||
| current_ver = version.parse(self.library_version) | ||
|
|
||
| for breaking_version_pattern, min_required in breaking_changes.items(): | ||
| min_required_ver = version.parse(min_required) | ||
|
|
||
| if current_ver >= min_required_ver: | ||
| if breaking_version_pattern.endswith(".x"): | ||
| major_minor = breaking_version_pattern[:-2] | ||
| if str(cache_ver).startswith(major_minor): | ||
| return False | ||
| else: | ||
| breaking_ver = version.parse(breaking_version_pattern) | ||
| if cache_ver <= breaking_ver: | ||
| return False | ||
|
|
||
| return True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
packaging.version.parse non-PEP440 InvalidVersion exception behavior
💡 Result:
Short answer: Historically packaging.version.parse returned a LegacyVersion object for many non‑PEP 440 strings; more recently the library moved to strict PEP 440 behavior and now raises packaging.version.InvalidVersion for strings that do not conform. [1][2][3]
Details and sources:
- Current packaging docs show parse(...) is a thin wrapper that constructs a Version and that Version raises InvalidVersion when the string does not match PEP 440. [1]
- The LegacyVersion behavior (returning a LegacyVersion for arbitrary strings) used to exist and is discussed in community posts; that behavior has been deprecated and removed in favor of raising InvalidVersion. See the pip/packaging issue discussion about deprecating legacy versions. [2][3]
If you have a specific version string that's being parsed, paste it and I can tell you whether current packaging will return a Version or raise InvalidVersion.
Sources:
[1] packaging — Version Handling documentation. (packaging.pypa.io)
[2] pip/pip / Deprecate legacy versions discussion. (github.com)
[3] StackOverflow and historical examples of LegacyVersion behavior. (stackoverflow.com)
Citations:
- 1: https://packaging.pypa.io/en/stable/version.html?utm_source=openai
- 2: Deprecate legacy versions and version specifiers pypa/pip#12063
- 3: https://stackoverflow.com/questions/67639128/python-version-number-comparison-with-not-equal?utm_source=openai
🏁 Script executed:
#!/bin/bash
# Find the get_library_version method and see what it returns
rg -A 10 "def get_library_version" claude_code_log/Repository: daaain/claude-code-log
Length of output: 736
🏁 Script executed:
#!/bin/bash
# Look for existing exception handling around version.parse calls
rg -B 3 -A 3 "version\.parse" claude_code_log/Repository: daaain/claude-code-log
Length of output: 1397
🏁 Script executed:
#!/bin/bash
# Search for InvalidVersion exception handling patterns
rg "InvalidVersion" claude_code_log/Repository: daaain/claude-code-log
Length of output: 48
🏁 Script executed:
#!/bin/bash
# Get full get_library_version() implementation
rg -A 30 "def get_library_version" claude_code_log/cache.pyRepository: daaain/claude-code-log
Length of output: 1275
🏁 Script executed:
#!/bin/bash
# Look at the _is_cache_version_compatible method to understand cache_version source
rg -B 5 "_is_cache_version_compatible" claude_code_log/cache_sqlite.py | head -50Repository: daaain/claude-code-log
Length of output: 547
🏁 Script executed:
#!/bin/bash
# Find where _is_cache_version_compatible is called
rg -B 2 -A 2 "_is_cache_version_compatible\(" claude_code_log/Repository: daaain/claude-code-log
Length of output: 890
🏁 Script executed:
#!/bin/bash
# Get the complete get_library_version() implementation including final fallback
rg -A 50 "def get_library_version" claude_code_log/cache.py | head -80Repository: daaain/claude-code-log
Length of output: 1918
Harden cache-version compatibility against malformed version strings
_is_cache_version_compatible() calls version.parse(cache_version) and version.parse(self.library_version) directly. Since get_library_version() explicitly returns "unknown" as a fallback (when package metadata is unavailable), and cache_version comes from the database and could be any string from previous installations, packaging.version.parse() will raise InvalidVersion for non-PEP 440 strings, breaking cache initialization instead of gracefully invalidating the cache.
Wrap the parse calls in a try-except block:
- cache_ver = version.parse(cache_version)
- current_ver = version.parse(self.library_version)
+ try:
+ cache_ver = version.parse(cache_version)
+ current_ver = version.parse(self.library_version)
+ except Exception:
+ # On invalid version strings, treat cache as incompatible and rebuild.
+ return FalseThis ensures cache initialization is robust against malformed version strings from older databases or failed installations.
🤖 Prompt for AI Agents
In claude_code_log/cache_sqlite.py around lines 569 to 585, version.parse is
called directly on cache_version and self.library_version which can be
non-PEP440 (e.g. "unknown" or corrupted DB values) and will raise
InvalidVersion; wrap the parsing in a try/except around both parse calls, catch
packaging.version.InvalidVersion (and Exception as a fallback), and treat any
parse error as an incompatible cache by returning False (i.e., invalidate the
cache) so initialization continues robustly.
|
While in general I think it's a good move, I have one concern with the This currently "freezes" the model we have, and if we make that evolve, the DB schema will have to evolve as well, with possibly non-trivial changes needed. I'd like to continue (and eventually finish!) some major refactoring in this area, with a clean separation between the parsed content data model and the rendered content data model. And I'm not exactly sure yet where this is going to lead me... So, either we wait a bit for my changes and freeze the |
|
I'm still working on this as I'm still not happy with the performance, so it's totally fine holding it back until you've done the refactoring. But also, I set up the migrations to make it possible to evolve the schema. We can even write SQL statements that transform existing data so it wouldn't even be necessary to regenerate the cache in most cases. |
|
I managed to make unchanged projects evaluated super quick by adding a table that has HTML generation metadata, but now there's a big question highlighted by Claude:
This massive combined session HTML is quite laggy in the browser even on my beefy M2 Max Mac, so I'm thinking about introducing pagination. That would make it impossible to search all sessions ever (at least in the static page itself, we could introduce a server building on the SQLite db), but I'm guessing people would mostly want to look in a certain time interval anyway (like I would know if something happened in the last few months or more than a year ago for example). So I'm thinking about paginating every 5K(?) messages on session boundaries (so create a new page each time a session goes over the message number limit). Maybe have 5K as the default but make it configurable. |
|
I did pagination and it's so much smoother and faster, actually decreased the default page size to 2K messages as 5K was still lots of sessions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (6)
claude_code_log/templates/components/page_nav_styles.css (1)
68-74: Remove unnecessary pseudo-element definitions.The
.page-nav-link.prev::beforeand.page-nav-link.next::afterrules define pseudo-elements with empty content but add no styling. The arrow glyphs come from HTML text ("← Previous", "Next →"), not CSS. These rules appear to be unused boilerplate.Apply this diff to remove them:
-.page-nav-link.prev::before { - content: ''; -} - -.page-nav-link.next::after { - content: ''; -}test/test_pagination.py (1)
74-76: Assertion may not verify ordering.Using
sorted(pages[0])to compare means this test won't catch if the function returns sessions in wrong order. Sincetest_sessions_sorted_chronologicallyexplicitly tests ordering, this may be intentional for this test case, but consider whether order matters here.If order doesn't matter for this test, the current assertion is fine. If it should preserve insertion order, use:
- assert sorted(pages[0]) == ["s1", "s2", "s3"] + assert pages[0] == ["s1", "s2", "s3"]claude_code_log/cli.py (2)
436-441: Consider adding page_size validation.The
--page-sizeoption accepts any integer, including zero or negative values, which could cause issues downstream.Add a callback or use Click's built-in validation:
@click.option( "--page-size", type=int, default=2000, + callback=lambda ctx, param, value: value if value > 0 else ctx.fail("--page-size must be positive"), help="Maximum messages per page for combined transcript (default: 2000). Sessions are never split across pages.", )Alternatively, use
click.IntRange(min=1)as the type.
259-268: Direct cache.db deletion may fail if database is in use.If another process has the SQLite database open,
unlink()may fail on some platforms (especially Windows). The current exception handling logs a warning, which is acceptable, but consider using a more graceful approach.The current warning-based approach is reasonable for a CLI tool. If more robustness is needed, you could add retry logic or instruct users to ensure no other processes are accessing the cache.
claude_code_log/renderer.py (1)
83-92: Temp‑path filtering inget_project_display_nameis Unix‑centric.
_is_tmp_pathonly matches Unix/macOS patterns (/tmp/,/var/folders/,pytest-, etc.). On Windows, typical temp dirs (e.g.%TEMP%,C:\\Users\\...\\AppData\\Local\\Temp) won’t be filtered, so “noisy” working directories may still leak into display names.If Windows support matters, consider extending
_is_tmp_pathto:
- Normalize paths with
Path(path).as_posix().lower(); and- Also match common Windows temp patterns (e.g.
\\AppData\\Local\\Temp\\,\\Temp\\).Otherwise the logic looks good and should produce much nicer project names when pytest/tmp dirs are present.
Please validate behavior on any non‑Unix environments you care about.
Also applies to: 94-123
claude_code_log/converter.py (1)
867-885: Good incremental session HTML regeneration; consider wiring regen count into stats.
_generate_individual_session_filesnow:
- Uses
CacheManager.is_html_stale(session_html_name, session_id)plusis_html_outdatedto avoid unnecessary per‑session regeneration.- Updates
html_cachewith the pre‑dedup message count (from cache) to keepis_html_staleconsistent.- Returns
regenerated_count.In
process_projects_hierarchy,GenerationStats.sessions_regeneratedis currently set fromlen(stale_sessions)computed before callingconvert_jsonl_to_html, not from the actualregenerated_count. If you care about accurate reporting, you might want to plumb the returned count back into stats.Functionally this is fine; it’s just a stats accuracy improvement.
Also applies to: 934-985
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
claude_code_log/cache.py(4 hunks)claude_code_log/cli.py(7 hunks)claude_code_log/converter.py(17 hunks)claude_code_log/migrations/002_html_cache.sql(1 hunks)claude_code_log/migrations/003_html_pagination.sql(1 hunks)claude_code_log/renderer.py(10 hunks)claude_code_log/templates/components/page_nav_styles.css(1 hunks)claude_code_log/templates/transcript.html(1 hunks)test/__snapshots__/test_snapshot_html.ambr(5 hunks)test/test_html_regeneration.py(4 hunks)test/test_pagination.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
claude_code_log/templates/**/*.html
📄 CodeRabbit inference engine (CLAUDE.md)
Use Jinja2 templates for HTML generation, including session navigation with table of contents, message rendering with different content types, and token display for individual messages and session totals
Files:
claude_code_log/templates/transcript.html
claude_code_log/cli.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use Click for CLI interface and argument parsing in Python CLI files
Files:
claude_code_log/cli.py
claude_code_log/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use dateparser for natural language date parsing to support date range filtering with expressions like 'today', 'yesterday', 'last week', and relative dates
Files:
claude_code_log/cli.pyclaude_code_log/cache.pyclaude_code_log/converter.pyclaude_code_log/renderer.py
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use ruff for code formatting and linting, with ruff check --fix for automatic fixes
Use pyright and mypy for type checking in Python code
Target Python 3.10+ with support for modern Python features and type hints
Files:
claude_code_log/cli.pytest/test_html_regeneration.pytest/test_pagination.pyclaude_code_log/cache.pyclaude_code_log/converter.pyclaude_code_log/renderer.py
test/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Organize tests into categories with pytest markers to avoid async event loop conflicts: unit tests (no mark), TUI tests (@pytest.mark.tui), browser tests (@pytest.mark.browser), and snapshot tests
Files:
test/test_html_regeneration.pytest/test_pagination.py
claude_code_log/renderer.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use mistune for quick Markdown rendering with syntax highlighting support in server-side template rendering
Files:
claude_code_log/renderer.py
🧠 Learnings (9)
📓 Common learnings
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-30T17:16:32.495Z
Learning: Generate cache-aware HTML output with automatic cache validation and leverage existing cache system for fast loading
📚 Learning: 2025-11-30T17:16:32.495Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-30T17:16:32.495Z
Learning: Applies to claude_code_log/templates/**/*.html : Use Jinja2 templates for HTML generation, including session navigation with table of contents, message rendering with different content types, and token display for individual messages and session totals
Applied to files:
claude_code_log/templates/transcript.htmltest/__snapshots__/test_snapshot_html.ambrclaude_code_log/converter.py
📚 Learning: 2025-11-30T17:16:32.495Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-30T17:16:32.495Z
Learning: Applies to claude_code_log/cli.py : Use Click for CLI interface and argument parsing in Python CLI files
Applied to files:
claude_code_log/cli.py
📚 Learning: 2025-11-30T17:16:32.495Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-30T17:16:32.495Z
Learning: Applies to test/test_snapshot_html.py : Use syrupy for HTML snapshot regression testing to detect unintended changes in HTML output
Applied to files:
test/__snapshots__/test_snapshot_html.ambrtest/test_html_regeneration.py
📚 Learning: 2025-11-30T17:16:32.495Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-30T17:16:32.495Z
Learning: Group messages chronologically by timestamp across all sessions and process entire project hierarchies with linked index pages
Applied to files:
claude_code_log/migrations/003_html_pagination.sqlclaude_code_log/converter.py
📚 Learning: 2025-11-30T17:16:32.495Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-30T17:16:32.495Z
Learning: Applies to claude_code_log/models.py : Use Pydantic models for parsing and validating transcript JSON data, including TranscriptEntry (union of UserTranscriptEntry, AssistantTranscriptEntry, SummaryTranscriptEntry), UsageInfo, and ContentItem
Applied to files:
claude_code_log/cache.pyclaude_code_log/converter.py
📚 Learning: 2025-11-30T17:16:32.495Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-30T17:16:32.495Z
Learning: Structure the project with separate modules for parser.py (data extraction), renderer.py (HTML generation), converter.py (high-level orchestration), cli.py (CLI interface), models.py (Pydantic data structures), tui.py (Textual TUI), and cache.py (cache management)
Applied to files:
claude_code_log/converter.pyclaude_code_log/renderer.py
📚 Learning: 2025-11-30T17:16:32.495Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-30T17:16:32.495Z
Learning: Applies to claude_code_log/renderer.py : Use mistune for quick Markdown rendering with syntax highlighting support in server-side template rendering
Applied to files:
claude_code_log/renderer.py
📚 Learning: 2025-11-30T17:16:32.495Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-30T17:16:32.495Z
Learning: Applies to claude_code_log/renderer_timings.py : Implement performance timing instrumentation via the CLAUDE_CODE_LOG_DEBUG_TIMING environment variable to identify performance bottlenecks in rendering phases, with detailed timing for initialization, deduplication, session summary processing, main message loop, and template rendering
Applied to files:
claude_code_log/renderer.py
🧬 Code graph analysis (2)
test/test_pagination.py (2)
claude_code_log/cache.py (10)
CacheManager(168-1184)SessionCacheData(39-52)get_page_count(1173-1184)get_page_size_config(900-914)update_page_cache(1010-1091)get_page_data(916-958)is_page_stale(1093-1144)invalidate_all_pages(1146-1171)get_all_pages(960-1008)get_library_version(119-162)claude_code_log/converter.py (3)
_get_page_html_path(106-113)_assign_sessions_to_pages(116-150)convert_jsonl_to_html(338-513)
claude_code_log/cache.py (4)
claude_code_log/migrations/runner.py (1)
run_migrations(135-156)claude_code_log/models.py (4)
AssistantTranscriptEntry(210-213)QueueOperationTranscriptEntry(231-248)SystemTranscriptEntry(223-228)UserTranscriptEntry(203-207)claude_code_log/converter.py (1)
summary(68-103)claude_code_log/renderer.py (1)
is_html_outdated(154-165)
🔇 Additional comments (19)
claude_code_log/templates/transcript.html (1)
20-20: Page navigation UI implementation looks solid.The conditional rendering approach is clean, and the placement before message content is sensible. The template properly guards against missing page_info/page_stats.
Please verify that the calling code in
renderer.pyandconverter.pyalways populatespage_infoandpage_statswith the expected structure:
page_info(dict):page_number,prev_link,next_linkpage_stats(dict):message_count,date_range,token_summary(optional)This guards against KeyError if the Python code is refactored later.
Also applies to: 27-51
claude_code_log/templates/components/page_nav_styles.css (1)
3-66: CSS styling is clean and well-structured.The flexbox layout, color scheme via CSS variables, and hover transitions all look good. Proper spacing and visual hierarchy throughout.
Ensure that the five CSS variables referenced are defined in
global_styles.cssor another globally-loaded stylesheet:
--system-warning-color--text-muted--border-light--bg-hover--session-bg-dimmedtest/__snapshots__/test_snapshot_html.ambr (5)
4219-4292: Snapshot styles updated for page navigation (Test Session HTML).The added
.page-navigationblock and related classes are coherent and self-contained; including them in this snapshot correctly reflects the new paginated transcript UI without affecting existing content.
8974-9047: Consistent page navigation styles in Edge Cases snapshot.This hunk mirrors the same page-navigation CSS used elsewhere, which keeps the Edge Cases snapshot in sync with the shared pagination styling while remaining non-invasive for the page structure.
11523-11523: More descriptive multi-session<title>text.Updating the
<title>toClaude Transcripts - test_multi_session_html0makes the snapshot more specific and consistent with the underlying test case naming.
13887-13960: Multi-session snapshot now includes nav styles and matching H1.The page-navigation CSS matches other snapshots, and the
<h1 id="title">Claude Transcripts - test_multi_session_html0</h1>correctly aligns the heading with the updated<title>, keeping the syrupy HTML snapshot aligned with the Jinja templates. Based on learnings, this matches the intended snapshot behavior.Also applies to: 13965-13967
18780-18853: Page navigation styles added to Test Transcript snapshot.Adding the shared
.page-navigationstyles here keeps this snapshot consistent with the global layout and pagination styling, with no impact on existing markup or semantics.claude_code_log/migrations/003_html_pagination.sql (1)
1-39: Well-structured pagination schema.The migration correctly implements the pagination tables with appropriate constraints:
- Foreign keys with CASCADE deletion ensure referential integrity
- UNIQUE constraints prevent duplicate page numbers and session mappings
- Indexes cover the expected query patterns (project lookup, page-session joins, session lookup)
The nullable
first_timestamp/last_timestampfields handle edge cases appropriately.claude_code_log/migrations/002_html_cache.sql (1)
1-18: Clean HTML cache schema for incremental regeneration.The table design appropriately tracks HTML generation metadata with:
- Nullable
source_session_iddistinguishing combined vs session-specific HTMLmessage_countfor cache invalidation sanity checkslibrary_versionenabling version-based regenerationtest/test_html_regeneration.py (2)
51-57: Good backward-compatible assertion.The flexible check for either the legacy per-file skip message or the new early-exit message ensures the test works correctly as the implementation evolves.
319-492: Comprehensive test coverage for incremental HTML cache.The
TestIncrementalHtmlCacheclass provides thorough coverage:
test_html_cache_tracks_generation: Validates cache entries for both combined and session HTMLtest_incremental_regeneration_only_updates_changed_sessions: Confirms only modified sessions are regeneratedtest_html_cache_detects_library_version_change: Verifies version-based staleness detectiontest_html_cache_message_count_validation: Tests session staleness when message counts changetest/test_pagination.py (2)
351-394: Well-designed test helper for session message generation.The
_create_session_messageshelper creates realistic alternating user/assistant messages with proper timestamps, session IDs, and usage data. This provides good test data for integration tests.
397-544: Thorough integration tests for pagination.The integration tests cover key scenarios:
- Small projects without pagination
- Large projects creating multiple pages
- Page size change triggering regeneration
- Very small page sizes respecting session boundaries
- Navigation links and stats in paginated output
claude_code_log/cli.py (2)
190-210: Good migration to new cache API.The relative matching logic correctly uses
cache_manager.get_working_directories()and handles the case where cache needs to be built by callingconvert_jsonl_to_htmlbefore re-fetching directories.
586-593: Correct propagation of page_size to process_projects_hierarchy.The
page_sizeparameter is properly passed through to the hierarchy processing function.claude_code_log/renderer.py (1)
200-248: Pygments/Markdown caching changes look correct and should reduce render overhead.The introduction of
_get_cached_lexer,_get_markdown_formatter,_get_cached_formatter, and_get_markdown_rendererremoves repeated lexer/formatter construction and repeated mistune initialization:
_create_pygments_pluginnow calls the cached helpers instead of re‑creating lexers/formatters._highlight_code_with_pygmentsuses the cached formatter and your precomputed lexer map.Given the existing timing hooks (
_markdown_timings,_pygments_timings), this is a solid and idiomatic optimization for the renderer’s hot path.Also applies to: 281-307, 690-778
claude_code_log/converter.py (2)
1051-1120: Exception handler can crash when no previous project exists.In
process_projects_hierarchy’sexceptblock:print( f"Warning: Failed to process {project_dir}: {e}\n" f"Previous (in alphabetical order) file before error: {project_summaries[-1]}" f"\n{traceback.format_exc()}" )If the first project in the sorted list fails,
project_summariesis still empty andproject_summaries[-1]will raiseIndexError, masking the original exception and breaking error reporting.Safer pattern:
- print( - f"Warning: Failed to process {project_dir}: {e}\n" - f"Previous (in alphabetical order) file before error: {project_summaries[-1]}" - f"\n{traceback.format_exc()}" - ) + previous = project_summaries[-1] if project_summaries else "<none>" + print( + f"Warning: Failed to process {project_dir}: {e}\n" + f"Previous (in alphabetical order) project summary: {previous}" + f"\n{traceback.format_exc()}" + )This preserves the diagnostic context without risking a second failure.
Also applies to: 1263-1271
⛔ Skipped due to learnings
Learnt from: CR Repo: daaain/claude-code-log PR: 0 File: CLAUDE.md:0-0 Timestamp: 2025-11-30T17:16:32.495Z Learning: Structure the project with separate modules for parser.py (data extraction), renderer.py (HTML generation), converter.py (high-level orchestration), cli.py (CLI interface), models.py (Pydantic data structures), tui.py (Textual TUI), and cache.py (cache management)
381-402: The review comment references code and methods that do not exist in the repository.The mentioned file
claude_code_log/converter.py, the code snippet shown (lines 381-402 and 435-487), and the referenced methods (_generate_paginated_html,is_page_stale,use_pagination,cache_manager.is_html_stale(),cache_manager.get_stale_sessions()) cannot be found in the codebase. Theclaude_code_logdirectory itself does not exist in the repository. The concern about pagination logic and page-level cache staleness checks is therefore not applicable to the current codebase.Likely an incorrect or invalid review comment.
claude_code_log/cache.py (1)
372-378: Remove incorrect SQLite syntax concern—NULLS LASTis supported.SQLite has supported the
NULLS LASTmodifier inORDER BYclauses since version 3.30.0 (released October 2019). The syntax used at lines 374 and 433 is valid and will execute correctly on modern SQLite versions.Since the project targets Python 3.10+ and does not specify a legacy SQLite version requirement, the code is compatible with contemporary environments where SQLite 3.30.0+ is standard.
| # Build page_info for navigation | ||
| has_prev = page_num > 1 | ||
| # Pre-enable next link if this page exceeds threshold (anticipating future pages) | ||
| # or if there are more pages | ||
| page_exceeds_threshold = page_message_count > page_size | ||
| has_next = page_num < len(pages) or page_exceeds_threshold | ||
|
|
||
| page_info = { | ||
| "page_number": page_num, | ||
| "prev_link": _get_page_html_path(page_num - 1) if has_prev else None, | ||
| "next_link": _get_page_html_path(page_num + 1) if has_next else None, | ||
| } | ||
|
|
||
| # Build page_stats | ||
| date_range = "" | ||
| if first_timestamp and last_timestamp: | ||
| first_fmt = format_timestamp(first_timestamp) | ||
| last_fmt = format_timestamp(last_timestamp) | ||
| if first_fmt == last_fmt: | ||
| date_range = first_fmt | ||
| else: | ||
| date_range = f"{first_fmt} - {last_fmt}" | ||
| elif first_timestamp: | ||
| date_range = format_timestamp(first_timestamp) | ||
|
|
||
| token_parts: List[str] = [] | ||
| if total_input_tokens: | ||
| token_parts.append(f"Input: {total_input_tokens:,}") | ||
| if total_output_tokens: | ||
| token_parts.append(f"Output: {total_output_tokens:,}") | ||
| if total_cache_creation_tokens: | ||
| token_parts.append(f"Cache Create: {total_cache_creation_tokens:,}") | ||
| if total_cache_read_tokens: | ||
| token_parts.append(f"Cache Read: {total_cache_read_tokens:,}") | ||
| token_summary = " | ".join(token_parts) if token_parts else None | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
has_next can point to a non‑existent next page.
In _generate_paginated_html:
page_exceeds_threshold = page_message_count > page_size
has_next = page_num < len(pages) or page_exceeds_threshold
...
"next_link": _get_page_html_path(page_num + 1) if has_next else None,Because _assign_sessions_to_pages closes a page once current_count > page_size, it’s possible for the last page to have page_message_count > page_size. In that case:
page_num == len(pages)page_exceeds_thresholdisTruehas_nextbecomesTrue, but pagepage_num + 1is never generated
This yields a “Next” link to a missing HTML file.
Unless you explicitly want a pre‑wired link to a future page (which static HTML can’t satisfy yet), it’s safer to base has_next solely on len(pages):
- page_exceeds_threshold = page_message_count > page_size
- has_next = page_num < len(pages) or page_exceeds_threshold
+ has_next = page_num < len(pages)or, if you want to retain the threshold concept, guard it so it never creates a link beyond len(pages).
🤖 Prompt for AI Agents
In claude_code_log/converter.py around lines 268 to 303, has_next can be True
when page_num == len(pages) because page_exceeds_threshold may be True,
producing a next_link to a non-existent page; change the logic to only create a
next link when the next page actually exists (e.g., set has_next = page_num <
len(pages) or, if keeping the threshold concept, ensure you also check page_num
+ 1 <= len(pages) before assigning next_link) so next_link never points beyond
the last generated page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is an optimisation to not have to regenerate the last page. Do you have a better idea how to handle this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daaain I understand the optimization goal—avoiding regeneration of the last page when new content arrives. However, creating a "next" link to a non-existent page causes a 404 error for users, which is a significant UX issue.
Here are some better approaches:
1. Accept one-page regeneration (recommended)
has_next = page_num < len(pages)When new messages arrive and trigger a new page, simply regenerate the previous last page to add the next link. The performance cost is minimal (one page), and you get reliable navigation without broken links.
2. Generate a placeholder page
When page_exceeds_threshold is true for the last page, generate an empty page N+1:
if page_num == len(pages) and page_exceeds_threshold:
# Generate placeholder page with "No content yet" message
# This keeps the link valid but shows users it's empty3. Use a conditional indicator
Instead of a clickable link, show different UI when threshold is exceeded on the last page:
if page_num < len(pages):
# Normal next link
elif page_exceeds_threshold:
# Show "More content coming soon" text (non-clickable)My recommendation: Go with option 1. The cost of regenerating one page when new content arrives is negligible compared to the confusion and broken experience of 404 links. Static HTML pages should have reliable navigation—users won't understand why a "Next" link doesn't work.
The current threshold check is useful for detecting when a page is full, but it shouldn't create navigation to pages that don't exist yet. The regeneration on content update is the right trade-off here.
🧠 Learnings used
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-30T17:16:32.495Z
Learning: Group messages chronologically by timestamp across all sessions and process entire project hierarchies with linked index pages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, instead of regenerating the page in full, we can maybe simply edit that previous file in place, to update its next page link (like, remove a "last-page" class that would hide it when present).
Besides, instead of a threshold in the number of messages (not necessarily a good indicator of the size), what do you think about the idea of using the "compaction messages" as delimiters? They're sort of guaranteed to occur regularly in long sessions, and after a relatively constant amount of tokens (unless there were manual /compact or /clear commands, well, those could also be used for delimiting pages?).
The only downside I see is that it's less obvious to detect at the SQL level (and potentially brittle), but I think the idea is worth exploring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of a class that can be removed in a targeted way with a simple string search or regex!
Not sure about the compaction messages, we can look at them and have them as an optional page breaking point if that creates a better page size, but I rarely let sessions get to that point, prefer to let Claude generate a handover and paste it into a new context.
With the number of messages, I was thinking if we're taking 1000s on each page, the lengths will average out? But we could count characters too, that would probably be more precise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think /clear creates a new session?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, /clear creates a new session (and I should probably use that more often :P).
I looked back at some of the long sessions I have, and using the compaction message as a separator is not that a great idea: it happens too often.
| silent: bool = True, | ||
| page_size: int = 2000, | ||
| ) -> Path: | ||
| """Process the entire ~/.claude/projects/ hierarchy and create linked HTML files.""" | ||
| """Process the entire ~/.claude/projects/ hierarchy and create linked HTML files. | ||
| Args: | ||
| projects_path: Path to the projects directory | ||
| from_date: Optional date filter start | ||
| to_date: Optional date filter end | ||
| use_cache: Whether to use SQLite cache | ||
| generate_individual_sessions: Whether to generate per-session HTML files | ||
| silent: If True, suppress verbose per-file logging (show summary only) | ||
| page_size: Maximum messages per page for combined transcript pagination | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
silent flag doesn’t fully suppress per‑project logging.
process_projects_hierarchy documents:
silent: If True, suppress verbose per-file logging (show summary only)
But it unconditionally prints per‑project lines:
print(f" {project_dir.name}: cached ({stats.total_time:.1f}s)")
...
print(f" {project_dir.name}: {detail} ({stats.total_time:.1f}s)")and the final summary is also always printed.
If you want silent=True to show only the final summary (or truly nothing), wrap the per‑project prints in if not silent: and decide whether the global summary should also respect silent. For example:
- print(f" {project_dir.name}: cached ({stats.total_time:.1f}s)")
+ if not silent:
+ print(f" {project_dir.name}: cached ({stats.total_time:.1f}s)")
...
- print(f" {project_dir.name}: {detail} ({stats.total_time:.1f}s)")
+ if not silent:
+ print(f" {project_dir.name}: {detail} ({stats.total_time:.1f}s)")This will align behavior with the parameter’s documentation.
Also applies to: 1086-1120, 1287-1304
🤖 Prompt for AI Agents
In claude_code_log/converter.py around lines 1000-1013 (and also address similar
prints at 1086-1120 and 1287-1304): the function documents that silent=True
should suppress verbose per-file logging, but there are unconditional print()
calls for each project and for the final summary. Wrap the per-project print
statements in an if not silent: block so they only run when verbose is desired,
and decide whether the final summary should respect silent (wrap it in if not
silent or add a separate summary_only flag); ensure all similar print
occurrences in the other indicated line ranges are handled the same way to match
the documented behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The per-project lines are the summaries so this is as intended
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daaain, thank you for the clarification! That makes sense—the per-project lines serve as the summary output, while silent=True suppresses the more granular per-file logging that happens inside convert_jsonl_to_html. The current behavior aligns with the documented intent. I appreciate you explaining the design decision!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
eef048d to
094489c
Compare
|
This is now shaping up really nicely, did a rebase and fixed some bits that broke, now it takes less than 2 seconds to rescan all my projects which fills the SQLite db to almost 900MB 😅 (claude-code-log) ➜ claude-code-log git:(feat/sqlite-cache) ✗ j cli
uv run claude-code-log
Processing all projects in /Users/dain/.claude/projects...
-Users-dain: cached (0.0s)
-Users-dain-Desktop: cached (0.0s)
-Users-dain-Obsidian: cached (0.0s)
-Users-dain-truenas-backup: cached (0.0s)
-Users-dain-workspace-JSSoundRecorder: cached (0.0s)
-Users-dain-workspace-browser-llm-script-interaction-extension: cached (0.1s)
-Users-dain-workspace-claude-code-log: cached (0.3s)
-Users-dain-workspace-coderabbit-review-helper: cached (0.0s)
-Users-dain-workspace-danieldemmel-me-next: cached (0.0s)
...
Processed 24 projects in 1.8s
Successfully processed 24 projects and created index at /Users/dain/.claude/projects/index.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
test/test_cache_integration.py (1)
395-411: Test uses obsolete JSON cache pattern.This test still references the old per-project JSON cache (
cache_dir.glob("*.json")), but the codebase has migrated to SQLite. The test will:
- Not find any JSON cache files (since SQLite is used)
- Skip the corruption logic entirely via the
if cache_files:guard- Pass without actually testing corruption recovery
Update this test to corrupt the SQLite database instead:
def test_cache_corruption_recovery(self, setup_test_project): """Test recovery from corrupted cache files.""" project_dir = setup_test_project # Create initial cache convert_jsonl_to_html(input_path=project_dir, use_cache=True) - # Corrupt cache file - cache_dir = project_dir / "cache" - cache_files = list(cache_dir.glob("*.json")) - if cache_files: - cache_file = [f for f in cache_files if f.name != "index.json"][0] - cache_file.write_text("corrupted json data", encoding="utf-8") + # Corrupt SQLite cache database + cache_db = project_dir.parent / "cache.db" + if cache_db.exists(): + cache_db.write_bytes(b"corrupted database content") # Should recover gracefully output = convert_jsonl_to_html(input_path=project_dir, use_cache=True) assert output.exists()test/test_integration_realistic.py (4)
561-698: Tests still use obsolete JSON cache pattern.
TestCacheHTMLPersistencereferences the oldcache/index.jsonpattern throughout:
- Line 567:
cache_file = project / "cache" / "index.json"- Line 653:
cache_index = project / "cache" / "index.json"- Lines 675-698: References
cache_dirand per-file JSON cachesThese tests will likely fail or be skipped since the SQLite migration removed JSON cache files.
Update these tests to use CacheManager for cache validation:
# Instead of: cache_index = project / "cache" / "index.json" cache_data = json.loads(cache_index.read_text()) assert cache_data["version"] == get_library_version() # Use: cache_manager = CacheManager(project, get_library_version()) cached_data = cache_manager.get_cached_project_data() assert cached_data is not None assert cached_data.version == get_library_version()
719-753: Test uses obsolete JSON cache pattern.
test_adding_lines_triggers_cache_updatereferencescache_index = project / "cache" / "index.json"which no longer exists with SQLite caching. The test will be skipped due to theif not cache_index.exists()guard.Update to use CacheManager:
- cache_index = project / "cache" / "index.json" - if not cache_index.exists(): - pytest.skip("Cache not generated by fixture") - - original_cache_mtime = cache_index.stat().st_mtime + cache_manager = CacheManager(project, get_library_version()) + original_data = cache_manager.get_cached_project_data() + if not original_data: + pytest.skip("Cache not generated by fixture") + + original_message_count = original_data.total_message_count # ... after modification ... - assert cache_index.stat().st_mtime > original_cache_mtime + new_data = cache_manager.get_cached_project_data() + assert new_data.total_message_count > original_message_count
819-844: Test uses obsolete JSON cache pattern.
test_new_file_detected_and_processedreferencescache_index = project / "cache" / "index.json"and reads JSON directly. This needs updating for SQLite.Similar fix needed - use CacheManager to validate session counts:
- cache_index = project / "cache" / "index.json" - if not cache_index.exists(): - pytest.skip("Cache not generated by fixture") - - original_cache = json.loads(cache_index.read_text()) - original_session_count = len(original_cache.get("sessions", {})) + cache_manager = CacheManager(project, get_library_version()) + original_data = cache_manager.get_cached_project_data() + if not original_data: + pytest.skip("Cache not generated by fixture") + + original_session_count = len(original_data.sessions) # ... after adding new file ... - new_cache = json.loads(cache_index.read_text()) - assert len(new_cache.get("sessions", {})) > original_session_count + # Re-fetch to get updated data + cache_manager = CacheManager(project, get_library_version()) + new_data = cache_manager.get_cached_project_data() + assert len(new_data.sessions) > original_session_count
1162-1199: Tests reference obsolete cache directory structure.
TestErrorHandlingRealistictests still referencecache/index.jsonandcachedirectory structure that no longer exists:
- Line 1168:
cache_index = project / "cache" / "index.json"- Line 1188:
cache_dir = project / "cache"- Line 1198-1199: Assertions on
cache_dirandindex.jsonThese tests need updating to validate SQLite cache corruption/recovery rather than JSON files.
♻️ Duplicate comments (3)
claude_code_log/converter.py (2)
1023-1340:silentflag doesn't suppress per-project and summary output.The docstring on line 1034 states
silent=Trueshould "suppress verbose per-file logging (show summary only)", but lines 1119 and 1152 unconditionally print per-project progress, and lines 1334–1340 unconditionally print the global summary. Users passingsilent=Truestill see all this output.This is the same issue flagged in a previous review and remains unfixed.
Wrap per-project prints in
if not silent:guards:- print(f" {project_dir.name}: cached ({stats.total_time:.1f}s)") + if not silent: + print(f" {project_dir.name}: cached ({stats.total_time:.1f}s)")Apply the same guard to line 1152 and decide whether lines 1334–1340 (global summary) should also respect
silentor remain unconditional. Based on the docstring, you may want a separate flag (e.g.,show_summary) if you always want the final summary regardless ofsilent.Also applies to: 1152-1152, 1334-1340
269-280:has_nextlogic can point to non-existent pages.Lines 273–274 set
has_next = page_num < len(pages) or page_exceeds_threshold, which means if the last page (page_num == len(pages)) happens to exceedpage_size,has_nextbecomesTrueand line 279 generatesnext_linkto a page that was never created.The comment on lines 271–272 about "anticipating future pages" doesn't apply to static HTML—you can't link to pages that don't exist.
This is the same issue flagged in a previous review and remains unfixed.
Fix by removing the threshold check from
has_next:- # Pre-enable next link if this page exceeds threshold (anticipating future pages) - # or if there are more pages - page_exceeds_threshold = page_message_count > page_size - has_next = page_num < len(pages) or page_exceeds_threshold + # Only show next link if there's actually a next page + has_next = page_num < len(pages)claude_code_log/cache.py (1)
1093-1144: Page staleness check still doesn't compare session message counts.Lines 1131–1139 query
message_countfrom thesessionstable but never compare it againstpage_data.message_countor per-session counts. The TODO comment on lines 1141–1142 confirms this is incomplete. This allows paginated pages to remain marked as "up_to_date" even when sessions have gained or lost messages, resulting in stale HTML.This is the same issue flagged in a previous review and remains unfixed.
Implement the missing comparison:
for session_id in page_data.session_ids: row = conn.execute( """SELECT message_count FROM sessions WHERE project_id = ? AND session_id = ?""", (self._project_id, session_id), ).fetchone() if not row: return True, "session_missing" - # We need to check if session content changed - # For now, just check if session exists + # Compare session message count to detect changes + # Note: page_data doesn't store per-session counts, so we need a workaround + # For now, return stale if any session count changed from when page was created + # (This requires storing per-session counts in page cache or re-aggregating here)Alternatively, store per-session message counts in the
page_sessionstable and compare them here, or invalidate all pages whenevertotal_message_countchanges at the project level.
🧹 Nitpick comments (6)
claude_code_log/cli.py (1)
436-441: Consider adding validation for --page-size.The
page_sizeoption accepts any integer, including zero or negative values, which could cause issues downstream (e.g., infinite loops or division by zero).Add a callback to validate the minimum page size:
@click.option( "--page-size", type=int, default=2000, + callback=lambda ctx, param, value: value if value >= 100 else + ctx.fail("--page-size must be at least 100"), help="Maximum messages per page for combined transcript (default: 2000). Sessions are never split across pages.", )Alternatively, use
click.IntRange(min=100)for cleaner validation.claude_code_log/renderer.py (3)
83-122: Project-display-name logic and tmp-path filtering look good, with minor optional refinementsFiltering
working_directoriesvia_is_tmp_pathbefore picking the least‑nested path is a sensible way to avoid pytest/macOS temp paths influencing display names. The fallback that reverses the encoded project dir name is also reasonable.If you find index pages end up showing full paths in the fallback case, consider additionally taking
Path(display_name).namethere so you still get a short repo name, but that’s optional and can be adjusted based on howproject_dir_nameis actually populated.
200-248: Lexer/formatter caching is a solid performance improvementIntroducing
_lexer_cache,_formatter_cache, and_get_markdown_formatterto reuse Pygments lexers andHtmlFormatterinstances should significantly reduce per‑message overhead, especially for large transcripts with many code blocks. The cache keys (lang:stripalland(linenos, linenostart)) are straightforward and align with how these objects are configured.One optional refinement: since
_get_cached_lexeralready handlesClassNotFound, theexcept ClassNotFoundbranch in_highlight_code_with_pygmentsis now effectively dead and could be removed in a follow‑up to simplify the control flow.
699-773: Optimized_highlight_code_with_pygmentsmatches SQLite/perf goals and integrates with new cachesSwitching from
get_lexer_for_filenameto a one‑timeget_all_lexers()pattern/extension map avoids filesystem I/O and antivirus hits, and then relying on_get_cached_lexerplus_get_cached_formatterreuses heavy Pygments objects. The extension‑first lookup and fnmatch fallback preserve the previous behavior while being more efficient.As noted above, the outer
try/except ClassNotFoundis now redundant because_get_cached_lexeralready falls back toTextLexer, but keeping it doesn’t break anything.claude_code_log/cache.py (2)
119-163: Broad exception handling hides errors in version detection.All three version-detection strategies use
except Exception: pass, which silently swallows import errors, file I/O issues, and parsing failures. This makes debugging difficult if, for example, thetomllibrary is missing orpyproject.tomlis malformed.Consider either catching more specific exceptions or logging suppressed errors to aid troubleshooting.
except Exception: - pass + # Package not installed, will try file-based methods + pass
396-396: Movedateparserimport to module level.The
import dateparseron line 396 is inside the method. Module-level imports improve readability, enable type-checking tools to detect missing dependencies earlier, and align with the coding guideline that recommends using dateparser for date parsing throughout the project.Move to the top of the file:
from pathlib import Path from typing import Any, Dict, Generator, List, Optional +import dateparserThen remove the import from line 396.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (22)
claude_code_log/cache.py(4 hunks)claude_code_log/cli.py(7 hunks)claude_code_log/converter.py(17 hunks)claude_code_log/migrations/001_initial_schema.sql(1 hunks)claude_code_log/migrations/002_html_cache.sql(1 hunks)claude_code_log/migrations/003_html_pagination.sql(1 hunks)claude_code_log/migrations/__init__.py(1 hunks)claude_code_log/migrations/runner.py(1 hunks)claude_code_log/renderer.py(10 hunks)claude_code_log/templates/components/page_nav_styles.css(1 hunks)claude_code_log/templates/transcript.html(1 hunks)claude_code_log/tui.py(2 hunks)claude_code_log/utils.py(0 hunks)test/__snapshots__/test_snapshot_html.ambr(5 hunks)test/test_cache.py(7 hunks)test/test_cache_integration.py(8 hunks)test/test_cache_sqlite_integrity.py(1 hunks)test/test_html_regeneration.py(4 hunks)test/test_integration_realistic.py(4 hunks)test/test_pagination.py(1 hunks)test/test_project_display_name.py(1 hunks)test/test_project_matching.py(2 hunks)
💤 Files with no reviewable changes (1)
- claude_code_log/utils.py
🚧 Files skipped from review as they are similar to previous changes (8)
- claude_code_log/migrations/init.py
- claude_code_log/templates/transcript.html
- test/test_project_matching.py
- claude_code_log/migrations/001_initial_schema.sql
- claude_code_log/templates/components/page_nav_styles.css
- claude_code_log/migrations/002_html_cache.sql
- claude_code_log/migrations/runner.py
- test/test_cache_sqlite_integrity.py
🧰 Additional context used
📓 Path-based instructions (6)
test/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Organize tests into categories with pytest markers to avoid async event loop conflicts: unit tests (no mark), TUI tests (@pytest.mark.tui), browser tests (@pytest.mark.browser), and snapshot tests
Files:
test/test_cache_integration.pytest/test_project_display_name.pytest/test_integration_realistic.pytest/test_pagination.pytest/test_html_regeneration.pytest/test_cache.py
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use ruff for code formatting and linting, with ruff check --fix for automatic fixes
Use pyright and mypy for type checking in Python code
Target Python 3.10+ with support for modern Python features and type hints
Files:
test/test_cache_integration.pytest/test_project_display_name.pytest/test_integration_realistic.pytest/test_pagination.pyclaude_code_log/cli.pytest/test_html_regeneration.pyclaude_code_log/cache.pytest/test_cache.pyclaude_code_log/tui.pyclaude_code_log/converter.pyclaude_code_log/renderer.py
claude_code_log/cli.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use Click for CLI interface and argument parsing in Python CLI files
Files:
claude_code_log/cli.py
claude_code_log/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use dateparser for natural language date parsing to support date range filtering with expressions like 'today', 'yesterday', 'last week', and relative dates
Files:
claude_code_log/cli.pyclaude_code_log/cache.pyclaude_code_log/tui.pyclaude_code_log/converter.pyclaude_code_log/renderer.py
claude_code_log/tui.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use Textual for implementing the interactive Terminal User Interface (TUI) in Python
Files:
claude_code_log/tui.py
claude_code_log/renderer.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use mistune for quick Markdown rendering with syntax highlighting support in server-side template rendering
Files:
claude_code_log/renderer.py
🧠 Learnings (9)
📓 Common learnings
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-30T17:16:32.495Z
Learning: Generate cache-aware HTML output with automatic cache validation and leverage existing cache system for fast loading
📚 Learning: 2025-11-30T17:16:32.495Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-30T17:16:32.495Z
Learning: Structure the project with separate modules for parser.py (data extraction), renderer.py (HTML generation), converter.py (high-level orchestration), cli.py (CLI interface), models.py (Pydantic data structures), tui.py (Textual TUI), and cache.py (cache management)
Applied to files:
test/test_cache_integration.pytest/test_integration_realistic.pyclaude_code_log/cache.pyclaude_code_log/converter.pyclaude_code_log/renderer.py
📚 Learning: 2025-11-30T17:16:32.495Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-30T17:16:32.495Z
Learning: Group messages chronologically by timestamp across all sessions and process entire project hierarchies with linked index pages
Applied to files:
claude_code_log/migrations/003_html_pagination.sqlclaude_code_log/converter.py
📚 Learning: 2025-11-30T17:16:32.495Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-30T17:16:32.495Z
Learning: Applies to claude_code_log/templates/**/*.html : Use Jinja2 templates for HTML generation, including session navigation with table of contents, message rendering with different content types, and token display for individual messages and session totals
Applied to files:
test/__snapshots__/test_snapshot_html.ambrclaude_code_log/converter.py
📚 Learning: 2025-11-30T17:16:32.495Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-30T17:16:32.495Z
Learning: Applies to test/test_snapshot_html.py : Use syrupy for HTML snapshot regression testing to detect unintended changes in HTML output
Applied to files:
test/__snapshots__/test_snapshot_html.ambrtest/test_html_regeneration.py
📚 Learning: 2025-11-30T17:16:32.495Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-30T17:16:32.495Z
Learning: Applies to claude_code_log/cli.py : Use Click for CLI interface and argument parsing in Python CLI files
Applied to files:
claude_code_log/cli.py
📚 Learning: 2025-11-30T17:16:32.495Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-30T17:16:32.495Z
Learning: Applies to claude_code_log/models.py : Use Pydantic models for parsing and validating transcript JSON data, including TranscriptEntry (union of UserTranscriptEntry, AssistantTranscriptEntry, SummaryTranscriptEntry), UsageInfo, and ContentItem
Applied to files:
claude_code_log/cache.pyclaude_code_log/converter.py
📚 Learning: 2025-11-30T17:16:32.495Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-30T17:16:32.495Z
Learning: Applies to claude_code_log/renderer.py : Use mistune for quick Markdown rendering with syntax highlighting support in server-side template rendering
Applied to files:
claude_code_log/renderer.py
📚 Learning: 2025-11-30T17:16:32.495Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-30T17:16:32.495Z
Learning: Applies to claude_code_log/renderer_timings.py : Implement performance timing instrumentation via the CLAUDE_CODE_LOG_DEBUG_TIMING environment variable to identify performance bottlenecks in rendering phases, with detailed timing for initialization, deduplication, session summary processing, main message loop, and template rendering
Applied to files:
claude_code_log/renderer.py
🧬 Code graph analysis (8)
test/test_project_display_name.py (1)
claude_code_log/renderer.py (1)
get_project_display_name(94-122)
test/test_integration_realistic.py (1)
claude_code_log/cache.py (2)
get_library_version(119-162)get_cached_project_data(604-676)
test/test_pagination.py (2)
claude_code_log/cache.py (10)
CacheManager(168-1184)SessionCacheData(39-52)get_page_count(1173-1184)get_page_size_config(900-914)update_page_cache(1010-1091)get_page_data(916-958)is_page_stale(1093-1144)invalidate_all_pages(1146-1171)get_all_pages(960-1008)get_library_version(119-162)claude_code_log/converter.py (3)
_get_page_html_path(107-114)_assign_sessions_to_pages(117-151)convert_jsonl_to_html(339-514)
claude_code_log/cli.py (2)
claude_code_log/cache.py (2)
get_working_directories(580-594)CacheManager(168-1184)claude_code_log/converter.py (1)
convert_jsonl_to_html(339-514)
test/test_html_regeneration.py (4)
test/conftest.py (1)
test_data_dir(11-13)test/test_cache.py (1)
cache_manager(42-45)test/test_pagination.py (1)
cache_manager(156-159)claude_code_log/cache.py (2)
CacheManager(168-1184)get_library_version(119-162)
test/test_cache.py (1)
claude_code_log/cache.py (4)
is_file_cached(340-361)save_cached_entries(440-501)get_cached_project_data(604-676)clear_cache(678-686)
claude_code_log/tui.py (1)
claude_code_log/cache.py (1)
get_working_directories(580-594)
claude_code_log/renderer.py (1)
claude_code_log/renderer_timings.py (1)
timing_stat(88-110)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: test (windows-latest, 3.14)
- GitHub Check: test (windows-latest, 3.11)
- GitHub Check: test (ubuntu-latest, 3.14)
- GitHub Check: test (windows-latest, 3.13)
- GitHub Check: test (windows-latest, 3.12)
- GitHub Check: test (ubuntu-latest, 3.12)
- GitHub Check: test (ubuntu-latest, 3.11)
- GitHub Check: test (windows-latest, 3.10)
- GitHub Check: test (ubuntu-latest, 3.13)
- GitHub Check: test (ubuntu-latest, 3.10)
🔇 Additional comments (32)
claude_code_log/tui.py (1)
8-8: LGTM! Clean migration to SQLite-backed cache.The changes correctly adapt to the centralized
CacheManager:
- Proper type annotation with
List[str]instead ofNone- Direct call to
cache_manager.get_working_directories()aligns with the SQLite cache schema- Exception handling maintains graceful fallback behavior
Based on the relevant code snippet,
get_working_directories()queries distinctcwdvalues from the sessions table, which is the right approach for the cache migration.Also applies to: 388-393
test/test_project_display_name.py (1)
118-140: LGTM! Excellent test coverage for temporary path filtering.The two new tests appropriately verify the
_is_tmp_pathfiltering logic:
test_tmp_paths_filtered_outconfirms that real project paths are preferred over pytest/macOS temp directoriestest_only_tmp_paths_falls_backensures graceful fallback when only temporary paths are availableThe test data uses realistic temporary path patterns, and the assertions clearly validate the expected behavior.
test/__snapshots__/test_snapshot_html.ambr (2)
4275-4348: LGTM! Snapshot correctly reflects pagination CSS.The added page navigation styling across all snapshots correctly captures the expected HTML output for the new pagination feature. The CSS includes proper visual hierarchy, hover states, and responsive layout as described in the PR objectives.
Based on learnings, syrupy snapshot tests detect unintended HTML changes, and these updates align with the pagination implementation.
Also applies to: 9081-9154, 13989-14062, 18933-19006
11574-11574: LGTM! Test identifier improvement.The title update from "tmp" to "test_multi_session_html0" improves test clarity and maintains consistency between the
<title>tag and the H1 heading.Also applies to: 14067-14067
claude_code_log/migrations/003_html_pagination.sql (1)
1-39: Well-structured pagination schema.The migration correctly establishes pagination support with proper constraints:
- Unique constraint on
(project_id, page_number)prevents duplicate pages- Cascade deletes ensure cleanup when projects are removed
- Indexes on foreign keys and
session_idwill optimize common queriesNote:
page_sessions.session_idis TEXT without a foreign key to thesessionstable. This appears intentional for flexibility (sessions can be recreated without breaking page mappings), but it means orphaned session references are possible if sessions are deleted without page regeneration.test/test_cache_integration.py (3)
98-114: LGTM - Correct validation of --no-cache behavior.The test properly verifies that:
- SQLite database file is created (infrastructure exists)
- Project data exists but has zero messages (cache not populated)
This correctly reflects the new centralized cache architecture where the database exists but caching is bypassed.
115-139: LGTM - Cache clearing test updated correctly.The test now validates cache clearing through CacheManager's SQL-backed data rather than filesystem artifacts.
427-441: LGTM - Version upgrade test uses SQLite correctly.The test properly validates version tracking through the CacheManager and
cached_data.versionproperty.test/test_html_regeneration.py (4)
51-57: LGTM - Flexible assertion handles output format changes.The updated assertion correctly accepts either the legacy per-file skip message or the new early-exit message, making the test resilient to output format changes while still validating the skip behavior.
321-361: LGTM - Comprehensive html_cache tracking test.The test thoroughly validates that:
- Combined transcript cache entries have
source_session_id = None- Session HTML cache entries correctly link to their session IDs
- Message counts are tracked in cache entries
363-415: Excellent incremental regeneration test.This test is critical for validating the performance optimization. It properly:
- Creates isolated sessions in separate files
- Modifies only one session
- Asserts the unmodified session's HTML remains unchanged (same mtime)
The 1.1-second sleep ensures filesystem timestamp granularity is respected.
417-492: LGTM - Version mismatch and staleness detection tests.These tests validate important cache invalidation scenarios:
- Library version changes trigger regeneration
- Message count changes mark sessions as stale
test/test_integration_realistic.py (2)
242-261: LGTM - SQLite cache deletion test.The test correctly validates that:
cache.dbis created after processingcache.dbis deleted after--clear-cache
415-427: LGTM - Cache creation test updated for SQLite.The test properly validates:
- SQLite
cache.dbexists after processing- Per-project data is accessible via CacheManager
- Version and sessions structure are present
claude_code_log/cli.py (4)
259-302: LGTM - Cache clearing handles both SQLite and legacy JSON caches.The implementation properly:
- Deletes the centralized SQLite
cache.db- Cleans up old per-project JSON cache directories for migration compatibility
- Handles both all-projects and single-directory modes
The inline
import shutilis acceptable for lazy loading since it's only needed in cleanup paths.
190-210: LGTM - Direct use of get_working_directories().Replacing
project_cache.working_directorieswithcache_manager.get_working_directories()is cleaner and avoids loading the full project cache just to get directories.
586-593: LGTM - page_size correctly threaded to process_projects_hierarchy.The parameter is properly passed through to the hierarchy processing function.
635-643: LGTM - page_size correctly threaded to convert_jsonl_to_html.The parameter is properly passed for single directory/file processing.
test/test_pagination.py (3)
21-135: Session-to-page assignment tests look correct and align with_assign_sessions_to_pagessemanticsThe test matrix around single/multiple sessions, overflow, chronological ordering, and empty input closely matches the implementation in
claude_code_log.converter._assign_sessions_to_pages, including the “overflow allowed but no session splitting” behavior. This gives good confidence that later pagination changes won’t silently break session grouping.
142-346: Solid coverage of page-level cache API behaviorThe fixture +
CacheManagertests exercise the full page-cache surface (get_page_count,get_page_size_config,update_page_cache,get_page_data,get_all_pages,is_page_stale,invalidate_all_pages) with realistic values. Using a temp project dir and verifying both count and per-page metadata (timestamps, token totals, session ordering) is a good fit for the new SQLite schema.
351-544: Pagination + converter integration tests are well-structuredThe integration tests construct small synthetic projects to cover no-pagination, multi-page pagination, page-size changes, tiny page sizes (one session per page), and HTML-level checks for navigation and stats. These scenarios mirror the performance and UX concerns described in the PR objectives and should catch regressions in
_generate_paginated_html,page_info, andpage_statswiring.test/test_cache.py (7)
25-32: Fixture change correctly isolates SQLite DB per testCreating a
projectsubdirectory and relying onproject_dir.parent / "cache.db"ensures each test instance gets its own database file under a unique temporary parent, which avoids cross-test coupling given the centralized cache design.
95-104: Initialization test nicely pins down newdb_pathcontractAsserting both
project_path,library_version, and thatdb_path == temp_project_dir.parent / "cache.db"(and exists) captures the intended public behavior ofCacheManagerin the SQLite-backed world. This should make future refactors to cache location safer.
112-124:is_file_cachedassertion strengthens save/load coverageChecking
cache_manager.is_file_cached(jsonl_path)immediately aftersave_cached_entriesvalidates the new mtime-based cache-validity logic, not just round‑trip serialization, which is a good addition to this test.
186-205: Version-mismatch test correctly exercises cache-compatibility behaviorValidating that a
CacheManagerinstantiated with a newer, but compatible, library version still sees the original cached project version (1.0.0) viaget_cached_project_datais a good guard against accidentally wiping or rewriting caches on harmless upgrades.
259-277:clear_cachebehavior is well specifiedAsserting that
is_file_cachedbecomes false and thatcached_filesandsessionsare emptied (whileget_cached_project_data()still returns a project record) precisely matches the documented intent ofclear_cacheto reset project data without deleting the DB itself.
580-587: Error-handling test for missing cache entries is appropriateThe combination of
assert not cache_manager.is_file_cached(jsonl_path)and expectingload_cached_entriesto returnNonewhen the file exists but was never cached aligns with the new SQLite-backed semantics and should prevent regressions in “cold cache” behavior.
599-618: Permission-handling test remains useful despite SQLite migrationEven though the cache is now centralized in
cache.db, keeping this test around a read‑onlycachedirectory under the project path is a good defensive check that unexpected filesystem layout or permissions don’t causeCacheManagerconstruction to blow up.claude_code_log/renderer.py (2)
277-306: Cached mistune renderer + timing hook are wired correctlyUsing a module‑level
_markdown_renderervia_get_markdown_renderer()avoids repeatedly constructing mistune instances and re‑adding the Pygments plugin, whilerender_markdownstill wraps rendering intiming_stat("_markdown_timings")so the existing timing instrumentation continues to work. This matches the guideline to use mistune for fast markdown rendering with syntax highlighting, and should help with the markdown‑heavy bottleneck identified in the PR description.
3128-3303:generate_htmlpagination context (page_info/page_stats) is threaded through cleanlyExtending
generate_html’s signature with optionalpage_infoandpage_statsand passing them directly into thetranscript.htmltemplate fits well with the new pagination design. Existing call sites that only passmessages,title, andcombined_transcript_linkremain valid, while paginated callers (e.g.,_generate_paginated_htmlvia the converter) can now drive per‑page navigation and stats, as exercised bytest/test_pagination.py.claude_code_log/converter.py (2)
39-105: LGTM!The
GenerationStatsdataclass provides clear tracking of cache usage, HTML regeneration, timing, and errors/warnings, with a well-formatted summary output. The design is clean and maintainable.
436-466: Pagination decision logic is well-designed.Lines 444–451 enable pagination when
total_message_count > page_sizeorexisting_page_count > 1. This "sticky" behavior prevents toggling between paginated and non-paginated modes as message counts fluctuate around the threshold, which would otherwise cause unnecessary regeneration and confuse users with inconsistent navigation.
|
I'm a bit torn on this below. I myself have "cleanupPeriodDays": 99999, but the default is only 30 days log retention, so keeping deleted logs could be really nice. But on the other hand, it could be confusing and unwanted behaviour and also adds complexity. I'm leaning towards not doing it, but wanted to at least have it documented here. Rearchitecture: SQLite Cache as Source of Truth for Archived Sessions Problem Claude Code has cleanupPeriodDays (default 30 days) that auto-deletes
Solution Treat SQLite cache as canonical data store. Auto-detect "archived" User Requirements
Implementation Plan Phase 1: Schema & Core Infrastructure 1.1 Add Migration ALTER TABLE sessions ADD COLUMN is_archived INTEGER DEFAULT 0; 1.2 Update SessionCacheData Model (claude_code_log/cache.py)
1.3 Add CacheManager Methods (claude_code_log/cache.py)
Phase 2: Data Loading Flow 2.1 Update ensure_fresh_cache() (claude_code_log/converter.py)
2.2 Update load_directory_transcripts() (claude_code_log/parser.py)
Phase 3: TUI Changes (claude_code_log/tui.py) 3.1 Remove JSONL File Requirement 3.2 Update populate_table()
3.3 Update action_resume_selected()
3.4 Update Expanded Row Content
Phase 4: HTML Template Changes 4.1 Session Navigation (templates/components/session_nav.html)
4.2 Transcript Template (templates/transcript.html)
4.3 Index Template (templates/index.html)
4.4 CSS Styles .session-link.archived { opacity: 0.8; border-left: 3px solid Phase 5: Testing
Files to Modify
Migration Strategy
|
|
On the other hand, we could also use the SQLite db as is to restore deleted JSONL files (until manual cache reset): Yes, we can fully restore JSONL files from the database! Here's what we store:
To restore a JSONL file, you'd simply: # Query messages for a specific file
rows = conn.execute(
"SELECT content FROM messages WHERE file_id = ? ORDER BY
timestamp",
(file_id,)
) # Write each row's content as a JSONL line
with open(output_path, "w") as f:
for row in rows:
f.write(row["content"] + "\n")The only thing we'd lose is the exact order within same-timestamp This is actually quite reassuring - it means the SQLite cache is a |
- Migrate from JSON cache to SQLite for better performance and integrity - Add database migrations system for schema versioning - Implement incremental HTML regeneration (only rebuild changed sessions) - Add pagination support for large projects - Improve working directories handling - Use explicit UTF-8 encoding for Windows compatibility
936ac19 to
a534b79
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
claude_code_log/converter.py (1)
980-1032:ensure_fresh_cacheandhas_cache_changescorrupt cache aggregates when called with date filters.When
from_date/to_dateare provided:
- Both functions treat date filters as invalidation reasons, forcing cache recomputation.
ensure_fresh_cachecallsload_directory_transcripts(project_dir, cache_manager, from_date, to_date, ...), which loads filtered messages only viaload_cached_entries_filtered._update_cache_with_session_datathen calculates aggregates (message counts, timestamps, token totals) based on this subset, overwriting the global cache.Subsequent calls without date filters will:
- See
cached_project_datapresent and skip cache update.- Use stale, subset-based aggregates in
sessionsandprojectsmetadata (affecting pagination, HTML staleness checks, and index stats).The fix requires removing date filters from invalidation logic in both
ensure_fresh_cacheandhas_cache_changes, and always passingNone, Nonetoload_directory_transcriptsto ensure cache aggregates reflect the full dataset. Date filtering should apply only at the presentation layer (viafilter_messages_by_date), not during cache population.
♻️ Duplicate comments (2)
claude_code_log/migrations/runner.py (1)
29-56: Migration application and recording aren’t atomic; risk of “applied but not recorded” state.
apply_migration()runsconn.executescript(content)and only then inserts into_schema_versionand commits. If the insert or commit fails (disk full, constraint error, concurrent writer), the migration SQL may be applied while_schema_versiondoesn’t record it, leaving the DB in an inconsistent state and making future non‑idempotent migrations fragile. This matches a previously raised concern on this file.To make migration + recording atomic, wrap them in a single transaction or savepoint. For example:
Suggested SAVEPOINT-based fix
def apply_migration( conn: sqlite3.Connection, version: int, migration_path: Path ) -> None: """Apply a single migration and record it in _schema_version.""" content = migration_path.read_text(encoding="utf-8") checksum = _compute_checksum(content) - # Execute the migration SQL - conn.executescript(content) - - # Record the migration - conn.execute( - """ - INSERT INTO _schema_version (version, filename, applied_at, checksum) - VALUES (?, ?, ?, ?) - """, - (version, migration_path.name, datetime.now().isoformat(), checksum), - ) - conn.commit() + savepoint = f"migration_{version}" + try: + conn.execute(f"SAVEPOINT {savepoint}") + # Execute the migration SQL + conn.executescript(content) + # Record the migration + conn.execute( + """ + INSERT INTO _schema_version (version, filename, applied_at, checksum) + VALUES (?, ?, ?, ?) + """, + (version, migration_path.name, datetime.now().isoformat(), checksum), + ) + conn.execute(f"RELEASE SAVEPOINT {savepoint}") + except Exception as e: + conn.execute(f"ROLLBACK TO SAVEPOINT {savepoint}") + conn.execute(f"RELEASE SAVEPOINT {savepoint}") + raise RuntimeError( + f"Failed to apply/record migration {version}: {e}. " + "Database may be in inconsistent state." + ) from ePlease double-check the exact SQLite
executescripttransaction semantics against your target Python/SQLite versions before adopting this, as behavior can differ slightly across versions.Also applies to: 92-111
claude_code_log/cache.py (1)
1095-1146: Paginated pages don’t become stale when session message counts change.
is_page_stale()fetchesmessage_countfromsessionsfor eachsession_idon the page but only checks whether the row exists. If a session gains or loses messages,html_pages.message_countand the HTML file stay unchanged, and the page is treated as"up_to_date". This can leave paginated combined transcripts out of sync with the SQLite cache.You’re already computing page-level
message_countinupdate_page_cache(), so a simple consistency check here would fix it.One possible fix is to compare the current summed session counts against the stored page message count:
Suggested consistency check in
is_page_stale# Check if any session on this page has changed with self._get_connection() as conn: - for session_id in page_data.session_ids: - row = conn.execute( - """SELECT message_count FROM sessions - WHERE project_id = ? AND session_id = ?""", - (self._project_id, session_id), - ).fetchone() - - if not row: - return True, "session_missing" - - # We need to check if session content changed - # For now, just check if session exists + total_message_count = 0 + for session_id in page_data.session_ids: + row = conn.execute( + """SELECT message_count FROM sessions + WHERE project_id = ? AND session_id = ?""", + (self._project_id, session_id), + ).fetchone() + + if not row: + return True, "session_missing" + + total_message_count += row["message_count"] or 0 + + if total_message_count != page_data.message_count: + return True, "session_updated" return False, "up_to_date"If you prefer to count messages after deduplication instead, you could also adjust
_generate_paginated_html()so thatpage_message_countis computed from a consistent source and stored intohtml_pages.message_count, then compare against that here.
🧹 Nitpick comments (20)
claude_code_log/converter.py (2)
415-481: GenerationStats is a good foundation, but currently underused.The dataclass looks well-structured for per-project reporting, but most fields (e.g.,
cache_time,render_time,combined_regenerated) aren’t populated yet, andsummary()isn’t called. That’s fine for now but consider either wiring these fields intoprocess_projects_hierarchy()or trimming unused attributes to reduce drift.
1294-1432: Return value of_generate_individual_session_filesis unused; consider wiring into stats or dropping.The helper now returns
regenerated_count, but callers (e.g.,convert_jsonl_to) ignore it andGenerationStats.sessions_regeneratedis driven instead from a stale-session pre-scan. Either:
- Use the returned count to update
GenerationStats.sessions_regenerated, or- Make
_generate_individual_session_filesreturnNoneagain to avoid confusion.claude_code_log/html/templates/components/page_nav_styles.css (1)
1-74: LGTM - Well-structured pagination styling.The CSS is clean and maintainable, with:
- Semantic class naming following consistent patterns
- Proper use of CSS variables for theming consistency
- Modern flexbox layouts with responsive wrapping
- Smooth hover transitions for better UX
- Appropriate visual hierarchy through sizing and colors
Minor: Consider clarifying or removing empty pseudo-elements.
Lines 68-73 define
::beforeand::afterpseudo-elements with empty content:.page-nav-link.prev::before { content: ''; } .page-nav-link.next::after { content: ''; }If these are placeholders for future icons or styling, consider adding a comment explaining the intent. Otherwise, they can be removed to simplify the stylesheet.
🔎 Optional: Add clarifying comment
+/* Placeholder pseudo-elements for potential arrow icons or spacing */ .page-nav-link.prev::before { content: ''; } .page-nav-link.next::after { content: ''; }test/test_pagination.py (1)
142-346: Page cache API tests are well structured but assume ordering semanticsThe
TestPageCacheMethodssuite exercises read/write, staleness reasons, invalidation, andget_all_pages()in a way that tightly matches the new API. Note that the assertions onall_pages[0].page_number == 1and[1].page_number == 2implicitly requireget_all_pages()to return rows ordered bypage_number; keep that ordering guaranteed in the implementation or document it, since tests and callers now rely on it.test/test_html_regeneration.py (2)
49-57: Skip-message assertions correctly handle both legacy and new messagesRelaxing the checks to accept either per-file “skipping regeneration” or the new “All HTML files are current” early-exit message makes the tests robust across the refactor while still verifying that regeneration is skipped. The approach of scanning
mock_print.call_args_listby substring is fine for this use case.Also applies to: 106-115
321-492: Incremental HTML cache tests thoroughly validate html_cache behaviorThe
TestIncrementalHtmlCachetests exercise combined vs per-session cache entries, selective regeneration based on per-session changes, version mismatches, and message-count-based staleness. They accurately mirror theCacheManager.get_html_cache/is_html_stalebehavior and look correct. The 1.1s sleeps are appropriate given the 1s mtime tolerance, but will add some runtime; only worth revisiting if the suite becomes slow.test/test_cache_sqlite_integrity.py (3)
91-177: SQL cascade, token-sum, and FK constraint tests are well targetedThe cascade-delete, token aggregation, and foreign-key tests exercise exactly the right invariants (projects → cached_files/messages/sessions, messages.file_id/project_id validity, and sum(input_tokens/output_tokens) correctness). Using
_get_connectionand_project_idis acceptable for this “white-box” integrity module, just be aware these tests will need updating if internal table names or relationships change.Also applies to: 179-244, 246-286
288-402: Serialization round‑trip tests cover complex content types effectivelyThe round‑trip test for ToolUse, ToolResult, and Thinking content ensures that
contentJSON and thecreate_transcript_entrypath preserve entry types and content item types. If you want even stronger guarantees later, you could also assert key fields (e.g.,tool_use_id,thinkingtext) match, but the current checks are already valuable.
610-711: WAL, concurrency, and performance tests are reasonable but mildly timing‑sensitiveThe WAL mode check and concurrent reader test validate important SQLite configuration choices and multi-threaded read safety. The “large dataset” performance test with 1k rows and a 2s upper bound is conservative enough for most CIs, but if you ever see flakes under load, consider relaxing the threshold or asserting only on successful completion plus a sanity bound on result size.
test/test_integration_realistic.py (1)
845-860: UTF‑8 encoding usage is now consistent for new writesThe updated
write_text/open(..., encoding="utf-8")calls for JSONL and HTML writes make the tests more robust across platforms and match the rest of the codebase’s encoding discipline. No behavior changes beyond safer I/O.Also applies to: 902-903, 959-961, 1507-1508, 1523-1540
claude_code_log/cli.py (3)
186-255: Relative‑match discovery now correctly leverages cached working directoriesThe changes in
_find_relative_matchesto useCacheManager.get_working_directories()(and to lazily build the cache viaconvert_jsonl_to_html(..., silent=True)when empty) are consistent with the new SQLite cache design and should make CWD matching more accurate once sessions are cached. This does mean the first run for uncached projects can be a bit heavier, but that’s expected for priming the cache.
257-315: _clear_caches now handles both SQLite and legacy JSON cachesFor
--all-projects, deleting the sharedcache.dband then cleaning per‑projectcachedirectories is the right migration path from JSON to SQLite. For a single directory, usingCacheManager.clear_cache()and then removing any legacycachedirectory preserves other projects’ DB data while cleaning the one project. The behavior matches the updated tests.
459-487: The --page-size option is correctly wired, but consider adding input validationThe
--page-sizeparameter is properly threaded through bothprocess_projects_hierarchy()andconvert_jsonl_to(), allowing CLI users to tune pagination of combined transcripts. Currently, it accepts any integer without validation. Consider constraining it to positive values withclick.IntRange(min=1)to avoid unexpected behavior from zero or negative inputs, which would cause every session to be on its own page or trigger odd edge cases in the pagination logic.claude_code_log/html/renderer.py (1)
475-535: Pagination metadata plumbing into the renderer is backward‑compatibleExtending
HtmlRenderer.generateandgenerate_htmlwithpage_infoandpage_statsand adding them to the Jinja context cleanly supports paginated transcripts without breaking existing call sites (new params are optional and appended). The docstrings match the new behavior, andgenerate_sessionstill callsgeneratevia keyword args so it remains unaffected.Also applies to: 604-628
test/test_cache_integration.py (3)
88-114: --no-cache CLI test captures intended “no new data written” semantics
test_cli_no_cache_flagnow asserts that after--clear-cacheand a subsequent run with--no-cache, the sharedcache.dbexists but the project’stotal_message_countremains 0. This matchesCacheManager.clear_cache()plus a non-caching run and is a reasonable contract for “no-cache” mode.
115-140: Converter/CLI cache-on vs cache-off tests align with SQLite behavior but hard-code versionThe tests that assert cache presence (
cache.db+cached_filesentries) vs emptiness (total_message_count == 0) correctly reflect the difference betweenuse_cache=Trueanduse_cache=False. They currently constructCacheManager(..., "1.0.0"); to reduce coupling to_is_cache_version_compatiblerules, you might consider usingget_library_version()(or patching it) here as you do elsewhere, but it’s not strictly required given the current compatibility logic.Also applies to: 196-231
232-266: Hierarchy and version‑upgrade integration tests exercise the right cache touchpoints
test_process_projects_hierarchy_with_cacheandtest_cache_version_upgrade_scenariovalidate that project hierarchies populatecache.dband that a pre-existing project row (created under an “old” version) doesn’t prevent later runs under a “new” version from proceeding. If you later extend_is_cache_version_compatiblewith more breaking ranges, these tests will be valuable to catch regressions.Also applies to: 431-442
claude_code_log/utils.py (2)
59-66: Consider adding Windows temp path patterns if cross-platform support is needed.The current patterns cover macOS and Unix temp directories. If the tool runs on Windows, you may want to add patterns like
"/AppData/Local/Temp/"or usetempfile.gettempdir()for a more robust check.🔎 Optional: Platform-agnostic temp path detection
+import tempfile + def _is_temp_path(path_str: str) -> bool: """Check if a path is a temporary/test path that should be filtered out.""" temp_patterns = [ "/private/var/folders/", # macOS temp "/tmp/", # Unix temp "/var/folders/", # macOS temp (alternate) + "/AppData/Local/Temp/", # Windows temp ] - return any(pattern in path_str for pattern in temp_patterns) + # Also check against system temp directory + system_temp = tempfile.gettempdir() + return any(pattern in path_str for pattern in temp_patterns) or path_str.startswith(system_temp)
81-104: Consider extracting duplicate fallback logic.The fallback conversion logic (removing leading
-and replacing-with/) appears twice: lines 87-90 and 100-104. This could be extracted to a small helper for maintainability.🔎 Optional: Extract fallback logic
+def _convert_project_dir_to_path(project_dir_name: str) -> str: + """Convert a project directory name to a display path.""" + display_name = project_dir_name + if display_name.startswith("-"): + display_name = display_name[1:].replace("-", "/") + return display_name + + def get_project_display_name( project_dir_name: str, working_directories: Optional[list[str]] = None ) -> str: ... if working_directories: real_dirs = [wd for wd in working_directories if not _is_temp_path(wd)] if not real_dirs: - display_name = project_dir_name - if display_name.startswith("-"): - display_name = display_name[1:].replace("-", "/") - return display_name + return _convert_project_dir_to_path(project_dir_name) ... else: - display_name = project_dir_name - if display_name.startswith("-"): - display_name = display_name[1:].replace("-", "/") - return display_name + return _convert_project_dir_to_path(project_dir_name)test/test_cache.py (1)
434-516: Consider extracting the repeated patched_method logic to a helper.The same
patched_methodimplementation is duplicated across three tests (test_breaking_change_exact_version_incompatible,test_breaking_change_pattern_matching,test_multiple_breaking_changes) with only thebreaking_changesdict varying. Extracting to a parameterized helper would reduce duplication.🔎 Optional: Extract version check helper
def _create_version_checker(cache_manager, breaking_changes): """Create a version compatibility checker with custom breaking changes.""" def patched_method(cache_version): if cache_version == cache_manager.library_version: return True from packaging import version cache_ver = version.parse(cache_version) current_ver = version.parse(cache_manager.library_version) for breaking_version_pattern, min_required in breaking_changes.items(): min_required_ver = version.parse(min_required) if current_ver >= min_required_ver: if breaking_version_pattern.endswith(".x"): major_minor = breaking_version_pattern[:-2] if str(cache_ver).startswith(major_minor): return False else: breaking_ver = version.parse(breaking_version_pattern) if cache_ver <= breaking_ver: return False return True return patched_methodThen in tests:
cache_manager._is_cache_version_compatible = _create_version_checker( cache_manager, {"0.3.3": "0.3.4"} )
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (28)
claude_code_log/cache.pyclaude_code_log/cli.pyclaude_code_log/converter.pyclaude_code_log/html/renderer.pyclaude_code_log/html/templates/components/page_nav_styles.cssclaude_code_log/html/templates/transcript.htmlclaude_code_log/migrations/001_initial_schema.sqlclaude_code_log/migrations/002_html_cache.sqlclaude_code_log/migrations/003_html_pagination.sqlclaude_code_log/migrations/__init__.pyclaude_code_log/migrations/runner.pyclaude_code_log/renderer.pyclaude_code_log/tui.pyclaude_code_log/utils.pytest/__snapshots__/test_snapshot_html.ambrtest/__snapshots__/test_snapshot_markdown.ambrtest/test_cache.pytest/test_cache_integration.pytest/test_cache_sqlite_integrity.pytest/test_html_regeneration.pytest/test_integration_realistic.pytest/test_message_types.pytest/test_pagination.pytest/test_performance.pytest/test_project_display_name.pytest/test_project_matching.pytest/test_sidechain_agents.pytest/test_tui.py
🚧 Files skipped from review as they are similar to previous changes (7)
- claude_code_log/migrations/003_html_pagination.sql
- test/test_performance.py
- claude_code_log/migrations/002_html_cache.sql
- test/snapshots/test_snapshot_html.ambr
- claude_code_log/migrations/001_initial_schema.sql
- claude_code_log/migrations/init.py
- test/test_project_matching.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use-n autoflag for parallel test execution with pytest in Python projects (e.g.,uv run pytest -n auto -m "not (tui or browser)" -v)
Useruff formatfor code formatting in Python projects
Useruff check --fixfor linting and automatic fixes in Python projects
Useuv run pyrightoruv run ty checkfor type checking in Python projects
Files:
test/test_message_types.pytest/test_pagination.pytest/test_cache_integration.pyclaude_code_log/tui.pyclaude_code_log/cli.pytest/test_integration_realistic.pytest/test_project_display_name.pytest/test_cache_sqlite_integrity.pytest/test_tui.pytest/test_sidechain_agents.pytest/test_html_regeneration.pyclaude_code_log/html/renderer.pyclaude_code_log/renderer.pyclaude_code_log/migrations/runner.pyclaude_code_log/cache.pytest/test_cache.pyclaude_code_log/utils.pyclaude_code_log/converter.py
claude_code_log/renderer.py
📄 CodeRabbit inference engine (CLAUDE.md)
When adding new message types or modifying CSS class generation, ensure the timeline's message type detection logic is updated accordingly and that the filter is applied consistently to messages in both the main transcript and timeline
Files:
claude_code_log/renderer.py
🧠 Learnings (6)
📚 Learning: 2025-12-31T16:37:39.494Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-31T16:37:39.494Z
Learning: Implement interactive TUI (Terminal User Interface) for browsing Claude Code sessions with real-time navigation, summaries, and quick actions
Applied to files:
test/__snapshots__/test_snapshot_markdown.ambr
📚 Learning: 2025-12-31T16:37:39.494Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-31T16:37:39.494Z
Learning: Applies to claude_code_log/templates/components/timeline.html : When adding new message types or modifying CSS class generation in renderer.py, ensure the timeline's message type detection logic in timeline.html is updated accordingly to maintain feature parity
Applied to files:
claude_code_log/html/templates/transcript.htmlclaude_code_log/renderer.py
📚 Learning: 2025-12-31T16:37:39.494Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-31T16:37:39.494Z
Learning: Applies to claude_code_log/renderer.py : When adding new message types or modifying CSS class generation, ensure the timeline's message type detection logic is updated accordingly and that the filter is applied consistently to messages in both the main transcript and timeline
Applied to files:
claude_code_log/html/templates/transcript.html
📚 Learning: 2025-12-31T16:37:39.494Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-31T16:37:39.494Z
Learning: Applies to claude_code_log/templates/components/timeline.html : Use Playwright to test browser runtime features and message filtering consistency in the timeline component
Applied to files:
claude_code_log/html/templates/transcript.html
📚 Learning: 2025-12-09T23:52:47.578Z
Learnt from: daaain
Repo: daaain/claude-code-log PR: 59
File: test/test_cache.py:135-165
Timestamp: 2025-12-09T23:52:47.578Z
Learning: SQLite supports NULLS FIRST and NULLS LAST in ORDER BY since v3.30.0 (Oct 2019). Do not flag SQL that uses these clauses as an error when reviewing Python tests or code that interacts with SQLite. If reviewing SQL strings, verify the target SQLite version supports NULLS FIRST/LAST and ensure the syntax is used correctly for the database in use.
Applied to files:
test/test_message_types.pytest/test_pagination.pytest/test_cache_integration.pytest/test_integration_realistic.pytest/test_project_display_name.pytest/test_cache_sqlite_integrity.pytest/test_tui.pytest/test_sidechain_agents.pytest/test_html_regeneration.pytest/test_cache.py
📚 Learning: 2025-11-30T23:24:07.840Z
Learnt from: cboos
Repo: daaain/claude-code-log PR: 54
File: claude_code_log/renderer.py:2912-2945
Timestamp: 2025-11-30T23:24:07.840Z
Learning: In claude_code_log/renderer.py, the agentId field is currently only set on Task tool_result messages, not on tool_use messages, because the agentId is generated after the tool_use is logged. The _reorder_sidechain_template_messages function relies on this to avoid duplicate sidechain insertion.
Applied to files:
test/test_sidechain_agents.py
🧬 Code graph analysis (13)
test/test_message_types.py (1)
claude_code_log/converter.py (1)
load_transcript(111-288)
test/test_pagination.py (2)
claude_code_log/cache.py (2)
SessionCacheData(39-52)get_library_version(119-162)claude_code_log/converter.py (2)
_get_page_html_path(483-490)_assign_sessions_to_pages(493-527)
test/test_cache_integration.py (1)
claude_code_log/cache.py (2)
CacheManager(168-1186)get_cached_project_data(604-676)
claude_code_log/tui.py (1)
claude_code_log/cache.py (1)
get_working_directories(580-594)
claude_code_log/cli.py (2)
claude_code_log/cache.py (3)
get_working_directories(580-594)CacheManager(168-1186)clear_cache(678-686)claude_code_log/converter.py (1)
convert_jsonl_to_html(716-740)
test/test_integration_realistic.py (1)
claude_code_log/cache.py (3)
CacheManager(168-1186)get_library_version(119-162)get_cached_project_data(604-676)
test/test_project_display_name.py (1)
claude_code_log/utils.py (1)
get_project_display_name(69-104)
test/test_cache_sqlite_integrity.py (1)
claude_code_log/cache.py (9)
SessionCacheData(39-52)save_cached_entries(440-501)update_session_cache(503-536)_get_connection(190-199)load_cached_entries(363-378)get_cached_project_data(604-676)get_cache_stats(718-748)get_working_directories(580-594)is_file_cached(340-361)
test/test_html_regeneration.py (1)
claude_code_log/cache.py (3)
get_library_version(119-162)get_html_cache(752-774)is_html_stale(808-869)
claude_code_log/renderer.py (2)
claude_code_log/html/renderer.py (1)
is_outdated(587-598)claude_code_log/markdown/renderer.py (1)
is_outdated(884-901)
claude_code_log/cache.py (4)
claude_code_log/factories/transcript_factory.py (1)
create_transcript_entry(219-238)claude_code_log/migrations/runner.py (1)
run_migrations(135-156)claude_code_log/models.py (5)
AssistantTranscriptEntry(175-178)QueueOperationTranscriptEntry(203-220)SummaryTranscriptEntry(181-186)SystemTranscriptEntry(189-200)UserTranscriptEntry(168-172)claude_code_log/renderer.py (2)
get(121-132)is_html_outdated(2308-2320)
test/test_cache.py (3)
test/test_cache_sqlite_integrity.py (2)
cache_manager(37-39)temp_project_dir(28-33)test/test_pagination.py (2)
cache_manager(156-159)temp_project_dir(143-146)claude_code_log/cache.py (4)
is_file_cached(340-361)save_cached_entries(440-501)get_cached_project_data(604-676)clear_cache(678-686)
claude_code_log/utils.py (1)
claude_code_log/cache.py (1)
SessionCacheData(39-52)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: test (windows-latest, 3.11)
- GitHub Check: test (ubuntu-latest, 3.12)
- GitHub Check: test (windows-latest, 3.12)
- GitHub Check: test (windows-latest, 3.10)
- GitHub Check: test (windows-latest, 3.14)
- GitHub Check: test (ubuntu-latest, 3.13)
- GitHub Check: test (windows-latest, 3.13)
- GitHub Check: test (ubuntu-latest, 3.14)
- GitHub Check: test (ubuntu-latest, 3.10)
- GitHub Check: test (ubuntu-latest, 3.11)
🔇 Additional comments (31)
test/__snapshots__/test_snapshot_markdown.ambr (1)
321-327: Snapshot header rename is consistent with test name.The updated header better reflects the multi-session markdown test identifier; no further changes needed.
test/test_tui.py (1)
90-96: Explicit UTF‑8 encoding on test files is a good hardening.Using
encoding="utf-8"for both the JSONL fixture and the non-directory test file avoids platform-default surprises when data contains non-ASCII.Also applies to: 796-801
test/test_message_types.py (1)
151-186: Missing-file tests correctly pinload_transcriptbehavior.These tests exercise both noisy and silent modes, asserting an empty list and appropriate warning behavior; they match the new
FileNotFoundErrorhandling inload_transcript.test/test_sidechain_agents.py (1)
19-35: UTF‑8 encoding on sidechain fixtures is appropriate.Specifying
encoding="utf-8"on all test data reads/writes makes these sidechain/agent tests robust to locale differences and non-ASCII content.Also applies to: 66-80, 103-115, 131-145, 176-178, 285-299
claude_code_log/converter.py (3)
152-160: Graceful handling of deleted JSONL files is correct and matches tests.Catching
FileNotFoundErrorinload_transcriptand returning an empty list (with an optional warning) handles racey session cleanup without failing the whole conversion, and aligns with the new tests.
530-714: Pagination pipeline is consistent; only staleness logic (in CacheManager) needs tightening.The
_get_page_html_path,_assign_sessions_to_pages, and_generate_paginated_htmlfunctions cleanly group sessions into pages without splitting sessions, reusegenerate_htmlwithpage_info/page_stats, and updatehtml_pages/page_sessionsviaupdate_page_cache. With theis_page_stalefix in the cache layer, this should give you efficient, per-page incremental regeneration.
1434-1774: Project‑hierarchy summary structure looks good; printing respects “summary only” intent but not full silence.
process_projects_hierarchy()now tracks per-projectGenerationStats, surfaces warnings/errors, and prints a concise global summary, which is useful. Note thatsilentcurrently suppresses per‑file logging (viaconvert_jsonl_to) but not per‑project status lines or the final summary, which matches the docstring (“show summary only”) but not a “no output at all” interpretation. If you ever need a fully quiet mode, a second flag (e.g.,quiet) may be clearer.claude_code_log/tui.py (2)
8-8: LGTM - Import addition aligns with type annotation usage.The
Listimport is used for the type hint at line 524, making the code more explicit about the expected type.
524-526: LGTM - Clean migration to CacheManager API.The refactor improves code clarity by:
- Using a more precise type annotation (
List[str]instead of implicitNone)- Calling
get_working_directories()directly instead of accessing via cached project data- Maintaining exception handling to preserve fallback behavior
The implementation correctly aligns with the new CacheManager-based architecture.
claude_code_log/html/templates/transcript.html (2)
20-20: LGTM - Standard CSS inclusion for pagination styling.The include follows the established pattern for component-based CSS organization in this template.
27-51: LGTM - Well-structured pagination UI.The page navigation block is properly implemented with:
- Conditional rendering via
{% if page_info %}for backward compatibility- Clean semantic HTML structure with appropriate CSS classes
- Nested conditionals for optional elements (stats, prev/next links)
- Consistent with the template's existing patterns
claude_code_log/renderer.py (1)
2308-2320: LGTM - Clean convenience wrapper for HTML staleness checks.The function provides a clean public API for checking HTML file versions:
- Local import pattern correctly avoids circular dependencies
- Simple delegation to
HtmlRenderer.is_outdated()- Clear docstring explaining return values
The implementation is straightforward and correct. This likely supports the HTML caching infrastructure introduced in the PR.
test/test_project_display_name.py (2)
118-128: LGTM - Test correctly validates temporary path filtering.The test ensures that common temporary paths (pytest, macOS temp directories) are properly filtered out, with the real project directory being selected for the display name. This aligns with the new filtering logic in
utils.py.
130-140: LGTM - Test correctly validates fallback behavior.The test ensures proper fallback to project directory name conversion when all working directories are temporary paths. This validates the edge case handling in the filtering logic.
test/test_pagination.py (3)
24-35: _get_page_html_path tests accurately codify filename contractThe three tests cover page 1 and higher-page naming and directly mirror the converter implementation; no issues here.
40-140: Pagination assignment tests match the new overflow semanticsThe
_assign_sessions_to_pagesscenarios (single/multiple sessions, large sessions, overflow, chronological ordering, empty input) line up with the implementation (add-then-check, allow overflow to keep sessions intact). This is a solid, readable test matrix.
351-549: Integration pagination tests give good end‑to‑end coverageThe converter integration tests cover no‑pagination, multi‑page generation, page_size reconfiguration, tiny page sizes, navigation links, and stats rendering. They closely reflect the intended semantics of
_assign_sessions_to_pagesand the new HTML navigation, and look correct.test/test_html_regeneration.py (1)
448-492: Message-count based staleness detection test matches cache semantics
test_html_cache_message_count_validationcorrectly drivesensure_fresh_cacheto update session metadata and then asserts thatis_html_staleflags the session assession_updatedonce the DB message_count diverges from the cached HTML count. This is a precise check of the intended invariant.test/test_cache_sqlite_integrity.py (2)
453-574: Ordering and NULL‑token aggregate tests match cache designThe timestamp-ordering and NULL-token handling tests correctly validate that
load_cached_entriesusesORDER BY timestamp NULLS LASTand that aggregates useCOALESCE(SUM(...), 0)semantics. These directly align with the SQL inCacheManagerand look sound.
887-923: Migration integrity tests nicely pin down schema-version behaviorThe
_schema_versiontests ensure migrations record checksums and are applied exactly once, which is critical for long-term cache stability. The assertions on version>0,.sqlfilenames, and 64-char checksums are appropriate.test/test_integration_realistic.py (2)
238-251: CLI clear‑cache integration tests correctly target the new SQLite cache
TestCLIWithProjectsDir.test_clear_cache_with_projects_dirnow asserts thatcache.dbis created under the projects dir after processing and removed after--clear-cache. This matches_clear_cachesbehavior (delete shared DB for--all-projects) and is a good high‑level safeguard.Also applies to: 266-268
421-434: Real‑data cache tests validate both DB presence and version recording
TestCacheWithRealData.test_cache_creation_all_projectsandtest_cache_version_storednow assert that a sharedcache.dbexists and thatCacheManager.get_cached_project_data()returns a non‑None structure with a non‑emptyversionequal toget_library_version(). This ties the realistic-data path neatly to the new SQLite cache semantics.Also applies to: 473-478
claude_code_log/utils.py (2)
7-10: LGTM! Correct use of TYPE_CHECKING for circular import avoidance.The
TYPE_CHECKINGguard correctly allows type hints without runtime circular dependency issues.
191-226: LGTM! Solid implementation for multi-type entry handling.The late import pattern correctly avoids circular dependencies at runtime while maintaining type safety via
TYPE_CHECKING. The duck-typing approach withgetattrandhasattrhandles the union type gracefully.Minor observation: Line 218's
getattr(entry, "timestamp", "")is slightly redundant sincehasattralready confirmed the attribute exists, but it's a safe defensive pattern.test/test_cache.py (7)
26-33: LGTM! Good test isolation pattern.Creating a project subdirectory ensures each test gets a unique
db_path(atparent/cache.db), preventing test interference when using the shared SQLite database.
108-113: LGTM! Clear validation of SQLite database location.The test properly verifies the database is created at the expected parent-level path.
138-167: LGTM! Direct SQLite verification is appropriate for testing internal storage.The test correctly queries the database to verify message storage structure. The
NULLS LASTsyntax is valid in SQLite 3.30+ (per retrieved learnings). Good isolation viaproject_idfiltering.
169-187: Mtime-based test may be flaky on fast filesystems.The
time.sleep(1.1)works in most cases, but some filesystems (especially in CI with containers) may have lower mtime resolution or caching that could cause intermittent failures.Consider whether you've observed any flakiness in CI. If so, a more robust approach would be to directly manipulate the cached mtime in the database or use
os.utime()to explicitly set a different modification time.
189-207: LGTM! Version compatibility test is well-structured.The test correctly validates that compatible version upgrades preserve the cache data without requiring re-creation.
263-285: LGTM! Comprehensive cache clearing test.Good test structure with explicit precondition verification before the clear operation, and thorough postcondition checks on both file cache status and project data.
593-603: LGTM! Clear test for uncached file handling.The test properly distinguishes between a file existing on disk and having a cache entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
test/__snapshots__/test_snapshot_html.ambr (1)
14150-14236: LGTM: Consistent snapshot updates with minor suggestion.The CSS and title updates maintain consistency across the snapshot. The empty page navigation block correctly reflects a test case without pagination controls.
Optional consideration: The comment at line 14225 mentions "revealed via in-place editing when new page is created." This approach requires that the HTML structure and class names remain stable. Consider ensuring the in-place editing logic is resilient to template changes, or document the contract between the template and the editing logic.
claude_code_log/converter.py (2)
483-491: Add input validation for page_number.
_get_page_html_pathdoesn't validate thatpage_numberis positive. Passing 0 or negative values would produce incorrect filenames like"combined_transcripts_0.html"or"combined_transcripts_-1.html".🔎 Suggested validation
def _get_page_html_path(page_number: int) -> str: """Get the HTML filename for a given page number. Page 1 is combined_transcripts.html, page 2+ are combined_transcripts_N.html """ + if page_number < 1: + raise ValueError(f"Page number must be >= 1, got {page_number}") if page_number == 1: return "combined_transcripts.html" return f"combined_transcripts_{page_number}.html"
537-572: Consider validating page_size parameter.
_assign_sessions_to_pagesdoesn't validate thatpage_size > 0. A zero or negative page_size would cause logical issues (sessions would never trigger page closure, or every session would close a page).🔎 Suggested validation
def _assign_sessions_to_pages( sessions: Dict[str, SessionCacheData], page_size: int ) -> List[List[str]]: """Assign sessions to pages, never splitting sessions across pages. Args: sessions: Dict mapping session_id to SessionCacheData page_size: Maximum messages per page (overflow allowed to keep sessions intact) Returns: List of pages, each containing a list of session_ids """ + if page_size < 1: + raise ValueError(f"page_size must be >= 1, got {page_size}") pages: List[List[str]] = []
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
claude_code_log/converter.pyclaude_code_log/html/templates/components/page_nav_styles.cssclaude_code_log/html/templates/transcript.htmltest/__snapshots__/test_snapshot_html.ambrtest/test_pagination.py
🚧 Files skipped from review as they are similar to previous changes (3)
- claude_code_log/html/templates/components/page_nav_styles.css
- claude_code_log/html/templates/transcript.html
- test/test_pagination.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use-n autoflag for parallel test execution with pytest in Python projects (e.g.,uv run pytest -n auto -m "not (tui or browser)" -v)
Useruff formatfor code formatting in Python projects
Useruff check --fixfor linting and automatic fixes in Python projects
Useuv run pyrightoruv run ty checkfor type checking in Python projects
Files:
claude_code_log/converter.py
🧠 Learnings (2)
📚 Learning: 2025-12-31T16:37:39.494Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-31T16:37:39.494Z
Learning: Applies to claude_code_log/templates/components/timeline.html : Use Playwright to test browser runtime features and message filtering consistency in the timeline component
Applied to files:
test/__snapshots__/test_snapshot_html.ambr
📚 Learning: 2025-12-31T16:37:39.494Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-31T16:37:39.494Z
Learning: Implement interactive TUI (Terminal User Interface) for browsing Claude Code sessions with real-time navigation, summaries, and quick actions
Applied to files:
test/__snapshots__/test_snapshot_html.ambr
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: test (ubuntu-latest, 3.11)
- GitHub Check: test (windows-latest, 3.11)
- GitHub Check: test (windows-latest, 3.12)
- GitHub Check: test (windows-latest, 3.13)
- GitHub Check: test (windows-latest, 3.10)
- GitHub Check: test (ubuntu-latest, 3.14)
- GitHub Check: test (windows-latest, 3.14)
- GitHub Check: test (ubuntu-latest, 3.13)
- GitHub Check: test (ubuntu-latest, 3.12)
- GitHub Check: test (ubuntu-latest, 3.10)
🔇 Additional comments (17)
test/__snapshots__/test_snapshot_html.ambr (4)
4290-4368: LGTM: Well-structured page navigation CSS.The CSS styling for paginated transcripts is well-organized with clear hierarchy, appropriate visual feedback (hover effects, transitions), and thoughtful details like the glassmorphic background and shadow effects. The approach aligns with the pagination feature described in the PR objectives.
9172-9250: LGTM: Expected snapshot duplication.The CSS block is correctly duplicated in this test case snapshot, ensuring the "Edge Cases" test captures the complete expected HTML output with page navigation styles.
11721-11721: LGTM: Improved test specificity.Updating the title to include the specific test project name improves snapshot clarity and makes test failures easier to identify.
19169-19255: LGTM: Complete snapshot coverage.The final test case snapshot correctly includes the page navigation CSS, ensuring all HTML generation test cases capture the pagination feature additions.
claude_code_log/converter.py (13)
152-160: Good defensive handling of race conditions.The try/except block correctly handles the case where a file is deleted between directory globbing and opening (e.g., during Claude Code session cleanup). Returning an empty list is appropriate, and the warning message is clear and conditionally suppressed.
415-481: LGTM - Well-structured stats tracking.The
GenerationStatsdataclass is well-designed with appropriate defaults and helper methods. Thesummary()method produces clear, multi-line output for per-project reporting.
690-703: Pagination approach successfully addresses previous concerns.The implementation uses the
is_last_pageflag (line 692) with thenext_linkalways provided (line 697). The template can hide the next link with a"last-page"CSS class whenis_last_pageis True. When a new page is created,_enable_next_link_on_previous_page()removes this class (lines 702-703), revealing the link without full page regeneration.This elegant solution avoids 404 errors while minimizing regeneration cost, as discussed in the past review comments.
770-786: LGTM - Consistent page_size parameter addition.The
page_sizeparameter with a default of 2000 is consistently added to bothconvert_jsonl_to_htmlandconvert_jsonl_to, maintaining backward compatibility.Also applies to: 799-814
850-870: Effective early-exit optimization for unchanged projects.The cache-aware short-circuit correctly skips expensive message loading when:
- Cache is fresh (
cache_was_updated == False)- No date filters applied
- Combined HTML is current (not stale and not outdated)
- Either individual sessions aren't being generated OR no sessions are stale
This optimization significantly improves performance for large multi-project hierarchies, as noted in PR objectives (e.g., "rescan of 24 projects processed in ~1.8s").
905-975: Well-designed pagination mode selection.The pagination decision logic (lines 913-921) is sound:
- Enables pagination for HTML directory mode without date filters
- Uses pagination when messages exceed
page_sizeOR when existing pages exist- The
existing_page_count > 1check prevents mode flip-flopping, maintaining pagination once enabled even if message count dropsThe dual code paths (paginated vs. single-file) are appropriately guarded and handle cache updates correctly.
993-1024: LGTM - Fast cache change detection.
has_cache_changes()provides efficient mtime-based invalidation checking without loading messages. The logic correctly handles:
- Missing cache (return True)
- Date filters (return True)
- Modified files (check via cache_manager)
- Stale cache detection (zero messages but files exist)
The function complements
ensure_fresh_cache()by providing a fast pre-check before expensive message loading.
1026-1037: LGTM - Improved documentation clarity.The updated docstring clearly explains the return value and relationship with
has_cache_changes(), helping developers understand the two-phase cache checking pattern (fast mtime check → expensive message loading).
1340-1478: LGTM - Enhanced session file generation with return value.The function improvements are well-designed:
- Returns
int(regenerated session count) for statistics tracking- Adds
silentparameter for suppressing per-session output- Filters sessions to only tracked ones (line 1380):
session_ids = session_ids & set(session_data.keys())- Uses cache-aware staleness checks (
is_html_stale) for incremental regeneration- Updates html_cache after regeneration with accurate message counts from cache
The cache integration enables fine-grained HTML regeneration, as intended by the SQLite cache refactor.
1480-1536: LGTM - Enhanced hierarchy processing with comprehensive statistics.The refactored
process_projects_hierarchyadds valuable improvements:
- Per-project timing and statistics via
GenerationStatssilentflag (defaultTrue) for suppressing per-file verbosity while keeping per-project summaries (aligns with past review discussion)page_sizeparameter support (default 2000)- Aggregated metrics: total projects, projects with updates, total sessions
The initialization section is well-organized and sets up the necessary tracking structures.
1550-1621: Efficient dual-path processing with correct cache routing.The fast/slow path logic is well-implemented:
Fast path (lines 1581-1586):
- Skips regeneration for unchanged projects
- Reports cached status with timing
Slow path (lines 1587-1621):
- Updates cache and regenerates output
- Tracks file updates and session regenerations
- Reports detailed progress
Cache staleness check (lines 1560-1571):
Correctly distinguishes paginated vs. non-paginated projects:
- Paginated projects: checks
is_page_stale(1, page_size)for page 1- Non-paginated projects: checks
is_html_stale(output_path.name, None)for single-file cacheThis dual-cache-table approach aligns with the PR's pagination architecture.
1623-1782: Robust project data collection with cache-first approach.The implementation correctly prioritizes cached data while maintaining a fallback:
Primary path (lines 1623-1680):
- Uses
cached_project_datafor efficient aggregates- Filters sessions to exclude warmup-only and agent-only sessions (line 1673-1674)
- Builds project summary from cache without re-parsing messages
Fallback path (lines 1682-1771):
- Prints warning about cache absence (rare case)
- Loads and processes messages directly
- Ensures cache is updated with session data (line 1689-1691), including working_directories
- Manually calculates aggregates
Exception handling (lines 1773-1782):
- Records error in stats
- Prints detailed warning with traceback
- Continues processing remaining projects (fail-safe behavior)
This defensive design ensures the multi-project scan completes even if individual projects fail.
1784-1822: LGTM - Clear summary reporting with appropriate index regeneration.Index generation (lines 1784-1798):
- Regenerates when renderer outdated, date filters applied, or any cache updated
any_cache_updatedflag ensures index reflects new data even when no files changed in some projectsSummary output (lines 1799-1821):
- Reports errors/warnings per project (lines 1807-1811)
- Provides global summary with elapsed time, update count, and index status
- Output is always shown (consistent with
silentflag design: suppress per-file logs, show per-project summaries)The comprehensive reporting helps users understand what work was performed, which is especially valuable for large multi-project scans.
Two fixes for cache invalidation issues: 1. Filter agent files in modification check - get_modified_files() checked all .jsonl including agent-*.jsonl - But load_directory_transcripts() excludes agent files - Result: unreferenced agent files always marked as "modified" 2. Skip archived sessions in staleness check (preserve data) - When JSONL files are deleted, cached sessions remain - Instead of pruning, skip them in get_stale_sessions() - Show "N archived" count in output for visibility - Preserves data for potential future archive/restore features Output now shows: project-name: cached, 3 archived (0.0s)
|
So, as I was testing #76, I also could observe the behavior of the new cache and paginated combined transcripts, and all seems to work great! The cache.db takes up less space than the combined The only concern I have is that we're using the |
* Add archived project support and session deletion Features: - Show fully archived projects (cached but no JSONL files) in TUI and HTML index - Projects with [ARCHIVED] badge in project selector and greyed-out styling in index - Archived projects automatically open in archived view mode - Delete sessions from cache via 'd' key with confirmation dialog - Restore JSONL now produces compact JSON format matching original files - Project list refreshes after restore to update archived status Implementation: - Add get_all_cached_projects() to discover archived projects from cache.db - Add delete_session() and delete_project() methods to CacheManager - Update CLI to discover and track archived projects, refresh on return - Update SessionBrowser to handle is_archived_project flag - Add DeleteConfirmScreen modal for delete confirmation Tests: - Add tests for compact JSON export, session/project deletion - Add tests for get_all_cached_projects with active/archived detection - Add TUI test for archived project loading in archived mode * Simplify views to show both archived and current + implement delete as well as archive and restore
|
That is a very good point about naming the cache a more unique I'll also switch the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 9
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
test/test_cache_integration.py (1)
98-114:--no-cachetest is self-fulfilling due to CacheManager initialization side effects; cannot verify the flag actually prevents caching.
CacheManager.__init__unconditionally calls_init_database()(which runs migrations and createscache.db) and_ensure_project_exists()(which inserts/updates the project record). When the test instantiatesCacheManagerafter running--no-cache, it creates the cache database regardless of what the--no-cacheflag did. The test cannot distinguish between cache.db created by the flag (indicating a bug) and cache.db created by the subsequentCacheManagerinstantiation (expected).Check that
cache.dbdoes not exist or has not been modified before instantiatingCacheManagerto verify that--no-cacheactually prevented cache creation. Alternatively, mock or isolate the cache path to avoid side effects from initialization.claude_code_log/cli.py (3)
225-294: Relative matching shouldn’t generate HTML as a side effect.In
_find_relative_matches, callingconvert_jsonl_to_html(project_dir, silent=True)(Line 242) can generate/overwrite outputs just to populate cache metadata. Preferensure_fresh_cache(project_dir, cache_manager, silent=True)(cache-only) to populateworking_directories.
498-526: --page-size should be validated (> 0) before use.A value of 0 or negative will produce odd pagination behavior. Add a click constraint or an early check (and fail fast with a clear error).
Proposed patch
@@ @click.option( "--page-size", type=int, default=2000, help="Maximum messages per page for combined transcript (default: 2000). Sessions are never split across pages.", ) @@ def main( @@ page_size: int, @@ ) -> None: @@ try: + if page_size <= 0: + raise click.BadParameter("--page-size must be a positive integer", param_hint="--page-size")Also applies to: 668-735
54-106: Passprojects_dirto_launch_tui_with_cache_checkor provide it toCacheManagerexplicitly.
CacheManager(project_path, ...)incorrectly assumes the cache database is atproject_path.parent / "cache.db". For archived projects,project_pathmay be the original stored path (e.g.,/Users/dain/workspace/project1), makingproject_path.parentpoint to the wrong location. The actual shared database is always atprojects_dir / "cache.db". Either passprojects_diras a parameter to_launch_tui_with_cache_checkand forward it toCacheManager, or modifyCacheManagerto accept an explicitdb_pathparameter.claude_code_log/converter.py (1)
1528-1973: Fast-path project processing is promising, but it depends on page/session staleness being correct.Given
is_page_stale()currently doesn’t detect session changes, the “nothing to do” path (Lines 1663-1671) risks skipping required page regeneration. Once page staleness compares real content (message counts / timestamps), this path should become reliable.
🤖 Fix all issues with AI agents
In @claude_code_log/cache.py:
- Around line 380-439: In load_cached_entries_filtered, the SQL date bounds are
built with from_dt.isoformat()/to_dt.isoformat() which can mismatch stored
timestamps that end with 'Z'; normalize the bounds to UTC 'Z' form before
appending to params (e.g. ensure from_dt/to_dt are timezone-aware UTC via
from_dt = from_dt.astimezone(timezone.utc) or set tzinfo=timezone.utc, then
produce a bound like from_dt.isoformat().replace('+00:00','Z') or use strftime
to output YYYY-MM-DDTHH:MM:SS[.fff]Z) so lexicographic SQL comparisons match
stored JSON values; update imports to include datetime.timezone and adjust where
from_dt/to_dt are parsed and appended to params, leaving the rest of
load_cached_entries_filtered and the use of _get_file_id/_deserialize_entry
unchanged.
- Around line 1326-1370: delete_session currently removes messages, html_cache
and sessions but leaves cached_files entries intact so is_file_cached() can
still return True and load_cached_entries() yields empty results; update
delete_session to also invalidate the file cache by deleting rows from
cached_files for the same project_id and source_session_id (i.e., execute
"DELETE FROM cached_files WHERE project_id = ? AND source_session_id = ?" using
self._project_id and session_id) so future runs will reparse the JSONL and
rebuild the session; ensure this deletion happens before committing and
alongside the existing deletions.
In @claude_code_log/converter.py:
- Around line 910-942: When use_pagination is true but cached_data is None,
session_data is set to {} which leads _generate_paginated_html to produce empty
pages; instead, detect missing cached_data and derive session_data from messages
(e.g., group messages by session id or date and build the same session
ordering/metadata shape expected by _generate_paginated_html) before calling
_generate_paginated_html so pagination is based on messages rather than an empty
cache; keep the cache_manager.get_cached_project_data() call but fallback to
building session metadata from messages and pass that session_data into
_generate_paginated_html.
In @claude_code_log/tui.py:
- Around line 1036-1041: Replace bare except: pass blocks in the UI/cache update
sections with explicit exception handling: catch Exception as e (not bare
except), log the exception via the module logger or self.logger with context
(e.g., when calling self.populate_table() / self.update_stats()), and add a
short comment explaining why suppression is acceptable; where possible narrow
the except to specific errors (e.g., AttributeError, OSError) for the
cache-building and UI update blocks and re-raise unexpected exceptions. Ensure
each occurrence (the archived-project UI update around
self.populate_table()/self.update_stats(), the cache building block, and the
other UI update block) is updated to log e and include a brief comment stating
why errors can be ignored if widgets are unmounted.
🟡 Minor comments (12)
test/test_cache_integration.py-835-848 (1)
835-848:get_all_cached_projectstest may fail if returned paths are normalized/realpathed.Comparing
str(active_dir)to returned strings is brittle if production code usesPath.resolve(),os.path.realpath, or different path separators. Prefer normalizing both sides (e.g.,Path(p).resolve()).Proposed fix
- project_paths = {p[0] for p in projects} - assert str(active_dir) in project_paths - assert str(archived_dir) in project_paths + project_paths = {Path(p[0]).resolve() for p in projects} + assert active_dir.resolve() in project_paths + assert archived_dir.resolve() in project_pathsCommittable suggestion skipped: line range outside the PR's diff.
test/__snapshots__/test_snapshot_html.ambr-464-487 (1)
464-487: Archived card dimming viaopacity: 0.6affects all child text visibility; consider a more targeted visual treatment.When
opacity: 0.6is applied to.project-card.archived, it creates a CSS stacking context that dims all child elements—not just the background. This includes the project name (blue link), stats text, and the badge, making them harder to read. Use an alternative approach such as adjusting onlybackground-colorand keeping text at full opacity, or apply opacity only to a pseudo-element to preserve text contrast.test/test_tui.py-1621-1682 (1)
1621-1682: Restore test is good; ensure JSONL content is parseable if future code reads it.
export_session_to_jsonlis mocked to return['{"type":"user"}'](Line 1649), which is likely not parseable by current Pydantic models if any code path later loads it. Consider returning a minimal valid entry (sessionId/timestamp/message/etc.) to keep the test robust against refactors.claude_code_log/cli.py-28-52 (1)
28-52: _discover_projects(): good direction; ensure stable ordering + no duplicates.Consider returning
project_dirsin a deterministic order (e.g., sorted) after appending archived paths, and guarding against duplicate paths differing only by string form (symlink/resolve differences).claude_code_log/converter.py-998-1029 (1)
998-1029: has_cache_changes() includes agent files; ensure this matches intended invalidation policy.Elsewhere you explicitly exclude
agent-*.jsonlfrom modified-file checks. Here it usesglob("*.jsonl")(Line 1012). If this function is used for gating work, it can reintroduce “agent-only change triggers work” behavior.test/test_tui.py-923-974 (1)
923-974: Archived-project test setup contradicts its own premise (creates a JSONL file).
test_archived_project_loads_archived_sessionssays “no JSONL files” but createssession-123.jsonl(Lines 929-932). This can mask behavior differences in code paths that key offglob("*.jsonl"). Prefer creating only the directory and relying on cache mocks.test/test_tui.py-90-106 (1)
90-106: Fixture split by session is good; ensure “summary” placement matches real data.Writing one JSONL per session is a solid realism improvement. One concern: the fixture writes
"type": "summary"intosession-456.jsonlregardless ofsessionId(Line 102-105). If summaries can live outside the session file in real datasets, this coupling may hide bugs in session discovery/summary association.claude_code_log/cache.py-1389-1434 (1)
1389-1434: Archived project detection is OK, but can get expensive.
get_all_cached_projects()callsglob("*.jsonl")per project path (Line 1421). On large caches this can become a bottleneck. Consider storinghas_jsonl/is_archivedas a derived DB field updated during scans, or batching filesystem checks.test/test_tui.py-16-16 (1)
16-16: Good: ProjectSelector is now exercised, but this is a public-surface signal.Since tests now import
ProjectSelectordirectly, consider whetherclaude_code_log.tuiintends to expose it as part of its supported API surface (or if tests should import from an internal module path instead).test/test_tui.py-1684-1935 (1)
1684-1935: Several ProjectSelector tests write JSON without explicit encoding.A few
Path.write_text(...)calls omitencoding="utf-8"(e.g., Lines 1694, 1698, 1741-1742, 1803, 1834, 1919). For consistency with the rest of the suite (and Windows stability), set encoding explicitly.Proposed patch
- (project1 / "session-1.jsonl").write_text('{"type":"user"}\n') + (project1 / "session-1.jsonl").write_text('{"type":"user"}\n', encoding="utf-8") @@ - (project2 / "session-2.jsonl").write_text('{"type":"user"}\n') + (project2 / "session-2.jsonl").write_text('{"type":"user"}\n', encoding="utf-8") @@ - jsonl1.write_text('{"type":"user"}\n') - jsonl2.write_text('{"type":"user"}\n') + jsonl1.write_text('{"type":"user"}\n', encoding="utf-8") + jsonl2.write_text('{"type":"user"}\n', encoding="utf-8") @@ - jsonl.write_text('{"type":"user"}\n') + jsonl.write_text('{"type":"user"}\n', encoding="utf-8") @@ - (project_path / "session-1.jsonl").write_text('{"type":"user"}\n') + (project_path / "session-1.jsonl").write_text('{"type":"user"}\n', encoding="utf-8")claude_code_log/cache.py-119-163 (1)
119-163: Narrow exception handling for better diagnostics; tomllib suggestion not applicable here.Since
tomlis already a declared dependency and the project targets Python 3.10+, usingtomllib(only available in 3.11+) would add conditional logic without benefit. However, the broadexcept Exceptionblocks in this function mask the actual error (e.g.,toml.TomlDecodeError,FileNotFoundError, import errors). Catch only expected exceptions so that unexpected failures surface and are easier to debug.claude_code_log/cache.py-189-205 (1)
189-205: Move PRAGMA journal_mode=WAL to initialization with environment-aware error handling.Setting
PRAGMA journal_mode = WALon every connection (line 195) is redundant since WAL mode is persistent at the database level. Set it once during_init_database()instead, with graceful error handling for environments that don't support WAL (read-only filesystems, network mounts, etc.).
🧹 Nitpick comments (8)
test/test_cache_integration.py (2)
149-174: Write JSONL with explicitencoding="utf-8"andnewline="\n"everywhere (cross-platform stability).New tests mix explicit UTF-8 in some places and default encoding in many others; JSONL fixtures with non-ASCII content will intermittently fail on Windows/macOS locales.
Proposed fix (apply pattern everywhere you write JSONL)
- with open(jsonl_file, "w") as f: + with open(jsonl_file, "w", encoding="utf-8", newline="\n") as f: for entry in sample_jsonl_data: f.write(json.dumps(entry) + "\n")Also applies to: 205-213, 226-231, 242-266, 325-328, 457-461, 491-497, 519-522, 548-551, 576-579, 599-602, 627-633, 658-661, 706-709, 736-739, 784-787, 823-826
445-616: Archive/restore workflow: good coverage, but restoration should preserve JSONL file invariants explicitly.Consider writing restored JSONL with
encoding="utf-8"+newline="\n"(and ideally ensuring a trailing newline) to avoid platform diffs; also consider asserting that restore preserves ordering guarantees expected by the loader (timestamp + stable tie-breaker if applicable).claude_code_log/html/templates/components/project_card_styles.css (1)
116-138: Avoidopacityfor archived cards (it reduces text contrast); prefer non-destructive styling.Consider removing opacity and instead adjust border/background (or add a subtle grayscale) so the text remains fully legible.
Possible tweak
.project-card.archived { - opacity: 0.6; - background-color: #f5f5f522; + background-color: #f5f5f522; + border-left-color: #88888855; } .project-card.archived:hover { - opacity: 0.8; + /* keep full text contrast */ + background-color: #f5f5f533; }test/__snapshots__/test_snapshot_html.ambr (1)
4318-4396: Page navigation CSS is duplicated across contexts; consider centralizing and avoid no-op pseudo-elements.The
.page-navigationblock matches the paginated combined transcript UI, but it’s repeated in multiple snapshots—if this originates from templates, consider consolidating into a shared stylesheet/component to avoid style drift. Also,.page-nav-link.prev::beforeand.page-nav-link.next::aftersetcontent: ''(Line 4385-4391 and repeats), which is a no-op; either remove those rules or render meaningful arrow content/labels in HTML/CSS.Also applies to: 9200-9278, 14178-14256, 19197-19275
test/test_tui.py (1)
1192-1531: Archive/Delete modal flow coverage is strong; add a few “state changed” assertions.The keypress-driven modal tests are valuable. A small improvement: after confirming archive/delete, also assert the in-memory session lists update (e.g., session removed from
app.sessions/app.archived_sessions) to ensure UI state stays consistent with filesystem/cache operations.claude_code_log/cli.py (1)
296-348: Cache cleanup update is good; consider also cleaning derived outputs tied to cache.db deletion.Deleting the shared
cache.dbis correct for “clear all”. If HTML/page caches depend on DB state, consider whether you also want to removecombined_transcripts*.html/session-*.htmlduring--clear-cache --all-projects(or at least message that outputs may now be stale).claude_code_log/converter.py (1)
420-486: GenerationStats looks fine; tiny nit: default_factory can be simplified.
field(default_factory=list)is enough (no need forlambda: []).claude_code_log/tui.py (1)
8-8: Optional: Modernize type hints.Line 8 imports
Listfrom typing, but line 1192 is the only usage (List[str]). Modern Python supportslist[str]directly.♻️ Proposed modernization
-from typing import Any, ClassVar, List, Optional, cast +from typing import Any, ClassVar, Optional, castAnd at line 1192:
- working_directories: List[str] = [] + working_directories: list[str] = []
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
claude_code_log/cache.pyclaude_code_log/cli.pyclaude_code_log/converter.pyclaude_code_log/html/templates/components/project_card_styles.cssclaude_code_log/html/templates/index.htmlclaude_code_log/tui.pydev-docs/restoring-archived-sessions.mdtest/__snapshots__/test_snapshot_html.ambrtest/test_cache_integration.pytest/test_tui.py
✅ Files skipped from review due to trivial changes (1)
- dev-docs/restoring-archived-sessions.md
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use-n autoflag for parallel test execution with pytest in Python projects (e.g.,uv run pytest -n auto -m "not (tui or browser)" -v)
Useruff formatfor code formatting in Python projects
Useruff check --fixfor linting and automatic fixes in Python projects
Useuv run pyrightoruv run ty checkfor type checking in Python projects
Files:
claude_code_log/cache.pytest/test_cache_integration.pytest/test_tui.pyclaude_code_log/cli.pyclaude_code_log/converter.pyclaude_code_log/tui.py
🧠 Learnings (5)
📚 Learning: 2025-12-31T16:37:39.494Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-31T16:37:39.494Z
Learning: Applies to claude_code_log/templates/components/timeline.html : When adding new message types or modifying CSS class generation in renderer.py, ensure the timeline's message type detection logic in timeline.html is updated accordingly to maintain feature parity
Applied to files:
claude_code_log/html/templates/index.html
📚 Learning: 2025-12-31T16:37:39.494Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-31T16:37:39.494Z
Learning: Applies to claude_code_log/renderer.py : When adding new message types or modifying CSS class generation, ensure the timeline's message type detection logic is updated accordingly and that the filter is applied consistently to messages in both the main transcript and timeline
Applied to files:
claude_code_log/html/templates/index.html
📚 Learning: 2025-12-31T16:37:39.494Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-31T16:37:39.494Z
Learning: Applies to claude_code_log/templates/components/timeline.html : Use Playwright to test browser runtime features and message filtering consistency in the timeline component
Applied to files:
test/__snapshots__/test_snapshot_html.ambr
📚 Learning: 2025-12-31T16:37:39.494Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-31T16:37:39.494Z
Learning: Implement interactive TUI (Terminal User Interface) for browsing Claude Code sessions with real-time navigation, summaries, and quick actions
Applied to files:
test/__snapshots__/test_snapshot_html.ambrtest/test_tui.pyclaude_code_log/cli.pyclaude_code_log/tui.py
📚 Learning: 2025-12-09T23:52:47.578Z
Learnt from: daaain
Repo: daaain/claude-code-log PR: 59
File: test/test_cache.py:135-165
Timestamp: 2025-12-09T23:52:47.578Z
Learning: SQLite supports NULLS FIRST and NULLS LAST in ORDER BY since v3.30.0 (Oct 2019). Do not flag SQL that uses these clauses as an error when reviewing Python tests or code that interacts with SQLite. If reviewing SQL strings, verify the target SQLite version supports NULLS FIRST/LAST and ensure the syntax is used correctly for the database in use.
Applied to files:
test/test_cache_integration.pytest/test_tui.py
🧬 Code graph analysis (3)
test/test_cache_integration.py (1)
claude_code_log/cache.py (8)
CacheManager(168-1386)get_cached_project_data(604-676)get_archived_sessions(939-978)export_session_to_jsonl(980-1011)load_session_entries(1013-1036)delete_session(1326-1369)delete_project(1371-1386)get_all_cached_projects(1389-1433)
test/test_tui.py (1)
claude_code_log/tui.py (5)
ProjectSelector(35-394)SessionBrowser(919-1803)run_session_browser(1825-1868)populate_table(108-168)populate_table(1102-1179)
claude_code_log/tui.py (1)
claude_code_log/cache.py (10)
CacheManager(168-1386)get_library_version(119-162)get_cached_project_data(604-676)clear_cache(678-686)export_session_to_jsonl(980-1011)SessionCacheData(39-52)get_archived_sessions(939-978)get_working_directories(580-594)load_session_entries(1013-1036)delete_session(1326-1369)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: test (windows-latest, 3.13)
- GitHub Check: test (windows-latest, 3.11)
- GitHub Check: test (windows-latest, 3.12)
- GitHub Check: test (ubuntu-latest, 3.11)
- GitHub Check: test (ubuntu-latest, 3.14)
- GitHub Check: test (windows-latest, 3.14)
- GitHub Check: test (ubuntu-latest, 3.13)
- GitHub Check: test (ubuntu-latest, 3.10)
- GitHub Check: test (ubuntu-latest, 3.12)
- GitHub Check: test (windows-latest, 3.10)
🔇 Additional comments (17)
claude_code_log/html/templates/index.html (1)
62-70: Archived marker wiring looks consistent (class + badge + hint gated).Template change is minimal and should compose cleanly with existing project card rendering.
test/test_cache_integration.py (2)
434-437: No changes needed—test correctly assumes CacheManager initialization creates the project row.The test is correct.
CacheManager.__init__calls_ensure_project_exists(), which always creates a project record in the database if one doesn't exist (via_create_project()). This happens synchronously during initialization, not during conversion/scan. Therefore,get_cached_project_data()will return the project row with the correct version immediately afterCacheManageris instantiated, without requiring any additional cache population step.
115-140: No issue found; test correctly asserts cache state.After
--clear-cache,clear_cache()deletes the old project data and creates a new project record. This meansget_cached_project_data()always returns a non-NoneProjectCache(with emptycached_filesandsessionsdicts). The test correctly asserts this state: the cache exists but contains no files or sessions.test/__snapshots__/test_snapshot_html.ambr (2)
1679-1685: Index snapshot changes for “open combined transcript” hint look consistent.The whitespace/hint placement around
.transcript-link-hintappears intentional and shouldn’t affect semantics.Also applies to: 1698-1704
11749-11749: Title update in snapshot looks intentional and consistent with the new fixture name.
<title>changing toClaude Transcripts - test_multi_session_html0aligns with the updated snapshot content.test/test_tui.py (2)
812-813: Good: explicit UTF-8 when writing files.Keeps tests stable across platforms/locales.
976-1190: Remove this comment — the assertion pattern is not brittle.The code inserts plain Python strings into DataTable cells (not Rich.Text objects), and
[ARCHIVED]is literal text, not Rich markup. Whenget_row_at()returns these plain strings,str()will reliably preserve the literal[ARCHIVED]prefix. The concern about markup representation changing does not apply here since rendering happens at display time, and these tests check the cell values at insertion time, which remain stable.Likely an incorrect or invalid review comment.
claude_code_log/cache.py (2)
1436-1445: Export list looks consistent with new API.
__all__includes the new models and discovery helper as expected.
440-498: All constraints and SQL features are properly implemented.The migrations confirm:
cached_filesenforcesUNIQUE(project_id, file_name)for theINSERT OR REPLACEat line 440html_cacheenforcesUNIQUE(project_id, html_path)for theINSERT OR REPLACEat line 791html_pagesenforcesUNIQUE(project_id, page_number)for theINSERT OR REPLACEat line 1178The
NULLS LASTsyntax used inORDER BYclauses (lines 374, 433, 996, 1032) is supported in SQLite 3.30.0+ (October 2019). Python 3.10+ (required by pyproject.toml) ships with sqlite3 3.36.0+, which provides full support for this syntax.claude_code_log/converter.py (3)
157-165: Good: handles JSONL deletion race cleanly.Catching FileNotFoundError around open is the right fix for transient deletions during scans.
542-577: Paging assignment seems OK (session boundary preserved).Overflow-per-session behavior is consistent with “never split sessions” intent.
1345-1483: Per-session incremental regeneration is a good step forward.The move to
is_html_stale()per session + updatinghtml_cachewill reduce unnecessary regeneration.claude_code_log/tui.py (5)
1681-1694: Verify project archive state transition logic.After restoring a single session, line 1685-1686 sets
self.is_archived_project = False. This flag appears to indicate whether the project directory contains any JSONL files (all archived) vs. some JSONL files exist (mixed or current).The logic is correct if
is_archived_projectmeans "project directory has zero JSONL files" (since we just created one). However, if it means "all sessions are archived," then restoring one session when others remain archived would incorrectly flip the flag.The subsequent
load_sessions()call will recompute session states, so this may work correctly in practice. Consider adding a comment clarifying the intended semantics ofis_archived_projectto prevent future confusion.
1076-1092: Good handling of test environment edge case.The comment at lines 1077-1078 clearly explains why archived session detection is conditional on
valid_session_ids(JSONL files) being present. This handles test environments where cached sessions exist without corresponding JSONL files by treating them as current rather than archived, allowing tests to proceed normally.This design choice is well-documented and makes sense in context.
1044-1050: Correct implementation of agent file filtering.Lines 1045-1049 properly exclude
agent-*.jsonlfiles from cache invalidation checks, as these files are loaded via session references rather than directly. This prevents false cache invalidation and aligns with the PR objectives.The inline comment clearly explains the rationale.
232-252: Excellent use of check_action for contextual command availability.Both
ProjectSelector.check_action(lines 232-252) andSessionBrowser.check_action(lines 1621-1638) properly restrict actions based on the current item's state (archived vs. current). This provides good UX by:
- Preventing archive actions on already-archived items
- Preventing restore actions on current items
- Preventing resume actions on archived sessions (no JSONL to resume from)
The conditional action availability makes the interface more intuitive and prevents error states.
Also applies to: 1621-1638
519-917: Well-designed confirmation screens with context-aware options.The five new modal confirmation screens (Archive/Delete/Restore for both sessions and projects) follow consistent design patterns and provide excellent UX:
Smart option presentation: Delete screens show different options based on archived state:
- Archived items: Only "delete cache" (permanent warning)
- Current items: "cache only" vs "both JSONL + cache" options
Appropriate visual styling: Warning/error/success colors match action severity
Consistent interaction patterns: y/n/Enter/Escape bindings work intuitively
The contextual messaging helps users understand the consequences of their actions, especially the permanence of deleting archived items that have no JSONL backup.
|
I read up a bit on it and JSONB support is relatively recent in version 3.45.0 (2024-01-15) and between SQLite shipped in some Python installers (but not all) and OS versions, I don't want to get into the weeds, so will try zlib. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (6)
claude_code_log/migrations/001_initial_schema.sql (1)
6-21: Redundant index on unique column.
idx_projects_pathon line 21 is unnecessary sinceproject_pathalready has aUNIQUEconstraint (line 8), which implicitly creates an index in SQLite.♻️ Suggested fix
); -CREATE INDEX IF NOT EXISTS idx_projects_path ON projects(project_path);test/test_pagination.py (1)
66-76: Assertion may be too permissive.Using
sorted(pages[0])on line 76 masks the actual ordering of sessions. Since sessions should be in chronological order (as tested intest_sessions_sorted_chronologically), this test should verify the exact order rather than just membership.♻️ Suggested fix
assert len(pages) == 1 - assert sorted(pages[0]) == ["s1", "s2", "s3"] + # Sessions should be in chronological order + assert pages[0] == ["s1", "s2", "s3"]test/test_cache_integration.py (1)
395-411: Test may be testing obsolete cache format.This corruption test targets the old JSON-based cache directory (
project/cache/*.json), but the PR migrates to SQLite. The test passes because it gracefully handles the missing/empty cache directory, but it no longer tests actual cache corruption recovery.Consider either:
- Removing this test as the JSON cache format is deprecated
- Updating it to test SQLite database corruption scenarios
♻️ Suggested update for SQLite corruption test
def test_cache_corruption_recovery(self, setup_test_project): - """Test recovery from corrupted cache files.""" + """Test recovery from corrupted SQLite cache database.""" project_dir = setup_test_project # Create initial cache convert_jsonl_to_html(input_path=project_dir, use_cache=True) - # Corrupt cache file - cache_dir = project_dir / "cache" - cache_files = list(cache_dir.glob("*.json")) - if cache_files: - cache_file = [f for f in cache_files if f.name != "index.json"][0] - cache_file.write_text("corrupted json data", encoding="utf-8") + # Corrupt SQLite database + db_path = project_dir.parent / "claude-code-log-cache.db" + if db_path.exists(): + db_path.write_bytes(b"corrupted database content") # Should recover gracefully output = convert_jsonl_to_html(input_path=project_dir, use_cache=True) assert output.exists()claude_code_log/cache.py (1)
1398-1454: Silent exception handling may hide issues.The
get_all_cached_projectsfunction silently catches all exceptions on line 1439 and returns an empty list. While this prevents crashes, it could hide legitimate issues (e.g., database corruption, permission errors).Consider logging the exception for debugging purposes.
♻️ Suggested improvement
+ import logging + try: conn = sqlite3.connect(db_path, timeout=30.0) # ... existing code ... - except Exception: - pass + except Exception as e: + logging.debug(f"Failed to query cached projects from {db_path}: {e}") return resultclaude_code_log/converter.py (2)
1586-1606: Remove redundant import.
jsonis already imported at the top of the file (line 4), so the local import on line 1594 is unnecessary.♻️ Suggested fix
def _get_cleanup_period_days() -> Optional[int]: """Read cleanupPeriodDays from Claude Code settings. Checks ~/.claude/settings.json for the cleanupPeriodDays setting. Returns: The configured cleanup period in days, or None if not set/readable. """ - import json - settings_path = Path.home() / ".claude" / "settings.json" if not settings_path.exists(): return None
645-656: Simplify token access pattern for consistency with rest of codebase.The fields in
UsageInfoareOptional[int]with defaultNone, so the secondor 0is necessary to handleNonevalues. However, thegetattrcall with default0is redundant—the attribute always exists onUsageInfoinstances. Useusage.input_tokens or 0directly (as done elsewhere in this file, lines ~515-525), instead ofgetattr(usage, "input_tokens", 0) or 0.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
claude_code_log/cache.pyclaude_code_log/cli.pyclaude_code_log/converter.pyclaude_code_log/migrations/001_initial_schema.sqlclaude_code_log/tui.pytest/test_cache.pytest/test_cache_integration.pytest/test_integration_realistic.pytest/test_pagination.py
🚧 Files skipped from review as they are similar to previous changes (1)
- test/test_integration_realistic.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use-n autoflag for parallel test execution with pytest in Python projects (e.g.,uv run pytest -n auto -m "not (tui or browser)" -v)
Useruff formatfor code formatting in Python projects
Useruff check --fixfor linting and automatic fixes in Python projects
Useuv run pyrightoruv run ty checkfor type checking in Python projects
Files:
test/test_pagination.pytest/test_cache.pyclaude_code_log/cache.pytest/test_cache_integration.pyclaude_code_log/converter.pyclaude_code_log/cli.pyclaude_code_log/tui.py
🧠 Learnings (3)
📚 Learning: 2025-12-09T23:52:47.578Z
Learnt from: daaain
Repo: daaain/claude-code-log PR: 59
File: test/test_cache.py:135-165
Timestamp: 2025-12-09T23:52:47.578Z
Learning: SQLite supports NULLS FIRST and NULLS LAST in ORDER BY since v3.30.0 (Oct 2019). Do not flag SQL that uses these clauses as an error when reviewing Python tests or code that interacts with SQLite. If reviewing SQL strings, verify the target SQLite version supports NULLS FIRST/LAST and ensure the syntax is used correctly for the database in use.
Applied to files:
test/test_pagination.pytest/test_cache.pytest/test_cache_integration.py
📚 Learning: 2025-12-31T16:37:39.494Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-31T16:37:39.494Z
Learning: Applies to claude_code_log/renderer.py : When adding new message types or modifying CSS class generation, ensure the timeline's message type detection logic is updated accordingly and that the filter is applied consistently to messages in both the main transcript and timeline
Applied to files:
claude_code_log/converter.py
📚 Learning: 2025-12-31T16:37:39.494Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-31T16:37:39.494Z
Learning: Implement interactive TUI (Terminal User Interface) for browsing Claude Code sessions with real-time navigation, summaries, and quick actions
Applied to files:
claude_code_log/cli.pyclaude_code_log/tui.py
🧬 Code graph analysis (4)
test/test_pagination.py (2)
claude_code_log/cache.py (7)
SessionCacheData(40-53)get_page_count(1315-1326)get_page_size_config(1042-1056)update_page_cache(1152-1233)get_page_data(1058-1100)is_page_stale(1235-1286)get_all_pages(1102-1150)claude_code_log/converter.py (2)
_get_page_html_path(488-495)_assign_sessions_to_pages(542-576)
claude_code_log/cache.py (4)
claude_code_log/factories/transcript_factory.py (1)
create_transcript_entry(219-238)claude_code_log/migrations/runner.py (1)
run_migrations(135-156)claude_code_log/models.py (4)
AssistantTranscriptEntry(175-178)SummaryTranscriptEntry(181-186)SystemTranscriptEntry(189-200)UserTranscriptEntry(168-172)claude_code_log/renderer.py (2)
get(121-132)is_html_outdated(2308-2320)
test/test_cache_integration.py (1)
claude_code_log/cache.py (7)
get_cached_project_data(616-688)get_archived_sessions(951-990)export_session_to_jsonl(992-1013)load_session_entries(1015-1038)delete_session(1328-1378)delete_project(1380-1395)get_all_cached_projects(1398-1442)
claude_code_log/cli.py (3)
claude_code_log/cache.py (5)
CacheManager(169-1395)get_all_cached_projects(1398-1442)get_library_version(120-163)get_cached_project_data(616-688)get_modified_files(608-614)claude_code_log/converter.py (1)
ensure_fresh_cache(1132-1184)claude_code_log/tui.py (2)
run_session_browser(1827-1870)run_project_selector(1808-1824)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: test (windows-latest, 3.13)
- GitHub Check: test (windows-latest, 3.14)
- GitHub Check: test (windows-latest, 3.12)
- GitHub Check: test (ubuntu-latest, 3.11)
- GitHub Check: test (windows-latest, 3.11)
- GitHub Check: test (ubuntu-latest, 3.13)
- GitHub Check: test (ubuntu-latest, 3.10)
- GitHub Check: test (ubuntu-latest, 3.12)
- GitHub Check: test (ubuntu-latest, 3.14)
- GitHub Check: test (windows-latest, 3.10)
🔇 Additional comments (46)
claude_code_log/migrations/001_initial_schema.sql (1)
24-114: Well-structured schema with appropriate normalization.The schema design is solid:
- Proper foreign key relationships with
ON DELETE CASCADEensure data integrity- Token counts are denormalized at session level for efficient aggregation queries
- Indexes cover the primary query patterns (by timestamp, project, file, session, uuid)
- Using BLOB for compressed content is appropriate for the zlib-compressed JSON storage
test/test_pagination.py (3)
142-346: LGTM!Fixtures and
TestPageCacheMethodsprovide good coverage of the page cache API:
- Empty state handling
- Page creation and retrieval
- Staleness detection with page size changes
- Invalidation and cleanup
351-550: LGTM!The
_create_session_messageshelper andTestPaginationIntegrationclass provide excellent end-to-end coverage including:
- Single page vs multi-page scenarios
- Page size configuration changes
- Navigation link presence
- Statistics in generated HTML
553-736: LGTM!The remaining test classes provide good coverage:
TestNextLinkInPlaceUpdate: Verifies in-place HTML modification for next linksTestPaginationNextLinkVisibility: Ensures correct visibility state across pagesTestPaginationFallbackWithoutCache: Critical test ensuring the system degrades gracefully when cache data is unavailabletest/test_cache_integration.py (4)
85-191: LGTM!The CLI integration tests correctly verify the new SQLite cache behavior:
- Cache database creation at parent level
- Cache clearing semantics
- Multi-project caching with
--all-projects- Date filtering with cached data
193-359: LGTM!The converter and performance integration tests appropriately verify:
- Cache creation and utilization during conversion
- Hierarchy processing with multi-project caching
- Graceful handling of cache bypass with
use_cache=FalseThe performance test correctly focuses on completion rather than timing assertions, which would be flaky in CI.
445-853: LGTM!Excellent test coverage for the archived sessions functionality:
- Detection of archived sessions after JSONL deletion
- Partial archival scenarios
- Full export/restore workflow with data integrity verification
- Message ordering preservation
- Session and project deletion with cascade behavior
- Edge cases like empty sessions and non-existent entries
855-913: LGTM!The
TestGetAllCachedProjectsclass properly verifies:
- Discovery of both active and archived projects with correct
is_archivedflags- Graceful handling of empty directories (no cache DB)
- Graceful handling of nonexistent directories
claude_code_log/cli.py (6)
28-52: LGTM!The
_discover_projectsfunction correctly:
- Identifies active projects (directories with JSONL files)
- Identifies archived projects from cache (in DB but no JSONL files)
- Prevents duplicates by checking against active project paths
- Returns both lists for downstream use
54-106: LGTM!The updated
_launch_tui_with_cache_checkproperly differentiates:
- Archived projects: Load directly from cache, error if no sessions found
- Active projects: Check for modified files, rebuild cache if needed
The user feedback messages are clear and informative.
225-293: LGTM!The refactoring to use
get_working_directories()directly instead of loading the fullProjectCacheis more efficient for this use case. The logic remains correct for relative path matching.
296-354: LGTM!The
_clear_cachesfunction now properly handles:
- Deletion of the centralized SQLite cache database
- Cleanup of legacy per-project JSON cache directories
- Both
--all-projectsmode and single directory modeThis ensures clean migration from the old cache format.
498-503: LGTM!The
--page-sizeoption is well-documented with a sensible default (2000 messages) and clear help text explaining that sessions are never split across pages.
553-640: LGTM!The TUI flow updates correctly:
- Re-discover projects on each loop iteration (handles restores/changes)
- Pass
archived_projectsset to project selector- Determine
is_archivedstate before launching session browser- Handle the "back_to_projects" navigation flow
test/test_cache.py (4)
26-34: LGTM!The fixture correctly creates a project subdirectory so each test gets a unique database path (
temp_dir/project/results intemp_dir/claude-code-log-cache.db), ensuring test isolation.
141-170: LGTM!The
test_message_storage_with_timestampstest properly validates the SQLite storage structure by:
- Querying the database directly
- Filtering by
project_idfor test isolation- Using
ORDER BY timestamp NULLS LAST(supported since SQLite 3.30.0 per learnings)- Verifying correct timestamp ordering and NULL handling for summaries
266-328: Excellent edge case test!This test catches an important bug where timestamps with 'Z' suffix could be incorrectly excluded due to string comparison issues with
.999999microseconds format. The detailed assertion message helps diagnose failures.
330-704: LGTM!The remaining test classes properly validate:
- Library version detection with multiple fallback strategies
- Cache version compatibility with breaking change patterns
- Error handling for missing cache entries (SQLite-aware assertions)
The version compatibility tests are thorough, covering exact matches, patterns, and multiple breaking changes.
claude_code_log/cache.py (11)
56-86: LGTM!The new
HtmlCacheEntryandPageCacheDatamodels are well-defined with appropriate fields for tracking HTML generation state and pagination metadata.
120-163: LGTM!The
get_library_versionfunction provides robust version detection with three fallback strategies:
- Package metadata (for installed packages)
importlib.resources+ toml (for development)- File-relative path (final fallback)
Each strategy is properly wrapped in try/except to gracefully fall through.
169-261: LGTM!The
CacheManagerinitialization is well-designed:
- Uses context manager pattern for safe connection handling
- Enables foreign keys and WAL mode for integrity and concurrency
- Runs migrations automatically on initialization
- Properly handles version incompatibility by clearing stale cache data
263-328: LGTM!The serialization/deserialization methods are well-implemented:
- Properly extracts type-specific fields for each
TranscriptEntryvariant- Uses
zlib.compressfor efficient BLOB storage (addresses PR concern about DB size)- Uses compact JSON format (
separators=(",", ":"))- Deserialization leverages the
create_transcript_entryfactory for clean type dispatch
343-364: LGTM!The
is_file_cachedmethod correctly handles:
- Missing project ID (returns False)
- Non-existent files (returns False)
- Modification time comparison with 1-second tolerance (handles filesystem precision differences)
383-450: LGTM!The
load_cached_entries_filteredmethod properly handles:
- Date parsing with
dateparserfor flexible user input- Normalization to UTC 'Z' format for consistent string comparison with stored timestamps
- NULL timestamp handling (summaries are always included)
- SQL-based filtering for efficiency
This addresses the edge case tested in
test_filtered_loading_with_z_suffix_boundary.
515-688: LGTM!The session and project management methods are well-implemented:
update_session_cache: UsesINSERT OR REPLACEfor clean upsert semanticsupdate_project_aggregates: Updates denormalized counts efficientlyget_working_directories: Simple distinct query for CWD matchingget_cached_project_data: Properly assembles the fullProjectCachefrom multiple tables
762-925: LGTM!The HTML cache methods provide comprehensive staleness detection:
is_html_stale: Checks library version, file existence, embedded version, and message count changesget_stale_sessions: Properly skips archived sessions to avoid perpetual regeneration attempts- Uses the
is_html_outdatedhelper from renderer for embedded version checking
951-1038: LGTM!The archived session methods provide the foundation for the restore workflow:
get_archived_sessions: Returns session data for sessions whose JSONL files were deletedexport_session_to_jsonl: Returns compact JSON strings ready for file restorationload_session_entries: Deserializes entries for HTML/Markdown rendering of archived sessions
1040-1326: LGTM!The page cache methods support the pagination feature:
update_page_cache: Maintains page metadata and session mappings with proper upsert logicis_page_stale: Detects configuration changes, version mismatches, and missing sessionsinvalidate_all_pages: Returns paths for downstream cleanup when re-pagination is needed- Session ordering is preserved through the
session_ordercolumn
1328-1395: LGTM!The delete methods properly clean up related data:
delete_session: Removes messages, HTML cache entries, cached_files entry, and session recorddelete_project: Relies on CASCADE delete for clean removalThe explicit deletion in
delete_session(rather than relying only on CASCADE) ensures thecached_filesentry is removed, which is critical foris_file_cached()behavior as tested intest_delete_session_invalidates_file_cache.claude_code_log/converter.py (8)
420-486: LGTM!The
GenerationStatsdataclass is well-structured with proper use offield(default_factory=...)for mutable defaults. The summary method provides useful diagnostic output.
542-577: LGTM!The pagination logic correctly ensures sessions are never split across pages, allowing pages to overflow the limit rather than breaking session continuity.
676-862: LGTM!The paginated HTML generation handles edge cases well: page size changes trigger full regeneration, orphan pages are cleaned up, and empty projects still get a single page. The staleness checks avoid unnecessary regeneration.
157-166: LGTM!Good defensive handling for the race condition where a file may be deleted between discovery and opening. This is a common issue with log files that can be cleaned up by external processes.
1099-1130: LGTM!The fast cache invalidation check correctly defers expensive message loading while conservatively detecting when updates are needed.
1007-1024: LGTM!The pagination decision logic correctly handles both the initial case (messages exceed page_size) and the hysteresis case (existing pages should be maintained even if message count decreases).
1686-1760: LGTM!The refactored stats aggregation provides clear fast/slow paths, correctly handles both paginated and non-paginated projects, and tracks archived session counts appropriately.
1973-2028: No changes needed—CacheManager already handles archived projects without filesystem access.CacheManager stores the project path as a string identifier in the SQLite database and performs all operations (initialization, cache lookup, working directory retrieval) purely through database queries. It never accesses the filesystem for the project_path, so non-existent directories pose no issue for archived projects. The exception handling at lines 1976–1983 provides additional safety if any unexpected errors occur.
Likely an incorrect or invalid review comment.
claude_code_log/tui.py (9)
100-168: LGTM!The archived project handling in
ProjectSelectorcorrectly shows indicators, skips cache building for archived projects, and properly strips the indicator when resolving selections.
519-917: LGTM!The confirmation modal screens follow consistent patterns, properly scope CSS, and implement appropriate keybindings. The conditional behavior for archived vs non-archived items is correctly handled.
1078-1095: Potential inconsistency in archived session detection.When
valid_session_idsis empty (no JSONL files), all cached sessions are treated as "current" rather than "archived". The comment mentions test environments, but this contradicts the archival model where sessions without JSONL files should be archived.Consider whether this special case is intentional or if it should instead mark all sessions as archived when no JSONL files exist (similar to
is_archived_projecthandling above).
1827-1864: LGTM!The
run_session_browserfunction properly handles archived projects that may not have JSONL files or even the project directory, falling back to cached data when available.
1623-1640: LGTM!The
check_actionmethod correctly gates actions based on session state - archive and resume only work for current sessions, restore only works for archived sessions.
1135-1181: LGTM!The merged display of current and archived sessions maintains chronological ordering and provides clear visual distinction with the
[ARCHIVED]prefix.
309-340: Theclear_cache()method exists in theCacheManagerclass and is correctly implemented. No action required.
1654-1671: Theexport_session_to_jsonl()method exists in theCacheManagerclass and has the correct signature. It returnsList[str]containing decompressed JSON strings, which matches the usage in the code—iterating over the list and writing each string as a line in the JSONL file is correct.
1780-1784:delete_session()method exists and works as expected.The
CacheManager.delete_session()method is implemented atclaude_code_log/cache.py:1328with the correct signaturedef delete_session(self, session_id: str) -> bool:. It returnsTrueon successful deletion andFalseif the session is not found or the project ID is invalid. The code correctly captures and uses the boolean return value.
|
Update: the new db is actually 592MB after I realised that there were some sessions that I only had in the old cache and had to restore 🙀 this is an amazing coincidence, I've been working on a new app that uses Claude Code as inference backend and a rogue local installation must have cleared up my old sessions as I was experimenting with it!
Anyway, bit less than half size is still pretty good going! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (7)
test/test_cli.py (1)
84-94: Helper function is well-structured but could specify encoding.The
create_project_with_jsonlhelper writes JSONL without explicit encoding. For consistency with other test files and to prevent potential encoding issues on Windows, consider specifying UTF-8 encoding.Suggested fix
def create_project_with_jsonl( projects_dir: Path, name: str, jsonl_data: list[dict] ) -> Path: """Helper to create a project directory with JSONL file.""" project_dir = projects_dir / name project_dir.mkdir(exist_ok=True) jsonl_file = project_dir / "session-1.jsonl" - with open(jsonl_file, "w") as f: + with open(jsonl_file, "w", encoding="utf-8") as f: for entry in jsonl_data: f.write(json.dumps(entry) + "\n") return project_dirtest/test_renderer_timings.py (2)
12-19: Empty test body for default state.The
test_debug_timing_disabled_by_defaulttest has no assertions. While the docstring explains the limitation, consider either adding an assertion or marking it as a documentation-only test.Suggested approach
def test_debug_timing_disabled_by_default(self): """DEBUG_TIMING is False by default.""" - # Import with fresh module state - - # Note: We can't easily test the default since the module is already loaded - # This test just documents the expected default behavior - # The actual value depends on environment at import time + # Note: We can't easily test the true default since the module is already + # loaded. This test verifies the behavior when env var is unset. + import os + import importlib + import claude_code_log.renderer_timings as rt + + # Ensure env var is not set + os.environ.pop("CLAUDE_CODE_LOG_DEBUG_TIMING", None) + importlib.reload(rt) + + # Should default to False when env var is not set + assert rt.DEBUG_TIMING is False
165-204: Consider timing reliability for timing_stat tests.The test uses
time.sleep(0.01)and assertsduration >= 0.01. While generally reliable, this could be flaky under heavy system load. Consider using a slightly larger margin or a more lenient assertion.Suggested improvement for test reliability
def test_tracks_operation_timing_when_enabled( self, monkeypatch: pytest.MonkeyPatch ): """Tracks operation timing when DEBUG_TIMING enabled.""" monkeypatch.setenv("CLAUDE_CODE_LOG_DEBUG_TIMING", "1") import importlib import claude_code_log.renderer_timings as rt importlib.reload(rt) rt._timing_data.clear() rt._timing_data["_test_timings"] = [] rt._timing_data["_current_msg_id"] = "msg-123" with rt.timing_stat("_test_timings"): time.sleep(0.01) assert len(rt._timing_data["_test_timings"]) == 1 duration, msg_id = rt._timing_data["_test_timings"][0] - assert duration >= 0.01 + assert duration >= 0.009 # Allow small margin for system load assert msg_id == "msg-123"test/test_migrations.py (1)
186-200: Potential race condition in pending migrations test.The test closes the connection after
_ensure_schema_version_table, then callsrun_migrations(db_path)which creates a new connection, then reconnects to check pending. While this works, the pattern of reconnecting could miss edge cases.Consider using consistent connection handling
def test_none_pending_after_all_applied(self, tmp_path: Path): """No migrations pending after all applied.""" db_path = tmp_path / "test.db" - conn = sqlite3.connect(db_path) # Run all migrations run_migrations(db_path) # Reconnect and check conn = sqlite3.connect(db_path) - pending = get_pending_migrations(conn) - assert len(pending) == 0 - - conn.close() + try: + pending = get_pending_migrations(conn) + assert len(pending) == 0 + finally: + conn.close()claude_code_log/cache.py (3)
442-452: Consider theto_datetime handling.When
to_dateis a specific time (e.g., "2025-06-08 15:00"), the code still sets it to end-of-day (23:59:59). This may be intentional for inclusive day filtering, but if precise time filtering is desired, you might want to skip the replacement for timestamps that already include a time component.
521-544: Considerexecutemanyfor batch insertion.The loop inserts entries one at a time. For large files with many messages, using
executemanywith a list of parameter tuples could improve performance. This is a minor optimization and may not be critical given the current use patterns.
1459-1481: Silent exception handling may hide issues.The broad
except Exception: passat line 1480-1481 silently swallows all errors, including potential database corruption or schema issues. Consider logging the exception for debugging purposes.Proposed improvement
- except Exception: - pass + except Exception as e: + import logging + logging.debug(f"Failed to query cached projects: {e}") return result
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
claude_code_log/cache.pyclaude_code_log/cli.pytest/conftest.pytest/test_cache.pytest/test_cache_integration.pytest/test_cache_sqlite_integrity.pytest/test_cli.pytest/test_migrations.pytest/test_renderer_timings.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use-n autoflag for parallel test execution with pytest in Python projects (e.g.,uv run pytest -n auto -m "not (tui or browser)" -v)
Useruff formatfor code formatting in Python projects
Useruff check --fixfor linting and automatic fixes in Python projects
Useuv run pyrightoruv run ty checkfor type checking in Python projects
Files:
test/test_cache_sqlite_integrity.pytest/test_cli.pytest/test_cache_integration.pyclaude_code_log/cli.pytest/test_cache.pytest/test_renderer_timings.pyclaude_code_log/cache.pytest/conftest.pytest/test_migrations.py
🧠 Learnings (5)
📚 Learning: 2025-12-09T23:52:47.578Z
Learnt from: daaain
Repo: daaain/claude-code-log PR: 59
File: test/test_cache.py:135-165
Timestamp: 2025-12-09T23:52:47.578Z
Learning: SQLite supports NULLS FIRST and NULLS LAST in ORDER BY since v3.30.0 (Oct 2019). Do not flag SQL that uses these clauses as an error when reviewing Python tests or code that interacts with SQLite. If reviewing SQL strings, verify the target SQLite version supports NULLS FIRST/LAST and ensure the syntax is used correctly for the database in use.
Applied to files:
test/test_cache_sqlite_integrity.pytest/test_cli.pytest/test_cache_integration.pytest/test_cache.pytest/test_renderer_timings.pytest/conftest.pytest/test_migrations.py
📚 Learning: 2025-12-31T16:37:39.494Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-31T16:37:39.494Z
Learning: Implement interactive TUI (Terminal User Interface) for browsing Claude Code sessions with real-time navigation, summaries, and quick actions
Applied to files:
claude_code_log/cli.py
📚 Learning: 2025-12-31T16:37:39.494Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-31T16:37:39.494Z
Learning: Applies to claude_code_log/templates/components/timeline.html : Use Playwright to test browser runtime features and message filtering consistency in the timeline component
Applied to files:
test/test_renderer_timings.py
📚 Learning: 2025-12-31T16:37:39.494Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-31T16:37:39.494Z
Learning: Applies to claude_code_log/templates/components/timeline.html : When adding new message types or modifying CSS class generation in renderer.py, ensure the timeline's message type detection logic in timeline.html is updated accordingly to maintain feature parity
Applied to files:
test/test_renderer_timings.py
📚 Learning: 2025-12-31T16:37:39.494Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-31T16:37:39.494Z
Learning: Applies to claude_code_log/renderer.py : When adding new message types or modifying CSS class generation, ensure the timeline's message type detection logic is updated accordingly and that the filter is applied consistently to messages in both the main transcript and timeline
Applied to files:
test/test_renderer_timings.py
🧬 Code graph analysis (5)
test/test_cache_sqlite_integrity.py (2)
claude_code_log/cache.py (9)
SessionCacheData(41-54)save_cached_entries(483-544)update_session_cache(546-579)_get_connection(222-231)load_cached_entries(397-412)load_cached_entries_filtered(414-481)get_cache_stats(761-791)get_working_directories(623-637)is_file_cached(374-395)claude_code_log/models.py (9)
AssistantMessageModel(131-141)AssistantTranscriptEntry(175-178)TextContent(55-59)ThinkingContent(107-110)ToolResultContent(99-104)ToolUseContent(92-96)UsageInfo(81-89)UserMessageModel(123-128)UserTranscriptEntry(168-172)
test/test_cache.py (2)
test/test_cache_integration.py (2)
parent(118-119)exists(121-122)claude_code_log/cache.py (2)
CacheManager(190-1426)get_all_cached_projects(1429-1483)
test/test_renderer_timings.py (2)
claude_code_log/renderer_timings.py (4)
set_timing_var(24-32)log_timing(36-84)timing_stat(88-110)report_timing_statistics(113-134)claude_code_log/renderer.py (1)
get(121-132)
claude_code_log/cache.py (4)
claude_code_log/factories/transcript_factory.py (1)
create_transcript_entry(219-238)claude_code_log/migrations/runner.py (1)
run_migrations(135-156)claude_code_log/models.py (5)
AssistantTranscriptEntry(175-178)QueueOperationTranscriptEntry(203-220)SummaryTranscriptEntry(181-186)SystemTranscriptEntry(189-200)UserTranscriptEntry(168-172)claude_code_log/renderer.py (2)
get(121-132)is_html_outdated(2308-2320)
test/test_migrations.py (1)
claude_code_log/migrations/runner.py (10)
_compute_checksum(16-18)_ensure_schema_version_table(29-56)_parse_migration_number(21-26)apply_migration(92-110)get_applied_migrations(59-65)get_available_migrations(68-82)get_current_version(159-163)get_pending_migrations(85-89)run_migrations(135-156)verify_migrations(113-132)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: test (windows-latest, 3.10)
- GitHub Check: test (windows-latest, 3.12)
- GitHub Check: test (ubuntu-latest, 3.12)
- GitHub Check: test (ubuntu-latest, 3.14)
- GitHub Check: test (windows-latest, 3.13)
- GitHub Check: test (windows-latest, 3.14)
- GitHub Check: test (ubuntu-latest, 3.10)
- GitHub Check: test (ubuntu-latest, 3.11)
- GitHub Check: test (ubuntu-latest, 3.13)
- GitHub Check: test (windows-latest, 3.11)
🔇 Additional comments (53)
test/conftest.py (2)
4-9: LGTM! Clean TYPE_CHECKING pattern for type hints.The
TYPE_CHECKINGblock correctly importsCacheManageronly for type annotation purposes, avoiding runtime import overhead and potential circular dependency issues.
17-51: Well-designed fixtures for test isolation.The three new fixtures provide proper SQLite database isolation for parallel test execution:
isolated_cache_dircreates a unique project directoryisolated_db_pathprovides a unique database pathisolated_cache_managercombines both with local import to avoid circular dependenciesThe local import of
CacheManagerinside the fixture body is the correct approach when the type is only needed in aTYPE_CHECKINGblock.test/test_cli.py (5)
21-41: Good fixture design for CLI test isolation.The
ProjectsSetupclass andcli_projects_setupfixture properly encapsulate both the projects directory and isolated database path, using environment variable override for cache isolation. This enables parallel test execution without database conflicts.
106-162: Comprehensive project discovery tests.The
TestDiscoverProjectsclass thoroughly tests:
- Active project discovery with JSONL files
- Archived project detection from cache
- Empty directory handling
Good coverage of the discovery logic including edge cases.
165-241: Cache clearing tests are well-structured.Tests properly verify:
- Single project cache clearing
- All projects cache clearing (database deletion)
- No-op behavior for single file paths
The assertions correctly check cache state before and after operations.
313-429: Good CLI integration test coverage.The main command tests cover key scenarios including help output, default behavior, cache/HTML clearing flags, format options, no-cache mode, and custom output paths. The assertions appropriately verify both exit codes and expected side effects.
432-469: Error handling tests appropriately verify graceful degradation.Tests verify the CLI handles edge cases without crashing:
- Invalid format options
- Empty project directories
- Malformed JSONL content
Using
exit_code in (0, 1)is reasonable for tests where the exact behavior may vary based on error handling paths.test/test_renderer_timings.py (3)
20-59: Good coverage of DEBUG_TIMING environment variable.Tests correctly verify that various truthy values (
1,true,yes,TRUE) enable the debug timing flag. The module reload pattern ensures fresh state for each test.
62-89: Thorough tests for set_timing_var behavior.Tests verify both enabled and disabled states, ensuring variables are stored only when DEBUG_TIMING is active.
207-291: Comprehensive report_timing_statistics tests.Tests verify output format, sorting behavior, empty handling, and the 10-slowest limit. The string position assertions for sorting order are clever and effective.
test/test_cache_sqlite_integrity.py (9)
29-32: Local fixture properly overrides conftest fixture.The local
cache_managerfixture with explicitdb_pathensures test isolation while using the same pattern as the conftest fixtures.
84-176: Thorough cascade delete verification.The test properly verifies that deleting a project cascades to all related records (files, messages, sessions). Direct SQL verification ensures the cascade behavior works at the database level.
247-286: Good foreign key constraint tests.Tests verify that SQLite enforces referential integrity by expecting
sqlite3.IntegrityErrorfor invalid foreign key references.
289-406: Excellent serialization round-trip test coverage.Tests verify that complex content types (tool_use, tool_result, thinking) survive JSON serialization and deserialization. The field-by-field comparison is appropriate since default values may differ.
629-636: WAL mode verification is straightforward and correct.Simple test confirms WAL journal mode is enabled as expected.
639-688: Concurrent reader test is well-designed.The test verifies that multiple reader threads can access the database simultaneously without blocking. Using 5 threads with a 5-second timeout is reasonable for this verification.
690-736: Performance test uses appropriate scale.Using 1000 entries (reduced from 10k per comment) balances coverage with test execution time. The 2-second threshold is reasonable for filtered loading.
900-928: Consider potential flakiness in mtime detection test.The test uses
time.sleep(1.1)to ensure different mtime. While this should be sufficient on most systems, filesystem mtime resolution varies. The existing approach is reasonable but worth noting.
930-968: Migration integrity tests are comprehensive.Tests verify checksum storage, SHA256 format, and idempotent migration application. These are important for ensuring database schema evolution works correctly.
test/test_cache_integration.py (6)
17-23: Clean container class for test setup data.
ProjectSetupprovides a simple container for bundling related test context.
96-139: Well-designed Path-like wrapper for backward compatibility.
ProjectWithCacheimplements the essential Path-like interface (__fspath__,__truediv__,glob,iterdir, etc.) to maintain compatibility with existing tests while carrying the additionaldb_pathcontext. This is a pragmatic approach to avoid rewriting all dependent tests.
526-728: Comprehensive archived sessions integration tests.Excellent coverage of the archive workflow:
- Session archival when JSONL files are deleted
- Partial archival with some files remaining
- Export to JSONL for restoration
- Loading session entries for rendering
- Full archive/restore workflow end-to-end
- Message order preservation
- Empty/non-existent session handling
- Compact JSON export format verification
These tests thoroughly validate the new archive functionality.
810-891: Critical test for delete_session cache invalidation.The
test_delete_session_invalidates_file_cachetest verifies an important behavior: that deleting a session also removes thecached_filesentry. The detailed docstring explains the bug this prevents. Good defensive testing.
936-994: Good coverage for get_all_cached_projects function.Tests verify:
- Finding both active and archived projects
- Empty directory handling
- Non-existent directory handling
The
is_archivedflag verification correctly checks project state based on JSONL file presence.
476-492: Test targets outdated JSON cache structure; SQLite backend doesn't create cache files in project directory.The test assumes a
cachesubdirectory with JSON files atproject_dir / "cache", but the current implementation stores all cache data in a SQLite database (claude-code-log-cache.db). Thecache_dir.glob("*.json")call will return an empty list, making theif cache_files:condition always False. This test doesn't actually exercise SQLite corruption recovery and should be updated to test the actual cache backend.Likely an incorrect or invalid review comment.
test/test_migrations.py (6)
23-44: Thorough migration filename parsing tests.Tests cover standard format, multiple underscores, and invalid formats with appropriate error handling verification.
47-67: Good checksum computation tests.Verifies consistency, content sensitivity, and SHA256 output format (64 hex characters).
70-113: Schema version table tests cover upgrade path.Tests verify both fresh table creation and upgrade from old format (adding checksum column). This ensures smooth migration from previous schema versions.
232-268: Clever approach to testing modified migration warnings.Using a fake checksum to simulate a modified migration file effectively tests the verification logic without actually modifying real migration files.
271-315: Comprehensive run_migrations tests.Tests verify:
- All pending migrations are applied
- Core tables are created
- Multiple runs are idempotent
- Database file is created if missing
This covers the essential migration runner behaviors.
318-367: Good current version tests.Tests verify zero for fresh database, highest version detection (even with out-of-order insertion), and alignment with available migrations after execution.
test/test_cache.py (7)
26-33: Good fixture update for SQLite isolation.Creating a project subdirectory ensures each test gets a unique parent directory for the shared cache database, enabling proper test isolation.
98-116: Updated initialization and path tests reflect SQLite backend.Tests correctly verify that the SQLite database is created at the parent level (
parent/claude-code-log-cache.db).
141-170: Direct SQL verification is appropriate for storage tests.The test uses
ORDER BY timestamp NULLS LASTwhich is supported in SQLite 3.30.0+ (per retrieved learnings). This approach validates the internal storage structure directly.
266-328: Excellent edge case test for Z suffix timestamp boundary.This test addresses a specific bug where timestamps ending with 'Z' (e.g.,
23:59:59Z) could be incorrectly excluded due to string comparison issues with.999999microsecond suffixes. The detailed docstring explains the problem being tested.
330-352: Clear cache test updated for SQLite semantics.The test correctly verifies that cache clearing removes files and sessions from the SQLite database while maintaining the project record.
657-704: Error handling tests are thorough.Tests cover missing cache entries, missing JSONL files, and directory permission issues with appropriate fallback behavior.
707-792: Comprehensive env var precedence tests.The
TestCachePathEnvVarclass thoroughly tests the cache path resolution priority:
- Explicit
db_pathparameter (highest)CLAUDE_CODE_LOG_CACHE_PATHenvironment variable- Default location (
parent/claude-code-log-cache.db)Tests also verify that
get_all_cached_projectsrespects the same precedence rules.claude_code_log/cli.py (7)
20-25: LGTM!Import changes correctly bring in the new SQLite-based cache utilities needed for the refactored cache system.
33-56: LGTM!The
_discover_projectsfunction correctly identifies active projects (with JSONL files) and archived projects (cached but without JSONL files). The logic properly differentiates between them using the cache'sis_archivedflag and cross-referencing with active project paths.
59-111: LGTM!The archived project handling is well-implemented. The function correctly:
- Skips JSONL file operations for archived projects
- Loads session data directly from cache for archived projects
- Provides appropriate error messaging when archived projects have no cached sessions
- Maintains existing cache refresh logic for non-archived projects
239-260: LGTM!Using
get_working_directories()instead of loading the full project cache is more efficient for this specific use case. The fallback logic to build cache and re-fetch directories is preserved correctly.
306-352: LGTM!The cache clearing logic properly handles the migration from JSON-based to SQLite-based caching:
- Deletes the centralized SQLite database for all-projects mode
- Cleans up legacy JSON cache directories for backwards compatibility
- Uses appropriate error handling with warnings for non-critical failures
503-508: LGTM!The
--page-sizeCLI option is well-defined with a sensible default of 2000 messages per page. The help text clearly explains that sessions are never split across pages, which is an important constraint for pagination.
558-642: LGTM!The TUI flow correctly handles archived projects throughout:
- Re-discovers projects on each loop iteration to reflect changes from restore operations
- Consistently passes
archived_projectsset to the project selector- Correctly determines
is_archivedstatus via set membership before launching TUI- All three TUI branch cases (single project, single match, multiple projects) are handled uniformly
claude_code_log/cache.py (8)
57-87: LGTM!The
HtmlCacheEntryandPageCacheDataPydantic models are well-structured with appropriate fields for tracking HTML generation metadata and paginated transcript pages. The models include all necessary information for cache invalidation decisions (timestamps, message counts, library version, session IDs).
170-184: LGTM!The
get_cache_db_pathfunction correctly implements the configuration priority (env var > default) with a unique, descriptive filename as discussed in the PR.
221-231: LGTM!The connection management is well-implemented:
- Context manager ensures connections are properly closed
- WAL mode enables concurrent reads while writing
- Foreign keys enforcement maintains referential integrity
- 30-second timeout handles contention gracefully
294-359: LGTM!The serialization approach is well-designed:
- Extracts queryable fields (timestamp, session_id, tokens) into indexed columns
- Stores full content as zlib-compressed JSON for space efficiency
- Deserialization correctly uses the factory pattern from relevant code snippets
- Compact JSON format (
separators=(",", ":")) minimizes storage overhead
863-863: Lazy import avoids circular dependency.The
from .renderer import is_html_outdatedis placed inside the method to avoid circular imports. This is a reasonable pattern, but consider documenting this dependency relationship.
1486-1496: LGTM!The
__all__export list properly exposes all public entities needed by consumers of this module.
121-164: No action required. Thetomlpackage is already listed in the project dependencies (toml>=0.10.2), so the function's use of it in fallback scenarios is safe and properly supported.Likely an incorrect or invalid review comment.
1393-1398: The filename assumption is correct. Session JSONL files are indeed named{session_id}.jsonlthroughout the codebase (verified inclaude_code_log/tui.pylines 376, 1667, 1725, 1775), so thedelete_sessionmethod's deletion logic will properly clean up the correspondingcached_filesentry without orphaning it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @claude_code_log/cache.py:
- Around line 1388-1438: The delete_session method fails to remove page_sessions
entries pointing to the removed session; update the delete_session function to
also execute a DELETE FROM page_sessions WHERE project_id = ? AND session_id = ?
using self._project_id and session_id (before committing), so page_sessions no
longer reference the deleted session; ensure this new delete runs alongside the
existing deletes (messages, html_cache, cached_files, sessions) and that
self._update_last_updated(conn) and conn.commit() remain after all deletions.
🧹 Nitpick comments (5)
claude_code_log/cache.py (5)
121-164: Consider consolidating duplicatetomlimports.The
import tomlstatement appears twice within the function (lines 135 and 152). While this works due to Python's import caching, consolidating the import at the top of the function or using a single try-except block would be cleaner.♻️ Suggested refactor
def get_library_version() -> str: """Get the current library version from package metadata or pyproject.toml.""" # First try to get version from installed package metadata try: from importlib.metadata import version as get_version return get_version("claude-code-log") except Exception: # Package not installed or other error, continue to file-based detection pass + # Import toml once for file-based detection methods + try: + import toml + except ImportError: + return "unknown" + # Second approach: Use importlib.resources for more robust package location detection try: from importlib import resources - import toml # Get the package directory and navigate to parent for pyproject.toml package_files = resources.files("claude_code_log")
254-256: Avoid printing to stdout in library code.Using
print()for version mismatch notification may interfere with non-CLI consumers of this library. Consider usingloggingmodule or allowing callers to handle this via a callback/return value.♻️ Suggested refactor
+import logging + +logger = logging.getLogger(__name__) + # In _ensure_project_exists: # Check version compatibility if not self._is_cache_version_compatible(cached_version): - print( + logger.info( f"Cache version incompatible: {cached_version} -> {self.library_version}, invalidating cache" )
442-452: Simplify redundant conditional branches.Both branches of the
if-elseat lines 445-452 set the same time (23:59:59.999999), making the condition unnecessary.♻️ Suggested simplification
if to_date: to_dt = dateparser.parse(to_date) if to_dt: - if to_date in ["today", "yesterday"] or "days ago" in to_date: - to_dt = to_dt.replace( - hour=23, minute=59, second=59, microsecond=999999 - ) - else: - to_dt = to_dt.replace( - hour=23, minute=59, second=59, microsecond=999999 - ) + # Always set to end of day for "to" date bound + to_dt = to_dt.replace( + hour=23, minute=59, second=59, microsecond=999999 + )
527-547: Consider usingexecutemany()for batch inserts.Inserting entries one at a time in a loop is functional but slower for large files. Using
executemany()with a list of serialized entries would improve performance.♻️ Batch insert optimization
# Delete existing messages for this file conn.execute("DELETE FROM messages WHERE file_id = ?", (file_id,)) - # Insert all entries in a batch - for entry in entries: - serialized = self._serialize_entry(entry, file_id) - conn.execute( - """ - INSERT INTO messages ( - project_id, file_id, type, timestamp, session_id, - ... - ) VALUES ( - :project_id, :file_id, :type, :timestamp, :session_id, - ... - ) - """, - serialized, - ) + # Insert all entries in a batch using executemany + serialized_entries = [self._serialize_entry(entry, file_id) for entry in entries] + conn.executemany( + """ + INSERT INTO messages ( + project_id, file_id, type, timestamp, session_id, + _uuid, _parent_uuid, _is_sidechain, _user_type, _cwd, _version, + _is_meta, _agent_id, _request_id, + input_tokens, output_tokens, cache_creation_tokens, cache_read_tokens, + _leaf_uuid, _level, _operation, content + ) VALUES ( + :project_id, :file_id, :type, :timestamp, :session_id, + :_uuid, :_parent_uuid, :_is_sidechain, :_user_type, :_cwd, :_version, + :_is_meta, :_agent_id, :_request_id, + :input_tokens, :output_tokens, :cache_creation_tokens, :cache_read_tokens, + :_leaf_uuid, :_level, :_operation, :content + ) + """, + serialized_entries, + )
1509-1510: Silent exception swallowing may hide issues.The bare
except Exception: passsilently ignores all errors including database corruption, permission issues, or unexpected failures. Consider logging the exception for debugging purposes.♻️ Add logging for exceptions
+import logging + +logger = logging.getLogger(__name__) + except Exception: - pass + logger.debug("Failed to load cached projects: %s", exc_info=True) return result
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
claude_code_log/cache.pyclaude_code_log/tui.pytest/test_pagination.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use-n autoflag for parallel test execution with pytest in Python projects (e.g.,uv run pytest -n auto -m "not (tui or browser)" -v)
Useruff formatfor code formatting in Python projects
Useruff check --fixfor linting and automatic fixes in Python projects
Useuv run pyrightoruv run ty checkfor type checking in Python projects
Files:
claude_code_log/cache.pytest/test_pagination.pyclaude_code_log/tui.py
🧠 Learnings (2)
📚 Learning: 2025-12-09T23:52:47.578Z
Learnt from: daaain
Repo: daaain/claude-code-log PR: 59
File: test/test_cache.py:135-165
Timestamp: 2025-12-09T23:52:47.578Z
Learning: SQLite supports NULLS FIRST and NULLS LAST in ORDER BY since v3.30.0 (Oct 2019). Do not flag SQL that uses these clauses as an error when reviewing Python tests or code that interacts with SQLite. If reviewing SQL strings, verify the target SQLite version supports NULLS FIRST/LAST and ensure the syntax is used correctly for the database in use.
Applied to files:
test/test_pagination.py
📚 Learning: 2025-12-31T16:37:39.494Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-31T16:37:39.494Z
Learning: Implement interactive TUI (Terminal User Interface) for browsing Claude Code sessions with real-time navigation, summaries, and quick actions
Applied to files:
claude_code_log/tui.py
🧬 Code graph analysis (1)
claude_code_log/cache.py (4)
claude_code_log/factories/transcript_factory.py (1)
create_transcript_entry(219-238)claude_code_log/migrations/runner.py (1)
run_migrations(135-156)claude_code_log/models.py (5)
AssistantTranscriptEntry(175-178)QueueOperationTranscriptEntry(203-220)SummaryTranscriptEntry(181-186)SystemTranscriptEntry(189-200)UserTranscriptEntry(168-172)claude_code_log/renderer.py (2)
get(121-132)is_html_outdated(2308-2320)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: test (windows-latest, 3.10)
- GitHub Check: test (windows-latest, 3.11)
- GitHub Check: test (windows-latest, 3.13)
- GitHub Check: test (windows-latest, 3.14)
- GitHub Check: test (ubuntu-latest, 3.13)
- GitHub Check: test (ubuntu-latest, 3.14)
- GitHub Check: test (ubuntu-latest, 3.10)
- GitHub Check: test (windows-latest, 3.12)
- GitHub Check: test (ubuntu-latest, 3.12)
- GitHub Check: test (ubuntu-latest, 3.11)
🔇 Additional comments (18)
claude_code_log/cache.py (1)
1319-1332: LGTM - SQL parameterization is correctly applied.The
INclause placeholders are safely generated from the list length, and actual session IDs are passed as bound parameters, preventing SQL injection.test/test_pagination.py (6)
1-19: LGTM - Clean imports and test setup.The imports are well-organized, pulling in necessary fixtures and the specific functions under test (
_get_page_html_path,_assign_sessions_to_pages).
37-139: Good test coverage for pagination logic edge cases.The tests comprehensively cover:
- Single/multiple sessions below threshold
- Overflow behavior when sessions exceed page size
- Chronological ordering
- Empty input handling
- Complex multi-page scenarios
155-159: Fixture patches version but passes same value to CacheManager.The
cache_managerfixture patchesget_library_versionto returnmock_version, then passes the samemock_versiontoCacheManager. This is correct but the patch may be unnecessary since the version is explicitly passed. However, keeping the patch is safer in case internal code paths callget_library_version().
276-301: Test forsession_missingcorrectly isolates the check.The test creates a page cache entry but doesn't add the session to the sessions table, correctly testing that
is_page_staledetects missing sessions. The mock ofis_html_outdatedproperly isolates the staleness check.
549-594: Good integration test coverage for pagination scenarios.Tests correctly:
- Create realistic JSONL data with helper function
- Verify single vs. multiple page generation based on
page_size- Check for presence of expected HTML files
705-770: Thorough edge case testing for next link updates.Tests properly cover:
- Removing
last-pageclass when enabling next link- No modification when already visible
- Graceful handling of missing files and invalid page numbers
claude_code_log/tui.py (11)
65-78: LGTM - Clean initialization with optional archived_projects parameter.The default
Nonewith fallback to empty set handles the optional parameter correctly, maintaining backward compatibility.
287-293: Good error handling with individual file tracking.The try-except per file with tracking of succeeded/failed files provides good UX - partial failures are reported clearly without stopping the entire operation.
340-342: Cache deletion order is correct.Cache is cleared first regardless of option (
cache_onlyorboth), which is the desired behavior. File deletion follows if requested.
412-424: Good idempotency check before restoration.The code correctly filters out sessions that already exist as JSONL files, preventing accidental overwrites and providing clear feedback when nothing needs restoration.
1117-1122: Good exclusion of agent files from session loading.Agent files (
agent-*.jsonl) are correctly excluded since they're loaded via session references, not as standalone sessions.
1694-1711: Well-designed action gating based on session state.The
check_actionmethod correctly:
- Allows archive/resume only for current sessions (with JSONL)
- Allows restore only for archived sessions (cache-only)
- Returns
Truefor unhandled actions as a safe default
1735-1741: LGTM - Safe file writing with directory creation.The code ensures the project directory exists before writing, handles the case where the project directory was deleted, and writes messages correctly with newline separators.
1836-1867: Delete confirmation callback handles all cases correctly.The callback:
- Only attempts JSONL deletion for
"both"option- Uses
CacheManager.delete_session()for cache cleanup- Reports what was actually deleted
- Clears selection and refreshes UI
1599-1611: Correct dual-path loading for archived vs. current sessions.The code properly distinguishes:
- Archived sessions: loads messages from cache via
load_session_entries()- Current sessions: loads messages from JSONL files via
load_directory_transcripts()This ensures archived sessions can still be rendered to HTML/Markdown.
1898-1935: Good handling of archived projects in entry point.The
run_session_browserfunction correctly:
- Allows archived projects to load from cache even if the directory doesn't exist
- Attempts cache lookup before showing "does not exist" error
- Checks for cached sessions when no JSONL files are found for archived projects
286-316: Non-atomic file deletion is acceptable for TUI context.The file deletion loop isn't atomic (if interrupted, some files may be deleted but project not marked archived). This is acceptable because:
- Atomic deletion of multiple files isn't possible without a transaction manager
- The partial state is recoverable - user can archive again
- The code correctly only marks as archived if ALL files deleted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
test/test_cache_integration.py (1)
476-492: Stale test: references old JSON-based cache structure.This test looks for
project_dir / "cache" / "*.json"files, but the new implementation uses SQLite. The test will silently pass without corrupting anything sincecache_fileswill be empty, making theif cache_files:block a no-op.🔧 Update test for SQLite corruption recovery
def test_cache_corruption_recovery(self, setup_test_project): """Test recovery from corrupted cache files.""" - project_dir = setup_test_project + project_dir = setup_test_project.project_dir + db_path = setup_test_project.db_path # Create initial cache convert_jsonl_to_html(input_path=project_dir, use_cache=True) - # Corrupt cache file - cache_dir = project_dir / "cache" - cache_files = list(cache_dir.glob("*.json")) - if cache_files: - cache_file = [f for f in cache_files if f.name != "index.json"][0] - cache_file.write_text("corrupted json data", encoding="utf-8") + # Corrupt SQLite database + assert db_path.exists() + with open(db_path, "r+b") as f: + f.seek(100) # Skip SQLite header + f.write(b"corrupted data here") - # Should recover gracefully - output = convert_jsonl_to_html(input_path=project_dir, use_cache=True) - assert output.exists() + # Should recover gracefully (may recreate database) + try: + output = convert_jsonl_to_html(input_path=project_dir, use_cache=True) + assert output.exists() + except Exception: + # Some form of error handling is acceptable + pass
🧹 Nitpick comments (3)
claude_code_log/cache.py (3)
444-452: Redundant conditional branches - both execute identical code.The
ifandelsebranches both setto_dtto end of day (23:59:59.999999). The condition check serves no purpose since both paths do the same thing.♻️ Simplify by removing the dead branch
if to_dt: - if to_date in ["today", "yesterday"] or "days ago" in to_date: - to_dt = to_dt.replace( - hour=23, minute=59, second=59, microsecond=999999 - ) - else: - to_dt = to_dt.replace( - hour=23, minute=59, second=59, microsecond=999999 - ) + to_dt = to_dt.replace( + hour=23, minute=59, second=59, microsecond=999999 + )
528-547: Considerexecutemanyfor batch inserts.Individual
execute()calls in a loop work correctly but are less efficient for large entry counts. Usingexecutemany()with a list of pre-serialized dictionaries would reduce Python↔SQLite round-trips.♻️ Batch insert optimization
- for entry in entries: - serialized = self._serialize_entry(entry, file_id) - conn.execute( - """ - INSERT INTO messages ( - project_id, file_id, type, timestamp, session_id, - _uuid, _parent_uuid, _is_sidechain, _user_type, _cwd, _version, - _is_meta, _agent_id, _request_id, - input_tokens, output_tokens, cache_creation_tokens, cache_read_tokens, - _leaf_uuid, _level, _operation, content - ) VALUES ( - :project_id, :file_id, :type, :timestamp, :session_id, - :_uuid, :_parent_uuid, :_is_sidechain, :_user_type, :_cwd, :_version, - :_is_meta, :_agent_id, :_request_id, - :input_tokens, :output_tokens, :cache_creation_tokens, :cache_read_tokens, - :_leaf_uuid, :_level, :_operation, :content - ) - """, - serialized, - ) + serialized_entries = [self._serialize_entry(entry, file_id) for entry in entries] + conn.executemany( + """ + INSERT INTO messages ( + project_id, file_id, type, timestamp, session_id, + _uuid, _parent_uuid, _is_sidechain, _user_type, _cwd, _version, + _is_meta, _agent_id, _request_id, + input_tokens, output_tokens, cache_creation_tokens, cache_read_tokens, + _leaf_uuid, _level, _operation, content + ) VALUES ( + :project_id, :file_id, :type, :timestamp, :session_id, + :_uuid, :_parent_uuid, :_is_sidechain, :_user_type, :_cwd, :_version, + :_is_meta, :_agent_id, :_request_id, + :input_tokens, :output_tokens, :cache_creation_tokens, :cache_read_tokens, + :_leaf_uuid, :_level, :_operation, :content + ) + """, + serialized_entries, + )
1516-1517: Silentexcept Exception: passmay hide important errors.Database corruption, permission errors, or schema issues will be silently ignored, making debugging difficult. Consider logging the exception or at least the exception type.
♻️ Add minimal error logging
+ import logging + logger = logging.getLogger(__name__) + try: conn = sqlite3.connect(actual_db_path, timeout=30.0) # ... existing code ... except Exception: - pass + logger.debug("Failed to read cached projects from %s", actual_db_path, exc_info=True)Or if logging isn't available, at least narrow the exception types:
- except Exception: - pass + except (sqlite3.Error, OSError): + # Database unavailable or corrupted - return empty list + pass
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
claude_code_log/cache.pytest/test_cache_integration.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use-n autoflag for parallel test execution with pytest in Python projects (e.g.,uv run pytest -n auto -m "not (tui or browser)" -v)
Useruff formatfor code formatting in Python projects
Useruff check --fixfor linting and automatic fixes in Python projects
Useuv run pyrightoruv run ty checkfor type checking in Python projects
Files:
claude_code_log/cache.pytest/test_cache_integration.py
🧠 Learnings (1)
📚 Learning: 2025-12-09T23:52:47.578Z
Learnt from: daaain
Repo: daaain/claude-code-log PR: 59
File: test/test_cache.py:135-165
Timestamp: 2025-12-09T23:52:47.578Z
Learning: SQLite supports NULLS FIRST and NULLS LAST in ORDER BY since v3.30.0 (Oct 2019). Do not flag SQL that uses these clauses as an error when reviewing Python tests or code that interacts with SQLite. If reviewing SQL strings, verify the target SQLite version supports NULLS FIRST/LAST and ensure the syntax is used correctly for the database in use.
Applied to files:
test/test_cache_integration.py
🧬 Code graph analysis (1)
claude_code_log/cache.py (4)
claude_code_log/factories/transcript_factory.py (1)
create_transcript_entry(219-238)claude_code_log/migrations/runner.py (1)
run_migrations(135-156)claude_code_log/models.py (5)
AssistantTranscriptEntry(175-178)QueueOperationTranscriptEntry(203-220)SummaryTranscriptEntry(181-186)SystemTranscriptEntry(189-200)UserTranscriptEntry(168-172)claude_code_log/renderer.py (2)
get(121-132)is_html_outdated(2308-2320)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: test (ubuntu-latest, 3.10)
- GitHub Check: test (ubuntu-latest, 3.11)
- GitHub Check: test (ubuntu-latest, 3.13)
- GitHub Check: test (ubuntu-latest, 3.12)
- GitHub Check: test (windows-latest, 3.10)
- GitHub Check: test (windows-latest, 3.12)
- GitHub Check: test (windows-latest, 3.13)
- GitHub Check: test (windows-latest, 3.14)
- GitHub Check: test (ubuntu-latest, 3.14)
- GitHub Check: test (windows-latest, 3.11)
🔇 Additional comments (4)
claude_code_log/cache.py (1)
1-87: Well-structured SQLite cache implementation.The migration to SQLite with zlib compression, WAL mode, and proper foreign key constraints is a solid architectural choice. The data models are clean and the overall design supports the archival, pagination, and version compatibility features well.
test/test_cache_integration.py (3)
526-696: Excellent test coverage for archived sessions workflow.The tests comprehensively cover the archive-restore lifecycle: caching → file deletion → archive detection → export → restore → verification. The test at line 648 (
test_full_archive_and_restore_workflow) is particularly valuable as an end-to-end integration test.
1001-1059: Good coverage forget_all_cached_projects.Tests properly verify the function handles:
- Mixed active/archived projects with correct
is_archivedflags- Empty directories (no cache database)
- Nonexistent directories
96-139: Clever Path-like wrapper for backward compatibility.The
ProjectWithCacheclass implementing__fspath__,__truediv__, and other Path-like methods allows existing tests to continue working while providing access todb_pathwhen needed. This is a pragmatic approach to incremental migration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In @claude_code_log/cache.py:
- Around line 241-266: Replace the print() in _ensure_project_exists with the
library logger and include a UTC timestamp: obtain or use the module/class
logger (e.g., self.logger or logging.getLogger(__name__)), call logger.warning
or logger.info instead of print, and prepend or format the message with
datetime.datetime.now(datetime.timezone.utc).isoformat() to produce an explicit
UTC timestamp; make the same replacement for the other print occurrences
referenced around lines 269-296 (the blocks that call
_is_cache_version_compatible, _clear_project_data, and _create_project) so all
cache-invalidation messages use the logger + UTC timestamp consistently.
- Around line 1087-1102: The get_page_size_config() implementation returns the
oldest page because the query orders by page_number ASC; update the SQL in
get_page_size_config() (and keep LIMIT 1) to order by page_number DESC so it
selects the most recent page's page_size_config, and adjust the docstring only
if you choose a different “most recent” definition (e.g., use updated_at) to
ensure the comment matches the query.
- Around line 417-477: The bug in load_cached_entries_filtered() is that
tz-aware parsed datetimes are being formatted with a "Z" without converting them
to UTC, shifting filter bounds; change the handling of from_dt and to_dt so that
if dt.tzinfo is not None you call dt = dt.astimezone(timezone.utc) (and if
dt.tzinfo is None keep the existing replace(tzinfo=timezone.utc) behavior), then
format to_bound/from_bound from that UTC-converted dt; update the branches that
set from_bound and to_bound (and ensure timezone is imported where needed) so
comparisons use true UTC "Z" timestamps.
🧹 Nitpick comments (5)
test/test_cache_integration.py (4)
25-50: Make JSONL I/O encoding consistent (avoid locale-dependent failures).
A lot of the new test scaffolding correctly isolates the DB path, but many JSONL writes still omitencoding="utf-8"(some places include it, others don’t). This can break on non-UTF-8 default locales and is hard to debug.Proposed fix (apply pattern throughout file)
- with open(jsonl_file, "w") as f: + with open(jsonl_file, "w", encoding="utf-8") as f: for entry in sample_jsonl_data: f.write(json.dumps(entry) + "\n")Also applies to: 141-156, 214-251, 270-310, 311-353, 355-425, 443-507, 527-1060
96-140: Avoid passingProjectWithCachewhere aPathis expected (or fully delegate).
ProjectWithCacheis a partialPathfaçade; this is brittle if production code uses additionalPathAPIs (e.g.,.resolve(),.suffix,.with_name(), etc.). Prefer consistently passingsetup_test_project.project_dirintoCacheManager/ converter APIs, and keep the wrapper only for tests that truly need it.Safer direction (example changes)
- project_dir = setup_test_project + project_dir = setup_test_project.project_dir ... - cache_manager = CacheManager(project_dir, "1.0.0") + cache_manager = CacheManager(project_dir, "1.0.0")Also applies to: 188-213, 252-269, 297-310, 426-442
355-425: “Performance” assertion is a no-op; consider asserting cache behavior instead.
assert second_run_time >= 0will always pass and doesn’t validate caching. If you want a stable integration assertion, consider checking something deterministic (e.g., that cache contains files/sessions, or that “stale sessions” is empty on second run).
500-507: Catching onlyFileNotFoundErrorcan hide real regressions.
If empty projects are an accepted input, assert the expected output behavior; otherwise, assert the exception. The currenttry/exceptmakes it too easy for unrelated failures to slip by.claude_code_log/cache.py (1)
124-168: Usetomllibwith fallback totomlfor Python 3.10 compatibility, and log failures.Since the project supports Python 3.10+ (not 3.11+),
tomllibcannot be the sole choice, but attempting it first with a fallback to the already-requiredtomldependency is sound. More importantly, all three exception handlers currently swallow errors silently, masking version detection failures. Add logging using the availableloggerto make failures visible for debugging.Suggested fix
def get_library_version() -> str: """Get the current library version from package metadata or pyproject.toml.""" # First try to get version from installed package metadata try: from importlib.metadata import version as get_version return get_version("claude-code-log") except Exception: - # Package not installed or other error, continue to file-based detection - pass + logger.debug("Failed to detect version from package metadata.", exc_info=True) # Second approach: Use importlib.resources for more robust package location detection try: from importlib import resources - import toml + try: + import tomllib as toml # py>=3.11 + except ImportError: # pragma: no cover + import toml # type: ignore[no-redef] # Get the package directory and navigate to parent for pyproject.toml package_files = resources.files("claude_code_log") # Convert to Path to access parent reliably package_root = Path(str(package_files)).parent pyproject_path = package_root / "pyproject.toml" if pyproject_path.exists(): - with open(pyproject_path, "r", encoding="utf-8") as f: + with open(pyproject_path, "rb") as f: pyproject_data = toml.load(f) return pyproject_data.get("project", {}).get("version", "unknown") except Exception: - pass + logger.debug("Failed to detect version from package files.", exc_info=True) # Final fallback: Try to read from pyproject.toml using file-relative path try: - import toml + try: + import tomllib as toml # py>=3.11 + except ImportError: # pragma: no cover + import toml # type: ignore[no-redef] project_root = Path(__file__).parent.parent pyproject_path = project_root / "pyproject.toml" if pyproject_path.exists(): - with open(pyproject_path, "r", encoding="utf-8") as f: + with open(pyproject_path, "rb") as f: pyproject_data = toml.load(f) return pyproject_data.get("project", {}).get("version", "unknown") except Exception: - pass + logger.debug("Failed to detect version from file-relative path.", exc_info=True) return "unknown"
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
claude_code_log/cache.pytest/test_cache_integration.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use-n autoflag for parallel test execution with pytest in Python projects (e.g.,uv run pytest -n auto -m "not (tui or browser)" -v)
Useruff formatfor code formatting in Python projects
Useruff check --fixfor linting and automatic fixes in Python projects
Useuv run pyrightoruv run ty checkfor type checking in Python projects
Files:
test/test_cache_integration.pyclaude_code_log/cache.py
🧠 Learnings (1)
📚 Learning: 2025-12-09T23:52:47.578Z
Learnt from: daaain
Repo: daaain/claude-code-log PR: 59
File: test/test_cache.py:135-165
Timestamp: 2025-12-09T23:52:47.578Z
Learning: SQLite supports NULLS FIRST and NULLS LAST in ORDER BY since v3.30.0 (Oct 2019). Do not flag SQL that uses these clauses as an error when reviewing Python tests or code that interacts with SQLite. If reviewing SQL strings, verify the target SQLite version supports NULLS FIRST/LAST and ensure the syntax is used correctly for the database in use.
Applied to files:
test/test_cache_integration.py
🧬 Code graph analysis (2)
test/test_cache_integration.py (1)
claude_code_log/cache.py (6)
get_cached_project_data(661-733)export_session_to_jsonl(1037-1058)load_session_entries(1060-1083)delete_session(1385-1442)is_file_cached(377-398)get_all_cached_projects(1462-1516)
claude_code_log/cache.py (4)
claude_code_log/factories/transcript_factory.py (1)
create_transcript_entry(219-238)claude_code_log/migrations/runner.py (1)
run_migrations(135-156)claude_code_log/models.py (4)
AssistantTranscriptEntry(175-178)SummaryTranscriptEntry(181-186)SystemTranscriptEntry(189-200)UserTranscriptEntry(168-172)claude_code_log/renderer.py (2)
get(121-132)is_html_outdated(2308-2320)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: test (windows-latest, 3.11)
- GitHub Check: test (windows-latest, 3.12)
- GitHub Check: test (ubuntu-latest, 3.11)
- GitHub Check: test (windows-latest, 3.10)
- GitHub Check: test (ubuntu-latest, 3.12)
- GitHub Check: test (windows-latest, 3.13)
- GitHub Check: test (windows-latest, 3.14)
- GitHub Check: test (ubuntu-latest, 3.10)
- GitHub Check: test (ubuntu-latest, 3.13)
- GitHub Check: test (ubuntu-latest, 3.14)
| def _ensure_project_exists(self) -> None: | ||
| """Ensure project record exists and get its ID.""" | ||
| project_path_str = str(self.project_path) | ||
|
|
||
| with self._get_connection() as conn: | ||
| row = conn.execute( | ||
| "SELECT id, version FROM projects WHERE project_path = ?", | ||
| (project_path_str,), | ||
| ).fetchone() | ||
|
|
||
| if row: | ||
| self._project_id = row["id"] | ||
| cached_version = row["version"] | ||
|
|
||
| # Check version compatibility | ||
| if not self._is_cache_version_compatible(cached_version): | ||
| print( | ||
| f"Cache version incompatible: {self._project_cache.version} -> {self.library_version}, invalidating cache" | ||
| f"Cache version incompatible: {cached_version} -> {self.library_version}, invalidating cache" | ||
| ) | ||
| self.clear_cache() | ||
| self._project_cache = None | ||
| except Exception as e: | ||
| print(f"Warning: Failed to load cache index, will rebuild: {e}") | ||
| self._project_cache = None | ||
|
|
||
| # Initialize empty cache if none exists | ||
| if self._project_cache is None: | ||
| self._project_cache = ProjectCache( | ||
| version=self.library_version, | ||
| cache_created=datetime.now().isoformat(), | ||
| last_updated=datetime.now().isoformat(), | ||
| project_path=str(self.project_path), | ||
| cached_files={}, | ||
| sessions={}, | ||
| ) | ||
| self._clear_project_data(conn) | ||
| self._project_id = self._create_project(conn) | ||
| else: | ||
| self._project_id = self._create_project(conn) | ||
|
|
||
| conn.commit() | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid print() in library code; use logger and write UTC timestamps.
print() in _ensure_project_exists() will pollute CLI/TUI output and is hard to test. Also, timestamps are currently local-time datetime.now().isoformat(); using UTC avoids cross-machine weirdness.
Proposed fix
- now = datetime.now().isoformat()
+ now = datetime.now(timezone.utc).isoformat()
@@
- print(
- f"Cache version incompatible: {cached_version} -> {self.library_version}, invalidating cache"
- )
+ logger.warning(
+ "Cache version incompatible: %s -> %s; invalidating cache",
+ cached_version,
+ self.library_version,
+ )Also applies to: 269-296
🤖 Prompt for AI Agents
In @claude_code_log/cache.py around lines 241 - 266, Replace the print() in
_ensure_project_exists with the library logger and include a UTC timestamp:
obtain or use the module/class logger (e.g., self.logger or
logging.getLogger(__name__)), call logger.warning or logger.info instead of
print, and prepend or format the message with
datetime.datetime.now(datetime.timezone.utc).isoformat() to produce an explicit
UTC timestamp; make the same replacement for the other print occurrences
referenced around lines 269-296 (the blocks that call
_is_cache_version_compatible, _clear_project_data, and _create_project) so all
cache-invalidation messages use the logger + UTC timestamp consistently.
| def _deserialize_entry(self, row: sqlite3.Row) -> TranscriptEntry: | ||
| """Convert SQLite row back to TranscriptEntry.""" | ||
| content_dict = json.loads(zlib.decompress(row["content"]).decode("utf-8")) | ||
| return create_transcript_entry(content_dict) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle corrupted content rows (zlib/json) so partial corruption doesn’t crash reads.
SQLite can stay readable even if individual BLOBs are corrupted; _deserialize_entry() will currently raise and fail the whole operation. Consider treating failures as a cache-miss signal (and possibly invalidating the project).
Proposed fix (local hardening)
def _deserialize_entry(self, row: sqlite3.Row) -> TranscriptEntry:
"""Convert SQLite row back to TranscriptEntry."""
- content_dict = json.loads(zlib.decompress(row["content"]).decode("utf-8"))
- return create_transcript_entry(content_dict)
+ try:
+ content_dict = json.loads(zlib.decompress(row["content"]).decode("utf-8"))
+ return create_transcript_entry(content_dict)
+ except (zlib.error, UnicodeDecodeError, json.JSONDecodeError, ValueError) as e:
+ raise sqlite3.DatabaseError(f"Corrupted cached message content: {e}") from eAlso applies to: 400-477, 1037-1084
| def load_cached_entries_filtered( | ||
| self, jsonl_path: Path, from_date: Optional[str], to_date: Optional[str] | ||
| ) -> Optional[list[TranscriptEntry]]: | ||
| """Load cached entries with efficient timestamp-based filtering.""" | ||
| ) -> Optional[List[TranscriptEntry]]: | ||
| """Load cached entries with SQL-based timestamp filtering.""" | ||
| if not self.is_file_cached(jsonl_path): | ||
| return None | ||
|
|
||
| cache_file = self._get_cache_file_path(jsonl_path) | ||
| try: | ||
| with open(cache_file, "r", encoding="utf-8") as f: | ||
| cache_data = json.load(f) | ||
| # If no date filtering needed, fall back to regular loading | ||
| if not from_date and not to_date: | ||
| return self.load_cached_entries(jsonl_path) | ||
|
|
||
| # If no date filtering needed, fall back to regular loading | ||
| if not from_date and not to_date: | ||
| return self.load_cached_entries(jsonl_path) | ||
| file_id = self._get_file_id(jsonl_path) | ||
| if file_id is None: | ||
| return None | ||
|
|
||
| # Parse date filters | ||
| from .parser import parse_timestamp | ||
| import dateparser | ||
| # Parse dates | ||
| import dateparser | ||
|
|
||
| from_dt = None | ||
| to_dt = None | ||
|
|
||
| if from_date: | ||
| from_dt = dateparser.parse(from_date) | ||
| if from_dt and ( | ||
| from_date in ["today", "yesterday"] or "days ago" in from_date | ||
| ): | ||
| from_dt = from_dt.replace(hour=0, minute=0, second=0, microsecond=0) | ||
|
|
||
| if to_date: | ||
| to_dt = dateparser.parse(to_date) | ||
| if to_dt: | ||
| to_dt = to_dt.replace(hour=23, minute=59, second=59, microsecond=999999) | ||
|
|
||
| # Build query with SQL-based filtering | ||
| sql = "SELECT content FROM messages WHERE file_id = ?" | ||
| params: List[Any] = [file_id] | ||
|
|
||
| if from_dt: | ||
| # Normalize to UTC 'Z' format for consistent string comparison | ||
| # with stored timestamps (which use 'Z' suffix from JSONL) | ||
| if from_dt.tzinfo is None: | ||
| from_dt = from_dt.replace(tzinfo=timezone.utc) | ||
| from_bound = from_dt.strftime("%Y-%m-%dT%H:%M:%SZ") | ||
| # Include entries with NULL timestamp (like summaries) OR within date range | ||
| sql += " AND (timestamp IS NULL OR timestamp >= ?)" | ||
| params.append(from_bound) | ||
|
|
||
| if to_dt: | ||
| # Normalize to UTC 'Z' format for consistent string comparison | ||
| if to_dt.tzinfo is None: | ||
| to_dt = to_dt.replace(tzinfo=timezone.utc) | ||
| to_bound = to_dt.strftime("%Y-%m-%dT%H:%M:%SZ") | ||
| sql += " AND (timestamp IS NULL OR timestamp <= ?)" | ||
| params.append(to_bound) | ||
|
|
||
| sql += " ORDER BY timestamp NULLS LAST" | ||
|
|
||
| with self._get_connection() as conn: | ||
| rows = conn.execute(sql, params).fetchall() | ||
|
|
||
| return [self._deserialize_entry(row) for row in rows] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Timezone bounds are wrong for tz-aware inputs in load_cached_entries_filtered().
If dateparser.parse() returns a tz-aware datetime (e.g., user passes “2023-01-01 10:00 PST”), the code formats it as ...Z without converting to UTC, which shifts the effective filter window.
Proposed fix
if from_dt:
# Normalize to UTC 'Z' format for consistent string comparison
# with stored timestamps (which use 'Z' suffix from JSONL)
- if from_dt.tzinfo is None:
- from_dt = from_dt.replace(tzinfo=timezone.utc)
+ if from_dt.tzinfo is None:
+ from_dt = from_dt.replace(tzinfo=timezone.utc)
+ else:
+ from_dt = from_dt.astimezone(timezone.utc)
from_bound = from_dt.strftime("%Y-%m-%dT%H:%M:%SZ")
@@
if to_dt:
# Normalize to UTC 'Z' format for consistent string comparison
- if to_dt.tzinfo is None:
- to_dt = to_dt.replace(tzinfo=timezone.utc)
+ if to_dt.tzinfo is None:
+ to_dt = to_dt.replace(tzinfo=timezone.utc)
+ else:
+ to_dt = to_dt.astimezone(timezone.utc)
to_bound = to_dt.strftime("%Y-%m-%dT%H:%M:%SZ")What `dateparser` version is used, and does the project expect `--from-date/--to-date` values with explicit timezones (e.g., “PST”)?
🤖 Prompt for AI Agents
In @claude_code_log/cache.py around lines 417 - 477, The bug in
load_cached_entries_filtered() is that tz-aware parsed datetimes are being
formatted with a "Z" without converting them to UTC, shifting filter bounds;
change the handling of from_dt and to_dt so that if dt.tzinfo is not None you
call dt = dt.astimezone(timezone.utc) (and if dt.tzinfo is None keep the
existing replace(tzinfo=timezone.utc) behavior), then format to_bound/from_bound
from that UTC-converted dt; update the branches that set from_bound and to_bound
(and ensure timezone is imported where needed) so comparisons use true UTC "Z"
timestamps.
| def get_page_size_config(self) -> Optional[int]: | ||
| """Get the configured page size from the most recent page, if any.""" | ||
| if self._project_id is None: | ||
| return None | ||
|
|
||
| with self._get_connection() as conn: | ||
| row = conn.execute( | ||
| """SELECT page_size_config FROM html_pages | ||
| WHERE project_id = ? | ||
| ORDER BY page_number ASC | ||
| LIMIT 1""", | ||
| (self._project_id,), | ||
| ).fetchone() | ||
|
|
||
| return row["page_size_config"] if row else None | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_page_size_config() returns the oldest page, not the “most recent”.
The query orders page_number ASC but the docstring says “most recent page”.
Proposed fix
- """SELECT page_size_config FROM html_pages
- WHERE project_id = ?
- ORDER BY page_number ASC
- LIMIT 1""",
+ """SELECT page_size_config FROM html_pages
+ WHERE project_id = ?
+ ORDER BY page_number DESC
+ LIMIT 1""",🤖 Prompt for AI Agents
In @claude_code_log/cache.py around lines 1087 - 1102, The
get_page_size_config() implementation returns the oldest page because the query
orders by page_number ASC; update the SQL in get_page_size_config() (and keep
LIMIT 1) to order by page_number DESC so it selects the most recent page's
page_size_config, and adjust the docstring only if you choose a different “most
recent” definition (e.g., use updated_at) to ensure the comment matches the
query.
| def delete_session(self, session_id: str) -> bool: | ||
| """Delete a session and its messages from cache. | ||
| Args: | ||
| session_id: The session ID to delete | ||
| Returns: | ||
| True if session was deleted, False if not found | ||
| """ | ||
| if self._project_id is None: | ||
| return False | ||
|
|
||
| with self._get_connection() as conn: | ||
| # Check if session exists | ||
| row = conn.execute( | ||
| "SELECT id FROM sessions WHERE project_id = ? AND session_id = ?", | ||
| (self._project_id, session_id), | ||
| ).fetchone() | ||
|
|
||
| if not row: | ||
| return False | ||
|
|
||
| # Delete messages for this session | ||
| conn.execute( | ||
| "DELETE FROM messages WHERE project_id = ? AND session_id = ?", | ||
| (self._project_id, session_id), | ||
| ) | ||
|
|
||
| # Delete HTML cache entries for this session | ||
| conn.execute( | ||
| "DELETE FROM html_cache WHERE project_id = ? AND source_session_id = ?", | ||
| (self._project_id, session_id), | ||
| ) | ||
|
|
||
| # Delete page_sessions entries referencing this session | ||
| conn.execute( | ||
| """DELETE FROM page_sessions WHERE session_id = ? | ||
| AND page_id IN (SELECT id FROM html_pages WHERE project_id = ?)""", | ||
| (session_id, self._project_id), | ||
| ) | ||
|
|
||
| # Delete cached_files entry for this session's JSONL file | ||
| # File name pattern is {session_id}.jsonl | ||
| conn.execute( | ||
| "DELETE FROM cached_files WHERE project_id = ? AND file_name = ?", | ||
| (self._project_id, f"{session_id}.jsonl"), | ||
| ) | ||
|
|
||
| # Delete the session record | ||
| conn.execute( | ||
| "DELETE FROM sessions WHERE project_id = ? AND session_id = ?", | ||
| (self._project_id, session_id), | ||
| ) | ||
|
|
||
| self._update_last_updated(conn) | ||
| conn.commit() | ||
|
|
||
| return True | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== 1) Checking migrations for messages table structure ==="
find . -type f -name "*.py" -path "*/migrations*" | head -20
echo -e "\n=== 2) Searching for JSONL file naming patterns ==="
rg -n '{session_id}\.jsonl|f"\{session_id\}\.jsonl"|session_id.*\.jsonl' --type py -A 2 -B 2 | head -50
echo -e "\n=== 3) Checking where cached_files table is used ==="
rg -n 'cached_files' --type py -B 2 -A 2 | head -80
echo -e "\n=== 4) Searching for session file creation/writing logic ==="
rg -n 'write.*jsonl|jsonl.*write|\.jsonl' --type py -B 2 -A 2 | head -100
echo -e "\n=== 5) Checking file structure of cache.py around session methods ==="
rg -n 'def.*session|JSONL|jsonl' claude_code_log/cache.py -B 1 -A 1 | head -50Repository: daaain/claude-code-log
Length of output: 415
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== 1) Checking migration files ==="
cat claude_code_log/migrations/runner.py | head -100
echo -e "\n=== 2) Searching for jsonl file naming (fixed regex) ==="
rg -n 'jsonl' claude_code_log/cache.py -B 2 -A 2
echo -e "\n=== 3) Finding where session files are created/saved ==="
rg -n 'session_id.*\.jsonl|\.jsonl.*session' --type py | head -30
echo -e "\n=== 4) Check cached_files creation/insertion ==="
rg -n 'INSERT.*cached_files|cached_files.*INSERT' --type py -B 3 -A 3 | head -60
echo -e "\n=== 5) Search for any session file path construction ==="
rg -n 'session.*path|path.*session|file.*name.*session' --type py | head -40
echo -e "\n=== 6) Check if there are tests for delete_session ==="
rg -n 'delete_session|test.*session' --type py | head -30Repository: daaain/claude-code-log
Length of output: 25875
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== 1) Check if delete_session also deletes the actual JSONL file ==="
rg -n 'def delete_session' claude_code_log/cache.py -A 80 | head -100
echo -e "\n=== 2) Search for where delete_session is called ==="
rg -n 'delete_session\(' --type py | head -30
echo -e "\n=== 3) Check for tests specifically testing delete_session ==="
rg -n 'test.*delete.*session|delete_session' test/ --type py -B 2 -A 10 | head -80
echo -e "\n=== 4) Look for file deletion in relation to session deletion ==="
rg -n 'unlink\|remove\|Path.*session.*delete' --type py | head -40
echo -e "\n=== 5) Check if there's a specific pattern for managing session JSONL file lifecycle ==="
rg -n 'session.*\.jsonl.*delete|delete.*session.*file' --type py -B 2 -A 2 | head -40Repository: daaain/claude-code-log
Length of output: 8961
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== 1) Look at tui.py delete_session action to see full context ==="
rg -n 'def action_delete_session' claude_code_log/tui.py -A 50 | head -70
echo -e "\n=== 2) Check if JSONL files are ever created with names other than {session_id}.jsonl ==="
rg -n 'write.*text\|\.write_text' claude_code_log/cache.py | grep -i jsonl || echo "No matches in cache.py"
echo -e "\n=== 3) Search for any session file with different naming ==="
rg -n '\.jsonl' claude_code_log/ --type py | grep -v 'f"{session_id}.jsonl' | grep -v 'jsonl_path' | head -20
echo -e "\n=== 4) Look at migration files to confirm file_name column in cached_files ==="
find claude_code_log/migrations -name "*.sql" -type f -exec cat {} \; | head -200
echo -e "\n=== 5) Check if there are any unique constraints on cached_files ==="
rg -n 'UNIQUE\|PRIMARY KEY' claude_code_log/migrations/ -A 2 -B 2 | head -50Repository: daaain/claude-code-log
Length of output: 11006
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== 1) Check if cached_files entries are only created for session JSONL files ==="
rg -n 'INSERT INTO cached_files|save_cached_entries' claude_code_log/cache.py -B 5 -A 10 | head -50
echo -e "\n=== 2) Check for any agent file handling in cache.py ==="
rg -n 'agent' claude_code_log/cache.py | head -20
echo -e "\n=== 3) Verify the UNIQUE constraint on cached_files is enforced ==="
rg -n 'UNIQUE.*project_id.*file_name' claude_code_log/migrations/ -B 5 -A 5
echo -e "\n=== 4) Check tests for delete_session to see expected behavior ==="
rg -n 'def test.*delete.*session' test/test_cache_integration.py -A 30 | head -100
echo -e "\n=== 5) Verify if a session can map to multiple cached_files ==="
rg -n 'cached_files.*session|session.*cached_files' claude_code_log/cache.py -B 2 -A 2 | head -40Repository: daaain/claude-code-log
Length of output: 7985
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== 1) Check if sessions table has any reference to source file ==="
rg -n 'sessions.*table|CREATE TABLE.*sessions' claude_code_log/migrations/ -A 30 | head -50
echo -e "\n=== 2) Check how session_id is extracted from JSONL files ==="
rg -n 'sessionId|session_id' claude_code_log/cache.py -B 2 -A 2 | grep -A 3 -B 3 'extract\|parse' | head -40
echo -e "\n=== 3) Verify that a JSONL file name is always {session_id}.jsonl ==="
rg -n 'def.*session|session_id.*=' claude_code_log/converter.py | head -30
echo -e "\n=== 4) Check if there's a way to query cached_files by session_id ==="
rg -n 'session_id.*cached_files|cached_files.*session_id' claude_code_log/cache.py -B 2 -A 2
echo -e "\n=== 5) Look at how messages are linked to files in database ==="
rg -n 'file_id|file_name.*session' claude_code_log/cache.py | head -30Repository: daaain/claude-code-log
Length of output: 4208
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== 1) Check the relationship between messages, sessions, and cached_files ==="
rg -n 'messages.*session_id|session_id.*messages' claude_code_log/migrations/ -B 5 -A 5 | head -60
echo -e "\n=== 2) Can a single session_id have messages from different file_ids? ==="
rg -n 'file_id.*session_id|session_id.*file_id' claude_code_log/migrations/ -B 3 -A 3
echo -e "\n=== 3) Look at how converter creates sessions and maps to files ==="
rg -n 'add_session|update_session' claude_code_log/converter.py -B 2 -A 8 | head -50
echo -e "\n=== 4) Verify if file_name in cached_files is always {session_id}.jsonl ==="
cat claude_code_log/cache.py | sed -n '479,520p'
echo -e "\n=== 5) Check if a single session can be stored across multiple JSONL files ==="
rg -n 'session.*append|multiple.*session|session.*merge' claude_code_log/ --type py | head -20Repository: daaain/claude-code-log
Length of output: 998
🏁 Script executed:
#!/bin/bash
set -euoplicauset -euo pipefail
echo "=== 1) Check if save_cached_entries validates filename matches session_id ==="
rg -n 'def save_cached_entries' claude_code_log/cache.py -A 40 | head -50
echo -e "\n=== 2) Search for any logic that creates JSONL files with custom names ==="
rg -n 'Path.*\.jsonl|\.jsonl.*Path' claude_code_log/converter.py -B 2 -A 2 | head -40
echo -e "\n=== 3) Check if there's any code that could generate sessions from multiple files ==="
rg -n 'session.*aggregate|aggregate.*session|merge.*session' claude_code_log/ --type py | head -20
echo -e "\n=== 4) Verify that a session ID always comes from the file being processed ==="
cat claude_code_log/converter.py | sed -n '1110,1160p'
echo -e "\n=== 5) Check the relationship between file processing and session creation ==="
rg -n 'def load_transcript|def process.*jsonl' claude_code_log/converter.py -A 5 | head -40Repository: daaain/claude-code-log
Length of output: 740
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== 1) Check if save_cached_entries validates filename matches session_id ==="
rg -n 'def save_cached_entries' claude_code_log/cache.py -A 40 | head -50
echo -e "\n=== 2) Search for any logic that creates JSONL files with custom names ==="
rg -n 'Path.*jsonl|jsonl.*Path' claude_code_log/converter.py -B 2 -A 2 | head -40
echo -e "\n=== 3) Check if there's any code that could generate sessions from multiple files ==="
rg -n 'session.*aggregate|aggregate.*session|merge.*session' claude_code_log/ --type py | head -20
echo -e "\n=== 4) Verify relationship: one JSONL file = one session ==="
sed -n '1110,1160p' claude_code_log/converter.py
echo -e "\n=== 5) Check the key assumption: file_name column in cached_files stores just filename ==="
sed -n '479,510p' claude_code_log/cache.pyRepository: daaain/claude-code-log
Length of output: 5401
delete_session() assumes {session_id}.jsonl file naming; verify this contract is validated.
The code at lines 1426-1431 deletes the cached_files entry by matching f"{session_id}.jsonl", but there is no validation that this file_name actually exists in the cached_files table or corresponds to the session being deleted. If a JSONL file is named differently, the cached_files entry will silently remain, leaving stale cache. While the entire codebase consistently uses the {session_id}.jsonl naming pattern, consider adding explicit validation in delete_session() to confirm the cached_files entry exists before attempting deletion, or use a more robust approach (e.g., query cached_files for the session's actual messages and delete via file_id).
|
I'm pretty happy with this branch now @cboos, added a lot more tests and also addressed all your feedback from the archive management PR, here, and even some bits from your scratchpad. Basically, there's full archive management, more intuitive TUI (with esc / enter navigation, etc), cache file having a unique name by default and supporting setting an alternative path outside of I think this is also going to be a great foundation for more advanced functionality that needs fast access to the logs in any shape and form! |
Sounds great! I can't wait to test it... but I'll probably have to wait for this WE ;-) Feel free to merge in the meantime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
claude_code_log/tui.py (5)
108-116: Clamp column widths to avoid negative/invalid widths on small terminals.
self.size.width - 13can go <= 0.Proposed fix
@@ - table.add_column("Project", width=self.size.width - 13) + table.add_column("Project", width=max(20, self.size.width - 13)) table.add_column("Sessions", width=10)
1507-1519:format_timestamp()“short” vs “long” format currently identical.
Both branches return"%m-%d %H:%M", sotime_width = 16doesn’t buy you anything.Proposed fix
@@ - elif short_format: - return dt.strftime("%m-%d %H:%M") - else: - return dt.strftime("%m-%d %H:%M") + elif short_format: + return dt.strftime("%m-%d %H:%M") + else: + return dt.strftime("%Y-%m-%d %H:%M")
1728-1803: Optimize session message loading for current sessions to avoid loading all messagesFor current sessions,
load_directory_transcripts()loads all sessions from all.jsonlfiles, whichrenderer.generate_session()then filters bysession_idinternally. While correctness is preserved through the renderer's filtering, this is inefficient when exporting a single session. Consider loading only the specific session's file (e.g.,{project_path}/{session_id}.jsonl) for current sessions, similar to how archived sessions load from the cache manager by ID.
108-203: Fix ambiguous project selection (directory name collisions can select the wrong project).The current implementation resolves row selection by parsing and matching against
project_path.name, which fails when multiple projects have the same directory name (e.g.,/path/a/projectand/path/b/project). The first collision match would always be selected, making duplicate-named projects unselectable.Replace the text-parsing approach with direct row-index-to-list-index mapping. Since
populate_table()adds rows sequentially without sorting or filtering, the table row order is always guaranteed to matchself.projectsorder. This eliminates parsing overhead and collision risk entirely.Proposed fix
def _update_selected_project_from_cursor(self) -> None: """Update the selected project based on the current cursor position.""" try: table = cast(DataTable[str], self.query_one("#projects-table", DataTable)) - row_data = table.get_row_at(table.cursor_row) - if row_data: - # Extract project display from the first column - project_display = str(row_data[0]).strip() - - # Remove the arrow indicator if present - if project_display.startswith("→"): - project_display = project_display[1:].strip() - - # Remove the archived indicator if present - if project_display.endswith(" [ARCHIVED]"): - project_display = project_display[:-11].strip() - - # Find the matching project path - for project_path in self.projects: - if project_path.name == project_display: - self.selected_project_path = project_path - break + row = table.cursor_row + if 0 <= row < len(self.projects): + self.selected_project_path = self.projects[row] except Exception: # If widget not mounted yet or we can't get the row data, don't update selection pass
1362-1547: Use Textual row keys to track full session IDs and eliminate prefix-matching ambiguity.Prefix-matching with
startswith()can select the wrong session if session IDs share the same 8-character prefix. Additionally, the loop through session dictionaries on every cursor movement adds unnecessary O(n) complexity.Pass the full
session_idas the row key when callingtable.add_row(...):table.add_row( session_id[:8], preview, start_time, end_time, str(session_data.message_count), token_display, key=session_id, # Use full session ID as row key )Then retrieve it in
_update_selected_session_from_cursor()using the Textual API instead of scanning:row_key = table.get_row_key(table.cursor_row) self.selected_session_id = str(row_key)claude_code_log/renderer.py (1)
1491-1557: Harden_filter_messages()against string content for consistency with_render_messages().While
message.message.contentshould always belist[ContentItem]for user/assistant messages after factory processing (viacreate_message_content()), iterating a string as a list would cause characters to be processed, breakinghas_tool_itemsdetection andextract_text_content()behavior. Apply the same defensive pattern already used in_render_messages():Proposed fix
@@ - else: - message_content = message.message.content + else: + raw_content = message.message.content + message_content = raw_content if isinstance(raw_content, list) else [] @@ - text_content = extract_text_content(message_content) + text_content = ( + extract_text_content(message_content) + if message_content + else (str(raw_content) if raw_content is not None else "") + )
🤖 Fix all issues with AI agents
In @claude_code_log/tui.py:
- Around line 571-619: The _split_into_pages function currently uses a capturing
split re.split(r"(\n(?=## ))", ...) which yields delimiter-only chunks and
causes odd boundaries; replace that with a multiline zero-width split like
re.split(r"(?m)(?=^##\s)', content) or otherwise split on a non-capturing
lookahead so you don't get standalone "\n" tokens, and ensure you trim or skip
empty tokens and preserve section headings when building pages (handle oversized
sections by splitting on lines as you already do). Also verify and update usages
of viewer.document.update to use the supported public Textual API for the
installed version (e.g., viewer.update or recreating the widget / calling the
documented method) rather than directly mutating viewer.document; locate
references to viewer.document.update in the same module and swap to the public
method or widget-recreation approach if Textual v6.5.0 requires it.
In @test/test_tui.py:
- Around line 1690-1920: Several tests call Path.write_text(...) to create JSONL
files without explicitly setting encoding; update those write_text calls in
tests like test_archive_project_action, test_delete_project_cache_only,
test_delete_project_both, test_restore_project_not_archived_shows_warning and
other occurrences in test_tui.py to pass encoding="utf-8" (e.g.,
Path(...).write_text('{"type":"user"}\n', encoding="utf-8")) so file writes are
consistent and explicit across ProjectSelector-related tests.
🧹 Nitpick comments (3)
test/test_message_types.py (1)
176-187: Consider verifying no warning is printed in silent mode.The test validates that
silent=Truereturns an empty list, but doesn't verify that no warning message was printed. This would strengthen the test coverage for the silent behavior.♻️ Suggested enhancement
-def test_load_transcript_missing_file_silent_mode(): +def test_load_transcript_missing_file_silent_mode(capsys: CaptureFixture[str]): """Test that load_transcript handles missing files in silent mode.""" nonexistent_file = Path("/tmp/nonexistent-session-xyz789.jsonl") # Ensure it doesn't exist if nonexistent_file.exists(): nonexistent_file.unlink() # Should return empty list without printing messages = load_transcript(nonexistent_file, silent=True) assert messages == [], f"Expected empty list, got {messages}" + # Should NOT print any warning + captured = capsys.readouterr() + assert "Warning" not in captured.out, "Silent mode should not print warnings" + print("✓ Test passed: Missing file in silent mode returns empty list")claude_code_log/tui.py (1)
254-317: File operations: consider atomic JSONL restore + avoid “disappearing project” when file deletion fails.
- Restore writes directly to the final
*.jsonlpath; a crash/exception can leave a partial file.- In delete-project (
result == "both"), if some JSONL deletions fail you still remove the project from the list, which can hide a still-present project directory from the UI.Suggested tweaks:
- Write to
*.jsonl.tmpthenPath.replace()to make restore atomic.- If any JSONL delete fails, consider keeping the project in
self.projects(or re-scan) so the UI reflects reality.Also applies to: 333-382, 399-466
test/test_tui.py (1)
150-277: Tests may be flaky due to fixedpause(0.1/0.2/1.0)timings; prefer waiting on observable UI state.
Where possible, wait for the table to have expectedrow_count, or for a widget to exist, instead of sleeping.(Not sure what the best “wait for” primitive is in Textual v6.5.0 test pilot—confirm available helpers and I can suggest concrete replacements.)
Also applies to: 326-377, 585-595
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (18)
claude_code_log/converter.pyclaude_code_log/factories/user_factory.pyclaude_code_log/html/renderer_code.pyclaude_code_log/html/tool_formatters.pyclaude_code_log/html/utils.pyclaude_code_log/renderer.pyclaude_code_log/tui.pypyproject.tomlstubs/pygments/__init__.pyistubs/pygments/formatter.pyistubs/pygments/formatters/__init__.pyistubs/pygments/lexer.pyistubs/pygments/lexers/__init__.pyistubs/pygments/util.pyitest/test_cache.pytest/test_index_timezone.pytest/test_message_types.pytest/test_tui.py
✅ Files skipped from review due to trivial changes (1)
- stubs/pygments/lexer.pyi
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use-n autoflag for parallel test execution with pytest in Python projects (e.g.,uv run pytest -n auto -m "not (tui or browser)" -v)
Useruff formatfor code formatting in Python projects
Useruff check --fixfor linting and automatic fixes in Python projects
Useuv run pyrightoruv run ty checkfor type checking in Python projects
Files:
claude_code_log/factories/user_factory.pytest/test_tui.pyclaude_code_log/html/tool_formatters.pytest/test_cache.pyclaude_code_log/html/utils.pyclaude_code_log/converter.pyclaude_code_log/renderer.pytest/test_index_timezone.pyclaude_code_log/tui.pyclaude_code_log/html/renderer_code.pytest/test_message_types.py
claude_code_log/renderer.py
📄 CodeRabbit inference engine (CLAUDE.md)
When adding new message types or modifying CSS class generation, ensure the timeline's message type detection logic is updated accordingly and that the filter is applied consistently to messages in both the main transcript and timeline
Files:
claude_code_log/renderer.py
🧠 Learnings (5)
📚 Learning: 2025-12-31T16:37:39.494Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-31T16:37:39.494Z
Learning: Implement interactive TUI (Terminal User Interface) for browsing Claude Code sessions with real-time navigation, summaries, and quick actions
Applied to files:
test/test_tui.pyclaude_code_log/tui.py
📚 Learning: 2025-12-09T23:52:47.578Z
Learnt from: daaain
Repo: daaain/claude-code-log PR: 59
File: test/test_cache.py:135-165
Timestamp: 2025-12-09T23:52:47.578Z
Learning: SQLite supports NULLS FIRST and NULLS LAST in ORDER BY since v3.30.0 (Oct 2019). Do not flag SQL that uses these clauses as an error when reviewing Python tests or code that interacts with SQLite. If reviewing SQL strings, verify the target SQLite version supports NULLS FIRST/LAST and ensure the syntax is used correctly for the database in use.
Applied to files:
test/test_tui.pytest/test_cache.pytest/test_index_timezone.pytest/test_message_types.py
📚 Learning: 2025-11-30T23:24:07.840Z
Learnt from: cboos
Repo: daaain/claude-code-log PR: 54
File: claude_code_log/renderer.py:2912-2945
Timestamp: 2025-11-30T23:24:07.840Z
Learning: In claude_code_log/renderer.py, the agentId field is currently only set on Task tool_result messages, not on tool_use messages, because the agentId is generated after the tool_use is logged. The _reorder_sidechain_template_messages function relies on this to avoid duplicate sidechain insertion.
Applied to files:
claude_code_log/converter.py
📚 Learning: 2025-12-31T16:37:39.494Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-31T16:37:39.494Z
Learning: Applies to claude_code_log/templates/components/timeline.html : When adding new message types or modifying CSS class generation in renderer.py, ensure the timeline's message type detection logic in timeline.html is updated accordingly to maintain feature parity
Applied to files:
claude_code_log/renderer.pyclaude_code_log/html/renderer_code.py
📚 Learning: 2025-12-31T16:37:39.494Z
Learnt from: CR
Repo: daaain/claude-code-log PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-31T16:37:39.494Z
Learning: Applies to claude_code_log/renderer.py : When adding new message types or modifying CSS class generation, ensure the timeline's message type detection logic is updated accordingly and that the filter is applied consistently to messages in both the main transcript and timeline
Applied to files:
claude_code_log/renderer.pyclaude_code_log/html/renderer_code.py
🧬 Code graph analysis (9)
stubs/pygments/formatters/__init__.pyi (1)
stubs/pygments/formatter.pyi (1)
Formatter(5-7)
claude_code_log/factories/user_factory.py (1)
claude_code_log/models.py (1)
ImageContent(70-78)
stubs/pygments/lexers/__init__.pyi (1)
stubs/pygments/lexer.pyi (1)
Lexer(5-7)
claude_code_log/html/tool_formatters.py (1)
claude_code_log/html/utils.py (1)
escape_html(191-198)
test/test_cache.py (1)
claude_code_log/cache.py (5)
is_file_cached(377-398)get_cached_project_data(661-733)load_cached_entries_filtered(417-477)CacheManager(193-1459)get_all_cached_projects(1462-1516)
claude_code_log/html/utils.py (7)
stubs/pygments/__init__.pyi (1)
highlight(5-5)stubs/pygments/lexer.pyi (1)
Lexer(5-7)stubs/pygments/lexers/__init__.pyi (2)
get_lexer_by_name(11-11)TextLexer(7-9)stubs/pygments/formatter.pyi (1)
Formatter(5-7)stubs/pygments/formatters/__init__.pyi (1)
HtmlFormatter(7-16)stubs/pygments/util.pyi (1)
ClassNotFound(3-6)claude_code_log/renderer_timings.py (1)
timing_stat(88-110)
claude_code_log/renderer.py (1)
claude_code_log/html/renderer.py (1)
is_outdated(587-598)
claude_code_log/html/renderer_code.py (5)
stubs/pygments/__init__.pyi (1)
highlight(5-5)stubs/pygments/lexer.pyi (1)
Lexer(5-7)stubs/pygments/formatter.pyi (1)
Formatter(5-7)stubs/pygments/formatters/__init__.pyi (1)
HtmlFormatter(7-16)stubs/pygments/util.pyi (1)
ClassNotFound(3-6)
test/test_message_types.py (1)
claude_code_log/converter.py (1)
load_transcript(116-293)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: test (ubuntu-latest, 3.12)
- GitHub Check: test (windows-latest, 3.12)
- GitHub Check: test (windows-latest, 3.13)
- GitHub Check: test (windows-latest, 3.11)
- GitHub Check: test (ubuntu-latest, 3.11)
- GitHub Check: test (windows-latest, 3.14)
- GitHub Check: test (ubuntu-latest, 3.13)
- GitHub Check: test (ubuntu-latest, 3.14)
- GitHub Check: test (windows-latest, 3.10)
- GitHub Check: test (ubuntu-latest, 3.10)
🔇 Additional comments (47)
test/test_index_timezone.py (1)
14-14: LGTM! Type ignore cleanup is appropriate.Removing
type: ignore[call-non-callable]comments frompytest.skip()calls is correct cleanup. Modern pytest type stubs properly defineskipas a callable, so these suppressions are no longer needed.Also applies to: 26-26, 71-71, 85-85
stubs/pygments/lexers/__init__.pyi (1)
1-20: LGTM!The type stubs correctly define
TextLexeras extendingLexer, and the function signatures forget_lexer_by_nameandget_all_lexersaccurately reflect the pygments API. The return type forget_all_lexersproperly captures the 4-tuple structure(name, aliases, patterns, mimetypes).stubs/pygments/util.pyi (1)
1-6: LGTM!The
ClassNotFoundexception stub correctly inherits fromExceptionand provides adequate typing for error handling in code that catches this pygments-specific exception.stubs/pygments/formatter.pyi (1)
1-7: LGTM!The base
Formatterclass stub is correctly defined with the**optionspattern matching the pygments API, and properly serves as the parent class forHtmlFormatter.stubs/pygments/__init__.pyi (1)
1-5: LGTM!The
highlightfunction stub correctly captures the core API. UsingAnyfor lexer/formatter parameters is acceptable for minimal stubs, though you could optionally useLexerandFormattertypes for stricter checking if desired.stubs/pygments/formatters/__init__.pyi (1)
1-16: LGTM!The
HtmlFormatterstub provides excellent typing coverage for the commonly used parameters. Thelinenos: bool | Literal["table", "inline"]union type correctly captures that this parameter accepts boolean values or specific string literals, which matches the pygments API behavior.pyproject.toml (2)
57-59: LGTM! Consistent with existing Pyright configuration.The
extra-paths = ["stubs"]setting aligns with the PyrightstubPath = "stubs"configuration on line 65, ensuring both type checkers use the same custom stubs for untyped libraries.
119-119: Thety>=0.0.11constraint is valid and current.Version 0.0.11 is confirmed to exist on PyPI and is the latest available version. No changes needed.
claude_code_log/html/tool_formatters.py (2)
545-545: LGTM — Unicode handling preserved.The
ensure_ascii=Falsecorrectly preserves non-ASCII characters in JSON output, which is important for internationalization.Note: The AI summary incorrectly states that
ensure_asciiwas removed and defaults toTrue. The code clearly retainsensure_ascii=False.
561-564: LGTM — Cleaner type handling.The explicit
cast(object, value)satisfies static type checkers without relying on# type: ignorecomments. The added comment clarifies the fallback intent.claude_code_log/factories/user_factory.py (2)
447-449: LGTM!The type ignore removal is appropriate here. The
hasattr(item, "text")check on line 447 ensures the attribute exists at runtime, and the explicit: strannotation handles the type.
463-465: Remove or clarify the "duck-typed" terminology—the code is type-safe.The concern about
item.model_dump()failing is unfounded. Becausecontent_listis typed aslist[ContentItem](a Union of Pydantic models), every item is guaranteed to be a PydanticBaseModel. Thiselifbranch handles other Union members that happen to have compatiblesourceandtype == "image"attributes, not arbitrary duck-typed objects. Sincemodel_dump()is a Pydantic method available on all Union members, the code is type-safe. The comment would be clearer if it described the actual scenario: "Convert image-like Pydantic models to our ImageContent model."Likely an incorrect or invalid review comment.
test/test_message_types.py (3)
7-8: LGTM!Clean import of
CaptureFixturefor proper typing of the pytest fixture parameter.
153-174: LGTM!This test properly verifies the race condition handling where a file may be deleted between glob and open. The assertions correctly check both the return value (empty list) and the warning output.
193-194: LGTM!Appropriate comment explaining why the capsys-dependent test is skipped when running directly.
claude_code_log/html/renderer_code.py (4)
15-20: LGTM!Clean import structure using explicit Pygments public API types (
Lexer,Formatter) instead of relying on broad type-ignores. This improves type safety and makes the code more maintainable.
54-54: LGTM!Proper use of underscore prefix for unused loop variables (
_name,_mimetypes), following Python conventions.
98-121: LGTM!Good refactoring pattern: initializing a default
TextLexerbefore the try block ensures the fallback is always available without duplicating fallback logic. The explicitLexertype annotation improves clarity.
124-133: LGTM!The explicit
Formattertype annotation and removal of type-ignore on thehighlightcall align with the improved typing throughout this module.claude_code_log/html/utils.py (3)
203-208: LGTM!Import structure matches the pattern established in
renderer_code.py, maintaining consistency across the codebase for Pygments type usage.
219-233: LGTM!The default
TextLexerinitialization pattern and explicit type annotations are consistent with the changes inrenderer_code.py. Good code consistency across the HTML rendering modules.
445-447: Pragmatic workaround for Jinja2's strict globals typing.The
Anycast is a reasonable approach to bypass Jinja2's overly strict type definition forenv.globals. The comment explains the intent clearly.test/test_cache.py (7)
26-33: LGTM!Good approach: creating a nested project subdirectory ensures each test gets a unique
db_pathat the parent level, providing proper test isolation for the shared SQLite database.
104-116: LGTM!Tests correctly verify the new
db_pathproperty and its location at the parent directory level, aligned with the SQLite-backed cache design.
141-170: LGTM!Direct SQLite queries are appropriate here to verify the internal storage structure. Filtering by
project_idensures test isolation when using a shared database. The use ofNULLS LASTis valid (supported since SQLite 3.30.0). Based on learnings, this SQL syntax is correct.
266-328: Excellent edge case coverage.This test catches a subtle bug where ISO format with microseconds (
.999999) vs UTC Z suffix could cause incorrect timestamp comparisons. The detailed assertion message will help debug any regressions.
330-352: LGTM!Test properly verifies that
clear_cacheremoves cached files and sessions while preserving the project structure in the SQLite database.
660-670: LGTM!Test renamed to better reflect the scenario (missing cache entry rather than corrupted file), appropriate for the SQLite-backed design where file corruption is less relevant.
707-792: Comprehensive environment variable test coverage.The tests properly verify the precedence order: explicit
db_path>CLAUDE_CODE_LOG_CACHE_PATHenv var > default location. Usingmonkeypatchfor environment manipulation is the correct pytest approach.claude_code_log/converter.py (15)
6-9: LGTM!Import additions are appropriate for the new functionality:
dataclassforGenerationStats, cache-related imports, andis_html_outdatedfor HTML regeneration checks.Also applies to: 23-28, 39-39
157-166: LGTM!Proper handling of the race condition where a file may be deleted between glob discovery and file open. The graceful degradation to an empty list with an optional warning is the right approach.
420-486: LGTM!
GenerationStatsis a well-designed dataclass for tracking per-project generation metrics. Thesummary()method provides a clean, formatted output for reporting.
542-576: Session assignment may produce pages larger than expected.The logic adds sessions to the current page and then checks if the limit is exceeded. This means a page can contain significantly more messages than
page_sizeif a large session is added. While the docstring notes "overflow allowed to keep sessions intact," consider whether there should be an upper bound or warning for extremely large sessions.Consider whether a session with, say, 50,000 messages should be allowed on a single page, or if there should be a secondary split mechanism or warning for such cases.
579-674: LGTM!Good fallback implementation for building session data when cache is unavailable. The logic correctly filters warmup sessions and extracts token usage from assistant messages.
726-732: LGTM!Proper cleanup of orphan pages when page count decreases. This prevents stale HTML files from accumulating when sessions are deleted or pagination configuration changes.
793-805: Page navigation logic is sound.The
page_infodict correctly provides previous/next links with appropriateis_last_pageflag. Enabling the previous page's next link when creating a new page ensures navigation stays consistent.
952-972: Good early exit optimization.This fast path avoids expensive message loading when all HTML files are current. The check order (combined stale → session stale) is logical.
1015-1023: Verify pagination threshold behavior.Pagination is enabled when
total_message_count > page_sizeORexisting_page_count > 1. The second condition preserves pagination for projects that were previously paginated but now have fewer messages. This seems intentional to avoid constantly switching between modes.
1099-1129: LGTM!
has_cache_changesprovides a fast mtime-only check without loading messages, enabling efficient early-exit paths in the pipeline.
1456-1460: Good API enhancement.Returning the count of regenerated sessions enables better progress reporting in the calling code.
1561-1577: LGTM!Proper tracking of HTML cache updates per session, enabling incremental regeneration on subsequent runs.
1586-1627: Helpful user guidance for archived sessions.Reading
cleanupPeriodDaysfrom settings and printing a note with a link to restoration documentation improves the user experience when encountering archived sessions.
1973-2028: LGTM!Archived project handling is well-integrated: cached data is retrieved, displayed with
[ARCHIVED]indicator, and added to project summaries withis_archived: Trueflag for UI differentiation.
2055-2073: LGTM!Clean summary output with timing, error/warning reporting, and the archived sessions note when applicable.
claude_code_log/renderer.py (2)
1738-1753: Good:_render_messages()now safely handles list vs string content.
The list-vs-string branching +chunk_message_content()integration makes message handling clearer and more robust.
2308-2320:is_html_outdated()helper looks fine; consider reusingget_renderer("html")if instantiation gets hot.
Probably negligible, but worth keeping in mind if this ends up on a tight loop.test/test_tui.py (1)
90-107: Fixture update to per-session JSONL (+ explicit UTF-8) is a solid improvement.
Matches real-world layout and removes implicit-encoding ambiguity.
| def _split_into_pages(self, content: str) -> list[str]: | ||
| """Split markdown content into pages by character count. | ||
| Splits at section boundaries (## ) when possible to avoid | ||
| cutting mid-section, but will split within sections if | ||
| a single section exceeds PAGE_SIZE_CHARS. | ||
| """ | ||
| import re | ||
|
|
||
| if len(content) <= self.PAGE_SIZE_CHARS: | ||
| return [content] | ||
|
|
||
| pages: list[str] = [] | ||
| current_page = "" | ||
|
|
||
| # Split by level 2 headings, keeping the delimiter | ||
| sections = re.split(r"(\n(?=## ))", content) | ||
|
|
||
| for section in sections: | ||
| if not section: | ||
| continue | ||
|
|
||
| # If adding this section exceeds page size | ||
| if len(current_page) + len(section) > self.PAGE_SIZE_CHARS: | ||
| # If current page has content, save it | ||
| if current_page.strip(): | ||
| pages.append(current_page) | ||
| current_page = "" | ||
|
|
||
| # If section itself exceeds page size, split it by lines | ||
| if len(section) > self.PAGE_SIZE_CHARS: | ||
| lines = section.split("\n") | ||
| for line in lines: | ||
| if len(current_page) + len(line) + 1 > self.PAGE_SIZE_CHARS: | ||
| if current_page.strip(): | ||
| pages.append(current_page) | ||
| current_page = line + "\n" | ||
| else: | ||
| current_page += line + "\n" | ||
| else: | ||
| current_page = section | ||
| else: | ||
| current_page += section | ||
|
|
||
| # Don't forget the last page | ||
| if current_page.strip(): | ||
| pages.append(current_page) | ||
|
|
||
| return pages if pages else [content] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pagination split logic + Textual API usage need a quick sanity check.
re.split(r"(\n(?=## ))", content)likely produces delimiter-only chunks ("\n") and can lead to odd page boundaries.viewer.document.update(...)may be Textual-version sensitive (anddocumentmight not be the supported public API).
If Textual v6.5.0 supports it, prefer updating via a public method/property (or recreating the widget) and use a multiline heading split like (?m)(?=^##\s) to avoid delimiter tokens.
Also applies to: 724-744
🤖 Prompt for AI Agents
In @claude_code_log/tui.py around lines 571 - 619, The _split_into_pages
function currently uses a capturing split re.split(r"(\n(?=## ))", ...) which
yields delimiter-only chunks and causes odd boundaries; replace that with a
multiline zero-width split like re.split(r"(?m)(?=^##\s)', content) or otherwise
split on a non-capturing lookahead so you don't get standalone "\n" tokens, and
ensure you trim or skip empty tokens and preserve section headings when building
pages (handle oversized sections by splitting on lines as you already do). Also
verify and update usages of viewer.document.update to use the supported public
Textual API for the installed version (e.g., viewer.update or recreating the
widget / calling the documented method) rather than directly mutating
viewer.document; locate references to viewer.document.update in the same module
and swap to the public method or widget-recreation approach if Textual v6.5.0
requires it.
| """Test that Enter key selects the highlighted project.""" | ||
| with tempfile.TemporaryDirectory() as temp_dir: | ||
| project1 = Path(temp_dir) / "project1" | ||
| project1.mkdir() | ||
| (project1 / "session-1.jsonl").write_text('{"type":"user"}\n') | ||
|
|
||
| project2 = Path(temp_dir) / "project2" | ||
| project2.mkdir() | ||
| (project2 / "session-2.jsonl").write_text('{"type":"user"}\n') | ||
|
|
||
| app = ProjectSelector( | ||
| projects=[project1, project2], | ||
| matching_projects=[], | ||
| archived_projects=set(), | ||
| ) | ||
|
|
||
| async with app.run_test() as pilot: | ||
| await pilot.pause(0.2) | ||
|
|
||
| # Select first project and press Enter | ||
| await pilot.press("enter") | ||
| await pilot.pause(0.1) | ||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_escape_key_quits(self): | ||
| """Test that Escape key quits the application.""" | ||
| with tempfile.TemporaryDirectory() as temp_dir: | ||
| project1 = Path(temp_dir) / "project1" | ||
| project1.mkdir() | ||
|
|
||
| app = ProjectSelector( | ||
| projects=[project1], | ||
| matching_projects=[], | ||
| archived_projects=set(), | ||
| ) | ||
|
|
||
| async with app.run_test() as pilot: | ||
| await pilot.pause(0.2) | ||
|
|
||
| # Press Escape to quit | ||
| await pilot.press("escape") | ||
| await pilot.pause(0.1) | ||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_archive_project_action(self): | ||
| """Test archiving a project deletes JSONL files.""" | ||
| with tempfile.TemporaryDirectory() as temp_dir: | ||
| project_path = Path(temp_dir) / "project1" | ||
| project_path.mkdir() | ||
| jsonl1 = project_path / "session-1.jsonl" | ||
| jsonl2 = project_path / "session-2.jsonl" | ||
| jsonl1.write_text('{"type":"user"}\n') | ||
| jsonl2.write_text('{"type":"user"}\n') | ||
|
|
||
| app = ProjectSelector( | ||
| projects=[project_path], | ||
| matching_projects=[], | ||
| archived_projects=set(), | ||
| ) | ||
|
|
||
| async with app.run_test() as pilot: | ||
| await pilot.pause(0.2) | ||
|
|
||
| # Select the project | ||
| app.selected_project_path = project_path | ||
|
|
||
| # Both JSONL files should exist | ||
| assert jsonl1.exists() | ||
| assert jsonl2.exists() | ||
|
|
||
| # Press 'a' to archive and then confirm | ||
| await pilot.press("a") | ||
| await pilot.pause(0.1) | ||
| await pilot.press("y") | ||
| await pilot.pause(0.1) | ||
|
|
||
| # JSONL files should be deleted | ||
| assert not jsonl1.exists() | ||
| assert not jsonl2.exists() | ||
|
|
||
| # Project should now be in archived set | ||
| assert project_path in app.archived_projects | ||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_archive_project_already_archived_shows_warning(self): | ||
| """Test archiving an already archived project shows warning.""" | ||
| with tempfile.TemporaryDirectory() as temp_dir: | ||
| project_path = Path(temp_dir) / "project1" | ||
| project_path.mkdir() | ||
|
|
||
| app = ProjectSelector( | ||
| projects=[project_path], | ||
| matching_projects=[], | ||
| archived_projects={project_path}, # Already archived | ||
| ) | ||
|
|
||
| async with app.run_test() as pilot: | ||
| await pilot.pause(0.2) | ||
|
|
||
| # Select the archived project | ||
| app.selected_project_path = project_path | ||
|
|
||
| # Try to archive - should show warning | ||
| await pilot.press("a") | ||
| await pilot.pause(0.1) | ||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_delete_project_cache_only(self): | ||
| """Test deleting project cache only keeps JSONL files.""" | ||
| with tempfile.TemporaryDirectory() as temp_dir: | ||
| project_path = Path(temp_dir) / "project1" | ||
| project_path.mkdir() | ||
| jsonl = project_path / "session-1.jsonl" | ||
| jsonl.write_text('{"type":"user"}\n') | ||
|
|
||
| app = ProjectSelector( | ||
| projects=[project_path], | ||
| matching_projects=[], | ||
| archived_projects=set(), | ||
| ) | ||
|
|
||
| with patch.object(CacheManager, "clear_cache"): | ||
| async with app.run_test() as pilot: | ||
| await pilot.pause(0.2) | ||
|
|
||
| # Select the project | ||
| app.selected_project_path = project_path | ||
|
|
||
| # Press 'd' to delete and choose cache only | ||
| await pilot.press("d") | ||
| await pilot.pause(0.1) | ||
| await pilot.press("c") # Cache only | ||
| await pilot.pause(0.1) | ||
|
|
||
| # JSONL file should still exist | ||
| assert jsonl.exists() | ||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_delete_project_both(self): | ||
| """Test deleting project cache and JSONL files.""" | ||
| with tempfile.TemporaryDirectory() as temp_dir: | ||
| project_path = Path(temp_dir) / "project1" | ||
| project_path.mkdir() | ||
| jsonl = project_path / "session-1.jsonl" | ||
| jsonl.write_text('{"type":"user"}\n') | ||
|
|
||
| app = ProjectSelector( | ||
| projects=[project_path], | ||
| matching_projects=[], | ||
| archived_projects=set(), | ||
| ) | ||
|
|
||
| with patch.object(CacheManager, "clear_cache"): | ||
| async with app.run_test() as pilot: | ||
| await pilot.pause(0.2) | ||
|
|
||
| # Select the project | ||
| app.selected_project_path = project_path | ||
|
|
||
| assert jsonl.exists() | ||
|
|
||
| # Press 'd' to delete and choose both | ||
| await pilot.press("d") | ||
| await pilot.pause(0.1) | ||
| await pilot.press("b") # Both | ||
| await pilot.pause(0.1) | ||
|
|
||
| # JSONL file should be deleted | ||
| assert not jsonl.exists() | ||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_restore_project_creates_directory(self): | ||
| """Test restoring a project creates directory if missing.""" | ||
| with tempfile.TemporaryDirectory() as temp_dir: | ||
| project_path = Path(temp_dir) / "deleted_project" | ||
| # Don't create the directory | ||
|
|
||
| mock_session_data = { | ||
| "session-123": SessionCacheData( | ||
| session_id="session-123", | ||
| first_timestamp="2025-01-01T10:00:00Z", | ||
| last_timestamp="2025-01-01T10:01:00Z", | ||
| message_count=1, | ||
| first_user_message="Test", | ||
| total_input_tokens=10, | ||
| total_output_tokens=10, | ||
| ), | ||
| } | ||
|
|
||
| app = ProjectSelector( | ||
| projects=[project_path], | ||
| matching_projects=[], | ||
| archived_projects={project_path}, # Archived project | ||
| ) | ||
|
|
||
| with ( | ||
| patch.object(CacheManager, "get_cached_project_data") as mock_cache, | ||
| patch.object( | ||
| CacheManager, | ||
| "export_session_to_jsonl", | ||
| return_value=['{"type":"user"}'], | ||
| ), | ||
| ): | ||
| mock_cache.return_value = Mock(sessions=mock_session_data) | ||
|
|
||
| async with app.run_test() as pilot: | ||
| await pilot.pause(0.2) | ||
|
|
||
| # Select the project | ||
| app.selected_project_path = project_path | ||
|
|
||
| # Directory should not exist | ||
| assert not project_path.exists() | ||
|
|
||
| # Press 'r' to restore and confirm | ||
| await pilot.press("r") | ||
| await pilot.pause(0.1) | ||
| await pilot.press("y") | ||
| await pilot.pause(0.1) | ||
|
|
||
| # Directory should now exist | ||
| assert project_path.exists() | ||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_restore_project_not_archived_shows_warning(self): | ||
| """Test restoring a non-archived project shows warning.""" | ||
| with tempfile.TemporaryDirectory() as temp_dir: | ||
| project_path = Path(temp_dir) / "project1" | ||
| project_path.mkdir() | ||
| (project_path / "session-1.jsonl").write_text('{"type":"user"}\n') | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make test file writes consistently UTF-8 (a few write_text() calls omit encoding).
Some tests use Path.write_text('...') without encoding="utf-8" while others specify it.
Example patch pattern
- (project1 / "session-1.jsonl").write_text('{"type":"user"}\n')
+ (project1 / "session-1.jsonl").write_text('{"type":"user"}\n', encoding="utf-8")(Apply similarly to the other write_text(...jsonl...) sites.)
🤖 Prompt for AI Agents
In @test/test_tui.py around lines 1690 - 1920, Several tests call
Path.write_text(...) to create JSONL files without explicitly setting encoding;
update those write_text calls in tests like test_archive_project_action,
test_delete_project_cache_only, test_delete_project_both,
test_restore_project_not_archived_shows_warning and other occurrences in
test_tui.py to pass encoding="utf-8" (e.g.,
Path(...).write_text('{"type":"user"}\n', encoding="utf-8")) so file writes are
consistent and explicit across ProjectSelector-related tests.
Oh right! Today I added pagination to the Markdown view in TUI, heading links scroll, links pop up a message saying it won't work, the ToC outline can be toggled now 😸 Will just play a bit more with it in the coming days and do another sweep of CodeRabbit reviews, I'll be away with work so weekend to merge sounds good. I'm thinking of doing this as a 1.0.0 release as it's kind of a breaking change from the previous JSON cache? |


Pretty comprehensive cache rewrite where all the bits using cache have been retooled to use SQLite.
The schema is created as migration to make it easier to update when new log types are added by Anthropic.
Also added lots of tests, including integrity to make sure db constraints are working.
Haven't measured timings yet, but feels much faster on my 123K+ messages CC projects dir which yielded a 848MB db 😅
Summary by CodeRabbit
New Features
Improvements
Bug Fixes
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.