Skip to content

Import Diigo shared bookmarks and add curation/rendering pipeline#244

Open
dckc wants to merge 17 commits into
masterfrom
diigo-shared-monthly
Open

Import Diigo shared bookmarks and add curation/rendering pipeline#244
dckc wants to merge 17 commits into
masterfrom
diigo-shared-monthly

Conversation

@dckc

@dckc dckc commented Mar 6, 2026

Copy link
Copy Markdown
Owner

This PR imports, curates, and publishes monthly shared-bookmark roundup pages, plus tooling and presentation updates to keep the workflow repeatable and safer.

Highlights:

  • add/refine monthly bookmark rendering and regenerate roundup pages
  • normalize selected legacy tag case (HTML, RChain, URI)
  • add expand_tco_urls.py with cache + batching (--next N)
  • switch monthly rendering to cache-only t.co expansion
  • add mixed-case tag allowlist checker
  • hide/de-emphasize roundup pages in homepage/list views

Closes #243
Refs #32

dckc added 13 commits March 6, 2026 16:23
- Rendering changes
  - Switch monthly links section from markdown bullets to HTML list markup (ul/li)
  - Render each bookmark in Diigo-like order:
    - title link
    - line break
    - metadata line with date first, then tags
    - line break
    - optional description paragraph
    - optional annotation quote blocks
  - Omit no_tag from visible per-item metadata
  - Remove textual labels from content blocks:
    - no tags: wrapper
    - no description: prefix
    - no annotations: heading

- Annotation handling
  - Render annotations as raw trusted HTML inside blockquote
  - Render annotation comments as note-styled quote blocks

- Support functions
  - Add helpers for HTML-focused output:
    - annotation_html(...)
    - inline_html_text(...)
    - visible_tags(...)

- Test/quality updates
  - Expand doctest assertions for new HTML output contract
  - Keep regression coverage for malformed title suffixes and multiline text
  - Re-run lint and doctest successfully before commit
Align frontmatter tag casing across pages by normalizing exact tag tokens:
- html -> HTML
- rchain -> RChain
- uri -> URI

This is isolated as its own commit and only changes tag-list metadata lines.
Regenerated pages/**/bookmarks-YYYY-MM.md using the updated bookmark renderer and current t.co cache-only linkification path.
Add a canonical mixed-case tag list and map tag parsing through it so tags like KC/HTML/RChain/URI retain intended capitalization while all other tags still normalize to lowercase.
Remove live URL expansion from monthly_bookmarks.py rendering. Linkification now resolves short URLs from the tco cache file only, adds --tco-cache CLI support, and threads cache data through post rendering.
Introduce a standalone t.co expansion utility that supports ndjson or rendered markdown inputs, updates a JSON cache, and offers --next N batching with throttled progress output.
Add a checker script that scans page metadata tags, imports the canonical mixed-case allowlist from monthly_bookmarks.py, and exits non-zero on mismatches.
Exclude bookmark roundup pages from homepage post selection and homepage tag counts, and apply list-view presentation changes so roundup entries are visibly de-emphasized elsewhere.
Record initial resolved t.co URL mappings for cache-only bookmark rendering and repeatable link expansion workflows.
Restore the local shared-bookmarks ndjson export with the AWS-signin credential token redacted to satisfy push protection.
@dckc dckc changed the title Import and curate Diigo shared bookmarks with cache-only URL expansion Import Diigo shared bookmarks and add curation/rendering pipeline Mar 6, 2026
dckc added 4 commits March 6, 2026 17:47
Create a converter that initializes a fresh Zotero DB from official source schema SQL and imports shared Diigo bookmarks as webpage items with URL/title/tags plus annotation notes.
- Why:
  - Keep a local, reviewable source snapshot for DB bootstrap fallback.
- What:
  - Add licensing notices.
  - Add only required schema SQL files.
- Scope:
  - No runtime behavior changes yet; data conversion logic remains separate.
- Schema compatibility:
  - Resolve item type and field IDs by name from the target DB.
  - Stop using hard-coded numeric IDs that mismatched Zotero 8.
- Bootstrap path:
  - Add --template-db support (defaulting to zero-items profile DB).
  - Copy template DB before import and assert it is empty (items=0).
  - Keep vendored SQL init as fallback when no template DB is found.
- Import behavior:
  - Create imported parent items as webpage, not dictionaryEntry.
  - Preserve updated_at into dateModified/clientDateModified.
  - Keep Diigo annotations as child notes and include import/readlater tags.
- Input:
  - projects/diigo-bak/diigo-bookmarks-shared.ndjson
- Baseline:
  - ~/Zotero zero-items/zotero.sqlite (Zotero 8 profile seed)
- Output:
  - projects/diigo-bak/diigo-zotero-vendored.sqlite
- Result:
  - Collection items imported as webpage type with Zotero 8-compatible schema rows.
@dckc dckc force-pushed the diigo-shared-monthly branch from e07b195 to db575f4 Compare March 7, 2026 18:59
@dckc dckc force-pushed the master branch 2 times, most recently from ef5eb36 to 4630db1 Compare April 22, 2026 14:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

syndicate diigo bookmarks on madmode.com

1 participant