Data Model

Reasoning needs a working memory. Cognitive Scientists have lots of elaborate models of working memory that have influenced the design of LLM cognitive agents, but I get confused trying to understand how they all fit together. So.

Cognitive workbench implements (currently) 4 layers at which it can work with memory.

Base layer: pieces of text, called Notes, and sets of these, called Collections.

These come with basic CRUD and set-theoretic operations like create-note, create-collection, add (Note or Collection to Collection), remove (Note from Collection), union, set-difference, intersection, is-empty, size, etc. These primitives operate structurally on the Notes and Collections, they do not operate on Note content (well, trivially is-empty and size do). Except for a few primitives (e.g. is-empty), all primitives and tools output either a Note or a Collection.

next layer up is operations on content that don't assume any format other than raw text.

these include index and search. all Notes are indexed (embedded) and are searchable using search-note. Similarly all collections can be searched via search-collections. a particular collection can be indexed using index and searched using search-collection target: . Large notes are segmented and indexed by each segment.Search Primitives:

search-notes: Global discovery across all Notes (no target needed). Returns Collection of structured Notes with text preview, metadata.source_id, metadata.uri, metadata.score, metadata.type.
search-collections: Global discovery across all Collections (no target needed). Returns Collection of structured Notes with text preview, metadata.source_id, metadata.uri, metadata.score, metadata.type.
search-within-collection: Search within a specific indexed Collection (requires target Collection, must be indexed first). Returns Collection of structured Notes with text preview, metadata.source_id, metadata.uri, metadata.score, metadata.type.

All search primitives return structured Notes matching query-web/semantic-scholar format:

text: First paragraph preview (200 chars max)
format: "text" or "json"
metadata.source_id: Original Note/Collection ID
metadata.uri: URI field (Note/Collection ID or extracted URI from source)
metadata.score: Search relevance score (0.0-1.0)
metadata.type: "Note" or "Collection"
char_count: Length of text preview

above that come operations that assume JSON note content and provide minimal sql-like content operations.

SQL-like Collection Operations (require dict/JSON Notes):

project: Extract specific fields (SELECT columns) → new Collection with subset of fields
pluck: Extract single field as simple values → Collection of values
filter-structured: Filter by field conditions (WHERE clauses) → filtered Collection
sort: Sort by field value (ORDER BY) → sorted Collection
join: Combine two Collections on matching field (INNER JOIN) → merged Collection

Use cases:

project: Extract URLs from search results, get title+year from papers
pluck: Get just titles as simple list, extract scores for analysis
filter-structured: Papers after 2020, results with score>0.5, venue contains "NeurIPS"
sort: Rank by score (descending), chronological by year, alphabetical by title
join: Merge papers with citation data, combine user info with profiles

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data Model

Base layer: pieces of text, called Notes, and sets of these, called Collections.

next layer up is operations on content that don't assume any format other than raw text.

above that come operations that assume JSON note content and provide minimal sql-like content operations.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally