-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using SQL based db as primary data storage (along regular files) #2259
Comments
for many features a database is absolutely required. for example, auto completion is strongly tied to syntax. whether human readable format or database each are good solutions with advantages and disadvantages. what matter s is acknowledging their limitations and taking advantage of either one. |
if you want edit markdown with external tools focus on this part, don't bloat it with features. if you want feature rich software use a database. |
To be honest, i don't think it will be implemented in boostnote. But at the moment, there is an ongoing discussion to use a database for notes in the new version of boostnote that is developed from scratch at the moment. Please feel free to a) help to implement it and b) join the discussion a) : https://github.com/BoostIO/Boost |
I think we could keep both. If any external process changes a note the app just picks it up and integrate into the database. By this file could be editable but we still gain from the use of database. (the usability of current filesystem storage by any external tooling is another topic to discuss - there open issues about it already) |
Hi. I started using boostnote because I like the fact I have "physical files" I can move about, and store in different devices. If I wanted a database I would just create one directly... On the other hand, early Evernote used a db for metadata IIRC while keeping the content in files, so perhaps that is a the best of both worlds? |
maybe not a good idea:
|
Considering the current "plaintext" cson format as something to read without Boostnote is far from reality. Neither the naming convention, nor the format is suitable for a human to work with. With real plain text with front-matter to store metadata AND and a practical file naming convention could make it useful for non-boostnote use. Incremental synchronization should not be an issue. As I mentioned it could be imagined as "primary" or "operational" storage. Where boostnote would be working. It would be a perfectly viable solution to "synchronize" between file system periodically and frequently. By this you could get the best of both solution. Plus imagine if you intend to make a "corporate" level document synchronization? just replace sqlite to some remote database solution and you got shared documents automatically. (keep in mind that locking a file while modifying is an issue today because the files could be modified in parallel - so having a reliable file level or db level locking is needed to implement anyway) |
CSON isn't ideal but I can 100% open them in Atom, VS Code, SED, AWK, Python or any other text based tool and manipulate them (global search and replace across all notes for example). The format is more suitable for humans to work with than some of the other file formats I work with at work every day. Remember this is a tool for programmers. I was using One Note on Mac at work and it got screwed up such that there are notes on my local drive in some database format that Microsoft will not disclose and can't fix that never got sync'd to the server (that is somehow broken now) so I'm having to copy out years of notes manually into Boost Note because One Note is holding them hostage in some database. I don't want to see a repeat of that dumpster fire ever again. I guess I could just use md files and a directory structure with Atom / VS Code if Boost Note is going to start requiring putting notes into a DB. More complexity, more points of failure, less reliability. |
Concepts
This issue is more like a discussion of an architectural change.
Summary: what is using sql based storages a primary/operational storage for the application.
Detail: Currently it is using the filesystem with CSON file format (CSON is not so widely supported). All operation on data is relying on file manipulation and every change what changes data structure and algorithm on data structure requires "handmade" code. As an example: if we would need a new relationship to implement between data it has to change storage format (what is obvious) and need to implement the corresponding algorithm (+ UI change, obviously, otherwise none of the improvement makes sense :) )
At first, I would consider using SQLite as embeddable, self-containing SQL storage engine.
The concept would be that from live, operational point of view using SQL based storage have several advantages (not necessarily in importance/relevancy):
(Remark: On the other hand, we should not ignore filesystem as secondary storage if the file system would be more human readable (neither file names nor CSON structure helps too much in it). I think a simple SQL file is much more easy to process with existing tooling.)
This concept could be found in ZimWiki. When starting up it is building an SQLite database and then it is used internally for efficiency.
Common use case examples
list of directories
list nodes in directories
list node with (any?) sort criteria
list all node with directoreis
counting nodes in folders
search by tags
as you could see it has endless flexibility in search and the heavy lifting is done by the sql engine.
snippets having java fragment
Let me demonstrate some potential improvements
storeing pre-compiled html
Although I did not notice any performance issue when rendering html from markdown I already noticed some delay on rendering diagrams (e.g. plantuml).
If the rendered html is "cached" in the system (including diagram images, etc) there would be no delay just show the html part.
Actually, it worth testing is there any issue of rendering large documents (as for a moment there is not such a thing as note hierarchy so there is no "natural" way of splitting larger writings - e.g into sections).
note cross reference
Once the note is prepared it could include the graph of cross-reference pages for easy navigation (e.g. "Related pages") based on links, tags, parent elements (wither tag, node or folder parent)
folder/note/tag hierarchy
This is the simplest implementation (for tags) to add but probably not so efficient when querying. There are other models (like closure table) with better query characteristics (and a little more complicated insert/update logic)
searching for a child of specific tag
ex: parent path (as nexted directory structure breadcrums)
ex: count by parent aggregate tag. aka sum of child counts
In general, the same logic could be applied to folders and notes too.
According to my investigation, the following issues could gain advantages from this implementation:
#tag_h1#tag_h2#tag_h3
) Hierarchy in tags (like#tag_h1#tag_h2#tag_h3
) #2099order by
is an easy win.Used SQL data structure:
For experimental purposes, I could provide a script what is converting boosnote files into this database structure.
The text was updated successfully, but these errors were encountered: