|
| 1 | +--- |
| 2 | +title: Document Graph |
| 3 | +date : 2021-Dec-17 |
| 4 | +author : ypp34 |
| 5 | +--- |
| 6 | + |
| 7 | +# Introduction |
| 8 | + |
| 9 | +Document Graph is a flexible persistence and modeling framework for blockchain applications. |
| 10 | + |
| 11 | +# Use Cases |
| 12 | + |
| 13 | +Document Graph is currently used for a number of use cases. Here are some real-world examples that we are using it for. Interestingly, all of the below use cases can be handled within the same graph where nodes can be linked to each other across contexts. |
| 14 | + |
| 15 | +## DAOs |
| 16 | +DAOs have have descriptive attributes. They also have members, proposals, assignments, quests, bounties, badges, and many other data types that are maintained to support a fully functional platform. |
| 17 | + |
| 18 | +Here's an actual example of how Document Graph is used to manage a few components within a DAO's information space. |
| 19 | + |
| 20 | + |
| 21 | + |
| 22 | +## Supply Chain |
| 23 | +We use Document Graph for a supply chain traceability application. An ```Item``` and its various ```Checkpoints``` are saved as ```documents```. When a new QR code ```scan``` event occurs, it extends the graph for that item to contain a new ```Checkpoint```. |
| 24 | + |
| 25 | +Since a ```Checkpoint``` is a document, it can store any arbitrary data that describes that event, such as weight, longitude, latitude, cost, and even photos or other files such as packing lists. Files are stored in IPFS and anchored into the on-chain document. |
| 26 | + |
| 27 | +Users have the option to save data directly onto the chain (for access within smart contracts) or to save the data to IPFS. Users also have the option to encrypt the data with a password of their choice. |
| 28 | + |
| 29 | +{{< plantuml id="eg" >}} |
| 30 | +@startuml |
| 31 | +digraph G { |
| 32 | + rankdir=LR; |
| 33 | + Item -> Checkpoint_1 [label="scan"]; |
| 34 | + Item -> Checkpoint_2 [label="scan"]; |
| 35 | + Item -> Checkpoint_3 [label="scan"]; |
| 36 | +} |
| 37 | +@enduml{{< /plantuml >}} |
| 38 | + |
| 39 | +## Accounting |
| 40 | +We also use the Document Graph to store data required to operate a double-entry accounting system. This system allows users to generate classical accounting reports such as Balance Sheet and Income Statement from blockchain transactions. |
| 41 | + |
| 42 | +Accountants must be able to tag transactions with additional data, references, files, etc., and they must also be able to assign a general ledger account to appropriately calculate debits and credits. |
| 43 | + |
| 44 | +# Features |
| 45 | + |
| 46 | +## End-user Accessibility |
| 47 | +The Document Graph Explorer allows for any user with a blockchain account to create and edit content that they own within the graph, and they may create between any two nodes in the graph. |
| 48 | + |
| 49 | +This level of accessibility to non-technical users is unprecedented. It allows them to collaboratively create and connect content with all of the benefits of blockchain. This level of capability was previously only accessible to highly technical engineers. |
| 50 | + |
| 51 | +## File and IPFS Integration |
| 52 | +Document Graph has integrated support for storing data or files within IPFS and saving that files’ CID (hash) within the document. |
| 53 | + |
| 54 | +## Encryption Support |
| 55 | +Document Graph Explorer supports encryption of a specific content item’s value. In DGE, the user is prompted to enter a password that is used for symmetric AES encryption. This secret simply encrypts the value and the ciphertext is persisted in the document. |
| 56 | + |
| 57 | +In a future release, we may integrate with the [Khala/Phala](https://phala.network/en/) confidential blockchain. Also, we are evaluating integration with [PAD](https://www.pad.tech/) as a trustless way to share the secret in a manner that alerts the owner when the secret is accessed. This is useful for interesting use cases such as one-time decrypt use cases and “in case of emergency” use cases. |
| 58 | + |
| 59 | +## GraphQL Caching |
| 60 | +Document Graph supports easy integration with [DGraph](https://dgraph.io), an open source distributed graph engine. The document graph cache listens for new blocks, and upon finalization, updates the DGraph graph to reflect any updates to the on-chain graph. DGraph has excellent tooling and ergonomics for querying, custom types or schemas, data visualizer, full-text search, and much more. |
| 61 | + |
| 62 | +## Composable SDK Experience (CLI) |
| 63 | +Document Graph is built to be highly composable and also support an ergonomic developer experience. In addition to the Document Graph Explorer web application, there is a CLI (written in Go) that supports all of the create, read, update, and delete operations directly against the blockchain. |
| 64 | + |
| 65 | +## Plugin Architecture for Custom Renderers and Editors |
| 66 | +Developers can include special fields within their document to enable custom viewers or editors. For example, if the field “preferred_renderer” or “preferred_editor” is populated with an endpoint, this endpoint will be used to render or edit the document. |
| 67 | + |
| 68 | + |
| 69 | +# Background of Semantic Triples |
| 70 | +Databases typically store data in a tabular format, where a table has set of columns and rows, like a spreadsheet. |
| 71 | + |
| 72 | +However, IRL, we tend to think in statements of ```Subject```, ```Predicate``` (```Verb```), and ```Object```. This is particularly true in the English language where word order is subject, verb, object (SVO). This information construct is known as ```triples```, or [semantic triples](https://en.wikipedia.org/wiki/Semantic_triple). The [W3C Resource Description Framework](https://www.w3.org/RDF/) is a leading specification for organizing these statements into semantic triples. |
| 73 | + |
| 74 | +As Lenhert describes [here](https://www.synaptica.com/triples-triads-and-semantics/): |
| 75 | + |
| 76 | +> "One of the fundamental benefits of these RDF-based controlled vocabulary structures are their dual nature as being both human intelligible as well as machine readable (and, therefore, portable). Like their basic constituent subject-verb-object basis, ontologies convey rich meaning in relatively simple and compact statements, making them extremely useful in modeling complex knowledge environments." |
| 77 | +
|
| 78 | +The following diagrams express semantic triples in various terms. |
| 79 | +### Using the natural English [SOV](https://en.wikipedia.org/wiki/Subject%E2%80%93verb%E2%80%93object) terms |
| 80 | + |
| 81 | +{{< plantuml id="dg1" >}} |
| 82 | +@startuml |
| 83 | +digraph G { |
| 84 | + rankdir=LR; |
| 85 | + Subject -> Object [label="Verb"]; |
| 86 | +} |
| 87 | +@enduml{{< /plantuml >}} |
| 88 | + |
| 89 | + |
| 90 | +### Using common [graph theory](https://en.wikipedia.org/wiki/Vertex_(graph_theory)) terms |
| 91 | + |
| 92 | +{{< plantuml id="dg2" >}} |
| 93 | +@startuml |
| 94 | +digraph G { |
| 95 | + rankdir=LR; |
| 96 | + Vertex1 -> Vertex2 [label="Edge"]; |
| 97 | +} |
| 98 | +@enduml{{< /plantuml >}} |
| 99 | + |
| 100 | + |
| 101 | +### Using the Document Graph terms |
| 102 | + |
| 103 | +{{< plantuml id="dg3" >}} |
| 104 | +@startuml |
| 105 | +digraph G { |
| 106 | + rankdir=LR; |
| 107 | + Document1 -> Document2 [label="Edge"]; |
| 108 | +} |
| 109 | +@enduml{{< /plantuml >}} |
| 110 | + |
| 111 | + |
| 112 | +Note that edges are both directional and labeled. |
| 113 | + |
| 114 | + |
| 115 | +### Simple Example of a Semantic Triple |
| 116 | + |
| 117 | +{{< plantuml id="dg4" >}} |
| 118 | +@startuml |
| 119 | +digraph G { |
| 120 | + rankdir=LR; |
| 121 | + Document1 -> Document2 [label="Edge"]; |
| 122 | +} |
| 123 | +@enduml{{< /plantuml >}} |
| 124 | + |
| 125 | + |
| 126 | +## Blockchain Change Management |
| 127 | +At their core, blockchains are used to manage data and state changes on that data. The structure of this data is nearly always based on a custom-built struct within a pallet or smart contract. Applications that are built on blockchains are typically built for a specific smart contract’s data structure, and the application is highly coupled to that application. When a data structure changes, it typically requires a complex migration of existing data and source code within the smart contract as well as perhaps changes to other backend or service layers and the application itself. |
| 128 | + |
| 129 | +For example, assume the overly simplified struct within a smart contract. Upon an approval, the amount would be sent to the account indicated. |
| 130 | + |
| 131 | +```rust |
| 132 | +pub struct Proposal { |
| 133 | + pub label: str, |
| 134 | + pub account: AccountId, |
| 135 | + pub amount: u32, |
| 136 | +} |
| 137 | +``` |
| 138 | + |
| 139 | +Then, let's say the DAO wants to add support for recurring payments. This would need to be supported within the data model by adding a time period to use as the recurring period and the total number of payments. |
| 140 | + |
| 141 | +```rust |
| 142 | +pub struct Proposal { |
| 143 | + pub title: str, |
| 144 | + pub account: AccountId, |
| 145 | + pub amount: u32, |
| 146 | + pub time_period_seconds: u32, |
| 147 | + pub payment_count: u32, |
| 148 | +} |
| 149 | +``` |
| 150 | + |
| 151 | +Depending on the blockchain, this change management event requires a series of gymnastics by developers to maintain a reasonable user experience. For example, on non-upgradable platforms, it may require clients to query from both structs, or if upgradable, may require erasing data from the original data store and recreating it (losing metadata in the process). Some may require maintaining both structures and adding a roll-up of sorts to maintain client backwards compatibility. |
| 152 | + |
| 153 | +All of these complications create a significant change management burden on software developers. |
| 154 | + |
| 155 | +### Document Graph Solves This |
| 156 | +Document Graph alleviates the need for developers to change their persistence layer when making upgrades to their data model. |
| 157 | + |
| 158 | +This is achieved by supporting flexible variant pairs within a consistently structured ```document``` type and support for an ```edge``` type used to link the graph's documents in meaningful ways. |
| 159 | + |
| 160 | +# Specification |
| 161 | +## Document Data Model |
| 162 | + |
| 163 | +{{< plantuml id="dg5" >}} |
| 164 | +@startuml |
| 165 | +@startmindmap |
| 166 | +* Document |
| 167 | +** Header |
| 168 | +*** Hash |
| 169 | +*** Owner |
| 170 | +*** Type |
| 171 | +*** Created & Updated Dates |
| 172 | +** Content Group 1 |
| 173 | +*** Content Item 1 |
| 174 | +**** Key (label) |
| 175 | +**** Value |
| 176 | +*** Content Item n |
| 177 | +** Content Group n |
| 178 | +@endmindmap |
| 179 | +@enduml{{< /plantuml >}} |
| 180 | + |
| 181 | + |
| 182 | +Each document is comprised of the following: |
| 183 | + |
| 184 | +- Header |
| 185 | + - [required] creator (account) |
| 186 | + - [required] graph contract (where the document is saved) |
| 187 | + - [optional] created date (timepoint) |
| 188 | + - [optional] updated date (timepoint) |
| 189 | + - [optional] hash of content (not including certificates or header) |
| 190 | +- Content |
| 191 | + - Value (variant) = ```std::variant <asset, string, time_point, name, int64> ``` |
| 192 | + - Content = an optionally labeled FlexValue |
| 193 | + - Content Group = vector<Content> |
| 194 | + - Content Groups = vector<ContentGroup> |
| 195 | + - Each document contains a single instance of ContentGroups. |
| 196 | + - This provides enough flexibility to support: |
| 197 | + - data of all supported types, |
| 198 | + - short clauses of annotated text, |
| 199 | + - longer form sequenced data, e.g. chapters. |
| 200 | +- Certificates [optional] |
| 201 | + - Each document has ```O..n``` certificates. |
| 202 | + - Certificate |
| 203 | + - certifier: the 'signer' |
| 204 | + - notes: string data provided by signer |
| 205 | + - certification_date: time_point |
0 commit comments