Skip to content

Commit ceb9d5e

Browse files
committed
REAMDME.md - some minor fixes + word wrap
1 parent e176165 commit ceb9d5e

File tree

1 file changed

+47
-13
lines changed

1 file changed

+47
-13
lines changed

README.md

Lines changed: 47 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,32 @@
44

55
## About
66

7-
ZSON is a PostgreSQL extension for transparent JSONB compression. Compression is based on a shared dictionary of strings most frequently used in specific JSONB documents (not only keys, but also values, array elements, etc).
7+
ZSON is a PostgreSQL extension for transparent JSONB compression. Compression is
8+
based on a shared dictionary of strings most frequently used in specific JSONB
9+
documents (not only keys, but also values, array elements, etc).
810

9-
In some cases ZSON can save half of your disk space and give you about 10% more TPS. Memory is saved as well. See [docs/benchmark.md](docs/benchmark.md). Everything depends on your data and workload though. Don't believe any benchmarks, re-check everything on your data, configuration, hardware, workload and PostgreSQL version.
11+
In some cases ZSON can save half of your disk space and give you about 10% more
12+
TPS. Memory is saved as well. See [docs/benchmark.md](docs/benchmark.md).
13+
Everything depends on your data and workload, though. Don't believe any
14+
benchmarks, re-check everything on your data, configuration, hardware, workload
15+
and PostgreSQL version.
1016

11-
ZSON was originally created in 2016 by [Postgres Professional](https://postgrespro.ru/) team: researched and coded by [Aleksander Alekseev](http://eax.me/); ideas, code review, testing, etc by [Alexander Korotkov](http://akorotkov.github.io/) and [Teodor Sigaev](http://www.sigaev.ru/).
17+
ZSON was originally created in 2016 by [Postgres Professional][pgpro] team:
18+
researched and coded by [Aleksander Alekseev][me]; ideas, code review, testing,
19+
etc by [Alexander Korotkov][ak] and [Teodor Sigaev][ts].
1220

13-
See also discussions on [pgsql-general@](https://www.postgresql.org/message-id/flat/20160930185801.38654a1c%40e754), [Hacker News](https://news.ycombinator.com/item?id=12633486), [Reddit](https://www.reddit.com/r/PostgreSQL/comments/55mr4r/zson_postgresql_extension_for_transparent_jsonb/) and [HabraHabr](https://habrahabr.ru/company/postgrespro/blog/312006/).
21+
[me]: http://eax.me/
22+
[ak]: http://akorotkov.github.io/
23+
[ts]: http://www.sigaev.ru/
24+
[pgpro]: https://postgrespro.ru/
25+
26+
See also discussions on [pgsql-general@][gen], [Hacker News][hn], [Reddit][rd]
27+
and [HabraHabr][habr].
28+
29+
[gen]: https://www.postgresql.org/message-id/flat/20160930185801.38654a1c%40e754
30+
[hn]: https://news.ycombinator.com/item?id=12633486
31+
[rd]: https://www.reddit.com/r/PostgreSQL/comments/55mr4r/zson_postgresql_extension_for_transparent_jsonb/
32+
[habr]: https://habrahabr.ru/company/postgrespro/blog/312006/
1433

1534
## Install
1635

@@ -75,15 +94,20 @@ Example:
7594
select zson_learn('{{"table1", "col1"}, {"table2", "col2"}}');
7695
```
7796

78-
You can create a temporary table and write some common JSONB documents to it manually or use existing tables. The idea is to provide a subset of real data. Lets say some document *type* is twice as frequent as some other document type. ZSON expects that there will be twice as many documents of the first type as those of the second one in a learning set.
97+
You can create a temporary table and write some common JSONB documents into it
98+
manually or use the existing tables. The idea is to provide a subset of real
99+
data. Let's say some document *type* is twice as frequent as another document
100+
type. ZSON expects that there will be twice as many documents of the first type
101+
as those of the second one in a learning set.
79102

80103
Resulting dictionary could be examined using this query:
81104

82105
```
83106
select * from zson_dict;
84107
```
85108

86-
Now ZSON type could be used as a complete and transparent replacement of JSONB type:
109+
Now ZSON type could be used as a complete and transparent replacement of JSONB
110+
type:
87111

88112
```
89113
zson_test=# create table zson_example(x zson);
@@ -99,15 +123,20 @@ zson_test=# select x -> 'aaa' from zson_example;
99123

100124
## Migrating to a new dictionary
101125

102-
When schema of JSONB documents evolve ZSON could be *re-learned*:
126+
When a schema of JSONB documents evolves ZSON could be *re-learned*:
103127

104128
```
105129
select zson_learn('{{"table1", "col1"}, {"table2", "col2"}}');
106130
```
107131

108-
This time *second* dictionary will be created. Dictionaries are cached in memory so it will take about a minute before ZSON realizes that there is a new dictionary. After that old documents will be decompressed using the old dictionary and new documents will be compressed and decompressed using the new dictionary.
132+
This time *second* dictionary will be created. Dictionaries are cached in memory
133+
so it will take about a minute before ZSON realizes that there is a new
134+
dictionary. After that old documents will be decompressed using the old
135+
dictionary and new documents will be compressed and decompressed using the new
136+
dictionary.
109137

110-
To find out which dictionary is used for a given ZSON document use zson\_info procedure:
138+
To find out which dictionary is used for a given ZSON document use zson\_info
139+
procedure:
111140

112141
```
113142
zson_test=# select zson_info(x) from test_compress where id = 1;
@@ -119,13 +148,15 @@ zson_test=# select zson_info(x) from test_compress where id = 2;
119148
zson_info | zson version = 0, dict version = 0, ...
120149
```
121150

122-
If **all** ZSON documents are migrated to the new dictionary the old one could be safely removed:
151+
If **all** ZSON documents are migrated to the new dictionary the old one could
152+
be safely removed:
123153

124154
```
125155
delete from zson_dict where dict_id = 0;
126156
```
127157

128-
In general it's safer to keep old dictionaries just in case. Gaining a few KB of disk space is not worth the risk of losing data.
158+
In general, it's safer to keep old dictionaries just in case. Gaining a few KB
159+
of disk space is not worth the risk of losing data.
129160

130161
## When it's a time to re-learn?
131162

@@ -137,7 +168,10 @@ A good heuristic could be:
137168
select pg_table_size('tt') / (select count(*) from tt)
138169
```
139170

140-
... i.e. average document size. When it suddenly starts to grow it's time to re-learn.
171+
... i.e. average document size. When it suddenly starts to grow it's time to
172+
re-learn.
141173

142-
However, developers usually know when they change a schema significantly. It's also easy to re-check whether current schema differs a lot from the original using zson\_dict table.
174+
However, developers usually know when they change a schema significantly. It's
175+
also easy to re-check whether the current schema differs a lot from the original
176+
one using zson\_dict table.
143177

0 commit comments

Comments
 (0)