Skip to content

Commit f1b6bf6

Browse files
committed
add sphinx documentation setup
1 parent a84e016 commit f1b6bf6

File tree

233 files changed

+51110
-80
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

233 files changed

+51110
-80
lines changed

Changelog.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
# Changelog
2+
13
▶️ *If you're having an issue with a breaking change, or migrating your data between versions, open an [issue](https://github.com/pirate/ArchiveBox/issues) to get help.*
24

35
**`ArchiveBox` was previously named `Pocket Archive Stream` and then `Bookmark Archiver`.**
@@ -88,4 +90,4 @@ See the [releases](https://github.com/pirate/ArchiveBox/releases) page for versi
8890
- added Pocket-format export support
8991

9092
---
91-
- v0.0.0 released: created Pocket Archive Stream 2017/05/05
93+
- v0.0.0 released: created Pocket Archive Stream 2017/05/05

Chromium-Install.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
# Chromium Install
2+
13
By default, ArchiveBox looks for any existing installed version of Chrome/Chromium and uses it if found. You can optionally install a specific version and set the environment variable `CHROME_BINARY` to force ArchiveBox to use that one, e.g.:
24

35
- `CHROME_BINARY=google-chrome-beta`
@@ -6,7 +8,7 @@ By default, ArchiveBox looks for any existing installed version of Chrome/Chromi
68

79
If you don't already have Chrome installed, I recommend installing Chromium instead of Google Chrome, as it's the open-source fork of Chrome that doesn't send as much tracking data to Google.
810

9-
#### Check for existing Chrome/Chromium install
11+
**Check for existing Chrome/Chromium install:**
1012

1113
<img src="https://i.imgur.com/FxFoIMH.jpg" width="25%" align="right"/>
1214

@@ -48,4 +50,4 @@ apt install google-chrome-beta
4850

4951
## Troubleshooting
5052

51-
If you encounter problems setting up Google Chrome or Chromium, see the [Troubleshooting](https://github.com/pirate/ArchiveBox/wiki/Troubleshooting#chromiumgoogle-chrome) page.
53+
If you encounter problems setting up Google Chrome or Chromium, see the [Troubleshooting](https://github.com/pirate/ArchiveBox/wiki/Troubleshooting#chromiumgoogle-chrome) page.

Configuration.md

Lines changed: 2 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
1-
▶️ *The default ArchiveBox config file can be found here: [`etc/ArchiveBox.conf.default`](https://github.com/pirate/ArchiveBox/blob/master/etc/ArchiveBox.conf.default).*
1+
# Configuration
22

3+
▶️ *The default ArchiveBox config file can be found here: [`etc/ArchiveBox.conf.default`](https://github.com/pirate/ArchiveBox/blob/master/etc/ArchiveBox.conf.default).*
34

45
Configuration is done through environment variables. You can pass in settings using all the usual environment variable methods: e.g. by using the `env` command, exporting variables in your shell profile, or sourcing a `.env` file before running the command.
56

@@ -384,31 +385,3 @@ Path or name of the curl binary to use.
384385

385386

386387
<img src="https://i.imgur.com/almAbwK.png" width="100%"/>
387-
388-
---
389-
390-
391-
# Creating a Config File
392-
393-
*Note: If you're using Docker, see the [[Docker]] page for configuration instructions.*
394-
395-
To set up a persistent config:
396-
397-
1. Copy `etc/ArchiveBox.conf.default` to `~/.ArchiveBox.conf`
398-
```bash
399-
cp ArchiveBox/etc/ArchiveBox.conf.default ~/.ArchiveBox.conf
400-
```
401-
402-
2. Edit your options inside `~/.ArchiveBox.conf`, e.g.:
403-
```bash
404-
CHROME_BINARY=google-chrome-stable
405-
RESOLUTION=1440,900
406-
FETCH_PDF=False
407-
```
408-
409-
3. Source your config file when you run your archive script:
410-
```bash
411-
eval export $(grep -v '^#' ~/path/to/your/ArchiveBox.conf); ./archive https://example.com/rss/feed.xml
412-
```
413-
414-
Improving this process is on the roadmap, in future versions you'll be able to pass a config file directly to the archive command.

Docker.md

Lines changed: 16 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,6 @@
1-
# Overview
1+
# Docker
2+
3+
## Overview
24

35
Running ArchiveBox with Docker allows you to manage it in a container without exposing it to the rest of your system. Usage with Docker is similar to usage of ArchiveBox normally, with a few small differences.
46

@@ -30,7 +32,7 @@ echo 'https://example.com' | docker run -i -v ~/ArchiveBox:/data nikisweeting/ar
3032

3133
<img src="https://i.imgur.com/knwOtky.png" height="40px" align="right">
3234

33-
# Docker Compose
35+
## Docker Compose
3436

3537
An example [`docker-compose.yml`](https://github.com/pirate/ArchiveBox/blob/master/docker-compose.yml) config with ArchiveBox and an Nginx server to serve the archive is included in the project root. You can edit it as you see fit, or just run it as it comes out-of-the-box.
3638

@@ -40,7 +42,7 @@ docker --version
4042
Docker version 18.09.1, build 4c52b90 # must be >= 17.04.0
4143
```
4244

43-
## Setup
45+
### Setup
4446

4547
```bash
4648
git clone https://github.com/pirate/ArchiveBox && cd ArchiveBox
@@ -50,7 +52,7 @@ docker-compose up -d
5052

5153
Then open [`http://127.0.0.1:8098`](http://127.0.0.1:8098) or `data/index.html` to view the archive (HTTP, not HTTPS).
5254

53-
## Usage
55+
### Usage
5456

5557
First, make sure you're `cd`'ed into the same folder as your `docker-compose.yml` file (e.g. the project root) and that your containers have been started with `docker-compose up -d`.
5658

@@ -73,13 +75,13 @@ docker-compose exec archivebox /bin/archive https://example.com/some/feed.rss
7375
```
7476
Passing a URL as an argument here does not archive the specified URL, it downloads it and archives the links *inside* of it, so only use it for RSS feeds or other *lists of links* you want to add. To add an individual link you want to archive use the instruction above and pass via stdin instead of by argument.
7577

76-
## Accessing the data
78+
### Accessing the data
7779

7880
The outputted archive data is stored in `data/` (relative to the project root), or whatever folder path you specified in the `docker-compose.yml` `volumes:` section. Make sure the `data/` folder on the host has permissions initially set to `777` so that the ArchiveBox command is able to set it to the specified `OUTPUT_PERMISSIONS` config setting on the first run.
7981

8082
To access your archive, you can open `data/index.html` directly, or you can use the provided Nginx server running inside docker on [`http://127.0.0.1:8098`](http://127.0.0.1:8098).
8183

82-
## Configuration
84+
### Configuration
8385

8486
ArchiveBox running with docker-compose accepts all the same environment variables as normal, see the full list on the [[Configuration]] page.
8587

@@ -107,9 +109,9 @@ If you want to access your archive server with HTTPS, put a reverse proxy like N
107109

108110
---
109111

110-
# Docker
112+
## Docker
111113

112-
## Setup
114+
### Setup
113115

114116
Fetch and run the ArchiveBox Docker image to create your initial archive.
115117
```bash
@@ -120,7 +122,7 @@ Replace `~/ArchiveBox` in the command above with the full path to a folder to us
120122

121123
Make sure the data folder you use host is either a new, uncreated path, or if it already exists make sure it has permissions initially set to `777` so that the ArchiveBox command is able to set it to the specified `OUTPUT_PERMISSIONS` config setting on the first run.
122124

123-
## Usage
125+
### Usage
124126

125127
**To add a single URL to the archive** or a list of links from a file, pipe them in via stdin. This will archive each link passed in.
126128
```bash
@@ -135,17 +137,17 @@ docker run -v -v ~/ArchiveBox:/data nikisweeting/archivebox /bin/archive 'https:
135137
```
136138
Passing a URL as an argument here does not archive the specified URL, it downloads it and archives the links *inside* of it, so only use it for RSS feeds or other *lists of links* you want to add. To add an individual link use the instruction above and pass via stdin instead of by argument.
137139

138-
## Accessing the data
140+
### Accessing the data
139141

140-
### Using a bind folder
142+
#### Using a bind folder
141143

142144
Use the flag:
143145
```bash
144146
-v /full/path/to/folder/on/host:/data
145147
```
146148
This will use the folder `/full/path/to/folder/on/host` on your host to store the ArchiveBox output.
147149

148-
### Using a named Docker data volume
150+
#### Using a named Docker data volume
149151

150152
```bash
151153
docker volume create archivebox-data
@@ -164,7 +166,7 @@ screen ~/Library/Containers/com.docker.docker/Data/vms/0/tty
164166
cd /var/lib/docker/volumes/archivebox-data/_data
165167
```
166168

167-
## Configuration
169+
### Configuration
168170

169171
ArchiveBox in Docker accepts all the same environment variables as normal, see the list on the [[Configuration]] page.
170172

@@ -176,4 +178,4 @@ echo 'https://example.com' | docker run -i -v ~/ArchiveBox:/data nikisweeting/ar
176178
Or you can create an `ArchiveBox.env` file (copy from the default `etc/ArchiveBox.conf.default`) and pass it in like so:
177179
```bash
178180
docker run -i -v --env-file=ArchiveBox.env nikisweeting/archivebox
179-
```
181+
```

Install.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
# Install
2+
13
ArchiveBox only has a few main dependencies apart from `python3`, and they can all be installed using your normal package manager. It usually takes 1min to get up and running if you use the [helper script](#automatic-setup), or about 5min if you install everything [manually](#manual-setup).
24

35
<img src="https://lh4.googleusercontent.com/KWaqSJ_J9nSaGZugZWGR_mC18xxbGj2pVScriSzP8hX7KiUSw6L3VVL8rhDxQKIwxaCsfSFUO1B2pipEM4h7L-HJOGXo7yZK8a3DBVERwqfEZ8GxpeHPwh8P4LSkqVjPGRx5XYs" width="20%" align="right"/>
@@ -131,4 +133,4 @@ You may optionally specify a second argument to `archive.py export.html 15324242
131133
First, if you don't already have docker installed, follow the official install instructions for Linux, macOS, or Windows https://docs.docker.com/install/#supported-platforms.
132134

133135

134-
Then see the [[Docker]] page for next steps.
136+
Then see the [[Docker]] page for next steps.

Makefile

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# Minimal makefile for Sphinx documentation
2+
#
3+
4+
# You can set these variables from the command line.
5+
SPHINXOPTS =
6+
SPHINXBUILD = sphinx-build
7+
SOURCEDIR = .
8+
BUILDDIR = _build
9+
10+
# Put it first so that "make" without argument is like "make help".
11+
help:
12+
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
13+
14+
.PHONY: help Makefile
15+
16+
# Catch-all target: route all unknown targets to Sphinx using the new
17+
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
18+
%: Makefile
19+
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

Publishing-Your-Archive.md

Lines changed: 9 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
## Publishing Your Archive
1+
# Publishing Your Archive
22

33
The archive produced by `./archive` is suitable for serving on any provider that can host static html (e.g. github pages!).
44

@@ -19,16 +19,15 @@ Make sure you're not running any content as CGI or PHP, you only want to serve s
1919

2020
Urls look like: `https://archive.example.com/archive/1493350273/en.wikipedia.org/wiki/Dining_philosophers_problem.html`
2121

22-
**Security WARNING & Content Disclaimer**
22+
## Security Concerns
2323

24-
Re-hosting other people's content has security implications for any other sites sharing your hosting domain. Make sure you understand
25-
the dangers of hosting unknown archived CSS & JS files [on your shared domain](https://developer.mozilla.org/en-US/docs/Web/Security/Same-origin_policy).
26-
Due to the security risk of serving some malicious JS you archived by accident, it's best to put this on a domain or subdomain
27-
of its own to keep cookies separate and slightly mitigate [CSRF attacks](https://en.wikipedia.org/wiki/Cross-site_request_forgery) and other nastiness.
24+
Re-hosting other people's content has security implications for any other sites sharing your hosting domain. Make sure you understand the dangers of hosting unknown archived CSS & JS files [on your shared domain](https://developer.mozilla.org/en-US/docs/Web/Security/Same-origin_policy).
25+
Due to the security risk of serving some malicious JS you archived by accident, it's best to put this on a domain or subdomain of its own to keep cookies separate and slightly mitigate [CSRF attacks](https://en.wikipedia.org/wiki/Cross-site_request_forgery) and other nastiness.
2826

29-
You may also want to blacklist your archive in `/robots.txt` if you don't want to be publicly assosciated with all the links you archive via search engine results.
27+
## Copyright Concerns
28+
29+
Be aware that some sites you archive may not allow you to rehost their content publicly for copyright reasons, it's up to you to host responsibly and respond to takedown requests appropriately.
3030

31-
Be aware that some sites you archive may not allow you to rehost their content publicly for copyright reasons,
32-
it's up to you to host responsibly and respond to takedown requests appropriately.
31+
You may also want to blacklist your archive in `/robots.txt` if you don't want to be publicly assosciated with all the links you archive via search engine results.
3332

34-
Please modify the `FOOTER_INFO` config variable to add your contact info to the footer of your index.
33+
Please modify the `FOOTER_INFO` config variable to add your contact info to the footer of your index.

Quickstart.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
# Quickstart
2+
13
<div align="center">
24
<img src="https://i.imgur.com/ZbHpEf8.jpg" width="40%"/>
35
</div>
@@ -72,4 +74,4 @@ Open `output/index.html` to view your archive. (favicons will appear next to ea
7274
- Read [[Configuration]] to learn about the various archive method options
7375
- Read [[Scheduled Archiving]] to learn how to set up automatic daily archiving
7476
- Read [[Publishing Your Archive]] if you want to host your archive for others to access online
75-
- Read [[Troubleshooting]] if you encounter any problems
77+
- Read [[Troubleshooting]] if you encounter any problems

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../README.md

Roadmap.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
## Roadmap
1+
# Roadmap
22

33
<img src="https://i.imgur.com/es97GGV.png" width="20%" align="right"/>
44

@@ -11,7 +11,7 @@ If you feel like contributing a PR, some of these tasks are pretty easy. Feel f
1111

1212
---
1313

14-
# Planned Specification
14+
## Planned Specification
1515

1616
ArchiveBox is going to migrate towards this design spec over the next 3 months bit by bit as functionality gets implemented and refactors are released.
1717

@@ -694,4 +694,4 @@ services:
694694
695695
---
696696
697-
**IMPORTANT**: *Please don't work on any of these major long-term tasks without [contacting me first](https://nicksweeting.com/blog#Contact-Me), work is already in progress for many of these, and I may have to reject your PR if it doesn't align with the existing work!*
697+
**IMPORTANT**: *Please don't work on any of these major long-term tasks without [contacting me first](https://nicksweeting.com/blog#Contact-Me), work is already in progress for many of these, and I may have to reject your PR if it doesn't align with the existing work!*

Scheduled-Archiving.md

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,6 @@
1-
## Schedule daily importing of new links into your archive
1+
# Scheduled Archiving
2+
3+
## Using Cron
24

35
To schedule regular archiving you can use any task scheduler like `cron`, `at`, `sytsemd`, etc.
46

@@ -8,7 +10,9 @@ ones as necessary.
810

911
For some example configs, see the [`etc/cron.d`](https://github.com/pirate/ArchiveBox/blob/master/etc/cron.d) and [`etc/supervisord`](https://github.com/pirate/ArchiveBox/blob/master/etc/supervisord) folders.
1012

11-
## Example: Import Firefox browser history every 24 hours
13+
## Examples
14+
15+
### Example: Import Firefox browser history every 24 hours
1216

1317
This example exports your browser history and archives it once a day:
1418

@@ -26,7 +30,7 @@ cd /opt/ArchiveBox
2630
0 24 * * * www-data /opt/ArchiveBox/bin/firefox_custom.sh
2731
```
2832

29-
## Example: Import an RSS feed from Pocket every 12 hours
33+
### Example: Import an RSS feed from Pocket every 12 hours
3034

3135
This example imports your Pocket bookmark feed and archives any new links once a day:
3236

@@ -43,4 +47,4 @@ cd /opt/ArchiveBox
4347
**Then create a new file `/etc/cron.d/ArchiveBox-Pocket` to tell cron to run your script every 12 hours:**
4448
```bash
4549
0 12 * * * www-data /opt/ArchiveBox/bin/pocket_custom.sh
46-
```
50+
```

Security-Overview.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
# Security Overview
2+
13
## Usage Modes
24

35
ArchiveBox has three common usage modes outlined below.
@@ -64,4 +66,4 @@ How much are you planning to archive? Only a few bookmarked articles, or thousa
6466

6567
Are you publishing your archive? If so, make sure you're only serving it as HTML and not accidentally running it as php or cgi, and put it on its own domain not shared with other services. This is done in order to avoid cookies leaking between your main domain and domains hosting content you don't control. Many companies put user provided files on separate domains like googleusercontent.com and github.io to avoid this problem.
6668

67-
Published archives automatically include a `robots.txt` `Dissallow: /` to block search engines from indexing them. You may still wish to publish your contact info in the index footer though using [`FOOTER_INFO`](https://github.com/pirate/ArchiveBox/wiki/Configuration#FOOTER_INFO) so that you can respond to any DMCA and copyright takedown notices if you accidentally rehost copyrighted content.
69+
Published archives automatically include a `robots.txt` `Dissallow: /` to block search engines from indexing them. You may still wish to publish your contact info in the index footer though using [`FOOTER_INFO`](https://github.com/pirate/ArchiveBox/wiki/Configuration#FOOTER_INFO) so that you can respond to any DMCA and copyright takedown notices if you accidentally rehost copyrighted content.

0 commit comments

Comments
 (0)