Skip to content

Commit 5b8cba6

Browse files
authored
Create bettergist_archive.md
1 parent 0436ad9 commit 5b8cba6

File tree

1 file changed

+135
-0
lines changed

1 file changed

+135
-0
lines changed

bettergist_archive.md

+135
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,135 @@
1+
I run the Bettergist Collector, which archives every single PHP packagist project,
2+
since April 2019, now on a bi-weekly basis.
3+
4+
Internally, I now have 400,000+ PHP packages' git repos, substantially more than
5+
the currently-alive PHP project count of 365,000+ (because I've been running the
6+
archive since 2019).
7+
8+
## The Internal Setup
9+
10+
The way I've been handling this for large institutional clients is:
11+
12+
1. I use [Gitea](https://about.gitea.com/) to host them all for free, on-premise,
13+
using a custom Gitea API app I made for the process.
14+
2. Bettergist since Jan 2022 stores everything as a shallow-cloned git repo, which
15+
(using complex commands) my script converts into full-remote repos which are
16+
imported to gitea via #1.
17+
3. For dead projects archived prior, a simple script creates a git repo and
18+
imports via #1.
19+
4. The gitea is hosted on dedicated servers, containing access to bi-weekly
20+
updated clones of the entire PHP Packagist system, including dead packages.
21+
5. The companies use [a custom packagist URL config](https://packagist.com/docs/setup)
22+
to pull in that private Bettergist Archive instead of traditional packagist.
23+
6. Once a quarter, the entire PHP packagist archive is compressed and stored both
24+
on-site, AWS S3, Mega.nz, and on USB drives dispersed on three continents
25+
(Texas, Colombia, and the UAE).
26+
7. A custom Composer plugin package can be installed on their repos that reports
27+
to the Bettergist Collector which composer packages are currently being actively
28+
used by clients (installs within the last year). It then include each archived
29+
version of those active packages in the active git repo.
30+
31+
## Reasons companies pay for the access:
32+
33+
1. All of their PHP dependencies are guaranteed to be present now and in the future,
34+
irrespective of the wishes of the authors (pursuant to the rights of the open source
35+
license).
36+
2. It is much more immune to supply-side attacks: Because of the bi-weekly archival
37+
schedule, usually supply-side attacks are discovered and remedied before the archive
38+
has taken place.
39+
3. It is much, much, much faster in various countries. For instance, there is an
40+
Alibaba Cloud instance that is over 500% faster when accessed from that instance,
41+
because it is inside of China's Great Firewall.
42+
4. It is largely immune to any particular country's IP irregularities, being hosted
43+
in Colombia, Germany, China, and Singapore.
44+
45+
## Access to the Bettergist packagist repo:
46+
47+
This service is currently only offered to publicly-traded corporations with market caps
48+
of at least $1 Billion dollars. This is to ensure a limited number of clients and the
49+
highest tier of service.
50+
51+
Access costs **$5/developer/month**, paid per year, with a 30 day trial period and
52+
a 60 day money-back guarantee for any reason.
53+
54+
The goals of this project are primarily not for-profit but for the secure archiving of
55+
every open source PHP project published via Packagist.org, for the greater posterity
56+
of human civilization, particularly in an edge-case such as the shutdown of GitHub, or
57+
even a grid-down event.
58+
59+
## Purchasing The USB
60+
61+
You can submit a request to purchase a USB drive of every reachable project + dead
62+
projects in PHP's Packagist database by [messaging me](https://twitter.com/hopeseekr)
63+
on Twitter / X.
64+
65+
* **Archive Size:** 112.1 GB compressed / 471.5 GB uncompressed
66+
* **Saved Dead Packages:** 18,995 (82,805 github stars, 30,622,705 installs)
67+
68+
The USB contains every PHP Packagist package downloadable since January 2022,
69+
and 98%+ available since April 2019, updated once per quarter.
70+
71+
It also contains the latest PHP source code for PHP v5.6 through 8.4, along with a compressed
72+
docker image of Ubuntu 22.04 with complete devtools installed, permitting you to compile PHP from
73+
scratch in the event of a full-on collapse, or recovery of the USB drive thousands of years into
74+
the far future.
75+
76+
The current price is $4,715.00 USD, payable via bank wire, Zelle, Bitcoin, Ethereum, $USDT, etc.
77+
78+
An archive of the Top 1 Million most-installed NPM packages is also available.
79+
80+
## Use Cases for The USB
81+
82+
### Extinction Level Event Protection
83+
84+
You can have a relatively uptodate archive of the a huge percent of the PHP Open Source ecosystem
85+
self-hosted on your own USB stick (if 512 GB or higher) or your hard drive.
86+
87+
This means you can code PHP completely off-grid, off-network, and in a verifiable clean-room
88+
environment, suitable for Dark Labs, Beyond Top Secret blackops dev sites, any of the offworld
89+
bases and outposts, and even the interstellar spacecraft that we don't officially have; all without
90+
risk of detection of that deep space signal that couldn't possibly exist.
91+
92+
In the event of a grid-down [**fire sale event**](https://diehard.fandom.com/wiki/Fire_Sale) where
93+
the Internet is offline or destroyed (EMP attack, supervolcano erruption, astroid impact, alien
94+
invasion, etc.), you can help rebuild society by preserving your own copy of the PHP open source
95+
ecosystem.
96+
97+
The USB drive contains everything you need to get a modern PHP stack both compiled and running on
98+
the latest Ubuntu LTS, along with instructions.
99+
100+
So if you are an archivist at heart, consider buying one and burying it underground or something.
101+
Photographic proof of it being secured underground in suitable fireproof, waterproof, EMP-resistant
102+
packaging will result in you being refunded all but the first $250 (manufacturing, shipping, and USB cost).
103+
104+
### Snapshot Over Time (Collectible)
105+
106+
By regularly purchasing the USB drives, you can grab a snapshot of the PHP ecosystem over time.
107+
This will become increasingly valuable as a collectible, per historic norms, for the same reason
108+
that 40 year-old McDonald plastic straws are currently being sold for thousands of dollars.
109+
![McD's straw for $2,620 on 2024-08-23](https://github.com/user-attachments/assets/e8eedaa6-be83-48d1-8385-6d498f2961df)
110+
111+
But this is even more valuable. You can, for instance, compare the coding practices of 2019 to 2025.
112+
113+
### AI Training / LLM Model Generation
114+
115+
Beyond the obvious of being able to instantly train an LLM Foundation Model against all of the
116+
open source PHP source instantly, you can pair it with exclusive access to the Bettergist API
117+
for targetted training of already pre-ranked data.
118+
119+
The Bettergist Collector project also contains the Bettergist Analyzer.
120+
121+
This CLI analyzes every single PHP package once per quarter:
122+
- disk space
123+
- PHP stats (lines of code, number of classes, methods, lines per class, lines per method, etc.)
124+
- PHPStan best error level with no errors / warnings.
125+
- Has PHPUnit tests?
126+
- Do the PHPUnit tests pass?
127+
- PHP Mess Detector results (clean code?)
128+
- GitHub Stars
129+
- Composer installs
130+
- and much more.
131+
132+
With all of this readily-available data via the Bettergist REST API, you, too, can quickly
133+
and efficiently build an LLM Foundation Model off of existing open source engines that will
134+
be able to code in PHP with much more fidelity and expertise than even GitHub CoPilot (*),
135+
at least subjectively, from our own clients' testimonies.

0 commit comments

Comments
 (0)