Feature: Strip URL tracking parameters on paste #7420
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
First time contributor checklist:
Contributor checklist:
main
branchpnpm run ready
run passes successfully (more about tests here)Description
Currently if you paste URLs with tracking parameters in Signal, it will not attempt to remove them. Some of these parameters, eg Instagram's
igshid
parameter, are immediately personally identifying, showing the recipient the profile of the sender. I have personal knowledge of users in anonymous group chats accidentally doxxing themselves by pasting links with tracking parameters.This PR implements a URL tracking parameter stripper. Every time a URL is pasted into the message composition box, it is filtered for known tracking parameters, according to AdGuard's tracking filter list. This is gated by a checkbox in settings (on by default). Here's what it looks like:
signal-sanitize-demo.mp4
Design notes
It is important that Signal preserve the privacy of its users while also not unfairly singling out any specific threat to privacy. The reasonable choice here is to let someone else decide who counts as a tracker. AdGuard maintains the most widely used filter lists in the world, publishing them for free on Github, with daily diffs in the hundreds of lines. By using their list, it is possible to keep a relatively complete and unbiased overview of trackers.
It is also important that Signal respect user choice. This is why there's a checkbox in the settings for users to opt in/out. This feature is on by default, since it is a privacy feature, has no obvious (to me) downsides, and whose utility may be opaque to lay users.
Testing
I included a set of known-answer tests for the URL stripping functionality. The parser and rule algorithm is very simple, so I didn't have any surprises when writing tests (besides forgetting things about how URLs are parsed by the browser).
Further, even if this somehow has a bug in it, the damage is limited. The stripping algorithm only deletes query parameters (everything after the
?
). So even if a bug made the stripping algorithm too overzealous, it would only delete query parameters, and not other parts of the URL.Performance
I ran some quick benchmarks on my Macbook Air M1. The
applyAllRules()
function takes 6ms on the first execution, 4ms for the second, <1ms for subsequent executions. I did not notice any lag when pasting in Signal with this feature on.What I'm missing
The new settings checkbox text is written in English, but not any other language. It would be good to have some translations for
icu:Preferences__auto-remove-url-tracking--title
andicu:Preferences__auto-remove-url-tracking--description
.Future work
Strip URLs that appear in larger blocks of text. Currently the tracker stripping hook is only triggered when the paste contents is a plain URL and nothing else. We could imagine expanding this to look for URLs in a larger paste, though. This comes with tradeoffs on user experience: a user pasting a large block of text might be surprised to find that a link in the middle was modified. Something to think about.
Implementing for other platforms. The parameter stripping code was intentionally written to be as simple as possible. It should be relatively straightforward to port the code to iOS and Android, and have the same behavior on all platforms. There's an old iOS PR that attempts this. Might be a good starting point.
Support regex removeparams. The parser doesn't support rules with regex
removeparams
yet, e.g.,removeparam=/^__s=[A-Za-z0-9]{6\,}/
. There are 43 such rules, out of 2155 total. It's a concretely small impact (2%), but it'd be nice if this could be implemented regardless.cc @moiseev-signal