.NET 8 C# URL normalizer.
URL normalization, also known as URL canonicalization, is the process of normalizing (standardizing) the text representation of a URL to determine if differently-formatted URLs are identical.
-
Duplicate slashes are removed
file://example.com/foo//bar.html→file://example.com/foo/bar.html -
Default port is removed
ftp://example.com:21/→ftp://example.com/ -
Dot-segments are removed
file://example.com/foo/./bar/baz/../qux→file://example.com/foo/bar/qux -
Empty path is converted to "/"
ftp://example.com→ftp://example.com/ -
Percent-encoded triplets are uppercased
ftp://example.com/foo%2a→ftp://example.com/foo%2A -
Percent-encoded triplets of unreserved characters are decoded
ftp://example.com/%7Efoo→ftp://example.com/~foo -
Scheme and host are lowercased
FTP://[email protected]/Foo→ftp://[email protected]/Foo
-
Directory index can be removed (optional, via
removableDirectoryIndexNames)
http://example.com/default.asp→http://example.com/
http://example.com/a/index.html→http://example.com/a/ -
Fragment can be removed (optional, via
isFragmentIgnored)
http://example.com/bar.html#section1→http://example.com/bar.html -
Scheme can be changed (optional, via
PreferredScheme)
https://example.com/→http://example.com/ -
Query parameters are sorted
http://example.com/display?lang=en&article=fred→http://example.com/display?article=fred&lang=en -
User-info can be removed (optional, via
isUserInfoIgnored)
http://user:[email protected]→http://example.com/ -
Empty query is removed
http://example.com/display?→http://example.com/display
PM> Install-Package Toimik.UrlNormalization> dotnet add package Toimik.UrlNormalization// Use default arguments
// var normalizer = new UrlNormalizer();
// Use custom arguments
var normalizer = new UrlNormalizer(isAdjacentSlashesCollapsed: false);
var url = ...
var normalizedlUrl = normalizer.Normalize(url);// Use default arguments
// var normalizer = new HttpUrlNormalizer();
// Use custom arguments
var normalizer = new HttpUrlNormalizer(
preferredScheme: "https",
isUserInfoIgnored: false,
removableDirectoryIndexNames: new HashSet<string>(0), // override the default
isFragmentIgnored: false);
var url = ...
var normalizedlUrl = normalizer.Normalize(url);