Skip to content

URL crate is failing to parse these existing URLs #489

Open
@mixalturek

Description

@mixalturek

Hi there, I have these five URLs that exist but can't be parsed by url crate due to invalid domain name. This issue is similar to #483 with subdomains ending on - character, but these ones use punycode.

use url::Url;

fn main() {
    let urls = vec![
        "http://mail.163.com.xn----9mcjf9b4dbm09f.com/iloystgnjfrgthteawvo/indexx.php",
        "http://shdedgelanimailnoticeborad.count.mail.163.com.xn----9mcjf9b4dbm09f.com/sitemap.html",
        "http://count.shdedgelanimailnoticeborad.count.mail.163.com.xn----9mcjf9b4dbm09f.com/bvv",
        "http://count.shdedgelanimailnoticeborad.count.mail.163.com.xn----9mcjf9b4dbm09f.com/index.php",
        "http://count.shdedgelanimailnoticeborad.count.mail.163.com.xn----9mcjf9b4dbm09f.com/iloystgnjfrgthteawvo/index.php"
    ];

    for url in urls {
        match url.parse::<Url>() {
            Ok(_) => println!("Parsing successful: {}", url),
            Err(e) => println!("Parsing failed: {}, {}", e, url),
        }
    }
}
Parsing failed: invalid international domain name, http://mail.163.com.xn----9mcjf9b4dbm09f.com/iloystgnjfrgthteawvo/indexx.php
Parsing failed: invalid international domain name, http://shdedgelanimailnoticeborad.count.mail.163.com.xn----9mcjf9b4dbm09f.com/sitemap.html
Parsing failed: invalid international domain name, http://count.shdedgelanimailnoticeborad.count.mail.163.com.xn----9mcjf9b4dbm09f.com/bvv
Parsing failed: invalid international domain name, http://count.shdedgelanimailnoticeborad.count.mail.163.com.xn----9mcjf9b4dbm09f.com/index.php
Parsing failed: invalid international domain name, http://count.shdedgelanimailnoticeborad.count.mail.163.com.xn----9mcjf9b4dbm09f.com/iloystgnjfrgthteawvo/index.php

All of them ping and respond.

http get http://count.shdedgelanimailnoticeborad.count.mail.163.com.xn----9mcjf9b4dbm09f.com/iloystgnjfrgthteawvo/index.php
HTTP/1.1 200 OK
cache-control: max-age=0, private, must-revalidate
connection: close
content-length: 384
content-type: text/html; charset=utf-8
date: Fri, 01 Mar 2019 15:12:05 GMT
server: nginx
set-cookie: sid=603521ac-3c34-11e9-b489-941a3615a412; path=/; domain=xn----9mcjf9b4dbm09f.com; HttpOnly

<html><head><title>Loading...</title></head><body><script type='text/javascript'>window.location.replace('http://count.shdedgelanimailnoticeborad.count.mail.163.com.xn----9mcjf9b4dbm09f.com/iloystgnjfrgthteawvo/index.php?js=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJqcyI6MX0.fADWc9hUOlh58R9UzufQBROmie3I7c7vE835oE6YmU4&uuid=603521ac-3c34-11e9-b489-941a3615a412');</script></body></html>

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions