Skip to content

Commit

Permalink
Fix false negative http code 404 in verification
Browse files Browse the repository at this point in the history
Some links when they were being checked returned the http code 404,
but the links were working correctly.

This was happening because before the request the link was concatenated
with the / character at the end, making it a different link from the
original. If the original link didn't have a path that is
accessed by / at the end, it would return a 404 error.

This behavior made it a false negative.
  • Loading branch information
matheusfelipeog committed Feb 7, 2022
1 parent 51b4166 commit c2bdd9e
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions scripts/validate/links.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ def find_links_in_text(text: str) -> List[str]:
raw_links = re.findall(link_pattern, text)

links = [
str(raw_link[0]).rstrip('/') for raw_link in raw_links
str(raw_link[0]) for raw_link in raw_links
]

return links
Expand Down Expand Up @@ -49,6 +49,7 @@ def check_duplicate_links(links: List[str]) -> Tuple[bool, List]:
has_duplicate = False

for link in links:
link = link.rstrip('/')
if link not in seen:
seen[link] = 1
else:
Expand Down Expand Up @@ -163,7 +164,7 @@ def check_if_link_is_working(link: str) -> Tuple[bool, str]:
error_message = ''

try:
resp = requests.get(link + '/', timeout=25, headers={
resp = requests.get(link, timeout=25, headers={
'User-Agent': fake_user_agent(),
'host': get_host_from_link(link)
})
Expand Down

0 comments on commit c2bdd9e

Please sign in to comment.