-
Notifications
You must be signed in to change notification settings - Fork 398
Script to check links on .md files #690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can accept and merge this, but I feel it won't be used/useful w/o introducing a proper CI setup. It can look like an extra CircleCI step that run the Node app and triggers this script on all .md files that has changed compared to master. Also, you don't even add a mention about the script to the docs.
- It does not fix 652 (we'll need to reopen the ticket) or remove from the description and change title, please
- It's a good practice to have
set -euxo pipefail
- It's a good practice to keep it clean from debug prints and other commented code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about set -euox pipefail
to make it more robust?
I don't think it will make this script more robust. By the way, maybe we don't have to merge this PR if @algomaster99 is going to itegrate it somewhere else. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the script 🙂 but also not seeing point (nor problem) in merging until it gets integrated to CI tests.
UPDATE: Please see my actual #690 (review) below...
@jorgeorpinel @shcheklein @dashohoxha I altered one URL in Just for testing, I disabled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I found one thing to review (see below)
And also, this doesn't seem to run on Mac:
$ ./scripts/check-links.sh
./scripts/check-links.sh: line 29: shopt: globstar: invalid shell option name
grep: static/**/*.md: No such file or directory
@@ -0,0 +1,41 @@ | |||
#!/bin/bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be #!/bin/sh
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jorgeorpinel Can you try and let me know?
I think you found an engine bug. It should return a 404 header.
p.s. no need to wait for that bug to be fixed before finalizing this. You can test with some other 404 returning URL like http://connectivitycheck.gstatic.com/ |
@jorgeorpinel @algomaster99 it might be not an easy thing to fix though - it's the way the engine is written/works (or at least it might turn out that it's easier to fix in the script, e.g. using selenium of what not). |
@jorgeorpinel @shcheklein One way is to Also, I think the script is written fine. We might need to grep its response codes and fail the test whenever we get |
@algomaster99 there is not easy way to grep the response. You need to execute JS. It's not some static content. |
@shcheklein My workaround was getting the content using |
@algomaster99 does wget generated response contain 404 for those links? |
@shcheklein Not for those but it works for URLs like |
@algomaster99 so, the point is to being able to test for any docs related URLs. For majority of them, wget method won't work because of the dynamic nature of the engine. |
I think we should definitely open a separate issue for the false 404 pages rendered by React which doesn't really block this PR. If that's not fixable, maybe we can use a different strategy to check these internal links, like check that the file exists (since the URL paths are also file paths). |
@jorgeorpinel Okay, I will open one :) |
@jorgeorpinel I'm not sure I understand why do you consider them "false" 404s, though? |
Because the engine renders a huge number "404", however it already responded with HTTP 200: I guess they're not "false"; They are actual 404s, but with an incorrect response header. |
@jorgeorpinel yep, but this is because it's not even reloading the page, right? no way it can redo a response if you click on a broken link on an already opened page in docs. |
@jorgeorpinel so, this is the way the engine works - it preloads a simple "empty" page with basic elements. This request returns 200. Then it executes some JS to download the md file to render it on the page. |
Yes, I understand it's dynamic but in theory it could check for the path validity before responding 200 and preloading a simple layout. And I do think it reloads when you click on a link (but it responds with |
Sure. |
The script currently reports these issues to me:
Maybe not all of them need to be fixed, but some of them should be fixed. |
By the way, I see two reasons why this script is not 100% reliable:
Maybe there are other problems too. |
@dashohoxha thanks for the list in #690 (comment) please open separate issue so we can address when possible. I believe @algomaster99 is taking over this branch and PR so no need to close it. Aman, please note Dashamir's notes in #690 (comment) – things to fix in the script (if not now, feel free to open separate issue to improve it after CI integration). Thanks |
closing this until we find a way to run it via CI and check the links reliably |
Fix #652