Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Most YT videos are not archived despite setting video crawler env vars #986

Open
1 task done
raddevon opened this issue Feb 5, 2025 · 1 comment
Open
1 task done

Comments

@raddevon
Copy link

raddevon commented Feb 5, 2025

Describe the Bug

Almost no YouTube videos are downloaded, even with the correct options set.

Steps to Reproduce

  1. Bookmark some YouTube videos
  2. Enable video downloading for Hoarder
  3. Admin Settings > Actions > Recrawl All Links

Expected Behaviour

Most videos would be downloaded and archived

Screenshots or Additional Context

I have the following env vars set for my instance:

var value
CRAWLER_VIDEO_DOWNLOAD true
CRAWLER_VIDEO_DOWNLOAD_MAX_SIZE 20000
CRAWLER_VIDEO_DOWNLOAD_TIMEOUT_SEC 7200

Container logs are not particularly helpful, at least to my eyes.

2025-02-05T20:55:41.737Z info: [VideoCrawler][7822] Attempting to download a file from "https://www.youtube.com/watch?v=Ev7FqNa5rD0" to "/tmp/video_downloads/dfc38c95-8944-46a8-bda2-bcaaf030fac8" using the following arguments: "https://www.youtube.com/watch?v=Ev7FqNa5rD0,-f,best[filesize<20000M],-o,/tmp/video_downloads/dfc38c95-8944-46a8-bda2-bcaaf030fac8,--no-playlist"
2025-02-05T20:55:45.297Z error: [VideoCrawler][7822] Failed to download a file from "https://www.youtube.com/watch?v=Ev7FqNa5rD0" to "/tmp/video_downloads/dfc38c95-8944-46a8-bda2-bcaaf030fac8"
2025-02-05T20:55:45.297Z info: [VideoCrawler][7822] Video Download Completed successfully

The following example is the only one I'm aware of out of 20 that actually succeeded, and I'm not sure why it would be any different from the other 19.

2025-02-05T20:43:58.063Z info: [VideoCrawler][7618] Attempting to download a file from "https://www.youtube.com/watch?v=A7HF7v9wUUA" to "/tmp/video_downloads/7cf05c3b-1f98-4278-a58c-f47b9fc93325" using the following arguments: "https://www.youtube.com/watch?v=A7HF7v9wUUA,-f,best[filesize<20000M],-o,/tmp/video_downloads/7cf05c3b-1f98-4278-a58c-f47b9fc93325,--no-playlist"
2025-02-05T20:44:10.252Z info: [VideoCrawler][7618] Finished downloading a file from "https://www.youtube.com/watch?v=A7HF7v9wUUA" to "/tmp/video_downloads/7cf05c3b-1f98-4278-a58c-f47b9fc93325"
2025-02-05T20:44:12.587Z info: [VideoCrawler][7618] Finished downloading video from "https://www.youtube.com/watch?v=A7HF7v9wUUA" and adding it to the database

2025-02-05T20:44:12.587Z info: [VideoCrawler][7618] Video Download Completed successfully

Some of the videos are age gated, so it would make sense if those failed, but most of the failed downloads (like the example above) are not. I believe yt-dlp uses embeds to download age gated videos without an account, so it makes sense that a video that is age gated and also disabled embedding would fail. That doesn't seem to be the only issue here though.

Device Details

No response

Exact Hoarder Version

0.22.0

Have you checked the troubleshooting guide?

  • I have checked the troubleshooting guide and I haven't found a solution to my problem
@raddevon
Copy link
Author

raddevon commented Feb 5, 2025

Just checked all 20 YouTube links, and that one successful example in the body of my issue is the only one of 20 that successfully downloaded. Here are a few others which failed with messages just like the one shared in the issue body:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant