Add requests checking and retry to solar flare fetch #92

jeanconn · 2025-03-03T15:25:06Z

Description

Add requests checking and retry to solar flare fetch

Fixes issue/bug in #90 that a failed fetch will create a cached "image" file that is the http error response like

jeanconn-fido> more AR_CH_20250302.png 
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<title>404 Not Found</title>
<style type="text/css"><!--
.style3 {
  font-family: Verdana, Arial, Helvetica, sans-serif;
  color: #CC0000;
}
.s {
  font-family: Verdana, Arial, Helvetica, sans-serif;
  font-size: 11px;
  font-weight: normal;
  color: #000000;
  line-height: 18px;
  text-align: center;
  border: 1px solid #CCCCCC;
  background-color: #FFFFEC;
}
body {
  background-color: #FFFFFF;
  margin-top: 100px;
}
--></style>
</head>
<body>
<div align="center">
<h2><span class="style3">404 Not Found</span></h2>
<table border="0" cellpadding="8" cellspacing="0" width="460">
<tbody><tr><td class="s">The requested URL was not found on this server.</td></tr></tbody>
</table>
</div>
</body>
</html>

Functional testing

I locally edited the script to change the value of img_url to a non-existent remote file, and confirmed that

the previously cached image was deleted/invalidated
the retry tried to get the file as requested
no new file was output into the cache directory

(ska3) flame:arc jean$ python get_solar_flare_png.py --image-cache-dir myretest --out-file new.png
WARN1NG: get_last_referenced_web_image(url=https://www.solen.info/solar/index.html, img_src_pattern=<img src=\"(images/AR_CH_\d{8}\.png)\", cache_dir=myretest) excepti0n: 404 Client Err0r: Not Found for url: https://www.solen.info/solar/images/AR_CH_20250303.png_broken, retrying in 5 seconds...
WARN1NG: get_last_referenced_web_image(url=https://www.solen.info/solar/index.html, img_src_pattern=<img src=\"(images/AR_CH_\d{8}\.png)\", cache_dir=myretest) excepti0n: 404 Client Err0r: Not Found for url: https://www.solen.info/solar/images/AR_CH_20250303.png_broken, retrying in 5 seconds...
Failed to get page or image: 404 Client Error: Not Found for url: https://www.solen.info/solar/images/AR_CH_20250303.png_broken

jeanconn · 2025-03-03T15:26:03Z

This comes up because I just went to put together an FSDS for the new replan central page and realized the image at bottom was just broken.

taldcroft · 2025-03-03T16:46:08Z

Couple of ideas for discussion:

Since the image only updates once a day, it seems like the cached image can be retained if the creation date of the file is less than 24 hours old. If the site goes down for a few hours there is no need to invalidate the cache completely.

For the (apparently common) case that the site is not responding, we don't need the logs cluttered with full tracebacks. It is probably sufficient to print some short benign message to the logs. People will notice if the image disappears for long stretches (and anyway there is nothing we can really do to fix the site being down).

About making this job run asynchronously, that is basically happening now with the entire main() in a try/except block.

jeanconn · 2025-03-03T16:54:39Z

If the site goes down for a few hours there is no need to invalidate the cache completely.

Though since the displayed image file is a separate copy not in the cache, invalidating the cache / deleting the cached image is benign and the right thing to do if we've moved on to a new day and don't have the file.

jeanconn · 2025-03-03T16:56:13Z

For the (apparently common) case that the site is not responding, we don't need the logs cluttered with full tracebacks.

Sure, so should we just include that handling in main to not spit out the traceback? And should this do the retry or not? I figured the retry would likely work for remote server hiccups.

jeanconn · 2025-03-03T16:56:58Z

About making this job run asynchronously, that is basically happening now with the entire main() in a try/except block.

Good point.

jeanconn · 2025-03-03T21:25:38Z

Also regarding "For the (apparently common) case that the site is not responding" - I have no information that it is common - just that the code in master did not invalidate the cached image (and instead created a broken one) on a failure and would have blithely served the failure text instead of the image for the rest of the day.

taldcroft · 2025-03-04T10:53:48Z

Looks OK but you need to update the functional testing.

My other question is whether the "Failed" and "Error" is going to trigger jobwatch emails / spam. I don't remember where things stand with the hourly job watch, but I don't think we need to be getting hourly messages if this particular image is not available.

jeanconn · 2025-03-04T17:55:51Z

I at least did the mangle_alert_words in the retry (to remove the number of lines with printed "real" warnings) and redid the functional testing.

With regard to jobwatch emails/spam, if this fails all attempts the last failure to get the page or image is and would be caught by the arc3 watch_cron_logs (to send an email once a day I think) and the daily log checking in jobwatch for replan central is also going to be red (if we don't add an exception). We shouldn't get hourly messages from jobwatch because the hourly watch just checks a subset of replan central outputs and this new image is not in the list.

This might be the right level of spam for now.

taldcroft

LGTM.

Add requests checking and retry to solar flare fetch

856ee76

jeanconn requested a review from taldcroft March 3, 2025 15:25

Print out HTTPError instead of traceback

ff84a38

Mangle alert words in the retry

9d8cc3f

taldcroft approved these changes Mar 4, 2025

View reviewed changes

jeanconn merged commit 0eed3e6 into master Mar 4, 2025
2 checks passed

jeanconn deleted the solar-flare-patch branch March 4, 2025 18:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add requests checking and retry to solar flare fetch #92

Add requests checking and retry to solar flare fetch #92

jeanconn commented Mar 3, 2025 •

edited

Loading

jeanconn commented Mar 3, 2025

taldcroft commented Mar 3, 2025

jeanconn commented Mar 3, 2025 •

edited

Loading

jeanconn commented Mar 3, 2025

jeanconn commented Mar 3, 2025

jeanconn commented Mar 3, 2025 •

edited

Loading

taldcroft commented Mar 4, 2025

jeanconn commented Mar 4, 2025

taldcroft left a comment

Add requests checking and retry to solar flare fetch #92

Add requests checking and retry to solar flare fetch #92

Conversation

jeanconn commented Mar 3, 2025 • edited Loading

Description

Functional testing

jeanconn commented Mar 3, 2025

taldcroft commented Mar 3, 2025

jeanconn commented Mar 3, 2025 • edited Loading

jeanconn commented Mar 3, 2025

jeanconn commented Mar 3, 2025

jeanconn commented Mar 3, 2025 • edited Loading

taldcroft commented Mar 4, 2025

jeanconn commented Mar 4, 2025

taldcroft left a comment

Choose a reason for hiding this comment

jeanconn commented Mar 3, 2025 •

edited

Loading

jeanconn commented Mar 3, 2025 •

edited

Loading

jeanconn commented Mar 3, 2025 •

edited

Loading