Skip to content

FIREFLY-1727: Handle Redis failure #1754

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 6, 2025
Merged

Conversation

loitly
Copy link
Contributor

@loitly loitly commented May 1, 2025

Ticket: https://jira.ipac.caltech.edu/browse/FIREFLY-1727

  • Try to reconnect until Redis is back online
  • Show error status to client
  • Show status in ServerStatus page
  • Refactor LostConnection.jsx component to use JoyUI

Test: https://fireflydev.ipac.caltech.edu/firefly-1727-handle-redis-failure/firefly/
It's a bit harder to test. To simulate Redis down, I've deployed this instance with 2 replicas.

  • You can find and delete the Redis pod to see that Firefly can recover from a redis restart.
  • You can find and edit the Redis deployment so that i cannot start back up to see what happen when it's down for a long time.
  • Both requires kubectl access.

Locally:

  • You can see the LostConnection.jsx when you shutdown tomcat
  • Find redis process, then kill it and you will see it's got restarted
  • If you install Redis locally, then you can add -Dredis.host=127.0.0.1 to Tomcat's bin/setenv.sh to see how it behaves when Redis is down.
    • brew install redis
    • brew services stop redis
    • brew services start redis

- Try to reconnect until Redis is back online
- Show error status to client
- Show status in ServerStatus page
- Refactor LostConnection.jsx component to use JoyUI
@loitly loitly added this to the 2025.3 milestone May 1, 2025
@loitly loitly requested a review from robyww May 1, 2025 20:02
@loitly loitly self-assigned this May 1, 2025
Copy link
Contributor

@robyww robyww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

}
throw new ConnectException("Unable to connect to Redis at " + redisHost + ":" + REDIS_PORT);
String status = failSince == null ? "OK" :
"Failed since " + DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss").withZone(ZoneId.systemDefault()).format(failSince);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not looking it up but I would think

withZone(ZoneId.systemDefault()

would have a default.

@@ -210,6 +221,10 @@ export function isAppReady() {
return get(flux.getState(), [APP_DATA_PATH, 'isReady']);
}

export function getConnectionStatus() {
return get(flux.getState(), [APP_DATA_PATH, 'connectionStatus']);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not

flux.getState()?.[APP_DATA_PATH]?.connectionnStatus

I find get to not be very readable and unnecessary these days.

const clzName = decor === 'small' ? 'lost-connection--small' : 'lost-connection';
const msg = 'You are no longer connected to the server. Try reloading the page to reconnect.';
if (decor !== 'full') return <div className = {clzName} title = {msg} />;
export function LostConnection () {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad you did this!

@loitly loitly merged commit 09a3047 into dev May 6, 2025
@loitly loitly deleted the FIREFLY-1727-handle-redis-failure branch May 6, 2025 17:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants