-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatic translator updates #1
Comments
It would be great to have the translators repo url in a config file, to be able to use a fork instead of the original translators repo. |
Summarized some thoughts here: Main problem: The translators are currently pulled during start up of the lambda environment and not updated after. They can end up being out of date, and if there is a change to a translator, the new version will not be used until AWS decides to drop existing container and create a new one. Idea 1: Add a middleware into the lambda to pull metadata, check /tmp/ folder if the translator code has been cached in there, and pull it from the repo.zotero.org if file doesn’t exist of is too old. Idea 2: Maybe we could use lambda layers? Translators can be packaged as a layer for the lambda function. The streaming server (correct me if I am not correct about how that piece works) can notify another agent (most likely another lambda function) that can do the fetching of metadata and asking repo.zotero.org about the updated translators, create a new layer, and update the lambda function of the translation server to use the new layer. That way, fetching/updating of the new translators happens only in one place, the translator server is independent of this logic, and repo.zotero.org is not checked for no reason. edit: In fact, the actual layering may not be needed. This other agent (most likely another lambda) can pull the latest translator-server code with the latest translators and run the deployment script. That way it's one less thing to worry about. Idea 3: Skip the streaming server and try to use GitHub actions ci. On push to master to the translators repo, we can pull the translation-server, move latest translators file into the right folder of translation-server, and then run the deploy script to update lambdas |
Just to clarify, translators are currently updated when we update the git submodule and redeploy — it's not related to the Lambda execution environment at the moment. I don't think we need or want to overly focus on Lambda here. That's how we deploy it, but I don't think there's any fundamental reason we can't have the same solution for Docker or straight Node deployments. So most of the logic here should just go in the main logic outside of lambda.js. (The Lambda part does imply that using the streaming server doesn't make sense, since we can't use a persistent connection. I'm not sure if we were even deploying to Lambda when I opened this ticket, but ignore that in any case.) I think continuing to use a submodule for the base set of translators is OK — most translators don't change for years at a time, so the server will be able to continue to use hundreds of them without downloading updates, and automatic updates should also be an optional setting. I think the basic process is:
So after a server deployment, it will request For Lambda, since this is all cached outside of the invocation, execution environments will share the same set of updated translators. New ones will have to start with the submodule set. (We'll be able to see how often that's happening, and that can influence how often we bother updating the submodule, but I don't expect it to be much of a problem, as long as we're only fetching updated translators.) |
Pull from the translators repo at startup and connect to the streaming server for immediate updates. These should be optional.
The text was updated successfully, but these errors were encountered: