Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📜 Add tags to mailchimp users #6080

Merged
merged 4 commits into from
Jan 15, 2025
Merged

📜 Add tags to mailchimp users #6080

merged 4 commits into from
Jan 15, 2025

Conversation

boryanagoncharenko
Copy link
Collaborator

@boryanagoncharenko boryanagoncharenko commented Dec 31, 2024

This PR adds the following functionality:

  • when a user, who is subscribed to the newsletter, changes their country, the change is reflected in the Mailchimp subscriber profile.
  • when a user creates or customizes a class, a corresponding tag is added to their Mailchimp subscriber profile.

Fixes partially #3995

How to test
Ensure you have access to the Mailchimp account of Hedy.

  • Create a new teacher account and subscribe to the newsletter. Check that the Mailchimp subscriber has the tags 'teacher' and the country you selected.
  • Create a class and do not customize it. Check that the Mailchimp subscriber has a 'created_class' tag.
  • Customize the class and check that the Mailchimp subscriber has a new tag: 'customized_class'
  • Go to the profile page of the teacher and change their country. Check that the Mailchimp subscriber has the country tag updated accordingly. The old country, if any, should be removed and the new country, if any, should be added.
  • Go to the profile page of the teacher and change their email. Check that there is a Mailchimp subscriber with the new email address and that all tags from the old subscriber are present. Check that the old Mailchimp subscriber is archived.

Note that currently you cannot subscribe to the newsletter after you signed up. So you cannot check if a teacher with existing customized classes gets the right tags when they subscribed.

To consider:
All functions for managing subscriptions are blocking but we do not need their output to continue with the response and the calls to Mailchimp are not retried. Is it worth investing in these for this feature?

@boryanagoncharenko boryanagoncharenko marked this pull request as draft December 31, 2024 14:09
@@ -36,42 +32,6 @@

env = os.getenv("HEROKU_APP_NAME")

MAILCHIMP_API_URL = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@@ -50,6 +51,7 @@ def create_class(self, user):
}

self.db.store_class(Class)
add_class_created_to_subscription(user['email'])
Copy link
Collaborator

@rix0rrr rix0rrr Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how fast or reliable the mailchimp API is, but in case it's slow or buggy: any way we can avoid doing this if it's already been done? (By keeping flags in Dynamo or sth?)

Also are we catching exceptions for low-value API calls like this, in case Mailchimp is down? (I think we're better off not setting a label than crashing the page)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit of a bummer this feels weird to be part of database.py, no?

Otherwise that would seem like a much better place for all these repetitive calls...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The calls to Mailchimp are defenseless at this point - no exception handling, retries, timeouts, and they are blocking the response while they do not need to. I don't think retries are needed in this case - I confirmed this with the POs. We definitely need exception handling (so that we do not fail the whole call) and timeouts (perhaps maximum 5 seconds?). I am not sure about making the call non-blocking. This could be a fire-and-forget call but I am on the fence on the cost vs benefit. What do you think?

I did not think of storing the data in the DB to reduce the Mailchimp calls. These operations are not frequent and if we have timeouts + exception handling, would it be necessary?

I was tempted to put this in the database.py but it truly does not belong there semantically. I am also a bit afraid moving this to the database.py file could lead to potential bugs in the future because other maintainers would probably not expect it either.

Copy link
Collaborator

@rix0rrr rix0rrr Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other maintainers would probably not expect it either.

Yeah that's extremely fair. And I also hate to introduce another domain modeling layer just for the occasional call that needs to do both database and something else.

As for the calls themselves: I think tossing them onto a background thread and catching and logging errors is probably a nice middle ground. This should not be a huge lift; we just need to pay attention that the HTTP calls we are doing do not access mutable global state, and the library is thread-safe itself as well.

Something like:

def fire_and_forget(f):
  def run_this_and_catch_errors():
    try:
      f()
    except Exception:
      logger.exception('Exception in background thread')

  t = threading.Thread(target=run_this_and_catch_errors, daemon=True)
  t.start()

We have some threading code in there already, to deal with logging to S3 (and, quite anciently, to something called JSONBin). You could have a look there as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pointers. I have seen the threading code but tbh I was a bit hesitant to spawn threads like that since they could go out of hand. Since the calls for creating or updating a subscription are not a lot, probably this will be ok and this is just me overthinking. So, let's go this way.

Btw I thought nowadays people use asyncio and I have no experience with it, so this is what I meant by cost of implementation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the calls for creating or updating a subscription are not a lot, probably this will be ok and this is just me overthinking

That's what I'm assuming. If we're at all concerned, we should probably use this: https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ThreadPoolExecutor, set it to a reasonable number of workers (could be 1), and just send jobs to that (that we never wait for). Maybe that's just the safer default anyway...

Btw I thought nowadays people use asyncio and I have no experience with it, so this is what I meant by cost of implementation.

async io is quite a trend... and the benefits are also that it is trendy, not that it is better. Or at least, I'm not convinced that it is in common scenario's.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me check that I understand your suggestion. If we swap the thread.start() with creating a new ThreadPoolExecutor on every request, that too can go out of hand. So probably you mean that we should create a global ThreadPoolExecutor on app startup and use it whenever we need to run and not wait for a function. Is this correct? If this is the case though, based on the docs, a ThreadPoolExecutor does not seem like the right thing to use:

All threads enqueued to ThreadPoolExecutor will be joined before the interpreter can exit. Note that the exit handler which does this is executed before any exit handlers added using atexit. This means exceptions in the main thread must be caught and handled in order to signal threads to exit gracefully. For this reason, it is recommended that ThreadPoolExecutor not be used for long-running tasks.

I did not realize asyncio is coroutines. I'd also not go for that in our case.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think that docstring precludes using it, but the individual threads are probably also fine for our use case

@wraps(f)
def inner(*args, **kws):
if MAILCHIMP_API_URL:
return f(*args, **kws)
Copy link
Collaborator

@rix0rrr rix0rrr Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super nitpick: probaby we shouldn't be doing return f(...): f can only ever return None because it may be that we're not calling it at all, in which case we will definitely return None so you can't rely on a return value anyway?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, thanks

Copy link
Contributor

mergify bot commented Jan 15, 2025

Thank you for contributing! Your pull request is now going on the merge train (choo choo! Do not click update from main anymore, and be sure to allow changes to be pushed to your fork).

Copy link
Contributor

mergify bot commented Jan 15, 2025

Thank you for contributing! Your pull request is now going on the merge train (choo choo! Do not click update from main anymore, and be sure to allow changes to be pushed to your fork).

@mergify mergify bot merged commit 4e12603 into main Jan 15, 2025
11 checks passed
@mergify mergify bot deleted the newsletter_3995 branch January 15, 2025 20:13
boryanagoncharenko added a commit that referenced this pull request Jan 22, 2025
mergify bot pushed a commit that referenced this pull request Jan 22, 2025
This reverts commit 4e12603.

Temporarily reverts the change in #6080 

The revert is needed because the content for the Mailchimp emails is not ready. Releasing this change without the counterpart in Mailchimp, would result in either users receiving irrelevant content or double emails. We will merge this back to mail when the content is ready.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants