-
Notifications
You must be signed in to change notification settings - Fork 0
Add ability to handle pages of transactions asynchronously #22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
620fb0f
04e15c9
EmielSchmeink
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested locally and seems to work.
|
Status update: This is currently not needed anymore. It was only used by Data Quality scripts when running time limits were very tight (Azure Functions). Since that is not the case anymore (using Argo Workflows instead now), this is not needed at the moment. If picked up at a later point: need to think carefully about a way to integrate async in the current SDK without breaking current contracts / keeping backward compatibility. |
Proposed changes
Recall that, when querying transactions, the dataset service uses a paginated response. I.e. only 10,000 transactions can be queried at a time (which we call 1 page). Currently, the SDK abstracts away from this by requesting all pages one-by-one until all transactions have been received.
However, in some cases it can be useful to already start processing a page of transactions, while waiting for the next one to be returned. This PR adds that functionality by adding a callback parameter to the
get_transactionsfunction that is called after each page of transactions is received.Types of changes
What types of changes does your code introduce to this repository?
Testing
Explain here how to run your unit tests, and explain how to execute manual testing.
Unit testing
n/a
Manual testing
Install snapshot version using
Then, for instance, something like the following. It should print the different pages and their sizes (always 10,000) multiple times.
Further comments
Uses Python's built-in asyncio to handle concurrency. This allows us to process a page of transactions while waiting for the next page of transactions to be returned by the API.