RFC: handle batch messages in parallel in batch module

**Is your feature request related to a problem? Please describe.**

At the moment, the batch module processes messages in sequence ([code](https://github.com/aws-powertools/powertools-lambda-java/blob/bb9bb2e540b58938caf29a46731f4fedc8fea155/powertools-batch/src/main/java/software/amazon/lambda/powertools/batch/handler/SqsBatchMessageHandler.java#L67)), which could be improved with a parallel processing for better performance.

**Describe the solution you'd like**
- The `BatchMessageHandler` could provide a `processBatchInParallel` method with the same signature as `processBatch` but with a different behaviour (parallel processing instead of serial)
- Instead of iterating through the list of messages, we could use a [`CompletableFuture`](https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/CompletableFuture.html). It would be something like this (probably not that easy but that's a start):

```java
Executor executor = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors() * 2);

List<CompletableFuture<Optional<SQSBatchResponse.BatchItemFailure>>> collect = event.getRecords().stream()
        .map(sqsMessage -> CompletableFuture.supplyAsync(
                () -> processTheMessageAndReturnOptionalOfBatchItemFailure(sqsMessage, context), executor)
        ).collect(Collectors.toList());

CompletableFuture<List<Optional<SQSBatchResponse.BatchItemFailure>>> listCompletableFuture = CompletableFuture
        .allOf(collect.toArray(new CompletableFuture[0]))
        .thenApply(unused -> collect
                .stream()
                .map(CompletableFuture::join)
                .collect(Collectors.toList())
        );

List<SQSBatchResponse.BatchItemFailure> batchItemFailures =
        listCompletableFuture.get().stream().filter(Optional::isPresent).map(Optional::get)
                .collect(Collectors.toList());
```

**Describe alternatives you've considered**
streams provide a `parallel` method which is based on the number of vCPUs (`Runtime.getRuntime().availableProcessors()`). Using `CompletableFuture`, we can define the number of executors, potentially more than the number of vCPUs. We should probably perform some load tests on Lambda to check if that's actually better, because parallel is probably much easier to implement.

**Additional context**

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RFC: handle batch messages in parallel in batch module #1540

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

RFC: handle batch messages in parallel in batch module #1540

Description

Activity

scottgerring commented on Dec 21, 2023

jeromevdl commented on Dec 21, 2023

itsmichaelwang commented on Dec 22, 2023

jeromevdl commented on Dec 22, 2023

scottgerring commented on Dec 22, 2023

jeromevdl commented on Dec 22, 2023

jeromevdl commented on Dec 22, 2023

scottgerring commented on Dec 22, 2023

kyuseoahn commented on Jan 8, 2024

scottgerring commented on Jan 16, 2024

kyuseoahn commented on Jan 16, 2024

scottgerring commented on Jan 25, 2024

jeromevdl commented on Jul 4, 2024

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions