Skip to content

DOC-6604 - thread pools #2298

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
bd69929
DOC-6604 - thread pools material on 4.13
paulmoloneyr3 Apr 30, 2025
25c4b66
DOC-6604 - thread pool updates
paulmoloneyr3 May 8, 2025
d633f06
DOC-6604 - fixed link
paulmoloneyr3 May 8, 2025
b0155c0
Merge branch 'release/platform/4.13' into paulmoloneyr3/DOC-6604-thre…
paulmoloneyr3 May 8, 2025
471de58
Update content/en/platform/corda/4.13/enterprise/cordapps/thread-pool…
paulmoloneyr3 May 8, 2025
36742ff
DOC-6604 - review by Adel
paulmoloneyr3 May 8, 2025
bed1da6
Update content/en/platform/corda/4.13/enterprise/cordapps/thread-pool…
paulmoloneyr3 May 9, 2025
1a30ad9
Update content/en/platform/corda/4.13/enterprise/node/operating/monit…
paulmoloneyr3 May 9, 2025
499b8ea
Update content/en/platform/corda/4.13/enterprise/node/operating/monit…
paulmoloneyr3 May 9, 2025
e8c4e32
DOC-6604 - review updates
paulmoloneyr3 May 12, 2025
61c5bf6
Merge branch 'release/platform/4.13' into paulmoloneyr3/DOC-6604-thre…
paulmoloneyr3 Jun 30, 2025
3f66028
DOC-6604 - added threadpools update to release notes
paulmoloneyr3 Jun 30, 2025
7211aed
DOC-6604 - merged two almost identical topics: performance tuning and…
paulmoloneyr3 Jul 1, 2025
d8aa7ee
DOC-6604 - updated release notes with thread pools
paulmoloneyr3 Jul 1, 2025
2cec746
DOC-6604 - updates after review by Lajos
paulmoloneyr3 Jul 1, 2025
614524b
DOC-6604 - updates after github suggestions by Lajos
paulmoloneyr3 Jul 1, 2025
a97f8d2
Update content/en/platform/corda/4.13/enterprise/node/operating/node-…
paulmoloneyr3 Jul 1, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
164 changes: 164 additions & 0 deletions content/en/platform/corda/4.13/enterprise/cordapps/thread-pools.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
---
date: '2025-04-20'
menu:
corda-enterprise-4-13:
identifier: corda-enterprise-4-13-cordapps-flows-segthreadpools
parent: corda-enterprise-4-13-cordapps-flows
tags:
- api
- service
- classes
title: Using additional thread pools
weight: 10
---

Corda Enterprise executes flows in *thread pools*. A thread pool is a group of pre-created, idle threads, ready to execute tasks. The default Corda Enterprise configuration creates a single thread pool, whose size is configured by the *[flowThreadPoolSize]({{< relref "../node/setup/corda-configuration-fields.html#enterpriseconfiguration" >}})* parameter. Open Source Corda is single-threaded.

In Corda 4.12 and previous versions, only the single, default thread pool described above was supported. From Corda 4.13 onward, the Enterprise version enables operators to define *multiple* thread pools and assign flows to them. The reason for this is to enable operators to prioritize particular flows and to segregate them from other flows.

For example, if there are slow-running reporting flows and more important transactional flows on the same system, the reporting flows can be separated into a dedicated thread pool so that they do not block the transactional flows.

Corda Enterprise targets the flow thread pools directly when it starts a flow. Therefore, there is no conflict between starting flows if one pool is performing badly and has a big queue.

## Configuring thread pools

Thread pools are defined in the [node configuration]({{< relref "../node/setup/corda-configuration-file.md" >}}) by adding an `additionalFlowThreadPools` array within the `tuning` object. The `additionalFlowThreadPools` array can contain one or more objects, each specifying the details of an additional thread pool. Each object contains a `threadpool` and `size` property, respectively defining the name of the thread pool and its size in number of threads.

### Example 1: Two Defined Thread Pools

The following sample configuration defines two thread pools based on the example above, `reporting` and `transactions`, each with three available threads:

```json
enterpriseConfiguration {
tuning {
additionalFlowThreadPools= [
{
threadPool=reporting,
size=3
},
{
threadPool=transactions,
size=3
},
]
}
}
```

The related flows then need to be tagged accordingly:

```
@FlowThreadPool("reporting")
```

and

```
@FlowThreadPool("transactions")
```

### Example 2: One Defined Thread Pool and Default Thread Pool

An alternative configuration, rather than defining two thread pools, could instead define one thread pool (in this case, `reporting`) but also use the default thread pool, defining its size using `flowThreadPoolSize`. As in previous versions of Corda, the size of the default thread pool (name: "default") is still specified by the *[flowThreadPoolSize]({{< relref "../node/setup/corda-configuration-fields.html#enterpriseconfiguration" >}})* parameter.

```json
enterpriseConfiguration {
tuning {
flowThreadPoolSize = 3,
additionalFlowThreadPools= [
{
threadPool=reporting,
size=3
},
]
}
}
```

Only the flows related to reporting then need to be tagged accordingly:

```
@FlowThreadPool("reporting")
```

## Logging

The Corda node's [startup log]({{< relref "../node/operating/monitoring-and-logging/overview.md" >}}) outputs the defined thread pools and their sizes; for example:

```
Created flow thread pools: reporting(3), transactions(3), default(20)
```

## Default flow-to-thread pool mapping rules

How flows are mapped to thread pools depends on:

- The thread pool configuration
- Whether or not the CorDapps installed have customized thread pool rules

The Corda default FlowSchedulerMapper follows these rules, in order of highest priority first:

1. If a flow is annotated with `@FlowThreadPool("threadpoolname")` and the referenced thread pool is defined in the configuration, then that flow is executed in the specified pool.
If the specified thread pool is not present in the node configuration, then the default thread pool is used instead.

2. If a thread pool named `Peer-Origin` is defined, then all flows started via a peer Corda node and **not** annotated with a specific thread pool will be executed in that thread pool. Otherwise, such flows are executed in the default thread pool.

3. If a thread pool named `RPC-Origin` is defined, then all flows started via RPC (for example, by a client application) and **not** annotated with a specific thread pool will be executed in that thread pool. Otherwise, such flows are executed in the default thread pool.

4. If none of the above rules apply to a flow, then the default behavior is the same as in previous versions of Corda: the flow is executed in the default thread pool.


## Customizing flow-to-thread pool mapping rules

CorDapps can override the above default flow mapping logic by defining a class which implements [the FlowSchedulerMapper interface](https://github.com/corda/corda/blob/release/os/4.13/core/src/main/kotlin/net/corda/core/flows/scheduler/mapper/FlowSchedulerMapper.kt); for example:

```java
interface FlowSchedulerMapper {
fun getScheduler(
invocationContext: InvocationContext,
flowLogic: Class<out FlowLogic<*>>,
ourIdentity: CordaX500Name
): String
}
```

The default mapping logic is available [here](https://github.com/corda/corda/blob/release/os/4.13/core/src/main/kotlin/net/corda/core/flows/scheduler/mapper/FlowSchedulerMapperImpl.kt).


Corda scans CorDapps at startup time for classes implementing the FlowSchedulerMapper interface.
Corda logs this message if it finds a single candidate:

```
Using custom flow scheduler mapper. Class {classname}
```

If it has a constructor which accepts a set of Strings, it will use that class as a flow mapper.
Corda aborts with an exception if there is more than one class or there are no matching constructors.

FlowSchedulerMapper constructors get the set of available additional thread pool names as an argument.
Its `getScheduler` method is called when a flow is scheduled.
Its expected return value is the thread pool's name, which is where the flow should be executed.

Users should package their custom scheduler mapper in a separate CorDapp. This simplifies adding or removing it from the system.
Also, having the mapper in the same package as the main app would make installing multiple apps impossible due to multiple custom scheduler mappers.

## Thread pool metrics

The following [metric]({{< relref "../node/operating/monitoring-and-logging/node-metrics.md" >}}) was introduced in 4.13 specifically for thread pools:

| Name | Description |
|--------------------------|-------------------------------------|
| QueueSizeTotal | The sum of all thread pool queues |

The following metrics have now been updated to be divided by thread pool:

| Previously | Corda 4.13 onward |
|------------------------------------------------|--------------------------------------------------------------------|
| ActiveThreads | ActiveThreads.{threadpoolname} |
| QueueSize | QueueSize.{threadpoolname} |
| QueueSizeOnInsert | QueueSizeOnInsert.{threadpoolname} |
| StartupQueueTime | StartupQueueTime.{threadpoolname} |
| FlowDuration.{Success/Failure}.{flowclassname} | FlowDuration.{Success/Failure}.{flowclassname}.{threadpoolname>} |

Metrics related to the default thread pool do not have a *.default* suffix; this is for backward compatibility.

Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ There are two types of cache:
- **Weight-based:**. Measured by the number of bytes of memory occupied by the entries

{{< note >}}
The avalable set of metrics depends on the cache type. The `maximum-size` and `sizePercent` metrics are only available for size-based caches, while `maximum-weight`, `weight`, and `weightPercent` metrics are only available for weight-based caches.
The available set of metrics depends on the cache type. The `maximum-size` and `sizePercent` metrics are only available for size-based caches, while `maximum-weight`, `weight`, and `weightPercent` metrics are only available for weight-based caches.
{{< /note >}}

{{< table >}}
Expand All @@ -94,27 +94,32 @@ The avalable set of metrics depends on the cache type. The `maximum-size` and `s

## Flows

Note that metrics related to the default thread pool do not have a *.default* suffix; this is for backward compatibility.

{{< table >}}

|Metric Query|Description|
|----------------------------------------------------------------|--------------------------------------------------------------------------------------|
|net.corda:type=Flows,name=ActiveThreads|The total number of threads running flows.|
|net.corda:type=Flows,name=ActiveThreads.{threadpool}|The total number of threads running flows for the specified [thread pool](../../../cordapps/thread-pools.md).|
|net.corda:type=Flows,name=CheckpointVolumeBytesPerSecondCurrent|The current rate at which checkpoint data is being persisted.|
|net.corda:type=Flows,name=CheckpointVolumeBytesPerSecondHist|A histogram indicating the rate at which bytes are being checkpointed.|
|net.corda:type=Flows,name=Checkpointing Rate|The rate at which checkpoint events are occurring.|
|net.corda:type=Flows,name=Error|The total number of flows failed with an error.|
|net.corda:type=Flows,name=ErrorPerMinute|The rate at which flows fail with an error.|
|net.corda:type=Flows,name=Finished|The total number of completed flows (both successfully and unsuccessfully).|
|net.corda:type=Flows,name=InFlight|The number of in-flight flows.|
|net.corda:type=Flows,name=QueueSize|The current size of the queue for flows waiting to be executed.|
|net.corda:type=Flows,name=QueueSizeOnInsert|A histogram showing the queue size at the point new flows are added.|
|net.corda:type=Flows,name=QueueSize.{threadpool}|The current size of the queue for flows waiting to be executed for the specified thread pool|
|net.corda:type=Flows,name=QueueSizeOnInsert.{threadpool}|A histogram showing the queue size at the point new flows are added for the specified thread pool|
|net.corda:type=Flows,name=QueueSizeTotal | The sum of all thread pool queues. |
|net.corda:type=Flows,name=Started|The total number of flows started.|
|net.corda:type=Flows,name=StartedPerMinute|The rate at which flows are started.|
|net.corda:type=Flows,name=StartupQueueTime|This timer measures the time a flow spends queued before it is executed.|
|net.corda:type=Flows,name=StartupQueueTime.{threadpool} |This timer measures the time a flow spends queued before it is executed for the specified thread pool. |
|net.corda:type=Flows,name=Success|The total number of successful flows.|
|net.corda:type=Flows,name=<action_name>|A histogram indicating the time taken to execute a particular action. See the following section for more details.|

|net.corda:type=Flows,name=FlowDuration.Success.{flowclassname} | The flow duration for the default thread pool of the specified flow, if successful. |
|net.corda:type=Flows,name=FlowDuration.Failure.{flowclassname}| The flow duration for the default thread pool of the specified flow, if failed. |
|net.corda:type=Flows,name=FlowDuration.Success.{flowclassname}.{threadpoolname} | The flow duration for the specified thread pool of the specified flow, if successful. |
|net.corda:type=Flows,name=FlowDuration.Failure.{flowclassname}.{threadpoolname} | The flow duration for the specified thread pool of the specified flow, if failed. |

{{< /table >}}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -688,7 +688,7 @@ To configure a connection pool, the following custom properties can be set in th


{{< note >}}
`maximumPoolSize` cannot be less than `enterpriseConfiguration.tuning.flowThreadPoolSize + enterpriseConfiguration.tuning.rpcThreadPoolSize + 2`. See [Performance tuning]({{< relref "../../performance-testing/performance-results.md" >}}) for more details. Their defaults depend on the machine they are being run, but if the `maximumPoolSize` a error will appear showing what is the minimum required.{{< /note >}}
`maximumPoolSize` cannot be less than `{The sum of the configured threadpools sizes} + enterpriseConfiguration.tuning.rpcThreadPoolSize + 2`. See [Optimising node performance]({{< relref "../../node/operating/optimizing.md" >}}) for more details. Their defaults depend on the machine they are being run, but if the `maximumPoolSize` a error will appear showing what is the minimum required.{{< /note >}}



Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,9 @@ Node performance optimisation can be achieved by adjusting node configuration, n

## Adjusting the node settings

The main parameters that can be tweaked for a Corda Enterprise node are

The main parameters that can be tweaked for a Corda Enterprise node are:

- **The number of thread pools used:** For more information, see [thread pools]({{< relref "../../cordapps/thread-pools.md" >}}).
- **The number of flow threads:** This is the number of flows that can be live and active in the state machine at the same time. The default value for this is twice the number of processor cores available on the machine, capped at 30.
- **The number of RPC threads:** This is the number of calls the RPC server can handle in parallel, enqueuing requests to the state machine. The default for this is the number of processor cores available on the machine
- **The amount of heap space the node process can allocate:** The default for this is 512 megabytes.
Expand Down Expand Up @@ -55,7 +55,21 @@ enterpriseConfiguration = {

The recommended approach is to start with a low number of flow threads (for example, 1 per gigabyte of heap memory), and increase the number of threads over a number of runs. In tests at R3, it seems that giving a node twice the number of flow threads than RPC threads seemed a sensible number, but that might depend on the hardware and the use case, so it is worthwhile to experiment with this ratio.

You can also define additional thread pools; for more information, see [Using additional thread pools]({{< relref "../../cordapps/thread-pools.md" >}}).

### Fine-tuning the Artemis configuration

The following configuration options control some aspects of Artemis and can affect the throughput and latency of an application:
* `p2pConfirmationWindowSize`: The size of the in-memory buffer, used by the broker to buffer completed commands before acknowledging them to the client.
* `brokerConnectionTtlCheckIntervalMs`: The interval at which acknowledgements of completed commands are to be sent in case `p2pConfirmationWindowSize` is not exhausted in time.
* `journalBufferSize`: The size of the in-memory buffer used to store messages before they are flushed to disk.
* `journalBufferTimeout`: The interval at which Artemis messages that are buffered in-memory are to be flushed to disk if the `journalBufferSize` is not exhausted in time.

As a result, you can control how frequently Artemis persists messages to disk and how frequently acknowledgements are sent back to clients. These values can affect the latency of flows, since a flow is expected to wait less on Artemis if it flushes messages to disk and sends acknowledgements more frequently. However, such configuration tweaks can also affect the throughput of flows, since flushing to disk more frequently and sending acknowledgements more frequently can result in a reduced efficiency of the utilisation of the disk and network resources. It is important that you benchmark any changes to these values in order to make sure that you have achieved the desired balance between throughput and latency.

### Fine-tuning transaction resolution

In some cases, a node might have to resolve the provenance chain of a transaction from a counterparty. The configuration option `backchainFetchBatchSize` controls how many transactions the node will send at a time when performing this resolution. This defaults to a relatively large value, but you might need to increase it further if you have extremely large chains of transactions that nodes need to resolve. Increasing this value can reduce the latency of flows, since nodes will be able to resolve a transaction chain with fewer round trips. It might also have a positive impact on throughput because this way flows will last less and nodes will be able to complete more of them. However, this might also lead to an increase in the utilisation of network bandwidth and node resources in general. As a result, the actual results will depend on your environment.
## Disk access

The node needs to write log files to disk, and has an Artemis spool directory that is used for the durable queues on the hard disk, so disk I/O for the node’s working directory has an impact on the node performance. For optimal performance, this should be on a fast, local disk.
Expand Down
Loading