Resource Allocation (CPU, Memory)
The machine resources (cores, memory) are handled by The Hadean Platform. During startup the Platform inspects the available resources on each machine. When receiving a request from Simulate to create a new process, the Hadean Platform will pick-up resources from the machine with sufficient spare capacity.
The total number of workers that can be spawned on a single machine is dependant on the available resources on that machine but also on the required resources per each worker.
Total number of workers = Min(Nb_cores/Nb_cores_per_worker;
Total_memory/Memory_per_worker)
The availability of a single machine is then limited by its initial capacity, the number of currently hosted workers and their individual requirements. The machine resources are also shared by the Simulate internal processes and the background worker. If no machine is available to satisfy a new spawning request, Simulate will reuse or resize existing workers. To give you an idea about machine sizing, the local WSL machine (used for local runs) cores and memory depend on the WSL configuration. This configuration is present in
.wslconfig
file. See Global configuration options with .wslconfig for more information. For remote runs, clusters have generally a higher capacity (e.g Multiple machines with > 8 cores).The CPU and RAM allocation for each worker is defined through the static arguments set when initialising the Manager. It is possible to set resource requirements for both a Cell Worker and the Background Worker. This can be done as below. The first block defines the Cell Worker allocation and the second the Background Worker. If the second block is not set then the Cell Worker allocation will be used for the Background Worker.
auto static_args = arguments.to_octree_params<octree_traits_t>();
static_args.resources = {
{
"1/10", // Cell Worker CPU threads
"8MiB" // Cell Worker RAM allocation
},
{
"1/10", // Override values for Background worker here
"16MiB"
}
};
The CPU and RAM values are parsed and handled as strings, the first parameter indicates the number of CPU cores to assign, this can be fractional. The second value is the RAM allocation.
An internal empirical study has shown that simulations perform best when each worker has a dedicated physical core. When running on servers that support hyper-threading, this requires setting cores=2. Typically this then acts as the limiting factor on the number of worker processes per machine. Memory is set to a safe value that is unlikely to saturate the machine capacity, e.g. 48MiB.
For the matter of performance, it's highly recommended for you to tune the workers requirements which can have a big impact on the overall simulation performance. Furthermore, user has the ability to configure split/merge operations through the
estimate_load
function that allows to define the load managed by each worker.Values defined by the user will override the default values. If no requirements are defined default resources values are set to:
cores=2, memory=48MiB
.