The scheduler handles all requisite operations such as resource accounting, machine provisioning, bootstrapping of server nodes and transmission of processes to spawn. It starts managers based on a configuration file, taking in the IP addresses, ports, CPUs and memory of available machines. It has an overall view of everything that is running, and once it has created one or more managers it can then request that they spawn user processes. User processes are defined by the code that runs as requested by the end user, or by an already running process (by making a
The platform provides a robust and scalable foundation for Hadean applications. Programs link against the Hadean library, providing an asynchronous
spawnfunction that can then be called to create a new process at any time.
spawnfunction is called, the library communicates with the scheduler, providing both the program and information on how to run it. A locality API is available to indicate preferences about the kinds of infrastructure on which the process should be run - for instance the geographical location, proximity to other infrastructure and available hardware.
The scheduler, if necessary, provisions the infrastructure (machines, networking, etc) and launches a manager on each new server, which has responsibility for the machine on which it runs. It then requests that the manager launch the provided program. The manager runs the user-supplied program on the (possibly newly-created) machines, and runs an enforcer that forwards any interesting output from the process to a logging service. The enforcer administers the strong isolation and predictability guarantees provided by the Hadean model, and ensures dangerous syscalls or instructions that can break these guarantees are sanitised.
Hadean channels serve as a first-class distributed IPC (Inter-Process Communication) primitive, ensuring all programs are both distribution-agnostic and scale-agnostic. After
spawn() channels (either local or distributed) may be created between the spawning and spawned processes for the purpose of IPC during the program’s execution.
Dynamic scalability is a core component of Hadean, ensuring problems such as under-/over-provisioning are avoided. The scheduler is built with a plugin architecture that can be used to add or remove backends for different means of provisioning machines - different clouds, on-prem deployments, et cetera.
Hadean pools resources to ensure that they are available in a timely manner. The platform maintains a pre-allocated list of machines that have each had a manager and enforcer added in anticipation. In order to employ them the user executable also needs to be transferred to them, which happens at run-time. Each machine is controlled by an on-machine manager. Managers are in-turn under the control of the Hadean scheduler, and can be said to be part of the Hadean cluster.
The Hadean platform automatically requests and "warms up" (transfers the necessary components, e.g. manager, enforcer, etc.) new boxes as its needs increase. The sequence of events is:
spawn() command is called and there is insufficient "space" on the existing
box(es), so Hadean requests a new box from the underlying infrastructure/cloud host
Hadean receives the IP address of the new box
The manager and enforcer are sent to the box
The user’s executable can now be sent to the new box
When resources are no longer required by the program nor yet by the pool they will be automatically released back to the cloud environment.