Metrics

Overview

HadeanOS provides an API which can be used to output metrics from any hadeanos process or component. Applications using the metrics API need to include the hadean/metrics.hh header.

These metrics will be very familiar if you've used prometheus before, our API is essentially the prometheus C++ API. We have counters, gauges, families, etc.

Serialisation

Relating to metrics, serialisation means taking the in-memory metrics and writing them out, either to disk or network. HadeanOS will not automatically invoke this serialisation for you, so at regular intervals you should invoke

hadean::metrics::serialise(
hadean::metrics::get_metrics_registry().Collect()
);

However, because Aether also uses Hadean metrics, and because engine code and user code share the same metrics registry, Aether workers and manager will already call this serialisation, so you may omit it on those processes.

Adding a Metric

There are a few pieces of code necessary to add a metric:

  1. add some state
struct per_worker_data {
...
hadean::metrics::Family<hadean::metrics::Gauge>* total_entities_family;
hadean::metrics::Gauge* total_entities_metric;
};
  1. add the initialisation logic
hadean::metrics::Family<hadean::metrics::Gauge>& entities_family(
hadean::metrics::BuildGauge()
.Name("total_entities")
.Register(hadean::metrics::get_metrics_registry()));
hadean::metrics::Gauge& entities_metric = entities_family.Add({{"worker_id", std::to_string(aether_state.get_worker().as_u64())}});
  1. add some logic to set or increment the counter/gauge at an appropriate point
total_entities_metric->Set(store.num_agents_local());
  1. add a call to the serialization
hadean::metrics::serialise(
hadean::metrics::get_metrics_registry().Collect()
);
  1. make sure to remove metrics if you're working with finite lifetime processes (they might die before the end of the whole program)
total_entities_family->Remove(total_entities_metric);

That's enough to have your metric available via our metrics API.

Viewing your Metrics

Follow the instructions in Metrics to setup a prometheus server with the SDK.

You can then view your metrics (while a simulation is running) by going to http://<prometheus-address>:<prometheus-port>/graph (port defaults to 9090, address might be localhost)

if you followed the snippets in this document, try typing total_entities into the query box and pressing execute. Also try checking the "stacked" box to see a stacked plot of the values for each of the workers.

screenshot of the metrics in prometheus