Logging
The Hadean Platform includes a logging framework to simplify the process of gathering logs generated by your code and the supporting Hadean services which may be running across multiple remote servers. It uses the concept of Sources (where logs are created) and Sinks (where logs are stored) to provide flexibility in accessing logged information. A default logging implementation is provided to help with debugging that can be accessed using the Hadean CLI, but this should be overridden when moving to a production environment as it will automatically delete old log files to save space.
The logging mechanism picks up logged information directed to stdout or stderr which means you can use a logging library of your choosing within your application. Any log lines captured on stdout will be logged at level INFO, whilst anything registered on stderr will be logged at level ERROR
When a log line is sent to stdout/err additional meta data will be added to the log message. The data generated will be stored in JSON format. An example of the log format can be found below.
{
"hadean_pid": "127.0.0.1.18002.0",
"timestamp": "2023-01-10T16:47:03.012973+00:00",
"log_stream": "stdout",
"log_group": "App",
"message": {
"level": "Info",
"target": "",
"module_path": null,
"file": null,
"line": null,
"key_values": {
"thread": "tokio-runtime-worker",
"process_name": "24",
"process_id": "140270151174944",
"thread_id": "140270151174944",
"pid": "127.0.0.1.18002.0"
},
"message": "my message”
}
Note that the level field in the JSON indicates if the log line was produced on stdout (INFO) or on stderr (ERROR)
The process_id and thread_id in the above logs provide a unique identifier for the process creating the log message across the entire cluster. For user generated logs the names for these will be null.
Logging within the Hadean Platform works with the concept of Sources and Sinks and makes use of the Vector backend that provides support for a wide variety of logging end points.
Sources define the location that a log line has been generated from. There are several sources that will be available by default for all Hadean platform applications. These are defined as below.
Source | Descriptionp |
---|---|
platform-scheduler-logs | Q?: What goes here, is this what ends up in hadean.log? |
platform-manager-logs | Q?: What appears here, is this gateway? |
Q? How do you push logs into events-logs?
Q? Can you define your own source
Q? what goes to gateway.log
In addition logs generated within user code will also be assigned a source. By default this will be app-logs.
Source | Description |
---|---|
app-logs | Any logs generated by user code in your application. When running another Hadean framework such as Simulate or Connect, then logs generated by that framework will also appear hear. |
events-logs | |
Logging data can also have an output location, referred to as a Sink defined. The Hadean platform supports all of the Sinks that are supported by the underlying Vector backend, a full list of which can be found here.
In order to configure a Vector end point you will need to configure a sink in your runtime configuration. Vector supports a wide range of end points that can be seen in detail in their documentation.
As an example we will look at setting up a local file as a sink. Detail on this endpoint can be found here.
When running on a cluster configuring a local file would result in logs being dropped onto the gateway machine. You will still need to access that machines file system in order to retrieve logs
[vector.sinks.local_file]
type = "file"
inputs = [ "platform-scheduler-logs", "platform-manager-logs", "app-logs", "events-logs" ]
compression = "none"
path = "/var/log/hadean/my-simulation-%Y-%m-%d.log"
encoding.code = "json"
The result of the above configuration is that all log output will be written to a single file located at /var/log/hadean.
Any of the Vector sinks can be used to consume logging data, below is an example of a configuration that can be used to send logs to DataDogs logging service. Note that you will need to change the API key to your own
[vector.sinks.datadog]
type = "datadog_logs"
encoding.codec = "json"
inputs = [ "platform-scheduler-logs", "platform-manager-logs", "app-logs", "events-logs" ]
default_api_key = "004e2f8fe3ade167c7c219ad61a38940" # CHANGE ME
compression = "gzip"
region = "us"
#encoding.timestamp_format = "unix"
batch.timeout_secs=1 # The maximum age of a batch before it is flushed.
The platform provides a pre-configured default logging endpoint that can be accessed directly via the Hadean CLI. This is provided for development purposes as it allows a developer to gain access to logs generated on a cloud cluster easily from their development station, however is not designed for production runs. The default logging endpoint will retain 1GB of data with older log lines being discarded once this limit is reached in order to reduce the risk of servers in the cluster running out of space.
If you chose to configure a custom logging end point the default endpoint will be disabled. This will mean that the
hadean cluster log
commands will no longer be accessible.The hadean CLI is able to pull down logs that have been generated using default logging from remote clusters. Logs are accessed via the command
hadean cluster logs --name [cluster name] [options]
Running this command on a cluster will stream all of the logs available from the last (or currently active) application run to the console on
stdout.
Logs will be streamed from the first saved log line and output will continue to be streamed to the console as new log lines are generated, tailing the active logs.Q? What runtime options are available for this? can we change log size stored like we can locally?
[logger] log-size-limit = "5000 MiB"
Log files can become very large and by default the simulation will retain up to 2GB of data with a guarantee of retaining at least the last 1GB of generated logs. In order to reduce data transfer it is possible to specify filtering options to reduce data sent over the network.
Option | Details |
---|---|
--filter <FILTER> | Apply the provided filtering options to log lines before streaming them down from the servers |
--filter-stderr | Only show logs at levels ERROR and above |
--filter-stdout | Only show logs at level INFO and above |
FILTER can be comprised of any of the following criteria
- timestamp: =, !=, <, >, <=, >= string (in ISO 8601) or number (unix timestamp)
- message: =, != string (strict equality) or ~ regex
- hadean processid: =, != string (strict equality)
- log stream: =, != string (strict equality) or ~ regex
- log group: =, != string (strict equality) or ~ regex
- (left AND right)
- (left OR right)
- NOT (right)
e.g.
--filter '(timestamp > "2022-10-20T10:48 OR processid = "10.2.1.4.20002.0")'
If an multiple application runs have been performed on a cluster then log files from previous runs will still be available. The following options are available to select specific previous runs when accessing log files.
Option | Details |
---|---|
--list | Lists the IDs of previous application runs that still have logs available for download from the cluster |
--id <ID> | Specify a specific application run for wihch to access log files |