SimKube Simulation Controller #
The Simulation Controller watches for new Simulation Custom Resources to be posted to the API server and then configures
a simulation to be run based on the parameters specified in the CR. The controller itself does not actually run the
simulation, it just does setup and cleanup, and then launches an sk-driver
Kubernetes Job to
actually perform the Simulation.
Usage #
Usage: sk-ctrl [OPTIONS]
Options:
--driver-secrets <DRIVER_SECRETS>
--use-cert-manager
--cert-manager-issuer <CERT_MANAGER_ISSUER> [default: ]
-v, --verbosity <VERBOSITY> [default: info]
-h, --help Print help
Details #
The Simulation Controller does the following on receipt of a new Simulation:
- Runs all preStart hooks
- Verifies that all the expected pre-existing objects are present in the cluster
- Creates a SimulationRoot "meta" object to hang objects off of that should persist for the whole simulation
- Creates the namespace for the simulation driver to run in
- Creates custom resources for the Prometheus operator to configure metrics collection
- Creates a MutatingWebhookConfiguration for the simulation driver
- Creates a Service for the simulation driver
- Sets up certificates for the simulation driver mutating webhook (currently requires the use of cert-manager).
- Creates the simulation driver Job
- Waits for the driver to complete
- Cleans up all "meta" resources
- Runs all postStop hooks
Simulation Custom Resource #
Simulations are controlled by a Simulation custom resource object, which specifies, among other things, how to configure the Simulation driver, metrics collection, and any hooks. The Simulation CR is cluster-namespaced, because it must create SimulationRoots.
SimulationRoot Custom Resource #
The SimulationRoot CR is an empty object that is used to hang all the simulated objects off of for easy cleanup (instead of having to write our own cleanup code, we just delete the SimulationRoot object and allow the Kubernetes garbage collector to clean up everything it owns). The SimulationRoot is cluster-namespaced because during the course of simulation we may be creating additional namespaces to run simulated pods in. Note that the driver itself is not owned by the SimulationRoot, so that users can still see the results and logs from the after the sim is over.
Configuring Metrics Collection #
!!! note In the future we may move metrics collection out of SimKube proper and instead run it as a standard "hook".
If you do not want to use Prometheus for metrics collection, or wish to configure it differently, you can disable metrics collection using skctl --disable-metrics
and configure your own metrics solution with a preStart hook.
SimKube depends on the Prometheus operator being installed in your simulation
cluster, as it creates custom resources understood by this operator. The metricsConfig
section of the Simulation spec
controls how this is set up. The namespace
and the serviceAccount
fields are the namespace and service account that
the Prometheus operator uses.
SimKube will spawn a new Prometheus pod with extremely high resolution (currently 1 second) for the duration of the
simulation. The Prometheus pod will be torn down at the end of the simulation, so it is recommended that you configure
at least one remote write
target for Prometheus for long-term metrics storage. The remoteWriteConfigs
section is simply a list of
RemoteWriteSpec
objects
from the Prometheus operator API. You can configure as many of these as you want using any supported Prometheus remote
write target.
One suggested approach is to use prom2parquet in order to save the Prometheus timeseries data to S3 in the Parquet columnar data format.