Health checking

Tweek Gateway expose health endpoints at:

  • http://tweek/health (check the health of the gateway)
  • http://tweek/status (check the status of all Tweek upstreams)

Both endpoints can return only 200 status code.

A status result example:

{
    "repository revision": "f56b389882fbe5431bc83dabf7755f2c10c13fd4",
    "services": {
        "api": {
            "healthy": {
                "CouchbaseConnection": "OK",
                "EnvironmentDetails": "Host = 3a3e561de09f, Version = 1.0.0-rc10",
                "LocalHttp": "OK",
                "QueryHealthCheck": "OK",
                "RulesRepository": "CurrentLabel = f56b389882fbe5431bc83dabf7755f2c10c13fd4, LastCheckTime = 07/27/2020 08:09:30"
            },
        "status": "Healthy"
    },
    "authoring": {},
    "publishing": {}
    }
}

Additonally, each Tweek service has a “/health” status that will return error status code if there’s a problem.

These endpoints can be used with k8s readiness/liveness probes, Docker healthchecks, etc…

Monitoring

Both the api and the gateway expose metrics promethus metrics at “/metrics”

Gateway service metrics:

request_duration_seconds (summary)
request_duration_seconds_histogram (histogram)

Api service uses the default webapp metrics defined at https://app-metrics.io

Logging

All Tweek services’ logs are written to std, and can be collected with tools such as FluentD.