Prometheus
Tutorial to scrape your application metrics on an h8lio managed cluster, using the Prometheus Operator pattern.
The Operator is already there
Section titled “The Operator is already there”On h8lio you do not install the kube-prometheus-stack chart yourself. The Prometheus Operator and its CRDs (Prometheus, Alertmanager, ServiceMonitor, PodMonitor, Probe, PrometheusRule, ScrapeConfig, ThanosRuler) are installed and reconciled cluster-wide by the platform. Your job is to create instances and scrape definitions as custom resources in your own namespaces, within the RBAC granted to your role.
| Role | monitoring.coreos.com resources |
|---|---|
| Owner / Admin / Operator | full access (create / update / delete Prometheus, Alertmanager, ServiceMonitor, PodMonitor, Probe, PrometheusRule, ThanosRuler) |
| Developer | read-only (get / list / watch) |
Installing the chart would also pull in a node-exporter DaemonSet and cluster-scoped CRDs, which a namespace tenant cannot deploy on a managed cluster. Node and host-level metrics are the platform’s responsibility and are surfaced in the shared Grafana described below.
Two ways to see your metrics
Section titled “Two ways to see your metrics”The shared Grafana (zero setup)
Section titled “The shared Grafana (zero setup)”h8lio provisions a shared Grafana at https://monitoring.h8l.io with a set of read-only dashboards scoped to your organization (CPU, memory, storage per namespace). It is wired automatically when your organization is created, so basic namespace monitoring needs no work from you.
Your own Prometheus + Grafana (custom metrics)
Section titled “Your own Prometheus + Grafana (custom metrics)”To scrape application metrics (MariaDB, PostgreSQL, your own services) and build your own dashboards and alerts, run your own Prometheus instance. The recommended layout is a dedicated monitoring namespace (an h8lio cluster such as acme-monitoring) that observes your organization’s other namespaces (acme-prod, acme-staging, …). All namespaces of an organization share the label tenant: <organization>, which makes cross-namespace selection a one-liner.
Requirements
Section titled “Requirements”kubectlconfigured for your cluster (see kubectl)- An Owner, Admin, or Operator role on the organization
- A monitoring namespace (cluster) to host the instance, for example
acme-monitoring - A deployed service exposing a metrics endpoint (typically a
Servicewith a namedmetricsport)
Create a Prometheus instance
Section titled “Create a Prometheus instance”A Prometheus resource tells the platform Operator to provision and reconcile a Prometheus server for you. It needs a ServiceAccount, and that account needs read access to the discovery objects in every namespace you want to scrape.
Create the instance in your monitoring namespace. The namespace selectors below match every namespace of your organization through its shared tenant label:
apiVersion: v1kind: ServiceAccountmetadata: name: prometheus namespace: acme-monitoring---apiVersion: monitoring.coreos.com/v1kind: Prometheusmetadata: name: acme namespace: acme-monitoringspec: replicas: 1 serviceAccountName: prometheus retention: 15d resources: requests: memory: 512Mi # discover ServiceMonitors / PodMonitors / PrometheusRules across # every namespace of your organization (they share the `tenant` label) serviceMonitorNamespaceSelector: matchLabels: tenant: acme podMonitorNamespaceSelector: matchLabels: tenant: acme ruleNamespaceSelector: matchLabels: tenant: acme # within those namespaces, select all objects (no label filter) serviceMonitorSelector: {} podMonitorSelector: {} ruleSelector: {} # persist the TSDB on a Ceph-backed volume storage: volumeClaimTemplate: spec: storageClassName: eu-west-fr-gra-block-nvme-ec-ext4 resources: requests: storage: 20GiThe Prometheus pod discovers its targets through the Kubernetes API, so its ServiceAccount needs get / list / watch on the discovery objects in each namespace it scrapes. A namespace-scoped Role (not a ClusterRole) is enough. Apply the pair below in every namespace you collect from, including acme-monitoring itself:
# repeat per scraped namespace: acme-prod, acme-staging, acme-monitoring, ...apiVersion: rbac.authorization.k8s.io/v1kind: Rolemetadata: name: prometheus namespace: acme-prodrules: - apiGroups: [""] resources: ["services", "endpoints", "pods"] verbs: ["get", "list", "watch"] - apiGroups: ["discovery.k8s.io"] resources: ["endpointslices"] verbs: ["get", "list", "watch"]---apiVersion: rbac.authorization.k8s.io/v1kind: RoleBindingmetadata: name: prometheus namespace: acme-prodroleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: prometheussubjects: # the instance's ServiceAccount lives in the monitoring namespace - kind: ServiceAccount name: prometheus namespace: acme-monitoringOnce the Operator reconciles the resource, reach the Prometheus UI locally:
kubectl -n acme-monitoring port-forward svc/prometheus-operated 9090:9090Then open localhost:9090 and check Status → Targets.
Service Monitoring
Section titled “Service Monitoring”A ServiceMonitor declaratively selects the Kubernetes Services to scrape. Create it in the application’s namespace; the instance picks it up because that namespace carries the tenant label its selector matches.
Example ServiceMonitor scraping MariaDB metrics (exposed by a mysqld-exporter sidecar):
apiVersion: monitoring.coreos.com/v1kind: ServiceMonitormetadata: name: mariadb # the application's namespace namespace: acme-prodspec: # name of the Prometheus job jobLabel: mariadb endpoints: - interval: 15s # MariaDB service "metrics" endpoint port: metrics # MariaDB service selector selector: matchLabels: app.kubernetes.io/component: primary app.kubernetes.io/instance: mariadbSee the ServiceMonitor CRD for the full schema.
PodMonitoris the equivalent that targets pods directly when there is no Service.
Once the ServiceMonitor is applied, the target and its endpoints appear in the Prometheus UI under Status → Targets. If the target is up, you can run your first PromQL queries against the scraped metrics.
External targets (ScrapeConfig)
Section titled “External targets (ScrapeConfig)”ServiceMonitor and PodMonitor cover in-cluster targets discovered by the Kubernetes service catalogue. For targets outside the cluster, or scrape configurations those CRDs cannot express, use the ScrapeConfig CRD (monitoring.coreos.com/v1alpha1). It is complementary: keep ServiceMonitor / PodMonitor as the default for your apps, and reach for ScrapeConfig only for external or static targets and advanced service discovery (cloud, DNS, file-based).
PrometheusRule defines recording and alerting rules. Create it in the application’s namespace, like the ServiceMonitor. Alerts surface in Alertmanager and can also drive Grafana Alerting.
Example PrometheusRule for the MariaDB service above:
apiVersion: monitoring.coreos.com/v1kind: PrometheusRulemetadata: name: mariadb # the application's namespace namespace: acme-prodspec: groups: # rules group name - name: mariadb rules: # check if the MariaDB ServiceMonitor job is down - alert: MariaDB-Down annotations: message: MariaDB instance {{ $labels.instance }} is down summary: MariaDB instance is down expr: absent(up{job="mariadb"} == 1) for: 5m labels: service: mariadb severity: warning # check if MariaDB has more than 100 active connections, using PromQL - alert: HighMariaDBConnections annotations: description: >- MariaDB has more than 100 active connections for more than 5 minutes. summary: High number of MariaDB connections expr: mysql_global_status_threads_connected > 100 for: 5m labels: severity: warningSee the PrometheusRule CRD for the full schema.
Visualize in Grafana
Section titled “Visualize in Grafana”For custom dashboards, run your own Grafana in the monitoring namespace (the same place that can host Loki for your logs) and add your Prometheus instance as a datasource. The Operator exposes the instance through the prometheus-operated Service:
http://prometheus-operated.acme-monitoring.svc:9090From there you can import community dashboards or build your own, and combine metrics with your Loki logs in a single place.
Next Steps
Section titled “Next Steps”- Pair this with Loki for logs in the same monitoring namespace
- Import or build a Grafana dashboard for your metrics
- Configure Grafana Alerting or Alertmanager routing for your
PrometheusRulealerts