Getting started with Valence and Declarative Performance

Written by Domenic Rosati | 15-05-2019

Earlier, we explained a little bit about what Valence is: ML-Powered, declarative performance for resources and scaling in Kubernetes. Now we are going to explain how to get started.

First, we will go over a quick start to see how Valence works - and then we will go into setting your first Deployment up with Valence. If you'd like to see more details checkout our docs on our repo.

Quick start for testing on example workloads

It's easy to get started quickly with Valence. The following instructions get you started with two of the same applications running the same workloads - the only difference is that one of the applications is instrumented with Valence. This makes it easy to see the effects of Valence on your application.

The workloads for testing are:

  • todo-backend-django (this is a control workload not using valence)
  • todo-backend-django-valence

They will use the following Service Level Objective (SLO) manifest:

  • slo-webapps

To Begin:

  • start on a fresh cluster such as docker-for-desktop or a testing instance of GKE
  • Clone the Valence repo: git clone https://github.com/valencenet/valence-manifests
  • if your cluster already has metrics-server (GKE does by default) run: make tooling-no-ms
  • Apply the Tooling (Metrics server (if don't have) and Kube-state-metrics): kubectl apply -f tooling.yaml
  • Apply the Valence system: kubectl apply -f valence.yaml
  • Apply the Example workloads and tooling: kubectl apply -f example-workloads.yaml
  • View results!
    • kubectl proxy svc/grafana -n valence-system &
    • open http://localhost:8001/api/v1/namespaces/valence-system/services/grafana/proxy Authentication is Grafana Default: username: admin, password: admin Recommendations for Replicas, Requests and Limits, and live changes to those should start coming in 5-20 minutes.

Getting started on your first Deployment.

Let's get started with installing Valence on your first workload on your own cluster. There are a few things we will have to do:

  1. Install Valence and ensure prerequisites are met
  2. Set up our first Service Level Objectives
  3. Selecting the Deployment Valence will operate on
  4. Ensuring the instrumentation is set up correctly
  5. Seeing the results of our first deployment

1) Installing Valence

Valence itself is lightweight and easy to install. For more details on installation and what is installed checkout out the docs.

Installation Steps

  • Install or confirm tooling is setup: If you don't currently have either the metrics-server or kube-state-metrics, you can install them with kubectl apply -f tooling.yaml
    • Note: If you already have the metrics server (such as in a GKE cluster) run: make tooling-no-ms && kubectl apply -f tooling.yaml
  • Make Valence:
make valence LICENSE=<YOUR.EMAIL> # This could also be your license if you signed up for metered usage.
kubectl apply -f valence.yaml
  • Install Valence: kubectl apply -f valence.yaml
    • Confirm Valence is installed. Run kubectl get po -n valence-system and you should have a grafana pod, a prometheus pod, and a optimization-operator pod.

Note: Valence can easily be uninstalled anytime by deleting that manifest:

kubectl delete -f valence.yaml

2) Setting your first Service Level Objective

Valence is based on the notion of Declarative Performance. Operators use Valence by setting Service Level Objectives for their apps (or set of apps) and its Valence's job to model application behaviour and workload in order to configure it for optimal performance resourcing. We currently have SLOs defined for stateless HTTP applications but we are working on supporting more (let us know what your ideal SLOs are @ info@valence.net)

Example:

slo-webapps.yaml

apiVersion: optimizer.valence.io/v1alpha1
kind: ServiceLevelObjective
metadata:
  name: slo-webapps
spec:
  # First we define a selector.
  # We use this to label deployments to tell Valence to meet the following objectives for those [deployments.](https://github.com/valencenet/valence-manifests/blob/master/example/workloads/todo-backend-django-valence/deployment.yaml#L7)
  selector:
    slo: slo-webapps
  objectives:
    - type: HTTP
      http:
        latency:
          # Valid values are 99, 95, 90, 75, 50.
          percentile: 99
          responseTime: 100ms
        # Omit this for autoscaling (ie. latency objective valid for all throughputs).
        # This is throughput of queries per minute.
        throughput: 500

3) Selecting the Deployment Valence will operate on

Selecting SLO

Choose the Deployment(s) you'd like to be operated by that Service Level Objective and Label them accordingly.

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: todo-backend-django
  labels:
    app: todo-backend-django
    # Add this as a label to your Deployment to match the selector you defined above.
    slo: slo-webapps
...
  template:
    metadata:
      labels:
        app: todo-backend-django
        # Add this as a template label to your Deployment to match the selector you defined above.
        slo: slo-webapps
...

Adding Sidecar

Valence collects application metrics through a sidecar. If you’d prefer to collect metrics based on your ingress, load-balancer, envoy containers, linkerd, istio or otherwise, let the Valence team know. This will eventually be automated, all feedback is appreciated!

Add the proxy container to your deployment and set the target address to where your application is normally serving.

Example: todo-backend-django/deployment.yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: todo-backend-django
  labels:
    app: todo-backend-django
    slo: slo-webapps
...
template:
  metadata:
    labels:
      app: todo-backend-django
      slo: slo-webapps
...
spec:
  containers:
    - name: prometheus-proxy
      image: valencenet/prometheus-proxy:0.2.8
      imagePullPolicy: IfNotPresent
      env:
        - name: TARGET_ADDRESS
          value: 'http://127.0.0.1:8000' # where your app is serving on
      args:
        - start
        # your base container is after this here.

Note: Valence will make relatively frequent changes so we recommend you ensure at least the following availability configuration for your deployments:

spec:
  # Revision history limit should be low but # greater than 1.
  revisionHistoryLimit: 3
  strategy:
    # Ensure we use rolling updates with:
    rollingUpdate:
      maxSurge: 2
      maxUnavailable: 10%

It is also helpful if you are using readiness and liveness probes to ensure availablity.

4) Ensuring the instrumentation is set up correctly

Replace your existing service with a Valence compatible service. That points to the sidecar proxy for metrics collection.

Example todo-backend-django/service.yaml Change:

apiVersion: v1
kind: Service
metadata:
  labels:
    service: todo-backend-django
  name: todo-backend-django
spec:
  # Works with any service type, NodePort just an example.
  type: NodePort
  ports:
  - name: headless # example port name
    port: 80
    targetPort: 8080
  selector:
    app: todo-backend-django

To:

apiVersion: v1
kind: Service
metadata:
  name: todo-backend-django
  labels:
    service: todo-backend-django
    # Scrape prometheus metrics by valence.
    valence.net/prometheus: "true"
spec:
  type: NodePort
  ports:
  # This would be your port you were exposing your application on.
  - name: headless # this name is arbitrary and can be changed to anything you want.
    port: 80
    targetPort: 8081 # this is the port prometheus-proxy is serving on
  # These three lines allow us to scrape application metrics.
  - name: prometheus
    port: 8181
    targetPort: 8181
  selector:
    app: todo-backend-django

5) Seeing the results

The recommendations are available in prometheus exposition format. Valence exposes its metrics on /metrics endpoint on port 8080 of the optimization-operator.valence-system service and can be scraped by prometheus and other similar tools for metrics collection in a standard way. The metrics can be accessed like:

kubectl port-forward svc/optimization-operator -n valence-system 8080 &
open http://localhost:8080/metrics

We expose the following metrics:

  • valence_recommendations_cpu_limits
  • valence_recommendations_cpu_requests
  • valence_recommendations_memory_limits
  • valence_recommendations_memory_requests
  • valence_recommendations_replicas

You can also use Grafana to see the recommendations.

kubectl proxy svc/grafana -n valence-system
open http://localhost:8001/api/v1/namespaces/valence-system/services/grafana/proxy

Authentication is Grafana Default:

  • username: admin
  • password: admin

Once you are in Grafana look at the Valence Recommendations dashboard. You will see:

  • Memory recommendations and resources
  • CPU recommendations and resources
  • HTTP Request Count in Queries per Second
  • HTTP Latency at selected percentile
  • Replica recommendations and current replicas

Now What?

Congrats! Now you have Valence running and making recommendations on your Deployment(s). You can either use those recommendations to set your Deployments resources or have Valence make those changes automatically with the annotation described below. If you had any issues please let us know at info@valence.net.

Whats next?

  • You can use Valence for free up to 5 Deployments concurrently.
  • If you like the recommendations, You can get Valence to automatically make changes using the annotation below.
  • Checkout the more detailed documentation and installation instructions on: https://github.com/valencenet/valence-manifests
  • If you have requests for features such as different Service Level Objectives or Indicators please let us know at info@valence.net and we would love to support them.

Annotations

You can use these optional annotations on the deployments managed by Valence:

  annotations:
    # Whether to make changes automatically with those recommendations.
    # And take control of your applications resources.
    valence.io/optimizer.configure: "true"
    # Minimum amount of replicas to recommend.
    valence.io/optimizer.min-replicas: "2"
    # Minimum cpu requests to recommend.
    valence.io/optimizer.min-cpu-requests: "100m"
    # Minimum memory requests to recommend.
    # For example: set this to your max heap size if you are using JVM.
    valence.io/optimizer.min-memory-requests: "500M"

Thanks for reading this guide on how to get started with Valence. Please let us know if you have any feedback or feature requests. As we continue to add features, we want to prioritize the ones that will be most useful for you info@valence.net