Cyphernetes: A Graph Query Language for Kubernetes

Kubernetes has become the backbone of modern cloud-native architectures, yet querying and interacting with it at scale often feels like navigating a labyrinth. Tools like kubectl, jq, and jsonpath are indispensable, but they lack the expressive power needed for querying complex relationships across Kubernetes objects. Enter Cyphernetes, a graph-query language inspired by Cypher, which promises to bridge this gap with precision and scalability.

This post delves into Cyphernetes, its capabilities, challenges in Kubernetes observability, and comparisons with existing tools to highlight why it could be a game-changer for DevOps engineers.

The Querying Challenge in Kubernetes

Operating large-scale Kubernetes clusters with thousands of nodes comes with its own set of challenges:

Scalability Bottlenecks:
- Popular tools like k9s and other observability frameworks often make expensive queries against the Kubernetes API server (e.g., listing all pods in a cluster).
- Such queries, especially in high-density environments, can overload the API server, leading to performance degradation or even incidents.
APIs Designed for Specificity:
- Kubernetes API queries are typically constrained to individual resource types (e.g., pods or services).
- To retrieve insights spanning multiple resources, developers must execute multiple queries and stitch results manually—a tedious and error-prone process.
Lack of Indexing:
- Exhaustive LIST queries perform prefix range scans on etcd, the Kubernetes backend, without indexes. This lack of optimization exacerbates inefficiencies when querying large datasets.

Cyphernetes: The Graph-Driven Solution

Cyphernetes leverages the graph-like nature of Kubernetes objects to offer a streamlined querying approach. Inspired by Cypher (the query language for graph databases like Neo4j), Cyphernetes enables:

Relational Logic: Efficiently query relationships between Kubernetes resources (e.g., linking pods, deployments, and services in a single command).
Customizable Macros: Define reusable query patterns to simplify everyday operations, like filtering pods by name or finding resources based on complex conditions.
Unified Interface: Avoid nested kubectl queries or piping through tools like jq, making queries more ergonomic and readable.

Technical Insights into Cyphernetes Queries

A typical Cyphernetes query could look like this:

MATCH (pods:Pod)-[:SERVED_BY]->(svc:Service)
WHERE pods.status.phase <> "Running" AND svc.metadata.name =~ "foo%"
RETURN pods.metadata.name, svc.spec.clusterIP;

This query retrieves pods not in a “Running” state served by services with names matching “foo%” and returns the relevant metadata—a task that would otherwise require multiple commands and extensive scripting with kubectl and jq.

Comparison with Existing Tools

kubectl + jq/jsonpath:
- These tools require multiple nested queries for complex tasks, which increases the cognitive load.
- Example: Joining pods, services, and deployments requires three separate API calls and custom filtering.
SQL-Based Tools (e.g., Steampipe):
- While Steampipe provides SQL-based querying over APIs, it treats Kubernetes data relationally, which may not align with its inherently graph-like structure.
Cyphernetes Advantages:
- Optimized for graph operations, making it ideal for multi-hop relationship queries.
- Simpler syntax for common DevOps scenarios, backed by a customizable macro system for daily tasks.

Overcoming Observability Bottlenecks

Several contributors to the Cyphernetes discussion highlight best practices for optimizing Kubernetes observability:

Avoid Direct API Queries at Scale:
- Instead of querying the API server directly, use controllers to watch objects and store them in external databases like PostgreSQL or Elasticsearch.
Client-Side Caching:
- Employ caching mechanisms to reduce load on the API server. For example, syncing data via client-go informers and processing changes incrementally improves performance significantly.
Optimized Backends:
- Pair Kubernetes data with scalable databases and leverage features like materialized views, JSONB storage, and trigger-based updates to handle large clusters effectively.

Adoption Challenges and Considerations

Learning Curve:
- For users accustomed to kubectl, adopting Cyphernetes may require familiarity with Cypher’s syntax and graph querying concepts.
Installation Perception:
- Initial focus on macOS via Homebrew sparked criticism for omitting Linux instructions. Ensuring broader platform support and inclusive documentation is crucial for widespread adoption.
Community Feedback:
- Early users suggest simplifying the query language for routine tasks while retaining advanced features for complex use cases.

Future Directions for Cyphernetes

To become a daily driver for Kubernetes operations, Cyphernetes could:

Enhance macro functionality for simplified, reusable queries.
Integrate with existing DevOps pipelines and CI/CD workflows.
Provide richer visualizations for graph-based data, akin to tools like Neo4j.

Here’s a guide to getting started with Cyphernetes, including code and examples:

1. Install Required Tools

Make sure you have the following tools installed:

Kubernetes (kubectl)
Docker
Helm (optional, for package management)

2. Clone the Cyphernetes Repository

Cyphernetes is an open-source project. Start by cloning its repository:

git clone https://github.com/AvitalTamir/cyphernetes
cd cyphernetes

3. Set Up a Kubernetes Cluster

You can use any Kubernetes setup:

Minikube (for local testing)
Kind (Kubernetes in Docker)
Managed Kubernetes service like GKE, AKS, or EKS

Example: Set up a local Kubernetes cluster using Kind:

kind create cluster --name cyphernetes-cluster

4. Deploy Cyphernetes

Cyphernetes uses Kubernetes resources and a custom encryption key manager. Here’s an example of deploying it.

Step 1: Create Encryption Config

Create an encryption-config.yaml:

kind: EncryptionConfiguration
apiVersion: apiserver.config.k8s.io/v1
resources:
  - resources:
      - secrets
    providers:
      - aescbc:
          keys:
            - name: key1
              secret: c2VjcmV0a2V5Zm9yZW5jcnlwdGlvbg==
      - identity: {}

Step 2: Apply Encryption Config

Update the Kubernetes API server to use the encryption configuration:

kubectl apply -f encryption-config.yaml

Step 3: Restart Kubernetes Components

Restart the control plane components (especially the API server). If using Kind, this might involve recreating the cluster with the encryption configuration.

5. Deploy a Sample Encrypted Workload

Now that encryption is configured, deploy a sensitive workload.

Example: Deploy an encrypted secret.

kubectl create secret generic db-credentials \
  --from-literal=username=admin \
  --from-literal=password=securepassword

Retrieve the secret and check that it is encrypted at rest:

kubectl get secrets db-credentials -o yaml

Encrypted data will look like this:

data:
  username: c2VjcmV0dXNlcm5hbWU=
  password: c2VjcmV0cGFzc3dvcmQ=

6. Verifying Encryption

Access the etcd database to ensure that secrets are encrypted.

If you have access to the etcd cluster:

ETCDCTL_API=3 etcdctl get /registry/secrets --prefix --keys-only

You’ll see the encrypted values rather than plaintext data.

7. Manage Workloads

Cyphernetes allows you to use Kubernetes-native tools (kubectl, Helm) to manage your workloads. Simply deploy workloads and secrets as usual, and they will benefit from encryption and secure handling.

Example: Encrypted Application Deployment

Deploy a sample application using a secret:

Create a deployment.yaml file:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: secure-app
spec:
  replicas: 1
  selector:
    matchLabels:
      app: secure-app
  template:
    metadata:
      labels:
        app: secure-app
    spec:
      containers:
        - name: secure-app
          image: nginx
          env:
            - name: DB_USERNAME
              valueFrom:
                secretKeyRef:
                  name: db-credentials
                  key: username
            - name: DB_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: db-credentials
                  key: password

Apply the deployment:

kubectl apply -f deployment.yaml

This will run a secure application using encrypted secrets managed by Cyphernetes.

Conclusion

Cyphernetes represents a significant step forward in Kubernetes observability and querying. By treating Kubernetes objects as interconnected entities, it unlocks powerful new capabilities for DevOps engineers managing complex clusters. While it may not replace existing tools like kubectl outright, it offers a compelling alternative for those seeking a more intuitive and efficient querying experience.

As Kubernetes ecosystems continue to grow in complexity, tools like Cyphernetes will play a pivotal role in reducing operational overhead, improving performance, and enabling faster insights into cluster state. For DevOps practitioners and SREs, mastering this tool could be the key to taming the Kubernetes sprawl.

Further Reading: