Keys & Caches Basics

Core Concepts

Keys & Caches is an experiment tracking and profiling library that helps you find the true bottlenecks, errors, and performance optimizations previously hidden or hard to discover in your ML stack - from PyTorch down to the GPU - with the help of AI.

Projects and Runs

Project

A collection of related experiments (e.g., “image-classification”, “llm-finetuning”)

Run

A single execution of your code with specific configuration and results

Config

Hyperparameters and settings tracked for each run

Metrics

Values logged during execution (loss, accuracy, custom metrics)

Traces

Model execution profiles showing layer-by-layer performance

Timings

Function execution times captured by timing decorators

Artifacts

Generated files like model traces, timing data, and code snapshots

Initialization Modes

kandc supports three different modes depending on your needs:

Online Mode (Default)
Offline Mode
Disabled Mode

Full Cloud Experience

kandc.init(project="my-project")  # mode="online" is default

What happens:

🔐 Authentication: Browser opens for sign-in (first time only)
🌐 Dashboard: Automatically opens your run dashboard
☁️ Cloud sync: All data synced in real-time
📊 Live metrics: Charts update as your code runs

Requirements: Internet connection and authentication
Best for: Production experiments, team collaboration, sharing results

Mode Comparison

Feature	Online	Offline	Disabled
Metrics Logging	✅	✅	❌
Dashboard	✅	❌	❌
Cloud Sync	✅	❌	❌
Authentication	Required	None	None
Internet Required	Yes	No	No

Basic Usage

Here’s a complete transformer example showing the core kandc workflow:

import time
import random
import torch
import torch.nn as nn
import kandc

@kandc.capture_model_class(model_name="SimpleTransformer")
class SimpleTransformer(nn.Module):
    def __init__(self, input_dim=32, seq_len=16, d_model=64, nhead=4, num_layers=2, num_classes=10):
        super().__init__()
        self.input_dim = input_dim
        self.seq_len = seq_len
        self.d_model = d_model

        # Project input to d_model
        self.input_proj = nn.Linear(input_dim, d_model)

        # Positional encoding (learnable)
        self.pos_embedding = nn.Parameter(torch.zeros(1, seq_len, d_model))

        # Transformer encoder
        encoder_layer = nn.TransformerEncoderLayer(d_model=d_model, nhead=nhead, batch_first=True)
        self.transformer = nn.TransformerEncoder(encoder_layer, num_layers=num_layers)

        # Output head
        self.head = nn.Sequential(
            nn.LayerNorm(d_model),
            nn.Linear(d_model, num_classes)
        )

    def forward(self, x):
        # x: (batch, seq_len, input_dim)
        x = self.input_proj(x)  # (batch, seq_len, d_model)
        x = x + self.pos_embedding  # Add positional encoding
        x = self.transformer(x)  # (batch, seq_len, d_model)
        x = x.mean(dim=1)  # Pool over sequence
        x = self.head(x)   # (batch, num_classes)
        return x

def main():
    # Initialize experiment tracking
    kandc.init(
        project="optimize-transformer",
        name="test-run-1",
        config={"d_model": 64, "nhead": 4, "num_layers": 2, "seq_len": 16},
        tags=["transformer", "pytorch"]
    )

    # Create model and data
    model = SimpleTransformer()
    # Simulate a batch of 32 sequences, each of length 16, with 32 features
    data = torch.randn(32, 16, 32)

    # Run model (automatically profiled due to decorator)
    output = model(data)
    loss = output.mean()

    @kandc.timed(name="random_wait")
    def random_wait():
        time.sleep(random.random() * 2)
        return "processing_complete"

    processing_result = random_wait()

    # Log metrics with custom x values
    for i in range(10):
        time.sleep(0.1)  # Simulate training time
        x_value = i * 0.5
        kandc.log({
            "loss": loss.item(),
            "accuracy": random.random(),
            "model_params": sum(p.numel() for p in model.parameters())
        }, x=x_value)

    # Finish the run
    kandc.finish()

if __name__ == "__main__":
    main()

PyTorch Profiling & Performance Analysis

kandc integrates with PyTorch’s built-in profiler to capture detailed GPU/CPU performance metrics, memory usage, CUDA kernel execution, and more. All profiling data is automatically exported as Chrome traces viewable in Perfetto UI.

Model Class Profiling

The most convenient way to profile models is using the class decorator:

import torch
import torch.nn as nn
import kandc

@kandc.capture_model_class(
    model_name="SimpleTransformer",
    record_shapes=True,      # Record tensor shapes
    profile_memory=True      # Profile memory usage
)
class SimpleTransformer(nn.Module):
    def __init__(self, input_dim=32, seq_len=16, d_model=64, nhead=4, num_layers=2, num_classes=10):
        super().__init__()
        self.input_dim = input_dim
        self.seq_len = seq_len
        self.d_model = d_model

        # Project input to d_model
        self.input_proj = nn.Linear(input_dim, d_model)

        # Positional encoding (learnable)
        self.pos_embedding = nn.Parameter(torch.zeros(1, seq_len, d_model))

        # Transformer encoder
        encoder_layer = nn.TransformerEncoderLayer(d_model=d_model, nhead=nhead, batch_first=True)
        self.transformer = nn.TransformerEncoder(encoder_layer, num_layers=num_layers)

        # Output head
        self.head = nn.Sequential(
            nn.LayerNorm(d_model),
            nn.Linear(d_model, num_classes)
        )

    def forward(self, x):
        # x: (batch, seq_len, input_dim)
        x = self.input_proj(x)  # (batch, seq_len, d_model)
        x = x + self.pos_embedding  # Add positional encoding
        x = self.transformer(x)  # (batch, seq_len, d_model)
        x = x.mean(dim=1)  # Pool over sequence
        x = self.head(x)   # (batch, num_classes)
        return x

# Usage
kandc.init(project="model-profiling")
model = SimpleTransformer()
# Simulate a batch of 32 sequences, each of length 16, with 32 features
data = torch.randn(32, 16, 32)
output = model(data)  # Automatically profiled!
kandc.finish()

Model Instance Profiling

For existing models, wrap them with capture_model_instance:

# Existing model
model = torchvision.models.resnet18()

# Wrap for profiling
model = kandc.capture_model_instance(
    model,
    model_name="ResNet18_Pretrained",
    record_shapes=True,
    profile_memory=True
)

# Now all forward passes are profiled
output = model(data)

Advanced Profiling Options

For more control over profiling, use the wrapper and decorator classes directly:

import kandc
from kandc.annotators import ProfilerWrapper, ProfilerDecorator

# Wrap any object with detailed profiling
class MyModel:
    def forward(self, x):
        return x * 2
    
    def predict(self, x):
        return self.forward(x)

model = MyModel()
profiled_model = ProfilerWrapper(
    model, 
    name="MyModel",
    activities=['cpu', 'cuda'],  # Profile both CPU and CUDA
    record_shapes=True,          # Record tensor shapes
    profile_memory=True,         # Profile memory usage
    with_stack=True             # Include call stacks
)

# All method calls are now profiled with PyTorch profiler
result = profiled_model.forward(data)
result = profiled_model.predict(data)

# Or use as a decorator
@ProfilerDecorator(name="OptimizedModel", record_shapes=True)
class OptimizedModel:
    def predict(self, x):
        return x * 2

# Convenience functions
profiled_obj = kandc.profile(my_object, name="MyObject")
@kandc.profiler(name="MyFunction")
def my_function(x):
    return expensive_computation(x)

Environment Control

Disable profiling globally without changing your code:

# Disable profiling
export KANDC_PROFILER_DISABLED=1
python my_script.py

# Enable profiling (default)
unset KANDC_PROFILER_DISABLED
python my_script.py

Function-Level Profiling

Profile any function with the capture_trace decorator:

@kandc.capture_trace(
    trace_name="data_preprocessing",
    record_shapes=True
)
def preprocess_batch(images, labels):
    # Your preprocessing code
    processed_images = transforms(images)
    return processed_images, labels

# Usage
processed_data = preprocess_batch(raw_images, labels)

Timing Functions

Capture execution times for any function:

@kandc.timed(name="model_inference")
def run_inference(model, batch):
    with torch.no_grad():
        return model(batch)

# Or time existing functions
result = kandc.timed_call("data_loading", load_batch, batch_size=32)

Viewing Performance Data

All profiling data is automatically saved as trace artifacts that you can view in multiple ways:

Dashboard Viewer
Download & View
Chrome Tracing

In your Keys & Caches dashboard:

Navigate to your run
Click the Artifacts tab
Select any trace artifact
Click “Open in Viewer” to view in embedded Perfetto UI

What you’ll see in traces:

Layer-by-layer execution times
GPU kernel execution details
Memory allocation patterns
Tensor shapes and operations
Call stacks and function relationships
CPU vs GPU time breakdown

Logging Metrics

Track any metrics during your experiments:

Basic Logging

kandc.init(project="training")

# Log single values
kandc.log({"loss": 0.25})
kandc.log({"accuracy": 0.92, "f1_score": 0.89})

# Log with step numbers (useful for training loops)
for epoch in range(100):
    loss = train_epoch()
    kandc.log({"epoch_loss": loss}, step=epoch)

Code Snapshot Configuration

kandc automatically captures your source code for reproducibility. You can control this behavior:

Disable Code Capture

# Disable code snapshot completely
kandc.init(
    project="my-project",
    capture_code=False  # No code will be captured or uploaded
)

Custom Exclude Patterns

# Exclude specific files/directories from code capture
kandc.init(
    project="my-project",
    capture_code=True,
    code_exclude_patterns=[
        "*.pth",           # Model files
        "data/",           # Data directory
        "experiments/",    # Experiment outputs
        "*.log",           # Log files
        "temp_*"           # Temporary files
    ]
)

What Gets Captured by Default

When capture_code=True (default), kandc captures:

Included Files
Excluded by Default

Source code files:

.py, .js, .ts, .jsx, .tsx
.java, .cpp, .c, .h, .hpp
.cs, .go, .rs, .rb, .php
.swift, .kt, .scala, .r
.sql, .sh, .bash, .zsh
.yaml, .yml, .json, .toml
.md, .rst, .txt
.html, .css, .scss

Configuration files:

requirements.txt, pyproject.toml
package.json, Dockerfile
.gitignore, .env.example

File Handling

kandc respects your .gitignore file when uploading code snapshots and traces. Add large files to .gitignore to avoid uploading them.

# Large model files
*.pth
*.safetensors
*.bin

# Data files
data/
datasets/
*.csv

# Environment
.env
venv/

Best practice: Download large models in your script rather than uploading them:

# Good: Download at runtime
from transformers import AutoModel
model = AutoModel.from_pretrained("bert-large-uncased")

# Avoid: Uploading large local files
# model = torch.load("my_5gb_model.pth")  # This would be uploaded

Error Handling

kandc is designed to fail gracefully:

try:
    kandc.init(project="my-project")
    # Your code here
    kandc.log({"metric": value})
finally:
    kandc.finish()  # Always finish, even if there's an error

If authentication fails or the backend is unavailable, kandc automatically falls back to offline mode and continues working locally.

Ready to dive deeper? Check out our complete example or get support.

Getting Started

Reference

Support

Core Concepts

Projects and Runs

Project

Run

Config

Metrics

Traces

Timings

Artifacts

Initialization Modes

Full Cloud Experience

Mode Comparison

Basic Usage

PyTorch Profiling & Performance Analysis

Model Class Profiling

Model Instance Profiling

Advanced Profiling Options

Environment Control

Function-Level Profiling

Timing Functions

Viewing Performance Data

Logging Metrics

Basic Logging

Code Snapshot Configuration

Disable Code Capture

Custom Exclude Patterns

What Gets Captured by Default

File Handling

Error Handling

Getting Started

Reference

Support

​Core Concepts

​Projects and Runs

Project

Run

Config

Metrics

Traces

Timings

Artifacts

​Initialization Modes

​Full Cloud Experience

​Mode Comparison

​Basic Usage

​PyTorch Profiling & Performance Analysis

​Model Class Profiling

​Model Instance Profiling

​Advanced Profiling Options

​Environment Control

​Function-Level Profiling

​Timing Functions

​Viewing Performance Data

​Logging Metrics

​Basic Logging

​Code Snapshot Configuration

​Disable Code Capture

​Custom Exclude Patterns

​What Gets Captured by Default

​File Handling

​Error Handling

Core Concepts

Projects and Runs

Initialization Modes

Full Cloud Experience

Mode Comparison

Basic Usage

PyTorch Profiling & Performance Analysis

Model Class Profiling

Model Instance Profiling

Advanced Profiling Options

Environment Control

Function-Level Profiling

Timing Functions

Viewing Performance Data

Logging Metrics

Basic Logging

Code Snapshot Configuration

Disable Code Capture

Custom Exclude Patterns

What Gets Captured by Default

File Handling

Error Handling