Chisel CLI accelerates your Python functions by automatically running them on cloud GPUs. Get your first GPU-accelerated function running in minutes.

Installation

1

Install Chisel CLI

Install directly from the GitHub repository using pip:
pip install chisel-cli
We recommend using a virtual environment to avoid dependency conflicts.
2

Verify installation

Test that Chisel is installed correctly:
chisel --help
You should see the Chisel CLI help message with available commands.

Your First GPU Function

1

Create a Python script

Create a new file called my_first_gpu_script.py:
my_first_gpu_script.py
from chisel import ChiselApp, GPUType

app = ChiselApp("my-app", gpu=GPUType.A100_80GB_1)

@app.capture_trace(trace_name="matrix_ops")
def matrix_multiply(size=1000):
    import torch
    device = "cuda" if torch.cuda.is_available() else "cpu"
    
    a = torch.randn(size, size, device=device)
    b = torch.randn(size, size, device=device)
    result = torch.mm(a, b)
    
    return result.cpu().numpy()

if __name__ == "__main__":
    result = matrix_multiply(2000)
    print(f"Result shape: {result.shape}")
The @app.capture_trace decorator marks functions for GPU execution when using the chisel command.
2

Test locally first

Run your script locally to ensure it works:
python my_first_gpu_script.py
Local execution uses CPU and helps verify your code works before GPU execution.
3

Run on cloud GPU

Execute the same script on a cloud GPU:
chisel python my_first_gpu_script.py
First-time execution will open your browser for authentication with Herdora.
4

Monitor execution

Watch the real-time output during execution:
  • Live upload progress with spinner
  • Instant job ID and URL when ready
  • Real-time stdout/stderr streaming
  • Cost information when job completes
Your function now runs on a powerful A100 GPU in the cloud!

Key Concepts

ChiselApp Configuration

The ChiselApp class is your main interface for GPU acceleration:
from chisel import ChiselApp, GPUType

app = ChiselApp("app-name", gpu=GPUType.A100_80GB_2)
Parameters:
  • "app-name": Job tracking identifier for monitoring
  • gpu: GPU configuration (see options below)
  • upload_dir: Directory to upload (default: current directory)

GPU Types Available

Choose the right GPU configuration for your workload:
TypeGPUsMemoryBest For
GPUType.A100_80GB_11x A10080GBDevelopment, inference
GPUType.A100_80GB_22x A100160GBMedium training
GPUType.A100_80GB_44x A100320GBLarge models
GPUType.A100_80GB_88x A100640GBMassive models

@capture_trace Decorator

Mark functions for GPU execution and tracing:
@app.capture_trace(trace_name="operation")
def my_function():
    # Runs on cloud GPU when using 'chisel' command
    pass

Command Line Usage

# Run any Python script on GPU
chisel python script.py

# Works with modules too
chisel python -m pytest
chisel jupyter notebook

Authentication

1

First-time setup

When you run chisel for the first time:
  1. Browser opens automatically
  2. Sign in or create Herdora account
  3. Credentials stored securely in ~/.chisel/
Authentication happens automatically - no manual setup required!
2

Credential management

Manage your authentication programmatically:
from chisel.auth import is_authenticated, clear_credentials

# Check authentication status
if is_authenticated():
    print("✅ Ready to use Chisel")

# Clear credentials (if needed)
clear_credentials()

Common Patterns

Error Handling

@app.capture_trace(trace_name="robust")
def robust_function(data):
    try:
        device = "cuda" if torch.cuda.is_available() else "cpu"
        # GPU code here
        tensor = torch.tensor(data, device=device)
        return tensor.cpu().numpy()
    except Exception as e:
        print(f"Error: {e}")
        raise

Memory Management

import torch
import gc

@app.capture_trace(trace_name="memory_efficient")
def memory_efficient_processing(data):
    torch.cuda.empty_cache()  # Clear cache
    
    # Process in chunks
    chunk_size = 1000
    results = []
    
    for i in range(0, len(data), chunk_size):
        chunk = data[i:i+chunk_size]
        result = process_chunk(chunk)
        results.append(result.cpu())  # Move to CPU
        
        del chunk, result
        if i % (chunk_size * 10) == 0:
            torch.cuda.empty_cache()
            gc.collect()
    
    return torch.cat(results)

Next Steps

Now that you have Chisel running, explore these features:
Need help? Join our community or reach out to contact@herdora.com for support.