GPU Types

The GPUType enum provides standardized GPU configurations for different workloads. Each type represents a specific combination of GPU count and memory capacity.

GPUType Enum

from chisel import GPUType

# Available GPU configurations
GPUType.A100_80GB_1  # Single A100-80GB GPU
GPUType.A100_80GB_2  # 2x A100-80GB GPUs
GPUType.A100_80GB_4  # 4x A100-80GB GPUs  
GPUType.A100_80GB_8  # 8x A100-80GB GPUs

GPU Specifications

A100-80GB Single GPU

Type

GPUType.A100_80GB_1

Single NVIDIA A100 with 80GB HBM2e memory

Specifications:

GPU Count: 1
Memory per GPU: 80GB HBM2e
Total Memory: 80GB
CUDA Cores: 6,912
Tensor Cores: 432 (3rd gen)
Memory Bandwidth: 2TB/s

Best For:

Development and testing
Model inference
Small to medium training jobs
Prototyping and experimentation

app = ChiselApp("inference-app", gpu=GPUType.A100_80GB_1)

A100-80GB Dual GPU

Type

GPUType.A100_80GB_2

Two NVIDIA A100 GPUs with 80GB HBM2e memory each

Specifications:

GPU Count: 2
Memory per GPU: 80GB HBM2e
Total Memory: 160GB
CUDA Cores: 13,824 (total)
Tensor Cores: 864 (total, 3rd gen)
Interconnect: NVLink 3.0

Best For:

Medium-scale training
Balanced performance/cost
Multi-GPU learning
Data parallel training

app = ChiselApp("training-app", gpu=GPUType.A100_80GB_2)

A100-80GB Quad GPU

Type

GPUType.A100_80GB_4

Four NVIDIA A100 GPUs with 80GB HBM2e memory each

Specifications:

GPU Count: 4
Memory per GPU: 80GB HBM2e
Total Memory: 320GB
CUDA Cores: 27,648 (total)
Tensor Cores: 1,728 (total, 3rd gen)
Interconnect: NVLink 3.0 with NVSwitch

Best For:

Large model training
High-throughput inference
Complex multi-GPU workflows
Distributed training

app = ChiselApp("large-training", gpu=GPUType.A100_80GB_4)

A100-80GB Octa GPU

Type

GPUType.A100_80GB_8

Eight NVIDIA A100 GPUs with 80GB HBM2e memory each

Specifications:

GPU Count: 8
Memory per GPU: 80GB HBM2e
Total Memory: 640GB
CUDA Cores: 55,296 (total)
Tensor Cores: 3,456 (total, 3rd gen)
Interconnect: Full NVLink 3.0 mesh with NVSwitch

Best For:

Massive model training (LLMs, transformers)
Maximum throughput workloads
Research and experimentation
Production-scale training

app = ChiselApp("massive-training", gpu=GPUType.A100_80GB_8)

Selection Guide

Choose the right GPU configuration based on your specific requirements:

By Use Case

Recommended: A100_80GB_1Perfect for:

Code development and debugging
Small dataset experimentation
Algorithm prototyping
Model inference testing

# Development setup
dev_app = ChiselApp("development", gpu=GPUType.A100_80GB_1)

@dev_app.capture_trace(trace_name="debug")
def test_function(data):
    import torch
    device = "cuda" if torch.cuda.is_available() else "cpu"
    # Development code here
    return result

By Memory Requirements

Model Size	Parameters	Recommended GPU	Memory Reasoning
Small	< 1B	A100_80GB_1	Fits comfortably in 80GB
Medium	1B - 7B	A100_80GB_2	Benefits from parallel processing
Large	7B - 30B	A100_80GB_4	Requires distributed memory
Massive	30B+	A100_80GB_8	Needs maximum memory capacity

Performance Scaling

GPU scaling is not always linear. Consider your specific workload characteristics.

# Example: Testing scaling performance
def benchmark_scaling():
    import time
    
    configurations = [
        GPUType.A100_80GB_1,
        GPUType.A100_80GB_2,
        GPUType.A100_80GB_4,
        GPUType.A100_80GB_8
    ]
    
    for gpu_type in configurations:
        app = ChiselApp(f"benchmark-{gpu_type.name}", gpu=gpu_type)
        
        @app.capture_trace(trace_name="benchmark")
        def benchmark_workload(data):
            import torch
            start_time = time.time()
            
            # Your workload here
            result = process_data(data)
            
            end_time = time.time()
            print(f"{gpu_type.name}: {end_time - start_time:.2f}s")
            return result

Usage Examples

Basic Usage

from chisel import ChiselApp, GPUType

# Direct enum usage (recommended)
app = ChiselApp("my-app", gpu=GPUType.A100_80GB_2)

Multi-App Workflow

from chisel import ChiselApp, GPUType

# Different apps for different stages
preprocessing_app = ChiselApp("preprocess", gpu=GPUType.A100_80GB_1)
training_app = ChiselApp("train", gpu=GPUType.A100_80GB_4) 
evaluation_app = ChiselApp("evaluate", gpu=GPUType.A100_80GB_2)

@preprocessing_app.capture_trace(trace_name="data_prep")
def preprocess_data(raw_data):
    # Light GPU work for data preprocessing
    return processed_data

@training_app.capture_trace(trace_name="model_training")
def train_model(processed_data):
    # Heavy GPU work requiring 4 GPUs
    return trained_model

@evaluation_app.capture_trace(trace_name="model_eval")
def evaluate_model(model, test_data):
    # Medium GPU work for evaluation
    return metrics

Cost Optimization

Choose the smallest GPU configuration that meets your performance requirements to optimize costs.

Cost-Performance Guidelines

Start Small: Begin with A100_80GB_1 for development
Scale Gradually: Move to larger configurations only when needed
Monitor Usage: Track GPU utilization to ensure efficient usage
Batch Processing: Group operations to maximize GPU utilization

# Cost-aware GPU selection
def cost_effective_selection(workload_type, data_size, time_constraint):
    if workload_type == "development":
        return GPUType.A100_80GB_1
    
    elif workload_type == "training":
        if data_size < 1_000_000:  # Small dataset
            return GPUType.A100_80GB_1
        elif data_size < 10_000_000:  # Medium dataset
            return GPUType.A100_80GB_2
        else:  # Large dataset
            return GPUType.A100_80GB_4 if time_constraint else GPUType.A100_80GB_2
    
    elif workload_type == "inference":
        return GPUType.A100_80GB_2  # Good balance for most inference
    
    else:
        return GPUType.A100_80GB_1  # Safe default

ChiselApp

Main application class documentation

Configuration

Performance tuning and optimization guide

Examples

Working examples for different GPU configurations

Troubleshooting

GPU-related issues and solutions

Overview

Core API

GPUType Enum

GPU Specifications

Selection Guide

By Use Case

By Memory Requirements

Performance Scaling

Usage Examples

Basic Usage

Multi-App Workflow

Cost Optimization

Cost-Performance Guidelines

ChiselApp

Configuration

Examples

Troubleshooting

Overview

Core API

​GPUType Enum

​GPU Specifications

​Selection Guide

​By Use Case

​By Memory Requirements

​Performance Scaling

​Usage Examples

​Basic Usage

​Multi-App Workflow

​Cost Optimization

​Cost-Performance Guidelines

​Related Documentation

ChiselApp

Configuration

Examples

Troubleshooting

GPUType Enum

GPU Specifications

Selection Guide

By Use Case

By Memory Requirements

Performance Scaling

Usage Examples

Basic Usage

Multi-App Workflow

Cost Optimization

Cost-Performance Guidelines

Related Documentation