Skip to main content

LLM Setup Guide

Choose your LLM backend: Local models (complete control, privacy) or AWS Bedrock (enterprise Claude with Zero Data Retention).

Choose Your Backend

  • Local (LM Studio, Ollama, llama.cpp, vLLM) - Your code never leaves your infrastructure. Full privacy and control.
  • AWS Bedrock - Enterprise Claude 4.5 with Zero Data Retention. Your data stays in your AWS, never used for training, meets compliance requirements.

Option 1: Local LLM Backends

All local backends provide complete data privacy - your code never leaves your infrastructure. Choose based on your preferences:

LM Studio Setup

Prerequisites

System Requirements

  • LM Studio installed (lmstudio.ai)
  • Compatible model (Qwen3-30B-A3B or similar)
  • RAM: At least 16GB (32GB recommended for 30B models)

Setup Steps

1. Install LM Studio

Download from lmstudio.ai and install for your platform (macOS, Windows, Linux).

2. Download Model

  1. Open LM Studio
  2. Navigate to "Discover" tab
  3. Search for "qwen3-30b-a3b"
  4. Download Q4_K_M quantization (recommended for performance/quality balance)

Quantization Explained

Q4_K_M is a 4-bit quantization that reduces model size and RAM usage while maintaining excellent quality. Perfect for 30B models on consumer hardware.

3. Start Local Server

  1. Click "Local Server" tab
  2. Select your downloaded model
  3. Configure server settings:
    • Port: 1234 (default)
    • Context Length: 20000 tokens
    • Max Tokens: 8000
  4. Click "Start Server"

You should see "Server running on http://localhost:1234" when successful.

4. Configure drep

Update your config.yaml:

yaml
llm:
  enabled: true
  endpoint: http://localhost:1234/v1
  model: qwen3-30b-a3b
  temperature: 0.2
  max_tokens: 8000

  # Rate limiting
  max_concurrent_global: 5
  requests_per_minute: 60
  max_tokens_per_minute: 100000

  # Cache configuration
  cache:
    enabled: true
    directory: ~/.cache/drep/llm
    ttl_days: 30
    max_size_gb: 10.0

  # Circuit breaker (optional)
  circuit_breaker_threshold: 5
  circuit_breaker_timeout: 60

5. Verify Setup

Test your configuration:

bash
drep scan owner/repo --show-metrics

Expected output:

Setup Complete!

If you see LLM analysis results and token usage metrics, your setup is working correctly.

Remote LM Studio

For remote LM Studio instances (e.g., a dedicated server):

yaml
llm:
  endpoint: https://lmstudio.example.com/v1
  api_key: ${LM_STUDIO_KEY}  # If authentication enabled

Set the API key as an environment variable:

bash
export LM_STUDIO_KEY=your-api-key-here

Model Recommendations

Choose a model based on your available RAM and performance requirements:

Model Size RAM Required Speed Quality
Qwen3-30B-A3B 30B 32GB Medium Excellent
Llama-3-70B 70B 64GB Slow Best
Mistral-7B 7B 8GB Fast Good

Recommended: Qwen3-30B-A3B provides the best balance of quality and performance for code review tasks.

Ollama Setup

Ollama provides simple CLI-based model management with Docker-friendly deployment:

  1. Install Ollama from ollama.ai
  2. Pull a model: ollama pull qwen2.5-coder:32b
  3. Update config.yaml:
yaml
llm:
  endpoint: http://localhost:11434/v1  # Ollama OpenAI-compatible API
  model: qwen2.5-coder:32b

llama.cpp Setup

llama.cpp provides lightweight, low-level control with minimal dependencies:

  1. Build llama.cpp with server support
  2. Start server: ./server -m model.gguf --port 8080
  3. Update config.yaml:
yaml
llm:
  endpoint: http://localhost:8080/v1
  model: your-model-name

Option 2: AWS Bedrock (Enterprise Claude with ZDR)

AWS Bedrock provides enterprise-grade Claude 4.5 models with Zero Data Retention (ZDR). Your data stays in your AWS infrastructure, is never used for model training, and meets strict compliance requirements.

Enterprise Benefits

  • Zero Data Retention - Your data is never stored or used for training
  • Data Sovereignty - Data stays in your AWS region and account
  • Compliance - Meets HIPAA, GDPR, SOC 2, and other regulatory requirements
  • Latest Claude Models - Claude Sonnet 4.5 and Haiku 4.5 available now
  • AWS Integration - Works with IAM, CloudWatch, VPC endpoints

Prerequisites

Setup Steps

1. Enable Model Access

  1. Go to AWS Bedrock Console (region: us-east-1 or your preferred region)
  2. Navigate to Model Access in the left sidebar
  3. Click Modify model access
  4. Select Anthropic Claude models:
    • Claude Sonnet 4.5
    • Claude Haiku 4.5
  5. Click Save changes
  6. Wait for access to be granted (usually instant)

2. Configure AWS Credentials

Bedrock uses the standard AWS credentials chain. Choose one method:

Method A: AWS CLI Configuration

bash
aws configure
# Enter your AWS Access Key ID
# Enter your AWS Secret Access Key
# Default region: us-east-1
# Default output format: json

Method B: Environment Variables

bash
export AWS_ACCESS_KEY_ID=your_access_key_id
export AWS_SECRET_ACCESS_KEY=your_secret_access_key
export AWS_DEFAULT_REGION=us-east-1

Method C: Credentials File

Create ~/.aws/credentials:

ini
[default]
aws_access_key_id = your_access_key_id
aws_secret_access_key = your_secret_access_key
region = us-east-1

3. Configure drep

Update config.yaml to use Bedrock:

yaml
llm:
  enabled: true
  provider: bedrock  # Required for Bedrock

  bedrock:
    region: us-east-1
    model: anthropic.claude-sonnet-4-5-20250929-v1:0

  # General LLM settings
  temperature: 0.2
  max_tokens: 4000

  # Rate limiting (lower for Bedrock)
  max_concurrent_global: 3
  requests_per_minute: 30

  # Cache configuration
  cache:
    enabled: true
    ttl_days: 30

4. Verify Setup

Test your Bedrock configuration:

bash
drep scan owner/repo --show-metrics

Expected output:

Supported Bedrock Models

Model Model ID Use Case Availability
Claude Sonnet 4.5 anthropic.claude-sonnet-4-5-20250929-v1:0 Best balance of speed, cost, and quality. Recommended for most use cases. global, us, eu, jp
Claude Haiku 4.5 anthropic.claude-haiku-4-5-20251001-v1:0 Fastest response times, lower cost. Good for simple checks. global, us, eu

Bedrock Regions

Region Code Sonnet 4.5 Haiku 4.5
US East us-east-1
US West us-west-2
EU (Frankfurt) eu-central-1
Asia Pacific ap-southeast-1

Region Tip

Use us-east-1 for maximum model availability. Check model availability in your region before configuring.

Bedrock Troubleshooting

AccessDeniedException

ThrottlingException

Invalid model ID

Credentials not found

Troubleshooting (Local Backends)

Connection Refused

If drep cannot connect to the LLM server:

Circuit Breaker is OPEN

If you see "Circuit breaker is OPEN" errors:

Cache Not Working

If cache hit rate stays at 0%:

Slow Performance

To improve performance:

Need More Help?

Check the main README or create an issue on GitHub.

Next Steps