Quickstart

1. Get Your Credentials

Create Account

Copy API Key

Go to Settings → API Keys → Create new key (starts with rpt_)

Copy Workspace ID

Your Workspace ID is in the dashboard header (UUID format)

2. Update Your Code

Just change the base URL and add two headers. Everything else stays the same.

Python
TypeScript
cURL

from openai import OpenAI

client = OpenAI(
    api_key="sk-your-openai-key",
    base_url="https://proxy.raptordata.dev/v1",
    default_headers={
        "X-Raptor-Api-Key": "rpt_your-key",
        "X-Raptor-Workspace-Id": "your-workspace-id"
    }
)

# That's it! Use normally
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello!"}]
)

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'sk-your-openai-key',
  baseURL: 'https://proxy.raptordata.dev/v1',
  defaultHeaders: {
    'X-Raptor-Api-Key': 'rpt_your-key',
    'X-Raptor-Workspace-Id': 'your-workspace-id'
  }
});

// That's it! Use normally
const response = await client.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'Hello!' }]
});

curl https://proxy.raptordata.dev/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-your-openai-key" \
  -H "X-Raptor-Api-Key: rpt_your-key" \
  -H "X-Raptor-Workspace-Id: your-workspace-id" \
  -d '{"model": "gpt-4", "messages": [{"role": "user", "content": "Hello!"}]}'

3. Verify It Works

Check for Raptor headers in the response:

X-Raptor-Cache: miss          # "hit" when cached
X-Raptor-Latency-Ms: 5        # Raptor overhead (~5ms)
X-Raptor-Upstream-Latency-Ms: 450  # AI provider time

Make the same request twice. The second time, you’ll see X-Raptor-Cache: hit and a much faster response.

4. Use Streaming

Streaming works out of the box. Just add stream: true:

Python
TypeScript

stream = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True
)

for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

const stream = await client.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'Tell me a story' }],
  stream: true
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

What’s Happening?

Every request now flows through Raptor:

Your App → Raptor Proxy → OpenAI/Anthropic
              │
              ├── Firewall check (~2ms)
              ├── Cache lookup (~1ms)
              ├── Evidence logging (async)
              └── Forward to AI

Total overhead: ~5ms. Built in Rust for speed.

Next Steps

How It Works

Understand the Rust architecture

Semantic Cache

Learn how caching saves you money

AI Firewall

Protect against prompt injection

Anthropic Guide

Using Claude instead of GPT?

Get Started

Integrations

Features

1. Get Your Credentials

2. Update Your Code

3. Verify It Works

4. Use Streaming

What’s Happening?

Next Steps

How It Works

Semantic Cache

AI Firewall

Anthropic Guide

Get Started

Integrations

Features

​1. Get Your Credentials

​2. Update Your Code

​3. Verify It Works

​4. Use Streaming

​What’s Happening?

​Next Steps

How It Works

Semantic Cache

AI Firewall

Anthropic Guide

1. Get Your Credentials

2. Update Your Code

3. Verify It Works

4. Use Streaming

What’s Happening?

Next Steps