Skip to main content

Version Control

Raptor provides built-in version control for documents, similar to Git for code. Track document evolution, compare versions, and manage document lineage.

Core Concepts

Documents, Versions, and Variants

  • Document: Represents a file and its complete version history
  • Version: A specific iteration of a document (e.g., v1, v2, v3)
  • Variant: Different processing configurations of the same version (e.g., different chunk sizes)
Document (contract.pdf)
├── Version 1 (original)
│   ├── Variant A (chunk_size=512)
│   └── Variant B (chunk_size=1024)
└── Version 2 (revised)
    └── Variant A (chunk_size=512)

Creating Versions

Manual Linking

Link a new upload to an existing document:
import Raptor from '@raptor-data/ts-sdk';

const raptor = new Raptor({ apiKey: process.env.RAPTOR_API_KEY });

// Upload initial version
const v1 = await raptor.process('contract_v1.pdf', {
  versionLabel: 'Initial Draft'
});

console.log(`Document ID: ${v1.documentId}`);
console.log(`Version: ${v1.versionNumber}`); // 1

// Upload new version with manual link
const v2 = await raptor.process('contract_v2.pdf', {
  parentDocumentId: v1.documentId,
  versionLabel: 'Client Revisions'
});

console.log(`Version: ${v2.versionNumber}`); // 2
console.log(`Parent: ${v2.parentDocumentId}`); // Same as v1.documentId

Auto-Linking

Raptor can automatically detect new versions:
// Upload original
const v1 = await raptor.process('contract_2024.pdf');

// Upload updated version (different filename)
const v2 = await raptor.process('contract_2024_revised.pdf');

if (v2.autoLinked) {
  console.log('Auto-linked to parent!');
  console.log(`Confidence: ${v2.autoLinkConfidence}`); // 0.92
  console.log(`Method: ${v2.autoLinkMethod}`); // "metadata_and_content"

  // Explanations
  v2.autoLinkExplanation?.forEach(reason => {
    console.log(`  - ${reason}`);
  });
  // Output:
  //   - High filename similarity: 0.95
  //   - Upload time proximity: 2 hours apart
  //   - Content similarity: 94% overlap
}
See Auto-Linking for more details.

Listing Versions

Get All Versions

const versions = await raptor.listVersions('document-id');

console.log(`Total versions: ${versions.total_count}`);

versions.versions.forEach(version => {
  console.log(`Version ${version.version_number}: ${version.filename}`);
  console.log(`  Created: ${version.created_at}`);
  console.log(`  Variants: ${version.variants.length}`);

  // Primary variant info
  const primary = version.variants.find(v => v.is_primary);
  if (primary) {
    console.log(`  Chunks: ${primary.chunks_count}`);
    console.log(`  Status: ${primary.status}`);
  }
});

Get Specific Version

const version = await raptor.getVersion('document-id', 2);

console.log(`Version 2: ${version.filename}`);
console.log(`Variants: ${version.variants.length}`);

version.variants.forEach(variant => {
  console.log(`Variant ${variant.config_hash.substring(0, 8)}:`);
  console.log(`  Status: ${variant.status}`);
  console.log(`  Chunks: ${variant.chunks_count}`);
  console.log(`  Primary: ${variant.is_primary}`);
});

Version Labels

Add human-readable labels to versions:
// Set label during upload
const result = await raptor.process('contract.pdf', {
  versionLabel: 'Final - Approved by Legal'
});

// Update label later
await raptor.updateVersionLabel('document-id', 'v2.1 - Bug Fixes');

console.log('Label updated');

Default Versions

Set which version is returned by default:
// Get current default
const doc = await raptor.getDocument('document-id');
console.log(`Default version: ${doc.version_number}`);

// Set version 3 as default
await raptor.setDefaultVersion('document-id', 3);

// Now queries return version 3
const updated = await raptor.getDocument('document-id');
console.log(`New default: ${updated.version_number}`); // 3

Deleting Versions

Delete Specific Version

await raptor.deleteVersion('document-id', 2);

console.log('Version 2 deleted');

// If deleted version was latest, previous version is promoted
const doc = await raptor.getDocument('document-id');
console.log(`New latest: ${doc.version_number}`); // 1 (promoted)

Delete Document

Deleting a document deletes all versions and variants:
await raptor.deleteDocument('document-id');

console.log('Document and all versions deleted');
Soft deletes: Raptor uses soft deletes. Deleted content is marked as deleted but metadata is preserved for audit trails and analytics.

Reverting Versions

Create a new version based on an old version:
// Revert to version 1
const reverted = await raptor.revertToVersion('document-id', 1);

console.log(`Reverted from version ${reverted.revertedFromVersion}`);
console.log(`New version: ${reverted.newVersionNumber}`);
console.log(`New version ID: ${reverted.newVersionId}`);

Comparing Versions

See what changed between versions:
const diff = await raptor.compareDocuments('doc-v1-id', 'doc-v2-id');

console.log(diff.summary);
console.log(`Similarity: ${(diff.similarityScore * 100).toFixed(1)}%`);

console.log(`\nAdded: ${diff.diff.addedCount} chunks`);
diff.diff.addedChunks.forEach(chunk => {
  console.log(`+ [${chunk.chunk_index}] ${chunk.text.substring(0, 100)}...`);
});

console.log(`\nRemoved: ${diff.diff.removedCount} chunks`);
diff.diff.removedChunks.forEach(chunk => {
  console.log(`- [${chunk.chunk_index}] ${chunk.text.substring(0, 100)}...`);
});

console.log(`\nModified: ${diff.diff.modifiedCount} chunks`);

Processing Variants

Multiple processing configurations for the same version:

Reprocess with Different Config

// Original processing
const v1 = await raptor.process('document.pdf', {
  chunkSize: 512
});

// Reprocess with different chunk size (creates new variant)
const v2 = await raptor.process('document.pdf', {
  chunkSize: 1024
});

// Same document, same version, different variants
console.log(v1.documentId === v2.documentId); // true
console.log(v1.versionId === v2.versionId); // true
console.log(v1.variantId === v2.variantId); // false (different variants)

List Variants

const version = await raptor.getVersion('document-id', 1);

version.variants.forEach(variant => {
  console.log(`Variant ${variant.config_hash.substring(0, 8)}:`);
  console.log(`  Chunk size: ${variant.config.chunk_size}`);
  console.log(`  Strategy: ${variant.config.strategy}`);
  console.log(`  Chunks: ${variant.chunks_count}`);
  console.log(`  Primary: ${variant.is_primary}`);
});

Delete Variant

await raptor.deleteVariant('variant-id');

console.log('Variant deleted');

// If deleted variant was primary, another variant is promoted

Advanced: Document Lineage

Get Lineage

Get complete version history:
const lineage = await raptor.getDocumentLineage('document-id');

console.log(`Lineage ID: ${lineage.lineage_id}`);
console.log(`Total versions: ${lineage.total_versions}`);

lineage.documents.forEach(doc => {
  console.log(`${doc.version_label || doc.filename}`);
  console.log(`  Parent: ${doc.parent_document_id || 'None'}`);
  console.log(`  Similarity: ${doc.similarity_score}`);
  console.log(`  Created: ${doc.created_at}`);
});

Lineage Tree

Visualize version relationships:
const tree = await raptor.getDocumentLineageTree('document-id');

function printTree(node, indent = '') {
  console.log(`${indent}${node.filename}`);
  node.children.forEach(child => printTree(child, indent + '  '));
}

tree.roots.forEach(root => printTree(root));

// Output:
// contract_v1.pdf
//   contract_v2.pdf
//     contract_v3.pdf
//   contract_v2_alt.pdf

Lineage Stats

const stats = await raptor.getLineageStats('document-id');

console.log(`Total versions: ${stats.total_versions}`);
console.log(`Oldest version: ${stats.oldest_version.filename}`);
console.log(`Created: ${stats.oldest_version.created_at}`);

Lineage Changelog

Generate a changelog for entire lineage:
const changelog = await raptor.getLineageChangelog('document-id');

console.log(`${changelog.total_transitions} version transitions`);

changelog.changelogs.forEach(entry => {
  console.log(`\n${entry.from_label}${entry.to_label}`);
  console.log(`Similarity: ${entry.similarity_score}`);
  console.log(`Added: ${entry.diff.added_count} chunks`);
  console.log(`Removed: ${entry.diff.removed_count} chunks`);
  console.log(`Modified: ${entry.diff.modified_count} chunks`);
});

Best Practices

Use version labels: Add meaningful labels like “v1.0”, “Draft”, “Final” to make versions easily identifiable.
Auto-linking is smart: Raptor uses filename similarity, upload time proximity, file size, and content sampling to detect versions with 85%+ accuracy.
Deleting is soft: Deleted versions are soft-deleted (marked as deleted but not removed). This preserves audit trails and billing records.