Document Lineage
Document lineage tracks the complete evolution history of documents, similar to Git tracking code history. Understand parent-child relationships, compare versions, and visualize document evolution over time.Core Concepts
Lineage Structure
Copy
Lineage (UUID-based group)
├── Document A (root)
│ ├── Document B (child of A)
│ │ └── Document D (child of B)
│ └── Document C (child of A)
└── Document E (root - different branch)
- One parent (except roots)
- Multiple children
- Version label (e.g., “v1.0”, “Draft”)
- Similarity score to parent (0.0-1.0)
Get Lineage
Linear Lineage
Get all documents in chronological order:Copy
import Raptor from '@raptor-data/ts-sdk';
const raptor = new Raptor({ apiKey: process.env.RAPTOR_API_KEY });
const lineage = await raptor.getDocumentLineage('document-id');
console.log(`Lineage ID: ${lineage.lineage_id}`);
console.log(`Total versions: ${lineage.total_versions}`);
lineage.documents.forEach(doc => {
const label = doc.version_label || doc.filename;
const parent = doc.parent_document_id ? `→ child of ${doc.parent_document_id}` : '(root)';
console.log(`${label} ${parent}`);
console.log(` Similarity: ${(doc.similarity_score * 100).toFixed(1)}%`);
console.log(` Created: ${doc.created_at}`);
console.log(` Current: ${doc.is_current ? 'Yes' : 'No'}`);
});
Copy
contract_v1.pdf (root)
Similarity: N/A
Created: 2024-01-15T10:00:00Z
Current: No
contract_v2.pdf → child of doc-abc
Similarity: 87.3%
Created: 2024-01-20T14:30:00Z
Current: No
contract_final.pdf → child of doc-def
Similarity: 94.1%
Created: 2024-01-25T09:15:00Z
Current: Yes
Include Deleted Documents
Copy
const lineage = await raptor.getDocumentLineage('document-id', {
includeDeleted: true
});
lineage.documents.forEach(doc => {
const deleted = doc.deleted_at ? '(deleted)' : '';
console.log(`${doc.filename} ${deleted}`);
});
Lineage Tree
Visualize parent-child relationships as a tree:Copy
const tree = await raptor.getDocumentLineageTree('document-id');
console.log(`Total versions: ${tree.totalVersions}`);
function printTree(node, indent = '') {
console.log(`${indent}${node.filename} (${node.versionLabel || 'no label'})`);
node.children.forEach(child => printTree(child, indent + ' '));
}
tree.roots.forEach(root => printTree(root));
Copy
contract_v1.pdf (Initial Draft)
contract_v2.pdf (Client Review)
contract_v3.pdf (Final)
contract_v2_alt.pdf (Alternative Approach)
template.pdf (Template)
instance_1.pdf (Instance 1)
instance_2.pdf (Instance 2)
Tree Structure
Copy
interface LineageTreeNode {
id: string;
filename: string;
contentHash: string;
versionLabel: string | null;
createdAt: string;
children: LineageTreeNode[];
}
interface DocumentLineageTreeResponse {
totalVersions: number;
roots: LineageTreeNode[];
}
Lineage Stats
Get summary statistics:Copy
const stats = await raptor.getLineageStats('document-id');
console.log(`Total versions: ${stats.total_versions}`);
console.log(`Oldest version: ${stats.oldest_version.filename}`);
console.log(`Created: ${stats.oldest_version.created_at}`);
Finding Similar Documents
Discover documents that might be related:Copy
const similar = await raptor.findSimilarDocuments('document-id', {
minSimilarity: 0.7, // Minimum 70% similarity
limit: 10
});
console.log(`Found ${similar.count} similar documents`);
similar.suggestions.forEach(doc => {
console.log(`${doc.filename}`);
console.log(` Similarity: ${(doc.similarityScore * 100).toFixed(1)}%`);
console.log(` Created: ${doc.createdAt}`);
console.log(` Label: ${doc.versionLabel || 'N/A'}`);
});
Manual Linking
Link to Parent
Create parent-child relationship manually:Copy
const result = await raptor.linkToParent('child-id', {
parentDocumentId: 'parent-id',
versionLabel: 'v2.0 - Updated'
});
console.log(`Linked ${result.documentId} to ${result.parentDocumentId}`);
console.log(`Lineage ID: ${result.lineageId}`);
console.log(`Similarity: ${result.similarityScore}`);
console.log(`Label: ${result.versionLabel}`);
Unlink from Lineage
Remove document from its lineage (creates new lineage):Copy
const result = await raptor.unlinkFromLineage('document-id');
console.log(`Document ${result.documentId} unlinked`);
console.log(`New lineage ID: ${result.newLineageId}`);
console.log(`Parent: ${result.parentDocumentId}`); // null
Lineage Changelog
Generate a changelog showing all version transitions:Copy
const changelog = await raptor.getLineageChangelog('document-id');
console.log(`${changelog.totalTransitions} version transitions\n`);
changelog.changelogs.forEach(entry => {
console.log(`${entry.from_label || entry.from_filename} → ${entry.to_label || entry.to_filename}`);
console.log(` Similarity: ${(entry.similarity_score * 100).toFixed(1)}%`);
console.log(` Date: ${entry.created_at}`);
const diff = entry.diff;
console.log(` Changes:`);
console.log(` + ${diff.added_count} chunks added`);
console.log(` - ${diff.removed_count} chunks removed`);
console.log(` ~ ${diff.modified_count} chunks modified`);
if (diff.summary) {
console.log(` Summary: ${diff.summary}`);
}
console.log();
});
Copy
Initial Draft → Client Review
Similarity: 87.3%
Date: 2024-01-20T14:30:00Z
Changes:
+ 12 chunks added
- 5 chunks removed
~ 8 chunks modified
Summary: Added client feedback sections
Client Review → Final
Similarity: 94.1%
Date: 2024-01-25T09:15:00Z
Changes:
+ 3 chunks added
- 2 chunks removed
~ 4 chunks modified
Summary: Minor corrections and formatting
Comparing Documents
See detailed differences between any two documents:Copy
const diff = await raptor.compareDocuments('doc1-id', 'doc2-id');
console.log(diff.summary);
console.log(`Similarity: ${(diff.similarityScore * 100).toFixed(1)}%`);
// Added chunks
console.log(`\nAdded (${diff.diff.addedCount}):`);
diff.diff.addedChunks.forEach(chunk => {
console.log(`+ [Chunk ${chunk.chunk_index}, Page ${chunk.page_number}]`);
console.log(` ${chunk.text.substring(0, 150)}...`);
});
// Removed chunks
console.log(`\nRemoved (${diff.diff.removedCount}):`);
diff.diff.removedChunks.forEach(chunk => {
console.log(`- [Chunk ${chunk.chunk_index}, Page ${chunk.page_number}]`);
console.log(` ${chunk.text.substring(0, 150)}...`);
});
// Modified chunks
console.log(`\nModified (${diff.diff.modifiedCount}):`);
diff.diff.modifiedChunks.forEach(pair => {
console.log(`~ [Chunk ${pair.old.chunk_index}]`);
console.log(` Before: ${pair.old.text.substring(0, 100)}...`);
console.log(` After: ${pair.new.text.substring(0, 100)}...`);
console.log(` Similarity: ${(pair.similarity * 100).toFixed(1)}%`);
});
Practical Examples
Track Contract Evolution
Copy
// Upload contract versions
const v1 = await raptor.process('contract_draft.pdf', {
versionLabel: 'Initial Draft'
});
const v2 = await raptor.process('contract_client_review.pdf', {
parentDocumentId: v1.documentId,
versionLabel: 'Client Review'
});
const v3 = await raptor.process('contract_final.pdf', {
parentDocumentId: v2.documentId,
versionLabel: 'Final - Signed'
});
// Get complete history
const lineage = await raptor.getDocumentLineage(v3.documentId);
lineage.documents.forEach(doc => {
console.log(`${doc.version_label}: ${doc.filename}`);
});
// Generate changelog
const changelog = await raptor.getLineageChangelog(v3.documentId);
console.log(`\nComplete changelog:`);
changelog.changelogs.forEach(entry => {
console.log(`\n${entry.from_label} → ${entry.to_label}`);
console.log(`Changes: +${entry.diff.added_count}, -${entry.diff.removed_count}`);
});
Visualize Document Tree
Copy
async function visualizeLineage(documentId: string) {
const tree = await raptor.getDocumentLineageTree(documentId);
console.log(`\nDocument Lineage (${tree.totalVersions} versions):`);
console.log('='.repeat(50));
function printNode(node, prefix = '', isLast = true) {
const connector = isLast ? '└── ' : '├── ';
const label = node.versionLabel || 'Unlabeled';
console.log(`${prefix}${connector}${node.filename} (${label})`);
const newPrefix = prefix + (isLast ? ' ' : '│ ');
node.children.forEach((child, index) => {
const childIsLast = index === node.children.length - 1;
printNode(child, newPrefix, childIsLast);
});
}
tree.roots.forEach((root, index) => {
const isLast = index === tree.roots.length - 1;
printNode(root, '', isLast);
});
}
await visualizeLineage('document-id');
Copy
Document Lineage (7 versions):
==================================================
└── contract_v1.pdf (Initial Draft)
├── contract_v2.pdf (Client Review)
│ └── contract_v3.pdf (Final)
└── contract_v2_alt.pdf (Alternative)
└── template.pdf (Template)
├── instance_1.pdf (Instance 1)
└── instance_2.pdf (Instance 2)
Find Related Documents
Copy
async function findRelatedDocs(documentId: string) {
// Get documents in same lineage
const lineage = await raptor.getDocumentLineage(documentId);
console.log('Documents in lineage:');
lineage.documents.forEach(doc => {
console.log(` - ${doc.filename} (${doc.similarity_score}% similar)`);
});
// Find similar documents outside lineage
const similar = await raptor.findSimilarDocuments(documentId, {
minSimilarity: 0.7,
limit: 5
});
console.log('\nSimilar documents (outside lineage):');
similar.suggestions.forEach(doc => {
console.log(` - ${doc.filename} (${doc.similarityScore * 100}% similar)`);
});
}
await findRelatedDocs('document-id');
Audit Document History
Copy
async function auditHistory(documentId: string) {
const lineage = await raptor.getDocumentLineage(documentId, {
includeDeleted: true
});
console.log('Document History Audit');
console.log('='.repeat(50));
lineage.documents.forEach((doc, index) => {
console.log(`\n${index + 1}. ${doc.filename}`);
console.log(` ID: ${doc.id}`);
console.log(` Label: ${doc.version_label || 'N/A'}`);
console.log(` Parent: ${doc.parent_document_id || 'None (root)'}`);
console.log(` Similarity: ${doc.similarity_score ? `${doc.similarity_score * 100}%` : 'N/A'}`);
console.log(` Created: ${doc.created_at}`);
console.log(` Status: ${doc.deleted_at ? 'Deleted' : 'Active'}`);
console.log(` Current: ${doc.is_current ? 'Yes' : 'No'}`);
});
// Get stats
const stats = await raptor.getLineageStats(documentId);
console.log(`\nTotal versions: ${stats.total_versions}`);
console.log(`Oldest: ${stats.oldest_version.filename} (${stats.oldest_version.created_at})`);
}
await auditHistory('document-id');
Best Practices
Use version labels: Add meaningful labels to make lineage easier to understand.
Similarity scores: Scores above 0.85 (85%) indicate strong version relationships.
Circular references: You cannot create circular lineage (A → B → A). Raptor prevents this automatically.