Auto-Linking
Auto-linking automatically detects when you’re uploading a new version of an existing document—no manual tracking required. Raptor achieves 85%+ accuracy using metadata and content analysis.
How It Works
When you upload a document, Raptor analyzes:
-
Metadata signals (fast)
- Filename similarity
- Upload time proximity
- File size range
- User upload patterns
-
Content sampling (medium speed)
- First 2KB text comparison
- Trigram similarity
- Chunk overlap detection
-
Confidence scoring
- Combines signals into 0-100% confidence
- Auto-links if above threshold (default: 85%)
Basic Usage
Auto-linking is enabled by default:
import Raptor from '@raptor-data/ts-sdk';
const raptor = new Raptor({ apiKey: process.env.RAPTOR_API_KEY });
// Upload original contract
const v1 = await raptor.process('contract_2024.pdf');
// documentId: "doc-abc"
// Upload updated contract (different filename!)
const v2 = await raptor.process('contract_2024_revised.pdf');
// Auto-linking detected the relationship
if (v2.autoLinked) {
console.log('Automatically linked to parent!');
console.log(`Parent: ${v2.parentDocumentId}`); // "doc-abc"
console.log(`Confidence: ${(v2.autoLinkConfidence * 100).toFixed(0)}%`); // "92%"
console.log(`Method: ${v2.autoLinkMethod}`); // "metadata_and_content"
// See why it was linked
v2.autoLinkExplanation?.forEach(reason => console.log(` - ${reason}`));
// Output:
// - High filename similarity: 0.95
// - Upload time proximity: 2 hours apart
// - Same file size range: within 10%
// - Content similarity: 94% chunk overlap
}
Auto-Link Settings
Get Current Settings
const settings = await raptor.getAutoLinkSettings();
console.log(`Enabled: ${settings.autoLinkEnabled}`);
console.log(`Threshold: ${settings.autoLinkThreshold}`); // 0.85 (85%)
Update Settings
Change auto-linking behavior for your account:
await raptor.updateAutoLinkSettings({
autoLinkEnabled: true,
autoLinkThreshold: 0.90 // Require 90% confidence
});
console.log('Auto-link settings updated');
Per-Upload Override
Override settings for a specific upload:
// Require very high confidence for this upload
const result = await raptor.process('sensitive-doc.pdf', {
autoLink: true,
autoLinkThreshold: 0.95 // Require 95% confidence
});
// Disable auto-link for this upload
const standalone = await raptor.process('new-doc.pdf', {
autoLink: false // Don't auto-link this one
});
Auto-Link Response
interface ProcessResult {
// Auto-linking metadata
autoLinked: boolean;
autoLinkConfidence?: number;
autoLinkExplanation?: string[];
autoLinkMethod?: 'metadata' | 'metadata_and_content' | 'content_only' | 'none';
parentDocumentId?: string;
// ... other fields
}
Response Fields
Whether auto-linking detected and linked to a parent
Final confidence score (0.0-1.0)
Human-readable reasons for the linking decision
Detection method used:
metadata: Linked based on metadata only
metadata_and_content: Combined metadata and content analysis
content_only: Linked based on content similarity (low metadata confidence)
none: No parent detected
ID of the detected parent document
Detection Methods
Very high confidence from metadata alone (95%+):
// Same filename, uploaded 10 minutes apart, same size
const v1 = await raptor.process('contract.pdf');
const v2 = await raptor.process('contract.pdf');
console.log(v2.autoLinkMethod); // "metadata"
console.log(v2.autoLinkConfidence); // 0.98
Metadata + Content Linking
Medium metadata confidence, boosted by content analysis:
// Different filename, but similar content
const v1 = await raptor.process('contract_draft.pdf');
const v2 = await raptor.process('contract_final.pdf');
console.log(v2.autoLinkMethod); // "metadata_and_content"
console.log(v2.autoLinkConfidence); // 0.87
v2.autoLinkExplanation?.forEach(e => console.log(e));
// Output:
// - Metadata confidence: 0.78
// - Content overlap: 0.95
// - Content boost: +0.09
// - Final confidence: 0.87
Content-Only Linking
Low metadata confidence, but very high content overlap:
// Completely different filename, but identical content
const v1 = await raptor.process('old_name.pdf');
const v2 = await raptor.process('completely_different_name.pdf');
console.log(v2.autoLinkMethod); // "content_only"
console.log(v2.autoLinkConfidence); // 0.94
Examples
Track Contract Revisions
const raptor = new Raptor({ apiKey: process.env.RAPTOR_API_KEY });
// Initial draft
const draft = await raptor.process('contract_draft_v1.pdf', {
versionLabel: 'Initial Draft'
});
// Client revisions (auto-linked)
const revised = await raptor.process('contract_revised_by_client.pdf', {
versionLabel: 'Client Revisions'
});
if (revised.autoLinked) {
console.log(`Auto-linked with ${revised.autoLinkConfidence * 100}% confidence`);
// Get version history
const lineage = await raptor.getDocumentLineage(revised.documentId);
console.log(`Version ${lineage.total_versions} of ${lineage.total_versions}`);
}
Handle High-Confidence Linking
const result = await raptor.process('document.pdf');
if (result.autoLinked && result.autoLinkConfidence >= 0.95) {
console.log('Very confident auto-link!');
console.log('Explanation:', result.autoLinkExplanation);
// Safe to assume this is a new version
await sendNotification({
message: `New version uploaded: v${result.versionNumber}`,
confidence: result.autoLinkConfidence
});
}
Verify Auto-Link Decision
const result = await raptor.process('updated-doc.pdf');
if (result.autoLinked) {
console.log(`Linked to: ${result.parentDocumentId}`);
console.log(`Confidence: ${result.autoLinkConfidence}`);
console.log(`Method: ${result.autoLinkMethod}`);
console.log('\nReasons:');
result.autoLinkExplanation?.forEach((reason, i) => {
console.log(`${i + 1}. ${reason}`);
});
// Verify the link is correct
const parent = await raptor.getDocument(result.parentDocumentId);
console.log(`\nParent document: ${parent.filename}`);
// If incorrect, you can unlink
if (needsManualReview) {
await raptor.unlinkFromLineage(result.documentId);
console.log('Unlinked from incorrect parent');
}
}
Disable for Specific Use Cases
// Don't auto-link for template documents
const template = await raptor.process('template.pdf', {
autoLink: false,
versionLabel: 'Template'
});
// Don't auto-link for bulk imports
async function bulkImport(files: File[]) {
for (const file of files) {
await raptor.process(file, {
autoLink: false // Treat each as independent
});
}
}
Confidence Thresholds
| Threshold | Use Case | Behavior |
|---|
| 0.95+ | Very high confidence | Metadata-only linking |
| 0.85-0.95 | High confidence | Metadata + content linking (default) |
| 0.70-0.85 | Medium confidence | Content-only fallback |
| Below 0.70 | Low confidence | No link created |
Choosing a Threshold
// Conservative (fewer false positives)
await raptor.updateAutoLinkSettings({
autoLinkThreshold: 0.95 // Only link if very confident
});
// Balanced (recommended)
await raptor.updateAutoLinkSettings({
autoLinkThreshold: 0.85 // Default
});
// Aggressive (more links, some false positives)
await raptor.updateAutoLinkSettings({
autoLinkThreshold: 0.75 // Link more liberally
});
Troubleshooting
Document Not Auto-Linked
If a document should have been linked but wasn’t:
const result = await raptor.process('document.pdf');
if (!result.autoLinked) {
console.log('Not auto-linked');
console.log(`Confidence: ${result.autoLinkConfidence || 'N/A'}`);
// Check threshold
const settings = await raptor.getAutoLinkSettings();
console.log(`Threshold: ${settings.autoLinkThreshold}`);
// Manual link if needed
if (result.autoLinkConfidence >= 0.70) {
// Link manually
await raptor.linkToParent(result.documentId, 'parent-id', 'v2.0');
}
}
Wrong Parent Linked
If auto-linking detected the wrong parent:
const result = await raptor.process('document.pdf');
if (result.autoLinked) {
// Verify parent
const parent = await raptor.getDocument(result.parentDocumentId);
console.log(`Linked to: ${parent.filename}`);
if (isWrongParent(parent)) {
// Unlink from wrong parent
await raptor.unlinkFromLineage(result.documentId);
// Link to correct parent
await raptor.linkToParent(result.documentId, 'correct-parent-id', 'v2.0');
}
}
Too Many False Positives
Increase the threshold:
await raptor.updateAutoLinkSettings({
autoLinkThreshold: 0.95 // Require higher confidence
});
Missing Expected Links
Lower the threshold or check file naming:
// Lower threshold
await raptor.updateAutoLinkSettings({
autoLinkThreshold: 0.75
});
// Or use consistent naming
// Good: contract_v1.pdf, contract_v2.pdf
// Bad: abc.pdf, xyz.pdf
Best Practices
Consistent naming helps: Use patterns like contract_v1.pdf, contract_v2.pdf for higher confidence detection.
Upload timing matters: Files uploaded close together (within 24 hours) get a confidence boost.
Review high-stakes links: For critical documents, review auto-link decisions before proceeding.