Auto-Linking
Auto-linking is Raptor’s intelligent feature that automatically detects when an uploaded document is a new version of an existing document, eliminating manual parent-child linking.How It Works
When you upload a document withautoLink: true, Raptor uses a two-stage matching algorithm:
Stage 1: Metadata Matching (Fast)
Raptor extracts and compares filename patterns:- Version numbers:
v1.0,v2.0,2023,2024 - Status indicators:
draft,final,revised - Date patterns:
2024-01-15,jan-2024
Stage 2: Content Matching (Fallback)
If metadata matching fails or confidence is low, Raptor compares the first 2KB of extracted text content using fuzzy matching. Example:Configuration
Global Settings (Account Level)
Set default auto-linking preferences in your dashboard:Per-Request Override
Override account settings for specific uploads:Response Format
When auto-linking succeeds, the response includes:auto_link_method
"metadata": Matched based on filename patterns (faster, more reliable)"content": Matched based on content similarity (slower, used as fallback)
auto_link_confidence
A score from 0.0 to 1.0 indicating match confidence:- 0.95+: Very high confidence (near-identical filenames or content)
- 0.85-0.95: High confidence (clear version relationship)
- 0.70-0.85: Medium confidence (similar but less obvious)
- Below 0.70: Low confidence (auto-linking skipped)
Examples
Example 1: Version-Numbered Documents
Example 2: Manual Parent Override
If auto-linking picks the wrong parent, override it:Example 3: Finding Suggestions
Get suggestions before uploading:Benefits
1. Automatic Version History
No manual tracking needed:2. Deduplication
Auto-linked documents benefit from deduplication:3. Consistency
Ensures related documents are properly linked:Troubleshooting
Auto-Linking Not Working
Issue:auto_linked: false even though documents are related
Solutions:
-
Lower the threshold:
-
Check filename patterns:
-
Use manual linking:
Wrong Parent Detected
Issue: Auto-linking chose the wrong parent document Solution: Override with explicit parent:Performance Concerns
Issue: Uploads slow when auto-linking enabled Explanation: Content-based matching can take 1-2 seconds for large corpora. Solutions:- Use metadata matching (ensure good filename patterns)
- Reduce candidate pool (manually link if you know the parent)
- Disable for non-version uploads:
Best Practices
Use consistent filename patterns
Use consistent filename patterns
Adopt a naming convention for versions:
Set appropriate thresholds
Set appropriate thresholds
Adjust based on your use case:
- High precision (few false positives):
0.90+ - Balanced (recommended):
0.85 - High recall (catch more matches):
0.75
Add version labels
Add version labels
Always include version labels for clarity:
Monitor auto-link results
Monitor auto-link results
Check the response to verify correct linking: