Documents API
The Documents API handles document upload, processing, version control, and retrieval.Upload Document
Upload a document for processing with version control and auto-linking.Request
Document file to process (PDF, DOCX, TXT, etc.)
Whether to store document content after processing. If
false, content is auto-deleted while metadata is preserved.Target chunk size in tokens (128-2048)
Overlap between chunks in tokens (0-512)
Chunking strategy:
semantic or recursiveExtract section numbers from legal documents
Create chunks for images and figures
Calculate quality scores for chunks
Minimum quality threshold (0.0-1.0)
Generate AI context for orphan tables
Extract tables from documents
Generate context for tables using AI
Parent document ID for manual version linking
Human-readable version label (e.g., “v2.0”, “draft”)
Enable/disable auto-linking.
null uses account settingConfidence threshold for auto-linking (0.0-1.0)
Response
Processing variant ID
Document ID (unique across versions)
Version ID
Version number in lineage (starts at 1)
Whether this is a new document
Whether this is a new version of existing document
Whether this is a new processing variant
Whether document is a duplicate (exact content hash match)
ID of canonical document if duplicate
Whether processing was skipped (duplicate or existing variant)
Processing status:
pending, processing, completed, or failedCelery task ID for async processing
Number of chunks (if completed)
Estimated page count for billing
Pages saved by duplicate detection
Whether chunk deduplication is available
Whether auto-linking detected and linked to parent
Auto-link confidence score (0.0-1.0)
Human-readable reasons for auto-link decision
Detection method:
metadata, metadata_and_content, or noneGet Document
Retrieve document information and processing status.Path Parameters
Document ID
Query Parameters
Specific version number to retrieve (default: latest)
Specific variant ID to retrieve
Response
Variant ID (for compatibility)
Document filename
MIME type (e.g.,
application/pdf)File size in bytes
Processing status:
pending, processing, completed, or failedNumber of processed chunks
Total token count across all chunks
ISO 8601 timestamp
Whether original content is stored
Deletion timestamp (null if not deleted)
Version of extraction engine used
List Documents
List all documents for the authenticated user.Query Parameters
Maximum number of results (1-100)
Pagination offset
Response
Returns an array of document objects.Get Document Chunks
Retrieve processed chunks for a document.Path Parameters
Document ID
Query Parameters
Include full deduplication metadata (can be large)
Response
Returns an array of chunk objects.Delete Document
Delete document with cascade deletion of all versions and variants.Path Parameters
Document ID to delete
Response
Deleted document ID
Resource type (
document)Deletion status (
deleted)Cascade deletion information
Deletion timestamp
Compare Documents
Compare two documents and return detailed diff.Query Parameters
First document ID
Second document ID
Response
Human-readable summary of changes
Overall similarity score (0.0-1.0)
Detailed diff information
Error Responses
Invalid parameters (e.g., chunk_size out of range)
Invalid or missing API key
Document not found
File size exceeds maximum (100MB)
Rate limit exceeded
Server error