Document Processing Issues
This guide helps you troubleshoot problems with uploading and processing documents in IntelliRepo.
Understanding Document Processing
When you upload a document, IntelliRepo:
- Uploads the file to secure storage
- Extracts text from the document
- Chunks text into searchable segments
- Embeds chunks for semantic search
- Stores everything in the database
Each step can potentially fail. This guide covers common issues at each stage.
Document Status
| Status | Meaning | Action |
|---|---|---|
| Processing | Currently being processed | Wait for completion |
| Completed | Ready for search and chat | None needed |
| Failed | Error during processing | See troubleshooting below |
Failed Uploads
Symptoms
- Status shows "Failed"
- Error message displayed
- Document not searchable
Common Causes
1. Unsupported File Format
Supported formats:
- PDF (.pdf)
- Microsoft Word (.docx)
- Plain text (.txt)
- Markdown ()
- HTML (.html, .htm)
Not supported:
- Legacy Word (.doc)
- Excel, PowerPoint
- Images (PNG, JPG)
- Scanned PDFs (image-only)
Solution: Convert to a supported format before uploading.
2. Corrupted File
Symptoms:
- File won't open on your computer
- Error when processing
Solution:
- Try opening the file locally
- If it doesn't open, the file is corrupted
- Re-download or re-create the file
- Upload the working version
3. Scanned/Image-Only PDF
Symptoms:
- PDF uploads but chat finds no content
- Status shows "Completed" but 0 chunks
Cause: The PDF contains images of text, not actual text.
How to check:
- Open the PDF
- Try to select/highlight text
- If you can't select text, it's image-only
Solutions:
- Use a PDF with selectable text
- Run OCR on the scanned PDF first
- Use a text-based source document
4. File Too Large
Plan limits:
| Plan | Max File Size |
|---|---|
| Solo | 25 MB |
| Pro | 50 MB |
| Team | 100 MB |
| Enterprise | 200 MB |
Solutions:
- Compress the file
- Split into multiple smaller files
- Remove unnecessary images/pages
- Upgrade your plan
5. Encoding Issues
Symptoms:
- Text file fails processing
- Strange characters in results
Cause: File is not UTF-8 encoded.
Solution:
- Open in a text editor
- Save with UTF-8 encoding
- Re-upload
Processing Stuck
Symptoms
- Document shows "Processing" for more than 10 minutes
- No progress indication
Solutions
- Refresh the page - Status may not have updated
- Wait a bit longer - Very large files take more time
- Delete and re-upload - If stuck for 15+ minutes
If Problems Persist
- Note the file name and type
- Check the file works locally
- Contact support with details
Poor Text Extraction
Symptoms
- Search results are incomplete
- Answers miss obvious content
- Sources show garbled text
Causes and Solutions
Complex PDF Layouts
PDFs with multiple columns, tables, or sidebars may not extract perfectly.
Solutions:
- Simplify the document layout
- Use a single-column version
- Export as plain text if possible
Password-Protected PDFs
Protected PDFs cannot be processed.
Solution: Remove the password before uploading.
Forms and Interactive Elements
PDF forms, buttons, and interactive elements are ignored.
Solution: Flatten the PDF or save as a static document.
Very Large Tables
Complex tables may not extract with proper structure.
Solution: Consider extracting table data separately as text.
Zero Chunks After Processing
Symptoms
- Status shows "Completed"
- Chunk count shows 0
- Document not appearing in search results
Causes
- Empty document - File contains no text
- Image-only PDF - No extractable text
- Unsupported content - Only images, diagrams, etc.
Solutions
- Check that the document actually contains text
- Try selecting text in the original file
- Convert images to text using OCR before uploading
Slow Processing
Normal Processing Times
| Document Size | Expected Time |
|---|---|
| 1-10 pages | 5-15 seconds |
| 10-50 pages | 15-30 seconds |
| 50-100 pages | 30-60 seconds |
| 100+ pages | 1-3 minutes |
If Unusually Slow
- Large files take longer - be patient
- Peak usage times may affect speed
- Multiple uploads are processed sequentially
Best Practices for Documents
Before Uploading
- Verify file opens correctly locally
- Confirm text is selectable (not scanned)
- Check file size is within limits
- Use supported format (.pdf, .docx, .txt, )
Document Quality Tips
- Use text-based PDFs whenever possible
- Keep formatting simple - single columns work best
- Include clear headings for better search
- Remove sensitive data before uploading
File Naming
- Use descriptive names: "Employee-Handbook-2024.pdf"
- Avoid special characters: use hyphens or underscores
- Include version/date if relevant
Re-Processing Documents
Currently, there's no "re-process" button. To re-process:
- Delete the document
- Upload it again
This is useful after:
- Fixing the source file
- System updates that improve processing
Still Having Issues?
If you've tried the above and still have problems:
- Note the exact error (screenshot if possible)
- Save the problematic file (we may ask for it)
- Contact support with:
- File name and type
- File size
- Error message
- What you've tried