Skip to main content

Exception Hierarchy

All OCR exceptions inherit from OCRError, allowing you to catch all OCR errors at once or handle specific exceptions individually.
OCRError (base)
├── OCRProviderError          # Engine/provider level errors
├── OCRFileNotFoundError      # File not found or not a file
├── OCRUnsupportedFormatError # Unsupported file format
├── OCRProcessingError        # Error during OCR processing
└── OCRTimeoutError           # Layer 1 timeout exceeded

Base: OCRError

Parent of all OCR exceptions. Carries 3 attributes:
AttributeTypeDescription
messagestrError message
error_codestr | NoneMachine-readable error code (e.g. "LAYER1_TIMEOUT")
original_errorException | NoneWrapped original exception (if any)
str() output format: [ERROR_CODE] message (Original: original error)
from upsonic.ocr import OCRError

try:
    text = ocr.get_text("document.pdf")
except OCRError as e:
    print(e.message)        # "Layer 1 OCR timed out after 30.0s on page 2"
    print(e.error_code)     # "LAYER1_TIMEOUT"
    print(e.original_error) # None or original exception

OCRProviderError

Raised during engine initialization and dependency errors.
error_codeWhen
UNSUPPORTED_LANGUAGELanguage not supported by the engine
READER_INIT_FAILEDEasyOCR reader creation failed
ENGINE_INIT_FAILEDRapidOCR engine initialization failed
TESSERACT_NOT_INSTALLEDTesseract not installed on the system
VLLM_NOT_AVAILABLEvLLM package not installed (DeepSeek)
UNSUPPORTED_MODEL_ARCHITECTUREDeepSeek model architecture not supported by vLLM
MODEL_INIT_FAILEDDeepSeek model loading failed
CLIENT_INIT_FAILEDOllama client connection failed
OLLAMA_NOT_AVAILABLEollama package not installed
from upsonic.ocr import OCRProviderError
from upsonic.ocr.layer_1.engines import EasyOCREngine

try:
    engine = EasyOCREngine(languages=['xyz'])
except OCRProviderError as e:
    # e.error_code == "UNSUPPORTED_LANGUAGE"
    print(e.message)

OCRFileNotFoundError

Raised when the file does not exist or the path is not a file. Thrown by Layer 0 (document_converter).
error_codeWhen
FILE_NOT_FOUNDFile does not exist
NOT_A_FILEPath points to a directory
from upsonic.ocr import OCRFileNotFoundError

try:
    text = ocr.get_text("nonexistent_file.pdf")
except OCRFileNotFoundError as e:
    # e.error_code == "FILE_NOT_FOUND"
    print(e.message)

OCRUnsupportedFormatError

Raised when an unsupported file format is provided. Thrown by Layer 0. Supported formats: .png, .jpg, .jpeg, .bmp, .tiff, .tif, .gif, .webp, .pdf
error_codeWhen
UNSUPPORTED_FORMATFile has an unsupported extension
from upsonic.ocr import OCRUnsupportedFormatError

try:
    text = ocr.get_text("document.docx")
except OCRUnsupportedFormatError as e:
    # e.error_code == "UNSUPPORTED_FORMAT"
    print(e.message)

OCRProcessingError

Raised when an error occurs at the engine level during OCR processing. Each engine uses its own error code.
error_codeEngineWhen
EASYOCR_PROCESSING_FAILEDEasyOCRreadtext call failed
RAPIDOCR_PROCESSING_FAILEDRapidOCROCR call failed
TESSERACT_PROCESSING_FAILEDTesseractimage_to_data call failed
DEEPSEEK_PROCESSING_FAILEDDeepSeekvLLM generate failed
DEEPSEEK_BATCH_PROCESSING_FAILEDDeepSeekBatch processing failed
DEEPSEEK_OLLAMA_PROCESSING_FAILEDDeepSeek OllamaOllama streaming failed
PADDLE_PROCESSING_FAILEDPaddleOCRpredict call failed
PDF_CONVERSION_FAILEDLayer 0PDF to image conversion failed
IMAGE_LOAD_FAILEDLayer 0Image could not be loaded
MISSING_DEPENDENCYLayer 0PyMuPDF (fitz) not installed
from upsonic.ocr import OCRProcessingError

try:
    text = ocr.get_text("corrupted_image.png")
except OCRProcessingError as e:
    print(e.error_code)     # "EASYOCR_PROCESSING_FAILED"
    print(e.original_error) # Original exception

OCRTimeoutError

Raised when layer_1_timeout is exceeded in the OCR orchestrator. Applied per page — if page 3 of a 5-page PDF times out, only that page raises the error.
error_codeWhen
LAYER1_TIMEOUTlayer_1_timeout seconds exceeded
from upsonic.ocr import OCR, OCRTimeoutError
from upsonic.ocr.layer_1.engines import EasyOCREngine

engine = EasyOCREngine(languages=['en'])
ocr = OCR(layer_1_ocr_engine=engine, layer_1_timeout=30.0)

try:
    text = ocr.get_text("large_file.pdf")
except OCRTimeoutError as e:
    # e.error_code == "LAYER1_TIMEOUT"
    # e.message == "Layer 1 OCR timed out after 30.0s on page 3"
    print(e.message)

Import

# All from one place
from upsonic.ocr import (
    OCRError,
    OCRProviderError,
    OCRFileNotFoundError,
    OCRUnsupportedFormatError,
    OCRProcessingError,
    OCRTimeoutError,
)

# Or directly from exceptions module
from upsonic.ocr.exceptions import OCRTimeoutError

Catch Pattern

Handle exceptions from most specific to most general:
try:
    text = ocr.get_text("document.pdf")
except OCRTimeoutError:
    print("Timeout - increase timeout or try a smaller file")
except OCRFileNotFoundError:
    print("File not found")
except OCRUnsupportedFormatError:
    print("This format is not supported")
except OCRProviderError:
    print("Engine issue - missing dependency or unsupported language")
except OCRProcessingError:
    print("OCR processing error")
except OCRError:
    print("Unknown OCR error")