Back to Home|API Documentation

Text Recognition

The Text Recognition module provides raw OCR (Optical Character Recognition) capabilities independent of pipeline-based extraction.

Overview

While the standard extraction pipeline performs intelligent, structured data extraction, the Text Recognition module offers a lower-level interface for obtaining raw text content from documents and images.

Capabilities

  • Full-page text extraction with position coordinates
  • Multi-language text detection and recognition
  • Handwritten text recognition
  • Table structure detection and text extraction
  • Line-by-line and word-by-word segmentation

When to Use

  • When you need raw text without structured field extraction
  • For documents not covered by existing pipeline configurations
  • When building custom post-processing logic on top of OCR output
  • For text search and indexing workflows

Output Format

The text recognition response includes:

  • pages — Array of page objects with extracted text
  • lines — Individual text lines with bounding box coordinates
  • words — Word-level segmentation with confidence scores
  • language — Detected language(s) of the document

Note: For structured data extraction, use the standard pipeline-based extraction endpoint instead.