Text recognition

What is text recognition?

Text recognition is the second stage of the OCR pipeline — it reads the text content from an image. The /recognition endpoint takes a pre-cropped image of a text region and returns the recognized string with a confidence score. Use recognition on its own when you already know where the text is:

Custom pipelines — run your own detection or segmentation, then pass cropped regions to Trace OCR for reading
Pre-cropped images — read text from images that already contain a single text snippet (scanned labels, badges, signs)
Model comparison — test different recognition architectures on the same crops to evaluate accuracy
Batch processing — send many pre-segmented text regions for recognition without re-running detection

Example: reading text from a poster

Here’s an event poster with hand-drawn text. We’ll crop individual text regions and send them to the recognition endpoint.

Event poster for Something Nice with stylized hand-drawn text

We crop two text regions from the poster — one with clean printed numerals, one with hand-drawn lettering:

Send each crop to the recognition endpoint:

curl -X POST https://ocr-api.trace.so/recognition/ \
  -F "[email protected]" \
  -F "[email protected]"

The API returns the recognized text and confidence for each file:

[
  {
    "name": "poster-crop-date.jpg",
    "value": "2016",
    "confidence": 1.0
  },
  {
    "name": "poster-crop-title.jpg",
    "value": "Entrance-frec",
    "confidence": 0.45
  }
]

The clean numerals “2016” are recognized perfectly with near-perfect confidence. The hand-drawn “Entrance Free” is partially misread — the stylized lettering brings the confidence score down to 0.45. This is expected: recognition models work best on printed or clearly written text.

Parameters

All parameters are optional query parameters passed in the URL.

Parameter	Default	Description
`recognition_model`	`crnn_vgg16_bn`	The recognition architecture to use. See available models for options.
`recognition_batch_size`	`128`	Number of crops processed in parallel. Decrease if you run into memory issues on large batches.

# Use a different recognition model
curl -X POST "https://ocr-api.trace.so/recognition/?recognition_model=parseq" \
  -F "[email protected]"

Recognition vs full OCR

The /recognition endpoint expects pre-cropped text images and returns only the recognized string. The /ocr endpoint handles detection and recognition together, returning a full document hierarchy.

	`/recognition`	`/ocr`
Input	Pre-cropped text image	Full document or page
Output	Text string + confidence	Nested hierarchy with bounding boxes
Detection	None — you provide the crops	Runs automatically
Speed	Faster (one model, no detection)	Slower (two models)
Use when	You already have text regions	You need end-to-end extraction

Next steps

Full OCR endpoint

Run detection and recognition together to get the full text content.

OCR pipeline

Understand the two-stage detection + recognition architecture.

Available models

Choose the right recognition model for your use case.

API reference

Full endpoint documentation with parameters and response schema.

Getting started

Concepts

What is text recognition?

Example: reading text from a poster

Parameters

Recognition vs full OCR

Next steps

Full OCR endpoint

OCR pipeline

Available models

API reference

Getting started

Concepts

​What is text recognition?

​Example: reading text from a poster

​Parameters

​Recognition vs full OCR

​Next steps

Full OCR endpoint

OCR pipeline

Available models

API reference

What is text recognition?

Example: reading text from a poster

Parameters

Recognition vs full OCR

Next steps