Skip to main content

What is text recognition?

Text recognition is the second stage of the OCR pipeline — it reads the text content from an image. The /recognition endpoint takes a pre-cropped image of a text region and returns the recognized string with a confidence score. Use recognition on its own when you already know where the text is:
  • Custom pipelines — run your own detection or segmentation, then pass cropped regions to Trace OCR for reading
  • Pre-cropped images — read text from images that already contain a single text snippet (scanned labels, badges, signs)
  • Model comparison — test different recognition architectures on the same crops to evaluate accuracy
  • Batch processing — send many pre-segmented text regions for recognition without re-running detection

Example: reading text from a poster

Here’s an event poster with hand-drawn text. We’ll crop individual text regions and send them to the recognition endpoint.
Event poster for Something Nice with stylized hand-drawn text
We crop two text regions from the poster — one with clean printed numerals, one with hand-drawn lettering:
Cropped text region showing 2016Cropped text region showing Entrance Free
Send each crop to the recognition endpoint:
curl -X POST https://ocr-api.trace.so/recognition/ \
  -F "[email protected]" \
  -F "[email protected]"
The API returns the recognized text and confidence for each file:
[
  {
    "name": "poster-crop-date.jpg",
    "value": "2016",
    "confidence": 1.0
  },
  {
    "name": "poster-crop-title.jpg",
    "value": "Entrance-frec",
    "confidence": 0.45
  }
]
The clean numerals “2016” are recognized perfectly with near-perfect confidence. The hand-drawn “Entrance Free” is partially misread — the stylized lettering brings the confidence score down to 0.45. This is expected: recognition models work best on printed or clearly written text.

Parameters

All parameters are optional query parameters passed in the URL.
ParameterDefaultDescription
recognition_modelcrnn_vgg16_bnThe recognition architecture to use. See available models for options.
recognition_batch_size128Number of crops processed in parallel. Decrease if you run into memory issues on large batches.
# Use a different recognition model
curl -X POST "https://ocr-api.trace.so/recognition/?recognition_model=parseq" \
  -F "[email protected]"

Recognition vs full OCR

The /recognition endpoint expects pre-cropped text images and returns only the recognized string. The /ocr endpoint handles detection and recognition together, returning a full document hierarchy.
/recognition/ocr
InputPre-cropped text imageFull document or page
OutputText string + confidenceNested hierarchy with bounding boxes
DetectionNone — you provide the cropsRuns automatically
SpeedFaster (one model, no detection)Slower (two models)
Use whenYou already have text regionsYou need end-to-end extraction

Next steps