Skip to main content
POST
/
ocr
Perform OCR
curl --request POST \
  --url https://ocr-api.trace.so/ocr/ \
  --header 'Content-Type: multipart/form-data' \
  --form 'files=<string>' \
  --form files.items='@example-file'
[
  {
    "name": "<string>",
    "orientation": {},
    "language": {},
    "dimensions": [
      123,
      123
    ],
    "pages": [
      {
        "blocks": [
          {
            "geometry": [
              123
            ],
            "detection_score": 123,
            "lines": [
              {
                "geometry": [
                  123
                ],
                "detection_score": 123,
                "words": [
                  {
                    "value": "<string>",
                    "geometry": [
                      123
                    ],
                    "detection_score": 123,
                    "confidence": 123,
                    "text_orientation": {}
                  }
                ]
              }
            ]
          }
        ]
      }
    ]
  }
]

Query Parameters

detection_model
string
default:db_resnet50

Detection architecture to use. See the models page for available options.

recognition_model
string
default:crnn_vgg16_bn

Recognition architecture to use. See the models page for available options.

assume_straight_pages
boolean
default:true

Return axis-aligned bounding boxes. Set to false for rotated documents to get 4-point polygons instead.

preserve_aspect_ratio
boolean
default:true

Pad the image to preserve its aspect ratio before feeding it to the model.

detect_orientation
boolean
default:false

Detect and report the page orientation angle in the response.

detect_language
boolean
default:false

Detect and report the page language in the response.

symmetric_padding
boolean
default:true

Pad symmetrically (centered) rather than bottom-right only.

straighten_pages
boolean
default:false

Automatically rotate pages to correct detected skew before recognition.

detection_batch_size
integer
default:2

Number of pages processed in parallel during detection. Increase for multi-page PDFs if you have enough memory.

recognition_batch_size
integer
default:128

Number of text crops processed in parallel during recognition. Decrease if you run into memory issues on large batches.

disable_page_orientation
boolean
default:false

Skip page orientation classification entirely.

disable_text_orientation
boolean
default:false

Skip text crop orientation classification.

binary_threshold
number
default:0.1

Pixel-level threshold for the detection segmentation heatmap. Lower values detect more text but may introduce false positives.

box_threshold
number
default:0.1

Minimum confidence score to keep a detected bounding box. Lower values return more boxes but may include noise.

group_lines
boolean
default:true

Group detected words into lines based on spatial proximity.

group_blocks
boolean
default:false

Group detected lines into blocks based on spatial proximity.

paragraph_break
number
default:0.0035

Normalized vertical distance threshold for splitting text into separate blocks. Only used when group_blocks is enabled.

Body

multipart/form-data
files
file[]

One or more files to process. Accepts JPEG, PNG, and PDF formats.

Response

Successful Response

name
string
required
Example:

"example.jpg"

orientation
Orientation · object
required
Example:
{ "confidence": 0.99, "value": 0 }
language
Language · object
required
Example:
{ "confidence": 0.99, "value": "en" }
dimensions
Dimensions · tuple
required
pages
OCRPage · object[]
required
Example:
{
"detection_score": 0.99,
"geometry": [0, 0, 0, 0],
"lines": [
{
"detection_score": 0.99,
"geometry": [0, 0, 0, 0],
"words": [
{
"confidence": 0.99,
"detection_score": 0.99,
"geometry": [0, 0, 0, 0],
"text_orientation": { "value": 0 },
"value": "example"
}
]
}
]
}