Tesseract OCR Try It

Open source text recognition (OCR) Engine to extract printed text from images.

Website Documentation Source

Category: Image
License: Apache 2
Platforms: Windows macOS Linux
Released: 2005
Path: c:\tesseract\tesseract.exe
Benefits: Supports a wide variety of languages.
Notes: Latest downloadable Windows build is 5.5.0 (UB Mannheim, 2024-11-11). Newer source tags (5.5.1, 5.5.2) have no published Windows binary yet.

Version
Latest known: 5.5.0.20241111 (2024-11-11)

Examples

1. Outputs raw text based on the text identified in the image

tesseract.exe d:\images\image-with-text.png - -l eng

2. Extracts all text from a image file to a text file.

tesseract.exe input_file.tiff output_file pdf

Try it

OCR an image to plain text
Run 'tesseract <input> - -l eng' to read printed text from sample_page.png and stream the recognized text to stdout. Output language is set to English.
OCR with bounding boxes (TSV)
Use 'tesseract <input> - tsv' to emit recognized words with per-word bounding boxes, page/block/paragraph IDs, and confidence scores in tab-separated form. Useful for layout-aware extraction.

Agree to terms to run demos.

# Tesseract OCR - Help

Source: https://tesseract-ocr.github.io/

```
Usage:
  C:\CFusionExtra\tesseract\tesseract.exe --help | --help-extra | --version
  C:\CFusionExtra\tesseract\tesseract.exe --list-langs
  C:\CFusionExtra\tesseract\tesseract.exe imagename outputbase [options...] [configfile...]

OCR options:
  -l LANG[+LANG]        Specify language(s) used for OCR.
NOTE: These options must occur before any configfile.

Single options:
  --help                Show this help message.
  --help-extra          Show extra help for advanced users.
  --version             Show version information.
  --list-langs          List available languages for tesseract engine.
```

Tesseract OCR Share Help Try It

Examples

1. Outputs raw text based on the text identified in the image

2. Extracts all text from a image file to a text file.

Try it

Tesseract OCR Try It