Caspian Office

OCR — image/PDF to text

Extract text from images and scanned PDFs in your browser (tesseract.js — English, Turkish, German, French & Spanish). No upload, no AI.

Open OCR — image/PDF to text →

Private · runs in your browserOffline · after first loadFree · no signup

What is the OCR tool?

A private, in-browser OCR tool that extracts editable text from images and scanned PDFs using tesseract.js — with support for English, Turkish, German, French and Spanish. Use it to pull text out of a photo, screenshot or scanned document so you can copy and reuse it. Nothing is uploaded and there's no AI involved; recognition runs entirely on your device, even offline once the engine and language model have loaded.

How to use OCR — image/PDF to text

Add your file — Drop an image or PDF onto the dropzone, or click to choose a file. PNG, JPG, WebP, BMP and PDF are all accepted.
Pick the language — Choose the language of the text from the dropdown — English, Turkish, German, French, Spanish, or combined English + Turkish — so the right model is used.
Extract the text — Click Extract text. The first run downloads the engine and language model (each model is a few MB, then cached); a progress bar tracks recognition.
Copy the result — Recognised text appears in the editable box on the right. Review it, fix anything OCR missed, then select and copy it into your document.

Frequently asked questions

Is my image or PDF uploaded to a server?

No. Recognition runs entirely in your browser with the selected language model, so your file and the text it contains never leave your device.

Which languages can it recognise?

English, Turkish, German, French and Spanish, plus a combined English + Turkish option for documents that mix both.

Why is the first run slow?

The first time you use a language, the OCR engine and that language model are downloaded (each model is roughly 1–5 MB). They're cached afterwards, so later runs are much faster.

Can it handle multi-page PDFs?

Yes. Each page is rasterised and recognised in turn, so a multi-page scanned PDF is processed page by page.

Is the text always perfect?

No OCR is flawless. Clear, high-contrast scans give the best results; the output is fully editable so you can correct any mistakes before using it.

Tips

Use a sharp, well-lit, high-contrast scan or photo — clean input gives far better accuracy.
Match the language setting to the document; the right model makes a noticeable difference.
Run the tool once to cache the engine and your language model so it works offline afterwards.
Proofread the output before reusing it — punctuation and rare characters are the most likely to need a fix.

Related tools

← Browse all Caspian Office tools