dpScreenOCR User Manual

Version 1.4

1 About

dpScreenOCR is a free and open-source program to recognize text on the screen. Powered by Tesseract, it supports more than 100 languages and can split independent text blocks, such as columns.

2 Installation

2.1 Installing dpScreenOCR

2.1.1 Unix-like systems

The dpScreenOCR website provides several options, including repositories for Debian, Ubuntu, and derivatives. If there is no suitable choice for your system, download the source code tarball, unpack it, and follow the instructions in “doc/building-unix.txt”.

2.1.2 Windows

The dpScreenOCR website provides an installer and a ZIP archive. The latter doesn’t need installation: unpack it anywhere and run dpscreenocr.exe.

Both versions are identical. In particular, the ZIP variant is not a so-called portable application: it stores its configuration and other files in the same directories as the installer version.

2.2 Installing languages

2.2.1 Unix-like systems

Use your package manager to install languages for Tesseract. Package names may vary across systems, but they usually start with “tesseract” and end with a language code or name. For example, the package for German has the following names:

There are two special packs that provide extra features rather than languages: “osd” (automatic script and orientation detection) and “equ” (math and equation detection). dpScreenOCR doesn’t use them.

2.2.2 Windows

dpScreenOCR for Windows is shipped with the English language pack. To install other languages, use the language manager as described in the “Language manager” section.

Alternatively, you can install languages manually: download the needed Tesseract language files (for example, from the languages page) and place them in C:\Users\(your name)\AppData\Local\dpscreenocr\tesseract_5_data. To quickly navigate to this folder, paste %LOCALAPPDATA%\dpscreenocr\tesseract_5_data to either “Run” (press Windows + R) or the folder address bar of File Explorer.

3 Usage

dpScreenOCR is simple to use:

  1. Choose languages and actions in the Main tab.
  2. Move the mouse pointer to the screen area containing text and press the hotkey shown in the Main tab to start the selection.
  3. Move the mouse so that the selection covers the text and press the hotkey again.

After these steps, dpScreenOCR will recognize the text from the selected area and process it according to the chosen actions.

The rest of this chapter will describe the various settings that you can find in the Main tab.

3.1 Character recognition

3.1.1 Split text blocks

If this option is enabled, dpScreenOCR tries to split independent blocks of text, such as columns. Otherwise, the text is treated as one continuous block. This behavior is best illustrated by the following image, which shows a two-column text layout (A) recognized with (B) and without (C) the “Split text blocks” option:

This option does not affect paragraph detection.

3.1.2 Languages

This is the list of languages that dpScreenOCR can use to recognize text. You can select more than one, but be aware that this may slow down recognition and reduce its accuracy.

3.1.3 Language manager

The language manager allows you to install, update, and remove languages. It is not available on Unix-like systems, where you can handle languages via the system package manager as described in the “Installing languages” section.

When you open the manager, it will try to fetch the list of available languages from the Internet. If it fails (e.g., if there is no network connection), you can still remove languages using the corresponding tab.

3.2 Actions

The Actions group lets you choose what to do with the recognized text: copy it to the clipboard, add it to the history (located in the corresponding tab), or pass it as an argument to an executable.

3.2.1 Run executable

The “Run executable” action will run an executable with the recognized text as the first argument. The entry expects either an absolute path to the executable, or just its name in case it’s located in one of the paths listed in your PATH environment variable.

3.2.1.1 Running scripts on Unix-like systems

Before using your script, make sure it starts with a proper shebang and you have the execute permission (run chmod +x your_script).

Here is an example Unix shell script that translates the recognized text to your native language using Translate Shell and displays the translation as a desktop notification.

#!/bin/sh

notify-send "Translation" $(trans -b "$1")

3.2.1.2 Running scripts on Windows

3.2.1.2.1 Batch files

dpScreenOCR doesn’t run batch files (“.bat” or “.cmd”) because there’s no way to safely pass them arbitrary text. Please use another scripting language instead.

3.2.1.2.2 Creating file associations

Before using a script, make sure that the file association is configured correctly so that you can launch the script just by its file name, without mentioning the interpreter explicitly. The simplest way to test this is to type the name of the script with some arguments in cmd.exe. If the script runs and receives all arguments, you can skip this section.

We will use Python as an example, but for other languages the process is similar. Open cmd.exe as administrator and run asscoc with the extension of the script file as an argument:

> assoc .py

If the script still receives only one argument (path to the script), this means that Windows actually use a different association for the given extension and ignores the one set with assoc/ftype. To fix this, open regedit and make sure the values of the following keys use the correct path to the Python executable and end with %*:

HKEY_CLASSES_ROOT\Applications\python.exe\shell\open\command
HKEY_CLASSES_ROOT\py_auto_file\shell\open\command

A tip for Python users: note that in the examples above the association uses Python Launcher (py.exe) rather than a concrete Python executable (python.exe). This allows using shebang lines to select the Python version for each script. For more information, read Using Python on Windows.

3.2.1.2.3 Hiding console window

Most scripting language interpreters for Windows come with a special version of the executable that doesn’t show the console window. For example, this is pyw.exe for Python.

The interpreter installer usually adds a special file association that allows you to hide the console window by changing the script extension (for example, to “.pyw” for Python). If such an association does not exist, you can create it as described in the previous section.

3.3 Hotkey

The hotkey starts and ends the on-screen selection. To cancel the selection, press Escape.

The hotkey is global: it works even if the dpScreenOCR window is minimized. If pressing the hotkey has no effect, it probably means that another program is already using it. In this case, try another key combination.

4 Tweaking

This section describes how to change some settings that are not available in the dpScreenOCR interface.

dpScreenOCR saves settings in settings.cfg. Depending on the platform, you can find it in the following directories:

Each line in settings.cfg contains an option as a key-value pair. A value is a string that, depending on the option, represents a boolean (true or false), number (like 10 or -5), file path, etc.

Values can contain the following escape sequences:

Any other character preceded by \ is kept as is. To preserve leading spaces, escape the first one with \; to preserve trailing spaces, put \ at the end of the line.

To reset an option to the default value, remove it from settings.cfg; to reset all options, clear or delete the file. Be aware that dpScreenOCR rewrites settings on exit, so make sure you close the program before making changes.

Here is a list of options that can only be changed by editing the settings file:

5 Troubleshooting

This section contains a list of possible issues and their solutions. If the solution doesn’t help, or you have a problem that is not listed here, please report it on the issue tracker. You can also contact the author by email; the link is at the bottom of the dpScreenOCR website.