Tiff to Text is designed to perform Optical Character Recognition (OCR) in a batch process. The program utilizes the OCR engine from Nuance (Owners of OMNI Page - formally ScanSoft) that is included with Microsoft Office Document Imaging (MODI). Without question this OCR engine is one of the five best in the World, and is available in different languages. If English is not the language to be used the setup will prompt the user for the location of MODI.
With Tiff to Text the user has the option of processing all of the tif images in a file folder as well as all subfolders that contain tif images. The output will be a matching file folder structure with either the tif image along with a matching text file created from the OCR or just the text file.
Tiff to Text is easy to setup as the user only has to enter the input folder and output folder along with some simple choices
Required Setup information:
Include Sub Directories this will OCR all the tif images in the Sub Directories of the root
Duplicate Folder Structure this makes a matching output file folder hierarchy of the OCRd images, if not selected all output files will be placed in the output root.
Display Status allows the user to see what file is being OCRd
Output Text File Only just creates a matching text file from the OCR contents
Delete Original Tiff image Deletes the file that was processed
Standard Output Normal OCR text file
Upper Case All of the OCR text is converted to Upper Case
Lower Case - All of the OCR text is converted to Lower Case
Strict ASCII Only outputs the OCR text that contains character codes between 0 127
Printable ASCII Only outputs the OCR text that contains character codes 10, (line feed), 13 (Carriage Return) and codes 32 to 126
Custom - the user can input a string of character codes to be returned from the OCR such as 10, 13, 48-57 which would just output the numbers contained within the tif image.