Prime Recognition Logo -  High Accuracy Optical Character Recognition
Home
Products
Services
Support
Customers
Partners
News
Search

Why High Accuracy
Why PrimeOCR
Try PrimeOCR
Info via Email
Mailing List
Contact Us

 

 

.

PrimeOCR Technical Specifications

Recognition Data Types

PrimeOCR recognizes the following data types:

Characters - Machine Print and Dot Matrix text in any of the following 11 languages:
  • Danish

  • English (US or UK)

  • German

  • Norwegian

  • Spanish

  • Dutch

  • French

  • Italian

  • Portuguese

  • Swedish

plus Russian, Chinese Simple, Chinese Traditional, Japanese and Korean characters.

 

Optical Marks (OMR) - When an area on the image is "zoned" as OMR, PrimeOCR will return the percentage of black space contained within the zone. This percentage can be used to determine whether a user has marked a selection on the page.

Graphics - PrimeOCR normally ignores any graphics (e.g., pictures) found on an image. It can instead be instructed to save the graphic to a file. A path to the graphic is added to the text output for later page reconstruction.

 

PrimeOCR Access Methods

PrimeOCR can be accessed through:

The PrimeOCR Job Server - The Job Server controls PrimeOCR processing and can instruct PrimeOCR to process all images found in a directory/subdirectories, with no user intervention or coding. It also records major activities performed by PrimeOCR.

PrimeView/PrimeVerify - This graphical interface for end-users consists of two applications for sending images to the Job Server and editing PrimeOCR results. See the PrimeVerify Data Sheet for more information.

Software Developers Kit (SDK) - The SDK consists of 32 simple, orthogonal API calls accessible as a Dynamic Link Library by the following languages:

  • C or C++

  • C++NET

  • Visual Basic

  • VB NET

  • Any language capable of accessing a DLL such as PowerBuilder.

The SDK also includes:

 

 

Image Input

PrimeOCR will read images from either file or memory in the following formats:

  • TIFF (single or multi-page) all compression types

  • PDF

  • JPEG

  • Color or grayscale

  • PCX

  • PDA

Valid resolutions include 200, 240, 300, 400, and 600 DPI as well as Standard or Fine FAX.

 

 

Pre-Processing

PrimeOCR offers a variety of ways to enhance and define your image for optimal OCR results:

Image Enhancement:

Improves image quality for better OCR using features such as:
  • Deskew

  • Image Registration

  • Despeckle

  • Line Removal, etc.

Image Zoning:

  • Manual Zoning

  • Auto Zoning

  • Zone Content Restrictions include:  None, Alphabetic, Alphabetic Upper/Lower Case, Numeric, Graphic, and OMR.

 

 

OCR Processing

PrimeOCR has several features that improve OCR accuracy, fault tolerance, and speed:

Configurable Accuracy

The base PrimeOCR configuration achieves 65% fewer errors than conventional OCR using a "3 engine" voting configuration.  Even greater accuracy can be achieved through the following:

  • 4 or 5 or 6 Engines - Add a 4th OCR engine to the base configuration for 75% fewer errors, a 5th engine for 80% fewer errors or a 6th engine for 83% fewer errors.

  • Character Training - PrimeOCR can be trained to recognize specific character sets or fonts.

  • Engine Customization - Users may select which engines participate in the recognition process or even weigh engine results differently.

High Fault Tolerance

Automatic Engine Recover - A poor quality image can cause a conventional OCR product to "crash". To solve this problem, PrimeOCR can sense when an engine fails and automatically reinitialize it for the next image. This increases throughput by allowing PrimeOCR to run unattended, 24 hours a day!

Configurable Speed

  • Multi-Processor Support - This option allows PrimeOCR to utilize up to four processors in a multi-CPU system for faster throughput.

  • Selective Voting - While "Voting" takes longer than conventional OCR, you can speed up the processing on high quality images through Selective Voting. The result: faster OCR speeds on high quality documents and more processing power on lower quality documents.

 

Output Format

PrimeOCR can generate file output in the following formats:

  • ASCII - Text only output, left justified.

  • Formatted ASCII - Spaces are added to text to mimic the original imaging layout.

  • PDF - Converts scanned images into PDF "Normal", "Image + Text" or "Image Only" formatted file including color images, including accessible PDF output and PDF/a.

  • RTF - Retains original character attributes and page layout using frames and paragraph conventions.  Color/grayscale image zones are supported.

  • Comma Delimited ASCII - Useful for exporting text fields to other applications.

  • Confidence/Character Attribute Reporting - Provides text and information on each character to aid in OCR verification.  Attributes include line coordinates as well as character confidence, font, location, point size, style, etc.

  • HTML - Transfer OCR results directly to the Web for on-line viewing.  Color/grayscale image zones are supported.

  • XHTML

  • XML

  • Tab delimited - useful for forms based applications, each defined zone's output is separated by a comma which can be easily imported into any popular database application.

  • RRI3 - RRI's FormWorks compatible format.

  • ZYINDEX - ZyLab's ZyIndex compatible format.

  • Custom output - each conversion project is unique in its requirements.  Contact us if you need customized output including advanced parsing of text output or any other custom pre or post OCR processing.

 

 

Complement products for PrimeOCR

Prime Recognition has 2 add-on applications that can be customized for specific image document types to improve PrimeOCR accuracy rates:

  • PrimeZone - This custom pre-OCR auto-zoning application creates a zone template for each image based upon specific document types such as Phonebooks, Greenbar, etc.

  • PrimePost - This custom post-OCR utility performs automatic error correction based upon predefined document types.

 

System Requirements

Software:

  • Windows 2000, Windows XP and Windows 2003 Server.

Hardware:

  • Pentium or Pentium compatible computer (AMD Athlon). Up to 6 CPUs supported.

  • A hard disk with 50-150 megabytes (Meg) of space for installation

  • At least 256 megabytes of Random Access Memory (RAM), 512 megabytes recommended.  Additional memory may be required for processing color/grayscale or higher resolution images.

Home  -  Products  -  Services   -  Support  -  Customers   -  Partners  -  News   -  Search
Why High Accuracy  -  Why PrimeOCR  -  Try PrimeOCR  -  Info via E-mail  -  Join Mail List  -  Contact Us