Standard Input Management
    • 30 Apr 2025
    • 2 Minutes to read
    • Contributors
    • PDF

    Standard Input Management

    • PDF

    Article summary

    Introduction

    The Standard Input Management Package adds OCR capabilities to the package.

    A client / server solution used for maintaining incoming business information and for the extraction of data for import directly into an external workflow system. By setting up simple text recognition in the Lasernet OCR Editor, you can work with data fields in documents such as invoices, order confirmations, and so on, and extract content into XML data with the Lasernet OCR Engine.

    Note

    Only sold as an add-on package to existing customers running Output Management.

    Dependencies

    Lasernet Base Server Package incl. XML in & out

    Additional Applications Included

    • Lasernet OCR Editor

    • Lasernet Dictionary Service

    Workflow

    Lasernet Input Management

    Modules

    Type

    Module

    Description

    Engine

    OCR

    The OCR Engine have properties to define the types of documents, types of OCR Fields, and types of Item Lines.

    Engine

    PDF Splitter

    A module to split a single PDF document into multiple PDF documents.

    Engine /Modifier

    Tesseract OCR*

    A module based on Tesseract OCR text recognition. Allows PDF input, adding image pre-processing features, for embedded images and scanned documents and outputting PDFs retaining original text in a PDF document. This module also support TIFF as input format with OCR processing of multi-paged files.

    Modifier

    Barcode Reader

    Extracts values for a wide range of linear and 2D barcodes in PDF documents, PNG, JPEG and TIFF images to an XML, JSON or JobInfos output format.

    Modifier

    Excel to XML

    Converts Microsoft Excel 2007 or newer to an XML or DataSet (XML with schema) format.

    Modifier

    PDF to Text

    Converts PDF format to Text format.

    Lasernet OCR Editor

    The Lasernet OCR Editor does not require any additional license. It is installed as an application on the end user desktop. The OCR Editor is used to define the criteria for how to recognize the sender of the document and to recognize the required data to be extracted for the external workflow.

    Dictionary Service

    The Lasernet Dictionary Service is an optional feature to automate the capturing of OCR data in documents instead of manual process on the Lasernet OCR Editor. Delivered out-of-the-box to work with invoices and credit notes in the Danish and English languages.

    Tesseract OCR

    Hand-written text, right-to-left and Asian languages are not supported. The module is trained and bundled with the following language packages: Danish, Dutch, German, English, Finnish, French, Icelandic, Italian, Norwegian, Russian, Spanish and Swedish.

    ABBYY FineReader (Not Maintained by the Lasernet Core Team)

    Formpipe offers sales and licenses for ABBYY FineReader. It is a stand-alone server application for OCR processing.

    ABBYY FineReader vs. Tesseract OCR

    The ABBYY FineReader application works significant differently compared to Tesseract OCR. It runs a full OCR scan for the whole PDF document, including the existing text strings already present in the PDF document, and not for images only.

    The ABBYY FineReader is, depending on ABBYY licensing, limited to max number of monthly processed documents. Tesseract OCR module operates similar to other Lasernet modules, with a license model supporting unlimited numbers of processed documents.

    Example of Usage (Tesseract OCR)

    A user receives PDF document as invoices and orders from their business partners. Data in the PDF documents is embedded as a mix of text strings and images or as full scanned documents. Tesseract OCR (Optical Character Recognition) adds the image pre-processing feature and convert images into text strings. With Tesseract OCR the PDF documents retains the original text from the input PDF and will only process the images embedded.

    With Lasernet Lasernet OCR Editor an end-user can define OCR Forms for how to extract the information required for import to an external workflow system in any ERP system.

    A Lasernet Dictionary service (to installed from Lasernet Server License Manager) is able to a AutoCapture features to the solution.

    Example of Usage (ABBYY FineReader)

    Contact a member of the Formpipe sales team for more information