Accurately extract entities, events, facts

Entity recognition identifies critical textual elements like persons, organizations, dates, geographic objects, event extraction reveals speech activities, commercial deals, crimes – or facts like citizenship, employment, family relationships. Powered by cutting-edge natural language processing Compreno technology, ABBYY InfoExtractor recognizes and disambiguates entities and events at the level of their meaning.

Identify relationships between entities and events

Through deep language-based analysis of documents InfoExtractor SDK reveals relations between entities and events. Entitity extraction is important, but recognizing that an entity has been replaced by a pronoun, or tracing its mentioning throughout text, allows you to analyze the whole picture: e.g. to find the deals that link a buyer and a seller and indentify the related financial figures or other information.


Add customized entities for specific cases

To ensure reliable entity recognition, the InfoExtractor enables creation of user ontology dictionaries to extract complicated examples of entities that can be critical for business (e.g. complex names like Aditya Prasad Kola, or organizations like Mengniu). An easy algorithm described in the SDK’s documentation helps to add missing concepts within existing entity types. These new concepts will automatically submit the extraction rules and improve complex cases of entity recognition.

Use custom ontologies for industry solutions

In addition to basic ontologies that enable entity and event extraction for most common domains, industry ontologies for specific industries or processes can also be customized or developed from scratch upon request by ABBYY professional linguistic services.

Work with text regardless of source

ABBYY’s world-famous Optical Character Recognition technology (OCR) is integrated into the InfoExtractor SDK enabling the analysis of scanned files (in tiff, jpeg and other graphic formats) and PDF-documents. Likewise, if large volumes of scanned documents need to be processed the InfoExtractor SDK can be seamlessly integrated with ABBYY Recognition Server.