Mastering Efficient Text Extraction with Abby OCR SDK – Smart Document Insight Generation

Mastering Efficient Text Extraction with Abby OCR SDK – Smart Document Insight Generation

Mark Lv13

Mastering Efficient Text Extraction with Abby OCR SDK – Smart Document Insight Generation

ABBYY FineReader Engine

The most comprehensive OCR SDK for software developers

Integrate AI-powered OCR features into your applications

Schedule a demo

FRE_Visual

Automated document analysis

The automated document analysis step is a key part of the overall document recognition process. To conduct this step with a high precision, ABBYY FineReader Engine uses many advanced algorithms of artificial intelligence based methods.

During the document analysis step, the document is analyzed in regards to its logical structure – first and last document pages are identified, the formatting elements such as footnotes, headers, footers and table of content are detected.

At the same time, the layout of each individual page is detected and each page is divided into individual objects, such as text blocks, pictures, tables and table cells, barcodes, and separators. Additionally, the document analysis algorithms detect page orientation, identifies double pages, detects vertical text and define page areas that are not relevant for the OCR process.

As a result, the ABBYY FineReader Engine is able to specify text areas and fields that should be recognized and page areas, such as images or diagrams, that should be kept in their original form. At the same time, it receives information about the logical document structure (including its formatting) which will be used at the end of the OCR process, when the document will be exactly reconstructed.

The results of this analysis are used for document structure and layout retrieval if documents are processed for further reuse – which means that the documents need be exactly reconstructed. All pictures and diagrams will be preserved in their original form - without recognizing the text inside pictures or logos.

WinUtilities Pro

Manual blocks specification for field-level recognition

The text recognition areas can be set up manually. In this case, the relevant recognition field is directly defined and the automated document analysis is not necessary. During the later recognition step, the recognizer receives information about the coordinates and properties of the requested fields and applies OCR only to the specified zone.

Request a demo today!

Schedule a demo and see how ABBYY’s intelligent automation can change the way you work - forever

First name*

Last name*

E-mail*

Phone

Company*

Add your question or describe your interest

Сountry*

СountryAfghanistanAland IslandsAlbaniaAlgeriaAmerican SamoaAndorraAngolaAnguillaAntarcticaAntigua and BarbudaArgentinaArmeniaArubaAustraliaAustriaAzerbaijanBahamasBahrainBangladeshBarbadosBelgiumBelizeBeninBermudaBhutanBoliviaBonaire, Sint Eustatius and SabaBosnia and HerzegovinaBotswanaBouvet IslandBrazilBritish Indian Ocean TerritoryBritish Virgin IslandsBrunei DarussalamBulgariaBurkina FasoBurundiCambodiaCameroonCanadaCape VerdeCayman IslandsCentral African RepublicChadChileChinaChristmas IslandCocos (Keeling) IslandsColombiaComorosCongo (Brazzaville)Congo, (Kinshasa)Cook IslandsCosta RicaCroatiaCuraçaoCyprusCzech RepublicCôte d’IvoireDenmarkDjiboutiDominicaDominican RepublicEcuadorEgyptEl SalvadorEquatorial GuineaEritreaEstoniaEthiopiaFalkland Islands (Malvinas)Faroe IslandsFijiFinlandFranceFrench GuianaFrench PolynesiaFrench Southern TerritoriesGabonGambiaGeorgiaGermanyGhanaGibraltarGreeceGreenlandGrenadaGuadeloupeGuamGuatemalaGuernseyGuineaGuinea-BissauGuyanaHaitiHeard and Mcdonald IslandsHoly See (Vatican City State)HondurasHong Kong, SAR ChinaHungaryIcelandIndiaIndonesiaIraqIrelandIsle of ManIsraelITJamaicaJapanJerseyJordanKazakhstanKenyaKiribatiKorea (South)KuwaitKyrgyzstanLao PDRLatviaLebanonLesothoLiberiaLibyaLiechtensteinLithuaniaLuxembourgMacao, SAR ChinaMacedonia, Republic ofMadagascarMalawiMalaysiaMaldivesMaliMaltaMarshall IslandsMartiniqueMauritaniaMauritiusMayotteMexicoMicronesia, Federated States ofMoldovaMonacoMongoliaMontenegroMontserratMoroccoMozambiqueMyanmarNamibiaNauruNepalNetherlandsNetherlands AntillesNew CaledoniaNew ZealandNicaraguaNigerNigeriaNiueNorfolk IslandNorthern Mariana IslandsNorwayOmanPakistanPalauPalestinian TerritoryPanamaPapua New GuineaParaguayPeruPhilippinesPitcairnPolandPortugalPuerto RicoQatarRomaniaRwandaRéunionSaint HelenaSaint Kitts and NevisSaint LuciaSaint Pierre and MiquelonSaint Vincent and GrenadinesSaint-BarthélemySaint-Martin (French part)SamoaSan MarinoSao Tome and PrincipeSaudi ArabiaSenegalSerbiaSeychellesSierra LeoneSingaporeSint Maarten (Dutch part)SlovakiaSloveniaSolomon IslandsSouth AfricaSouth Georgia and the South Sandwich IslandsSouth SudanSpainSri LankaSurinameSvalbard and Jan Mayen IslandsSwazilandSwedenSwitzerlandTaiwan, Republic of ChinaTajikistanTanzania, United Republic ofThailandTimor-LesteTogoTokelauTongaTrinidad and TobagoTunisiaTurkeyTurks and Caicos IslandsTuvaluUgandaUkraineUnited Arab EmiratesUnited KingdomUnited States of AmericaUruguayUS Minor Outlying IslandsUzbekistanVanuatuVenezuela (Bolivarian Republic)Viet NamVirgin Islands, USWallis and Futuna IslandsWestern SaharaZambiaZimbabwe

  • I agree to receive email updates from ABBYY Solutions Ltd. such as news related to ABBYY Solutions Ltd. products and technologies, invitations to events and webinars, and information about whitepapers and content related to ABBYY Solutions Ltd. products and services.

I am aware that my consent could be revoked at any time by clicking the unsubscribe link inside any email received from ABBYY Solutions Ltd. or via ABBYY Data Subject Access Rights Form .

Academic Title

Salutation

Referrer

Captcha Score

Business Scenario Temp

City

Query string

Product Interest Temp

UTM Medium

UTM Source

UTM Content

UTM Campaign Name

ITM Source

GA Client ID

GDPR Consent Note

Page URL

  • Title: Mastering Efficient Text Extraction with Abby OCR SDK – Smart Document Insight Generation
  • Author: Mark
  • Created at : 2024-08-21 17:42:50
  • Updated at : 2024-08-22 17:42:50
  • Link: https://some-guidance.techidaily.com/mastering-efficient-text-extraction-with-abby-ocr-sdk-smart-document-insight-generation/
  • License: This work is licensed under CC BY-NC-SA 4.0.