Mastering Efficient Text Extraction with Abby OCR SDK – Smart Document Insight Generation
Mastering Efficient Text Extraction with Abby OCR SDK – Smart Document Insight Generation
ABBYY FineReader Engine
The most comprehensive OCR SDK for software developers
Integrate AI-powered OCR features into your applications
Automated document analysis
The automated document analysis step is a key part of the overall document recognition process. To conduct this step with a high precision, ABBYY FineReader Engine uses many advanced algorithms of artificial intelligence based methods.
During the document analysis step, the document is analyzed in regards to its logical structure – first and last document pages are identified, the formatting elements such as footnotes, headers, footers and table of content are detected.
At the same time, the layout of each individual page is detected and each page is divided into individual objects, such as text blocks, pictures, tables and table cells, barcodes, and separators. Additionally, the document analysis algorithms detect page orientation, identifies double pages, detects vertical text and define page areas that are not relevant for the OCR process.
As a result, the ABBYY FineReader Engine is able to specify text areas and fields that should be recognized and page areas, such as images or diagrams, that should be kept in their original form. At the same time, it receives information about the logical document structure (including its formatting) which will be used at the end of the OCR process, when the document will be exactly reconstructed.
The results of this analysis are used for document structure and layout retrieval if documents are processed for further reuse – which means that the documents need be exactly reconstructed. All pictures and diagrams will be preserved in their original form - without recognizing the text inside pictures or logos.
Manual blocks specification for field-level recognition
The text recognition areas can be set up manually. In this case, the relevant recognition field is directly defined and the automated document analysis is not necessary. During the later recognition step, the recognizer receives information about the coordinates and properties of the requested fields and applies OCR only to the specified zone.
Request a demo today!
Schedule a demo and see how ABBYY’s intelligent automation can change the way you work - forever
First name*
Last name*
E-mail*
Phone
Company*
Add your question or describe your interest
Сountry*
СountryAfghanistanAland IslandsAlbaniaAlgeriaAmerican SamoaAndorraAngolaAnguillaAntarcticaAntigua and BarbudaArgentinaArmeniaArubaAustraliaAustriaAzerbaijanBahamasBahrainBangladeshBarbadosBelgiumBelizeBeninBermudaBhutanBoliviaBonaire, Sint Eustatius and SabaBosnia and HerzegovinaBotswanaBouvet IslandBrazilBritish Indian Ocean TerritoryBritish Virgin IslandsBrunei DarussalamBulgariaBurkina FasoBurundiCambodiaCameroonCanadaCape VerdeCayman IslandsCentral African RepublicChadChileChinaChristmas IslandCocos (Keeling) IslandsColombiaComorosCongo (Brazzaville)Congo, (Kinshasa)Cook IslandsCosta RicaCroatiaCuraçaoCyprusCzech RepublicCôte d’IvoireDenmarkDjiboutiDominicaDominican RepublicEcuadorEgyptEl SalvadorEquatorial GuineaEritreaEstoniaEthiopiaFalkland Islands (Malvinas)Faroe IslandsFijiFinlandFranceFrench GuianaFrench PolynesiaFrench Southern TerritoriesGabonGambiaGeorgiaGermanyGhanaGibraltarGreeceGreenlandGrenadaGuadeloupeGuamGuatemalaGuernseyGuineaGuinea-BissauGuyanaHaitiHeard and Mcdonald IslandsHoly See (Vatican City State)HondurasHong Kong, SAR ChinaHungaryIcelandIndiaIndonesiaIraqIrelandIsle of ManIsraelITJamaicaJapanJerseyJordanKazakhstanKenyaKiribatiKorea (South)KuwaitKyrgyzstanLao PDRLatviaLebanonLesothoLiberiaLibyaLiechtensteinLithuaniaLuxembourgMacao, SAR ChinaMacedonia, Republic ofMadagascarMalawiMalaysiaMaldivesMaliMaltaMarshall IslandsMartiniqueMauritaniaMauritiusMayotteMexicoMicronesia, Federated States ofMoldovaMonacoMongoliaMontenegroMontserratMoroccoMozambiqueMyanmarNamibiaNauruNepalNetherlandsNetherlands AntillesNew CaledoniaNew ZealandNicaraguaNigerNigeriaNiueNorfolk IslandNorthern Mariana IslandsNorwayOmanPakistanPalauPalestinian TerritoryPanamaPapua New GuineaParaguayPeruPhilippinesPitcairnPolandPortugalPuerto RicoQatarRomaniaRwandaRéunionSaint HelenaSaint Kitts and NevisSaint LuciaSaint Pierre and MiquelonSaint Vincent and GrenadinesSaint-BarthélemySaint-Martin (French part)SamoaSan MarinoSao Tome and PrincipeSaudi ArabiaSenegalSerbiaSeychellesSierra LeoneSingaporeSint Maarten (Dutch part)SlovakiaSloveniaSolomon IslandsSouth AfricaSouth Georgia and the South Sandwich IslandsSouth SudanSpainSri LankaSurinameSvalbard and Jan Mayen IslandsSwazilandSwedenSwitzerlandTaiwan, Republic of ChinaTajikistanTanzania, United Republic ofThailandTimor-LesteTogoTokelauTongaTrinidad and TobagoTunisiaTurkeyTurks and Caicos IslandsTuvaluUgandaUkraineUnited Arab EmiratesUnited KingdomUnited States of AmericaUruguayUS Minor Outlying IslandsUzbekistanVanuatuVenezuela (Bolivarian Republic)Viet NamVirgin Islands, USWallis and Futuna IslandsWestern SaharaZambiaZimbabwe
- I agree to receive email updates from ABBYY Solutions Ltd. such as news related to ABBYY Solutions Ltd. products and technologies, invitations to events and webinars, and information about whitepapers and content related to ABBYY Solutions Ltd. products and services.
I am aware that my consent could be revoked at any time by clicking the unsubscribe link inside any email received from ABBYY Solutions Ltd. or via ABBYY Data Subject Access Rights Form .
- I have read and agree with the Privacy policy and the Cookie policy .*
Academic Title
Salutation
Referrer
Captcha Score
Business Scenario Temp
City
Query string
Product Interest Temp
UTM Medium
UTM Source
UTM Content
UTM Campaign Name
ITM Source
GA Client ID
GDPR Consent Note
Page URL
- Title: Mastering Efficient Text Extraction with Abby OCR SDK – Smart Document Insight Generation
- Author: Mark
- Created at : 2024-08-21 17:42:50
- Updated at : 2024-08-22 17:42:50
- Link: https://some-guidance.techidaily.com/mastering-efficient-text-extraction-with-abby-ocr-sdk-smart-document-insight-generation/
- License: This work is licensed under CC BY-NC-SA 4.0.