The New Era of Document Processing

Article by Ahmed Zaidi | July 26, 2023

ABSTRACT

Explore the New Era of Document Processing: From OCR to ML Model-Based Solutions. Discover how OCR extracts text and provides document location, while ML Model-Based Processing adds context and meaning. Uncover the challenges of Data Labeling for ML Model-based data extraction and its time-consuming nature. Learn how cutting-edge technologies are revolutionizing document processing

OCR just extracts text and provides its location on a document.

In the late 80s and 90s companies started using Optical Character recognition to digitize their paper documents. The idea was to optically detect characters on a digital image of a document and provide a computer readable form of the text that can be used in digital systems. Once text was “read” off the document, various text manipulation techniques were used to determine the context and the “meaning” of the text. In simple terms, OCR technology was no smarter than a 1st grader who can identify alphabets.

For most of the 90s and 2000s OCR technology made modest advancements. Products firms ABBYY, and KOFAX continued to add functionality however, “reading” text off a document remained to be a template-based affair. Someone would identify areas on a document that would correspond to a piece of data that needed to be extracted. Should the document format change, the template needed to be updated. If a new format of a document containing the same data needed to be processed a new template would need to be created. E.g. invoices from different vendors may require you to maintain a template per vendor.

ML Model Based Processing Adds context and meaning to text on a document.

In the era of machine learning, digitizing of documents, which had become synonymous with OCR, evolved, and took the name “Intelligent Document Processing” (IDP). IDP platforms introduced machine learning models that would rely on a data set labelled by humans, to combine text on a document, the position of the text and the context around it to identify data elements on a document. This was more akin to a 5th grader who could not only read words but also apply some context to those words and ascertain “meaning”.

Data Labeling for ML Model based data extraction is time-consuming.

However, the big challenge for Machine Learning Model based document processing was the amount of time and effort spent in labelling data. ML models learn by ingesting data that a human has labelled and determine an output based on that learning. E.g., a human may identify fields on 1000 documents and using this “labelling” data, the model determines how to identify this field based on text and positional context. The labelling can be time-consuming and if done incorrectly can result in a model being inefficient.

LLM/GPT is a giant leap in intelligent document processing – No more labelling.

Enter Large Language Models - LLMs (more commonly known as generative AI or GPT). Since LLMs have been pre trained with a tremendous amount data they have contextual knowledge of language and can extract information from text contextually. So, by asking a question such as “What is the Invoice Number on this invoice?” The model already knows that the document being fed is an invoice and its pre training allows it to respond with appropriate text from the provided data. The use of LLMs has proven to be a giant leap in document processing. LLMs do not require data labelling or pretraining. This significantly reduces the time required to implement document-based automation.

Data size challenges

There are however some challenges with processing documents with LLMs. Most services that host LLMs like Azure or OpenAI have a limit on the amount of data that can be sent in a single request. Hence processing very large documents can be a challenge. This is usually solved by first extracting pages from a document using a contextual search service such as MS Cognitive Search. Cognitive search can extract chunks of text that are relevant to the topic. Text from these pages is then fed to the LLM to extract the data required.

LLMs/GPT Hallucinate

Keep in mind that in its current form LLMs and GPT technology, especially for textual use cases just predicts the next best word, given its training and context. Which could mean that if the context is missing and the prompts do not specifically prevent it, the model may predict a word based on its training and not the data in the document. This is referred to as hallucination. To Reduce Hallucinations most platforms will utilize prompt engineering and providing additional context in the form of vector databases to prevent hallucinations.

Conclusion

Using Generative AI technology to process documents-based use cases can be extremely efficient. This technology marks a sea change in paradigms for Intelligent Document Processing. As this technology evolves and becomes more mainstream, we foresee significant advancement in document processing use cases through LLMs. Many platforms including OpenBots, UiPath and most recently Snowflake have already introduced LLM based document processing capabilities that have shown significant promise.

Ahmed Zaidi

CEO & Managing Partner

Ahmed Zaidi is the Chief Executive Officer of Accelirate. As the CEO, Ahmed is responsible for overall company strategy and execution.

How Agentic AI and Autonomous Systems Are Impacting Business Workflows?

Automation used to mean bots that just followed rigid rules. Clicking and typing only what they were told. But today’s business challenges are far more complex. That’s where Agentic AI comes in. In this thought piece, automation expert Namrata Butch shares how AI agents and autonomous systems are changing the way organizations run operations, moving beyond scripts and into systems that observe, adapt, and act.

June 16, 2025

Tableau Next Is Here: What Analysts Need to Know Now

The future of analytics is here—and it’s more collaborative than ever. Tableau Next brings Agentic AI into the analytics experience, helping analysts do more with less effort. From smarter insights powered by Tableau Semantics to helpful AI agents like Concierge and Inspector, Tableau is making data work easier, faster, and more impactful.

April 29, 2025

How to Prepare Your Salesforce Org for Agentforce AI Agents

Agentforce Agents offer powerful automation capabilities within Salesforce, but their success depends on how well your organization’s data, security, and permissions are structured.

March 17, 2025

Agentic Automation

Generative AI Consulting

Test Automation Center

7 Agentic AI & Automation Trends for 2025

Maximize Your Salesforce ROI With Our Agentforce Readiness Assessment

Accelirate Exclusive

5-Week AI Agent Activator

Accelirated Delivery

AcceliOps Managed Services

Industry

Case study

Accelirating Credit Union Operations with Intelligent Process Automation

Process Automation With Agentic AI Excellence

CORE SOLUTIONS

RESOURCES

Is Your Organization Ready for Agentforce? Find Out Now!

Practical Use of Agents in HR & Recruiting

Agentic AI & UiPath: What’s New, What’s Next with Accelirate?

Accelirate exclusive

Our Story

Grow with Our Partners

COMPANY

Strategic Partner

Accelirate partners with ServiceNow to accelerate...

Accelirate Signs Strategic Partnership with Klarity to...