RPA and Intelligent Automation for Optical Character Recognition Based Business Processes
Many organizations today have well established OCR Processes where accuracy rates are pre-established based on the OCR tool-set deployed and have standard operations procedures defined for the sorters, scanners, verifiers and data-entry operators.
Our purpose here is to address how an organization can improve efficiency, accuracy and/or reduce cost of these business processes by using a combination of OCR, RPA and AI technologies.
Document processing involves the following activities:
- Identify: the type of the document; image, machine readable text form, handwritten document scanned in the system, etc.
- Classify: based on identification, classify into understandable formats like invoices, trade bills, time sheets
- Read: Character recognize the document
- Interpret: derive conclusions based on text recognized on the document
- Act/Assimilate: perform actions based on the conclusions, like, set up reminders, send notifications, store data into a structured format, etc.
Typical document sets that require OCR vary but here are some that we have seen most commonly used:
- Letters from customers
- Tax Forms
- W-2 Forms
- Legal Bills of Cost
- PositivePay (duplication data-entry)
- Medical Coding/Transcription
- Claim Processing
- KnowledgeBase management
- Policy underwriting and issuance
- Rating Generators
- Broadly, we classify these documents into the following forms:
- Pre-defined standard formats: These are typically compliance and regulatory documents, like tax forms, w-2 forms, etc. Most of the OCR Tools available in the market are able to read these efficiently (98-99%). Digital Process Automation (RPA/IA) tools can be utilized to read, classify and store these into structured data systems seamlessly, more than likely these processes are quite repeatable and easily automated using off the shelf process automation tools.
- Semi-Structured Enhanced-automation (Natural Language Processing): These types of documents contain free-text in the form of paragraphs and bullet points with data embedded in them. Legal bills of Cost and Medical Coding would essentially fall into these formats. For these documents, most OCR tools extract the text but require a little bit of deductive reasoning from the extracted text to enable it to be stored into structured systems. OCR tools like Ephesoft, Adobe Acrobat DC, and Documentum are quite capable of doing these translations. The RPA tools can interact with the APIs provided by these OCR tools and absorb data into the system. These tools are fairly good with pre-defined data and/or enhanced Automation. For e.g. if we are looking at a letter of explanation from a first-time home buyer explaining their source of funds, we found the accuracy at 80-85%. We recommend testing the Tools community edition before buying it for the specific needs.
- Non-Structured Documents: If you are looking for enhanced readability of documents without knowing the context, format, or considering regional slang, abbreviated words, short texts, or even hashtags efficiently, we would recommend developing a solution that would be a combination of OCR Tool, Natural Language Processing Engine (NLP) and RPA. Such solutions have a quick build engineering core and give you pretty decent assimilation of data where accuracy rates are from 75%-90% depending on the source. For example, for Arabic text and syntactical changes from United Arab Emirates to Kuwait vary and cause the accuracy rate to be low at 78%, while when reading English text across various cultures in countries like Australia, England, and The United States, the accuracy rates are pretty high at 88%, even for reading English from countries like India and South Africa, the accuracy rate was 85%.
Intelligent Automation (IA) also comes into play in cases where the source of data varies and requires learning the extraction technique from varied sources in varied forms and making a smart educated judgement. For example, say a mortgage lender/brokerage company wishes to automate the process of collating and consolidating documentation required for underwriting. Typically, these companies have a document storage system which stores all the documents provided by a borrower, but reading these documents and ensuring all sources of funds are well understood and documented require a human. But, this is changing, with intelligent automation solutions, which use a combination of OCR Tools, NLP Engines, Process Automation and Machine Learning. For e.g. a typical solution for the above problem statement could be outlined in the following points:
- First, go through the documents attached, identify bank statements (typical need is last 90 days)
- Verify that these are all the bank statements listed on the Application Form
- Next, identify all sources of funds (deposits into these accounts and who was the sender/depositor)
- The typical sources of funds that you see on a bank statement are Salary ACH deposits, Dividend income from investments, stock transactions
- A good intelligent automation solution should be able to correlate the ACH deposits with w-2 forms submitted by the borrower and validate ensuring the numbers and frequency match
- Next, the dividend income on the deposits should be correlated with investments declared and the numbers can be correlated with broker statements
- Similarly, credit reports from IRS can be auto read into the system
This solution enables the company to bring down its underwriting cycle from days to hours and minutes once the necessary documentation is available in the system. Such machine learning can be implemented using tools like AlchemyAPI, Aylien, Fluxifi, Textalytics and the likes, these can also be used for forensic analysis of data to identify or extract machine readable data.
Such operational automation requires machine learning and the ability to build with various undefined sources which is a capability you can build within your organization over time. Below is the depiction of activities to solutions: