The world of automation is confusing enough and the acronym soup — AI, ML, OCR, ICR, RPA — adopted by vendors makes it more so. Here, we aim to define the terminology you’re hearing, explain how they are different and, hopefully, educate you on what’s best for your use case.
Advanced automation technologies have been around forever, but are growing in adoption now because of the stiff challenges facing today’s businesses to do things faster, cheaper and, ultimately, better. Every company has tedious processes — think about how you automate data entry or data extraction — that are begging to be simplified. Examples of these types of documents include invoices, insurance claims, medical forms, and mortgage documents. These processes are terrible for your business: your workers struggle with the monotony of the tasks and paying them to manage tactical processes is taking away from more important work and, in many instances, their professional growth.
It’s easy to identify the processes that need automation. The tough part is finding the right solution in a sea of automation technologies. The automation industry is intimidating, and it’s easy to get tripped up understanding the differences in each of the technologies and which is best suited for you.
What follows is a guide for you to navigate the automation world. We’ll define the acronyms: optical character recognition (OCR), intelligent character recognition (ICR), robotic process automation (RPA), machine learning (ML) and automated machine learning (AutoML). We’ll define each term, explain the benefits of each and get you on the way to automating processes that are bogging down your workforce.
AI as a field in computer science has been around for generations, but has had a rebirth over the last decade as innovators think of cutting edge ways to apply it. In essence, AI is a field of computer science that attempts to mimic human intelligence with algorithms. Digital voice assistants — like Alexa and Siri — are examples of AI that we commonly use that simplify our lives as consumers. Applications are endless - and virtuous, too. In healthcare, AI is being used to reduce time spent on operational tasks — extracting information from medical records, for example, which Impira plays a role in — but also in the fight against COVID-19. At MIT, researchers have developed AI that detects asymptomatic COVID-19 through recorded coughs in a cell phone.
Much of the technology we’ll talk about in this post is a subset of AI. The benefits of AI are vast: done well, AI reduces the number of people needed for tasks. A retailer may currently have hundreds of employees specifically trained to input product information. AI solutions can eliminate most of them by training a computer system to automate those processes, freeing up headcount for more strategic work.
RPA falls firmly into the business process automation camp, helping companies automate mundane tasks. RPA tools work by “mimicking the actions of human users, memorizing the steps of a particular workflow within a particular UI and then replicating those steps without humans in the mix. This approach is effective for automating static workflows that occur within an environment — such as a payroll application — and deal with highly structured data, like employee records.
RPA is not ideal for anything that’s off script, which includes handling unstructured data or dealing with multiple-choice outcomes. In order to automate the processing of richer unstructured data and reap the tremendous efficiency gains that come from doing so, enterprises must look beyond RPA.
When coupled with some of the intelligent applications we discuss below, RPA can be a solution for more complicated tasks but it is very limited as an intelligent automation technology.
Using OCR technology, users can convert pixels to characters of text, with additional metadata including geometry, size, and page number. The technology has been around since the 1970s with legacy companies like Xerox, Oracle, Kofax among the early entrants. Many of the first eCommerce sites were built on the power of OCR services.
At its core, OCR remains important but it doesn’t scale to meet more robust needs. Extracting usable structured data from documents involves multiple steps. It’s often assumed OCR addresses all of those steps — it doesn’t. We cover that in more depth here, but OCR does not address text extraction or post-processing, which eliminates wrong outputs and ensures better quality in results.
OCR is ideal for use cases that are document heavy and in unstructured formats like PDFs, images, and other text formats that cannot be edited digitally. Users can quickly convert those files into editable documents for your teams. It’s a terrific technology for enabling you to edit and search what previously may have been “frozen” files, those that you couldn’t access or edit.
Using OCR makes editing easier and, more importantly, it reduces human error and is a start to eliminating those monotonous tasks we spoke of earlier. However, in addition to not addressing the two steps we noted earlier, it is most effective with printed text - which means handwritten text must be addressed through other technology.
One other thing to consider: the quality of the first image is critical. The quality of your reproduction is completely dependent on the quality of the initial image and quality can be diminished during the OCR process. Keep this in mind as we transition to ICR.
ICR — sometimes called Intelligent OCR — takes the next step by combining artificial intelligence (AI) with OCR and enables users to incorporate handwritten text and move beyond static text documents. As you can imagine, this comes in handy in any use case that involves a lot of notes: medical, financial services, and insurance come to mind. It uses AI to search from documents such as invoices, sales and purchase orders and shipping receipts and stores the data in an enterprise content management (ECM) system.
While improving on the capabilities of OCR, ICR works best when a human is in the loop to maintain/attain a certain threshold for accuracy. With a human in the loop, ICR can continually learn and improve, bettering the accuracy of your data.
ICR systems can be costly but, if your business is dependent on handwritten data, it may be the best fit for you.
Machine learning focuses on the development of computer programs that can access data and use it to learn for themselves. ICR, for instance, is machine learning because it has the ability to evolve based on what it learns.
Machine learning applications thrive when humans are in the loop. This enables you to train the system so it becomes smarter and more accurate. It’s also the technology we as consumers use most frequently: if you speak to your Alexa, you’re familiar with machine learning. If you’re using Google Maps, you’re familiar with machine learning.
Doing machine learning well takes people, money, and time — and even with all of those resources, you can’t guarantee success. Where we find companies fail here is that they haven’t done a thorough job scoping the business problem they are solving. One question to ask yourself when adopting machine learning for a specific process: could a human do it? If not, a computer probably cannot either.
Machine learning (ML) is a disruptive concept: you can train a computer to solve a problem by showing it a few examples. ML unlocks a lot of problems that are historically very difficult to solve with software, like extracting valuable data out of business documents (forms, invoices, etc.). Unfortunately, entering 2021, machine learning is still very difficult to use. You have to collect samples together into a structured data format, train models explicitly, validate their results, deploy them as an API, and repeat the whole process as you encounter new examples that are not modeled correctly.
Maintaining machine learning models can feel as repetitive and time-consuming as the tasks they’re meant to automate away, let alone expensive and requiring a rare skill set. The process of automating the machine learning development cycle is called AutoML. AutoML is nascent but in theory exposes the power of machine learning to a much broader audience.
Instead of preparing for every possible scenario like machine learning does, AutoML learns based on your documents. In doing so, you get all the benefits of an AI-driven approach over any set of documents you want to work with. When executed well, AutoML provides a user experience that is perfect in its simplicity. You don’t need a computer engineering or data science degree — AutoML platforms can be operated by business users; in fact, it puts an emphasis on subject matter experts who have an innate understanding of the process being automated and are capable of teaching the system.
An AutoML-based approach provides the benefits of OCR, ICR, and ML, learning the nuances of your documents continuously. And unlike pure RPA, AutoML can go "off script" and learn on the fly with your users. Impira AutoML can be used on its own or in tandem with an RPA solution to fully automate complex processes.
Impira AutoML works with as little as one example, which is often enough for structured forms. The more data you provide, the more accurate the results, so Impira will prompt you to provide input when it’s not confident enough about its predictions. This is where those subject matter experts are critical for continuous learning.
The platform also offers a number of tools to help you post-process data. It comes with a rich query language, which you can use to join data across documents, split up / transform fields, and even aggregate across documents. We find that these tools save our users from writing code to further normalize the data in their documents. And finally, everything is accessible through our UI and a simple REST API, which you can use to upload files and query extracted data.
The vision to bring machine learning to the masses is what drives us at Impira. If you want to learn more about our AutoML platform, contact us here.