Join us October 22nd to hear Coglate-Palmolive, IDC, and Sequoia Capital discuss moving to a digital-first environment
Learn more
Flow chart steps from creating a collection, uploading files, and applying OCR and custom ML models for data extraction through Impira AutoM

OCR in under two minutes with Impira AutoML

For document-heavy industries — and what industries aren’t? — OCR is no longer enough. Impira AutoML is the answer to automate manual data entry tasks in minutes.

Subscribe to Impira's blog
Stay up to date with all things Impira, automation, document processing, and industry best practices.

Optical character recognition (OCR) technologies — while still great at identifying characters from PDFs, images, and handwritten notes — can never be successful alone, but rather rely on other technologies to add structure to make this data usable. Luckily today, when you pair OCR with automated machine learning (AutoML) you can save time and resources. AutoML allows anyone (your accountant, insurance agent, IT specialist, or healthcare receptionist) to set up and modify a unique machine learning (ML) model for a set of documents without needing to know a single line of code. With Impira AutoML, you can simply train your machine learning model by highlighting the text you want to extract. To make a change or improve the accuracy of your model, all you need to do is confirm your data extraction matches within the platform. This is you actively training and improving the OCR that powers your machine learning model in the background. 

Eliminate time-consuming data extraction in three steps

Leveraging Impira AutoML, subject matter experts can now set up their own automation workflow with little to no expertise in the technologies that power Impira. 

Step 1: Create a new collection

A collection is the main feature within Impira for organizing and grouping files together. Each collection has its own dedicated ML model, and that model will be customized to work on extracting data from only the files within that collection. 

Screenshot of how to add a collection. A collection is like a folder which groups your files together in order to run unique OCR and ML models on.

Step 2: Upload your files

Upload your files to your new collection, and Impira will immediately begin the initial OCR processing. This process will detect words and letters embedded within your file. Once complete, you’ll see that each file has its own row in the table. We recommend starting off by adding at least 5 or more similar files.

Screenshot of Impira's drag and drop upload functionality. Upload files that you wish to extract data from with OCR and Impira AutoML.

Step 3: Train a machine learning model to understand your files with Impira AutoML

The beauty of AutoML is that you can teach and retrain an ML model to continually process your files through a simple user interface without the knowledge of code. Simply open a file, highlight the text you want to extract, and watch the machine learning model pull out your data from the rest of your files. As you confirm more data extraction matches, this will retrain and improve the accuracy of your ML model. 

That’s it, you’re done. It’s really that simple: Create a collection, upload, and highlight. 

We have more detailed How to guides that can walk you through a hands-on demo. If you’re in the insurance industry, ACORD forms are likely the bane of your existence. Read our How to guide on automating ACORD forms here

Unlock your data stuck in pdfs, images, or files today.

Try Impira AutoML's easy to use interface to set up custom ML models for all your data extraction needs.