Join us October 22nd to hear Coglate-Palmolive, IDC, and Sequoia Capital discuss moving to a digital-first environment
Learn more
Hand holding four pencils.

Empowering users in machine learning interaction

Consumer and enterprise software has evolved to become highly intuitive and user-focused. Machine learning, however, has mostly operated outside the users' control, only vaguely influenced through implicit feedback. At Impira, we are working on bringing explicit machine learning training into the hands of the user with our new product, Mango Beta.

In consumer applications where machine learning is most prevalent, the technology exists mostly as a black box working behind the scenes. Machine learning controls major elements of the application that are typically outside of the end-user's control, like the sequence of posts on an Instagram feed, the ranking of Google's search results, and the automated responses from Lyft's or Uber’s customer support bots.

These consumer machine learning pipelines rely on implicit feedback from their users to operate. Each click, back button, conversion, or close of an app in frustration is noted and logged as one noisy signal among millions of others. These observations are pooled across users, anonymized, and then distilled into massive data pipelines to provide the training data required to continuously hone the highly complex models that power these features.

What about a more explicit process to allow a user to intentionally train a machine learning model?

To the extent that there are products to control and shape your own models, these have largely come in the form of developer tools. Google AutoML and Amazon Sagemaker Autopilot are examples of platforms that allow developers to provide custom training data, and train black box models which can be used for future prediction.

However, these offerings suffer from two major issues that limit their usefulness in real applications. The first (more obvious) issue is that they are only accessible to technical users. A second more subtle issue is that they are explicitly designed for separate sequential stages of training, evaluation, and finally execution in production. In other words, the simplest way to integrate this type of tool is to train a model manually with an initial dataset, then evaluate the model's performance against initial test data, and finally integrate the prediction API into a production application. However, this process encourages the lack of an ongoing feedback loop. Specifically, the trained model will fail to handle new cases or improve over time, eventually suffering from the same problem as the generic pre-trained image tagging APIs mentioned in our earlier blog post Feedback Loops in Machine Learning. As such, the process for capturing ongoing feedback becomes a burden on the technical user, requiring dedicated effort to engineer an application interface for collecting feedback and a pipeline for retraining and redeploying an updated model.

At Impira, we’re working on a very different approach, with the goal of bringing interactive machine learning training into the non-technical user journey. We’ve started with a constrained use case to unlock the data captured in pdf and image content, now live in our new Mango Beta product. To accomplish this goal we have tackled these two major issues head-on:

1) Building an accessible and digestible interface for non-technical users

Making custom machine learning models accessible to both technical and non-technical users requires offering those models in a form that is easy to grasp. Technical users are comfortable providing data in an exacting format through APIs or command-line tools. They are also accustomed to evaluating and picking the right technical tool for each job (such as choosing whether to use Python or C++ as a programming language for a given project), and so can do the work to evaluate machine learning models against non-ML solutions like manually coded heuristics. For non-technical business users, on the other hand, the high-level goal is almost always the automation of a complex or painful business workflow. They want to stay focused on the workflow problem, and whether that automation gets accomplished through machine learning or explicit programming is beside the point.

This led us to draw inspiration from Microsoft Excel (and now Google Sheets), a simple yet elegant tool that successfully empowers non-technical users to perform all kinds of programming-like data transformations. Interactions within Excel feel intuitive for the user, and more importantly, do not get in the way of manipulating business data. Likewise, we believe that machine learning-powered interfaces will be most successful if they can live alongside more familiar and easy-to-use tools for explicit data manipulation, such as Excel.

Mango Beta, our latest product that was released last Tuesday, February 11th, was designed with precisely this in mind. As shown below, the interface is a grid-like view that is familiar and easy to use for both non-technical and technical users alike.

2) Moving from sequential to ongoing learning

Machine learning literature and tools commonly differentiate between three distinct phases. First, there is a period of training, when a model processes a dataset to learn the specific parameters to accurately make predictions. Second, that model is evaluated using a test dataset with known answers; the model and its parameters remain fixed, and the model's predictions are compared against the ground truth. This evaluation period allows for a clear measurement of how well the model is performing. Third, the model is deployed into an application, where its predictions are used to power some features like search ranking or image tagging.

However, we often see that users want to think about machine intelligence similarly to how humans learn. It is a strange requirement to have an initial period of data gathering during which the algorithm cannot output anything useful (training), followed by a separate period during which the algorithm can output but not get any smarter (evaluation). Far more intuitive is for the model to gain confidence and specificity over its predictions in a gradual and ongoing way after repeated interaction with the user.

One example of this difference between sequential and ongoing learning is how a model's predictions are handled. In a tool like Google AutoML, where you are directly involved in each step of the sequential training and evaluation, you would send in new training data and get back a newly trained model. At that point, you would have multiple versions of the model, one from the original set of training data and one from the updated set. If you had already used the previous model to make predictions on a bunch of unlabeled data, those stale predictions will not incorporate learnings from the new training data. As a result, you will need to recompute and update those predictions, as well as replace the model itself to generate newer and more accurate predictions.

At Impira, we are working on abstracting these concerns away from the user. As users take more actions on their data, such as labeling, accepting, and correcting outputs of the model, these bits of user feedback are continuously fed back into the model for re-training. The predictions that manifest in the user interface are a dynamic representation of the model's estimation at its most recent point, trained on the fullest set of user interaction data available. This creates a tight and interactive feedback loop between the user and the model, which is a unique advantage of having the machine learning process fully embedded and exposed within the application.

The Mango Beta product is our first experiment with ongoing learning. The interface exposes a highly interactive UI that allows the user to provide feedback to the model in real-time. For example, when a predicted label for an image is correct, the user can confirm the prediction, thereby providing feedback to the model.  This feedback helps the model make better predictions for other documents in that table, which will get updated in real time.

Ultimately, we believe that building an accessible interface and moving to an ongoing learning process can unlock a more intuitive and productive relationship between the user and machine learning automation. At Impira, we are working hard to make these ideas a reality in full-fledged applications with use cases around facial recognition, intelligent image tagging, PDF data extraction, and much more. If Impira interests you, sign up below to gain free access to our product. We'd love to hear your feedback. Please also reach out to us at or connect with us on Facebook, Instagram, LinkedIn, and Twitter.

Subscribe to Impira's blogStay up to date with all things Impira, automation, document processing, and industry best practices.

Instantly extract data with Impira