Subscribe to Impira's blog
Stay up to date with all things Impira, automation, document processing, and industry best practices.
The evolution of machine learning
Machine learning may feel like a new concept because it’s integral to futuristic sectors like self-driving cars, space travel, and the Internet of Things. However, machine learning is not actually a new concept, and has been used since the mid 20th century.
The history of machine learning dates back to the 1950s, when Arthur Samuel popularized the term after developing a self-learning computer program that could play checkers. He believed that by teaching computers to play games, engineers and mathematicians could develop similar tactics to solving general problems.
Machine learning applications have proliferated in the last decade. Techniques like convolutional neural networks and transformers allow machine learning models to now recognize complex patterns with incredible accuracy. These mechanisms have been invaluable in driving innovation around image and voice recognition, to name a few.
At this point, machine learning has been successfully folded into consumer applications like web search, language translation, and social networking. However, attempts to recreate its ubiquity in other sectors have been spotty due to the need for massive quantities of annotated data, high project costs, and accuracy expectations.
Traditional machine learning is data hungry
Machine learning models often require significant volumes of manually annotated data (“labeled data”). That means they require regular human intervention to retrain the underlying model with more input. That’s why these techniques were developed and improved by consumer tech companies like Facebook, Amazon, Microsoft, and Google, where data is abundant.
So while machine learning has evolved, it has been out of reach for some sectors and for most organizations until recently. The best hope for widespread adoption of machine learning across all industries comes in the form of “one-shot” learning. One-shot learning aims to overcome limitations for companies that can’t devote intensive time and resources to machine learning, by allowing machines to learn from as little as one data point. With one-shot learning, machine learning models are customized without a continual supply of tons of upfront training data.
With one-shot learning, the barrier to entry for machine learning has been shattered. What once required state-of-the-art tooling is now feasible with fewer resources and at a lower cost. Forget complex teams of people and incredibly expensive machinery. Non-technical users can create and train their own machine learning models instantly.
In sectors like tech and finance, which have typically required complex teams for this functionality, one-shot learning unlocks all sorts of new possibilities. Let’s walk through one-shot learning and take a look at what it can do in the real world.
Training machines for nuance
As humans, we easily manage a certain degree of ambiguity when we look at a document or hear a sentence. For example, we use our understanding of natural language to know that both “Grand Total” and “Total Amount Due” refer to the same concept in an invoice.
While we’re able to make connections like these intuitively and through experience, machines typically have difficulty toggling between the myriad nuances of human language. They can’t always recognize patterns and take educated guesses in the same way the human brain can.
However, in the decades since machine learning was introduced, computers have already gotten better at this type of reasoning. With machine learning, computers model the same kinds of thinking by looking at examples and picking up on patterns. Over time, they get better at solving problems that rely on visual or language patterns, like reading data from a document or translating one language into another.
Translating data for nuance is easier in some forms than others. For example, unlike a raw image, video, or audio clip, a document is generated with logic. Images, videos, and audio clips are captured with sensors. It will be easier to train machine learning models with logic, through checking data against a set of high-level rules and constraints.
How one-shot learning works
With one-shot learning, less data packs a bigger punch. Why? Simply put, one-shot learning uses fewer data points to train a machine learning model. This means you have much more control over how and what the model learns. And better yet, one-shot learning models continue to improve over time, with minimal training examples.
The one-shot learning difference lies in classification and categorization. Where most machine learning algorithms use hundreds or thousands of data points to learn low-level patterns, one-shot learning learns information about object categories. One-shot learning models bank on the fact that systems can use prior knowledge to classify new objects.
That’s why even one additional piece of data can make a difference in the accuracy of the algorithm. For example, the category “dog” may be learned in one shot based on previous knowledge of the “horse” or “fox” category because those categories may contain similar definitive qualities.
With one-shot learning, more accuracy is achieved with fewer hands on deck, and a smaller weight on training. Some of the methodologies that allow a small set of data to pack a bigger punch include:
- Generative modeling — Modeling the higher order rules by which a set of documents was generated. For example, the rule “X is to the right of the line” does not dictate an exact distance between X and the line, and the exact location of X can be flexible as long as it follows that higher order rule. Generative learning seeks to model how a document was created (with a certain set of rules) versus how to "decompose it." Therefore, it can be done with fewer pieces of data.
- Rethinking existing training infrastructure — One-shot learning requires a change to the existing infrastructure that trains and deploys models, like Spark, Sagemaker, and Google AutoML, which were architected with technical users and long time-frames in mind. At Impira, we’ve developed new database technology to support incrementally retraining models and reevaluating predictions with new annotations and new files in real-time. This way you can run queries like “find all documents with at least 3 line items” that optimize for querying semi-structured data.
- Intuitive interfaces to train the machine — Organizations that support one-shot learning can lower the barrier to entry in terms of technical abilities for the average user. Much like with a spreadsheet, the user interface looks simple but the backend is running complex database technology. Simple feedback loops placed throughout a one-shot learning platform (e.g., confidence indicators, predictive text, or other feedback indicators) allow users to improve the end results.
The business applications of one-shot learning
With one-shot learning, businesses of all sizes can leverage machine learning to solve problems. In general, less investment is required to streamline traditional, non machine learning-enabled operations.
One-shot learning can apply widely to many business applications. For tech companies, it can be applied anywhere from character and object recognition and classification, to sentence completions, translations, labeling, and 3D object reconstruction.
In finance, it can be used for fraud detection, error detection and alerting, visual navigation in robotics, task completion using text data, user intent classification, word similarity tasks, and document parsing.
Beyond these sectors, you might find one-shot learning applied to things like voice cloning, IoT analytics, curve-fitting in mathematics, or one-shot drug discovery and other medical applications.
Broadly speaking, the ability to input and analyze more data with fewer tools and resources can have great impact on product development, revenue operations, user experience, and customer service.
One-shot learning and financial statement automation
To use a real-life example of one-shot learning in action, let’s consider the problem of finding key data within financial statements. PWC recently reported that 30-40% of the time spent on key financial processes could be eliminated with automation.
Knowledge workers spend valuable hours each week entering manual data into spreadsheets. This administrative work is a cost-center to the organization, and that time might otherwise be devoted to analysis, strategy, and other revenue-generating activities that would positively impact the bottom line.
With one-shot learning, a machine learning model can quickly and accurately pull key data from financial documents, flagging anomalies and bringing irregularities to the attention of the relevant team. With one-shot learning, this entire process is automated, and does not require a resource-intensive investment. That can directly translate into faster, more accurate financial analyses for quarterly and annual reporting. It can also prevent lost invoices, clear up backlogs, and help meet deadlines.
One-shot learning removes the need for hours of manual data entry work, and gives that time back to be better applied elsewhere.
One-shot learning for all
From a business standpoint, machine learning becomes truly self-service with one-shot learning. Regardless of their technical competency level, anyone can create their own model without assembling a large dataset, a massive computing budget, or a cross-functional team.
How can you implement one-shot learning today? The first step is finding solutions that don’t require heavy services or data experts to build something custom. Identify opportunities for “intelligent” needs within your organization, and explore the tools and services available to you.
Also, pay attention to breakthroughs and developments in machine learning, whether that’s new techniques, optimizations, or use cases. Advances in machine learning are happening all the time. One-shot learning is only one practical advance that’s allowing the efficiency and accuracy of machine learning to be democratized.