Get our latest book on "Top 10 myths for artificial intelligence."
Airplane flying straight up leave white tracks in a dark blue sky.

Improve the quality of your product data with AI

The path to success in eCommerce is recognizing the importance of product data quality and optimizing it with AI technology.

On July 23, 1983, passengers boarded Montreal-to-Edmonton Air Canada Flight 143, unaware of the terrifying sequence of events that would befall them.  The plane’s Fuel Quantity Information System computer had malfunctioned, so the ground crew had to manually calculate how much gas to put in. Unfortunately for all, the crew forgot to take into account that this was Boeing’s new 767, the first model to fly in Canada using the metric system.  When measuring the gravity of the fuel, the factor they used was 1.77 pounds/liter instead of the 0.8 kg/liter required by the new all-metric plane. Somewhere over Red Lake, Ontario the plane ran out of fuel. The engines went eerily quiet and hydraulic pressure was lost. Fortunately, the heroic captain was able to glide the plane into a safe emergency landing, sparing the 61 souls aboard with only a few injuries.

Though the advent of eCommerce would not arrive for another decade, those of us distributing product information online in today’s world can learn something from this cautionary tale.  As captain of your ‘aircraft’, you should be concerned about...Data Quality. In our modern age of integrated computer systems, we are all too familiar with the disastrous consequences of incorrect data transformations, lagging data updates, and incomplete data.  While we may not have lives at stake, the data that we create and distribute powers every aspect of the consumer purchasing decision and therefore our revenues.

Why is product data important?

In the highly competitive online retail space, one of the best ways to differentiate is with your product data.  Companies that keep an eye toward achieving and maintaining high quality product data will have a leg up on their competition.  Why? Because product data drives two very important business needs: search engine optimization (SEO) and customer experience.

SEO is the practice of increasing the amount of traffic from organic search results on search engines to your website or product listing.  In many cases, multiple websites will list the same product. The consumer’s purchasing decision is often determined simply by which website appears first in their search results.  By providing more robust and engaging product names, descriptions, bullet points, and keywords, you can increase the ‘findability’ of your products on sites like Google and Amazon.

When it comes to customer experience, product data is of the utmost importance.  In order to make their buying decision, today’s consumers demand detailed information, such as product images and product descriptions.   If it’s a food product, does it properly list the ingredients and allergen statements? Does the associated picture accurately reflect the product?  These are just a handful of example data points that will have a positive or negative impact on your sales.

What is Data Quality?

Data Quality can feel like such a nebulous term.  In order to set data quality standards, we must first consider what are the attributes that define data quality.  In order to be considered as high quality, data must be:

  • Available: How hard is it for both employees and/or consumers to obtain the information they need or want?  Must internal employees track down a product manager to find data needed for entry into a Product Information Management  (PIM) system? Do consumers in the online store have the information they need to make their purchasing decision?
  • Accessible: Is the information in the correct format?  Is a Spanish description available for a Spanish-speaking consumer?  Can the information be found in the location where it is expected to be?
  • Timely: Is the data available up-to-date?  How frequently does one system update fields in another?  Items out-of-stock should be listed as such and new orders should be taken as soon as the inventory is available again.
  • Accurate: Is the information correct?  If consumers do not have the correct information, they will not be able to make a proper buying decision, which will lead to lost sales and product returns.  Have you ever published an incorrect price? There are websites dedicated to helping savvy consumers find unintended discounts.
  • Reliable:  Can consumers trust the information provided to them?  Is the information presented in a way that is believable to the consumer?
  • Transparent:  Can the consumers easily interpret the product information?  Is it succinct and understandable at a moment’s glance?
  • Complete:  Is all of the necessary information present?  Have you provided the consumer with all the information they need to make their purchasing decision?
  • Relevant: Have you provided the consumer with unnecessary or extraneous information that may get in the way?
  • Rich:  Does the data contain lots of meaningful information (without overwhelming the consumer)?

Companies that hope to succeed in today’s market must develop their own definitions and standards for data quality.  However, even when a company finally understands the power of differentiating through data, it often struggles with how to proceed.

Illustration of unorganized assets becoming organized with AI

How to improve product data quality

If product data quality is so important, why is it so hard for companies to get it right?  One major reason is that company data lives in multiple software systems. The team responsible for the Enterprise Resource Planning (ERP) system is usually not the same team responsible for the Product Lifecycle Management (PLM), or the PIM, or the Digital Asset Management (DAM) systems.  With a multitude of systems and owners, it’s near impossible to set consistent data quality standards across all the systems. Companies today are lacking the tools they need to define, measure, and monitor product data quality across their various systems. This is where Impira comes in.

Impira’s Data Intelligence Platform serves as ‘air traffic control’ between your various systems.  Impira doesn’t necessarily replace a PIM, DAM, ERP, PLM, or logistics system. Instead, Impira tracks and monitors the information that flows between these systems.  Impira’s AI technology can:

  • Identify missing or incomplete data
  • Track the lineage of data as it flows to downstream systems
  • Spot data discrepancies using anomaly detection
  • Provide visibility into who owns various fields across systems and teams
  • Automatically enrich data, such as auto-tagging images or extracting data from documents (i.e., invoices and images)
  • Enable data access control and permissions for cross-team collaboration
  • Automatically transform, link, convert data, and recommend data relationships
  • Monitor data synchronization frequency and surface last-updated timestamps

Impira provides the tools to conduct an initial product data quality analysis and operates continuously to help iterate and improve that data over time.

“If you can’t measure it, you can’t improve it.” – Peter Drucker

AI and product data quality

The business insights you need to grow your revenues and reduce costs can only come from adopting a renewed focus on data quality.  In the past, this has meant hiring more people, or enforcing cumbersome workflows that delay getting a product to market. Today, those costly overheads can be reduced or even removed entirely by applying Artificial Intelligence.  For example:

  • Anomaly detection can be used to call out data points that may fall outside the typical or projected values.  This could mean identifying an incorrect price within a category of similar products by taking into account historical and seasonal pricing information.
  • With Natural Language Processing, it’s possible to identify keywords (or lack of keywords) within text, and even predict sales from the language of product descriptions.  AI can provide human-assisting, auto-complete suggestions that boost performance, or translate a category tag to a synonym within your product taxonomy.
  • Facial Recognition and Emotion Analysis can identify the effectiveness of spokespersons and fashion models on consumer engagement with a brand or product.
  • Speech-to-Text can identify what phrases spoken within your product videos are contributing to success.  Text transcribed from the audio can be added to the product metadata to enrich it for SEO purposes.
  • OCR/Text Recognition can be used to automatically extract text from product packaging flats and populate the corresponding fields for an online store.  Text on the packaging can also be used to validate information intended for the product page (e.g. are the ingredients correct?).
  • Image object detection can identify products shown within ad creatives or lifestyle shots and pause marketing campaigns when an item has gone out-of-stock.

AI is changing the way eCommerce companies operate.  At Impira, we can help you make the transition to the new age.  The right combination of product data quality and AI will be your flight plan to success.

To learn more, please reach out to us here or at and connect with us on Facebook, Instagram, LinkedIn, and Twitter.