Get our latest book on "Top 10 artificial intelligence myths."
Download

Last updated

March 3, 2021

Table of contents

Impira terminology

Basics

These are the set of core building blocks that are shared throughout Impira.

Record: A set of information corresponding to a single source within a collection. In a collection, these values are related by a file instance. In the table view, each row corresponds to a record. There are 2 records displayed in the below screenshot that correspond to the “Purchase-Order-5.doc” and “Purchase-Order-4.doc” files.

Field: A piece of information that each record within a collection contains. Each field has a certain type (ex: number) which is the same for each record. In the table view, each column represents a field. There are 3 fields displayed in the below screenshot: “File”, “total”, and “date.”

Value: A piece of data in a record that corresponds to a field. In the table view, each cell represents a value. In the screenshot below, “35,650” is the value of the cell for the “total” field of the “Purchase-Order-4.doc” record.

Screen_Shot_2021-02-04_at_5.52.04_PM.png

Organization

Impira makes it easy to organize your files and data. There are several different organizational tools within Impira.

Collection: A group of records where each record corresponds to a file.

Manual collection: A group of records grouped together by hand.

Smart collection: A group of records to which files are assigned by a specific query.

Dataset: A group of records where each record does not correspond to a specific file. Datasets commonly contain metadata and can be created from the contents of a spreadsheet or from scratch.

Data types

There are a number of different field types available in Impira.

text: Text of any type. Text values can be of any length.

date: A single date.

number: A single number.

checkbox: A binary true or false value.

join: A field that links each record to records in another collection or dataset.

function: The saved result of an IQL function for each record.

Machine learning field types

The ability to extract structured data from records is a core action within Impira. Machine learning fields help you automate the extraction of that information. There are several different types of machine learning fields within Impira in addition to manual fields.

Text extraction: The machine learning field type for extracting a single value for each record. A text extraction field can extract data of any length. However, this field will only return at most one value per record. If the model is not confident that the relevant information is present in the record, it will return a blank value.

  • A text extraction field can be a text, number, or date type.

Checkbox extraction [limited access]: The machine learning field type for extracting a binary True or False value. This model works for checkboxes, radio buttons, and other marks of true or false. This is a limited access feature. If you would like to access this feature, please reach out to the Impira team at info@impira.com.

Machine learning confidence states

For machine learning fields, in addition to generating the most accurate predictions, Impira communicates an estimate of uncertainty around those predictions.

Manual confidence: If a user manually extracts or corrects a value, we assign 100% confidence to that value. In the table view, these manually extracted values are denoted by a thick black border on the left side of the cell.

High confidence: If a machine learning model is highly confident in its prediction, it is denoted by a dashed green border on the left side of the cell.

Review Recommended: If a machine learning model is moderately confident in its prediction, it is denoted by a dashed red border along with an red triangle on the left side of the cell.

Blank prediction: If the machine learning model is not able to identify the value in the record, the cell will be blank and either have the high confidence or medium confidence indicator. Not all blank predictions are inherently incorrect and some records may not contain the value in question.