Get our latest book on "Top 10 artificial intelligence myths."
Download

Last updated

March 4, 2021

Table of contents

Using Impira Query Language (IQL) for advanced queries

Impira Query Language, or IQL, was designed to make it really easy for power users, developers, or knowledge workers to interactively search, filter, and transform the semi-structured data trapped in your files. IQL seeks to be as easy to use as a search bar, generally with a single line of text, yet provide powerful features that would otherwise require writing a complex query using something like SQL. 

IQL can be used for advanced queries within the Impira UI, to power a Function field on a collection, or to query data over the API.

Design principles

IQL is designed specifically for building business applications on metadata-rich information. In fact, all of your interactions in Impira are powered by IQL queries — from uploading new files and creating collections to adding machine learning fields to a collection to extract information.

We wanted IQL to be approachable, designed to fit in a single line, and to be written incrementally. You can start with a search query, refine the results, then quickly add a few instructions to start pivoting and aggregating your data.

Getting started

Once you have an account, created a collection and extracted some information, you will need to change your search mode to IQL. You can do this by clicking on the menu button on the right side of the search bar, selecting IQL, and clicking the Save button. For the rest of these examples we will be querying a collection of invoices, but you can use a collection of your own files, too.

Switch to IQL mode in the search bar

IQL concepts

Querying

The first part of a query is the entity class you want to query from. There are a number of entity classes available to query, but the two most common are `@files` and `@file_collections`. Each of your collections within Impira are assigned a unique identifier, and after changing your search mode to use IQL, you’ll see the name of the entity class in the search bar. 

The unique entity class is pre-filled in the search bar

Impira will pre-fill the entity class of the collection you’re currently working on, and it’ll prefix it with an `@` symbol to distinguish it from the other parts of the query. If you’re working with IQL via the API, you’ll need to include the entity class and the `@` symbol in your query. Running the query with just the entity class will return all of the fields for each row in the collection. Impira visualizes this in a table, but there’s a lot more information available for each file and field. You can see this by switching to the JSON view. 

Go ahead and switch to the JSON view and take a look just how rich the data is behind the scenes. Switch between the table view and the JSON view at any time by returning to the view menu.

Switch the application view from Table to JSON

For fields with nested values, you can use dot notation (.) to navigate the nested path. For a field name contains spaces, like `Invoice No`, you can write the name between backticks (`).

Projection

The second part of a query is the projection list. This is used to specify which fields or expressions you want to return, allowing you to selectively access the fields specific to your needs. The projection list is specified between square brackets, `[` and `]`, and each expression within the list is separated by a comma. 

Let’s query our entity class for the `Invoice No` and the `Due Date`:

``` language="js"
[`Invoice No`, `Due Date`]
```

Query for Invoice No and Due Date

Notice that this returns the complete JSON structure for only the fields specified in the projection. You can be more specific about obtaining just the extracted information from this result by using the dot notation to navigate through the result.

Let’s query for the extracted value for `Invoice No` and `Due Date`:

``` language="js"
[`Invoice No`.Label.Value, `Due Date`.Label.Value]
```

You’ll see your result change to a much simpler array of objects with only two properties per row in your collection.

JSON query result for Invoice No andDue Date fields

Aliasing

Items in the projection list can be given a new name, also known as aliasing. You can do this by prefacing a field name with a name of your own, separating your new name and the field name with a colon (`:`). This makes reading the results simpler, and also provides a powerful way to rename fields to meet the needs of downstream applications that consume data via Impira APIs.

Go ahead and change your projection list to include new names for your fields.

``` language="js"
[inv_no:`Invoice No`.Label.Value, due:`Due Date`.Label.Value]
```

JSON result with aliased field names

Switching back to the table view you’ll see that the new JSON result from your query is also visualized nicely with a header row that matches your aliased field names, and each row with the data you projected as part of the query. Switch back and forth between each view to get a different perspective on your result.

IQL query for Invoice No and Due Date shown in the Table view

Filters, sorting, limits, and offsets

The remainder of the IQL query, after the entity class and (optional) projection, is the filter. The filter is flexible and designed to allow you to write simple text queries to more specific queries over fields or expressions. 

Let’s take a look at some examples that build on the `Invoice No` and `Due Date` queries above.

Query for all invoice numbers with a due date greater than Jan 1st 2020:

``` language="js"
[inv_no:`Invoice No`.Label.Value, due:`Due Date`.Label.Value] due>2020-01-01
```

Query for a specific invoice number:

``` language="js"
[inv_no:`Invoice No`.Label.Value, due:`Due Date`.Label.Value] inv_no:451
```

Query for all invoice numbers with a due date in 2019 (notice in this example that the due date is filtered on both “less than” and “greater than” a specific date):

``` language="js"
[inv_no:`Invoice No`.Label.Value, due:`Due Date`.Label.Value] due>2019-01-01 due<2020-01-01
```

Query for the 10 most recent invoices, including Invoice No and Total Due, sorted by the highest due date:

``` language="js"
[inv_no:`Invoice No`.Label.Value, total:`Total Due`.Label.Value, due:`Due Date`.Label.Value] limit:10 highest:due
```

Query for the lowest 10 invoice totals:

``` language="js"
[inv_no:`Invoice No`.Label.Value, total:`Total Due`.Label.Value, due:`Due Date`.Label.Value] limit:10 lowest:total
```

For larger results, you’ll often want to paginate the results into smaller chunks. You can do this using a combination of limit and offset filters. For example, here’s a filter to get 100 records, beginning from the 300th record:

``` language="js"
limit:100 offset:300
```

Pivot and rollup

IQL also supports a concept called pivoting which is a neat way to group your results. Just like the above examples of limit, offset, highest, and lowest, you can set pivot to true. Let’s look at an example to see how you can easily group your data for more detailed analysis or to summarize your results.

Let’s assume you wanted to see all your invoices grouped by year. IQL has many built-in functions to transform and manipulate details, and for this example we’ll use the `year()` function to truncate the `Due Date` field to the year.

``` language="js"
[year(`Due Date`), {`Invoice No`, `Total Due`}] pivot:true
```

Query result showing all invoices broken down by year

Similarly, you can use other functions in your query to aggregate the results. With a small change to the query, we can calculate the total due for all invoices for each year. IQL has some additional conveniences too, and when an aggregate function like `sum()` is used, pivot is automatically set to true.

Combining what we’ve learned about filters and aliasing our projected fields, we can create a user friendly output that can be easily shared throughout your business.

``` language="js"
[`Invoice year`: year(`Due Date`), `Total spend`: sum(`Total Due`)] highest:`Total spend`
```

Query result showing total spend by year

In addition to pivoting, IQL supports a strong form of grouping called a rollup. A rollup will compute the value of each aggregate at each level of nesting in the query. Building on the previous example, you can expand this to include the month and see the total of all invoices by month and by year.

``` language="js"
[`Invoice Year`: year(`Due Date`), `Invoice Month`: month(`Due Date`),
   `Total spend`: sum(`Total Due`)] rollup:true
```

IQL Function Catalog

Need more? Here’s an extended catalog of IQL functions.