The Impira team is back at it, creating new features and improving your experience.
Impira supports running complex queries against collections and datasets using Impira Query Language (IQL). You can now use /poll to just receive the changes since the last time you ran a query. This allows you to run a continuous workload, where you receive updates as they happen. Explore our Read API documentation here.
Impira supports configuring webhooks to subscribe to changes in collections and datasets. This is part of the new Collection automations feature, which supports continuously ingesting files and exporting data from Impira. Stay tuned as we’ll be shipping more and more of these features in the months to follow.
New computed fields for confidence
Easily distinguish between files that may need a bit more review, and files with fully confident data. With newly introduced computed fields, you can quickly inspect the confidence values and states across all extracted fields for each file in a collection. The three new boolean fields are `File.IsPreprocessed` which tells you whether a file has completed preprocessing (uploading, analyzing text, producing a thumbnail, etc.), `__system.IsProcessed` which tells you whether a record in a collection has completed processing across all of its ML fields, and `__system.IsConfident` which tells you whether all machine learning values for a record are high confidence. You can read more about these fields in the ML confidence guide.
New text parsing improvements
Impira has significantly upgraded its process of extracting text from documents and images. This includes improved preprocessing for images and documents for Optical Character Recognition (OCR) which will yield more accurate and complete extracted text, and more complete extraction of text embedded in PDFs.