How to read PDF files using AI Builder
Reading from PDF files is a common requirement in business applications. For eg: Reading the Invoice report from the sales team. Currently, even though the process is digitalized but the data collected from the user as receipt or feedback is in paper format it should be entered manually in the App, to fill this gap app should read the pdf and populate the values.
OCR is available in almost all RPA tools including Power Automate as Desktop flows, AI Builder was built on top of Azure Cognitive services and Form Recognizer to simplify the complexity by providing a 3 step process but simplification comes with greater cost.
Those who can handle complexity and a complex PDF structure to read can opt-out for Azure OCR in their cognitive services or Form Recognizer to read effectively from the pdf documents or images.
Power Platform provides AI builder to train the model to read documents, they have already provided some OOTB templates from there we can start building our applications, some of the templates provided by default are:
- Form Processing
- Category Classification
- Entity Extraction
- Object Detection
In Form Processing – A model has to be built using the form processing template and the model needs to be trained with a set of documents for each collection and fields need to be mapped. Then we can call it via power automate and the outputs can be used for other processes.
Once a new model has been created we need to complete three steps to use that model
Extract – In the first step, we need to declare the “Variables or field” which is used to hold values from the PDF file, to hold a single value “Field type” is used and for the list data “Table type” should be used.
Collection – To train the model we need to add at least 5 documents for each document type and it will be grouped under the name Collection.
Tag – The created fields in the first step should be mapped in each document by selecting the area to capture.
The final step is to train the model which will take few minutes to complete.
The created model can be called in the workflow using “Extract information” action, Values will be returned in JSON format we can construct the data in a readable format. Here it is formatted as a CSV table.
As you can see with a series of simple steps a complex task is achieved within hours but it comes with a greater cost, See the Power Platform Pricing post for more details.
This is not the conclusion, We will cover more on Computer Vision, Azure OCR, and OCR in other RPA tools.
Please post your queries in the comment section. Happy Building 🙂