Using Artificial Intelligence (AI) to unlock insights from unstructured documents efficiently
Unstructured data is growing at an unprecedented rate. Gartner predicts that enterprise data will grow by 800 per cent in five years, with 80 per cent of it unstructured. It leads to a huge untapped opportunity to better understand and leverage unstructured data. Azure Form Recogniser applies advanced machine learning to accurately extract text, key/value pairs, and tables from documents. This further increases your operational efficiency, improve customer experience, and help you make better decisions.
It’s not the data, but what you do with it.
Let’s build an AI powered Form recogniser (Azure Cognitive Service):
Pre Requisite — Azure subscription : Create one for free.
Follow these steps below:
- Understand the invoice receipt: The sample data can be extracted from the github repository:
2. Create an Azure Storage Blob Container:
A. Follow the instructions below to Create an Azure Storage account.
B. Create an Azure blob container to create a container within the Azure Storage account.
3. Upload sample data to the Azure blob container:
Upload the sample data from GitHub repository on block blob to upload data to a container :
Copy the URL of the container, you’ll need this URL later.
4. Create a Form Recognizer resource:
Go to the Azure portal and create a new Form Recognizer resource . Follow the information below:
As the Form Recognizer resource finishes deployment, find and select it from the All resources list in the portal.
Further select the Quick start tab, and save the values of Key and Endpoint to a temporary location. You’ll use them in the following steps.
5. Create your logic app:
You can use . In this tutorial, we will create the Azure Logic Apps to automate and orchestrate tasks and workflows. Specifically, the logic app is triggered by receiving an invoice that you want to analyze as an email attachment.
A. From the main Azure menu, select Create a resource > Integration > Logic App. Under Create logic app, provide details about your logic app as shown here.
B. After Azure deploys your app, on the Azure toolbar, select Notifications > Go to resource for your deployed logic app. In the Logic Apps Designer, under Templates, select Blank Logic App.
6. Configure the logic app to trigger the workflow when an email arrives
A. From the tabs, select All, select Office 365 Outlook, and then under Triggers, select When a new email arrives.
B. In the Office 365 Outlook, Sign in, and enter the details to log into an Office 365 account.
C. In the next dialog box, perform the following steps.
D. Click Save from the toolbar at the top.
7. Configure the logic app to use Form Recognizer Train Model operation
A. Select New step, and under Choose an action, search for Form Recognizer. Follow the steps below:
B. In the Form Recognizer dialog box, provide a name for the connection, and enter the endpoint URL and the key from the Form Recognizer resource.
C. In the Train Model dialog box, for Source, enter the URL for the container where you uploaded the sample data.
D. Click Save from the toolbar at the top.
8. Configure the logic app to use the Form Recognizer Analyze Form operation
A. Select New step, and under Choose an action, search for Form Recognizer. Follow the steps below:
B. In the Analyze Form dialog box, do the following steps:
- Click the Model ID text box, and in the dialog box that opens up, under Dynamic Content tab, select modelId. By doing this you provide the flow application with the model ID of the model you trained in the last section.
2. Click the Document text box, and in the dialog box that opens up, under Dynamic Content tab, select Attachments Content. This configures the flow to use the sample invoice file that is attached in the email that triggers the workflow.
3. Click Save from the toolbar at the top.
9. Extract the table information from the invoice:
- Select Add an action, and under Choose an action, search for Compose and under the actions that are available, select Compose again.
2. In the Compose dialog box, click the Inputs text box, and from the dialog box that pops up, select tables.
3. Click Save.
10. Test your logic app
To test the logic app, use the sample invoices in the /Test folder of the sample data set from GitHub. Follow these steps:
- From the Azure Logic Apps designer for your app, select Run from the toolbar at the top. The workflow is now active and waits to receive an email with the invoice attached.
- Send an email with a sample invoice attached to the email address that you provided while creating the logic app. Make sure the email is delivered to the folder that you provided while configuring the logic app.
- As soon as the email is delivered to the folder, the Logic Apps Designer shows a screen with the progress of each stage. In the screenshot below, you see that an email with attachment is received and the workflow is in progress.
4. After all the stages of the workflow have finished running, the Logic Apps Designer shows a green checkbox against every stage. In the designer window, select For each 2, and then select Compose.
From the OUTPUTS box, copy the output and paste it to any text editor.
5. Compare the JSON output with the sample invoice that you sent as an attachment in the email. Verify that the JSON data corresponds to the data in the table within the invoice.
This tutorial is completed!
Privacy & Security is Key.
Azure Form Recognizer is offered as a preview of an Azure service under the Online Service Terms. As with all the cognitive services, developers using the Form Recognizer service should be aware of Microsoft policies on customer data.
Process documents cost-effectively:
Follow the pricing calculator to configure and estimate the costs of Azure Product.
Here is a snapshot of pricing details for the form recogniser:
What’s Next?
Next, learn how to build a training data set to create a similar scenario with your own forms.