How to extract text inside an image without writing any code
Guest post from Nick Proud, Software Architect of ThinkAutomation.
In the modern-day battle to transfer our information from paper to the digital realm, OCR (Optical Character Recognition) has become essential to the smooth running of countless business processes.
From recognising license plates to digitising literature, OCR can often be seen as reserved for those proficient in the many programming languages on offer today. However, there is an easy way to achieve text recognition from images automatically. All without writing a single line of code.
ThinkAutomation has been around for some time. Originally known as Email2DB, it has grown significantly from its original offering as a manner of automating the movement of email content to a database. Now, it is an all-purpose toolkit for the automation of business processes.
More recently, the application has ventured into the world of AI. ThinkAutomation leverages Microsoft Azure’s computer vision API to allow people to extract text from images with OCR and also to obtain information about the contents of an image — without human intervention.
Following the text extraction, users can do whatever they want using a set of ‘if this, then that‘ style drag and drop actions. This includes sending an email / SMS message, running a database command, or even posting to Facebook.
So, let’s say you’ve been tasked with automating a process based on the information you have received inside an image. A driver’s license for example. You are working for an undisclosed agency. This agency is looking for someone, Sarah Morgan. We are scanning driving licenses daily. If we find her name, we need to tell someone immediately.
Where Azure comes in
We may not have to write any code, but we still need to allocate a little cloud-computing resource to our new project to find Sarah Morgan. Don’t despair if you have never used Azure before though. It’s super simple to set up the resource we need for ThinkAutomation.
ThinkAutomation needs you to have an Azure Cognitive Services endpoint that it can contact and then facilitate the analysis of the images of driving licenses we are passing to the software.
To set this up, you will need a Microsoft Azure account. Microsoft offers a free trial, and for most resources, the paid versions are extremely low cost. (As well as being available on a ‘pay as you go’ basis.) Take a look here at how to set up your Azure account and then the Azure resource you need.
If you followed the guidance in the two links above then you will now have the two elements required for ThinkAutomation to speak to Azure:
- A Cognitive Services endpoint, which is just a URL
- A unique subscription key, a value allowing you to authenticate to your Azure account
Let’s get ThinkAutomating!
So we’ve got Azure set up and ready to receive data. Now we need to get ThinkAutomation set up. Go ahead and download the software using the free trial and then install it on a Windows machine.
Next, we need to configure the software to pick up the images of driving licenses that we have scanned and saved to disk. ThinkAutomation uses Accounts as a manner of receiving messages from multiple sources. Don’t let the terminology confuse you.
Messages could indeed be emails or SMS messages, but they could also be tweets, files from our hard drive, or results from a database query.
Each account can contain multiple triggers, which are just workflows of actions you want to take place automatically once a message is received.
Once an account has received a message, the triggers will execute, automatically, running actions you have instructed ThinkAutomation to perform. That could be to evaluate the received data further, send an email, call a web service, or pass the message on to some other software.
Create a ThinkAutomation account
We are going to create our first account, which will be a File Pickup account, scheduled to run on a timer. More specifically, it will check a location on our hard drive every three minutes for newly added image files. Then it will scrape any text it finds in them and save it to the message body, so that we can do stuff with it.
Start by clicking one of the giant plus icons to add a new account. This will open the account properties screen. Give the account a name.
I’m going to call my account ‘OCR Processor’… because that’s what it does.
Then further down in this screen, we want to enable OCR. By enabling, ThinkAutomation knows that whenever this Account runs, if we are performing a File Pickup on an image then we need to try and extract text from it.
OCR on the Account level is currently only supported on file pickups.
When you see the OCR Settings button appear, click it.
Here you can add the two pieces of information you obtained from your newly created Azure resource. Once you have done this, click ‘OK.’
Set up file pickup
Great! We should now be connected to our Azure resource via ThinkAutomation. Now we just need to set up the application to automatically pick up image files from a directory. To do this, head over to the ‘File Pickup’ tab shown in the top ribbon.
As you can see from my example above, I am telling ThinkAutomation that I want to enable File Pickup for this account. This means that it will check for files automatically every X minutes.
I’ve specified that I want the application to look for files on the desktop which have an extension of type ‘.jpg.’ The * character simply means any name. If you wanted to pick up all file names with all extensions, you would use *.*
Set up a trigger
Once this is completed, ThinkAutomation will ask you if you want to set up a trigger for this account. Click yes. You will then see a similar setup screen to the account screen. This is where we will configure what happens once a file is picked up.
Just like the account setup, we give the trigger a name. In this case, we are simply calling it ‘OCR.’ There is a log of options on this screen where we can specifically state how we want to filter messages that are coming in. Since we want to perform automated actions on all messages coming in, we can proceed straight onto the Trigger Actions page. Click Trigger Actions’ shown in the top ribbon of the screen.
The trigger action screen provides us with a toolbox of actions along the left-hand side of the screen, which we can drag into a canvas on the right. The actions will run sequentially. (Much like they would if they were lines of code.)
The above example shows a trigger I set up previously to achieve our goal of finding Sarah Morgan from our pile of driving license photos. Let’s go through how we create the above.
Go to the toolbox’s search bar in the bottom left of the screen and search for ‘if’. You should see that a trigger action called an If block appears in the list. We can use this to wrap an action in a condition so that it only happens if the condition is met.
Click and drag the if block action to the canvas and you should see a condition builder window open.
As you can see above, we have used the dropdown boxes in the condition builder to build our condition. We simply say that if the message body (in this case, the text extracted from the incoming image via OCR) contains ‘SARAH’ and if it contains the word ‘MORGAN’, then we class this as Sarah Morgan being found, and we can do something inside this if block.
Next, use the search bar again to find the ‘send an email’ trigger action. Drag it into the if-block on the canvas. The trigger action will then open up for you to configure.
Now that we have added a ‘send an email’ trigger action to the block inside our if statement, then if the condition we’ve set is met, the action will send an email.
Let’s assume then that our condition was met: we save the image below to our desktop and ThinkAutomation picks it up.
The text in this image will be extracted, and we will be able to look at the message body by clicking ‘View Processed Messages’ and double-clicking the message found in the list.
As you can see, the image that was pulled in was sent to Azure for OCR and the text added to the message body. Also, because our condition was met, an email will have been sent to say that we found the person we were looking for.
Just one example
Here we can see a basic example of how we can easily perform OCR on images without having to write any code. For more examples or guidance on how you can set this up in ThinkAutomation, documentation is available here.
N.B. Nick also published this post here: https://www.automationmission.com/2020/03/02/how-to-extract-text-inside-an-image-without-writing-any-code/