Hey all, I’m just wondering how doable it would be to read digits from a picture and turn it into text. Anyone with experience or pointers on where to look in this area would greatly help. I’ve been tasked with writting an app to read utility meters and being able to use the iphones camera to read a digital meter would be a great addition.
Also does anyone have experience with reading bar code and/or QR codes?
Thanks in advance!
Can you post the picture you want to read data from?
I used to work with bar codes long ago and I’ve forgotten more than I remember. Maybe this will help you. I’m sure searching the net will turn up more information. http://m.wikihow.com/Read-12-Digit-UPC-Barcodes
EDIT: Actually I had a program that created barcodes for test conditions. I would be given a list of UPC numbers needed for a test and I would create sheets of barcodes in bmp format that could be scanned.
I would say doable but not straightforward. Will depend a lot on how standardised the text is - always the same font? Same colour? Same background? Same orientation? The less variation the better though this may make the system less practical
.@Briarfox. Thanks for giving me a suggestion for something to code. I’ll take a picture of a barcode and see if I can read it from the picture. I think it should be easy to do. I’ll give you the code if I’m able to do it.
You guys are great, thanks for the info. This isnt the exact meter but its pretty similar.If the meter is rolling I could see it being a problem, however it could show the digits that it can read and ask user input for the un readable ones.
It seems to me you can break this into several steps
- Locate the black area containing the numbers and figure out its size.
- Since the numbers of chars is constant, break up this area into individual chars like a sprite sheet.
- Recognise individual chars one by one. That is obviously the hard part
I might look on the net for number 3, but my first thought would be to break each char rectangle into a cell grid and calculate the proportion of white pixels in each “cell”. A 1 would have nothing in the side “cells”, ie is narrow, while a 7 is fat across the top and narrow below, etc.
You would want to collect maybe several hundred example photos and tune your algorithm with them, using perhaps 2/3 of them for tuning and 1/3 as a test set which the algorithm is required to solve, sight unseen. This is standard practice in AI work.
As @Ignatz says, part 3 is achievable with something like an artificial neural network. You would “train” your ANNs using a number of different examples of each character then you should be able to use your trained ANN to recognise instances of previously unseen characters.
@Ignatz thanks for breaking it up for me! @west I think I understand what you mean by ANNs this should be fun!
So far I’ve just turned all non digit pixel to 0 alpha and made the digits black. Next I’ll try to separate them. I’ll google ANNs and see if I can get this logic of it.
Once again thank you all. I sure do love this community. So very helpful!
.@Briarfox I have the code done that will decode a UPC barcode that I took a picture of. Do you have a picture so I can try one of your barcodes. The code is kind of messy and will probably need tweaking depending on the sharpness and contrast of your barcode. Once I get it working with your barcode, I’ll post the code and you can modify it from there for your needs.
@Briarfox- ANNs are black boxes that can be quite tricky to use effectively. An alternative is genetic algorithms (GA), for which I’ve already written Codea code.
What they can do, for example, is find the best weightings to give each “cell” in your grid to recognise different letters. I have done a lot of playing with GAs and am happy to help out, if you’d like. The advantage is that your eventual algorithm is easy to understand.
The first step in any case is to get lots and lots of photos of numbers, to get an idea of the likely variability in quality, size, clarity, etc.
What you want to do is an ‘OCR’ (optical character recognition) program.
They have been around for 30 years and they dont work perfectly yet.
When you say ‘i’ve been tasked’ do you mean professionnally? Or at school? For school it is not a big deal. It is quite different if it is professionnal: having a reliable recognition software is difficult. You have to
- normalize the image levels so that the brightness/contrats is always the same. The best solution for your case is probably 1/ flatten the image (remove the x gradient and y gradient) then 2/ equalize the histogram.
- normalize the scale/rotation to have always the same size & orientation for characters. This is easier if there is something easy to detect, like 2 identical screws, of to dots in your image. You detect then with a ‘matched filter’ , basically a convolution with a template (much faster to do in Fourier domain, ie perform FFT). However, because the scale can be different, you must match with several samples and choose the one for which you get more signal. Once you get the position of your 2 screws, you can ‘warp’ the image with a ‘bilinear’ interpolation algorithm.
- at this stage you must recognize your characters. You know where they are almost exactly, so you can extract a small window for each. Then you can make a dot product of this window with each of the templates of your figures (10 of them). You pick the highest result as the figure value. This will work luch better if you use for templates not the figure images, but a processed version of them, intended to provide better discrimination (‘Synthetic Discriminant Filters’). But to compute them you will need again FFT and matrix inversion.
You could try a simpler way (but less efficient) by computing locally a set of standard ‘image invariants’ that are rotation/scale/contrast stable, and measure the distance between your character signature and the results.
I have put terms in ‘…’ so you can look for them on the web.
Maybe you could also search for ‘lua OCR’ on google, and get a ready-to-use algo?
Thanks guys! This is a program for work, its my families company and I wrote the billing software awhile back. I’ve been asked to write an ipad app to handle utility readings. Figured I’d have some fun with it and look into using the iPhone camera to gather as much if not all of the reading as possible. You guys have introduced me to some new terms, I’ll google for a bit and come back when I have questions.
@ignatz sure I’d appreciate any info on GAs. Your tutorial on images helped me get started
@dave1707 I do not have a specific barcode yet, I was going to create them when I knew I could read them via codea. Thanks for working on the barcode, that’s really awesome.
If you tap ‘ocr’ in the appstore you get a lot of apps, some of them free.
You could try them first.
@Jmv38, interesting stuff! I managed to understand most of this - but you lost me at ‘Synthetic Discriminant Filters’
This does open up a good question whether anyone knows of any good ‘general purpose’ image processing libraries out there? Obviously, Lua would be great - but a C or a Python one etc would be pretty easy to port across.
While OCR isn’t perfect, your problem domain is quite tight, ie it’s numbers only. Coming up with your own algorithm shouldn’t be too hard. The issues I see will be:
Do all the meters have a consistent font? If there is variation then it will be tricky.
Sometimes a number on the meter will be half turned eg as it moves from 7 → 8. It could be anywhere in this rotation, recognising these will be tricky.
I agree that this is a fairly tight problem domain, which means it may be possible to solve it yourself without requiring a PhD.
However, I imagine
- the “noise” level in the photos could be quite high, eg cracked window glass, reflections, half turned digits, etc
- the error level will need to be extremely low if you don’t want to upset your clients
I’ll message you about GAs, but I do suggest in any event that you get a substantial pilot sample of photos to give you an idea of the variability you are going to have to deal with.
@andymac3D - I think Jmv38 must have made up that term Synthetic Discriminant Filters.
(I hope he did because it doesn’t sound like something I would have a hope of understanding! :-? )
@Ignatz, in my experience GAs strength lies in optimization and that ANNs would be a better choice for pattern recognition. ANNs have been used many times for similar type problems (OCR, handwriting recognition, car number plate detection, etc) and there is lots of papers and tutorials on it (though not in Codea )
For example, here is an tutorial about the use of ANNs in OCR for MATLAB, which uses one of the standard libraries the academic communities uses to test their techniques:
@BriarFox - it’s an easy problem to describe on paper, but is likely to be very challenging!
If you want a really, really, really basic approach written in codea I had a go at a simple handwriting recognition program a while back - it uses similar principles, where an input picture is compared to a set of example cases and the closes match is chosen - it only uses a 5 by 5 grid but I got reasonable results: