First, take a look at this photo:
Notice anything strange? In fact this image has been tampered with to cover up a truck. The truck is completely hidden by foliage. Here's the original:
Wouldn't it be nice to be able to detect that automatically? It is possible. Here's an image automatically generated by my code showing what was moved. All of the red was moved to the blue (or the other way around).
I was motivated to work on this program by greed (or at least my never-ending love of having a little flutter on things). Best of the Best runs spot-the-ball competitions in airports to win very expensive cars. But they also run the same competition online. That meant I could get my hands on the actual image used... could I process it to discover where the ball had been removed? (In reality, this isn't the right way to win because the actual ball position is not governed by where it actually was, but where a judge thinks it was).
Would it be cheating if I could? Apparently not, the competition rules say I should use my skill and judgment in determining the ball position. Surely, skill covers my programming ability.
So, I went looking for tampering algorithms and eventually came across Detection of Copy-Move Forgery in Digital Images written by Jessica Fridrich at SUNY Binghamton. The paper describes an algorithm for detecting just the sort of changes I thought I was looking for.
Unfortunately, I know nothing about image processing. Fortunately, the paper is written in a very clear style and a bit of Internet research enabled me to track down the knowledge I didn't have. (Also, thanks to Jessica for sending me the original images she used to test my implementation).
In brief the algorithm does the following:
- Slide a 16x16 block across the entire image from left hand corner to bottom right hand corner. For each 16x16 block perform a discrete cosine transform (DCT) on it and then quantize the 16x16 block using an expanded version of the standard JPEG quantization matrix.
- Each quantized DCT transformed block is stored in a matrix with one row per (x,y) position in the original image (the (x,y) being the upper left hand corner of the 16x16 block being examined).
- The resulting matrix is lexicographically sorted and then rows that match in the matrix are identified. For each pair of matching rows (x1,y1) and (x2,y2) the shift vector (x1-x2,y1-y2) (normalized by swapping if necessary so that the first value is +ve) is computed and for each shift vector a count is kept of the number of times it is seen.
- Finally the shift vectors with a count > some threshold are examined, the corresponding pair of positions in the image are found and the 16x16 blocks they represent are highlighted.
Here's another picture showing a golfing image that's been touched up to remove something from the grass:
To get access to image data I used the FreeImage library and wrote a small C program that implements Jessica's algorithm. You can download the source here; it's released to you under the GNU GPL.
The program has two key parameters that affect how the image is processed: the quality factor and the threshold.
The quality factor is a number used to 'blur' the image (actually it changes the quantization): the higher the factor the more blurring and hence more 16x16 blocks are likely to seem the same to the algorithm. Increasing the quality factor will tend to increase the false matches.
The threshold is simply the number of blocks that have to appear to have been copied together. This prevents us from seeing a single 16x16 block as evidence of copying. Increasing the threshold means ever larger groups of blocks have to be identified together before they are identified as copying.
Back at Best of the Best I grabbed the image for Supercar Competition (SC-272), cut out a section that I thought the ball had to be in (just to speed up processing) and ran the algorithm. After some parameter tweaking the algorithm came up only with what look like false matches to me (along the bar where it's all one color):
And, of course, that's not where the judge thought the ball was. So, I guess I won't be driving home in the V8 Vantage, but what geek needs that when they've got a cool piece of software that detects copy/move forgery in images?
Which leaves me with one question: how are spot-the-ball images generated? Is this an algorithm problem, a problem because they use JPG (which is already transformed) for their images, or are these images generated in some other way?