Thursday, February 14, 2008

The sum of the first n odd numbers is always a square

I was staring at the checked pattern on the back of an airline seat the other day when I suddenly saw that the sum of the first n odd numbers is always a square. For example,

1
1 + 3 = 4
1 + 3 + 5 = 9
1 + 3 + 5 + 7 = 16

And, of course, it occurred to me that it would be nice to be able to prove it. There are lots of ways to do that. Firstly, this is just the sum of an arithmetic progression starting at a = 1 with a difference of d = 2. So the standard formula gives us:

sum_odd(n) = n(2a + (n-1)d)/2
= n(2 + (n-1)2)/2
= n(1 + n - 1)
= n^2

So, the sum of the first n odd numbers is n^2.

But using standard formulae is annoying, so how about trying a little induction.

sum_odd(1) = 1

sum_odd(n+1) = sum_odd(n) + (2n + 1)
= n^2 + 2n + 1
= (n+1)^2

But back to the airline seat. Here's what I saw (I added the numbering, Lufthansa isn't kind enough to do that for you :-):



The other thing I noticed was this:



You can view the square as the sum of two simpler progressions (the sum of the first n numbers and the sum of the first n-1 numbers):

1 + 3 + 5 + 7 =
1 + 2 + 3 + 4 +
1 + 2 + 3

And given that we know from Gauss the sum of the first n numbers if n(n+1)/2 we can easily calculate:

sum_odd(n) = sum(n) + sum(n-1)
= n(n+1)/2 + (n-1)n/2
= (n^2 + n + n^2 - n)/2
= n^2

What do you do on long flights?

Wednesday, February 13, 2008

Tonight, I'm going to write myself an Aston Martin

This is the story of my attempt to 'cheat' in an on-line spot-the-ball competition to win an Aston Martin. It's also the story of my failure, but you get free source code that implements automatic detection of image alteration using copy/paste or tools like the Clone Tool in Photoshop.

First, take a look at this photo:



Notice anything strange? In fact this image has been tampered with to cover up a truck. The truck is completely hidden by foliage. Here's the original:



Wouldn't it be nice to be able to detect that automatically? It is possible. Here's an image automatically generated by my code showing what was moved. All of the red was moved to the blue (or the other way around).



I was motivated to work on this program by greed (or at least my never-ending love of having a little flutter on things). Best of the Best runs spot-the-ball competitions in airports to win very expensive cars. But they also run the same competition online. That meant I could get my hands on the actual image used... could I process it to discover where the ball had been removed? (In reality, this isn't the right way to win because the actual ball position is not governed by where it actually was, but where a judge thinks it was).

Would it be cheating if I could? Apparently not, the competition rules say I should use my skill and judgment in determining the ball position. Surely, skill covers my programming ability.

So, I went looking for tampering algorithms and eventually came across Detection of Copy-Move Forgery in Digital Images written by Jessica Fridrich at SUNY Binghamton. The paper describes an algorithm for detecting just the sort of changes I thought I was looking for.

Unfortunately, I know nothing about image processing. Fortunately, the paper is written in a very clear style and a bit of Internet research enabled me to track down the knowledge I didn't have. (Also, thanks to Jessica for sending me the original images she used to test my implementation).

In brief the algorithm does the following:
  1. Slide a 16x16 block across the entire image from left hand corner to bottom right hand corner. For each 16x16 block perform a discrete cosine transform (DCT) on it and then quantize the 16x16 block using an expanded version of the standard JPEG quantization matrix.

  2. Each quantized DCT transformed block is stored in a matrix with one row per (x,y) position in the original image (the (x,y) being the upper left hand corner of the 16x16 block being examined).

  3. The resulting matrix is lexicographically sorted and then rows that match in the matrix are identified. For each pair of matching rows (x1,y1) and (x2,y2) the shift vector (x1-x2,y1-y2) (normalized by swapping if necessary so that the first value is +ve) is computed and for each shift vector a count is kept of the number of times it is seen.

  4. Finally the shift vectors with a count > some threshold are examined, the corresponding pair of positions in the image are found and the 16x16 blocks they represent are highlighted.

Here's another picture showing a golfing image that's been touched up to remove something from the grass:






To get access to image data I used the FreeImage library and wrote a small C program that implements Jessica's algorithm. You can download the source here; it's released to you under the GNU GPL.

The program has two key parameters that affect how the image is processed: the quality factor and the threshold.

The quality factor is a number used to 'blur' the image (actually it changes the quantization): the higher the factor the more blurring and hence more 16x16 blocks are likely to seem the same to the algorithm. Increasing the quality factor will tend to increase the false matches.

The threshold is simply the number of blocks that have to appear to have been copied together. This prevents us from seeing a single 16x16 block as evidence of copying. Increasing the threshold means ever larger groups of blocks have to be identified together before they are identified as copying.

Back at Best of the Best I grabbed the image for Supercar Competition (SC-272), cut out a section that I thought the ball had to be in (just to speed up processing) and ran the algorithm. After some parameter tweaking the algorithm came up only with what look like false matches to me (along the bar where it's all one color):



And, of course, that's not where the judge thought the ball was. So, I guess I won't be driving home in the V8 Vantage, but what geek needs that when they've got a cool piece of software that detects copy/move forgery in images?

Which leaves me with one question: how are spot-the-ball images generated? Is this an algorithm problem, a problem because they use JPG (which is already transformed) for their images, or are these images generated in some other way?

Tuesday, February 12, 2008

Interface to SQLite database in 23 lines of Arc

One thing that the first release of Arc was missing was access to any sort of database, but that's easily remedied. Here are 23 lines of Arc code that provide access to a SQLite database:

(= db! 'nil)

(def db+ (name (o host "localhost") (o port 49153))
(let (i o) (connect-socket host port)
(db> o name)
(if (db< i) (list i o))))

(def sql ((i o) q)
(db> o q)
(if (db< i) (readall i 200)))

(def db- (db)
(map close db))

(def db> (o s)
(write s o)
(writec #\return o)
(writec #\newline o)
(flush-socket o))

(def db< (i)
(= db! (read i))
(iso db! 200))

The three functions you need to care about are db+ (get a connection to a named SQLite database), db- (close a connection to a database) and sql (execute a SQL query and return a list (or lists) of rows. There's also db! which contains the status of the last command (200 for OK, or 500 followed by a string explaining the error).

Here's a little Arc session creating a database, putting some data in it and then querying it. The database called test didn't exist at the start of this session:

arc> (= db (db+ "test"))
(#<input-port> #<output-port>)
arc> (sql db "create table foo (id integer primary key, text varchar(255))")
nil
arc> (sql db "select * from foo")
nil
arc> (sql db "insert into foo (text) values ('first');")
nil
arc> (sql db "select * from foo")
(("1" "first"))
arc> (sql db "insert into foo (text) values ('something else')")
nil
arc> (sql db "select * from foo")
(("1" "first") ("2" "something else"))
arc> (db- db)
nil

To make this work I had to write a TCP server that wraps SQLite (it's just a small C program that you can get here). The C program listens on a port for connections from your Arc program and handles queries.

I did have to make a small patch to Arc itself (since arc0 doesn't contain any outgoing socket code). My patch adds the ability to make a TCP connection to a remote machine and to flush an output port (add this to your ac.scm):

(xdef 'connect-socket (lambda (host port)
(let-values ([(in out) (tcp-connect host port)]) (list in out))))
(xdef 'flush-socket (lambda (s) (flush-output s)))

(Apologies if I have abused Scheme there, I'm a Scheme n00b)

All this code is released under the same license as Arc itself.

Monday, February 11, 2008

My first Arc project: a simple Wiki

The only way to learn a programming language is to write something in it. So, I decided it was time to dig into Arc and my first project is a very simple Wiki.

Here's the source (wiki.arc):

; A wiki written in Arc (arc0)
;
; Copyright (c) 2008 John Graham-Cumming
;
; (load "wiki.arc")
; (wsv)
;
; Then go to http://localhost:8080/show

(load "web.arc")
(load "util.arc")

(= pagedir* "wiki/")

(def histfiles (page)
(sort > (map [coerce _ 'int] (rem [is "current" _] (dir (pagepath page))))))

(def nexthist (page)
(let h (histfiles page)
(if h (++ (car h)) 0)))

(def pagepath (page)
(string pagedir* (page 0) "/" (page 0) (page 1) "/" page ))

(def pagefile (page (o file))
(string (pagepath page) "/" (or file "current")))

(def slurp (page (o file))
(if
(let p (pagefile page file)
(if (file-exists p) (readfile p)))))

(def upperlen (word)
(len (keep upper word)))

(def is-wikilink (word)
(if (alphas word)
(if (~is (word 0) (downcase (word 0)))
(>= (upperlen word) 2))))

(mac url-show (page)
`(string "show?p=" ,page))

(mac url-edit (page)
`(string "edit?p=" ,page))

(mac link-show (page text)
`(link ,text (url-show ,page)))

(mac link-edit (page text)
`(link ,text (url-edit ,page)))

(def wikify (word)
(if (is-wikilink word)
(if (file-exists (pagefile word))
(link-show word word)
(pr word)(link-edit word "?"))
(pr word))
(ws))

(mac spew-raw (page)
`(spew ,page [pr _ " "]))

(mac spew-wiki (page (o file))
`(spew ,page [wikify (string _)] ,file))

(def spew (page f (o file))
(let p (pagepath page)
(if (dir-exists p)
(map f (flat (map tokens (slurp page file))))
(pr "This page does not yet exist."))))

(def squash (file body)
(writefile1 body file))

(def save-page (req)
(w/$ p
(w/$ t
(squash (pagefile p) t)
(squash (string (pagepath p) "/" (nexthist p)) t))
(url-show p)))

(mac mtime (f)
`(datetime (file-mtime ,f)))

(def last-modified (page)
(let f (pagefile page)
(if (file-exists f)
(pr "Last modified: " (mtime f)))))

(mac show-page (page)
`(whitepage
(tag h1 (link-show ,page ,page))
(spew-wiki ,page)
(br 2)
(hr)
(last-modified ,page)
(br)
(link-edit ,page "[edit]")
(ws)
(link "[history]" (string "history?p=" ,page))))

(mac edit-page (page)
`(whitepage
(tag h1 (pr (string "Editing " ,page)))
(arform save-page
(textarea "t" 25 80 (spew-raw ,page))
(hidden "p" ,page)
(br)
(submit "Save"))
(link-show ,page "[cancel]")
(br 2)))

(def revision (page rev)
(tag li
(pr "Revision: " )
(link rev (string "revision?p=" page "&r=" rev))
(pr " modified " (mtime (string (pagepath page) "/" rev)))))

(mac history-page (page)
`(whitepage
(tag h1 (pr (string "History of " ,page)))
(tag ul (map [revision ,page _] (histfiles ,page)))
(hr)
(link-show ,page (string "Back to " ,page))))

(mac revision-page (page rev)
`(whitepage
(tag h1 (pr "Showing revision " ,rev " of " ,page))
(spew-wiki ,page ,rev)
(br 2)
(hr)
(last-modified ,page)
(br)
(link-show ,page (string "Back to " ,page))))

(defop show req
(w/$ p
(if p
(show-page ($ "p"))
(show-page "HomePage"))))

(defop edit req
(w/$ p
(ensure-dir (pagepath p))
(edit-page p)))

(defop history req
(history-page ($ "p")))

(defop revision req
(revision-page ($ "p") ($ "r")))

(def wsv ()
(ensure-dir pagedir*)
(asv))

It loads two helpers. The first contains common utilities that aren't really Wiki-related (util.arc):

(def alpha (c)
(or (<= #\a c #\z) (<= #\A c #\Z)))

(def alphas (str)
(is (keep alpha str) str))

(def upper (c)
(is (upcase c) c))

(def datetime ((o time (seconds)))
(let val (tostring
(system (string "date -u -r " time " \"+%Y-%m-%d %H:%M\"")))
(subseq val 0 (- (len val) 1))))


And the second contains enhancement to Arc's web/HTML handling (web.arc):

(mac hidden (name val)
`(gentag input type 'hidden name ,name value ,val))

(mac hr ()
`(gentag hr))

(mac ws ()
`(pr " "))

(mac $ (r)
`(arg req ,r))

(mac w/$ (r . body)
`(with (,r ($ (string ',r))) ,@body))

In web.arc there are a couple of bits of syntax to make accessing form/URL arguments easier: ($ "p") (which gets the value of the argument p) and (w/$ p ...) which sets a variable called p to the value of the argument p and then evaluates the rest of the expression.

All this is released under the same license as Arc. (Since I have never programmed in Arc before, and it's been almost 20 years since I stopped coding in LISP or ML, I'd appreciate constructive comments).

PPP3 (final version) in Java and C

Steve Gibson has released the final version of his PPP system: PPPv3 and so I've updated my code to be compatible.

Two versions of PPPv3 are available:

Both are released, as before, under the BSD license.

Friday, February 08, 2008

The Arc Challenge Explained

When I first looked at the Arc Challenge code my reaction, like that of many people, was WTH? Here's the code:

(defop said req
(aform [w/link (pr "you said: " (arg _ "foo"))
(pr "click here")]
(input "foo")
(submit)))

Within the context of the Arc web/app server this creates a page called /said which has a form on it:

<form method=post action="x">
<input type=hidden name="fnid" value="JtCw8ju328">
<input type=text name="foo" value="" size=10>
<input type=submit value="submit">
</form>

That form accepts a single parameter called foo and redirects to /x.

When clicking submit the user is taken to a page with a single link on it:

<a href="x?fnid=bHJpJ5G1DH">click here</a>

Following that link brings up a page showing what you typed in the first; here's the output when I typed hello in the form:

you said: hello

So, how does that work?

Firstly, the defop defines an 'operation' (which is just a page within the web server). In this case the page is called said and hence is bound to /said. There's a single argument, called req, which will contain the HTTP request when said is called by the server.

When said is called it uses aform to create an HTML form. To see this more clearly I've removed the clever part (and replaced it with X):

(aform X
(input "foo")
(submit))

So aform creates a form with an simple HTML input with the name foo and a submit button. The clever bit is what happens when the form is submitted.

By default the form submits to the page /x. This is hard-coded in the source of the Arc server. It makes use of a neat feature of the Arc server: fnids. When the form was generated a hidden field was inserted with a unique 'function id' (the fnid). This fnid is used by the /x URL to lookup a function to call when /x is activated. (Note this example uses URLs/hidden form fields for the fnid, there's no reason why it couldn't be stored in a cookie).

The function called is actually the first argument to aform which has been stored away to be called when necessary. Here's the function definition:

[w/link (pr "you said: " (arg _ "foo")) (pr "click here")]

[ ... _ ... ] is special Arc syntax for a function with a single argument called _. So the first argument to aform is a function definition, and that function is assigned a unique fnid and that fnid can be used to lookup that function and call it. The single argument consists of the HTTP request used to activate the function.

The w/link macro creates a page consisting of the words click here linked to another page. The link is, once again, done using a function and fnid. The function called when the link is clicked is:

(pr "you said: " (arg _ "foo"))

w/link's first argument is an expression that will be evaluated within the context of a function (which is entirely hidden inside the server) and used to output the page. It retrieves the foo argument from the HTTP request at the time of the initial POST.

What's neat here is the mapping between functions and fnids so that pages are just functions and the lookup of the right page to go to is handled automatically.