Result from the Canny Edge detection algorithm

The search for Eve

Artificial Intelligence.  The next great breakthrough in technology, and my ticket to fame and riches.  Well, not quite.  First I have to invent it.
Previous attempts didn’t go so well.  I concluded that AI requires at least two sources of input –  a confluence of two systems.  One system is language.  The other system is one of the five senses.  In this case, vision.  Without combining vision with language, inventing AI is akin to teaching a blind, deaf paraplegic newborn Chinese.
Step one: identify shapes/objects.  I started with a still image.  It was fairly easy to get my webcam to save a photo to my computer.  Here’s what I got:
Handsome feller, ain’t I.   gray-me-1


And then I needed a line detection algorithm.  And here I was stumped, because I couldn’t find anything ‘out of the box’, like I had for the webcam photo.  Lots of technical articles, but no actual code.  Finally I found something:
$conv_x = ($pixel_up_right+($pixel_right*2)+$pixel_down_right)-($pixel_up_left+($pixel_left*2)+$pixel_down_left);
$conv_y = ($pixel_up_left+($pixel_up*2)+$pixel_up_right)-($pixel_down_left+($pixel_down*2)+$pixel_down_right);  // no differnce in speed noticed

$angle = 90 + rad2deg( atan($conv_x / $conv_y));
//euclidean distance
$gray = sqrt($conv_x*$conv_x+$conv_y+$conv_y);
Which turned out to be wrong.  See the + instead of the * ?

Grr how do i turn off preformatted.

So It would have been faster if I invented my own algorithm.  First I had to analyze what the canny did.  So I scoured the Wikipedia page which led to finite differential operators and Big O notation.  I walked through the algorithm on paper and thought I could do better.
So I did.  Here’s my scribblings:
I coded it thusly:
return  (abs($pixel_up_right-$pixel_down_left)+abs($pixel_down_right-$pixel_up_left)+abs($pixel_up-$pixel_down)+abs($pixel_right-$pixel_left));
Its very simple – how do I turn off preformatted code again… Ok there we go.
It’s just finding the difference on each side of the pixel.  That’s it.
I couldn’t believe it was that simple, OR that it was better than the cobel operator.  So I ran some tests.
Here’s the cobel operation:


And here’s my operator:


Mine’s a tiny bit better, imho.

The grid in the pictures turns out to be a result of using .JPG format.  Advice for others:  don’t use .jpg format.  Use .PNG.

Turns out the Sobel operator does the same thing I do, it just considers the horizontal and vertical planes more important, presumably because those pixels are closer than the diagonals.  Either way, the explanation on Wikipedia (and everywhere else) is unnecessarily overcomplicated.

Now, in the defense of the Canny edge detector, it does do something the Sobel operator doesn’t – it detects the angle of the line.  To do that, you do need two separate measurements (horizontal and vertical) and divide the result, meaning that using the overcomplicated Sobel operator makes more sense.  Here’s what the canny, with edges colored, gets me (red for up and down, green for sideways, blues for diagonals):

Funky, eh?  But ultimately pointless.  Who cares what color the line is, or its angle?  Why is everyone using this canny edge detection operator?

Reading other Ph’ds and masters thesis, I discovered the next step people usually perform is an edge thinner.  Again, I couldn’t find anything specific, so I wrote my own.  It counts the edge pixels, checks that they are all grouped together, and deletes one if it’s next to another dark pixel.

if($pixel == $DARK and $neighbors <= 6 and $neighbors >= 3 and $transitions == 2){
$lumarray[$x][$y] = $LIGHT;}

Here’s the result:


Again, simple, and if I had just wrote my own from the start I wouldn’t have wasted so much time looking for something online.

At this point, I paused.  The result of this line thinner is technically accurate, but something seems to have been lost.  Take a look yourself.  It’s lost its vitality.  All the information is the same from the computer’s point of view, but from the human point of view, it doesn’t look right.
So what changed?  My best guess is that human vision relies on lighting, and gradients.  I decided to stick with the pre-thinned version to do my testing on.

Three days have passed at this point – I’m losing steam.  Time to go for the whole hog: invent an algorithm that traces lines.  It wasn’t too hard.  Here’s the result of the program:

‘traced’ lines are in green.

But several problems appeared.  First, see how the line gets all wonky as it moves to the bottom right?  It doesn’t know when to stop.  Often it detects the border of the picture as a line.
Second, it finds things that aren’t lines, like shadows in the folds of my shirt.  It constantly thinks the corner of the room is part of my head.
How to solve these problems?  I created an algorithm to test for these conditions, and kept going, but then it occurred to me that I was doing exactly what I vowed NOT to do: hard code every little minutiae.
Sure, I could have kept going, and in fact I did, only to run into more problems (eg. A straight line isn’t actually straight – one pixel off, and the computer thinks it’s a corner instead of a line.)

The problem I’m running into is the same problem that dominates the whole of AI – I need a general purpose algorithm.  Something that says ‘hey, this isn’t right, back up and try something different’.

I said, ok, that doesn’t sound too hard.  Analyze our variables, and if something is unusual, throw some kind of exception.
Here’s a graph I made examining one of my variables.  You can see the results here.
It seems informative – the values follow a trend, and anything outside the trend is probably wrong.  Similar results appear for other variables.  All I have to do is detect the peaks of the graph.  Easy, right?
Wrong.  There is no reliable method to find all the significant points of a graph.  Take the graph in the link – is that blip at values 190-192 significant?  Darn right it is.  What about the small peak at 65-69?  Nope, turns out it is not.  How can a computer decide this?  It can’t.  I’d have to end up hard-coding a ‘what is significant’ algorithm.

Which is where we reach the end of my tale.  I need to invent an ‘algorithm-making algorithm’.  For inspiration, I refer you to ‘the game of life’!