Thursday, September 27, 2007

Blondie24: Playing at the Edge of AI

Imagine this scenario: You have recently befriended a visitor from Mars. Martians have no knowledge of human culture, let alone any experience with our board games. But somehow your new pal has heard of checkers and desperately wants to learn how to play. So much so, in fact, that you question his "friendship" as he demands success while waving around his ray gun menacingly.

An uneasy situation to be sure, but what if in addition to having to teach the visitor far from home how to play checkers from scratch, eventually making him an expert player, the only information you were allowed to tell him is where the game pieces were allowed to be on the board and how they were allowed to be moved. After a series of games, you would tell him how many games he had won, but not which games they were.

Could you do it?

David B. Fogel faced this very same challenge, albeit without the Martians and ray guns. Instead, Fogel aimed to have a computer learn the game of checkers in this very same way. We all know how dense computers are -- they certainly won't do anything other than exactly what you tell them -- so we can see that this task is still daunting nonetheless.

In Blondie24: Playing at the Edge of AI, Fogel describes how he used evolutionary computing algorithms to have a computer teach itself checkers. In the process, he manages to give a good introduction and general review of the state of artificial intelligence, to provide interesting insight into what it truly means for software to act intelligent, and to detail how he and his colleague Kumar Chellapilla managed to evolve a neural network that could play checkers competitively with the experts.

If you have studied AI at all in the past, then you probably remember spending a lot of time on subjects like decision trees, expert systems, and search tree algorithms such as A*. Indeed, topics like these seem to be at the forefront of the field, and capturing human knowledge and reusing it in an effective way constitutes, for many, the essence of artificial intelligence.

But does the use of our own knowledge really demonstrate intelligence? As Fogel points out, not really. These kinds of programs are simply acting like we do. They make no contribution to understanding what intelligence really is, nor how it develops in the first place.

Fogel offers an alternative. He believes that mimicking what nature has done so well is the key to having software learn for itself; that is, we should be focusing on evolution.

And that's exactly what Fogel uses to create a checkers program that can win against very high ranking players. He sets up a neural network with a somewhat arbitrary configuration, using the piece placement on the squares of a checkerboard as inputs. Several networks are created with random weights between neurons; this becomes the first generation in the evolutionary process.

The networks play against each other based on the most basic rules of how pieces can move from particular squares and when the game is over. They don't know when they lose or win a game; instead, they are assigned a final score for the series of games they have played. Then the half of the generation with the lowest scores is thrown out. The better half is copied and slightly randomly mutated. This new set of neural networks becomes the next generation, and the process begins again.

I'm definitely simplifying things here (for one thing, the neural networks don't play checkers, but rather evaluate a particular board configuration), but suffice it to say that these neural networks, after many generations of evolution, could actually play checkers very well. This is quite remarkable if you think about it -- just from random mutation and survival of the fittest, we start with something more or less random that knows nothing about checkers, and eventually get something that can beat some of the best. Nowhere have we instilled our own human expertise or knowledge into the program like the creators of Chinook did, or those of Deep Blue in the case of chess.

If you are the least bit curious of the details on how Fogel and Chellapilla achieved this, I highly recommend reading his book. I couldn't help myself feeling at least as enthralled with the progress of the artificial checkers player as I have with any fictional protagonist. Fogel does a wonderful job of providing the necessary background of the field, and repeats the key points just enough so the reader doesn't get lost.

More importantly, I hope you can share in the excitement surrounding the possibilities of evolutionary computing. Though it's been around for decades, I'm not sure that it's been used to anywhere near its potential. As hardware speeds up and complicated problems can be evolved more efficiently, we might find ourselves being able to create something closer and closer to true intelligence.

If you ask me, this sounds a bit less frustrating (or dangerous) than trying to do the same with our extra terrestrial companions mentioned earlier...

Tuesday, September 25, 2007

Creating Mazes from Images

This semester, I'm taking a systems engineering course for my Masters. It's called Entertainment Technology, and aims to cover some pretty interesting stuff, from image processing and compositing, to using accelerometers like those in the Wii remotes, to watermarking and other content protection schemes.

In a couple of weeks, we will be giving a presentation to our classmates on a technical paper of our choice. Mine will be Image-Guided Maze Construction by Jie Xu and Craig Kaplan from the University of Waterloo. Since the topic is pretty cool, I thought I'd share what it's all about!

Artists have long been creating beautifully artistic mazes. Take Christopher Berg as a modern example. He's been at it since the fifth grade, starting with countless varieties of amoebas as the basis for his mazes. Check out some of his more recent marvels, found on the Amazeing Art website:



Xu and Kaplan state early on that Berg was an inspiration for them. Unfortunately, the techniques that Berg uses to draw his mazes don't translate well to computational algorithms. The computer (with some help from the maze designer) must create lines that will artistically convey the tone, texture, and form of the original image. But in addition to this, there must be spaces between the lines that actually make the drawing a maze!

In an interactive system designed by Xu and Kaplan, a maze designer is able to identify various areas of the image they want to turn into a maze. Each of these regions can have a particular texture applied to them; that is, the maze can follow the general direction of lines in the original image, spiral in on itself, have random direction, or be completely user-defined. Once the mazes in these regions are created, they will be interconnected by breaking a hole in the "walls" between the regions. In addition, the designer can plan out a general solution path to ensure both an interesting and challenging puzzle.

To see how to construct a maze algorithmically, you can first imagine a grid laid out within the area the maze will be created (a planar subdivision). The faces of the grid (which isn't necessarily orthogonal) are called cells, where the edges between two adjacent cells are called cell walls.

Next, consider the dual of the grid, the cell graph, where each cell is represented by a vertex, and cells that share a wall are connected by an edge. Now if you want to get a maze from the grid that doesn't contain any cycles, you simply need to find a spanning tree of this dual and erase the walls between cells connected by the spanning tree in the cell graph. A random spanning tree could be generated, or certain walls could be biased such that they are more likely to be broken first.

Finally, notice that the four maze textures described above can be used to create the grid from which to generate a maze.

In the case of directional mazes, the walls of the maze should flow in a general direction that resembles what it happening in the image the maze is based on. This is done using vector fields, from which "streamlines" are placed. Then, streamlines perpendicular to these are added, thus creating a grid. The distance between the streamlines can be adjusted, helping depict the desired tone.

For regions marked to have spiral textures, the computer will create shrinking concentric rings of the boundary shape. Radial lines are added at regular intervals to complete the grid.

The random texture is created first as a square grid, and then relaxed using an algorithm created by Singh and Pederson. User defined textures obviously must be created by the maze designer.

Here are some examples of what Xu and Kaplan's system have created, taken from their maze design project page:


Monday, August 20, 2007

Summer of Code: Final Report

Today is the official end of the Google Summer of Code program. I've been both excited and nervous about the culmination of three months' worth of work on this judgment day. But no sense in worrying any longer; I will find out soon enough if I am to receive the coveted t-shirt and very useful final payment.

Here, I would like to share a summary of everything that I've done this summer for Inkscape. A written summary is made even more crucial by the nature of what I did in the second half of the program. While I laid down an important foundation for changes that can now be made much easier in the near future, there is not as much for the user to see quite yet.

Many of you already know my midterm impressions. As a quick reminder, during the first half of the summer, I implemented support for the <tref> element, as laid out in the SVG specification. This will let you use any character data from within the document in your own <text>, and as a bonus, any character markup from the original text will be ignored so you can easily add your own. You can't quite do this with <use>.

The work done in the second half is notably more complicated to explain. The end goal was initially twofold:
  1. Many fonts used in professional design can not be represented within the constraints of CSS. Take a font with a Shadow variant, such as Gill Sans, as an example -- there is no way to describe Gill Sans Shadow with CSS, as you can see in the CSS2 Spec. By adding our own Inkscape-specific information, we hoped to be able to use fonts like these at least within Inkscape.
  2. When it comes to listing font families and faces (aka styles) in our UI, we use Pango's answer to the question "which part of this font name is the family?" Unfortunately, this does not always lead to an ideal situation. What often happens is that several different faces that should be grouped together in the same family aren't. For example, you might see "Futura Bk BT", "Futura Hv BT", and "Futura Lt BT" listed as separate families, but would rather they were grouped together. We wanted to customize this process so we could control how families and styles were grouped together.
Ironically, neither of these two improvements is visible to the user in my latest code. It turns out that the first goal is almost certainly impossible to accomplish given the current state of Pango. The second goal is actually very much in reach, and can easily be implemented once the Summer of Code projects are complete.

What do I mean by the "current state of Pango" you ask? Well, as it turns out, the structure that is fundamental to our use of Pango, the PangoFontDescription, will only describe fonts that fit into the confines of CSS. So until the good folks at Pango decide to expand on what Pango can represent, we're stuck with what it already supports. (And yes, a little later on, I do plan on speaking with these good folks to find out why it is the way it is, and try to offer a helping hand to enhance it if that's what makes sense.)

But even if the PangoFontDescription did support what CSS can't, that doesn't really help us, right? After all, our SVG files use CSS to describe the fonts we're using anyway. That is true, and that's why I am adding a new attribute to an object's style string: -inkscape-font-specification. Here we will store a string that can fully represent a font no matter what its face is. The CSS will be updated as best it can be and used if there is no font specification available.

As for the second goal of cleaning up the family and style lists, the framework that makes this possible is in place, and only the code for converting the full font specification strings into the desired family and style for the UI needs to be set up. A fellow summer of coder has written code that may be appropriate for this, so after the deadline his code can be evaluated and re-purposed for this use. Everything should work well with only some possible minor tweaking.

So what does the user get to enjoy as a result of this work? Gone are the days of duplicate styles in the UI and styles that seem selectable but that just can't be used. Those fonts that Pango cannot handle aren't even listed, and those that just needed a little extra help now work. That means that every style listed in the Text and Styles dialog is now selectable! Hurray!

In conclusion, though it may not seem terribly exciting on the surface, this portion of my project has unlocked the door for some really great changes to be made in the very near future. Look forward to a more desirable grouping of families first, and hopefully we can get Pango to do for us what we really want next: support ALL fonts!

Wednesday, August 8, 2007

Looking Forward to School

The summer is flying by! What with Summer of Code and trying to plan my wedding (which is on August 24th), I haven't even had a chance to ride my motorcycle more than a few times this season let alone keep up with this blog.

Still, as the calendar flips over to August, thoughts of "back to school" start swimming in my head. Oddly enough, I always look forward to hitting the books again each September. This year is no different in that regard, but there is a new excitement in the fact that I will be returning to Carleton as a graduate student.

I've already registered for the five courses I will be taking in the first two semesters. I figured I'd share the title and description and see if you had any comments on the topics covered (love it or hate it, never heard of it, etc).

Computational Aspects of Geographic Information Systems

Computational perspective of geographic information systems (GIS). Data representations and their operations on raster and vector devices: e.g., quadtrees, grid files, digital elevation models, triangular irregular network models. Analysis and design of efficient algorithms for solving GIS problems: visibility queries, point location, facility location.

Advanced Topics in Computer Systems (from Systems and Computer Engineering)

Recent and advanced topics in computer systems. The course will generally focus on one or more of the following areas: specification, design, implementation, and modeling/analysis. Students may be expected to contribute to lectures or seminars on selected topics.

Applied Computational Geometry

Computer-based representation and manipulation of geometric objects. Design and analysis of efficient algorithms for solving geometric problems in applied fields such as Computer-Aided Design and Manufacturing, Cartography, Materials Science, and Geometric Network Design.

Topics in Medical Computing

Introductory course on data structures, algorithms, techniques, and software development related to medical computing (in particular spatial modeling). Topics may include: computational geometry algorithms for cancer treatment, medical imaging, spatial data compression algorithms, dynamic programming for DNA analysis.

Selected Topics in Computer Science: Computer Graphics

[No description available.]

Monday, July 9, 2007

Summer of Code: Halfway Through

Midterm evaluations begin today for Google's Summer of Code program, so what better time than now to reflect on the program so far? Here I hope to give some insight into my progress thus far, why it is thus, and where I feel I stand. This will inevitably provide a good idea of my impressions on open source up to this point, albeit indirectly.

Back at the end of May, the official program start, my plan was to write up a tutorial on how to use Eclipse on Windows to debug Inkscape, tackle a quick bug from a list my mentor provided, then move on to some of the work that was contained in my application proposal. I never expected to have that bug take me to the middle of the program. I am certainly not new to large software projects, so it has been interesting to experience such a steep learning curve.

Despite my relatively slow progress, I can say that I have learned much in the process. I have discovered both how some of Inkscape’s general architecture works, and the basic workings of the code that handles text. I feel very well equipped to tackle my next task with an efficiency and grace that would not have been possible had I jumped into it head first back in May.

More importantly, I have a genuine desire to stick around after the summer. There would be no sense in whipping up some fast and dirty code just to get paid by Google and then move on. Therefore, I try to look at the time spent thus far as an investment.

(By the way, the bug I’m referring to is really more of a small feature. I’m adding support for the tref element, as you may have read about in an earlier post about some frustrations encountered.)

There are several factors that I feel have hindered my progress. These are things that I have personally taken some time to get used to, not necessarily inherent flaws in the way Inkscape is built or managed. In fact, some of these issues are a direct result of working with Windows when the code is much better suited to development in Linux.

Builds Take Forever

Even if I change just one line of code, I have to wait ten minutes to rebuild the code. This is incredibly frustrating and a huge time waster. Often, I can’t move on to work on the next thing while waiting because I don’t know what the next thing is going to be before I try the code I am building for in the first place. Especially when just learning the code, I find myself needing to experiment with different one line changes constantly. Perhaps I wouldn’t need to do this so often if the next issue – debugging – were non-existent.

Debugging Stinks

Stepping through code, watching variables, breaking on assertions – this is the sort of thing that can really speed up the learning process for a new piece of software. I wasn’t too enthusiastic about debugging with gdb on the command line, so I very fortunately managed to set up Eclipse to debug using the CDT plugin. This essentially gives a GUI front end to gdb. (I’ll note that this took some fighting back in March when I was getting my Summer of Code application together. Others were also having a hard time setting it up; this is why I spent the first few days of the program writing a tutorial about it.)

Now, I don’t know if it’s gdb or simply Eclipse’s use of it, but debugging in this manner is very flaky. In debug mode, I can’t open files (though this has been proven to happen on the command line as well). Breakpoints often won’t work unless they are set before the application is launched. While stepping through code, the lines being executed seem to jump backwards a line or two unexpectedly. The debugger doesn’t break on crashes and can’t be arbitrarily suspended. Sometimes the application just doesn’t feel like launching. And these are just the issues I can remember!

Documentation is Thin

Not a big surprise here, though more up to date documentation would be nice. Assuming that I just haven’t been too dumb to find it. ;)

Online Communication Can Be Tricky

I have to admit that the fully online nature of the project has been difficult to adjust to. I tend to try to figure things out myself in the early stages of a new project. I find this to be a very effective philosophy and helps me learn the basics in a way that they will stick in my mind long-term. Still, there are times when a quick question to a team member can clear things up very quickly. The only problem is, when questions can only be asked online, asking isn’t always fast.

I’ve often preferred communication via the written word. It gives you an opportunity to think out what you want to say. However, there are times when sitting in front of a screen with another person is invaluable. If I don’t know the correct or recognizable terminology for something, for example, I can just point at the code instead of describe it. We can watch the behaviour of the running software at the same time and talk about it realtime. We can even sketch little diagrams to explain what we’re talking about. Some of this sort of thing can be done online, but it is impersonal and often cumbersome.

Another thing I really miss is weekly team meetings. Sure, they don’t seem so great while you are waiting for them to end, but hearing about other peoples’ problems and the technical discussion that often follows can be really beneficial. Some of this comes through in emails sent to the team, as it does in an online-only community, but you get something extra with in-person meetings it seems.

***

As with all things, it’s much easier to state what’s wrong than what’s right. There are some aspects of this kind of work that I really like. Working from home has its challenges, but it is darned nice to avoid the half hour commute to work in the morning. It’s also pretty amazing to see how well people can make friends online, without possibly ever meeting each other. There is a real sense of community.

I was fortunate enough to meet some developers in person at Libre Graphics Meeting, which made the comradery more real for me. I also try to make myself visible in the community, participating in pertinent mailing list discussions and helping users on the IRC channel.

Given all of this, where do I stand? I’m not sure how well I’ve performed compared to the other Summer of Code students in terms of actual code produced, but I have felt fairly productive in terms of learning. I’m hoping I will feel as fast the second half of the summer as I felt slow during the first half.

No matter what happens, if I can finish the summer with a prestigious Summer of Code t-shirt and a fun open source project to work on, I will be a happy programmer.