I see headlines for articles about how computer science needs more women so often that the notion has become clichéd to me. But the cold, hard truth recently made the problem seem so much more real, and got me thinking about some potential solutions.
According to an article written at the end of December 2005, the highest number of female computer science graduates was 37 percent (this occurred in 1985). During a few of years leading up to the article, this number was around 27 percent. These statistics are confirmed by the graph at the end of this article written in 2006.
It turns out that our undergraduate program has only about a 10% female population (based on observations). The story can't be much different from other universities. Kind of a big drop, isn't it?
Why? How can we stop it??
Time and time again people try to come up with the answers. I've even tried to do the same, albeit for open source rather than education. So why do the numbers seem to only go down, if anything?
I really don't have any more answers than the countless others who have written on the topic. All I can do is attempt to improve the outlook that young females have toward computer science, and try to improve the atmosphere for those already here.
To accomplish the latter, I have used my contact for the Google Ambassador Program to secure some sweet Google swag for an all-women's social gathering. From the same program I also have some money left to provide free food. The idea is to provide a informal, fun forum for girls in the School of Computer Science to get together and do whatever happens to work - gripe about things we don't like about being female in computer science, discuss ideas for finding work or going to grad school after convocation, and so on. Kind of a solidarity thing, I suppose. I've even invited a successful alumni of our program to talk about her experiences with life after Carleton.
There are many programs in our area I have recently discovered that help encourage young women to consider science and technology in their career options. As just one example, Pathmakers has post-secondary female student volunteers make presentations to elementary and high school students to help them explore these opportunities. Our undergraduate advisor is already looking into participating in one of their bigger upcoming events in December.
In addition to what's already out there, I was recently inspired to submit a proposal to create and teach a mini-course that our local universities offer to grade eight and high school students. I have actually taken some of these courses when I was in high school, but didn't realize I could teach one as a grad student. I wanted to provide a course just for girls so they wouldn't be intimidated by the probability of being in a class with all (potentially geeky) boys. The course would somehow give a flavour of some of the main topics in computer science, showing the girls that the area can be interesting and fun. I decided that computer games is a perfect medium for doing this, and plan to submit my proposal next week. I will post the course description here once I'm done.
Whether these programs and ideas will make much difference in the numbers of incoming female computer science students one day, I cannot say. Perhaps I will never know. But given the dismal percentages we face today, I will feel better knowing that I have tried.
Friday, October 26, 2007
Saturday, October 20, 2007
Watching Sports Your Way
One option for my Masters thesis is to explore the use of computer vision in a real industrial production system. Thinking in terms of sportscasts, on the simplest end, one could automatically track players and objects (remember that annoying -- and short lived -- blue highlighted hockey puck?). On the more ambitious end, broadcasters might be able to give viewers novel and exciting viewpoints, such as a particular play from a particular player's perspective.
Imagine my disappointment in seeing that the latter had seemingly been solved already (assuming the former is relatively simple in comparison). I was recently directed to a copy of a paper called Virtual Viewpoint Replay for a Soccer Match by View Interpolation From Multiple Cameras by Naho Inamoto and Hideo Saito, where I found not only Inamoto and Saito's work on the topic, but of course, many references to related research, as well.
Ok, so maybe it's not really such a surprise that work on this area has been done in the past -- it would be naive to think otherwise [:)]. But at first glance, this paper seems to cover a method that might be well suited to the broadcast industry, thereby coming much closer to the realm of my possible thesis topic. But as we will see, there is still much room for improvement, and therefore still work available if I do end up taking this direction.
What It's All About
The basic premise of this research is to be able to watch a sports game (soccer in the case of this particular paper) from angles that weren't explicitly captured on camera. Computer vision techniques would be employed to determine what an intermediate camera between two real ones would be seeing. A user of an on-demand video system could potentially guide his own camera and see the entire game from that viewpoint, or a broadcaster could simply present a new and interesting shot of a particular play for the audience.
In the past, the methods used to synthesize these new shots fit into three categories (according to Inamoto and Saito):
Inamoto and Saito's proposed method generates images of arbitrary viewpoints "by view interpolation among real camera images." Multiple cameras will be capturing a soccer match, and when a user chooses a new virtual viewpoint, the two or three cameras closest to it are used to find correspondences and perform view interpolation. Because the field and the soccer stadium itself are fixed, they can be considered static, and processed separately from the dynamic objects (players, the ball, etc). Furthermore, much of the processing time can be done ahead of time, including the determination of the camera geometry.
Several videos that show the results of this method can be found on this research web page. In addition to simply creating new viewpoints for the user, the videos demonstrate that the method allows for effects like that seen in The Matrix, where the camera spins around stationary players.
The performance of a developed system using this method was tested on a Pentium 4 3.2 GHz desktop with 2 GB of memory and an ATI Radeon 9800 graphics card. On average, the system could process 3.7 frames every second. The processing time is linear in the number of dynamic objects contained in a scene at a given time.
What It Means For Me
Here I'd like to discuss some aspects of the reported research that I'm not sure would be entirely suitable for the production industry, and could thus be interesting areas for me the explore for my thesis. Some of these I have not mentioned above, but are still part of the discussed research, and more information can be found in the associated paper.
It's worth noting that, being Canadian, we wanted to explore the application of virtual viewpoints in the context of hockey games rather than soccer. How cool would it be to put yourself in Daniel Alfredsson's skates as he flies to the net and puts a puck in top shelf?! Keep this context in mind as you read the issues I raise below.
First, the proposed method requires that several cameras are fixed and calibrated ahead of time. The first problem with this is not too serious: there is a requirement for the stadium or arena to invest in more cameras. This is due to the fact that almost all cameras at a sports event are constantly moving to follow the action or to switch its focus to another point of interest. It is possible that this is a reasonable investment, provided that the results are worth the extra money. As will be discussed in a moment, this certainly isn't the case at this stage.
The second issue with fixed cameras could, in my opinion, be slightly more problematic. The manual work required to calibrate these cameras requires broadcasting technicians to spend time identifying corresponding points between the cameras, and identifying the static regions (such as the field or stadium in the case of soccer) in the image captured by every camera. This work takes an hour for just four cameras, and you may need more cameras than that to get a high quality result. Now, this extra work may not be relevant after all if you consider that once the cameras are calibrated, they don't need to be calibrated again because they are fixed. I'm just wondering whether there is the possibility of these cameras being taken away at the end of each game, thus forcing this task to be redone before the next game when the cameras are replaced. The possibility of human error each time could seriously affect the results for that game; for example, being slightly off on the point correspondences throws off the whole geometry of the cameras!
Next, consider a requirement that allows soccer players to be identified on the field. It is essential that the color of the ball and the players (including the uniform) be different from the field. It would seem that most soccer teams have jerseys that avoid this problem, but this is certainly not the case in hockey. Even my favorite hockey team wears white on away games, so it would be difficult to distinguish between the jersey and the white ice surface. I also believe there could be new issues when using this method for hockey related to the reflectivity of the ice. Surely the light absorbing properties of ice and turf are different, so shadows might need to be handled differently, and reflections might get in the way.
Finally, the critical issues outlined in the paper must be addressed before such a system could be used for real broadcasting. For example, players can occasionally disappear from the scene when they have not been filmed in at least two of the cameras. Problems also occur when four or five players overlap. While this may not happen quite as often on a wide open soccer field, the smaller hockey rink would have this happen all the time!
Conclusion
It's pretty clear that the issues outlined here, as well as generally improving the speed and quality of the resulting virtual video, provide a lot of potential research for an enthusiastic Masters student such as myself. I'll let you all know what I end up doing for my thesis as soon as I know!
Imagine my disappointment in seeing that the latter had seemingly been solved already (assuming the former is relatively simple in comparison). I was recently directed to a copy of a paper called Virtual Viewpoint Replay for a Soccer Match by View Interpolation From Multiple Cameras by Naho Inamoto and Hideo Saito, where I found not only Inamoto and Saito's work on the topic, but of course, many references to related research, as well.
Ok, so maybe it's not really such a surprise that work on this area has been done in the past -- it would be naive to think otherwise [:)]. But at first glance, this paper seems to cover a method that might be well suited to the broadcast industry, thereby coming much closer to the realm of my possible thesis topic. But as we will see, there is still much room for improvement, and therefore still work available if I do end up taking this direction.
What It's All About
The basic premise of this research is to be able to watch a sports game (soccer in the case of this particular paper) from angles that weren't explicitly captured on camera. Computer vision techniques would be employed to determine what an intermediate camera between two real ones would be seeing. A user of an on-demand video system could potentially guide his own camera and see the entire game from that viewpoint, or a broadcaster could simply present a new and interesting shot of a particular play for the audience.
In the past, the methods used to synthesize these new shots fit into three categories (according to Inamoto and Saito):
- Model Based Approach. Using the geometry of the scene, a 3-D reconstruction with a synthesized texture could be created and then reprojected into the desired novel view. The accuracy of the results depends on how well the model can be constructed, which in turn depends on having many calibrated cameras. This method is not well suited to large areas like a soccer stadium.
- Transfer Based Approach. Instead of a 3-D model, morphing techniques can be employed to obtain the new viewpoint between two images, or a trifocal tensor can be used for image transfer. Either of these require dense correspondences between the known views, hence the two known images must be static or only slightly varying.
- Approach Using the Plenoptic Function. The plenoptic function "describes all the radiant energy that is perceived by an observer at any point in space and time." With this function, it is possible to allow users to arbitrarily pan and tilt a virtual camera, and the resulting shots will be based on a collection of sample images. Since the plenoptic function is seven dimensions, though, it requires data reduction or compression in practical use. This makes it less suitable for the soccer example because it would be impossible to describe all the radiant energy in the stadium scene.
Inamoto and Saito's proposed method generates images of arbitrary viewpoints "by view interpolation among real camera images." Multiple cameras will be capturing a soccer match, and when a user chooses a new virtual viewpoint, the two or three cameras closest to it are used to find correspondences and perform view interpolation. Because the field and the soccer stadium itself are fixed, they can be considered static, and processed separately from the dynamic objects (players, the ball, etc). Furthermore, much of the processing time can be done ahead of time, including the determination of the camera geometry.
Several videos that show the results of this method can be found on this research web page. In addition to simply creating new viewpoints for the user, the videos demonstrate that the method allows for effects like that seen in The Matrix, where the camera spins around stationary players.
The performance of a developed system using this method was tested on a Pentium 4 3.2 GHz desktop with 2 GB of memory and an ATI Radeon 9800 graphics card. On average, the system could process 3.7 frames every second. The processing time is linear in the number of dynamic objects contained in a scene at a given time.
What It Means For Me
Here I'd like to discuss some aspects of the reported research that I'm not sure would be entirely suitable for the production industry, and could thus be interesting areas for me the explore for my thesis. Some of these I have not mentioned above, but are still part of the discussed research, and more information can be found in the associated paper.
It's worth noting that, being Canadian, we wanted to explore the application of virtual viewpoints in the context of hockey games rather than soccer. How cool would it be to put yourself in Daniel Alfredsson's skates as he flies to the net and puts a puck in top shelf?! Keep this context in mind as you read the issues I raise below.
First, the proposed method requires that several cameras are fixed and calibrated ahead of time. The first problem with this is not too serious: there is a requirement for the stadium or arena to invest in more cameras. This is due to the fact that almost all cameras at a sports event are constantly moving to follow the action or to switch its focus to another point of interest. It is possible that this is a reasonable investment, provided that the results are worth the extra money. As will be discussed in a moment, this certainly isn't the case at this stage.
The second issue with fixed cameras could, in my opinion, be slightly more problematic. The manual work required to calibrate these cameras requires broadcasting technicians to spend time identifying corresponding points between the cameras, and identifying the static regions (such as the field or stadium in the case of soccer) in the image captured by every camera. This work takes an hour for just four cameras, and you may need more cameras than that to get a high quality result. Now, this extra work may not be relevant after all if you consider that once the cameras are calibrated, they don't need to be calibrated again because they are fixed. I'm just wondering whether there is the possibility of these cameras being taken away at the end of each game, thus forcing this task to be redone before the next game when the cameras are replaced. The possibility of human error each time could seriously affect the results for that game; for example, being slightly off on the point correspondences throws off the whole geometry of the cameras!
Next, consider a requirement that allows soccer players to be identified on the field. It is essential that the color of the ball and the players (including the uniform) be different from the field. It would seem that most soccer teams have jerseys that avoid this problem, but this is certainly not the case in hockey. Even my favorite hockey team wears white on away games, so it would be difficult to distinguish between the jersey and the white ice surface. I also believe there could be new issues when using this method for hockey related to the reflectivity of the ice. Surely the light absorbing properties of ice and turf are different, so shadows might need to be handled differently, and reflections might get in the way.
Finally, the critical issues outlined in the paper must be addressed before such a system could be used for real broadcasting. For example, players can occasionally disappear from the scene when they have not been filmed in at least two of the cameras. Problems also occur when four or five players overlap. While this may not happen quite as often on a wide open soccer field, the smaller hockey rink would have this happen all the time!
Conclusion
It's pretty clear that the issues outlined here, as well as generally improving the speed and quality of the resulting virtual video, provide a lot of potential research for an enthusiastic Masters student such as myself. I'll let you all know what I end up doing for my thesis as soon as I know!
Sunday, October 14, 2007
Read the question
I recently finished grading my very first batch of assignments as a teaching assistant. Previously I had worked as a special kind of "general help" teaching assistant that wasn't assigned to any particular course, so I didn't needed to mark anything before this semester.
Most students fared rather well on their first attempt at understanding functional programming using Scheme. In fact, I only gave one failure, though it was clear this person either forgot to do the assignment until 30 seconds before it was due, or simply planned to drop the course soon anyway. A grand 1% is what they earned. By and large, the class did a good job.
But they could have done better if only they had read the questions more carefully!
Everybody forget something. Maybe it was the substitution model they were supposed to show for their multiply procedure. Or maybe they missed the fact that they were supposed to program in an iterative style. Perhaps they ignored the fact that 25% of the grade was specifically allotted for testing and documentation.
This kind of thing is common, but I never realized how many students it affected until I marked all of their work.
Ah well -- my many comments will hopefully serve as a reminder to be more careful in the next few assignments, and if nothing else, I now know myself to reread my own assignments a few times more before rushing to answer!
Most students fared rather well on their first attempt at understanding functional programming using Scheme. In fact, I only gave one failure, though it was clear this person either forgot to do the assignment until 30 seconds before it was due, or simply planned to drop the course soon anyway. A grand 1% is what they earned. By and large, the class did a good job.
But they could have done better if only they had read the questions more carefully!
Everybody forget something. Maybe it was the substitution model they were supposed to show for their multiply procedure. Or maybe they missed the fact that they were supposed to program in an iterative style. Perhaps they ignored the fact that 25% of the grade was specifically allotted for testing and documentation.
This kind of thing is common, but I never realized how many students it affected until I marked all of their work.
Ah well -- my many comments will hopefully serve as a reminder to be more careful in the next few assignments, and if nothing else, I now know myself to reread my own assignments a few times more before rushing to answer!
Wednesday, October 3, 2007
Heap Trick
I learned a neat trick regarding heaps today. Here's how it goes:
- Take the number of nodes in the heap and write that number down in binary.
- Erase the most significant bit.
- Starting at the root, go to the left child when you see a zero, and go to the right child when you see a one.
- When you run out of binary digits, you will be at the last node.