Monday 24 September 2012

GPGPU Computation: What it all means

GPUs or Graphics Processing Units , are specialized chips that come with almost all electronic gadgets today and were introduced primarily for games.In the late 1900s, game developers realized that most of the operations that they had to do for making their games 'look' good were actually quite simple ( basic addition, subtraction or logical operations ) and more importantly could be done in parallel. At the time you had these CPUs with dual cores capable of performing complicated instructions. This was good for general operations but if the gaming industry needed to expand, they needed a new device; something that could just do basic operations but with many threads running in parallel. The call for these devices was finally answered with Nvidia introducing the GeForce 256 in 1999 calling it the first Graphics Processing Unit ( GPU ) . In 2002, ATI entered the market as well with its Radeon 9700. Currently, the GPU market is almost completely controlled by ATI and Nvidia, with Intel being the major distributor of embedded graphics cards.
For many years after that ( and even today ) gamers around the world would boast about their new graphics cards blurting out clock frequencies and RAM sizes and comparing manufacturers. As the complexity of games increased the complexity of the GPUs needed to run them increased. On the software side tools like DirectX and OpenGL started to develop to simplify the game development process. And so by 2005, GPUs had made their mark on the game industry and some pretty cool software existed to harness their potential.
And then around 2006, a new trend was established. A trend which has gained significant popularity over the years and is slowly replacing conventional approaches to many problems.
The trend is called General-purpose computing on graphics processing units or  GPGPU computing. The term was introduced by Mark Harris around 2002 and refers to the approach of using GPUs for non graphics applications. Basically, the problem that spawned the generation of GPUs, namely the need for hardware with a large number of parallel cores was also observed in the fields of science and mathematics. Most of the sciences deal in one way or another with nature and in general most of the phenomenon in nature can be simulated in parallel. In complex mathematical operations and Monte Carlo simulations, calculations are independent and so the more calculations you can do in parallel the better. And so the same revolution happened and is happening in these fields as well. More and more academics from around the world are starting to see GPUs as viable alternatives to solve their problems. And rightly so.
I've talked about how GPUs are ideal for parallel processing but some of you might be asking, why not just run more threads on CPUs? To answer that, let me give you a flavor of 'how' parallel GPUs are. For example consider one of the newer GPUs, say the Nvidia GTX 680. Now, the GTX 680 has 1536 CUDA cores grouped into streaming multiprocessors ( SMP ). I'll describe the Nvidia execution model and the concept of SMPs in more detail in my next post. For now, lets just assume we have 1536 cores to run our code on. Now on the face of it, this is far more than any current generation CPU can give us, but there is more. GPUs execute a certain number of threads, called a warp, on each of these cores. For Nvidia, the warp size is 32. This means that to utilize the full potential of this GPU, you could run 1536*32 = 49152 threads in parallel at a time! Clearly FAR more parallel than your average CPU.
I'd like to elaborate on the GPU execution model and maybe introduce a few more technical terms in my next post, but for now I'll wrap by answering another question that may be bothering you: what's the catch? Most scientific endeavors which have shifted to GPU computing have reported speed ups of several orders of magnitude ( which is no small feat by any means ) , but these problems have to be of a certain type. These kinds f problems are often called 'embarrassingly parallel ' problems because the computations involved have very little inter-dependence and are of the order of 10s of thousands or more. In my illustration above I calculated around 49k threads running in parallel at a time. The catch is that you need to make sure your problem is big enough that you have 49k threads to run! Another thing that drastically reduces GPU speedups are conditional statements.For reasons which I will get into later, conditional statements can often drastically reduce the speed up that you get out of your GPU. So GPU computation can give you a gigantic improvement in run time as long as the problem that you are shifting meet certain minimum conditions. Quite often though, you can improve the speeds that you get out of the GPU by approaching the problem differently. The single threaded approach to problem solving is generally how we are programmed to think, but I fear the trends in computer science will render this way of thinking less and less useful in future.
So a different, more parallel way of thinking is the need of the hour. There are many reasons for this; perhaps the topic for another post. For now, I've introduced what GPGPU computing is and we'll get into a few more details and possibly touch upon CUDA and OpenCL ( the languages to code for GPUs ) in my next post.

Thursday 23 August 2012

There and Back Again

In my last post I wrote about a fortunate series of events which saw me landing up at CERN. I would like to dedicate this post to my experiences there.
CERN was not at all what I expected. The main site located in Meyrin ( a  fifteen minute drive from the airport ) is quite a large enclosure shaped somewhat like the bough of a ship. Once inside, I was somewhat surprised to see the buildings. Quite a few of them are pretty old and so are no more than concrete blocks. The buildings are not numbered serially so first timers are often confounded when they see building 500 next to building 2 behind building 32. But it doesn't bother you as much once you get used to it. One structure that really does catch your eye as you enter is the Globe, a spherical building located just by the entrance.
CERN is located right at the border of Switzerland and France. The border cuts through CERN and so a typical day involved unknowingly crossing the border a number of times. In fact, because of this Switzerland bought a piece of land in France to build a restaurant ! ( the prices for the same food might have been different otherwise ). There is usually no check when you cross the border, so it isn't really inconvenient or anything but it sure was interesting!
When I first arrived at CERN, I went to the building of the IT Department to complete some formalities before going to my workplace which was in another building. The IT building is really cool! On the ground floor there is a small museum with many objects behind a glass cupboard. All of them have some historic significance. There are some old chip wafers and ancient machines and magnetic tapes etc. The most interesting one by far is this old black CPU with a worn out paper stuck to it on which you can make out a slightly faded message. The machine was the first server used by Tim Berners Lee and on it is written ' This machine is a server. DO NOT TURN IT OFF' !'. Imagine that! The internet shuts down because the cleaning lady wanted to conserve power!
CERN has a rich and incredible history. Even before the Higgs announcement, before the LHC, they were at the forefront of science and technology for quite a while. Walk around for a bit and this becomes more and more apparent. Right from the plaque in front of a room in the building next to mine which reads ' The Internet was invented here. Once the office of Tim Berners Lee.', to the lawn outside the restaurant with first bubble chambers sitting under the sun for all to see. 
Every year CERN invites around 200 students from around the world to work on an internship there over the summer. These include scientists and engineers from various fields. One of the highlights of my trip was mingling with this diverse international crowd. You have to understand that this was my first trip outside the subcontinent and it was overwhelming  to see so many people from so many different cultures and backgrounds all in one place. There were people from China, Japan, Brazil,  Sweden, Finland, Norway, Denmark, Macedonia, Lithuania, Greece, France, Italy, Romania and.. well you get the picture. It is just brilliant to see so many people from so many parts of the world just getting along with each other so easily. Makes you wonder why politicians crib so much.
At this point I think I'd better mention the 'elephant in the room' so to speak. The question most people want to ask when they find out you've been to CERN. Did I see the LHC?
First of all, the LHC is not this huge tourist attraction with a water park build inside it or anything. The whole ring is underground and no one is allowed near it when it is running because of the amount of radiation it emits. The only places where you might get close to it is at the site of the detectors. The four major detectors that collect data in CERN are Atlas, CMS, LHCb and ALICE. The 'big' ones which announced the results on the 4th of July were Atlas and CMS. These are more general purpose as compared to LHCb and ALICE which are smaller in size but meant for more specific purposes.
On the night of the 3rd of July, right from around 11 p.m. or so, you might have observed a queue slowly forming outside the main auditorium. It was the iPad release of the scientific world. By 4 a.m on the 4th of August the queue was huge. At 6 it was apparent that the limit had been reached. When the announcement was made at 9 a.m. the entire auditorium was full and abuzz with excitement. Unfortunately, I was not in the auditorium at 9 that morning. I was in another building far away watching the live webcast while attending an Intel TBB workshop where no one paid quite as much attention as they should have. Besides, I don't think I had it in me to wait through the night. I don't know much about particle physics either so I doubt I would have got much anyway.
The day of the big announcement was, believe it or not, just like usual. There were no processions on the street; no champagne parties;  no one shouting 'Hallelujah' from the rooftops. It was business as usual. I'm not sure if the people actually involved in the discovery had a big party planned or anything, but the entire student community ( who pretty much had nothing to with anything ) went out to the pub in a town nearby and celebrated the Higgs.
A typical week saw me working diligently all day at CERN and sitting peacefully under this large tree on the lawn with the other students from the hostel. The hostel itself was located in France next to a French village called St.Genis. I use the term village here because that is what I was told it was. But you would often find expensive cars zooming down the streets as you passed by these large villas in the corner of town. Its proximity to our hostel was perfect because it meant we had a supermarket 10 minutes away and pretty much anything you needed you could find.  On the weekends there was always something to do. You could hike up 'Le Reculet' in the Jura mountain range, or go canyoning in a valley not fat off or go paragliding if the weather allowed it. If the physical activities started to get tiring I would go down to Geneva for a music festival or just to stroll around the clear blue lake.
The best part of being in Switzerland in the summer time is that in July you can go down to Montreaux to catch the annual 'Montreux Jazz Festival'.( In case you were wondering where you heard the name, listen to 'Smoke On the Water' by Deep Purple again)
 Montruex is a beautiful place to see. The city is architecturally impressive and is built in levels that peter down to the lake in the middle. Standing by the lake, you could look up and see all the buildings at various steps going higher and higher until finally the green of the mountains take over and climb all the way up into the sky. With the summer sun shining down we settled down on the grass in the park to listen to some incredible music of different types and genres played by some of the best bands from across the world. The music in the park and the jazz cafe are free and you can hang around all day without getting even remotely tired. At night some well known artists play good jazz in two large halls located just as you enter. These are paid concerts and can cost anywhere from 60 - 300 swiss francs. The two days that I went the concerts were sold out, which was understandable considering the artists playing were Bob Dylan and then Hugh Laurie the next day. Maybe next time.
For me,  this was by far the most incredible summer ever! I am extremely grateful to all the people at the CERN Openlab and the people in charge of Google Summer Of Code 2012. If not for their quick replies to all my questions and their enormous patience in answering all my questions and requests, this summer might have just passed me by like any other. Instead, now I can tell people that for two years they were searching for the Higgs, and a fortnight after I went there, we found it.
Too much of a coincidence, don't you think?  ;)




Sunday 19 August 2012

An Unexpected Journey

If you had asked me what my plans for the summer were four months ago, I probably would have shrugged and mumbled something about applying for an internship somewhere. The fact is, I had no idea where I was applying or what I was going to do. Things might have remained the same and another summer might have just passed me by had it not been for one morning in mid March when I decided to apply for the Google Summer of Code ( GSoC) 2012. Little did I know that my plans were about to change, in a big way.

GSoC is an initiative by Google ( I'm sure you could have figured that one out for yourself) to provide a platform for interested students to contribute code to open source organizations. The thing that makes GSoC different is that Google pays you for it. The catch is that you have to be a student pursuing a degree and above 18 years of age. The best part is that it does not matter what degree you are pursuing, which allows for Electronics engineers like myself to work on Computer Science projects. Basically, Google puts out a list of organizations that you can work with every year which you can peruse at your convenience. Once you find a project that tickles your fancy, just send an e-mail to one of the mentors for that org and see how it goes from there. The most important thing that I learned while applying was that it is not about how much you know, but more about how much you are willing to learn and how dedicated and genuinely interested your project . It really is an amazing initiative and if you're interested you should check out this link.

GSoC can be the most interesting, challenging and at times frustrating thing you have ever done. But to any computer geek with something to prove, GSoC offers a portal to develop code that can actually make a difference. The organizations that one can apply to are all incredible open source projects. There were quite a few I had never heard of and so it was interesting to look at such a multitude of good open source orgs that were looking for developers.
 One of the perks of GSoC is that you can work from home. In fact you can work from anywhere in the world so long as you've got a work-permit for that country. So I expected to spend my summer, like everyone else, developing software from within the cozy confines of my house. I expected to learn a lot about programming and code development over the summer. I did learn a lot about that, but I also took away much more. I got a chance to meet new people, experience new cultures, see new cities and landscapes. I got to try out new and exciting things and have all kinds of crazy adventures more than 4000 miles from home.

You see, the organization with which I worked this summer invited me to work at their headquarters at my own expense.It is located on the Swiss-France border, straddled by Meyrin, Switzerland on one side and a quaint French village called St.Genis on the other. The organization is quite well known and has in fact gained even more popularity because of a startling discovery this summer.This summer I worked with a toolkit called 'Geant4' which is maintained and used by the European Organization for Nuclear Research ( CERN ). Every time I think about that last statement, I have to pinch myself to make sure it wasn't all just a very elaborate dream. Either way, it has been one of the best dreams of my life. 

Every year almost 200 students from across Europe and elsewhere are selected to go to CERN as summer students. This mix of engineers and physicists are joined by around 15 students from another program called the CERN Openlab. The Openlab is an initiative by 5 companies including HP and Intel to bring students working in IT to intern at CERN for the summer. This year there was one addition. The people at the Openlab graciously invited me to work on my project at CERN. I accepted the offer without a moments hesitation and have never looked back.

This post was mainly to tell you about GSoC and the ( unexpected )  benefits of the program. I will devote another post to my experience at CERN itself, so stay tuned for that.