Karan Kurani. Software Engineer at Funzio, was M Eng CS student @ Cornell.
Love coding, listening to music, writing, traveling and photography.
[Photo taken in Nov 2011 @ Big Sur, CA]
- Built from scratch, scaled and added new features to the Analytics System at Funzio. This system was adopted by Gree after the acquisition and currently powers the entire analytics stack at Gree US.
- The analytics system includes batch processing and real time streaming of hundreds of Gigabytes per week. This includes collecting both performance and application specific data.
- Also worked on several game features on the server side as and when required.
Led the open source initiative within the company and the first two open sourced projects from Gree US studio -
- Open source version of an alternate front for graphite. - https://github.com/gree/Orion . More description at - http://karankurani.tumblr.com/post/28790569032/being-at-funzio-part-5-of-n
- Open source version of a log shipping system used internally. Project Relay - https://github.com/gree/Relay . More description at - http://karankurani.tumblr.com/post/42854935307/being-at-funzio-part-7-of-n
- The components involve use of Node.js, HTML/Client Side Javascript, Python, Java, PHP, SQL and NoSQL (MongoDB).
- Initiated, managed and led the hackathon initiative of Company Hackathons at Funzio and then later at Gree. The 1st hackathon had participation of approx 20 people while the 4th one had around 200 people.
Student researcher in the following projects:
1) Multiview Ensemble Learning for Poverty Mapping (http://blogs.cornell.edu/ml4ics/category/ics-ml-projects/poverty-mapping/)
- Application of Machine Learning Techniques to Economics.
- Using Multi view clustering on Census and Survey to get improved estimates of the poverty level in a particular area.
- Part of the team for the research work and implementation of the machine learning side of the project.
2) Social Network Discovery of Computational Sustainability Community
(http://blogs.cornell.edu/ml4ics/category/ics-ml-projects/social-network-discovery/)
- Winner of Big Data Award at BOOM '11 (http://www.cis.cornell.edu/boom/2011sp/)
- Applied autonomous techniques to discover people who are doing work in the field of computational sustainability.
- Using research publication data to discover such people.
- Use combination of topic generation models, similarity measures, network analysis and statistics to rank papers/authors.
1) Course project in Advanced Language Technologies - User and Link Annotation for Online Social Networks
- Used a combination of network analysis and a generative topic model to automatically assign relevant topics/words to a particular user and the links/relationships between two users.
2) Course Project in HCI - Dual Wield
(https://chrome.google.com/webstore/detail/clkmkjiacaelbbhhklpkkjfgdpjekhpp?hc=search&hcp=main)
- A unique way to visualize wikipedia.
- Primary use in dual screen systems with the extension being aware of wikipedia pages surfed on the second screen.
- Coded up in a few days' worth of coding. Very rough and in early stages of development.
Worked with 6 team members to develop a flood forecasting system which makes hourly forecasts for a period of up to 48 hours. The aim is to implement this system on off-the-shelf workstations commercially available.
Co-authored and presented the lead paper in Disaster Management at Geomatics 2010 held at Indian Space & Research Organization – Space Applications Centre regarding the first prototype.
Used .NET framework and its parallel extensions library (.NET 4.0) to improve the performance of system by over 70%. The performance depends on the number of processors in the machine.
IEEE Student Member since 2007, IEEE Computer Society Member since January 2010.
Organized and co-managed Technodyssey ’08 and Technodyssey ’09 – A National Tech Fest held annually with participation of around 300 students and an organizing team of 20.
Managed 2 other institute level events.
Organized Innovision ‘09 (An annual techfest) which had 12 events and participation of over 300 students. Led a team which consisted of 35 people along with several volunteers. It was organized with the liaison of 3 other clubs – CSI, ACES & AMS.
Part of the initiating team for a social service activity by the club. Brought under privileged children from the slums of Ahmedabad to our labs in the university to help them become computer literate. This activity has been so successful that it is now carried out by the CS department every year.
Managed 5 other institute level events and Colloseum ’08 (Annual Techfest).
Chameleon - All India Radio
Delicate and soothing, Leona Prue’s voice lends a new sound to this album.
Miami Showdown - Digitalism
A nice balance with strong “electronic riffs” makes this song a music worm. On repeat all day every day.
Tây Ninh - Tanimura Midnight
A really obscure band I came across while traveling through the pipes of the interwebs. Light, soothing beats make it an excellent sunday morning song.
Diamonds Are Forever - Kanye West
A song about blood diamonds from Seirra Leone.
Youth of the Nation - P.O.D
One hit wonder. But a really hard hitting song.
Song influenced by the high school shootings. http://en.wikipedia.org/wiki/Youth_of_the_Nation
Lights (Remix) - Ellie Goulding, Bassnectar
This song will worm into your head. Been on repeat since the last few days.
Reset - MuteMath
A blend of radiohead and muse influences their music. This is a very addictive soundtrack.
Moving Mountains - Ode We Will Bury Ourselves
From their album Pneuma. Blend of various subgenres lead to a unique sound.
This is part of a series of posts about moving halfway across the world to bootstrap my startup, Shoutt.
In my earlier post, Moving To India - Cutting Costs, I talked about my aim to find an apartment under $500 including utilities.
The Hunt
Hyderabad is a growing metropolitan. In the…
I moved to India on April 4th to work full-time on my startup - Shoutt. I was born in Hyderabad and migrated to the US 5 years ago for my Masters. Hyderabad and India have both changed quite a bit in that time. Or maybe it was me.
Since I got back, I’ve been constantly shocked and surprised…
//PROJECT RELAY
After all the trouble we went through by using Flume, we realized that Flume did not satisfy our requirements. So we decided to build an in house alternative which did not require all the bells and whistles. This also meant that it would be simpler to implement. It would be super simple, easy to maintain and configure with a few parameters… all the good stuff that every engineering project wants to have. So was it as simple, easy to maintain etc as we originally wanted it to be? … erm, maybe. We went through many iterations before we finally stabilized into a design which we were satisfied with.
Our initial design looked something like this:
We decided to pick NodeJS as a language to implement it in as it had inherent advantages in handling concurrent network requests. Our app server connected to the service which accepted tcp connection. It sends the data over the connection and once its done sending over, close the connection. Our service consisted of a bunch of “collectors” which accepted data from any app server that connected to it. It would initially log all the incoming data into a file and post process it at periodic intervals to copy over the data into vertica. This design meant that the game code was responsible for handling all failure cases like one of the collectors being down, the connection being too slow etc. In a full failure scenario when all the collectors were down, the app server would simply log the unsent data to a file locally which was then later copied into vertica by our existing daily batch import job.
In about a week, we developed it and shipped it to production for a couple of our low traffic games. I then named it Relay since it was all about message passing from one server to another. Cool name eh?
Over the next week we debugged it, handled all the bugs and once it was sufficiently stable, started rolling it out to the other games. We had rolled it out to all the games, and the system scaled beautifully… until it broke down as we rolled it out to the last game.
//DEVIL IS IN THE DETAILS
One of the best decisions we made was to make the game code fell back to the old behavior in case Relay failed. And it worked wonderfully, we never lost a single bit of data when we were making a transition. The data might have come in a bit late (batch imported once a day), but it would eventually be available.
The way we rolled out the code was to low traffic games first and then the higher traffic games which had more data. We also had knobs which would control what set of events were sent through relay, flume or daily import job. Our aim was to make ALL the traffic go through Relay eventually. And we started by just pumping as much as we can through Relay to see where it breaks.
So what happened when we rolled it out to the last game? The collector part of the process slowed down to a crawl because of the amount of data coming in. It simply couldn’t keep up with processing the incoming stream.
The collector had two parts:
The listener handled the incoming tcp connection, accepted the data and logged it immediately to a file. This file was processed at regular intervals and then copied into vertica. Keep in mind that both parts were part of a single process.
The bottleneck was the second half of the collector. When it started to process the next file, the process would go into this massive slowdown and stop accepting tcp connections until it finished processing the file. It took quite a while to figure out why it was slowing down so much, it turned that the garbage collector hogged all the processing while deallocating and reallocating memory. Even though NodeJS is asynchronous, it is still a single process i.e. it can only execute one thread at a time. So if any one thing is executing, nothing else executes in parallel .. but anytime any thread is waiting for I/O nodejs goes off and handles something else giving it incredible speed and scalalbilty for applications which are continuously waiting for some I/O or are event based.
NodeJS also does not have a clean way to iterate through a file line by line. So what I did was just load the entire thing in memory and then process it. That was also (obviously) causing the process to be memory intensive. We fixed it later by using the ‘lazy’ module which helps in making processing things sequentially without all the crazy hoops that NodeJS really requires you to do.
Once we verified that the second part was causing the problem, it took me about 15 mins to split them up into two distinct nodejs processes and deploy them out separately. To do this fast, all I did was make two programs with one having the “copying to vertica” part disabled and vice versa. The code was still exactly the same and it allowed us to test it out really quickly.
So the new Relay setup looked:
//DECOUPLING
This setup almost satisfied what we wanted, but not quite. We also wanted graceful handling of failures of various components of the system. This meant that if any collectors, copiers or the app server failing should not cause any effect on other parts of the system. With the current setup, if the collectors went down, the game would start throwing a lot of internal errors when its unable to connect to a collector (it would not affect the game itself but would lead to a slowdown in the performance). This was still unacceptable because we wanted to completely decouple the application code from analytics.
After working on it a little bit more we got the final design as shown:
In the new setup the app server would simply log to a file locally in a particular format. We introduced a relay sender on each app server which is continuously running. It picked up the logged data and sent it it to the collector. This allowed completely decoupling of the analytics systems and really helps us in maintaining it without touching the game code at all. Further, each module - sender, listener and copier are completely separate entities and so they can be used/maintained/upgraded/swapped in and out independently.
Once we had this design, we realized that we can use the individual components separately in other parts of the system which had similar needs. These new usecases also prodded us to redesign relay to be much more configurable and generic.
We think that other people may also find this system useful, and as part of our open source initiative, we have open sourced Relay here. It has the generic parts of sender and listener and they can be used independently. They are also configurable via just a few command line params.
This project has shown me the true value of modularizing your code than any project for a software engineering class that purports to teach the same principles.
//INTERNS! ZOMG!
About a month and half after I joined, Ram announced that the engineering team is getting interns. Four of ‘em. They were all undergrads from University of Waterloo and were coming here for a semester as part of their co op program. This was the first time engineering was getting interns in the company. Ram walks over to me and tells me that I was gonna mentor 2 interns. *GULP* Here I was a fresh grad couple of months into his first job and I am supposed to mentor two other people?! This was a big responsibility to say the least. These interns were gonna have their first taste of industry through me. *Double GULP* All sort of doubts and nervousness hit me for the next few hours. With these thoughts going on in my head I started training the interns.
It turned out that Waterloo has an excellent co-op program. So the students in the program study for one semester and work in the industry for the next sem. This meant that they gain a lot of industry experience by the time they graduate. The interns I got, Matt and Patrick had already interned at a couple of other companies before they joined Funzio. So (thankfully) I wasn’t the first guy who was mentoring them.
Waterloo students are incredibly smart and are also very fast learners. Both Patrick and Matt quickly picked up the basics of our processes and were comfortable with our code base in a week or so. When I had joined Funzio, my first piece of code was out on production within 2 days. That experience had given me an incredible feeling of doing something substantial, however small, and also a good amount of confidence. It instantly made me feel part of the team. I have been doing the same for every new hire (full time or intern) since then.
They quickly went from doing bug fixes to building full fledged tools and features. Of course there were a few hiccups along the way and I had to teach them about best practices about coding style, scalibility, networks and databases but once they got the concept they applied it everywhere. They were incredibly inquisitive and were always hungry to learn more. Seeing them grow from slightly nervous students to confident software engineers was the best thing ever. Matt had moved onto the iOS team and Patrick stayed on to help me with analytics. We built and improved various tools that have since become critical to the functioning of Funzio as a company.
Our first batch had a stellar performance. We had incredible fun working with them and based of this success we had doubled our next batch to 8 and we again had the same experience as the previous one. The contributions of all our interns are equivalent to any full timer. Both Matt and Patrick returned to Funzio/Gree for a second term and so did majority of the interns from our second batch. Their second term was even better than the first since all of us were familiar with each other by then. Right now, the contribution our interns make is equivalent to any full time employee and I hope that the trend continues in the company through the years.
//PROJECT ORION - (UPDATED)
We use Graphite (http://graphite.wikidot.com/) for tracking metrics w.r.t time. To put it in one word, its… awesome. It is highly scalable, you don’t need to do anything to add new metrics and querying data out of it is super easy. Graphite also has a default front end and though, it works but the rendered graphs aren’t… pretty. Also, at the time we were using it (July 2011), storing dashboards and accessing them through the default interface was very painful if not impossible. They have improved it a lot since then.
We had a custom front end where we could store dashboards which can be accessed by their urls. It was very painful to add new dashboards which required changes in code everytime a new dashboard needed to be added. It wasn’t a scalable method to add more dashboards. And we needed custom dashboards for every new game that we released. Plus, we wanted to have a dashboard in which the graphs would update live without needing to refresh the page to get new data. We also had some additional features which I will describe below.
Patrick C, an intern from Uni of Waterloo and I started off working on the project. We named it Orion - just because it sounded cool. In two weeks we had a working prototype out which was visible to everyone in the company. Here is what a typical dashboard looked like:
Looking at the links on the right there are three sections. The left box is heading under which links are grouped. The top group on the right are internal links which load the various dashboards asynchronoously. We had several external dashboards that were also accessed frequently. The bottom half on the right are external links (external to orion) that open in a new window.
Next are the graphs which are rendered on the screen.
Here we have an example of the total number of server warns thrown by the application servers for our game CrimeCity - iOS. Internally, we define warns as something we want to keep an eye out for and these warns usually spike when something goes awry (although it doesn’t break the game completely) with our deploys. We used highcharts (http://www.highcharts.com/) to render the graphs. It has a very modular structure and its easy to plugin your data. Whenever the mouse hovers over a graph it highlights the details about the point in time like the exact value at that time. It also allows you to temporarily disable some metrics on the graphs by simply clicking on the legend on the top right.
In Orion we can render as many graphs as we want to on a single page. If there are too many graphs you can have them in columnar structure and also mix and match the arrangement. Also, each graph can render as many metrics (graph lines in each graph) as we want.
This makes the tool highly flexible (almost as flexible as the original graphite interface). And it doesn’t require a technical person to add/remove/maintain dashboards. And accessing the most commonly used dashboards was never more than a couple of clicks away.
All these features combined made orion a hub to keep track of all engineering metrics on production. Its also used to keep track of anything that we need to measure in realtime like number of installs, logins, number of transactions etc.
Lastly, we also wanted to have the ability to go back in time and look at the history of metric. So we added a datetime picker to reload the entire dashboard for that time period dynamically. This also generated a unique url so you can share this particular snapshot with anyone else.
I have not touched on how to create or edit the dashboards. The interface looks like the following:
Game name is the same as the overall tabs under which the dashboard is grouped. The dashboard name is pretty self explanatory.
You can pick the platform, game, metric and the specification you want. The metric text box has autocomplete since there are far too many metrics to create a dropdown off. The other period is used to plot the same metric in a different period of time so we can do a quick comparision of how its performing relative to x days ago. In the above screenshot we are plotting installs in the last 4 hours, and the same time period one day ago to see the day over day comparision.
Adding an external link also has a similar interface where you just pick the game name/tab under which you want the link to appear and paste the link.
//BUILDING IT
We built the entire product from scratch in about 3 weeks. This was of course not the only project we were working on at that time. Also, neither of us was an expert in UI coding and we picked up the various tools/skills on the fly while building it.
The code was messy internally. As the product was changed to meet additional requirements (the feature to have other period was introduced later), the code got messier with a bunch of special conditions to check for the format of how we stored the data. It was a mishmash of jQuery, jQuery libraries, vanilla javascript, PHP, MySQL, CSS and HTML. We also had a NodeJs component to interact with which was never used. Tying it all up into a single coherent product was challenging to say the least. I learnt a lot on how to design software systems from scratch. We stumbled a lot, crashed it multiple times, threw out days’ worth of code out of the window…
But it all worked.
Nothing is as exhilirating as shipping your product and then seeing it in action. Orion became the defacto place everyone went to check the health of our systems. It became deeply integrated into our software release processes and monitoring systems. Whenever I walk around in the office, I see multiple screens with Orion on it. That is the best feeling ever.
A few months went by and we had Danny Bowman (https://twitter.com/dbow1234) onboard. He is an awesome UI engineer and he (along with Patrick C who had returned to intern with us for a second time) embarked on a project to revamp the entire product. It has cool new features like the ability to copy graphs/metrics, OAuth for user authentication and access, mobile optimized dashboard, ability to embed graphs anywhere else and more.
It currently looks like this:
We are in the process of open sourcing this product. It still needs work to be a complete flexible alternative to the native graphite dashboard and I hope with open sourcing everyone else can look at it and improve it further.
UPDATE - The project is now available at https://github.com/gree/Orion
4 - Flume and Fire Fighting
[ Note: This is a fairly long post. It also leans more on the technical side but I have tried to keep it as simple as possible. I felt the need to go into this much detail because the experience described below has affected me tremendously in the way I approach work and design systems. Of the many projects I have done at Funzio this one is the most memorable and the most hard hitting. ]
Around the time I joined, Funzio had decided to adopt Vertica as the analytics backend system. It is a highly scalable data warehousing system that supports SQL. It required minimal administration and support compared to other systems like Hadoop. I did not know anything about it. Ram gave me the 700 page manual and told me to figure out how to hook our game into it. Some other people had already started working on it before I was put onto this task.
//FLUME
We had decided to use Flume, an open source tool which streams logs from one machine to another. It had a plugin architecture where we can hook our code into and write to whatever type of destination we wanted. It was written in Java and Vertica had JDBC connector which we used to build the plugin.
The game servers log the analytics data locally to a file. Flume had the ability to watch a file and track what changes were made to it. So as we logged more data into the file, a flume agent process would grab the changes and send it a flume collector process over the internal network. The collector would collect the additional log statements from multiple flume agents and merge them. The collector also processed the data via our plugin and then dump the transformed data into Vertica. Flume also had a flume master process which maintained the collectors and agents and dynamically rerouted traffic as and when needed. This is called an ETL (Extract, Transform and Load) pipeline in analytics and is standard way of processing data. Reiterating, Flume had 3 parts to it -
Flume has a feature called end to end reliability streaming. What this meant is that it kept track of each and every record that was generated from the agents and saw that it was loaded into Vertica. If the data was lost somewhere midway it would resend the data from the agent. Since analytics data is critical we had it enabled. In a few days we had the whole setup up and running ready to take traffic.
The process of releasing our games is in two phases. We do a “soft launch” first where we release our new game in 1 or 2 countries with very little marketing. We get a few thousand users and see how the game performs. We closely monitor the in game behaviour of the players, find the bugs and problems in the game mechanics, tweak the game and iterate. Once we are satisfied with it’s performance, we do a worldwide launch and go for a full marketing push.
//LAUNCH DAY
During the soft launch phase Flume and Vertica held up without any hiccups, it was only few tens of thousands of records per day. We added more tables to track more stuff and the system accepted the new data without any problems. Then on launch day this happened…
The game was prepared for this traffic, we had 35 servers ready to handle it. We had also put more collectors (4 of them) in anticipation of the volume of analytics data increase. I thought that it was enough and that we can always add more collectors to handle more traffic. It turned out (as usual) I was wrong.
The first sign of problem was subtle. As the day went by, we saw that the numbers were slightly off (less than 0.5%). I thought that it was due to the lag from the time record was generated to the time it was actually put into Vertica, and it always used to catch up. But, during the second day the numbers were going off by a lot and we saw that the data was streaming in much more slowly (lag of a few hours). This pointed to serious problem in the pipeline.
We added more collectors and it seemed to catch up. But another problem surfaced. The numbers seemed off where everything was more in number, like the number of players logging in was significantly more than they were supposed to be compared with the data we had from other sources. W-T-F was going on? Ram had the foresight of telling me to generate a unique identifier (uuid) for every record that was created. Delving more into the data we found out that there were tons of duplicate records in Vertica. Okay, I thought, since Vertica was a SQL database, we can put a primary key constraint on uuid and it won’t accept duplicate records. I cleaned out the records, which took a mind numbing 5 hours of manual mechanic labour, added the constraint on the field and got on with other tasks. Fairly straightforward right?
Wrong. Although the documentation said that primary key was used as a keyword, we missed the fine print that said that although the primary key was supported… it wasn’t enforced. All Vertica could do is that it can detect that the constraint was violated and tell that to us AFTER duplicates were inserted. I finally uncovered this fact a couple of days later. Back to cleaning up the duplicates for another day. We had to keep the database clean since it was used extensively for analyzing our game. We cannot afford to have dirty data since Funzio is a very data driven company and all our decisions were based on that data.
//MANIFESTATIONS
While all this was going on, the pipeline was slowing down due to increased amount of data we were tracking along with the consistent high traffic. Few of the agent nodes had error-ed out and we had to restart them to bring them back up. This issue was chewing out more and more of my time which could’ve been spent on supporting the game itself. A few days after the launch, it was 9 PM and I had just reached home. Alex, our sole business analyst at that time, noticed that the data pipeline had slowed to a crawl, it was backed up for around 8 hours. Ram looked at Flume’s monitoring page and saw that the agent processes were going into an error state with alarming frequency. This issue had become unmanageable and we had to tackle it ASAP. Time to go nuclear.
I was back at the office with Ram and Patrick (our analytics intern) at 10 30 PM. We shut down the flume pipeline entirely so as to not excarbate the issue and went to the drawing board. 30 minutes later we came up with a short term plan to deal with it. We split the traffic into two streams, one was high value but low volume data that was streamed in real time and the other was high volume but the content was not necessary to be available in realtime. We setup a batch processing script that would copy it into Vertica once a day. We had the new system live on production in the next few hours and breathed a sigh of relief as we saw traffic was flowing smoothly and was stable. We also understood the internals of flume enough to setup a monitoring system which alerted us the moment it saw signs of flume getting backed up because of traffic.
I have not yet talked about why the duplicates were generated in the first place. We knew that we were logging a record only once, but it was getting inserted more than once. Why? The culprit was Flume again. Remember that we had enforced end to end reliability to make sure each record was getting through. When too much traffic went through Flume, the records were not getting inserted at a fast enough rate from the collectors into Vertica. The records would go through eventually, but not before the agent would time out and thought that the record was lost… and it sent that record again to the collectors. This caused more traffic to be generated which led to more timeouts and the agents went into a death spiral finally ending in an error state and crashing the entire system. This behaviour was not documented anywhere but was a manifestation of how Flume’s protocol worked. It took me more than a week to figure this out.
//FIREFIGHTING A.K.A YOU AREN’T OUT OF WOODS YET
Flume used to get backed up for short periods of time randomly, or so it seemed. We started to notice that whenever it got backed up, Vertica would have some sort of CPU or I/O load issues. The problem here was clearly our plugin. It could not copy over data to Vertica at a fast enough rate and was very sensitive to Vertica’s load. We coded up a new plugin which was more efficient in the copy process and tested it in our development environment.
But we could not QA it. The reason for this was the way our AWS (Amazon Web Services) network was setup. All of our servers were on the same private network, it kept all our deployment processes and service communication simple and easily maintainable. Also, flume would dynamically reroute traffic to different collectors depending on its load. This was a very useful feature. But this meant that any flume process in that network would register automatically to the production network and participate in streaming data. We quickly shut down our new QA network as soon as we saw production data go into our QA vertica server and vice versa. Configuring a whole new network and keeping it semi isolated from the other servers on the network was too much of a pain and we did not have the time or resources to do so. There were 5 server engineers plus Ram supporting two games (Crime City FB and iOS) played by millions of users. I was the sole guy maintaining the analytics system and I was also simultaneously managing other projects.
There were several other issues which caused Flume to back up once in a while. I had to go clean up the mess manually afterwards and find out what had caused the problem. All through this, Ram had given me the following advice which I shall follow religiously forever.
“Never ever get into a phase where you are constantly fighting fires. It sucks up your time and you cannot ever build more tools on top of that system. Always strive to minimize the time you spend fighting fires.”
Easy to read, sounds obvious, but a very subtle principle. Its really easy to not notice that what you are doing is firefighting but think that you are “learning the tools” and trying to design around it. You can easily get stuck at that level and never move forward to build more advanced tools/systems on top and it deviates you from the overall vision of your plan.
We eventually decided to get rid of flume altogether. Transitioning away from it in a seamless manner without interrupting the production systems presented with another set of challenges which I may detail in another post.
Its also worth mentioning that there was nothing wrong with Flume per se. Although it was still a project in its infancy, it was version 0.6 or something close to it, its just the way we used it to attack the problem that was wrong. Also, I did not have complete understanding of both Flume and Vertica, which caused the above mentioned issues.
//TL;DR
The overall lessons/principles I learnt:
- Research your tools. It may not be possible to know everything about them at once but always strive to learn the internal workings of the tools you use.
- Have a kick ass monitoring system to be your eyes and ears at all times. Also, log everything to keep track of how your stuff is working.
- Make sure that your design is flexible enough to be changed at a moment’s notice. You will throw your old work out of the window multiple times over the lifetime of any project. Learn to be at peace with that. The project will inevitably look completely different from whatever you initially thought of.
- Minimize the time you spend on “firefighting”. Again, minimize the time you spend on “firefighting”. Once more just in case it did not sink in, minimize the time you spend on “firefighting”.
[Comment: This is my favourite piece of writing so far. Let me know if you concur.]
I walked on,
Leaving everything behind,
All my sorrows and worries,
All happiness, anger and frustration,
Towards the light that was before me,
So alluring and enchanting it looked,
Twinkling like a diamond changing its hues,
I walked on,
All the memories,
Forgetting all the wars I’d waged,
All the battles I’d won,
Throwing away my emotions into a void,
My passions and love slipping into oblivion,
I walked on,
Into the Light,
It beckoned me with its soft voice,
A lovelier sight never seen before my eyes,
It was everything,
The only glimmer in my eye,
My body floated away,
My soul walked on,
I lay bare before it,
Everything in stark view,
All my virtues and sins,
Every little joy and regret inside me,
Lay there and defined me,
I passed into it,
And I walked on,
Never regretting any moment,
It was me and I was it,
I accept it and it accepts me,
And from then on,
We walked together.
First Task
I walked into the office at around 10 am. It was surprisingly empty. I’d later find out that the engineers usually come in at around 10 30ish (I started coming in later at around 11). I talked with HR, signed a bunch of papers and then met Ram - the CTO of the company, also my manager, and Andy - the VP of engineering. I was brought onboard as a server guy and some of the first tasks were a couple of bug fixes for the game Crime City. I was the 4th guy on the server team.
The setup of the local development environment took more than day. The stack consisted of:
- iOS Objective C code for the game client.
- PHP code hosted on Apache on the backend.
- Sharded MySQL server as a database.
- Memcached for caching.
- Linux on the production servers.
- Git is used a source control versioning system.
- P4merge as a diffing and merging tool.
- MongoDB was used as a database for analytics some technical metrics.
- Graphite had just been integrated as a time series database for analytics.
- I used PHPStorm, Textmate and a little bit of vi as IDE.
My first major task came about 3 days in. Charlie, our business analyst, had to give a presentation about the Lifetime Value of a user in our first game, Crime City - Facebook. His presentation was scheduled in 2 days. He required me to build a script to calculate the metrics he had specified. I was asked to do this in Python - a language I had never touched before in my life. The script was querying a bunch of tables in a mysql database and grabbing the relevant records for each player, slicing and dicing the data and printing out the output in a CSV format. This was to be done for all the players that had ever installed the game - a few million of them. It took me about a day to write the script and I kicked it off at midnight to run overnight.
The next morning I found out that I had incorrectly coded up a part of the script, which was about 75% complete. It was killed and restarted. The fix had some additional queries which slowed it down considerably. Loading the whole data into memory was out of question since the total data was around 50 GB in size. The estimated time to run the script was 16 hours and the presentation was due in 12 hours. Shit. I called up Ram at 11 30 pm and with his advice (“Try using threads and run parallel queries - Oh, and also make sure you don’t bring down the database.”), I completely rewrote the script and kicked it off at 1 am hoping it was fast enough and praying that does not bring down the database. It finished in 5 hours flat.
At the end of first week I had written a couple of bug fixes which were already on production affecting millions of users, learnt a new language, wrote a fairly advanced script in that language that processed tens of millions of records, to give data which was used to decide what steps the company was gonna take. Fun times.
I struggled,
Flailing my arms around,
Desperately trying to reach him,
He just sat there smiling,
Staring into my eyes,
My chest heaved,
Gulping,
I screamed at him for help,
He stared at me,
Reminding me of all the deceitful words,
That I’d spewed out all my life,
I couldn’t stop,
It rose up,
Slowly and surely,
It was now up to my chin,
I couldn’t stop,
I won’t stop,
Hoping more of it would save me,
Today was the day he had waited for,
He sat there and smiled,
Patiently,
Waiting for me to drown,
In the sea of treachery and betrayal,
That I had created.
First day/Hardware setup
About a week before I was joining. I received an email from HR about what kind of workstation I would prefer. Wait…what?! I get to pick my own fricking workstation? Perk number one of a well funded startup. My choices were:
- A PC or A Mac (I chose the Mac).
- Screen size (13, 15 or 17 inch) - I chose the 15 inch.
- For the main monitor, a dell or an apple display (I picked the apple display).
After finishing up the paperwork I was taken to the storage space where they kept all the hardware. All my hardware was lying packed and sealed there. I picked up all of it, went to my desk and started setting it up. The feeling of setting it up all by myself was surreal and weirdly felt like an initiation ceremony. All the hardware is still registered under my name.
My initial setup looked as follows:
- A 15” beefed up macbook pro (8 GB RAM, 500 GB SSD HDD, 512 MB video graphics card, 2.3 GHz Intel Core i7). The laptop cost around $4500.
- Apple Cinema 27” display. $1000
- Apple Magic Mouse. $70
- Apple Bluetooth Keyboard. $70
I’d gone all apple fanboy crazy just because I had the opportunity to do so. As I spent more time with the hardware I found out that the cool looking things are not the best. It needs to be ergonomic, especially since being a developer you are spending most of your non-sleeping moments on a computer.
I researched on how to best maintain your posture and adjusted my gear accordingly. About two months after starting, with lots of experimenting, I’d swapped out the keyboard and the mouse. I use an ergonomic gaming logitech mouse and microsoft’s split ergonomic keyboard. There is an additional full keyboard from apple connected because others hated the split configuration and there are often cases that they end up “driving” my workstation when debugging an issue or while pair programming. My current setup is photographed below:
I have raised the monitor so that I dont have to bend over my neck while looking at the screen. A good heuristic is that the monitor’s top should be in line with the top of your head. Also, my chair is also configured to my needs. My arm rests are raised upto the desk level so that my arms lay out flat on the table from elbow onwards and the keyboard is designed to be curving in so that there is no bending of wrist which is the primary reason of the carpel tunnel syndrome.
A couple of people in the office currently use standing desks and I am thinking of giving them a shot at some point.
Lastly, an important part of a dev’s setup is the set of headphones, especially in an open office environment. I have a preference for over ear/circumaural headphones. I started off with a trusty pair of HDD 220 which I had bought as a student. I couldn’t wear them for a long period of time since they pressed hard against my spectacle frames on the side. A few months in, I decided that a comfortable pair of cans would be a good investment and I chose HD 558 for their comfort factor.
Funzio always took care of its developers and one of the most visible signs of how well a company takes care of its employees is the kind of tools (both hardware and software) it provides. Funzio is at or above the industry standard in this department.
[
A few obvious points:
]
“Shit.”
Passive by A Perfect Circle was playing. It was my favorite “F You” song. I sat there. Finally everything as it was meant to be. It was dark. I’d pulled down the blinds so that I can do this in peace. I closed my eyes and listened to the song…
…
Leaning over you here
Cold and catatonic
I catch a brief reflection
Of what you could and might have been
It’s your RIGHT and your ability
To become my perfect enemy
…
I thought about my very short and uneventful life. How I’d conformed during most of it. How in the last 2 years I shed that attitude and behaved like an angst ridden teenager even though I was 25 now. The world was forlorn and empty. I had seen it, experienced it in all its ugly glory. The music stopped. I opened my eyes. It started again.
“Shit.”
I smiled at the appropriate selection of the randomized playlist I’d put on my laptop. The Great Gig in the Sky. I stared at the rope hanging from the ceiling. It has been a trusty companion over the last 4 years and I was sure it won’t let me down (I smiled at my own pun) today. The voice of Clare Torry came on. It screamed at me. I sighed in pleasure. I stood up, went over to the chair and put my head in the noose. It felt so good. Finally something to hang on to. The voice of Clare subsided. I stood there for some moments. Waiting. The seductive sound came back on. Slowly at first, it rose in its intensity, I closed my eyes. I was soaking in its ecstasy, it started to drift off. I wanted to drift off with her. I kicked out the stool. It was straining hard. I never thought it would be so painful. Then it broke.
“Shit.”
The damn rope could hold on to the boxing bag for 4 years but it gave out on me. I lay there smiling. It seems my laptop was observing me. It was playing A Bittersweet Symphony. Life, indeed, has something else in store for me.
…
But tonight I’m on my knees yeah
I need to hear some sounds that recognize the pain in me, yeah
I let the melody shine, let it cleanse my mind, I feel free now
But the airways are clean and there’s nobody singing to me now
…
I climbed onto my bed. And I slept feeling content for the first time in many years.
// Hello Funzio
Life is serendipitous. How I ended up at Funzio is one such tale that exemplifies it. In the next few weeks, I will chronicle my journey in the valley.
It was mid April 2011. I had just come back to Ithaca from California. I’d finished interviewing with TellApart. It was a very bad one. I had not answered any of their questions properly and they probably thought I was a fool wasting their time (based on my performance, I agreed with them). I’d performed so poorly that they had cut off the interview midway, only 2 people spoke with me before I was politely sent off back home. So all in all it wasn’t a very good week for me. I had wasted their time and my time traveling all the way to California and back.
I had fallen in love with Silicon Valley the moment I reached there. I had walked around SF a little bit and just seeing all the tech around was an exhilirating experience. As I was walking down the street, I could hear people discuss about scaling, optimizations, website design etc. I knew I wanted to end up here.
It was 2 days since I had come back to Ithaca. I was about to doze off for the afternoon when I saw an incoming call. The number’s location was SF, California. “Great, here is the rejection call from TellApart”, I thought. I picked up the phone.
R: “Hello, is this Karan Kurani?”
Me: “Yes.”
R: “Hi, my name is R. Is this a good time to talk?”
Me: “Yes.”
R: “I am looking at your resume right now. We think you have a very good profile and would like to talk to you about Funzio.”
Me (very confused): “But I have not applied to Funzio. I dont even know what Funzio does.”
R: “Yes. But we are looking for exceptional candidates and would like to talk to you about Funzio. Can I have a few mins of your time?”
Me (feeling weird because I had never been cold called before): “Sure.”
R: “We are a social gaming startup. We make facebook games. Our first game is called Crime City which is out on facebook. It has over 7 million monthly active users. We are just a 20 person company and we are looking to grow fast. I would like you to talk with our engineers here.”
Me: “Erm. I’m not so sure about it.”
R: “You seem to be unsure about it. Have you been to california?”
Me: “Yeah. I was there a week ago.”
R: “Thats great! What did you think of it?”
Me: “It was fun. I really liked it there.”
R: “Wouldn’t you like to have the opportunity to work in a startup in california in the heart of silicon valley?”
That one statement sold me in an instant. We talked a little bit more where R told me more about the company and setup additional calls.
I went through two phone screens before I was called on site. The first phone screen was with an engineer - lets call him A, and the second was with Ram - the CTO of the company. This was the first time that I had directly interviewed with a C level executive of a company - a positive indicator that the company was still very small at this point. The interviews were technical, but the questions they had asked were very different from any other company. The interview style was very free flowing, even on the phone. It just seemed very easy to talk with Ram and A. After the phone screens, I became more positive in my outlook of Funzio (although I thought it was a funny/dorky name and I can’t get used to it - even today).
I went through the on site interview. It was 4 hours of grueling technical questions, one of the toughest set of interviews I had encountered. Funzio had a very high bar for its engineers. I was on the other side of the interview being the interviewer (more on that in later posts) and saw how high that bar was indeed.
I had chatted with all the co founders of the company by the end of the onsite interview. I had also dug around the internet to have a look at their professional track record. I really wanted to work at Funzio at that point. The interview process was very fast and they called me up about 5 mins after I walked out of their office telling me they thought I was a good fit and would like to make me an offer. I was elated and I knew that I was gonna accept it even before I had seen it.
As I walked into the BART station I knew I was gonna be part of something special.
A seductive veil over my eyes,
Flowing and flapping in the wind,
Constantly changing my perspective,
Deceiving, Confusing and Blocking,
Its strength invisible,
It can make a Vulcan weep with joy,
Or it can be Medusa itself,
Rendering lifeless my very core,
It makes me jump over a cliff,
Or Sit in the same place for days on end,
While my mind wanders away,
Prancing around in the Castles built in air,
It is the only thing that I want in the end,
I want it back from everyone I give it to,
For it is free to give and receive,
But the most priceless thing a man can ever give.
- Karan Kurani
Ever since I have read Design of Everyday Things by Don Norman, I have started noticing discrepancies in usability everywhere (especially doors). I wanted to write about a very irritating usability issue in the design of Caltrain stations.
I moved to the Bay Area in mid June. I did not have a car back then and commuted to work using Caltrain and BART. I got on at the Hilsdale Caltrain station and transitioned to BART at Millbrae.
I used the Clipper card since I used the service regularly and its immensely easier than buying tickets using cash everyday. Caltrain has this tag on and tag off system where you tag on with your clipper card at the station you board the train and it’ll temporarily charge you the maximum amount for the train ride. When you board off, you just tag off at the station and the correct amount is refunded back to your card.
They have these machines which you use to tag on and tag off scattered across the station. And they are literally scattered, its as if someone was throwing stones for fun and decided to put up a tag machine wherever it landed.
Lets have a look at the Hilsdale Station first. I am drawing from memory and I have not used Caltrain since over a month. All the points/lines are approximate in estimates but it is enough to show my issue. Behold my awesome paint skillz.
The green lines are Exits/Entrance to the station and the red spots are where the tagging machines are. There is a parking lot all along the west edge of the platform and another one on the northwest. There is also a SamTrans stop near the south west exit.
The issues: As you can see, there are too few tagging machines and worse, they are all concentrated near the north end of the station. It leads to myriad of issues. Anyone who is unfortunate enough to board/get off the train near the south end has to go right up to the north end and tag and then go wherever they want to (like the car I had parked near the south end). Overcrowding in the north part of the train since everyone waits there after tagging on. A rush to the north end for all the people who arrive late and are running hither and thither in order to not miss the train. And so on. It gets worse if the station is crowded for an event like a Giant’s game.
Lets look at the Millbrae station (Caltrain section).
I may have missed a tag machine on the west end since I never went past the north half of the platform. The blue lines represent exits to the BART section. The blue line on the west end is a staircase which leads to the other side of the platform.
Again, the tagging machines are haphazardly laid out and there aren’t enough of them. Here the problem is intensified while trying to catch the BART train. Often Sometimes the caltrain is late and the BART train is about to leave in 30 seconds. Whichever end of the platform I get off, I need to first tag off at a machine and then reach the BART entrance. If I am on the north end it leads me to run towards the south part since the tag machines are there. And its not just me who is trying to catch the train, which leads to overcrowding near the tag machines. Oh, and there is also a clipper recharge machine right near the north BART entrance, so if I want to recharge my clipper card, I need to go back north after tagging off.
These are only a handful of issues I have pointed out. The solution I propose for the problem is following the most common sense usability rule - Consistency. Have a tag machine near every entrance/exit. If the entrance is wide (lets say more than 20 feet), have a tag machine on either end. If its even larger then have a tag machine every 30 feet. Do not have machines where there is no entrance/exit (like the one on south end of east side on the Millbrae platform) since people need to use the entrance and exit to get in and out anyway.
I don’t know whether there are same problems with other stations but the 4th and King Caltrain Station in SF does follow the consistency rule and so its very convenient for passengers to use the tag machines.
I have been using google+ for sometime now. And their interface is best described elegant. They have all the hallmarks of interface tricks used mostly by Apple. All the little things, like circles rolling out when I delete them. It gives me a feeling of… the only word I can describe is joy, when I am using it.
They have used the mouse hover event very very very nicely. They have managed to get the extremely fine balance of using mouse hover so that it is not irritating but useful.
Like when I am managing circles, if I hover on a person, their corresponding circles are highlighted. Gives me information instantly whether I have added them and in which circles they are included. Nothing more. The next one is when I hover over a person’s name, a small box fades in which shows me, again the circle information about that person. If I hover over the circles box I can add them or remove them the circles they are in with - finally - a single click. All of this can be done with no more than a single click.
Mouse Clicks are expensive, much more expensive than hovering over something. This leads me to a rule of thumb. I don’t know whether it is already a design principle or not, but here is what I think. Hover should be used for reviewing something, while clicks should lead to an action. And its good if the action happens without leading you to a new page. It makes a web page more of an app rather than a page. In Google Plus, every time you hover, you get some information about the entity. But when you want to do something with the info, i.e. do some deliberate action, you click. Deciding what event should come on hover and what on click is determined by whether it shows info or is it some action to be done by the user.
In the last couple of weeks both Facebook and Gmail have revamped their chat interfaces… for the worse. I will list out first the short-comings of Gchat/Google Talk and then facebook chat.
Gchat:
Before Google+ was introduced, here is how I used google chat. I would click on a person’s name and a small box would open on the bottom left with it active. I would start chatting. If I clicked on more people their boxes would open right beside the existing ones. I used to switch between various chat boxes by using tab and shift+tab. But it got broken once google+ was released. Now every time I want to switch between different chats, I have to drag mouse and click. Its very irritating especially when I am switching between multiple boxes on multiple tabs on multiple windows on multiple monitors. Earlier I could get to the chat using a few strokes on my keyboard, but now that flow is completely broken.
Another thing I have noticed is that the chat windows in google+ and gmail are synchronized… just not completely. One thing in particular is that if I open a chat box in one tab (say gmail), it would not open in google+ at once. If I type in something and send it to the other person, it does not get updated on both chat boxes on both windows. Nor does it update when I receive a response. But when I do close a chat box in one of the two, it gets closed on the other one too! It is very disorienting and some uniformity in the synchronization behavior would go a lot in improving the user experience.
Facebook:
Facebook… aah facebook. Your user experience has become more and more irritating. Is it the frequent user interface changes? Or TMI due to every random person you have met once in your life friending you? Or something else? I do not know. But using facebook is no more the joyful experience it was before. Anyways. On to the chat interface. They updated it to include video chat and made a few changes to the UI. And they are all detrimental to the experience.
First, the chat button, when I click on it, it shows the list of people online. Earlier, the chat button used to be there so that I could click on it again to close it. It was a very quick way to check who is online and then close it again so it doesn’t use the screen space. Now, the chat button disappears once clicked, and in its place the search box is selected. If I want to close the chat list, there is no visual cue on how to do so. You can close it by clicking on the top bar! Again, breaks the flow.
Next, is the experience with chat boxes with other friends itself. Earlier, multiple chat boxes could be open simultaneously so I can glance at the conversations I was having with all the people at once. No longer the case, if I click on one chat box, all other close! So there is a lot clicking involved in just viewing friends’ responses. And they also have broken the tab and shift+tab support they had earlier to switch between chat boxes.
End of my rant for today. I hope they improve the issues with the user interface I mentioned and make chatting experience pain free again.
Lessons learnt. Cornell provides among the best academic education in the world. So I won’t go into that. I want to talk more about the non academic lessons learnt here and they will help in my life forever.
Diversity:
Cornell has an incredibly diverse student body. Forget about the whole campus, in every department, you will encounter and work with people from all over the world. I have personally worked on projects/assignments with people from US, India, Malaysia, Bulgaria, China and Greece. Apart from that, I am friends with people from Turkey, UK, Japan and Kazakhstan among others. Unfortunately, I have seen that people tend to cluster with people of same nationality. They have missed cultivating relationships with some of the smartest and most amazing people in the world. I have often heard jokes that start with “An American, Chinese and Indian guy walk into a bar… ” I have been that Indian multiple times. In the spring semester, every time after HCI class, an Indian, Malaysian and Turkish guy went for lunch - and just talked. Hanging out with different people has opened my mind in a very inexpressible way. You gotta do it to feel it.
Team work:
Picking the people you work with is the most important decision you will make anywhere. And that includes your school work too. I have been fortunate enough to work with extremely good people. Without my team mates it would have been impossible for me to perform the way I did. Assuming you did the same amount of work, having a good team mate makes the difference between getting a C and an A+. Apart from the work performance, I have learnt a lot from my team mates too. They have their strengths and you have yours. The practical experience of dividing and managing work with good team mates is amazing and kinda spoils you. If you have a good chemistry, then most of the time, issues do not rise up since you know them well. And when they do, you can communicate transparently without any fear and resolve it quickly.
Faculty and Mentors:
As I mentioned in the earlier posts, Cornell has one of the most amazing faculty ever. The professors are more than just teachers. They will guide you through your time there and, if you want to, even after graduation. The network you will develop will yield benefits throughout your life and you get the oppoertunity to talk and work with the best people from the field.
Hard Work:
In the earlier posts, I have chronicled how much work I had to handle while studying there. The M Eng program make us do 30 credits of work in 9 months. It is incredibly intense and truly shows you how much work you are capable of doing. This is the one thing every person who passes the program learns to do. You just need to pass the program to give you the confidence to handle any kind of work that will be thrown to you at any point of time in your life. It doesn’t matter whether you like the work or not, you will gain the ability to do it when required. It is the best thing you will learn when you pass out of not just Cornell, but any good university in US.
Living in a Foreign Land:
I have lived away from my hometown in India for 6 years before I came here to study. Although I was prepared for the loneliness you feel being away from home, immersing yourself into another culture is an entirely different - and exhilirating experience. Its tough at first, but one gets used to it pretty soon.
This ends my set of blog posts about my experience in Cornell. There are many experiences and lessons about which I did not write about, but will remain with me forever. The time at Cornell has been the best time I have had till now. I have started work in the industry in San Francisco at Funzio as a software engineer. I hope my experience is as enriching as the one I had at Cornell.
My Semester End Exams were coming up. Sem End were the same as mid terms but a longer version. The project presentations and paper write ups had gone well. I was prepared for them and the exams went well. I went off to India for the winter break. Before I left, I had already received a grade for the M Eng project. An A+, I was extremely happy and thought of it as a good start for the final grades. The grades started coming in and I managed to get straight As. I ended up with a GPA of 4.069 for the first semester. Whoa! I had never imagined that I would perform this well. It instilled me with a confidence that I can do anything with sufficient hard work and concentration.
After a month back at home, I was back in Ithaca. The winters are one of the characteristics of the university without which Cornell wouldn’t simply be Cornell. It was the first time in my life that I was living for an extended period of time in snow. And Ithaca has bad winters. It turned out that this winter was one of the worst ones. It was weird to wear three to four layers of clothing whenever I had to step outside. Trudging around in the snow was a pain, but the campus was extremely beautiful when it snowed and covered everything in pristine white. Ithaca also had a good snow ploughing infrastructure, so most of the days the roads and paths were cleared in the morning. When I had landed at Ahmedabad, it was about 12 am at night, and it was winter there. My folks and a friend had come to pick me up. All of them were bundled up wearing jackets and caps. In the car, I started the A/C because I felt hot. Everyone started yelling at me for being crazy. I guess I was a true Cornellian now.
The second semester was a relaxed version of the first one. I knew what to expect and how to manage the insane amount of workload we get. And it wasn’t just me. Everyone in the program was much more relaxed and stress free for the second semester. I even managed to fly out to California twice during the semester. I loved the place and I finally got a job there. I went through the same process as described in the previous blog posts, first few relaxed weeks, work load increasing, getting into study rut, working your ass off, getting stuck in snow and then finally graduating at the end of May.
The graduation day had a surprise in store for me. I went to the ceremony tent and I was told that I was sitting in the front row. It was known that the people sitting in the front row were awarded. I could not believe I had bagged the Award for Academic Excellence CS M Eng 2011. It was the best moment ever! Its been an incredible 9 month journey and its been the best time of my life so far. I will truly miss everyone and everything there. I will summarize about the general lessons learnt from my experience in the next couple of posts.
The Social Network Discovery (SND) project required a lot of background work to be done since we were starting it from scratch. I was part of a team of 5 people. I was responsible to keep track of day to day activities and oversee the implementation details. I was fortunate enough to have great team mates and a good team fit. Two were Ph D students who had tons of research experience and knew how to attack the problem. My friend Jason had over 6 years of industry experience and was the software engineering guy. Theo was the project supervisor. He kept the project on track and guided me throughout my time at Cornell. He is a great advisor and I have learnt very important lessons pertaining to research, academics and life in general.
A couple of weeks before my mid terms and after I had accepted to work on the SND project, my laptop conked out. Come what may, I could not get it working. I sent it in for repairs and it turned out that the motherboard had burnt out. To make matters worse, Lenovo (in all their infinite wisdom), did not have a single replacement motherboard available. You can skip the rest of this papragraph if you don’t want to concern yourself with a rant bashing Lenovo. This model was in production and they were selling it when I had given it for repairs. But, Lenovo did not have a single replacement motherboard available in the whole country! They said that they need to get it shipped from China… and it was gonna take them a month to do so. Yep, they would not remove a motherboard from a new laptop which was in their stock and return it asap. Nor would they send me a temporary replacement. Lenovo has caused me a lot of pain and anguish. I am gonna write one post detailing it.
So back to my situation, I was stuck without a laptop for a month. For a CS student at Cornell, being without a laptop means you are gonna spend the better part of your life in the lab. Everything you do, you are gonna do on the computer. Write assignments and programs, read research papers, submit assignments, keep track of your projects, emails etc etc. Since I did not have a laptop, I started spending all my waking moments in the lab. Because the SND project was on fast track, I had to get upto speed in the research in the area very quickly. This was very tough. My schedule everyday for the next month was similar to this one:
Wake up - Lab/Classes/Assignments upto lunch - lunch - Lab/Classes/Assignments upto dinner - dinner - Lab/Classes/Assignments until you are very sleepy - Go home and sleep - repeat
All this while, I was also interviewing with companies. I had scheduled a final round with a company on Novemeber 22 (a Monday). It turned out that I had to deliver results for the SND project on Wednesday morning. It was crunchtime. I remember we were close to getting results out on Saturday night. But then we found a bug in our program on Sat afternoon. I was busy debugging it through Saturday and Sunday. I saw that I would not get the results without some help. I called Jason at 9 on Sunday night and he came over. On a Sunday night, the lab was full, everyone had something to submit. We worked our way through. By 1 30 am we had the final program up and running. I went home, ate a snack, took a shower and then dressed up for the interview. My flight was at 4 am. Straight from the airport, I reached the company premises at 11 30 am, went through 4 interviews and a test. Finally at 7 pm, I reached the hotel and fell on my bed. It was the most exhausting weekend I ever experienced at Cornell.
Time was running out and I needed to choose my M Eng project soon. Every semester we get access to a list of projects that were posted by professors. The variety of projects was huge and after a while, it became a problem of choice because all of them looked good. They ranged from systems, theory/algorithms, Machine Learning, HCI and so on. Some of them were straight forward implementations for specific requirements. Some were research related. I had done some research in my under grad and had liked it. So I short listed projects which were research based. I first contacted a Dr Theodoros Damoulas (Theo). He was looking for people to do a project which was researching applications of Machine Learning to Poverty Mapping (http://blogs.cornell.edu/ml4ics/category/ics-ml-projects/poverty-mapping/). The project was a part of ICS (Institute of Computational Sustainability) which aimed at establishing a completely new sub area of Computational Sustainability. (Details: http://www.cis.cornell.edu/ics/)
I was excited to be part of such a movement. I joined the project and started getting myself upto speed on the problem. It took me 2 weeks to just understand the basics of the problem. Research is hard and not for the faint of heart. Doing research is like walking in the dark, feeling around the room until you know the layout completely. The succesful researchers were the ones who were really passionate about solving the problem. It needed a lot of discipline and dedication to not give up.
My mid terms were coming up. Unlike in Nirma, the mid term times were decided by the course instructor and there was no formal week during which mid terms happened. I had mid terms in two courses. One was open notes and the other was open laptop. Contrary to what people think, open book exams are tougher because you were expected to tackle the problems given using what you had learnt in a very creative manner. And if you did not know the concepts and started looking for them in your book/notes then time flies and you could not complete the paper. So you needed to be very well prepared to give these type of exams.
Open laptop meant that we could use our laptops to solve the problem given. We were not allowed to access the internet though. In fact, one or two problems were designed such that you needed to make small programs to solve them quickly. Making those scripts on the fly while giving the exam was a completely different experience I’ve had. It made me use my brain instead of just exercising my memory.
A week before mid terms, Theo called me to his office. He mentioned that they were interested in doing a new project which was high priority. It was related to mining research databases to find new researchers who were interested in doing work related to Computational Sustainability. I was told that it would be on fast track and the project would be on a very tight deadline. They asked if I was interested. It sounded fun to me and I took it. This project plus a combination of (unforseen) circumstances, made the next month the toughest time of my life.