Tis the season of Quantum Computing

Exciting news broke last month as Rigetti Computing unveiled a new paper on Arxiv, “Unsupervised Machine Learning on a Hybrid Quantum Computer“. So I set about to understand what it is this startup does and why they think they can take on the megalithic IBM, Microsoft and Google in the quantum computing space.

A new challenger appears

Rigetti is not a household name like its competitors unless you are in the quantum computing research space, but it might become one soon. The founder is Chad Rigetti, who got his PhD from Yale and worked in IBM for a while. Chad’s previous work and the focus of this company is on building quantum computers using superconducting circuits.

Superconducting qubits are not the only way one can build quantum computers (other alternatives including trapped ions, optics etc) but it seems to be architecture that is currently leading the race. My earlier post about Google’s quantum computing result was based on the D-Wave quantum annealer which is also built with superconducting qubits.

The paper details the performance of their flagship 19 qubit quantum computer. They then go on to implement an algorithm called QAOA – Quantum Approximate Optimization Algorithm. I went through the QAOA paper and, to be honest, was a bit disappointed. The algorithm breaks a really hard problem (the NP complete MAXCUT problem) into two parts, one part is efficiently solvable on QC and the second part is still open for future work. I wanted to write more about the algorithm but decided not to since I felt that there is a lot of room for doubt regarding its actual speed advantage on practical applications.

My best guess as to why the Rigetti folks chose this algorithm is that they wanted to showcase their computer and picked a novel algorithm to demonstrate rather than chase the leaders in the field. I think it worked. Their publication caught significant press buzz and brought the company into the public consciousness. For the people in the know, this paper highlights the more important fact – that Rigetti’s 19 qubit quantum computer can perform both entangling and measurement based quantum operations and run custom algorithms.

Let a thousand startups bloom

It is an exciting time for quantum computing; major results are coming through faster than I can read and write about them. I still haven’t gone through IBM’s big result from 2 months ago.

The ongoing research war between D-Wave, Google, IBM, Microsoft is riveting and churning out some interesting results. These companies are attracting top talent from universities who are also getting a big recruitment and funding boost. Now that VCs are interested, startups like Rigetti, Quantum Circuits and hbar are beginning to appear and challenge the big dogs. I am really rooting for QC to break through the cost/effectiveness barrier and become mainstream. My career would come to an ideal full circle if I end up as a QAI/QML engineer someday.


Moreover, there is plenty of money and glory to be had. The race is still on to reach quantum supremacy – when a quantum computer beats the best classical computer+algorithm on a given problem. Once quantum computers are better than classical computers on a certain task, it is unlikely that it will ever be reversed, barring some truly incredible finding on the same order of awesomeness as P=NP. How profitable that task will be remains to be seen.

Beware of quantum winter

However, to ring a somber note, I am beginning to feel that the hype is slowly creeping up beyond what can actually be delivered in the near term. John Preskill, a top Caltech researcher, wrote a detailed essay suggesting that quantum supremacy may not be the correct goal.


Based on my research experience I have to agree, and I think there is much more at stake than just some minor disappointment. There is a real chance that quantum computing might go through something similar to what artificial intelligence went through during the “AI winter” between 1970-2000.

Back in the 1950s-60s, the academic research in AI/neural networks became extremely popular. Billions of dollars were pumped into the AI market from academia, defense and industry. However the hype grew too fast and the chip technology and computing power were just not there to support it. As practical applications became hard to come by and companies failed, mass disillusionment followed. As a result, not only did the press and the general public get tired of AI, but even defense, venture capital and academic funding dried up. Even though computing technology kept advancing, it is generally believed that the lack of funding and mass exodus of researchers in the 70s-90s greatly delayed the eventual resurgence of AI/ML.

The technological challenges facing practical quantum computers such as noise and decoherence, magnetic and optical trap stability, crystals and detector efficiency are steadily improving year after year. Most academics I know believe that the technology will surely get there, but is not there yet. Now thanks to the hype created by D-Wave, Google, IBM, Rigetti etc., it seems that the world is ready for quantum computers… today.

The first signs of “QC mania” have already started and money is pouring in. There are only two ways this can end well –

  • Either people, especially investors and managers, should temper their excitement and allow the technologies to mature before demanding returns on their dollar. Needless to say, I don’t have much faith in this option.
  • The other option is that the researchers in all the companies and universities are able to race and outperform the market hype and we do indeed get beyond proof-of-concept in the next 5-10 years. I am hoping this happens.

If neither happens, we might end up creating a QC winter and push the timeline for practical quantum computers from 15 years to 50 years, and that would be a real tragedy. Because I would have retired by then 🙂


How many pixels does it take to screw a neural network?

A fascinating paper was put up on arXiv last month, “One pixel attack for fooling deep neural networks.

Many computer scientists and casual users might agree that the most delightful successes of deep neural networks (DNNs) thus far has been image recognition. Though I am personally not much of photographer/social media autobiographer, I have heard colleagues and friends rave about the capabilities of apps like Google Photos. Apparently, without any human help, Photos lets you search through your pics for arbitrary queries. There are other cool features but I think this one already packs enough punch!! Is ML finally reaching the elusive AI distinction, where it can “understand” abstract concepts such as ‘sunsets’ or ‘hiking’ and find the right context in images despite the myriad variations of locations, positions, lighting, quality and people?

Understanding DNNs

A lot of research has been pumped into understanding what makes DNNs (or CNNs convolutional neural networks as are used for images) such a powerful learning structure. There have been a few interesting papers which attempt to extract human readable explanations of the predictions made by a neural network. I personally find interpretability a very, if not the most, interesting part of AI/ML research, so I will delve into that tangent in future posts. Suffices to say here that the CNNs/DNNs currently being used in cutting edge research are very hard to interpret.

Unsurprisingly, “tech” companies have few qualms about using things they don’t fully understand. And for good reason. Somehow, DNNs just work! The physicist in me demands, but do they? How do we know? One good way to test these inscrutable beasts is to probe them with data and measure the response. ML practitioners might recognize this exercise as testing on holdout data set.

In my opinion, as a philosophy of testing, cross validation or testing on holdout set is far from satisfactory. The origins of cross validation lie in statistics from the straightforward rationale that predictions made by a model built on one data can be tested on different data to check if it generalizes. However, models in statistics are highly reductionist and always interpretable. Sure cross-validation was a good test, but the true test of statistical models is always in what it is able to explain. The statistical model is often simply a means to quantify the cause and effect where both the cause and effect are independently well understood.

Classic ML models such as logistic regression and SVMs were born in statistics journals, with highly interpretable parameters. Yet once the emphasis shifted from explanation to prediction, it made sense to create arbitrary features and parameters with scant justification as long as it improved the predictive accuracy. With the explosion of so-called black box models like Random Forests and DNNs in the last few years, the last explainable scaffolds were dropped and we took a collective leap of faith putting all our eggs in the data testing basket. In fact when I began learning about ML a few years ago, many top practitioners were proclaiming that as long as you had a sufficiently large and accurate data set and a sufficiently complex and trainable model, almost anything could be learnt. As a theorist studying the math at the time, I agreed wholeheartedly. Now as I am building practical models myself, I find that “sufficient” is easier said than done.

To draw an analogy from physics, if we are treating the model as a object that we need to understand, testing with a validation data set is akin to throwing data at the model and measuring what bounces back. In that sense it is not very different from scattering or tomography techniques such as CT scans. Now if your model has the complexity of a flat wall, throwing a few objects at a few different positions and speeds and measuring what comes back will tell you everything there is to know about the wall – its height, width, texture and strength. But if your model has millions of parameters and looks more like the Taj Mahal, then understanding the complete shape and structure of the model is not easy. “Sufficient” is not a helpful quantification of the number of objects you need to throw at Taj Mahal to understand it. Most statistical models are simple, like walls, while DNNs are anything but.

The answer is ‘one’

And that is quite apparent in this paper. The authors show that changing a single pixel in a 1024 pixel image can fool the VGG-16 model.


Three specific results of this paper stand out to me –

  1. The authors were able to fool the model for more than 66% of the images. To be completely fair, in a few of the above pictures I can almost see why the model might be confused, especially the misclassifications between dog and cat. However, some mistakes don’t seem to make any sense from a human’s perspective. e.g horse to automobile and airplane to dog.
  2. On average the model was more than 97% sure of its predictions when it was being fooled. This fact undoes a lot of the credit I mentioned in the previous point. A single pixel can almost never convince a human to change a prediction so decisively. Even when fooled most humans would have attached a much higher degree of uncertainty to their assessment.
  3. Some images can be made to look like any class. In the example below, I can hardly make out the original image which is supposed to be a dog. But apparently VGG can be convinced with near 100% certainty that this image is an airplane, bird, truck or frog, all with just one well placed pixel. Examples like the one below are baffling and really bring to focus the question, “What exactly has the model learnt?”


So do we AI or not?

So at this point, some might wonder, is it time to dismiss all AI/ML as smoke and mirrors? A more nuanced question is, are all many neural networks horrendously overfit, to the point where they have memorized massive datasets? I don’t want to profess any authority here, I personally have not worked with VGG or the other state-of-the-art image recognition models so I do not have a very detailed understanding of how well these models generalize. Yet even in my experience, CNNs and DNNs are much more fragile than I would like. The models are incredibly powerful, achieving very high accuracy over millions of examples using seemingly robust validation and testing protocols. Nevertheless a relatively small systematic change in the input data can potentially render the model almost entirely useless. In many cases it is easy to include the new data in the training examples so that the model now performs well over the space of both types of data. Yet, the opacity of the learning and the inability of transferring learning without retraining does not inspire a lot of confidence in the model’s ability to represent human level abstractions.

Now this could be interpreted to be a classic case of the much touted “moving goalpost” problem with Artificial Intelligence. Once chess was thought to be the line that separated true intelligence vs just calculation. However we eventually realized that the chessmaster program was not very intelligent just a very good calculator, even if no human could beat it at chess. We have already heard that image recognition models are now at nearly human level capabilities. Am I claiming that image recognition could also be a problem that could be solved with programming rather than anything resembling intelligence?

No, I am actually claiming that image recognition is not solved, at least not with the finality of chess. The world of the chessmaster program is finite, 64 squares, 32 units and a dozen rules. The world of images is not finite. As this paper handily demonstrates, one likely reason we think we have reached human level image recognition may be because our testing data set is narrow and biased. Our understanding of the model is limited to its response of our testing data set and the bizarre gaps between human and machine response are revealed when the model is presented with unusual or adversarial data. We presume we know which images are the hardest to understand and test our model on those, whilst not realizing that a single defective pixel might already be too much to handle. And at this level of model complexity, I am not sure it is even possible to detect all the various failure modes of a model simply by throwing cleverly designed testing data at it. Fortunately, it seems to me that some top AI researchers are also moving away from the claim that intelligence can be completely validated with “reasonable” amounts of data, though there is nothing close to a consensus.

I don’t think the goalposts have moved yet; solving image recognition along with all the other advances in language processing, sequential learning and reinforcement learning could still put us on the right track to understanding intelligence. The VGG model is no longer the state-of-the-art, many other (deeper) architectures perform much better. However, the metric of their success is still accuracy on a testing data set. I think we need a more robust method of determining if an intelligence problem has been “solved” and my guess is, the “solution” will require more ingenuity than just deeper neural networks.


First mover (dis)advantage in industry and academia

A month or so ago, Google held a press release to announce their latest result in quantum computing using superconducting qubits. The full technical paper is up on Physical Review X and arxiv, the easier to understand summary on their research blog. Shortly after that my friend and former classmate Shantanu Debnath was part of the new breakthrough result from University of Maryland for making programmable trapped ion quantum computer. I found it interesting that the latest advancements in QC/QI are coming from both industry and academia.

I went to academia with a very staunch belief that research needed to be unbiased and unfettered by monetary constraints. Of course everything still costs money, but academia does create an approximation of the ideal. Absence of direct monetary incentives and tenure for professors encourages curiosity-based research and taking risks. There is definitely some element of ‘selling’ your research, which many researchers might find a bit distasteful. However the ‘buyers’ are often other researchers : peer reviewers, funding agency program managers and journal editors. Even the worst of this bunch belongs in the top 0.01% of the world in their respective fields.

Having jumped ship to industry now, I am trying to observe how research in industry works. At first glance, it seems that all of the above criteria are violated. Research in industry is always motivated by monetary gains, either for a company or an individual. Researchers also usually do not have the security of tenure. Their position is under much greater threat from market forces, management decisions and economic factors beyond their control. Finally, even though they might publish their work in peer reviewed journals, the measure of success of these researchers is ultimately tied to the success of the company. In that sense, the true ‘buyer’ of their work is not a specialized group, rather it is the general public. ‘Selling’ research to the general public is far more difficult and requires vastly more packaging.

The interplay of all these forces makes for a fascinating arc through history. Almost always academic research spends at least 10-30 years pursuing a given direction of research before anyone in industry get wind of it. First mover advantage is definitely huge in academia. The most agile research groups which are able to invest in a new technology can reap the rewards for many decades to come. The most relevant example is NIST Maryland, which pioneered laser cooling technology back in 1990’s. Today the collaboration of NIST and University of Maryland has some of the best technology in laser cooling and optical traps which is still making them a leader in related fields decades later.

Once the academics have chipped away at a difficult and risky problem and brought it within striking distance, a few bold souls from the private sector will attempt to seize the first mover advantage. The strong monetary and competitive incentives push industrial research rapidly in the next few years. Yet, as I am learning more about the history of innovation in the Bay area, it seems that when it comes to world changing technologies, being the prime mover in the private sector is not necessarily an advantage.

GO Corporation, General Magic and to some extent Fairchild semiconductors are examples of companies that could not last long enough to see the fruits of the technologies that they themselves pioneered. In every case, the companies that came around 10-20 years later perfected their product when the technology was more mature and the general public more receptive. Intel, born out of a dying Fairchild, went on to become one of the most dominant corporations ever. Apple used GO and General Magic’s vision, and some of the founders, to create the iPhone which is probably the single most successful product in history.

I think in academia being too early is not as disastrous. For one academic research does not have to turn a profit year after year to sustain itself. Secondly since the adjudicators of success are often other scientists, it is possible to convince them of the value of the work even if experimental confirmation and technological validation are many years in the future *cough* neural networks, string theory *cough*.

But being too early is death in the marketplace. Tony Fadell (the man behind the iPod and iPhone) has explained in many-a-interview how Apple literally sculpted the market by first selling the iPod and then adding one feature at a time (video, podcasts, iTunes) until the market was ripe for the iPhone. The general public is not much into beta testing new technologies, so the first few iterations of consumer technologies are often destined to fail.

It is tempting to extrapolate this “first mover disadvantage” to today’s hubbub around quantum computing. The seeds of quantum computing were sown in 1970’s and it has been a hotbed of academic research in the last couple of decades. In the last few years, QC has rocketed into mainstream news with every major tech company angling to have a stake in the future.

Realistically I think that a consumer product utilizing quantum computing is still some distance away. Of course, as with most research, unforeseen challenges and unexpected discoveries may drastically alter the timeline. But based just on my personal assessment of the past developments, estimated time to consumer for QC should be a minimum of 10 years, maybe 20. That also seems exactly like the time-frame that pioneers of previous revolutions took to file bankruptcy so that the next generation of companies could stand on their shoulders and change the world. The bold claims of General Magic in 1993 to build an anytime, anywhere handheld communications device seem like a close parallel to D-Wave announcing the world’s first commercial quantum computer in 2011.

So am I saying that all the companies that are around today investing heavily in quantum computing will be gone before you can solve the travelling salesman problem on your phone? 🙂

Google’s latest quantum computing results explored!

The D-Wave quantum computer has been a focal point of great controversy in the academic and research communities. Back when I was doing my PhD in Quantum Optics and Quantum Information and technically in the race to build a quantum computer, I have to admit to being mildly skeptical of D-Wave’s claim of having made the first commercial quantum computer.

This attitude was a result of two facts – first, for a long time D-Wave staunchly maintained that its research was proprietary and did not allowed open peer review of its work. Secondly, many accomplished and reputable physicists have openly stated, both in the press and to me personally, that in their opinion the D-Wave is not really a quantum computer.

On Dec 8th, Google issued a big press release in collaboration with D-Wave. The bold claim states that the D-Wave ‘quantum annealer’ does in fact beat the equivalent classical algorithm by an astonishing factor of 10^8. The team was nice enough to release a paper on Arxiv explaining the theory and experiment behind the statement. The paper has yet to be peer reviewed, so I decided to check it out myself.

The title of the paper is “The computational value of finite range tunneling“.

Quantum tunneling is a phenomenon that allows a system to cross an energy barrier without actually climbing over the barrier, rather tunneling through it. The most common analogy is throwing a ball at a wall and having it come out on the other side of the wall. This phenomena arises from the fact that according to quantum physics, objects are not rigid entities but clouds of probability density. Therefore what we physically recognize as the ball is simply the region that has the highest probability of finding the ball. However there is a non-zero probability of finding the ball outside this region, say on the other side of the wall.

For large macroscopic objects like balls and walls, this probability is vanishingly small and we would never see this happening even if we tried it for a billion billion years. However when we go down to the microscopic scale, these quantum effects can become quite substantial and quantum tunneling has been observed for particles like electrons and protons.

The paper, as the title states, aims to establish just how useful this quantum effect can be for computing.

And according to the paper, it is very useful when applied to simulated annealing problems. Which begs the question,

What is simulated annealing?

Some computational problems can be posed as an optimization problem, i.e the parameters of the question are embedded into a mathematical function in such a way that the answer to the question is the minimum value of the function.

However, finding the minimum of an arbitrary function is not always easy, especially if the function has multiple local minima. In such cases, sometimes it is preferable to find a good approximate answer rather than the exact answer which might take a long time. This is where the concepts of annealing come into play.

The system is assigned a ‘temperature’, which governs the probability of the system to jump to another state. Then the system is allowed to ‘cool’, i.e the probability of these jumps is slowly decreased over time. As a result, the system spends more time in states which have a low ‘energy’ which is given by the optimization function. Therefore, as the system ‘cools’, it is more likely to reach a state close to the global minimum state than any other.

The origin of this process comes from metallurgy where worked metals are heated and allowed to recrystallize slowly into a low energy, low defect state. Simulated Annealing (SA) is simply this process being simulated on a computer.

The probability that the simulated annealing jumps out of a local minimum well decreases with temperature but also decreases exponentially with the height of the wall surrounding the well. Moreover wells which have a narrow width have a smaller probability of the system landing in them. Therefore SA performs very well when the optimization function has many shallow, wide wells but very poorly when the function has deep, narrow wells.

But guess what works really well if you want to jump through really tall walls? Quantum tunneling. Using a quantum system for annealing, or Quantum Annealing (QA), eases the computational cost of SA by adding an additional method by which our (quantum) ‘ball’ can get from the shallow well to the deeper well. The exponential nature of the cost means that even a small tunneling probability can mean a huge gain in computational power.

That covers the introductory sections of the paper. The bulk of the paper describes the specifics of the function, and the quantum setup they used. The main points can be summarized thus –

  • The quantum system used in the D-Wave is a grid of Qbits. A Qbit (or quantum bit) is the quantum analog of a two level system which can be in either state 0 or state 1.
  • Different algorithms were tested on the same problem and compared on the number of steps required to reach the answer with 99% confidence. The test candidates were QA, SA, an algorithm called Quantum Monte Carlo (QMC) and other classical algorithms.
  • The problem was constructed specifically to cause the worst case performance of SA. Therefore these results do not represent the performance of QA vs SA on an average problem.
  • Under these conditions, it was found that QA can have as much as 10^8 times fewer steps required to arrive at the solution than SA or QMC.
  • SA is very bad at problems that contain many deep and narrow wells that are separated by tall walls because the jumping probability decreases exponentially with the height of the barriers.  Here QA can solve the problem much much faster.
  • QA is bad at problems where the wells are far apart, because the tunneling probability decreases exponentially with separation between the wells.  Here QA offers no additional benefit over SA.
  • The tests were repeated for  180, 296,489,681 and 945 variables to see how the solving time scaled with number of variables.
Tunneling provides a large advantage to QA over SA when finding the minimum between A, B and C, however in the long range transition from C to D, QA loses its edge over SA.
So back to the original question. Is the D-Wave a quantum computer?

Well, the D-Wave does indeed display some distinctly quantum phenomena such as entanglement and tunneling. However the paper finds that the D-Wave has the same scaling as QMC. I will not elaborate on QMC here except that it can be efficiently solved on a classical computer. The D-Wave is reported to be much faster than the QMC, but by a large polynomial factor, not an exponential one. Therefore the D-Wave does not offer a quantum speedup over classical computers and therefore, strictly speaking, cannot be called a quantum computer.

However whether or not the D-Wave is a true quantum computer is more of an academic question (and a question about the PR strategy of D-Wave).

Optimization problems are a very important class of problems, especially in the context of machine learning and artificial intelligence. Which is probably why Google has pumped so much time and money into testing the D-Wave. It seems that at the very least this investment will yield new insights into the nature of algorithms and quantum computing, if we are lucky it might yield a generation of quantum annealers which are several orders of magnitude faster than today’s computers despite not being ‘true’ quantum computers.

Practically speaking, if the D-Wave indeed has a large advantage over a classical computer on certain problems, then all one cares about is whether those problems are worth solving. The paper suggests a couple of algorithmic problems which might utilize the power of this computer which they hope to expand in the future. They also predict that the they expect that the next generation of quantum annealers would be even more powerful. Whether the D-Wave can become more than an controversial curiosity will depend on their ability to make good on that promise.

From Academia to Corporate

I just finished my first few weeks at Google Inc. As quick as my actual transition from grad school to a corporate gig has been, the emotional rollercoaster has been ramping up for quite a while. And now that I have finally embraced the dark side, I am often asked – why? Why did I spent 6+ yrs doing an MS-PhD in Physics and then leave academia to work in an unrelated field as a Quantitative Analyst at Google? Some just ask why anyone would do a PhD at all?

I started seriously thinking about writing this post when my friend (and entrepreneur) Vaibhav Devanathan asked me, “given the choice again, knowing what you know now, would you choose to do a PhD?”

Hmm… would I?

Would I choose to join my lab which (now I know) has to be punishingly cold and dark for the sake of state-of-the-art optics equipment? I used to think academia was a utopia, where brilliant people sit around together drinking coffee and solving equations on blackboards when they are not winning Nobel prizes. It is partly that, but it can also get very lonely when your funding is cut, or your experiments are not working or when everything works just not well enough. Now that I know that, would I still dare to choose that life? But most of all, would I still do a PhD, when I know that while all my friends are earning 5 times as much, I am spending 4+ years earning a degree that could actually decrease my value in the job market?

It is not an easy question to answer. Especially when it goes beyond hypotheticals and prospective students ask me if they should apply to grad school, it behooves me to be honest. So I decided to write down my thoughts, starting from why decided to do PhD in the first place to what I feel about that decision now.

School  (Woe be upon thee) 

Middle school, as far as I could tell, is where most students begin hating school. In the Indian school education, minutiae is often conflated with rigor, and the lack of choice with discipline. The ‘cool’ teachers would tell us then that high school would be better.  That’s when we could choose our subjects. Some of us went ‘Yayy no more social studies!’, while others went ‘Yaay, no more science!’.

When we got to high school however, we were hit by more unnecessary discipline, painfully dull curricula and criminally hypocritical exams. And yet again, I had a ‘cool’ class teacher who told us every day to pay our dues and get into a top university, and that will be a magical land of sincere learning.

Thinking back I realize he never actually said that, but it is what I understood at the time. And so like most people, I worked hard to get into the best college I could.

Undergraduate (stars in their eyes)

I coasted through most of my undergrad as a mediocre student. Perhaps I should have studied more, learned more and fought harder for grades. One of the reasons I did not (besides the fact that I was surrounded by really smart people and grading was relative) was that I thought I had done it; I had made it to the promised land where grades did not matter.

And to be fair, some things were better. No unnecessary discipline and some professors really tried. Yet for some reason, it was not enough to make me commit.

In my final year however, I had to confront the reality. My four years at IITB were magical, but I still hadn’t found what I was looking for. It had, however, given me hope, that there could be better things out there.

So I applied to grad school, not even sitting in for campus placements.

Grad School (And the faithful shall be rewarded)

And grad school delivered. For 6 years, I was surrounded by brilliant people, who despite our disparate backgrounds, life experiences and present circumstances, somehow shared with me one uniting sentiment. They were here because not even 16 years of education was going to stop them from learning.

The attitude towards learning and discovery that some of my colleagues, friends and advisors had was nothing short of inspirational. They were here because they wanted to learn about everything ‘we’ know.  Not how much you or I or the pope or the president knew, or how much you could fit in a sheet in a three hour slog, but what ‘we’ as a civilization, collectively through all the millennia of accrued knowledge, knew.

Of course not everyone thought this way, but enough did. There was plenty of bureaucracy, apathy and of course the pitiful salary, but these scarcely bothered me. It was relatively easy for me to lose myself in the joys of research, enough that for a while I really felt like I could be a graduate student for life.

But you can’t!

In the real world, graduate students have to, you know, graduate.

However the more I learnt about the academic life beyond grad school, the more I got disillusioned. Most professors spend all their time teaching, mentoring students and canvassing for their research, all the while navigating the bureaucracy of universities, journals and funding agencies. But that didn’t even sound half-bad against the 8-12 years between graduating with a PhD and becoming a tenured Professor.

The demands of publications, citations and funding dollars upon postdocs and untenured professors reminded me of the tangential success metrics of high school exams. But what infuses this nightmare with a Kafka-esque hilarity is that, unlike high school, academia success metrics are often at the mercy of social, political and scientific trends and vagaries.

I wrestled with my doubts and fears for over a year. Eventually, about 6 months ago, I decided I would not stay in academia.

I decided that the academic utopia I experienced was simply a sheltered atmosphere that could not last long. Realizing I would never be able to reconcile my unreasonable expectations with that reality, I decided to quit while I was ahead.

Was it all for naught?

I don’t think so.

Of course there are the tangible benefits. I have a shiny new degree, about 2-3 year worth of coursework in math, science and programming. I also have about 2-3 years worth of what can only be called work experience for lack of a better phrase. This work experience is different from industry I am sure, but different is not useless… I hope.

I personally do not believe switching from Physics PhD to Google is a loss. This is the first time after 10 years I am not surrounded by physics/physicists. It is a gamble; I also do not know how corporate life will suit me.  So far, I am enjoying my work at Google. Of course I do not know enough about job markets to know whether my future prospects have broadened or narrowed, time will tell I suppose.

And then there is the intangible. For many years I have been chasing this ideal place of learning that I imagined as a boy reading books about science. I got a brief (=4years :P) glimpse into that ideal world. I believe that vision might just save me from the crushing cynicism that I often see around me about education or life in general.

While it is true that we must all grow up to accept the real world, I also believe that we must do more than accept, that we must seek to somehow make every day better than the day before. The optimism to work towards that abstract, Sisyphean goal is hard to find unless we have some direction, a glimpse into our notion of an ideal world.

Others might have their own unique experience; it might be the research team that wants to create the future, or the company that cares for its employees or the startup idea written on a paper napkin or just that parent/teacher/spouse/friend who showed you that our world is full of possibilities and wonder. For me it was grad school. And though they cannot last, I believe these brief glimpses are necessary to remind us what we are working towards, for without them our world is bleak indeed.

So should you do a PhD?

Go to the job fair at your college/university. If ‘student’ sounds better to you than any of the titles/positions at job fair in your college/university, you should think about graduate school.

If you spent your school/UG years being frustrated (or disinterested) with academics but ended up reading about those subjects from books outside the curriculum, you should consider graduate school.

In other words, if your goal with starting a PhD after UG is something like getting a good job, or winning the Nobel prize, or that your friends/classmates are doing it, you will probably not finish it nor benefit from it. I would advise not to plan your whole future before you start your PhD, and if you do, to be flexible. Research especially, and life in general, often does not go according to plan.

However if the PhD is your goal in itself because you either love the subject or love learning and being a student, then grad school is the place for you. Only then can you be happy with your doctorate, regardless of whether you stay in academia or not.

Ultimately, these reasons are not enough. 3-5 years is a big commitment and must be made with extreme prejudice. As with everything else, only you can make the completely informed decision.

All I can say is that if 6 years ago I knew everything I know now, and had to choose… I would definitely still choose to do a PhD.


PS : This post is simply my opinion based on my subjective experience. Most of these experiences depend not only on factors such as university and field of study but also on who you have around as parents, friends, advisor or significant other, all of which I think I was quite lucky with.

Spoken like a Nobel laureate!!

I have heard quite a few keynote lectures from Nobel Prize winners in Physics or Chemistry. At least ten that I can remember, possibly more.

I am being flippant on purpose. Nobel laureate lectures aren’t all that. When I went to my first such lecture, I was all agog. I expected to hear a profound speech that would inspire generations of scientists to a lofty ideal. After all, aren’t the great Nobel lectures of the past the motivational quotes of today?

But my excitement faded pretty soon. Every Nobel laureate lecture was just a professor droning about his research, the science and the results, the theory and the experiment.

Nothing more. Nothing less.

Don’t get me wrong, every one of them was a brilliant scientist, who had built their research painstakingly and made a landmark contribution to science. I do not mean to undermine their scientific contribution at all. But I do not agree with their choice of keynote lecture topic.

A lecture at a conference is usually an opportunity for a scientist to showcase her research and publicize her results. This is the mechanism by which a scientist gathers traction, attracts collaborators and funding, and achieves visibility and credibility.

What does a Nobel laureate achieve?

Nobel laureates do not need visibility for their research. By virtue of the Prize, their research has gotten more exposure than anyone else’s. By virtue of the Prize, hundreds and thousands of introductory and advanced level articles have written, many man hours of airtime has been dedicated to publicize and explain their research.

By virtue of the Prize, Nobel laureates have been given a platform. They have been elevated and given an opportunity to be heard by an audience their peers will never be afforded. Instead of using this power, and responsibility, judiciously, most laureates squander it just explaining their research. That is not even selfish, its just utterly useless.

I heard Eric Betzig at CLEO15 at San Jose last month. He won the 2014 Nobel Prize in Chemistry for super-resolution microscopy. His lecture was the first, and only, one I have heard that was worth its Nobel salt.


Eric spent more time on mentioning the advantages and drawbacks of other contemporary work than on his own. He promoted research that currently going on, compared it with his work and pointed out why they might be more or less useful.

Even the time he spent on his own research, he used to tell his journey, his story. Every scientist can understand the equations, but starry eyed grad students and young professors alike go to a Nobel lecture to hear how… how were the discoveries made, what was the process, what was the struggle, how can they do it? His story of life in the last days of Bell labs, his frustrations with academia and failure as a businessman, painted a fascinating picture of a life of learning and hinted at the qualities required to be successful in research.

He also expressed his opinions of how research can or should be done and some of the pitfalls that befall academics and businessmen. None of that felt like a sermon, it was just part of story of Eric Betzig’s journey and lessons that he learnt. 

Here is ‘a’ lecture by Eric Betzig., the best Nobel laureate lecture I have ever heard. Its not the CLEO one, but the material is almost the same.

 Although he did say ‘Fuck you’ at CLEO, but this talk only has him say ‘goddamn, bitch’. How much do you have to achieve to be able to swear nonchalantly at ever formal conference? Thank you Eric for putting it to the test and re-establishing the worth of the Nobel Prize.

Taking on the Indian education juggernaut

Education is one of those things that someone is always complaining about… like taxes, politics and the Indian cricket team. Among the educated middle class especially, there is almost unanimous, and amusingly self-denying, agreement – everyone seems to believe that education is the solution to a majority of our problems, yet they also believe that the current system of education needs radical reform.

The first question is what should such reform entail. The upper crust of Indian school and undergraduate students score pretty high in tests compared to the rest of the world, especially in STEM fields. However there is, perhaps, too much emphasis on learning by rote. The vast majority of students who are in not in the upper crust never use the knowledge they learn in school. Rather education ends up being a race for degrees, and the holistic aim of making informed and intelligent citizens is lost.

In contrast, countries like the US err on the other extreme. The US school system does away with rote entirely, to such an extent that a student’s entire knowledge is conceptual, never actually tested or even put on paper. This method is very effective in higher education, when the students are mature, motivated and capable of testing themselves, whereas the discipline of Asian middle and high schools is demonstrably much better for younger students.

So there are no magic bullets. Given such a nuanced problem, the more important question is who can or should reform education. Should it be the government?


An idealist might believe that the people are the government and each citizen must make choices that enable the change that they want to see in society. Cynics might believe that the government is a toothless organization, a purely regulatory body constructed only to maintain the status quo and give society stability, incapable of innovation. Neither can realistically expect “the gourment” to magically do anything by itself. Only the lazy and intellectually dishonest can truly lay any sizeable responsibility for major social reform onto the government while they do nothing.

I claim that innovation and social change has to come from the people. Many take the onus upon themselves personally and become teachers and professors who impact the system, one student at a time. I have wanted to be a professor myself for the longest time, and I probably will… eventually.

However, the democratization of information sharing has allowed talented, motivated entrepreneurs, even with no capital or leverage, a shot at making a bigger difference. The explosion of startup culture is has suddenly placed societal change within the framework and reach of a rational common man.

Some might question the faith that I express in entrepreneurs. How can we count on fickle dreamers for social progress? A company that fundamentally redefines the way we think and learn would come once in a generation, how is that be trusted an engine of change?

Its true, entrepreneurs are not reliable. But harbingers of change never are. Most of the explorers who left Europe looking for India probably just died trying. We scarcely remember them and focus, rather unjustly, on the one person who finally did make it. But the future does not hinge on that one person. Rather as more and more explorers and entrepreneurs dare to bet their lives and careers on a vision, they pressure society to notice them, and indeed follow them, until eventually change is inevitable. And sure sometimes they might not find their destination and land up in a whole other continent. But in the process they would have found a new way of doing things, a new world, which will live and die by its merits. And if it survives it has the potential to change the future.


This Friday marked the launch of Laughuru. It is an education startup aimed at making learning fun, and more effective, for middle school children in India. I had a small part in its development from the inception. But its primarily the hard work of Vaibhav Devanathan, my classmate from IITB, and the great team he has built around himself. He is a true explorer, having left a lucrative offers at McKinsey and Harvard Business School to pursue a dream.

LaughGuru 1 Site Landing Page

Only time will tell how big an impact it will have. In the meantime, all an explorer can do is follow the compass in his head, read the stars and tackle what lies ahead. One rosy sunset, or one perfect storm, at a time.

Who can communicate science better : Scientists or Writers?

Nobel Prize winner Steven Weinberg recently wrote a piece in the Guardian which talks about the history of science and science communication. The post elicited a sharp response from Philip Ball on his blog. The two bring up an important point that is becoming increasingly relevant today : who communicates science better, scientists or writers?

The increased relevance, which presumably both would agree on, comes from the fact that it is unhealthy for a society if the general public becomes too divorced from the knowledge of the current state of science. After all, science and technology is a crucial driver of progress of society and a populace that does not understand why it is important cannot devote it resources appropriately.


I have been interested in science communication for a long time. This blog is mostly practice for just that. To that end, I follow many popular science writers (usually journalists or writers, who have a passion for understanding and popularizing science) and scientist writers ( scientists who at some point in their career devote more time to communicating science).

Even though both these groups have a common objective, scientists are often quite critical of writers. And I can see why. I have clicked many a sensational headline that fizzled into an article that either did not justify the heading or just seemed like a new and misleading spin on an old idea. However, I also know a few writers who do a decent job of bringing science news in popular media such as Twitter, where scientists lack presence.

I don’t understand how Philip can disagree with Weinberg when he says “mathematics is the main obstacle in explaining cutting edge science to the general public”. I think it is because when Weinberg says science, he really means physics. Though that is definitely misleading and maybe incorrect, it is hardly unexpected since physics is Weinberg’s area of expertise. He might assume that his frustrations with respect to science communication in physics is something that experts in other fields might feel as well. Perhaps it is not felt as acutely by, say, biology since biology is not as abstractly mathematical as physics. Words can do a lot of justice to concepts in biology.

However, I think abstract math, be it string theory or wave-functions, is harder to explain in English. I personally think quantum physics cannot be explained without mathematics. If I say in English that objects can pass through walls, that is as meaningless as saying I am the King of Mars. Only with the mathematics can I explain how quantum objects can pass through walls in a manner that doesn’t require the listener to ad hoc trust me. “Explain” rather than just “tell”.


Philip however says that science writers can explain anything without math. Perhaps he is right, perhaps that is why we need science writers.

XKCD : The Greatest God

While Randall Munroe’s giant posters (LOTR have become the more popular format, for me the true genius of XKCD has always been its ability to break down a really complex concept into a single frame and few words. The latest comic exemplifies it –

A God who holds the record for eating the most skateboards is greater than the God who does not hold that record

The premise of an all powerful God rapidly spirals out of control if you just follow the logic. If God is all powerful, wouldn’t an incomprehensible God indifferent to humanity be greater than a petulant, jealous, vengeful one. Or even a merciful and just one, after all those qualities are just as human.

A truly all powerful God would just ‘be’, without a care for who disobeys him. In fact, a God who can be disobeyed must be lesser than a God who cannot be defied. At which point there becomes no real distinction between God and the fabric of reality, the inescapable laws of math, logic and physics.

Conversely, a God who does exist with all those human qualities cannot be all powerful. This God is more like an advanced civilization/entity whose advanced knowledge/technology seemed magical and supernatural to our primitive minds. In which case, isn’t it reasonable to wonder, if we humans could build a drill to pierce the heavens? (yes, you should watch Gurren Lagann)

First World Problems : Should Robots Have Gender?

Even with all the clamor around the documentary India’s daughters, the aftermath of rape in India and women’s rights in Muslim countries, I did not write anything it. Primarily because there is a surfeit of smart women who can and did talk about the issue themselves, they don’t need a middle class, Tambrahm dude preaching about what he imagines its like to be oppressed by the hypocritical patriarchy. I think I contributed more by liking their posts than by writing one myself.

It would have been funny if it were not true.

Having said that I do consider myself quite an expert in philosophical first world problems and their hypothetical impact on society. So when Slate ran an article exploring the motivations, nuances and consequences of assigning gender to robots, I am jumping on the chance to talk about gender in a totally noncommittal, never-been-affected-by-it, you-go-gals-I’l-wait-here context.

The key point of the article is this : whether we want to or not, we unconsciously assign humanity, and gender to our environment,usually based on prevailing societal stereotypes.
For example : robots with angular construction, darker colors seem more masculine to people, as do one that are used in strength-intensive functions such as lifting or construction.
Robots with lighter colors, curvy designs and intended for calmer functions such as those involved in healthcare and teaching are usually identified by people as female.
Guess which one is male and which one female. And go watch Wall-E.
This obvious stereotyping occurs even if the robots do not have an interactive voice, or even a “face”.
The article goes to on to describe NASA’s answer to the DARPA Robotics Challenge- The Valkyrie DRC Robonaut. Built with the intention of replacing humans for tasks that are too dangerous, the robot has been given a female name and characteristics. The article praises NASA for doing this. However when NASA was asked specifically if the Valkyrie was intended to be female, NASA chose not to assign its robots gender.

In an ideal world, I think what NASA did was correct. Robots do not have gender. It is a slippery slope if you start assigning them one. If certain characteristics strike as masculine or feminine, it can be in the eye of the beholder. There need not be any explicit delineation of robot gender from their creator, just like buildings and cars are not required to be male or female, despite people’s individual preferences.

Unfortunately, we do not live in an ideal world.

Emotions and symbolism exert more power over our actions and beliefs than we would like to admit. As the Slate article says, “if robots are given female form only for designing sexbots and maids, and all the heavy lifting is done by male robots, what will it say about the humans who use these bots”?

Is it possible to prevent this from happening? I don’t think so. A private robot manufacturer will be free to design and label his product. I think sexbots and cleaningbots will be given the female form, simply because they might sell more. Even if we could legislate that all robots should be sexless, is it the right thing to do? It can be argued that a feminine design for a healthcare bot could actually be beneficial for a patient’s emotional and psychological recovery. 
Given such grey areas, it might be more practical to admit that whether we like it or not, many robots will be assigned genders, be it to augment their function, or just to augment their sales.

And in this imperfect world I have to agree with the article, we might need (and I can’t believe I am saying this) “strong female robot role models” for the same reason we have had to ‘promote’ women in science; to prevent prejudices and stereotypes from denying rights and opportunities from those who deserve it.