Substack

Wednesday, July 15, 2020

A reality check on Artificial Intelligence

The Economist has a much needed survey which seeks to take a realistic view of the promise of Artificial Intelligence (AI).

The claims, boosted by the deeply conflicted advise of consultants, have been eye-popping,
PwC, a professional-services firm, predicts that artificial intelligence (AI) will add $16trn to the global economy by 2030. The total of all activity—from banks and biotech to shops and construction—in the world’s second-largest economy was just $13trn in 2018. PwC’s claim is no outlier. Rival prognosticators at McKinsey put the figure at $13trn. Others go for qualitative drama, rather than quantitative. Sundar Pichai, Google’s boss, has described developments in ai as “more profound than fire or electricity”.
But the outcomes till date have been not anywhere near as impressive,
A survey carried out by Boston Consulting Group and MIT polled almost 2,500 bosses and found that seven out of ten said their AI projects had generated little impact so far. Two-fifths of those with “significant investments” in AI had yet to report any benefits at all... Another survey, this one by PwC, found that the number of bosses planning to deploy ai across their firms was 4% in 2020, down from 20% the year before. The number saying they had already implemented AI in “multiple areas” fell from 27% to 18%.
This article points to limitations,
Those hoping to make use of AI's potential must confront two sets of problems. The first is practical. The machine-learning revolution has been built on three things: improved algorithms, more powerful computers on which to run them, and—thanks to the gradual digitisation of society—more data from which they can learn. Yet data are not always readily available. It is hard to use ai to monitor covid-19 transmission without a comprehensive database of everyone’s movements, for instance. Even when data do exist, they can contain hidden assumptions that can trip the unwary. The newest AI systems’ demand for computing power can be expensive... The second set of problems runs deeper, and concerns the algorithms themselves. Machine learning uses thousands or millions of examples to train a software model (the structure of which is loosely based on the neural architecture of the brain). The resulting systems can do some tasks, such as recognising images or speech, far more reliably than those programmed the traditional way with hand-crafted rules, but they are not “intelligent” in the way that most people understand the term. They are powerful pattern-recognition tools, but lack many cognitive abilities that biological brains take for granted. They struggle with reasoning, generalising from the rules they discover, and with the general-purpose savoir faire that researchers, for want of a more precise description, dub “common sense”. The result is an artificial idiot savant that can excel at well-bounded tasks, but can get things very wrong if faced with unexpected input. Without another breakthrough, these drawbacks put fundamental limits on what AI can and cannot do.
Self-driving cars, which must navigate an ever-changing world, are already delayed, and may never arrive at all. Systems that deal with language, like chatbots and personal assistants, are built on statistical approaches that generate a shallow appearance of understanding, without the reality. That will limit how useful they can become. Existential worries about clever computers making radiologists or lorry drivers obsolete—let alone, as some doom-mongers suggest, posing a threat to humanity’s survival—seem overblown.
Sample the problems with self-driving cars, which were among the earliest promises of AI,
Self-driving cars work in the same way as other applications of machine learning. Computers crunch huge piles of data to extract general rules about how driving works. The more data, at least in theory, the better the systems perform. Tesla’s cars continuously beam data back to headquarters, where it is used to refine the software. On top of the millions of real-world miles logged by its cars, Waymo claims to have generated well over a billion miles-worth of data using ersatz driving in virtual environments. The problem, says Rodney Brooks, an Australian roboticist who has long been sceptical of grand self-driving promises, is deep-learning approaches are fundamentally statistical, linking inputs to outputs in ways specified by their training data. That leaves them unable to cope with what engineers call “edge cases”—unusual circumstances that are not common in those training data. Driving is full of such oddities. Some are dramatic: an escaped horse in the road, say, or a light aircraft making an emergency landing on a highway (as happened in Canada in April). Most are trivial, such as a man running out in a chicken suit. Human drivers usually deal with them without thinking. But machines struggle.
One study, for instance, found that computer-vision systems were thrown when snow partly obscured lane markings. Another found that a handful of stickers could cause a car to misidentify a “stop” sign as one showing a speed limit of 45mph. Even unobscured objects can baffle computers when seen in unusual orientations: in one paper a motorbike was classified as a parachute or a bobsled... Mary “Missy” Cummings, the director of Duke University’s Humans and Autonomy Laboratory, says that humans are better able to cope with such oddities because they can use “top-down” reasoning about the way the world works to guide them in situations where “bottom-up” signals from their senses are ambiguous or incomplete. ai systems mostly lack that capacity and are, in a sense, working with only half a brain. Though they are competent in their comfort zone, even trivial changes can be problematic. In the absence of the capacity to reason and generalise, computers are imprisoned by the same data that make them work in the first place.
This is a fundamental problem,
Christopher Manning, of Stanford University’s aiLab, points out that biological brains learn from far richer data-sets than machines. Artificial language models are trained solely on large quantities of text or speech. But a baby, he says, can rely on sounds, tone of voice or tracking what its parents are looking at, as well as a rich physical environment to help it anchor abstract concepts in the real world. This shades into an old idea in ai research called “embodied cognition”, which holds that if minds are to understand the world properly, they need to be fully embodied in it, not confined to an abstracted existence as pulses of electricity in a data-centre.
There is an argument that much of the progress in AI itself has been due to the advances in computing power,
Last year Richard Sutton, an AI researcher at the University of Alberta and DeepMind, published an essay called “The Bitter Lesson”, arguing that the history of ai shows that attempts to build human understanding into computers rarely work. Instead most of the field’s progress has come courtesy of Moore’s law, and the ability to bring ever more brute computational force to bear on a problem.
Sample this about the limitations with data itself, in facial recognition,
Bias is another source of problems. Last year America’s National Institute of Standards and Technology tested nearly 200 facialrecognition algorithms and found that many were significantly less accurate at identifying black faces than white ones. The problem may reflect a preponderance of white faces in their training data. A study from IBM, published last year, found that over 80% of faces in three widely used training sets had light skin.
Then there is also the issue of getting the data ready for analysis, which itself takes up an inordinately large share of the effort.

And this about problems facing AI in medicine,
The first is about getting data into a coherent, usable format... Then there are the challenges of privacy and regulation. Laws guarding medical records tend to be fierce, and regulators are still wrestling with the question of how exactly to subject AI systems to clinical trials. Finally there is the question of “explainability”. Because AI systems learn from examples rather than following explicit rules, working out why they reach particular conclusions can be tricky. Researchers call this the “black box” problem. As AI spreads into areas such as medicine and law, solving it is becoming increasingly important.
While AI may have its limitations, it has undoubted promise. Many significant benefits of AI are already being reaped. It is now being acknowledged that AI will be the latest general purpose technology following in the footsteps of steam engine, antibiotics, electricity, motor vehicles, and the internet. 

However the preparedness of Indian state and corporates to make meaningful use of the promise of AI may be poor. Prosenjit Datta writes that AI may be the latest area for techno-colonialism as India runs the risk of falling behind on AI and depending on others,
The issue is that both the government and companies are largely focused on AI applications, not research and development (R&D). And even in applications, much of the work is at the mid and lower ends of the spectrum, tweaking existing solutions and innovations available with technology giants such as Microsoft, Google, Amazon or IBM. When it comes to actually setting the future direction of AI, we are nowhere... Currently, the race is really between the US, China and the EU, with the US in a slender lead. Even Russia, where President Vladimir Putin recognised and flagged the dangers of falling behind in the AI race, is lagging. India has not even entered the race yet...
In India, neither the government nor the industry has focused on research. The government may have woken up to the benefits of AI but it still does not have a cogent long-term plan. The industry is only focusing on developing solutions based on platforms available. AI as a technology has still not reached maturity. That will come perhaps a decade down the line as another technology, Quantum Computing, moves out of the labs and into the real world. So, there is still hope if we formulate a long-term plan just as we do for other infrastructure plans. It will mean squeezing expenditure elsewhere to find money for R&D and also giving incentives to attract research talent and getting the biggest corporations involved.
Such simplistic and low-cost efforts at supporting AI are no substitute for hard and patient investments in R&D.

Update 1 (05.08.2020)

GPT3, the AI language model released by Elon Musk's Open AI, has generated immense excitement. It has been hailed as being transformative. Analysts have been gushing at how at 175 billion parameters, it has ten times more than Microsoft's previous model. 

However perceptive observers have questioned the assumptions behind the hype. See this,
At its core, GPT-3 is an extremely sophisticated text predictor. A human gives it a chunk of text as input, and the model generates its best guess as to what the next chunk of text should be. It can then repeat this process—taking the original input together with the newly generated chunk, treating that as a new input, and generating a subsequent chunk—until it reaches a length limit. How does GPT-3 go about generating these predictions? It has ingested effectively all of the text available on the Internet. The output it generates is language that it calculates to be a statistically plausible response to the input it is given, based on everything that humans have previously published online... Having trained on a dataset of half a trillion words, GPT-3 is able to identify and dazzingly riff on the linguistic patterns contained therein.

But GPT-3 possesses no internal representation of what these words actually mean. It has no semantically-grounded model of the world or of the topics on which it discourses. It cannot be said to understand its inputs and outputs in any meaningful way. Why does this matter? Because it means that GPT-3 lacks the ability to reason abstractly; it lacks true common sense. When faced with concepts, content, or even phrasing that the Internet’s corpus of existing text has not prepared it for, it is at a loss.
See also this on some of its failings. This video (see from 4.13) has a very explanation of the point about GPT3 being ultimately the most powerful backward-looking text predictor.  

No comments: