Driverless cars illustrate the limits of today’s AI
无人车显现了当今AI的局限性【技术季刊《AI及其局限:比预期更陡峭》系列之六】

【双语】路障 Road block-书迷号 shumihao.com

IN MARCH Starsky Robotics, a self-driving lorry firm based in San Francisco, closed down. Stefan Seltz-Axmacher, its founder, gave several reasons for its failure. Investors’ interest was already cooling, owing to a run of poorly performing tech-sector IPOs and a recession in the trucking business. His firm’s focus on safety, he wrote, did not go down well with impatient funders, who preferred to see a steady stream of whizzy new features. But the biggest problem was that the technology was simply not up to the job. “Supervised machine learning doesn’t live up to the hype. It isn’t actual artificial intelligence akin to c-3PO [a humanoid robot from the “Star Wars” films]. It’s a sophisticated pattern-matching tool.”

今年3月,位于旧金山的无人驾驶卡车公司“星空机器人”(Starsky Robotics)宣告关门。创始人斯蒂芬·塞尔兹-阿克斯马赫(Stefan Seltz-Axmacher)给出了失败的几个原因。由于一系列科技公司上市后表现不佳,加之货运业务衰退,投资者的兴趣已经开始降温。他写道,自己的公司重视安全性,但缺乏耐心的资方对此反应不佳,他们宁愿看到技术花哨的新功能被源源不断地开发出来。但最大的问题是,技术根本无法胜任。“有监督的机器学习达不到现在热炒的程度。这不是类似于c-3PO(《星球大战》电影中的人形机器人)的真正的人工智能。这无非是一种先进的模式匹配工具。”

Policing social media, detecting fraud and defeating humans at ancient games are all very well. But building a vehicle that can drive itself on ordinary roads is—along with getting computers to conduct plausible conversations—one of the grand ambitions of modern AI. Some imagined driverless cars could do away with the need for car ownership by letting people summon robotaxis at will. They believe they would be safer, too. Computers never tire, and their attention never wanders. According to the WHO, over a million people a year die in car accidents caused by fallible human drivers. Advocates hoped to cut those numbers drastically.

监管社交媒体,发现欺诈行为,在古老的游戏中击败人类——这些固然都很好。但是,制造一种能够在普通道路上自主行驶的汽车(以及让计算机与人开展听上去合理的对话)是现代AI的宏伟目标之一。有人认为无人驾驶汽车可以让人们随时召唤机器人出租车,这样就不需要拥有汽车了。他们也相信这些车会更安全。电脑永不疲倦,也永远不会分神。根据世卫组织的数据,每年有超过一百万人死于易犯错的人类驾驶员引发的车祸。无人车的拥护者们希望能大幅削减这些数字。

And they would do it soon. In 2015 Elon Musk, the boss of Tesla, an electric-car maker, predicted the arrival of “complete autonomy” by 2018. Cruise, a self-driving firm acquired by General Motors in 2016, had planned to launch self-driving taxis in San Francisco by 2019. Chris Urmson, then the boss of Waymo, a Google subsidiary widely seen as the market leader, said in 2015 that he hoped his son, then 11 years old, would never need a driving licence.

它们本来很快就要做到了。电动汽车制造商特斯拉的老板埃隆·马斯克在2015年预测,“完全自主”将在2018年到来。通用汽车在2016年收购的无人驾驶公司克鲁斯(Cruise)曾计划2019年之前在旧金山推出无人驾驶出租车。2015年,被广泛视为行业领头羊的谷歌子公司Waymo的时任老板克里斯·厄姆森(Chris Urmson)表示,希望自己11岁的儿子永远不需要驾照。

But progress has lagged. In 2018 a self-driving car being tested by Uber, a ride-hailing service, became the first to kill a pedestrian when it hit a woman pushing a bicycle across a road in Arizona. Users of Tesla’s “Autopilot” software must, despite its name, keep their hands on the wheel and their eyes on the road (several who seem to have failed to do so have been killed in crashes). The few firms that carry passengers, such as Waymo in America and WeRide in China, are geographically limited and rely on human safety drivers. Mr Urmson, who has since left Waymo, now thinks that adoption will be slower and more gradual.

但是进展慢了下来。2018年,网约车公司优步测试的一辆无人车在亚利桑那州撞到了一名推着自行车过马路的女性,成为无人车致行人死亡的第一例。特斯拉的“自动驾驶”(Autopilot)软件虽名为“自动”,用户依然须将手放在方向盘上并看着路(几个看起来没能做到的人在事故中丧生)。极少数几家载客的公司,例如美国的Waymo和中国的文远知行,覆盖的地理范围都不大,并且依赖人类安全驾驶员。已离开Waymo的厄姆森现在认为这项技术的推广速度会更慢、更渐进。

Black swans and bitter lessons

黑天鹅和苦涩的教训

Self-driving cars work in the same way as other applications of machine learning. Computers crunch huge piles of data to extract general rules about how driving works. The more data, at least in theory, the better the systems perform. Tesla’s cars continuously beam data back to headquarters, where it is used to refine the software. On top of the millions of real-world miles logged by its cars, Waymo claims to have generated well over a billion miles-worth of data using ersatz driving in virtual environments.

无人驾驶汽车的工作原理与其他机器学习应用是一样的。计算机处理大量数据以提取有关如何驾驶的一般规则。至少从理论上讲,数据越多,系统的性能越好。特斯拉的汽车不断将数据传回总部,用来完善软件。Waymo声称,它的汽车除在现实世界中行驶了数百万英里外,也使用模拟驾驶在虚拟环境中生成了远超过十亿英里的数据。

The problem, says Rodney Brooks, an Australian roboticist who has long been sceptical of grand self-driving promises, is deep-learning approaches are fundamentally statistical, linking inputs to outputs in ways specified by their training data. That leaves them unable to cope with what engineers call “edge cases”—unusual circumstances that are not common in those training data. Driving is full of such oddities. Some are dramatic: an escaped horse in the road, say, or a light aircraft making an emergency landing on a highway (as happened in Canada in April). Most are trivial, such as a man running out in a chicken suit. Human drivers usually deal with them without thinking. But machines struggle.

澳大利亚机器人专家罗德尼·布鲁克斯(Rodney Brooks)长期以来一直对无人驾驶技术的辉煌未来持怀疑态度。他说,问题在于深度学习的方法从根本上说是一种统计方法,根据由训练数据规定的方式把输入与输出联系起来。这使它们无法应付工程师们说的“边缘案例”,即训练数据中不常见的异常情况。驾驶充满了这类怪事。其中一些是“大场面”,比如路上有一匹脱缰的马,或者是轻型飞机紧急降落在高速公路上(4月份在加拿大发生)。大多数不是什么大事,例如一个人穿着小鸡造型的服装跑出来。人类驾驶员应对这种情况一般都不假思索,机器却步履维艰。

One study, for instance, found that computer-vision systems were thrown when snow partly obscured lane markings. Another found that a handful of stickers could cause a car to misidentify a “stop” sign as one showing a speed limit of 45mph. Even unobscured objects can baffle computers when seen in unusual orientations: in one paper a motorbike was classified as a parachute or a bobsled. Fixing such issues has proved extremely difficult, says Mr Seltz-Axmacher. “A lot of people thought that filling in the last 10% would be harder than the first 90%”, he says. “But not that it would be ten thousand times harder.”

例如,一项研究发现,当车道标记被雪部分覆盖时,计算机视觉系统就晕了。另一项研究发现,某些贴纸可能会导致汽车将“停车”标志误认为是时速限制45英里的标志。即使是未被遮挡的物体,如果从不寻常的方向观察也会迷惑计算机,比如一篇论文提到摩托车被归类成了降落伞或雪橇。塞尔兹-阿克斯马赫说,解决此类问题极其困难。“很多人认为完成最后10%要比前90%更难,”他说,“但没想过要难一万倍。”

Mary “Missy” Cummings, the director of Duke University’s Humans and Autonomy Laboratory, says that humans are better able to cope with such oddities because they can use “top-down” reasoning about the way the world works to guide them in situations where “bottom-up” signals from their senses are ambiguous or incomplete. AI systems mostly lack that capacity and are, in a sense, working with only half a brain. Though they are competent in their comfort zone, even trivial changes can be problematic. In the absence of the capacity to reason and generalise, computers are imprisoned by the same data that make them work in the first place. “These systems are fundamentally brittle,” says Dr Cummings.

杜克大学人类与自主实验室主任玛丽·“丫头”·卡明斯(Mary “Missy” Cummings)说,人类能够更好地应对这些怪异的情形,因为他们可以用对世界运行方式“自上而下”的推理,来指导感官收到的“自下而上”的信号模糊或不完整的情况。AI系统大多缺乏这种能力,从某种意义上说,它只能用一半的大脑工作。尽管它们在自己的舒适区十分称职,但即使是微不足道的改变也可能带来问题。由于缺乏推理和归纳能力,那些让计算机能够工作的数据也会禁锢它们。卡明斯博士说:“这些系统从根本上就是脆弱的。”

This narrow intelligence is visible in areas beyond just self-driving cars. Google’s “Translate” system usually does a decent job at translating between languages. But in 2018 researchers noticed that, when asked to translate 18 repetitions of the word “dog” into Yoruba (a language spoken in parts of Nigeria and Benin) and then back into English, it came up with the following: “Doomsday Clock is at three minutes to twelve. We are experiencing characters and dramatic developments in the world, which indicate that we are increasingly approaching the end times and Jesus’ return.”

在无人车以外的领域,这种智能的狭隘也显而易见。谷歌的“翻译”系统在语言互译方面通常都做得不错。但在2018年,研究人员注意到,当被要求把连续18个“dog”译成约鲁巴语(尼日利亚和贝宁的部分地区使用的一种语言)再译回英语时,它给出了这样的译文:“末日钟还有三分钟就要到12点。我们正在经历世界上的人物和戏剧性发展,这表明我们越来越接近末日和耶稣的归来。”

Gary Marcus, a professor of psychology at New York University, says that, besides its comedy value, the mistranslation highlights how Google’s system does not understand the basic structure of language. Concepts like verbs or nouns are alien, let alone the notion that nouns refer to physical objects in a real world. Instead, it has constructed statistical rules linking strings of letters in one language with strings of letters in another, without any understanding of the concepts to which those letters refer. Language processing, he says, is therefore still baffled by the sorts of questions a toddler would find trivial.

纽约大学的心理学教授加里·马库斯(Gary Marcus)表示,除了颇有些喜剧价值外,误译凸显了谷歌的系统并不理解语言的基本结构。它全然不懂动词或名词之类的概念,更不用说理解名词指的是现实世界中的物理对象了。相反,它构造了统计规则,将一种语言的字母字符串与另一种语言的字母字符串联系在一起,但对这些字母所指的概念一无所知。因此,他说,哪怕幼儿也能轻易解答的问题也会让语言处理系统困惑不已。

How much those limitations matter varies from field to field. An automated system does not have to be better than a professional human translator to be useful, after all (Google’s system has since been tweaked). But it does set an upper bound on how useful chatbots or personal assistants are likely to become. And for safety-critical applications like self-driving cars, says Dr Cummings, AI’s limitations are potentially show-stopping.

这些限制到底有多紧要因领域而异。毕竟自动化系统并不一定要胜过专业译员才有用处(谷歌的系统之后有所调整)。但这确实也给聊天机器人或个人助手到底能做到什么程度设置了上限。而对于无人车等性命攸关的应用,卡明斯博士说,AI的局限性可能会成为拦路虎。

Researchers are beginning to ponder what to do about the problem. In a conference talk in December Yoshua Bengio, one of AI’s elder statesmen, devoted his keynote address to it. Current machine-learning systems, said Dr Bengio, “learn in a very narrow way, they need much more data to learn a new task than [humans], they need humans to provide high-level concepts through labels, and they still make really stupid mistakes”.

研究人员开始思索如何解决问题。AI领域的资深活动家约书亚·本希奥(Yoshua Bengio)在12月的一次会议中专门就此发表了主题演讲。本希奥博士说,当前的机器学习系统“以一种非常狭隘的方式学习,它们需要比[人类]多得多的数据来学习新任务,它们需要人类通过标签提供高级概念,但仍然犯下非常愚蠢的错误。”

Beyond deep learning

超越深度学习

Different researchers have different ideas about how to try to improve things. One idea is to widen the scope, rather than the volume, of what machines are taught. Christopher Manning, of Stanford University’s AI Lab, points out that biological brains learn from far richer data-sets than machines. Artificial language models are trained solely on large quantities of text or speech. But a baby, he says, can rely on sounds, tone of voice or tracking what its parents are looking at, as well as a rich physical environment to help it anchor abstract concepts in the real world. This shades into an old idea in AI research called “embodied cognition”, which holds that if minds are to understand the world properly, they need to be fully embodied in it, not confined to an abstracted existence as pulses of electricity in a data-centre.

不同的研究人员对如何改进现状有不同的想法。一种想法是扩大机器训练的范围,而不单是数量。斯坦福大学AI实验室的克里斯托弗·曼宁(Christopher Manning)指出,相比机器,生物大脑学习的数据集要丰富得多。人工语言模型仅用大量的文本或语音训练。他说,但婴儿可以依靠声音、语调或跟踪父母的眼神,以及丰富的物理环境来帮助自己在现实世界中把握抽象概念。这涉及到AI研究中一个叫做“具身认知”的古老观点,它认为如果人们要正确地理解世界,就需要充分沉浸其中,而不仅限于抽象的存在,如数据中心里的一些电脉冲。

Biology offers other ideas, too. Dr Brooks argues that the current generation of AI researchers “fetishise” models that begin as blank slates, with no hand-crafted hints built in by their creators. But “all animals are born with structure in their brains,” he says. “That’s where you get instincts from.”

生物学也提供了一些思路。布鲁克斯博士认为,当前一代AI研究人员“迷恋”始于空白状态的模型——创建者没有手工置入任何提示。但是“所有动物的大脑都有天生的结构,”他说,“本能就是从这里来的。”

Dr Marcus, for his part, thinks machine-learning techniques should be combined with older, “symbolic AI” approaches. These emphasise formal logic, hierarchical categories and top-down reasoning, and were most popular in the 1980s. Now, with machine-learning approaches in the ascendancy, they are a backwater.

马库斯博士则认为,机器学习技术应与较早的“符号AI”方法结合使用。这些方法强调形式逻辑、层次类别和自上而下的推理,在1980年代最受欢迎。现在,随着机器学习方法的兴起,它们已成为一潭死水。

But others argue for persisting with existing approaches. Last year Richard Sutton, an AI researcher at the University of Alberta and DeepMind, published an essay called “The Bitter Lesson”, arguing that the history of AI shows that attempts to build human understanding into computers rarely work. Instead most of the field’s progress has come courtesy of Moore’s law, and the ability to bring ever more brute computational force to bear on a problem. The “bitter lesson” is that “the actual contents of [human] minds are tremendously, irredeemably complex…They are not what should be built in [to machines].”

但其他人则主张坚持现有方法。去年,艾伯塔大学和DeepMind的AI研究人员理查德·萨顿(Richard Sutton)发表了一篇名为《苦涩的教训》(The Bitter Lesson)的文章,认为AI的历史表明,试图将人类的理解建构到计算机中的尝试很少行得通。相反,该领域的大多数进步都得益于摩尔定律,以及不断引入更多蛮力计算来解决问题的能力。“苦涩的教训”是:“[人类]思想的实际内容极度、无可救药地复杂……这不是应该置入[机器]的东西。”

Away from the research labs, expectations around driverless cars are cooling. Some Chinese firms are experimenting with building digital guide rails into urban infrastructure, in an attempt to lighten the cognitive burden on the cars themselves. Incumbent carmakers, meanwhile, now prefer to talk about “driver-assistance” tools such as automatic lane-keeping or parking systems, rather than full-blown autonomous cars. A new wave of startups has deliberately smaller ambitions, hoping to build cars that drive around small, limited areas such as airports or retirement villages, or vehicles which trundle slowly along pavements, delivering packages under remote human supervision. “There’s a scientific reason we’re not going to get to full self-driving with our current technology,” says Dr Cummings. “This less ambitious stuff—I think that’s much more realistic.”■

在研究实验室之外,人们对无人车的期待正在降温。一些中国公司正尝试在城市基础设施中构建数字导轨,以减轻无人车本身的认知负担。同时,成熟的汽车制造商现在更喜欢谈论“驾驶员辅助”工具,例如自动车道保持或停车系统,而不是完全自主驾驶的汽车。新一波创业公司则有意缩减野心,希望制造能在狭小有限的区域(如机场或退休社区)里行驶的汽车,或在人行道上缓慢移动、在人员的远程监督下递送包裹的车辆。“出于科学原因,我们不打算使用现有的技术实现完全无人驾驶,”卡明斯博士说,“这些不那么宏伟的目标——我觉得要现实得多。”■