Businesses are finding AI hard to adopt
企业发现AI难以实施【技术季刊《AI及其局限:比预期更陡峭》系列之三】

【双语】算法和军队 Algorithms and armies-书迷号 shumihao.com
企业发现AI难以实施【技术季刊《AI及其局限:比预期更陡峭》系列之三】企业发现AI难以实施【技术季刊《AI及其局限:比预期更陡峭》系列之三】企业发现AI难以实施【技术季刊《AI及其局限:比预期更陡峭》系列之三】

“FACEBOOK: THE INSIDE STORY”, Steven Levy’s recent book about the American social-media giant, paints a vivid picture of the firm’s size, not in terms of revenues or share price but in the sheer amount of human activity that thrums through its servers. 1.73bn people use Facebook every day, writing comments and uploading videos. An operation on that scale is so big, writes Mr Levy, “that it can only be policed by algorithms or armies”.

史蒂文·列维(Steven Levy)近期出版的《Facebook:内幕》(Facebook: The Inside Story)一书生动地描绘了这家美国社交媒体巨头的规模——说的不是收入或股价,而是服务器上巨量的人类活动。每天有17.3亿人使用Facebook,撰写评论和上传视频。列维写道,行动的规模如此之大,“以至于只能用算法或军队来管理”。

In fact, Facebook uses both. Human moderators work alongside algorithms trained to spot posts that violate either an individual country’s laws or the site’s own policies. But algorithms have many advantages over their human counterparts. They do not sleep, or take holidays, or complain about their performance reviews. They are quick, scanning thousands of messages a second, and untiring. And, of course, they do not need to be paid.

实际上,Facebook两者都用。人工审查员与经过训练的算法一起工作,以发现违反各个国家或地区法律或网站自身政策的帖子。但算法比起人类具有许多优势。它们不睡觉,不休假,也不抱怨自己的绩效考核。它们很快——每秒扫描成千上万条消息,而且不知疲倦。而且,当然,也不用给它们付工资。

And it is not just Facebook. Google uses machine learning to refine search results, and target advertisements; Amazon and Netflix use it to recommend products and television shows to watch; Twitter and TikTok to suggest new users to follow. The ability to provide all these services with minimal human intervention is one reason why tech firms’ dizzying valuations have been achieved with comparatively small workforces.

不只是Facebook。谷歌使用机器学习来优化搜索结果并定位广告;亚马逊和奈飞(Netflix)使用它来推荐产品和电视节目;推特和抖音会推荐值得关注的新用户。能够在最少的人工干预下提供所有这些服务的能力,是科技公司员工队伍相对较小、估值却高得让人眩晕的原因之一。

Firms in other industries woud love that kind of efficiency. Yet the magic is proving elusive. A survey carried out by Boston Consulting Group and MIT polled almost 2,500 bosses and found that seven out of ten said their AI projects had generated little impact so far. Two-fifths of those with “significant investments” in AI had yet to report any benefits at all.

其他行业的公司也想要这种效率,但事实证明这种魔法难以把握。波士顿咨询集团和麻省理工学院对近2500位老板的调查发现,他们中七成人说自己的AI项目到目前为止还没有产生多少影响。在AI上做了“重大投资”的人中,五分之二还根本没看到任何好处。

Perhaps as a result, bosses seem to be cooling on the idea more generally. Another survey, this one by PwC, found that the number of bosses planning to deploy AI across their firms was 4% in 2020, down from 20% the year before. The number saying they had already implemented AI in “multiple areas” fell from 27% to 18%. Euan Cameron at PwC says that rushed trials may have been abandoned or rethought, and that the “irrational exuberance” that has dominated boardrooms for the past few years is fading.

也许正因如此,更多的老板们在部署AI这件事上开始冷静下来。普华永道的一项调查发现,计划在公司中部署AI的老板人数在2020年为4%,低于一年前的20%。称自己已经在“多个领域”实施AI的人数从27%降至18%。普华永道的尤安·卡梅伦(Euan Cameron)表示,匆忙上马的试验可能已被放弃或重新考虑,并且过去几年董事会中充斥的“非理性繁荣”正在逐渐消失。

There are several reasons for the reality check. One is prosaic: businesses, particularly big ones, often find change difficult. One parallel from history is with the electrification of factories. Electricity offers big advantages over steam power in terms of both efficiency and convenience. Most of the fundamental technologies had been invented by the end of the 19th century. But electric power nonetheless took more than 30 years to become widely adopted in the rich world.

回归现实有多个原因。其中一个平淡无奇:企业,尤其是大型企业,常常会发现变革很难。历史上的一个类比是工厂电气化。就效率和便利性而言,电力相对于蒸汽动力具有巨大的优势。大多数基础技术都是在19世纪末发明的,但电力却花了30多年才在富裕世界中被广泛采用。

Reasons specific to AI exist, too. Firms may have been misled by the success of the internet giants, which were perfectly placed to adopt the new technology. They were already staffed by programmers, and were already sitting on huge piles of user-generated data. The uses to which they put AI, at least at first—improving search results, displaying adverts, recommending new products and the like—were straightforward and easy to measure.

还有AI技术专有的原因。互联网巨头有很好的条件来采用新技术,它们的成功可能误导了企业。它们本来就已经配备了程序员,坐拥大量用户生成的数据。它们应用AI的地方(至少是在最开始)都非常直截了当且易于评估,比如改善搜索结果、显示广告、推荐新产品等。

Not everyone is so lucky. Finding staff can be tricky for many firms. AI experts are scarce, and command luxuriant salaries. “Only the tech giants and the hedge funds can afford to employ these people,” grumbles one senior manager at an organisation that is neither. Academia has been a fertile recruiting ground.

不是所有企业都那么幸运。对于许多公司而言,要找到所需的员工可能很难。AI专家稀缺,并且要求丰厚的薪水。“只有科技巨头和对冲基金才雇得起这些人。”一家两者都不是的公司的一位高级经理抱怨道。学术界则一直是一片招聘AI专家的沃土。

A more subtle problem is that of deciding what to use AI for. Machine intelligence is very different from the biological sort. That means that gauging how difficult machines will find a task can be counter-intuitive. AI researchers call the problem Moravec’s paradox, after Hans Moravec, a Canadian roboticist, who noted that, though machines find complex arithmetic and formal logic easy, they struggle with tasks like co-ordinated movement and locomotion which humans take completely for granted.

一个更微妙的问题是决定AI派上什么用场。机器智能与生物智能大不相同。这意味着衡量一个任务对机器来说是难是易可能并不符合人们的直觉。AI研究人员称这个问题为莫拉维克悖论,这是以加拿大机器人专家汉斯·莫拉维克(Hans Moravec)命名的。莫拉维克指出,尽管复杂算术和形式逻辑对机器来说很容易,但诸如协调运动等人类完全司空见惯的任务却让它们苦苦挣扎。

For example, almost any human can staff a customer-support helpline. Very few can play Go at grandmaster level. Yet Paul Henninger, an AI expert at KPMG, an accountancy firm, says that building a customer-service chatbot is in some ways harder than building a superhuman Go machine. Go has only two possible outcomes—win or lose—and both can be easily identified. Individual games can play out in zillions of unique ways, but the underlying rules are few and clearly specified. Such well-defined problems are a good fit for AI. By contrast, says Mr Henninger, “a single customer call after a cancelled flight has…many, many more ways it could go”.

比如,几乎任何人都可以去接客服电话,却很少有人能成为围棋大师。然而,会计师事务所毕马威的AI专家保罗·亨宁格(Paul Henninger)表示,在某些方面,搭建一个客服聊天机器人要比一台超强围棋机器还要难。围棋只有两个可能的结果(输或赢),并且都可以轻松判断。一个游戏可以有无数种不同的进展方式,但基本规则很少而且很明确。此类明确定义的问题非常适合AI。相比之下,亨宁格说:“航班取消后的一个客户来电……有多得多得多的可能出现的情况。”

What to do? One piece of advice, says James Gralton, engineering director at Ocado, a British warehouse-automation and food-delivery firm, is to start small, and pick projects that can quickly deliver obvious benefits. Ocado’s warehouses are full of thousands of robots that look like little filing cabinets on wheels. Swarms of them zip around a grid of rails, picking up food to fulfil orders from online shoppers.

该怎么办呢?英国仓库自动化和食品交付公司Ocado的工程总监詹姆斯·格拉尔顿(James Gralton)说,一个建议是从小处入手,选择可以迅速带来明显收益的项目。Ocado的仓库里充满了成千上万的机器人,看上去就像带轮子的小文件柜。它们沿着轨道网格快速穿行,抓起食物以履行来自在线购物者的订单。

Ocado’s engineers used simple data from the robots, like electricity consumption or torque readings from their wheel motors, to train a machine-learning model to predict when a damaged or worn robot was likely to fail. Since broken-down robots get in the way, removing them for pre-emptive maintenance saves time and money. And implementing the system was comparatively easy.

Ocado的工程师使用来自机器人的简单数据(例如耗电量或车轮电动机的扭矩读数)训练机器学习模型,以预测损坏或磨损的机器人何时会发生故障。由于出故障的机器人会堵住路,因此将它们撤下并提前维护可以节省时间和金钱。而且该系统的实施相对容易。

The robots, warehouses and data all existed already. And the outcome is clear, too, which makes it easy to tell how well the AI model is working: either the system reduces breakdowns and saves money, or it does not. That kind of “predictive maintenance”, along with things like back-office automation, is a good example of what PWC approvingly calls “boring AI” (though Mr Gralton would surely object).

机器人、仓库和数据都已经存在。结果也很清晰,这就很容易判断AI模型运行得好不好:系统要么减少了故障并节省了资金,要么没有。这种“预测性维护”以及后台办公自动化之类的东西,是普华永道赞许地称之为“无聊的AI”的一个很好的例子(尽管格拉尔顿肯定会反对这种说法)。

There is more to building an AI system than its accuracy in a vacuum. It must also do something that can be integrated into a firm’s work. During the late 1990s Mr Henninger worked on Fair Isaac Corporation’s (FICO) “Falcon”, a credit-card fraud-detection system aimed at banks and credit-card companies that was, he says, one of the first real-world uses for machine learning. As with predictive maintenance, fraud detection was a good fit: the data (in the form of credit-card transaction records) were clean and readily available, and decisions were usefully binary (either a transaction was fraudulent or it wasn’t).

构建AI系统需要的不只是它在真空环境中的准确性。它还必须做一些可以整合到公司工作中的东西。在1990年代末,亨宁格在费埃哲公司(Fair Isaac Corporation,以下简称FICO)研究“猎鹰”(Falcon)项目,这是一个面向银行和信用卡公司的信用卡欺诈检测系统。他说这是机器学习最早的现实应用之一。与预测性维护一样,欺诈检测也是一个很好的用途:数据(以信用卡交易记录的形式)干净且是现成的,决策是二元的(交易要么是欺诈,要么不是),这很有帮助。

The widening gyre

不断扩张的螺旋

But although Falcon was much better at spotting dodgy transactions than banks’ existing systems, he says, it did not enjoy success as a product until FICO worked out how to help banks do something with the information the model was generating. “Falcon was limited by the same thing that holds a lot of AI projects back today: going from a working model to a useful system.” In the end, says Mr Henninger, it was the much more mundane task of creating a case-management system—flagging up potential frauds to bank workers, then allowing them to block the transaction, wave it through, or phone clients to double-check—that persuaded banks that the system was worth buying.

他说,尽管“猎鹰”比银行现有系统更擅长发现欺诈交易,但直到FICO研究出如何帮助银行利用模型生成的信息采取行动之后,猎鹰作为一个产品才获得成功。“猎鹰受到的限制和当今许多AI项目一样:从可用的模型演变为有用的系统。” 最终,亨宁格说,是创建案件管理系统这个乏味得多的任务——向银行工作人员举报潜在的欺诈行为,使他们能够阻止交易、放行或致电客户核实——让银行相信该系统值得购买。

Because they are complicated and open-ended, few problems in the real world are likely to be completely solvable by AI, says Mr Gralton. Managers should therefore plan for how their systems will fail. Often that will mean throwing difficult cases to human beings to judge. That can limit the expected cost savings, especially if a model is poorly tuned and makes frequent wrong decisions.

格拉尔顿说,由于现实世界中的问题复杂且开放,很少有问题可以完全用AI解决。因此,管理层应该对系统可能失灵做好规划。通常,这将意味着将困难的案例交给人类做判断。这可能会导致节省成本的程度达不到预期,尤其是遇到模型调校不良而经常做出错误决策的情况。

The tech giants’ experience of the covid-19 pandemic, which has been accompanied by a deluge of online conspiracy theories, disinformation and nonsense, demonstrates the benefits of always keeping humans in the loop. Because human moderators see sensitive, private data, they typically work in offices with strict security policies (bringing smartphones to work, for instance, is usually prohibited).

科技巨头在新冠肺炎大流行中的经验证明了始终保持人类参与的好处。这场疫情伴随着大量的在线阴谋论、虚假信息和胡说八道。由于人类审查员会看到敏感的私人数据,因此他们通常在执行严格安全策略的办公室中工作(例如,通常被禁止携带智能手机上班)。

In early March, as the disease spread, tech firms sent their content moderators home, where such security is tough to enforce. That meant an increased reliance on the algorithms. The firms were frank about the impact. More videos would end up being removed, said YouTube, “including some that may not violate [our] policies”. Facebook admitted that less human supervision would likely mean “longer response times and more mistakes”. AI can do a lot. But it works best when humans are there to hold its hand. ■

今年3月初,随着疾病蔓延,科技公司让其内容审查员回家办公,让此类安全措施很难实施。这意味着对算法的依赖增加了。这些公司对其影响直言不讳。YouTube表示,这会删除更多视频,“其中有些可能并没有违反我们的政策”。Facebook承认,更少的人员监督可能意味着“更长的响应时间和更多的错误”。AI可以做很多事情。但只有在人类在旁引导时才做得最好。