人工智能競賽正在進行中。這是贏家。 (The race is on for artificial intelligence. Here’s who is winning.)

On Saturday, Louisville, Kentucky hosted the 143rd running of the Kentucky Derby. It was a spectacle where more than 150k people watched in person. Millions more followed on television and streaming media. The winner received a $1.4 million prize, and the opportunity for more winnings in later races this year.

星期六，肯塔基州路易斯維爾舉辦了第143場肯塔基德比大賽。超過15萬人親自觀看的奇觀。電視和流媒體上還有數以百萬計的關注。獲勝者將獲得140萬美元的獎金，并有機會在今年的以后比賽中贏得更多獎金。

A bigger race is raging within the technology sector around who can commoditize machine learning as a service. Prebuilt machine learning models are worth billions of dollars. This competition pits the largest technology companies on the planet.

誰可以將機器學習作為服務商品化，因此在技術領域內，一場激烈的競賽正在展開。預建的機器學習模型價值數十億美元。這場競爭使全球最大的技術公司陷入困境。

Events such as the Kentucky Derby actually have many races going on during the same day. The race to dominate machine learning is the same. For this article, I’m going to just focus on how the race for image recognition is shaping up.

諸如肯塔基德比之類的活動實際上在同一天進行著許多比賽。主導機器學習的競賽是相同的。在本文中，我將只關注圖像識別競賽的發展趨勢。

云競爭者 (The Cloud Contenders)

Right now there are options from each of the major Public Cloud vendors. Amazon, Google, and Microsoft get a prime position based on their storage hosting services. Their offerings will determine the market direction. Image recognition may become a feature built into big cloud-based image storage systems. This move would eliminate prebuilt models as a separate product.

現在，每個主要的公共云供應商都提供了一些選擇。亞馬遜，谷歌和微軟基于它們的存儲托管服務而處于領先地位。他們的產品將決定市場方向。圖像識別可能會成為內置在大型基于云的圖像存儲系統中的功能。此舉將消除預建模型作為單獨的產品。

測試當前產品 (Testing out the current offerings)

To “race” the providers against one another, I used the photo below from Wikipedia. To make the article more readable, I reduced the precision on each of the responses below to three digits.

為了使提供程序彼此“競爭”，我使用了Wikipedia的以下照片。為了使文章更具可讀性，我將下面每個回答的精度降低到三位數。

亞馬孫 (Amazon)

Amazon has the largest Public Cloud footprint in the industry. Six months ago they released their MVP of Rekognition. This service builds on their Cloud platform as it integrates into S3 and Lambda. Here is what their models determine from the race photo.

亞馬遜擁有業內最大的公共云資源。六個月前，他們發布了Rekognition的MVP 。該服務在集成到S3和Lambda的云平臺上構建。這是他們的模型根據比賽照片確定的。

[{’Confidence’: 98.0, ’Name’: ’Animal’},{’Confidence’: 98.0, ’Name’: ’Horse’},{’Confidence’: 98.0, ’Name’: ’Mammal’},{’Confidence’: 90.8, ’Name’: ’Equestrian’},{’Confidence’: 90.8, ’Name’: ’Person’},{’Confidence’: 52.7, ’Name’: ’Colt Horse’}]

谷歌 (Google)

Google has a large Cloud business, including object storage. Their history with image recognition in search is also a massive advantage. Using their Cloud Vision API provides a thorough response on the race image.

Google擁有龐大的Cloud業務，包括對象存儲。他們在搜索中具有圖像識別的歷史也是一個巨大的優勢。使用他們的Cloud Vision API，可以對比賽圖像提供全面的響應。

[{ "description": "horse", "score": 0.937 },{ "description": "western riding", "score": 0.889 },{ "description": "jockey", "score": 0.881 },{ "description": "racing", "score": 0.861 },{ "description": "stallion", "score": 0.810},{ "description": "mare", "score": 0.810 },{ "description": "western pleasure", "score": 0.806 },{  "description": "sports", "score": 0.776 },{  "description": "horse racing", "score": 0.775 },{  "description": "english riding", "score": 0.731 },{  "description": "horse trainer", "score": 0.722 },{  "description": "equestrian sport", "score": 0.708 },{  "description": "equestrianism", "score": 0.705 },{  "description": "animal sports", "score": 0.685 },{  "description": "barrel racing", "score": 0.648},{  "description": "eventing", "score": 0.614},{  "description": "horse like mammal", "score": 0.590},{  "description": "reining", "score": 0.546 }]

Google goes even further by adding in text recognition. When scanning the image, it translated the text in the scoreboard. See the yellow boxes in the top left of the image below.

Google進一步增加了文本識別功能。掃描圖像時，它會翻譯記分板上的文本。請參見下圖左上方的黃色框。

Google translates this information into a machine readable format (JSON). This is a powerful feature that others don’t offer yet.

Google會將這些信息轉換為機器可讀格式(JSON)。這是其他人尚未提供的強大功能。

微軟 (Microsoft)

Microsoft also has the combination of a large Cloud and Search business. Their offering has been on the market for more than a year. Their Cloud Vision API recognized the image, and provided the following results.

微軟還擁有大型云和搜索業務的組合。他們的產品已經投放市場一年多了。他們的Cloud Vision API可以識別圖像，并提供以下結果。

[ { “name”: “grass”, “confidence”: 0.999 },{ “name”: “fence”, “confidence”: 0.999 },{ “name”: “outdoor”, “confidence”: 0.995 },{ “name”: “horse”, “confidence”: 0.985 },{ “name”: “ground”, “confidence”: 0.974 },{ “name”: “sport”, “confidence”: 0.821 },{ “name”: “horse racing”, “confidence”: 0.519 }]

長時間射擊 (The Long-Shots)

This race has more entrants than the three major Public Cloud providers. IBM has Watson, and strong capabilities in AI. They have enabled this capability within BlueMix. Here’s what I got when attempting to use the public demo using the photo.

與三大主要公有云提供商相比，該競賽的參與者更多。 IBM具有Watson，并具有強大的AI功能。他們在BlueMix中啟用了此功能。這是我嘗試使用帶有照片的公開演示時得到的信息。

There are limitations with this service as there are restrictions on size. This may be a usability gap the deters customers. I found a similar photo on Wikipedia that was within the 2MB threshold. The quality of the recognition was similar to the others.

此服務存在限制，因為存在大小限制。這可能會阻止用戶使用可用性。我在Wikipedia上發現了一張 2MB閾值以內的類似照片。識別的質量與其他類似。

[ { "class": "horse racing", "score": 0.922 },{ "class": "racing", "score": 0.928 },{ "class": "sport", "score": 0.928 },{ "class": "jockey (horse rider)", "score": 0.622 },{ "class": "traveler", "score": 0.622 },{ "class": "person", "score": 0.622 },{ "class": "racehorse", "score": 0.53 },{ "class": "mammal", "score": 0.53 },{ "class": "animal", "score": 0.53 },{ "class": "green color", "score": 0.876 }]

Start-ups provide creative alternatives in this race. An example is Clarifai that raised $30M last year. Their API highlighted strong recognition using the same image as the tech giants.

初創企業在這場比賽中提供了創新的選擇。一個例子就是Clarifai ，它去年籌集了3000萬美元。他們的API使用與技術巨頭相同的圖像強調了強大的識別能力。

horse, 0.999equine, 0.992race, 0.990track, 0.989fast, 0.984jockey, 0.983thoroughbred, 0.981competition, 0.966gambling, 0.951filly, 0.942mare, 0.936turf, 0.924whip, 0.902best, 0.897stallion, 0.882athlete, 0.869saddle, 0.865racehorse, 0.864rider, 0.864blinker, 0.858

This highlights the potential for a newcomer to break into this race. The startup could ride the rails of an existing Cloud hosting provider, giving it economies of scale.

這凸顯了新人打入這場比賽的潛力。該初創公司可以利用現有云托管提供商的優勢，從而實現規模經濟。

誰是贏家？ (Who is the winner?)

The race is very competitive, with Google currently in the lead. Software developers integrating image recognition into their digital products are also winners. I recently built an Alexa game that uses it to play scavenger hunt. This was done with just a few lines of code, and no effort to train models.

比賽非常激烈，Google目前處于領先地位。將圖像識別集成到其數字產品中的軟件開發人員也是贏家。我最近制作了一個Alexa游戲，用它玩尋寶游戲。只需執行幾行代碼，就無需訓練模型。

The current price point is around $1/thousand images. At this level, image recognition will be incorporated into many different products. The race to become the most consumed service is on!

當前的價格點約為$ 1 /千張圖片。在此級別上，圖像識別將被集成到許多不同的產品中。成為最消耗服務的競賽正在進行中！