個稅10% 人群

by Thanesh Sunthar

由塔內什·桑塔爾(Thanesh Sunthar)

人群管理如何使我們的搜索質量提高27％ (How Crowd Curation Improved Our Search Quality by 27%)

The bigger your platform gets, the more vital search becomes. And if you run a content-heavy platform like ours, it’s even more critical that you get search right.

您的平臺越大，搜索就越重要。而且，如果您運行像我們這樣的內容豐富的平臺，那么正確搜索就顯得尤為關鍵。

Retrieving relevant information from millions — potentially billions — of records is not a trivial task. The problem of search is so complex that it has it’s own discipline dedicated to solving it, called Information Science.

從數百萬個(可能數十億個)記錄中檢索相關信息并非易事。搜索問題是如此復雜，以至于有自己的學科致力于解決它，稱為信息科學 。

The world’s six most-visited websites all feature a prominent search bar in their navigation panel. Facebook, YouTube, and Amazon have chosen to place the search bar right next to their logo, highlighting how important search has become for these platforms. Google, the world’s number one website, was initially built around this single problem — search!

世界上訪問量最大的六個網站在其導航面板中均具有醒目的搜索欄。 Facebook，YouTube和Amazon選擇將搜索欄放置在徽標旁邊，以突出顯示搜索對于這些平臺的重要性。谷歌，世界排名第一的網站，最初是圍繞著這個單一的問題-搜索！

Search is the primary way people discover content on a platform. Few people will really put in the time to learn your platform’s hierarchy. In every category, there are many other platforms competing for your users’ time, so these hierarchies are constantly evolving, anyway.

搜索是人們在平臺上發現內容的主要方式。很少有人會真正花時間學習平臺的層次結構。在每個類別中，都有許多其他平臺在爭奪您用戶的時間，因此無論如何，這些層次結構都在不斷發展。

When was the last time you used a menu bar? Or even used advance search filters? Unless you forced users to use these, they tend to naturally stay away from these. So if it isn’t easy enough for users to discover your content through search, they’ll lose interest and move on.

您上次使用菜單欄是什么時候？甚至使用了高級搜索過濾器 ？除非您強迫用戶使用這些，否則他們自然會遠離這些。因此，如果用戶通過搜索來發現您的內容不夠容易，他們將失去興趣并繼續前進。

整理搜索結果 (Curating Search Results)

When search results aren’t relevant to the user, they won’t take the next action and click on any of the results. Curation helps increase the relevance of search results.

當搜索結果與用戶無關時，他們將不會采取下一步操作，也不會點擊任何結果。策劃有助于提高搜索結果的相關性。

My company, Flipp, takes weekly circulars from retailers and makes them searchable. Here’s the difference between non-curated and manually curated results when you search “cake” on Flipp:

我的公司Flipp每周都會收到零售商的通函，并使其可搜索。下面是當你搜索“蛋糕”上Flipp 非策劃和人工監管的結果之間的差異：

Manual curation is the process of a human actually checking each and every search term, then manually arranging the sort order of the results. It’s clear that our manually curated version shows a much cleaner, more relevant set of search results to the user.

手動策劃是人類實際檢查每個搜索詞，然后手動排列結果的排序順序的過程。顯然，我們的手動策劃版本向用戶顯示了更簡潔，更相關的搜索結果集。

You can automate some aspects of manual curation, but it’s still a resource intensive task.

您可以使手動管理的某些方面實現自動化，但這仍然是一項資源密集型任務。

輸入人群管理 (Enter Crowd Curation)

While manual curation is a great way to get started, it’s not a scalable solution. We need a better approach.

雖然手動策展是入門的好方法，但它不是可擴展的解決方案。我們需要更好的方法。

This is where crowd curation comes into play. It uses the wisdom of the crowd to sort the order of search results.

這是人群策展活動發揮作用的地方。它利用人群的智慧對搜索結果進行排序。

One simple approach is to see what items users are clicking on the most, then bump them up to the top of your search results. Here’s an example of the search results for the query “bbq” both before and after crowd curation:

一種簡單的方法是查看用戶點擊次數最多的項目，然后將其提升到搜索結果的頂部。這是在人群管理之前和之后查詢“ bbq”的搜索結果的示例：

As you can see, measuring the click count on an item and sorting results based on that yields much better search results. But because items change on a daily basis in our app, our search results require a periodic tune-up. We keep search results fresh so that expired deals disappear and newer, more “newsworthy” deals rise to the top.

如您所見，測量項目的點擊計數并基于該結果對結果進行排序會產生更好的搜索結果。但是，由于項目在我們的應用程序中每天都在變化，因此我們的搜索結果需要定期進行調整。我們會保持搜索結果的最新狀態，以使過期的交易消失，并且更新的，更具“新聞價值”的交易升至最高。

We have to ensure that we don’t penalize new flyer items against the older items, which have received more impressions, and therefore collected more clicks. This creates other interesting challenges for our dev team.

我們必須確保我們不會對新的傳單項目與較舊的項目進行懲罰，因為較舊的項目獲得了更多的印象，因此獲得了更多的點擊次數。這給我們的開發團隊帶來了其他有趣的挑戰。

Search is also slightly different on mobile platforms. Because the screen size is smaller, we have to also consider what is actually displayed in the viewport.

在移動平臺上，搜索也略有不同。因為屏幕尺寸較小，所以我們還必須考慮視口中實際顯示的內容。

There’s a greater chance that a user will click on an item that is shown at the top (above the fold) rather than items further down the list that they have to scroll down to (below the fold). If the user does take the effort to scroll down to find an item, then that has to also be taken into account when we improve the sort order of our search results.

用戶更有可能單擊頂部(折疊上方)顯示的項目，而不是單擊其向下滾動至(折疊下方)列表下方的項目。如果用戶確實努力向下滾動以查找項目，那么當我們改善搜索結果的排序順序時，也必須考慮到這一點。

衡量搜索質量 (Measuring Search Quality)

The most important measure of a search engine is the quality of its search results. Here, the gap between searches and clicks is widening, and search is getting worse:

搜索引擎最重要的衡量標準是其搜索結果的質量。在這里，搜索和點擊之間的差距正在擴大，搜索變得越來越糟：

We use Click Through Rate — the ratio of users who click on a specific item versus the total users who view those search results — to measure the effectiveness of our search engine.

我們使用點擊率 —單擊特定項目的用戶與查看這些搜索結果的總用戶的比率—來衡量我們搜索引擎的有效性。

We also use Discounted Cumulative Gain to measure the quality of our ranking algorithms.

我們還使用折扣累積增益來衡量排名算法的質量。

One simple way to visualize “uplift” — improvement in results — is to measure the additional clicks generated at every rank of the search result. We used this to conclude that by using crowd curation we saw 27% uplift in clicks generated from the first result.

可視化“提升”(提高結果)的一種簡單方法是測量在搜索結果的每個等級上產生的額外點擊。我們得出的結論是， 通過使用人群策劃，我們發現從第一個結果產生的點擊次數增加了27％。

Most of the clicks shifted towards the top ranks, proving that we had improved the quality and relevancy of our search results.

大多數點擊都移到了最高排名，證明我們已經提高了搜索結果的質量和相關性。

And yes, our algorithm also weighs on how long an item has been available in search.

是的，我們的算法還權衡了一項商品在搜索中的可用時間。

For example, if a circular dropped on Wednesday, the “newsworthiness” of items from that circular would degrade as we move through the week, giving more importance to items from flyers dropped more recently than Wednesday. We balance this with the number of clicks.

例如，如果周三發布了通函，則隨著我們在一周中的移動，該通函中的項目的“新聞價值”將會降低，這比周三發布的傳單中的項目更加重要。我們用點擊次數來平衡這一點。

In other words:

換一種說法：

Item Rank = Formula(Age of Item, Clicks)

物品等級=公式(物品的年齡，點擊次數)

This way, we’re able to somewhat mitigate winner-takes-all effects.

這樣，我們就可以在某種程度上減輕贏家通吃的影響。

At Flipp, we want the user experience to be magical. We’re a data-driven company that constantly looks at data to find new ways to improve the lives of millions of users. Search is just one area where we are applying this principle, but it’s an important one.

在Flipp，我們希望用戶體驗是神奇的。我們是一家由數據驅動的公司，不斷研究數據以尋找新方法來改善數百萬用戶的生活。搜索只是我們應用此原理的一個領域，但這是重要的領域。

I’m Thanesh, a senior product manager at Flipp. I published another version of this on the Flipp engineering blog. If you’re interested in reinventing the way people buy things, check out our current job postings.

我是Flipp的高級產品經理Thanesh 。 我在Flipp工程博客上發布了此版本的另一個版本。 如果您有興趣重塑人們的購買方式，請查看我們當前的職位發布。