使用協同過濾推薦電影

ALSO, ARE RECOMMENDER SYSTEMS INFLUENCING OUR TASTE??

此外,推薦系統是否影響我們的口味?

An excerpt on creating a movie recommender system similar to the OTT platforms.

有關創建類似于OTT平臺的電影推薦系統的摘錄。

INTRODUCTION

介紹

Formally Defining,A Recommender System is a system that seeks to predict or filter preferences according to the user’s preferences. The demand for a good recommender system is soaring, especially with then onset of Covid-19 induced lock down,forcing everyone to stay home and watch movies of their favourite genre,actor,director….you get it right.This is where a recommender system plays an important role in providing the user, content he is more likely to watch, rather than the user searching for something that interests him,which would mess with the user experience.

正式定義,推薦系統是一種試圖根據用戶的偏好來預測或過濾偏好的系統。 對好的推薦器系統的需求猛增,尤其是在Covid-19引發鎖定之后,迫使每個人呆在家里觀看自己喜歡的類型,演員,導演的電影……您就對了。這就是推薦器的地方系統在提供用戶更可能觀看的內容而不是用戶搜索他感興趣的內容方面起著重要作用,而這會干擾用戶體驗。

The essence of a recommender system lies in its recommendation engine.There are Two types of Recommendation engine:

推薦系統的本質在于其推薦引擎。推薦引擎有兩種類型:

  1. Content-based filtering engine: It provides recommendations by matching the description of the movie and a user profile, generated by the interests provided by the user.It has an explicit understanding of the recommendation.You might have observed it in some apps,where you are asked questions about your preferences as soon as you signup.This is what it’s for.

    基于內容的過濾引擎:它通過匹配電影的描述和由用戶提供的興趣產生的用戶個人資料來提供推薦。它對推薦具有清晰的了解。您可能已經在某些應用中觀察到了該推薦,在您注冊后被問到有關您的偏好的問題。這就是它的用途。

  2. Collaborative filtering engine: It is a method of making automatic predictions about the interests of a user by collecting preferences or taste information based on the activity of current user along with many other users with similar activity(collaborating).The underlying assumption of the collaborative filtering approach is that if a person A has the same opinion as a person B on an issue, A is more likely to have B’s opinion on a different issue than that of a randomly chosen person.It need not have any explicit understanding of the recommendation.You might have observed in one of your OTT platforms when you open a particular movie, An array of movies under the heading “people who watched this movie also watched”.This is what it uses.

    協作過濾引擎:這是一種通過根據當前用戶以及許多其他具有類似活動(協作)的用戶的活動收集偏好或品味信息來自動預測用戶興趣的方法。方法是,如果一個人A在某個問題上與人B擁有相同的觀點,那么與隨機選擇的人相比,A在一個不同的問題上更有可能擁有B的觀點,它不需要對該建議有任何明確的理解。當您打開特定電影時,您可能已經在一個OTT平臺中觀察到過,標題為“看過這部電影的人也看過”的一系列電影。這就是它的用途。

Equipped with this basics,Lets dive into creating a movie recommender system using collaborative filtering.

配備了這些基礎知識后,我們將深入研究使用協作過濾創建電影推薦系統。

We start by Importing required libraries. We will be using Scikit-surprise which contains the SVD(Singular Value Decomposition).SVD allows us to extract and untangle information,which is really helpful in creating a recommender system.

我們首先導入所需的庫。 我們將使用包含SVD(奇異值分解)的Scikit-surprise。SVD允許我們提取和解開信息,這對于創建推薦系統非常有幫助。

This topic involves a lot of statistical data analysis.resources to know more about scikit surprise,SVD:

本主題涉及大量統計數據分析。了解更多關于scikit Surprise,SVD的資源:

First thing one must do before creating a model is observe the data. This gives us a lot of insight on the type of data it is, and what we could use to gain the maximum from it.

創建模型之前,必須做的第一件事就是觀察數據。 這使我們對數據的類型以及可以用來從中獲得最大收益的數據有很多了解。

As we observe the data, we see that timestamp is a redundant column and it is best to remove it.

當我們觀察數據時,我們看到時間戳是多余的列,最好將其刪除。

It is always a good practice to check for NaNs in your dataset,luckily we don’t have any.

最好在您的數據集中檢查NaN,幸運的是我們沒有。

現在是該模型的主要部分, 探索性數據分析 (Now comes the Main Part of this model, Exploratory Data Analysis)

To start,We look for the Number of movies and users in the dataset.

首先,我們在數據集中尋找電影和用戶數。

Now we find Sparsity of the data. Sparsity tells us the percentage of movies missing rating by the users. i.e Not all users rate a movie, It tells us the percentage of missing values by the total values.Sparsity for this data is 98%. Usually the lower the sparsity,the better.But in the case of Collaborative Filtering, below 99% is manageable.

現在我們發現數據的稀疏性。 稀疏度告訴我們用戶缺少電影評分的百分比。 即,并非所有用戶都對電影進行評分,它告訴我們缺失值占總值的百分比。此數據的稀疏度為98%。 通常,稀疏度越低越好。但是在協作過濾的情況下,低于99%是可以控制的。

Sparsity(%) = (No of Missing Values/(Total Values))*100

稀疏度(%)=(遺漏值/(總值))* 100

Now we try to visualize ratings distribution.

現在,我們嘗試可視化收視率分布。

Most of the ratings are between 3–5 and the range of the ratings are from 0.5 to 5.

大多數評級介于3-5之間,評級范圍介于0.5到5之間。

FEATURE ENGINEERING

特征工程

Now comes The next essential part of the system, Feature Engineering.I always believe that Feature Engineering as Important as building a model, as It allows the model to better understand and converge better.

現在是系統的下一個基本部分,即要素工程。我一直認為要素工程對于構建模型同樣重要,因為它可以使模型更好地理解和融合。

Here We are Reducing the Dimensions by removing the redundant data like Movies with less than 3 ratings or user who rated less than 3 movies, as it is difficult to recommend something with such less data to analyse.

在這里,我們正在通過刪除冗余數據(例如評級低于3的電影或評級低于3的用戶的電影)來減少尺寸,因為很難推薦具有此類數據的數據來進行分析。

Now lets start creating the Model,

現在開始創建模型,

Creating a Surprise Dataset for training using the Reader class that we imported and provide the expected scale of rating,which we found out during our exploratory data analysis.You can add that to your data using the dataset import.

使用我們導入的Reader類創建一個用于訓練的Surprise Dataset,并提供我們在探索性數據分析中發現的預期的評分等級。您可以使用數據集導入將其添加到數據中。

Now as we are using our whole train set for training,we create an antiset which consists of all the data without the reviews on which we can test.

現在,當我們使用整個訓練集進行訓練時,我們將創建一個包含所有數據的antiset,而沒有可以測試的評論。

We create our SVD, which untangles the information for us to complete the recommender model.

我們創建了SVD,它為我們整理了信息,以完成推薦模型。

We then evaluate our model with the metrics Root Mean Square Error and Mean Absolute Error as they provide the average over the epoch of the absolute values of difference between the recommendation and the actual observation.

然后,我們使用度量均方根誤差和均值絕對誤差來評估我們的模型,因為它們提供了建議與實際觀察值之間的絕對差值的平均值。

Predicting

預測

預測為我們提供了用戶ID為1的電影ID。 (The prediction gives us a movie id for user id 1.)

This finishes our recommender system’s job.

這樣就完成了推薦系統的工作。

Now… lets discuss about something debatable.

現在...讓我們討論一些值得商bat的問題。

推薦系統是否正在影響我們在電影中的品味并控制我們? (Are Recommender Systems influencing our taste in movies and taking the control from us??)

Image for post
Photo by Juan Rumimpunu on Unsplash
Juan Rumimpunu在Unsplash上的照片

My Father who is no way related to computer Science asked me this one fine morning.He was going through his favourite video streaming service and made an observation that, He was seeing videos that are related to a few areas only. It made him feel that his choice is getting Influenced by it and was unable to come across something new.

我父親與計算機科學毫無關系,今天上午好。我正在經歷他最喜歡的視頻流媒體服務,并觀察到,他正在觀看的視頻僅涉及幾個領域。 這讓他感到自己的選擇正在受到影響,無法遇到新的事物。

I explained this to him using my own words and understanding:

我用自己的語言和理解向他解釋了這一點:

He has been watching the same videos over and over daily,Thus creating a profile that, he is interested in only in this particular topic of videos.That was the reason he was shown videos from that particular topic only.

他每天都在看相同的視頻,因此創建了一個個人檔案,他只對特定的視頻主題感興趣。這就是為什么他只看到該特定主題的視頻。

But does it mean you have no control over it,

但這是否意味著您無法控制它,

The Answer is NO.

答案是否定的。

You still have your control, If you are not interested in a topic, but you were recommended by the engine, Just let the engine know that you are not interested. Yes, you have that option. Expand your viewing horizons for diverse content. A recommender system is there just to help you, not control you.It all finally depends on the viewer to watch or not.

您仍然可以控制自己,如果您對某個主題不感興趣,但是引擎推薦您,只需讓引擎知道您不感興趣即可。 是的,您可以選擇。 擴大您的觀看范圍,以獲取各種內容。 推薦系統只是在幫助您而不是控制您,最終取決于觀看者是否觀看。

Lets share our views on this and spread some knowledge.Lets learn and grow as a community.. Because all we are left with is people,memories and knowledge.

讓我們就此發表看法并傳播一些知識。讓我們作為一個社區學習和成長。因為我們所剩的就是人,記憶和知識。

Thank you.

謝謝。

翻譯自: https://medium.com/swlh/recommending-a-movie-using-collaborative-filtering-6dab1b8f4472

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/389669.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/389669.shtml
英文地址,請注明出處:http://en.pswp.cn/news/389669.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

423. 從英文中重建數字

423. 從英文中重建數字 給你一個字符串 s ,其中包含字母順序打亂的用英文單詞表示的若干數字(0-9)。按 升序 返回原始的數字。 例 1:輸入:s "owoztneoer" 輸出:"012"示例 2&#xf…

錦欣生殖獲戰略投資,華平、信銀領投,紅杉、藥明康德跟投

9月16日消息,錦欣生殖近日宣布已完成新一輪的戰略投資,本輪融資由原戰略股東華平投資及新引入的中信銀行旗下信銀投資領投,紅杉資本中國基金、藥明康德等跟投。完成本輪融資后,華平投資及信銀投資分別成為錦欣生殖的第二及第三大股…

數據暑假實習面試_面試數據科學實習如何準備

數據暑假實習面試Unfortunately, on this occasion, your application was not successful, and we have appointed an applicant who…不幸的是,這一次,您的申請沒有成功,我們已經任命了一位符合以下條件的申請人: Sounds famili…

兩道簡單的入門題

1&#xff09;  for循環求100以內奇數和 1 #include<stdio.h> 2 int main(){ 3 int ans0;//定義一個答案變量存儲答案 4 for(int i1;i<100;i)//用for從1循環到100&#xff0c;如果i%2&#xff01;0&#xff08;%是一種取余運算&#xff09; 5 if(…

1716. 計算力扣銀行的錢

1716. 計算力扣銀行的錢 Hercy 想要為購買第一輛車存錢。他 每天 都往力扣銀行里存錢。 最開始&#xff0c;他在周一的時候存入 1 塊錢。從周二到周日&#xff0c;他每天都比前一天多存入 1 塊錢。在接下來每一個周一&#xff0c;他都會比 前一個周一 多存入 1 塊錢。 給你 …

谷歌 colab_如何在Google Colab上使用熊貓分析

谷歌 colabRecently, pandas have come up with an amazing open-source library called pandas-profiling. Generally, EDA starts by df.describe(), df.info() and etc which to be done separately. Pandas_profiling extends the general data frame report using a singl…

【題解】HAOI2007分割矩陣

水題盛宴啦啦啦……做起來真的極其舒服&#xff0c;比某些毒瘤題好太多了…… 數據范圍極小 --> 狀壓 / 搜索 / 高維度dp&#xff1b;觀察要求的均方差&#xff0c;開始考慮是不是能夠換一下式子。我們用\(a_{x}\)來表示第 \(x\) 個矩陣的總值&#xff0c;則式子為&#xff…

Java之生成Pdf并對Pdf內容操作

雖說網上有很多可以在線導出Pdf或者word或者轉成png等格式的工具&#xff0c;但是我覺得還是得了解知道是怎么實現的。一來&#xff0c;在線免費轉換工具&#xff0c;是有容量限制的&#xff0c;達到一定的容量時&#xff0c;是不能成功導出的;二來&#xff0c;業務需求&#x…

邊際概率條件概率_數據科學家解釋的邊際聯合和條件概率

邊際概率條件概率Probability plays a very important role in Data Science, as Data Scientist regularly attempt to draw statistical inferences that could be used to predict data or analyse data better.P robability起著數據科學非常重要的作用&#xff0c;為數據科…

1822. 數組元素積的符號

1822. 數組元素積的符號 已知函數 signFunc(x) 將會根據 x 的正負返回特定值&#xff1a; 如果 x 是正數&#xff0c;返回 1 。 如果 x 是負數&#xff0c;返回 -1 。 如果 x 是等于 0 &#xff0c;返回 0 。 給你一個整數數組 nums 。令 product 為數組 nums 中所有元素值的…

java并發編程實戰:第十四章----構建自定義的同步工具

一、狀態依賴性管理 對于單線程程序&#xff0c;某個條件為假&#xff0c;那么這個條件將永遠無法成真在并發程序中&#xff0c;基于狀態的條件可能會由于其他線程的操作而改變1 可阻塞的狀態依賴操作的結構2 3 acquire lock on object state4 while (precondition does not ho…

關于之前的函數式編程

之前寫的函數式編程是我從 JavaScript ES6 函數式編程入門經典這本書里面整理的&#xff0c;然后只在第一篇里專門提到了&#xff0c;后面的話沒有專門提到&#xff0c;而且引用了書中大量的文字&#xff0c;所以我把掘金這里的文章都刪除了&#xff0c;然后在 CSDN 上面每一篇…

袋裝決策樹_袋裝樹是每個數據科學家需要的機器學習算法

袋裝決策樹袋裝樹木介紹 (Introduction to Bagged Trees) Without diving into the specifics just yet, it’s important that you have some foundation understanding of decision trees.尚未深入研究細節&#xff0c;對決策樹有一定基礎了解就很重要。 From the evaluatio…

[JS 分析] 天_眼_查 字體文件

0. 參考 js分析 貓_眼_電_影 字體文件 font-face 1. 分析 1.1 定位目標元素 1.2 查看網頁源代碼 1.3 requests 請求提取得到大量錯誤信息 對比貓_眼_電_影抓取到unicode編碼&#xff0c;天_眼_查混合使用正常字體和自定義字體&#xff0c;難點在于如何從 紅 轉化為 美。 一開始…

深入學習Redis(4):哨兵

前言在 深入學習Redis&#xff08;3&#xff09;&#xff1a;主從復制 中曾提到&#xff0c;Redis主從復制的作用有數據熱備、負載均衡、故障恢復等&#xff1b;但主從復制存在的一個問題是故障恢復無法自動化。本文將要介紹的哨兵&#xff0c;它基于Redis主從復制&#xff0c;…

1805. 字符串中不同整數的數目

1805. 字符串中不同整數的數目 給你一個字符串 word &#xff0c;該字符串由數字和小寫英文字母組成。 請你用空格替換每個不是數字的字符。例如&#xff0c;“a123bc34d8ef34” 將會變成 " 123 34 8 34" 。注意&#xff0c;剩下的這些整數為&#xff08;相鄰彼此至…

經天測繪測量工具包_公共土地測量系統

經天測繪測量工具包部分-鄉鎮第一師 (Sections — First Divisions of Townships) The PLSS Townships are typically divided into 36 Sections (nominally one mile on a side), but in the national standard this feature is called the first division because Townships …

洛谷 P4012 深海機器人問題【費用流】

題目鏈接&#xff1a;https://www.luogu.org/problemnew/show/P4012 洛谷 P4012 深海機器人問題 輸入輸出樣例 輸入樣例#1&#xff1a; 1 1 2 2 1 2 3 4 5 6 7 2 8 10 9 3 2 0 0 2 2 2 輸出樣例#1&#xff1a; 42 說明 題解&#xff1a;建圖方法如下&#xff1a; 對于矩陣中的每…

day5 模擬用戶登錄

_user "yangtuo" _passwd "123456"# passd_authentication False #flag 標志位for i in range(3): #for 語句后面可以跟else&#xff0c;但是不能跟elifusername input("Username:")password input("Password:")if username _use…

opencv實現對象跟蹤_如何使用opencv跟蹤對象的距離和角度

opencv實現對象跟蹤介紹 (Introduction) Tracking the distance and angle of an object has many practical uses, especially in robotics. This tutorial explains how to get an accurate distance and angle measurement, even when the target is at a strong angle from…