php amazon-s3
Item-based collaborative and User-based collaborative approach for recommendation system with simple coding.
推薦系統的基于項目的協作和基于用戶的協作方法,編碼簡單。
推薦系統概述 (Overview of Recommendation System)
There are many methods of recommendation system where each of them serve for different purposes. My previous article is talking about the simple and content-based recommendation. These recommendations are non-personalised recommenders, but that doesn’t mean they are less useful when compare to the other. These method are very popular for recommending top music of the week and recommending music of similar genre.
推薦系統的方法很多,每種方法都有不同的用途。 我的上一篇文章討論的是基于內容的簡單推薦。 這些推薦是非個性化的推薦者,但這并不意味著它們與其他推薦相比沒有太大用處。 這些方法在推薦本周熱門音樂和推薦類似流派的音樂時非常流行。
In this article, it will focus on collaborative filtering method. This method considers your taste in comparison to people/items that are in similar. Then, it recommends a list of items based on consumption similarity and suggest what you probably interested. These method only focus on calculating the rating.
在本文中,它將重點介紹協作過濾方法。 與相似的人/物品相比,此方法考慮了您的口味。 然后,它根據消費相似性推薦商品清單,并建議您可能感興趣的商品。 這些方法僅專注于計算等級 。
There are two main filtering for this method: item-based filtering and user-based filtering. Item-based filtering will suggest items that are similar to what you have already liked. User-based filtering will suggest items that people similar to you have liked but you have not yet consumed.
此方法主要有兩種過濾:基于項目的過濾和基于用戶的過濾。 基于項目的過濾將建議與您喜歡的項目相似的項目。 基于用戶的過濾將建議與您相似的人喜歡但尚未消耗的物品。
With the Amazon movie data, we will apply item-based filtering and user-based filtering recommendation methods to analyze similar items to be recommend and identify users that have similar taste.
借助Amazon電影數據 ,我們將應用基于項目的過濾和基于用戶的過濾推薦方法來分析要推薦的相似項目并識別具有相似品味的用戶。
分析概述 (Analysis Overview)
For both item-based filtering and user-based filtering recommendation, we need to clean data and prepare them into matrix so that it can be used for analysis. All ratings need to be in numbers and normalized and cosine similarity will be used to calculate items/users similarity.
對于基于項目的過濾和基于用戶的過濾建議,我們都需要清理數據并將它們準備成矩陣,以便可以將其用于分析。 所有等級都必須以數字表示并進行歸一化,余弦相似度將用于計算項目/用戶相似度。
資料總覽 (Data Overview)
There are 4,848 users with a total of 206 movies in the dataset.
數據集中有4848位用戶,總共206部電影。
實作 (Implementation)
Now, lets import all tools that we are going to use for the analysis, put data into DataFrame, and clean them.
現在,讓我們導入我們將用于分析的所有工具,將數據放入DataFrame并清理它們。
import pandas as pd
import numpy as np
import scipy as sp
from scipy import sparse
from sklearn.metrics.pairwise import cosine_similarityamazon = pd.read_csv('.../Amazon.csv')
amazon.info()
amazon.head()


Then, we need to rearrange data into matrix format where we will set index for the rows as user_id and index for the column as name.
然后,我們需要將數據重新排列為矩陣格式,在該格式中,將行的索引設置為user_id,將列的索引設置為name。
amazon = amazon.melt(id_vars=['user_id'], var_name='name', value_name='rating')
amazon_pivot = amazon.pivot_table(index=['user_id'], columns=['name'], values='rating')
amazon_pivot.head()

From here, we need to normalized the rating values so that value range are closer to one and another. Then, turn the NaN values into 0 and select only those users who at least rate one movie.
從這里開始,我們需要對評級值進行歸一化,以使值范圍彼此接近。 然后,將NaN值設置為0,然后僅選擇至少對一部電影評分的用戶。
amazon_normalized = amazon_pivot.apply(lambda x: (x-np.min(x))/(np.max(x)-np.min(x)), axis=1)amazon_normalized.fillna(0, inplace=True)
amazon_normalized = amazon_normalized.T
amazon_normalized = amazon_normalized.loc[:, (amazon_normalized !=0).any(axis=0)]

We nearly there. Now we need to put them into sparse matrix.
我們快到了。 現在我們需要將它們放入稀疏矩陣。
amazon_sparse = sp.sparse.csr_matrix(amazon_normalized.values)
Lets look at item-based filtering recommendation.
讓我們看一下基于項目的過濾建議 。
item_similarity = cosine_similarity(amazon_sparse)
item_sim_df = pd.DataFrame(item_similarity, index=amazon_normalized.index, columns=amazon_normalized.index)
item_sim_df.head()

All the columns and rows are now become each of the movie and it is ready for the recommendation calculation.
現在,所有的列和行都成為電影的每一個,并且可以進行推薦計算了。
def top_movie(movie_name):
for item in item_sim_df.sort_values(by = movie_name, ascending = False).index[1:11]:
print('Similar movie:{}'.format(item))top_movie("Movie102")

These are the movies that are similar to Movie102.
這些是與Movie102類似的電影。
Lets look at user-based filtering recommendation. Who has similar taste to me?
讓我們看一下基于用戶的過濾推薦 。 誰有和我相似的品味?
user_similarity = cosine_similarity(amazon_sparse.T)
user_sim_df = pd.DataFrame(user_similarity, index = amazon_normalized.columns, columns = amazon_normalized.columns)
user_sim_df.head()

def top_users(user):
sim_values = user_sim_df.sort_values(by=user, ascending=False).loc[:,user].tolist()[1:11]
sim_users = user_sim_df.sort_values(by=user, ascending=False).index[1:11]
zipped = zip(sim_users, sim_values)
for user, sim in zipped:
print('User #{0}, Similarity value: {1:.2f}'.format(user, sim))top_users('A140XH16IKR4B0')

These are the examples on how to implement the item-based and user-based filtering recommendation system. Some of the code are from https://www.kaggle.com/ajmichelutti/collaborative-filtering-on-anime-data
這些是有關如何實施基于項目和基于用戶的過濾推薦系統的示例。 一些代碼來自https://www.kaggle.com/ajmichelutti/collaborative-filtering-on-anime-data
Hope that you enjoy!
希望你喜歡!
翻譯自: https://medium.com/analytics-vidhya/recommend-amazon-movie-a-collaborative-approach-9b3db8f48ad6
php amazon-s3
本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。 如若轉載,請注明出處:http://www.pswp.cn/news/391226.shtml 繁體地址,請注明出處:http://hk.pswp.cn/news/391226.shtml 英文地址,請注明出處:http://en.pswp.cn/news/391226.shtml
如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!