php amazon-s3_推薦亞馬遜電影-一種協作方法

php amazon-s3

Item-based collaborative and User-based collaborative approach for recommendation system with simple coding.

推薦系統的基于項目的協作和基于用戶的協作方法,編碼簡單。

推薦系統概述 (Overview of Recommendation System)

There are many methods of recommendation system where each of them serve for different purposes. My previous article is talking about the simple and content-based recommendation. These recommendations are non-personalised recommenders, but that doesn’t mean they are less useful when compare to the other. These method are very popular for recommending top music of the week and recommending music of similar genre.

推薦系統的方法很多,每種方法都有不同的用途。 我的上一篇文章討論的是基于內容的簡單推薦。 這些推薦是非個性化的推薦者,但這并不意味著它們與其他推薦相比沒有太大用處。 這些方法在推薦本周熱門音樂和推薦類似流派的音樂時非常流行。

In this article, it will focus on collaborative filtering method. This method considers your taste in comparison to people/items that are in similar. Then, it recommends a list of items based on consumption similarity and suggest what you probably interested. These method only focus on calculating the rating.

在本文中,它將重點介紹協作過濾方法。 與相似的人/物品相比,此方法考慮了您的口味。 然后,它根據消費相似性推薦商品清單,并建議您可能感興趣的商品。 這些方法僅專注于計算等級

There are two main filtering for this method: item-based filtering and user-based filtering. Item-based filtering will suggest items that are similar to what you have already liked. User-based filtering will suggest items that people similar to you have liked but you have not yet consumed.

此方法主要有兩種過濾:基于項目的過濾和基于用戶的過濾。 基于項目的過濾將建議與您喜歡的項目相似的項目。 基于用戶的過濾將建議與您相似的人喜歡但尚未消耗的物品。

With the Amazon movie data, we will apply item-based filtering and user-based filtering recommendation methods to analyze similar items to be recommend and identify users that have similar taste.

借助Amazon電影數據 ,我們將應用基于項目的過濾和基于用戶的過濾推薦方法來分析要推薦的相似項目并識別具有相似品味的用戶。

分析概述 (Analysis Overview)

For both item-based filtering and user-based filtering recommendation, we need to clean data and prepare them into matrix so that it can be used for analysis. All ratings need to be in numbers and normalized and cosine similarity will be used to calculate items/users similarity.

對于基于項目的過濾和基于用戶的過濾建議,我們都需要清理數據并將它們準備成矩陣,以便可以將其用于分析。 所有等級都必須以數字表示并進行歸一化,余弦相似度將用于計算項目/用戶相似度。

資料總覽 (Data Overview)

There are 4,848 users with a total of 206 movies in the dataset.

數據集中有4848位用戶,總共206部電影。

實作 (Implementation)

Now, lets import all tools that we are going to use for the analysis, put data into DataFrame, and clean them.

現在,讓我們導入我們將用于分析的所有工具,將數據放入DataFrame并清理它們。

import pandas as pd
import numpy as np
import scipy as sp
from scipy import sparse
from sklearn.metrics.pairwise import cosine_similarityamazon = pd.read_csv('.../Amazon.csv')
amazon.info()
amazon.head()
Image for post
Image for post

Then, we need to rearrange data into matrix format where we will set index for the rows as user_id and index for the column as name.

然后,我們需要將數據重新排列為矩陣格式,在該格式中,將行的索引設置為user_id,將列的索引設置為name。

amazon = amazon.melt(id_vars=['user_id'], var_name='name', value_name='rating')
amazon_pivot = amazon.pivot_table(index=['user_id'], columns=['name'], values='rating')
amazon_pivot.head()
Image for post

From here, we need to normalized the rating values so that value range are closer to one and another. Then, turn the NaN values into 0 and select only those users who at least rate one movie.

從這里開始,我們需要對評級值進行歸一化,以使值范圍彼此接近。 然后,將NaN值設置為0,然后僅選擇至少對一部電影評分的用戶。

amazon_normalized = amazon_pivot.apply(lambda x: (x-np.min(x))/(np.max(x)-np.min(x)), axis=1)amazon_normalized.fillna(0, inplace=True)
amazon_normalized = amazon_normalized.T
amazon_normalized = amazon_normalized.loc[:, (amazon_normalized !=0).any(axis=0)]
Image for post

We nearly there. Now we need to put them into sparse matrix.

我們快到了。 現在我們需要將它們放入稀疏矩陣。

amazon_sparse = sp.sparse.csr_matrix(amazon_normalized.values)

Lets look at item-based filtering recommendation.

讓我們看一下基于項目的過濾建議

item_similarity = cosine_similarity(amazon_sparse)
item_sim_df = pd.DataFrame(item_similarity, index=amazon_normalized.index, columns=amazon_normalized.index)
item_sim_df.head()
Image for post

All the columns and rows are now become each of the movie and it is ready for the recommendation calculation.

現在,所有的列和行都成為電影的每一個,并且可以進行推薦計算了。

def top_movie(movie_name):
for item in item_sim_df.sort_values(by = movie_name, ascending = False).index[1:11]:
print('Similar movie:{}'.format(item))top_movie("Movie102")
Image for post

These are the movies that are similar to Movie102.

這些是與Movie102類似的電影。

Lets look at user-based filtering recommendation. Who has similar taste to me?

讓我們看一下基于用戶的過濾推薦 。 誰有和我相似的品味?

user_similarity = cosine_similarity(amazon_sparse.T)
user_sim_df = pd.DataFrame(user_similarity, index = amazon_normalized.columns, columns = amazon_normalized.columns)
user_sim_df.head()
Image for post
def top_users(user):  
sim_values = user_sim_df.sort_values(by=user, ascending=False).loc[:,user].tolist()[1:11]
sim_users = user_sim_df.sort_values(by=user, ascending=False).index[1:11]
zipped = zip(sim_users, sim_values)
for user, sim in zipped:
print('User #{0}, Similarity value: {1:.2f}'.format(user, sim))top_users('A140XH16IKR4B0')
Image for post

These are the examples on how to implement the item-based and user-based filtering recommendation system. Some of the code are from https://www.kaggle.com/ajmichelutti/collaborative-filtering-on-anime-data

這些是有關如何實施基于項目和基于用戶的過濾推薦系統的示例。 一些代碼來自https://www.kaggle.com/ajmichelutti/collaborative-filtering-on-anime-data

Hope that you enjoy!

希望你喜歡!

翻譯自: https://medium.com/analytics-vidhya/recommend-amazon-movie-a-collaborative-approach-9b3db8f48ad6

php amazon-s3

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/391226.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/391226.shtml
英文地址,請注明出處:http://en.pswp.cn/news/391226.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

[高精度乘法]BZOJ 1754 [Usaco2005 qua]Bull Math

模板題目&#xff0c;練練手~ #include <iostream> #include <algorithm> #include <cstring> #include <cstdio> using namespace std;int s1[2333]; int s2[2333]; int Out[2333]; string one,two;void Debug(){for(int i0;i<one.length();i){pri…

python:使用Djangorestframework編寫post和get接口

1、安裝django pip install django 2、新建一個django工程 python manage.py startproject cainiao_monitor_api 3、新建一個app python manage.py startapp monitor 4、安裝DRF pip install djangorestframework 5、編寫視圖函數 views.py from rest_framework.views import A…

Kubernetes 入門(3)集群安裝

1. kubeadm簡介 kubeadm 是 Kubernetes 官方提供的一個 CLI 工具&#xff0c;可以很方便的搭建一套符合官方最佳實踐的最小化可用集群。當我們使用 kubeadm 搭建集群時&#xff0c;集群可以通過 K8S 的一致性測試&#xff0c;并且 kubeadm 還支持其他的集群生命周期功能&#…

Angular Material 攻略 04 Icon

Icon 網頁系統中的Icon雖然說很簡單&#xff0c;但是其中的學問還是有很多的&#xff0c;我們常用的Icon庫有FontAwesome、Iconfont等&#xff0c;我們選擇了Angular Material這個組件庫&#xff0c;就介紹Material Icons吧。 對Icon感興趣的同學可以看一下這里 Material Desig…

【9303】平面分割

Time Limit: 10 second Memory Limit: 2 MB 問題描述 同一平面內有n&#xff08;n≤500&#xff09;條直線&#xff0c;已知其中p&#xff08;p≥2&#xff09;條直線相交與同一點&#xff0c;則這n條直線最多能將平面分割成多少個不同的區域&#xff1f; Input 兩個整數n&am…

簡述yolo1-yolo3_使用YOLO框架進行對象檢測的綜合指南-第一部分

簡述yolo1-yolo3重點 (Top highlight)目錄&#xff1a; (Table Of Contents:) Introduction 介紹 Why YOLO? 為什么選擇YOLO&#xff1f; How does it work? 它是如何工作的&#xff1f; Intersection over Union (IoU) 聯合路口(IoU) Non-max suppression 非最大抑制 Networ…

django:資源網站匯總

Django REST framework官網 http://www.sinodocs.cn/ django中文網 https://www.django.cn/ 轉載于:https://www.cnblogs.com/gcgc/p/11542068.html

Kubernetes 入門(4)集群配置

1. 集群配置 報錯&#xff1a; message: ‘runtime network not ready: NetworkReadyfalse reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized’ 原因&#xff1a;cni未被初始化&#xff08;CNI 是 Container Network In…

【例9.8】合唱隊形

【例9.8】合唱隊形 鏈接&#xff1a;http://ybt.ssoier.cn:8088/problem_show.php?pid1264 時間限制: 1000 ms 內存限制: 65536 KB【題目描述】 N位同學站成一排&#xff0c;音樂老師要請其中的(N-K)位同學出列&#xff0c;使得剩下的K位同學排成合唱隊形。 合唱隊形是…

scrum流程 規劃 沖刺_Scrum –困難的部分2:更快地沖刺

scrum流程 規劃 沖刺In the first part, I presented my favorite list of Scrums hard parts and how to work around them. In the second part, I offer you a colorful bouquet of workarounds as well. Have fun!在第一部分中 &#xff0c;我介紹了我最喜歡的Scrum困難部分…

JAVA基礎知識|lambda與stream

lambda與stream是java8中比較重要兩個新特性&#xff0c;lambda表達式采用一種簡潔的語法定義代碼塊&#xff0c;允許我們將行為傳遞到函數中。之前我們想將行為傳遞到函數中&#xff0c;僅有的選擇是使用匿名內部類&#xff0c;現在我們可以使用lambda表達式替代匿名內部類。在…

數據庫:存儲過程_數據科學過程:摘要

數據庫:存儲過程Once you begin studying data science, you will hear something called ‘data science process’. This expression refers to a five stage process that usually data scientists perform when working on a project. In this post I will walk through ea…

901

901 轉載于:https://www.cnblogs.com/Forever77/p/11542129.html

leetcode 137. 只出現一次的數字 II(位運算)

給你一個整數數組 nums &#xff0c;除某個元素僅出現 一次 外&#xff0c;其余每個元素都恰出現 三次 。請你找出并返回那個只出現了一次的元素。 示例 1&#xff1a; 輸入&#xff1a;nums [2,2,3,2] 輸出&#xff1a;3 示例 2&#xff1a; 輸入&#xff1a;nums [0,1,0,…

【p081】ISBN號碼

Time Limit: 1 second Memory Limit: 50 MB 【問題描述】 每一本正式出版的圖書都有一個ISBN號碼與之對應&#xff0c;ISBN碼包括9位數字、1位識別碼和3位分隔符&#xff0c;其規定格式如“x-xxx-xxxxx-x”&#xff0c;其中符號“-”是分隔符&#xff08;鍵盤上的減號&#xff…

gitlab bash_如何編寫Bash一線式以克隆和管理GitHub和GitLab存儲庫

gitlab bashFew things are more satisfying to me than one elegant line of Bash that automates hours of tedious work. 沒有什么比讓Bash自動完成數小時繁瑣工作的Bash優雅系列令我滿意的了。 As part of some recent explorations into automatically re-creating my la…

寒假學習筆記(4)

2018.2.11 類中的常成員 關鍵字const&#xff0c;在類定義中聲明數據成員使用關鍵字限定&#xff0c;聲明時不能初始化。初始化列表&#xff0c;類中的任何函數都不能對常數據成員賦值&#xff0c;包括構造函數。為構造函數添加初始化列表是對常數據成員進行初始化的唯一途徑。…

svm和k-最近鄰_使用K最近鄰的電影推薦和評級預測

svm和k-最近鄰Recommendation systems are becoming increasingly important in today’s hectic world. People are always in the lookout for products/services that are best suited for them. Therefore, the recommendation systems are important as they help them ma…

Oracle:時間字段模糊查詢

需要查詢某一天的數據&#xff0c;但是庫里面存的是下圖date類型 將Oracle中時間字段轉化成字符串&#xff0c;然后進行字符串模糊查詢 select * from CAINIAO_MONITOR_MSG t WHERE to_char(t.CREATE_TIME,yyyy-MM-dd) like 2019-09-12 轉載于:https://www.cnblogs.com/gcgc/p/…

cogs2109 [NOIP2015] 運輸計劃

cogs2109 [NOIP2015] 運輸計劃 二分答案樹上差分。 STO鏈剖巨佬們我不會&#xff08;太虛偽了吧 首先二分一個答案&#xff0c;下界為0,上界為max{路徑長度}。 然后判斷一個答案是否可行&#xff0c;這里用到樹上差分。 &#xff08;闊以理解為前綴和&#xff1f;&#xff1f;&…