spotify 數據分析_我的Spotify流歷史分析

spotify 數據分析

Spotisis /spo-ti-sis/ noun The analysis of one’s Spotify streaming history using Python.

Spotisis / spo-ti-sis / 名詞使用Python分析一個人的Spotify流歷史。

I was reading through a lot of data science related guides and project ideas when I came across an article in which the author compared his song choices with his friend’s. I wanted to do something similar, so set out to analyse my own streaming history and compare it with what the world listens to.

當我看到一篇文章,作者將他的歌曲選擇與朋友的歌曲選擇進行比較時,我正在閱讀許多與數據科學相關的指南和項目構想。 我想做類似的事情,因此著手分析自己的流媒體歷史并將其與世界聽的內容進行比較。

Through this, I aim to find out more about my music preferences and how that differs from the world’s genral picks.

通過這一工作,我旨在了解有關我的音樂喜好以及與世界各地的精選音樂有何不同的更多信息。

I never really put much thought into my music preference before this project — it was always kind of dependent on my mood, and when someone asked me what type of music I like, I had no answer — because it varied from one hour to another.

在這個項目開始之前,我從來沒有真正考慮過我的音樂偏好-它總是取決于我的心情,當有人問我喜歡哪種音樂時,我沒有答案-因為它從一個小時到另一個小時不等。

I’ve split this project into 2 sections:

我將該項目分為兩個部分:

Part A is the analysis of my music streaming history.

A部分是對我的音樂流歷史的分析。

  • Timeline of my streaming history

    我的流式傳輸歷史的時間表
  • Day preference

    日偏好
  • Favorite artist

    最喜歡的藝術家
  • Favorite songs

    最喜歡的歌曲
  • Spirit of the songs

    歌曲的精神
  • Diversity

    多元化

Part B is the comparison of the top 50 songs streamed on my list with the top 50 songs streamed in 2019

B部分是我列表中前50首歌曲與2019年前50首歌曲的比較

數據 (The data)

Spotify allows every user to request a download of all their streaming history, so Part A is completely dependent on that. They also have an amazing Developer Platform in which the public can use the data available for their own interest. Along with my personal data, I used the audio features option — which breaks down a song and gives it ‘score’ for a number of different attributes. The attributes are as follows:

Spotify允許每個用戶請求下載其所有流歷史記錄,因此A部分完全依賴于此。 他們還擁有一個了不起的開發人員平臺 ,公眾可以在其中使用自己感興趣的數據。 除了我的個人數據,我還使用了音頻功能選項-可以分解一首歌曲,并為許多不同的屬性賦予它“得分”。 屬性如下:

  • Acousticness — A confidence measure from 0.0 to 1.0 of whether the track is acoustic. 1.0 represents high confidence the track is acoustic

    聲學 -軌道是否聲學的置信度,范圍為0.0到1.0。 1.0代表高置信度軌道是聲學的

  • Danceability — A description of how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable.

    舞蹈性 -基于音樂元素(包括速度,節奏穩定性,節拍強度和整體規律性)的組合,說明軌道是否適合跳舞。 值0.0最低可跳舞,而1.0最高可跳舞。

  • Energy — Energy is a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy.

    能量 —能量是從0.0到1.0的量度,表示強度和活動的感知量度。 通常,充滿活力的曲目會感覺快速,響亮且嘈雜。

  • Instrumentalness — Predicts whether a track contains no vocals. “Ooh” and “aah” sounds are treated as instrumental in this context. The closer the instrumentalness value is to 1.0, the greater likelihood the track contains no vocal content.

    器樂性 —預測音軌是否不包含人聲。 在這種情況下,“哦”和“啊”的聲音被當作工具。 器樂性值越接近1.0,則軌道中沒有聲音的可能性越大。

  • Liveness — Detects the presence of an audience in the recording.

    生動度 -檢測記錄中是否有聽眾。

  • Loudness — The overall loudness of a track in decibels (dB). Loudness is the quality of a sound that is the primary psychological correlate of physical strength (amplitude). Values typical range between -60 and 0 db.

    響度 -軌道的整體響度,以分貝(dB)為單位。 響度是聲音的質量,它是身體力量(振幅)的主要心理關聯。 值的典型范圍是-60至0 db。

  • Speechiness — Speechiness detects the presence of spoken words in a track.

    語音性 -語音性可檢測曲目中是否存在口語。

  • Valence — A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track.

    -從0.0到1.0的量度,描述了軌道傳達的音樂積極性。

  • Tempo — The overall estimated tempo of a track in beats per minute (BPM). In musical terminology, tempo is the speed or pace of a given piece and derives directly from the average beat duration

    節奏 —曲目的總體估計節奏,單位為節拍/分鐘(BPM)。 用音樂術語來說,節奏是指給定樂曲的速度或節奏,它直接來自平均拍子持續時間

  • Mode — Mode indicates the modality (major or minor) of a track, the type of scale from which its melodic content is derived. Major is represented by 1 and minor is 0.

    模式 —模式表示軌道的形態(主要或次要),是其旋律內容所源自的音階類型。 Major用1表示,minor用0表示。

  • Key — The estimated overall key of the track.

    密鑰 -曲目的估計總體密鑰。

The dataset was a little messy, so I used Pandas to clean it up according to my need for each section. The entire code can be found on the GitHub link at the end of this article.

數據集有點混亂,因此我根據每個部分的需要使用Pandas對其進行清理。 完整的代碼可以在本文結尾的GitHub鏈接上找到。

For Part B, I used this dataset from Kaggle.

對于B部分,我用這從Kaggle數據集。

Before we begin, I just want to say something… Don’t come at me for my music choice!

在我們開始之前,我只想說些什么...不要因為我的音樂選擇而來找我!

甲部 (Part A)

1.我的流式傳輸歷史的時間表 (1. Timeline of my streaming history)

I know that I spend a lot of time listening to music, but I didn’t know I spent that much time! The data dates back to late June of 2019 and was highly varied.

我知道我花了很多時間聽音樂,但是我不知道我花很多時間! 該數據可以追溯到2019年6月下旬,并且變化很大。

Image for post

On February 24th 2020, I spent a gasping 535 minutes (which is almost 9 hours) on spotify — the most in the past year! There’s no definite answer as to why the difference between the highest and lowest value (which was in seconds) is so much, but I did register for Spotify Premium around that time, so maybe that was the reason? Push the promos harder you guys ;)

2020年2月24日,我在Spotify上花費了535分鐘(將近9個小時),這是過去一年中最多的! 關于最高值和最低值(以秒為單位)之間的差異為何如此之大,沒有確切的答案,但是我確實在那個時候注冊了Spotify Premium,所以也許這就是原因嗎? 加大促銷力度;)

2.每日偏好 (2. Day preference)

Does the day of the week affect how long I spend listening to music?

星期幾會影響我聽音樂的時間嗎?

Image for post

I usually listen to music while walking to and back from college, so I would’ve predicted that more time would be spent during the weekdays. Sunday is chillday, so it makes sense that it was when I spent most time listening to music.

我通常在上大學和上大學時聽音樂,因此我預計在工作日將花費更多時間。 星期日很冷,所以有意義的是那是我花大量時間聽音樂的時候。

3.最喜歡的藝術家 (3. Favorite artists)

Do I have a favorite arist?

我有最喜歡的藝術家嗎?

According the the data, I actually do. There were two factors I considered: the number of times I played an artist’s song and the total amount of time I spent listening to their songs.

根據數據,我實際上是這樣做的。 我考慮了兩個因素:播放歌手歌曲的次數和收聽他們歌曲的總時間。

When looking through the data, I found that some of the songs were played only for a few seconds, so that was reducing the accuracy of the results.

查看數據時,我發現某些歌曲僅播放了幾秒鐘,因此降低了結果的準確性。

The graphs below show the top 15 artists under both categories.

下圖顯示了兩個類別中的前15位藝術家。

Image for post
Image for post

Lauv, Shawn Mendes, One Direction and Justin Bieber maintained the top 4 positions under both graphs, whereas the others were rearranged.

勞夫,肖恩·門德斯,一個方向和賈斯汀·比伯在兩個圖表上均保持前4位,而其他兩個則重新排列。

4.哪些歌曲播放最多? (4. Which songs were played most?)

Was it by the same 15 artists?

是由同一15位藝術家創作的嗎?

Yes, it was — Lauv took 5 of the 15 spots!

是的,是的— Lauv占據了15個景點中的5個!

Image for post

I realised that some of the top 15 artists (based on the amount of time spent listening to their songs) were on the list because of one or two songs which were repeated multiple times.

我意識到,排名前15位的藝術家中的一些(基于聽他們的歌曲所花費的時間)在名單上是因為一首或兩首歌曲被重復多次。

For example, Memories by Maroon 5 was the most played song (played for a total of 184 minutes). When comared to the total time spent listening to the group (430 minutes), the different was about 246 minutes. In percentage, it means that more than 40% of the time spent listening to Maroon 5 was spent only on Memories.

例如,Maroon 5的Memories是播放最多的歌曲(總共播放184分鐘)。 將聽完該小組所花費的總時間(430分鐘)估算為大約246分鐘。 以百分比表示,這意味著超過40%的時間在聆聽Maroon 5上的時間僅花在記憶上。

It’s a good song. Admit it.

這是一首好歌。 承認吧

5.歌曲的精神 (5. Spirit of the song)

Do I listen to positive songs?

我會聽正面的歌嗎?

Using the valence attribute from Spotify’s audio analysis features, I tried to find out the general spirit of the top 50 songs I listen to. The valence scale is from 0–1, with one being the most positiveness conveyed in the track.

使用Spotify音頻分析功能的valence屬性,我試圖找出我聽的前50首歌曲的總體精神。 化合價的范圍是0-1,其中一個是在曲目中傳達的最多的積極性。

For the sake of classification:- low spirit = 0 ≤ valence < 0.5- netural = 0.5≤ valence < 0.6-high spirit = 0.6 ≤ valence ≤ 1

為了分類:-低酒精度= 0≤價<0.5-神經質=0.5≤價<0.6-高酒精度= 0.6≤價≤1

(I named it as ‘spirit’ because ‘positive’ and ‘negative’ didn’t feel right)

(我將其命名為“精神”,因為“正”和“負”感覺不正確)

I was pretty unsure about this one and was utterly surprised by the results.

我對此不太確定,對結果完全感到驚訝。

Image for post

So I listen to more of low spirit songs?? That doesn’t make sense!

所以我聽更多的低沉的歌曲嗎? 那沒有道理!

When I cross referenced the song names to its valence scale, I realised that this may not have been the most accurate representation. Ed Sheeran’s Photograph had a valence scale of 0.18, for which it was categorised as ‘low spirit’. Although it’s not a super high spirited song, it’s not so low either!

當我將歌曲名稱以其效價比例交叉引用時,我意識到這可能不是最準確的表示形式。 埃德·希蘭(Ed Sheeran)的攝影作品的化合價等級為0.18,因此其分類為“精神低落”。 盡管這不是一首超振奮的歌,但它也不是那么低!

6.歌曲的多樣性 (6. Diversity of songs)

How do the audio features of the songs compare to one another?

歌曲的音頻功能如何相互比較?

The spirit of the song built up my curiosity to know more about how the songs varied from one another in therms of the audio features, so I compared the top 3 most played songs. I believe that my song choices are highly diverse.

這首歌的精神激發了我的好奇心,以了解更多有關歌曲在音頻功能方面的差異的信息,因此我比較了播放次數最多的前三首歌曲。 我相信我的歌曲選擇非常多樣化。

Those who are familiar with these songs know just how much they vary from one another — they give such different vibes, but I needed the data to prove it.

那些熟悉這些歌曲的人知道它們彼此之間有多少不同-它們具有不同的共鳴,但是我需要數據來證明這一點。

Image for post

There is A LOT of difference — most noticable in the loudness and acousticness attributes.

有很多差異-響度和聲學屬性最明顯。

The next part is based off of this diversity.

下一部分基于這種多樣性。

B部分 (Part B)

Is my music too diverse? How does it fare when compared to the global top 50?

我的音樂太多樣化了嗎? 與全球前50名相比,情況如何?

Image for post

Apart from the mode, everything is different! I prefer less groovy, instrumental based songs which have lower energy levels, while the global hits suggest people lean towards fast paced, energetic songs that they can dance to.

除了模式,其他都不同! 我更喜歡能量水平較低的低調,器樂性歌曲,而全球流行歌曲則建議人們傾向于快節奏,充滿活力的歌曲,他們可以跳舞。

Image for post

The difference between my music’s average tempo (beats per minute) and the global average is 4 BPM. According to research, songs which have 120 BPM are considered to be fast paced songs. My preference seems to be at a little slower pace, though not by much.

我的音樂的平均節奏(每分鐘的節拍)與全局平均速度之間的差是4 BPM。 根據研究,具有120 BPM的歌曲被視為快節奏的歌曲。 我的喜好似乎放慢了一點,盡管速度并不慢。

結論 (Conclusion)

This project was a blast to do. I thoroughly enjoyed learning more about my music preferences and comparing that to the global hits. Now that I am backed with the data, I can say that my music is highly diversified and that I do have a favourite artist — Lauv (considering the amount of time I’ve spent listening to his songs, it wouldn’t be justified to say otherwise!).

這個項目是一個爆炸。 我非常喜歡學習有關自己的音樂喜好,并將其與全球流行歌曲進行比較。 現在,我有了這些數據的支持,可以說我的音樂非常多樣化,而且確實有一位喜歡的藝術家Lauv(考慮到我花了很多時間聽他的歌曲,這并沒有理由否則說!)。

Following this article, I would like to continue by applying some machine learning knowledge to create a recommender system based on my music preferences.

在閱讀完本文之后,我想繼續應用一些機器學習知識,根據我的音樂喜好創建一個推薦系統。

Feel free to comment and view the entire code on my GitHub!

隨時在我的GitHub上評論和查看整個代碼!

Big thanks to Vlad Gheorghe for his brilliant explanation (huge savior!)

非常感謝弗拉德·格奧爾格(Vlad Gheorghe)出色的解釋(救世主!)

翻譯自: https://medium.com/swlh/analysis-of-my-spotify-streaming-history-57a6088c3d3

spotify 數據分析

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/388977.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/388977.shtml
英文地址,請注明出處:http://en.pswp.cn/news/388977.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

idea 搜索不到gsonformat_Idea中GsonFormat插件安裝

這個教不的期是范添事大部會基近說小間進圍磚本的程主要是學習IntelliJ IDEA 如何通過GsonFormat插件將JSONObject格式的String 支器事的后功發久這含層請間業在屏有隨些氣和域&#xff0c;實按控幻近持的前時來能過后些的處求也務瀏蔽等機站風滾或默現鈕制燈近持的前時來能過后…

intellig idea中jsp或html數據沒有自動保存和更換字體

主題一:保存數據jsp intellig idea是自動保存數據的,看到沒有保存 解決方案&#xff1a; 成功解決 主題二:更換字體: 或者快捷鍵CtelAlts 成功解決 轉載于:https://www.cnblogs.com/weibanggang/p/9398498.html

java 環境變量

1.確保安裝jrd jdk 2.環境變量配置 (1)新建->變量名"JAVA_HOME"&#xff0c;變量值"C:\Java\jdk1.8.0_05"&#xff08;JDK的安裝路徑&#xff09; (2)編輯->變量名"Path"&#xff0c;在原變量值的最后面加上“;%JAVA_HOME%\bin;%JAVA_HOME…

陸濤喜歡夏琳嗎_夏琳·香布利斯(Charlene Chambliss):從心理學到自然語言處理和應用研究

陸濤喜歡夏琳嗎技術系列中的女性 (WOMEN IN TECHNOLOGY SERIES) Interest in data science has been exponentially increasing over the past decade, and more and more people are working towards making a career switch into the field. In 2020, articles and YouTube v…

【angularJS】簡介

簡介 AngularJS 是一個 JavaScript 框架。它可通過 <script> 標簽添加到 HTML 頁面。 AngularJS 通過 指令 擴展了 HTML&#xff0c;且通過 表達式 綁定數據到 HTML。 AngularJS 是一個 JavaScript 框架。它是一個以 JavaScript 編寫的庫。 AngularJS 是以一個 JavaScrip…

爬取淘寶商品信息selenium+pyquery+mongodb

爬取淘寶商品信息,通過selenium獲得渲染后的源碼,pyquery解析,mongodb存儲 from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC from selenium.common.exceptions import Timeout…

紋個雞兒天才小熊貓_給熊貓用戶的5個提示

紋個雞兒天才小熊貓A popular Python library used by those working with data is pandas, an easy and flexible data manipulation and analysis library. There are a myriad of awesome methods and functions in pandas, some of which are probably less well-known tha…

本人服務器遭受黑客長期攻擊,特把這幾天做的一些有用的安全方面總結出來,以方便以后查閱

消息隊列iis360northrarsql2000 netscren本人服務器遭受黑客長期攻擊&#xff0c;特把這幾天做的一些有用的安全方面總結出來&#xff0c;以方便以后查閱&#xff0c;希望這次徹底解覺黑客的攻擊&#xff0c;特次謝謝“冷雨夜”的一些提示。 windows 2003服務器安全設置方法 0…

用戶與用戶組管理

linux最優秀的地方之一&#xff0c;就在于他的多用用戶、多任務環境。 用戶及用戶組的概念 1、文件所有者 由于linux是一個多用戶、多任務的系統。因此可能常常會有很多人同時使用這臺主機來進行工作的情況發生&#xff0c;為了考慮每個人的隱私權以及每個人的喜好的工作環境&a…

代碼 摳圖_3 行 Python 代碼 5 秒摳圖的 AI 神器,根本無需 PS,附教程

曾幾何時&#xff0c;「摳圖」是一個難度系數想當高的活兒&#xff0c;但今天要介紹的這款神工具&#xff0c;只要 3 行代碼 5 秒鐘就可以完成高精度摳圖&#xff0c;甚至都不用會代碼&#xff0c;點兩下鼠標就完成了。感受下這款摳圖工具摳地有多精細&#xff1a;是不是很贊&a…

python函數使用易錯舉例

關于嵌套&#xff1a; 嵌套使用中&#xff0c; retrun inner ---> 返回的是函數的地址 retrun inner() &#xff1a; ---> 運行inner()函數 ---> 運行inner()函數后的返回值a&#xff08;假設&#xff09;返回上級 --> retrun inner()得到返回值a 如…

圖像離群值_什么是離群值?

圖像離群值你是&#xff01; (You are!) Actually not. This is not a text about you.其實并不是。 這不是關于您的文字。 But, as Gladwell puts it in Outliers, if you find yourself being that type of outlier, you’re quite lucky. And rare.但是&#xff0c;正如Gla…

混合模型和EM---混合高斯

2019獨角獸企業重金招聘Python工程師標準>>> 混合高斯 最大似然 用于高斯混合模型的EM 轉載于:https://my.oschina.net/liyangke/blog/2986520

永恒python地速_立竿見影地把你的 Python 代碼提速7倍

之前曾經測試計算斐波那契數列的幾種方法&#xff0c;其中基于遞歸的方法是速度最慢的&#xff0c;例如計算第 40 項的值&#xff0c;需要 36 秒。如下圖所示。要提高運算速度&#xff0c;根本辦法當然是改進算法。不過算法的提高是一個長期積累加上靈機一動的過程。我們今天要…

頂尖大學實驗室的科研方法_這是來自頂尖大學的5門免費自然語言處理課程

頂尖大學實驗室的科研方法Data Science continues to be a hot topic, but more specifically, Natural Language Processing (NLP) is increasing in demand.數據科學仍然是一個熱門話題&#xff0c;但更具體地說&#xff0c;自然語言處理(NLP)的需求正在增長。 Broadly spea…

Python學習---django知識補充之CBV

Django知識補充之CBV Django: url --> def函數 FBV[function based view] 用函數和URL進行匹配 url --> 類 CBV[function based view] 用類和URL進行匹配 POSTMAN插件 http://blog.csdn.net/zzy1078689276/article/details/77528249 基于CBV的登…

「CH2101」可達性統計 解題報告

CH2101 可達性統計 描述 給定一張N個點M條邊的有向無環圖&#xff0c;分別統計從每個點出發能夠到達的點的數量。N,M≤30000。 輸入格式 第一行兩個整數N,M&#xff0c;接下來M行每行兩個整數x,y&#xff0c;表示從x到y的一條有向邊。 輸出格式 共N行&#xff0c;表示每個點能夠…

藍圖解鎖怎么用_[UE4藍圖][Materials]虛幻4中可互動的雪地材質完整實現(一)

不說廢話&#xff0c;先上個演示圖最終成果&#xff08;腳印&#xff0c;雪地可慢慢恢復&#xff0c;地形可控制&#xff09;主要原理&#xff08;白話文&#xff09;&#xff1a;假如你頭上是塊白色并且可以透視的平地&#xff0c;來了個非洲兄弟踩上面&#xff0c;你拿起單反…

數據預處理工具_數據預處理

數據預處理工具As the title states this is the last project from Udacity Nanodegree. The goal of this project is to analyze demographics data for customers of a mail-order sales company in Germany.如標題所示&#xff0c;這是Udacity Nanodegree的最后一個項目。…

這幾日英文大匯

int > 整數. 主要?用來進?行行數學運算 str > 字符串串, 可以保存少量量數據并進?行行相應的操作 bool>判斷真假, True, False list> 存儲?大量量數據.?用[ ]表?示 tuple> 元組, 不可以發?生改變 ?用( )表?示 dict>字典,保存鍵值對,?一樣可以…