云尚制片管理系統_電影制片廠的未來

云尚制片管理系統

Data visualization is a key step of any data science project. During the process of exploratory data analysis, visualizing data allows us to locate outliers and identify distribution, helping us to control for possible biases in our data earlier on. Coupled with simple statistical tests, it can also answer many of the questions and can aid us in prioritizing areas to focus on.

數據可視化是任何數據科學項目的關鍵步驟。 在探索性數據分析過程中,可視化數據使我們能夠找到異常值并識別分布,從而幫助我們盡早控制數據中可能存在的偏差。 結合簡單的統計測試,它還可以回答許多問題,并可以幫助我們確定優先領域。

Here, I will go through some of the exploratory data analysis and data visualization steps in Python using Matplotlib and Seaborn libraries. The goal of the project is to analyze movie trends of the past decade to make suggestions in developing a new movie studio brand for a well-established corporation.

在這里,我將使用Matplotlib和Seaborn庫完成一些探索性數據分析和數據可視化步驟。 該項目的目的是分析過去十年的電影趨勢,為發展成熟的公司開發新的電影制片廠品牌提供建議。

方法 (Approach)

We explored the data with these two primary goals in mind.

考慮到這兩個主要目標,我們探索了數據。

  1. Building a global brand — We don’t just make movies, we make good movies that appeal to a global audience.

    建立全球品牌- 我們不僅制作電影,而且制作吸引全球觀眾的優質電影。

  2. Establishing a sustainable long-term plan —Making a sustainable business plan, not just a movie production plan.

    建立可持續的長期計劃- 制定可持續的商業計劃,而不僅僅是電影制作計劃。

數據結構 (Data Structure)

Image for post
Our data frame structure
我們的數據框結構

This is the basic structure of our cleaned Pandas data frame. We sourced our data from the Movie Database (TMDB), IMDB, and the Numbers. I recommend using the Movie Database (TMDB) API for the preliminary movie data.

這是我們清理過的熊貓數據框的基本結構。 我們從電影數據庫(TMDB),IMDB和數字中獲取數據。 我建議使用電影數據庫(TMDB)API來獲取初步的電影數據。

勘探 (Exploration)

最初設定 (Initial Setup)

總收入分配 (Distribution of Gross Revenue)

Let’s start looking at the distribution of the overall gross revenues for domestic and worldwide. Seaborn’s distplot plots histogram along with KDE (Kernel Density Estimate) plot.

讓我們開始看看國內和全球總收入的分布。 Seaborn的distplot繪制直方圖以及KDE(內核密度估計)圖

Image for post

We can see that it is strongly right skewed, it is a pretty usual trend for income data. Taking the log transformation of this data can help us visualize what’s happening in the dense area more clearly.

我們可以看到它是非常右偏的,對于收入數據來說這是很常見的趨勢。 對這些數據進行對數轉換可以幫助我們更清晰地可視化密集區域中發生的情況。

Image for post

Not surprisingly, It seems like the global market yields higher revenues on average. Let’s look at the relationship between the budget and revenue.

毫不奇怪,似乎全球市場平均產生更高的收入。 讓我們看一下預算與收入之間的關系。

預算收入 (Budget to Revenue)

Now we want to visualize the relationship between production budget and gross revenue, which are two continuous variables using scatter plots. There are many ways to achieve this. Here, I used the overlaid scatter plots to look at the global and domestic gross revenues together.

現在我們要形象化生產預算和總收入之間的關系,這是使用散點圖的兩個連續變量。 有很多方法可以實現這一目標。 在這里,我使用疊加的散點圖一起查看了全球和國內總收入。

Image for post

It seems like a high budget does not always lead to high revenue especially in the domestic market. Also some movies yield high revenues with relatively lower budgets when it targets the global market. Let’s take a closer look at which genres might return the most return for its investment.

似乎高預算并不總是導致高收入,尤其是在國內市場。 此外,某些電影面向全球市場時,其預算卻相對較低,可帶來高額收入。 讓我們仔細研究一下哪些類型的內容可能會為其投資帶來最大的回報。

體裁分布 (Distribution of Genre)

We can look at the percentage of each genre in our dataset using a bar plot.

我們可以使用條形圖查看數據集中每種類型的百分比

Image for post

We see that about 30% of our data is action movies.

我們看到大約30%的數據是動作電影。

各類型的收益與成本比率 (Revenue to Cost Ratio of Each Genre)

Which genres have the highest return per investment?

哪種類型的單筆投資回報最高?

Image for post

Based on the global gross revenue to budget ratio, horror films on average make the most return per investment. But this does not necessarily mean that horror movies bring the most profit. Horror movies might take less production budget to make, thus yielding a higher percentage of return per cost. We can compare the budget of each genre using a box plot.

根據全球總收入與預算的比率,恐怖電影平均每筆投資回報最高。 但這并不一定意味著恐怖電影會帶來最大的收益。 恐怖電影可能需要較少的制作預算,因此產生更高的單位成本回報率。 我們可以使用箱形圖比較每種類型的預算。

各類型的平均制作預算 (Average Production Budget of Each Genre)

Image for post

As we suspected, horror movies usually require a little budget to start out. On the other hand, action, animation and some family films tend to have higher budgets. Then which genre of movies yield the most profit? (Here I’m using the term “profit” liberally to mean global gross revenue minus the production budget. In reality, we cannot entirely know what the total cost involved in the movie production, distribution and marketing is to validate this measure.)

正如我們所懷疑的,恐怖電影通常需要很少的預算才能開始。 另一方面,動作,動畫和一些家庭電影往往預算較高。 那么哪種類型的電影收益最大? (在這里,我用“利潤”一詞來表示全球總收入減去制作預算。實際上,我們不能完全知道電影制作,發行和營銷所涉及的總成本是如何驗證這一指標的。)

各類型的利潤 (Profit of Each Genre)

Image for post

(code is similar to above)

(代碼與上面類似)

In fact, the genre that usually yields the highest profit is animation, followed by family and action. We can also look at this relationship between production budget and gross revenue of each genre by plotting a linear model plot.

實際上,通常產生最高利潤的類型是動畫,其次是家庭和動作。 通過繪制線性模型圖,我們還可以查看每種類型的生產預算與總收入之間的這種關系。

Image for post

Looking at the linear model plot, it’s clear that with a very few exceptions, horror movies are low-cost and do not quite make a lot of revenues. Also high average profit for adventures seem to be from a handful of rare successes. It seems like feasible money-makers are action and animation. Action shows stronger correlation between budget and gross revenue, while animation seems to allow some of the high successes with relatively lower budget.

查看線性模型圖 ,很明顯,除了少數例外,恐怖電影是低成本的,并且收入不高。 冒險的高平均利潤似乎也來自少數難得的成功。 似乎可行的賺錢活動是動作和動畫。 動作顯示預算與總收入之間的相關性更強,而動畫似乎可以在預算相對較低的情況下取得一些成功。

We can simply compute correlations for each genre to confirm this.

我們可以簡單地計算每種類型的相關性以確認這一點。

for g in df[‘genre’].unique():corr = df[df.genre == g][‘budget’].corr(df[df.genre == g][‘glob_gross’])print(f”{g}: {round(corr, 2)}”)# Action: 0.74
# Animation 0.60
# slightly higher correlation between global gross revenue and budget for action films.

But the profit is not everything. As a brand new studio, we want to build a reputation and elevate our brand image to level with other established studio brands. This requires making reputable and award-worthy movies, as well as popular movies that go viral. Let’s see which genre tends to earn this status.

但是利潤不是一切。 作為一個全新的工作室,我們希望建立聲譽并提升我們的品牌形象,使其與其他知名工作室品牌保持一致。 這就要求制作著名的和值得獎賞的電影,以及流行的流行電影。 讓我們看看哪種流派傾向于獲得這種地位。

等級 (Ratings)

Image for post

A majority of horror movies don’t get high average ratings on IMDB, while biography or drama films tend to do well. We should investigate which type of biography or drama films are worth investing into. On the other hand, an all time winner seems like an animation, which often yields high revenue and high ratings. Only downside is that the award opportunities for animations are relatively slim.

大多數恐怖電影在IMDB上的平均收視率都不高,而傳記或戲劇電影則表現良好。 我們應該調查哪些傳記或戲劇電影值得投資。 另一方面,一個歷來的贏家似乎就像一個動畫,通常會帶來高收入和高收視率。 唯一的缺點是動畫的獲獎機會相對較少。

人氣度 (Popularity)

Image for post

We can see that action, adventure and animation are the most popular genres, based on the TMDB popularity score, while comedy, horror and biography films tend to be less so. For building a global brand presence and high profit, action, adventure and animation are good areas to target. We will look at these three genres first.

根據TMDB的人氣得分,我們可以看到動作,冒險和動畫是最受歡迎的類型,而喜劇,恐怖和傳記電影則不那么受歡迎。 對于建立全球品牌影響力和高利潤而言,動作,冒險和動畫是理想的目標領域。 我們將首先看這三種類型。

超級英雄動作片 (Superhero Action Films)

One thing that stood out from our dataset was that 3 out of 5 top profit action movies were superhero movies from Marvel production. Superhero film market has skyrocketed in the past decade and will be a difficult wall to break as a new studio, since most of them are sequels based on deep-rooted fandoms. So I decided to filter these superhero films based on the name of writers and directors by adding a new column ‘superhero’.

從我們的數據集中脫穎而出的一件事是,五部最賺錢的動作片中有三部是來自漫威制作的超級英雄電影。 在過去的十年中,超級英雄電影市場飛速發展,作為一個新的制片廠,這將是很難打破的一堵墻,因為其中大多數都是基于根深蒂固的狂熱分子的續集。 因此,我決定根據作者和導演的姓名來過濾這些超級英雄電影,方法是添加一個新列“ superhero”。

Image for post

Swarm plot is a good way to look at distribution of continuous values based on two other categorical values. Here, we can see that a big chunk of high profit action movies are indeed superhero films. Also even though not depicted here, most of successful non-superhero films are sequels (for both action and animation). It might be worthwhile to add a sequel as a feature for more deeper analysis.

Swarm圖是查看基于其他兩個分類值的連續值分布的好方法。 在這里,我們可以看到大量的高利潤動作電影確實是超級英雄電影。 同樣,盡管這里沒有描述,但大多數成功的非超級英雄電影都是續集(用于動作和動畫)。 可能需要添加續集作為更深入的分析功能。

Image for post

動作,動畫,冒險 (Action, Animation, Adventure)

Image for post

We can see here that animation on average tends to be more successful globally and domestically.

我們在這里可以看到,動畫在全球和國內平均而言更趨于成功。

Image for post

獲獎電影 (Award Winning Films)

So far we established that given a high budget, animation is perhaps a less risky genre to invest in. But we also want to invest in non-animation films to expand our chance of winning awards and establishing the reputation. Earlier we saw that biography and drama films tend to get rated high.

到目前為止,我們已經確定,在預算較高的情況下,動畫可能是投資風險較小的類型。但是,我們也希望投資于非動畫電影,以擴大獲得獎項和建立聲譽的機會。 之前我們看到傳記和戲劇電影的收視率往往很高。

Image for post

This plot shows that generally higher rating is associated with higher profit, but not by much. Also there seems to be some drama films that are following a different trend. We should look more into the sub-genre of drama films.

該圖表明,較高的評級通常與較高的利潤相關,但關系不大。 似乎有些戲劇電影也遵循不同的趨勢。 我們應該更多地研究戲劇電影的子流派。

Image for post

Strip plot is a scatter plot for categorical value, which adds a bit of horizontal jitter making it easier to visualize the density of values. It’s hard to observe strong trends here as there are too many categories and not enough observation, other than that there many of the drama films have a sub-genre of romance.

帶狀圖是分類值的散點圖,它增加了一些水平抖動,從而更易于可視化值的密度。 在這里很難觀察到強烈的趨勢,因為類別太多,觀察不夠,除了許多戲劇電影都具有浪漫的亞體。

Simple t-test showed that there are statistically significant differences in average IMDB rating between drama and biography films (p < 0.01), but not in profit or budget. So we should focus on making a biography film instead.

簡單的t檢驗表明,戲劇電影和傳記電影之間的IMDB平均評分存在統計學差異( p <0.01 ),但利潤或預算上沒有差異。 因此,我們應該專注于制作傳記電影。

每月趨勢 (Monthly Trend)

Lastly, we looked at when is the best time to release the movie to maximize the profit using line plots.

最后,我們用線圖研究了何時發行電影以最大化利潤的最佳時間。

Image for post

Looking at the annual trend, we can see that movies released in April to June tend to be the highest revenue yielding. This would be a great time to release our globally appealing animation.

從年度趨勢來看,我們可以看到4月至6月發行的電影收益最高。 這將是發布我們具有全球吸引力的動畫的絕佳時機。

Highly acclaimed movies are released close to the end of the year during the “Oscar Seasons” to maximize their exposures to critics. We recommend releasing our award worthy biography films during this time and elevate our brand to the level of other established studios.

備受贊譽的電影將在“奧斯卡季”(Oscar Seasons)臨近年底發行,以最大程度地提高對評論家的曝光率。 我們建議您在這段時間內發布我們的獲獎傳記電影,并將我們的品牌提升到其他知名制片廠的水平。

結論 (Conclusion)

We reviewed the movie data from the past decade to propose a few recommendations and guidelines to start a movie studio. Horror movies yield the highest percentage return per investment and it requires a little budget to start out. But it’s not a good genre to start with, as it is usually not popular or highly rated, and does not bring in high revenue. To maximize the profit and to develop global presence, investing in animation films is encouraged. As well to target awards, in order to elevate the brand reputation, we suggested making biography films. An annual plan to synergize productions of two separate lines of films (profitable animation and award-worthy biography) is suggested.

我們回顧了過去十年的電影數據,提出了一些建議和指導方針來建立電影制片廠。 恐怖電影的單筆投資回報率最高,而且制作預算也很少。 但這并不是一個很好的類型,因為它通常不受歡迎或評級很高,并且不會帶來高收入。 為了最大化利潤并發展全球影響力,鼓勵在動畫電影上投資。 除了獲得獎項之外,為了提升品牌聲譽,我們建議制作傳記電影。 建議制定一項年度計劃,以使兩行不同的電影(有益的動畫和獲獎的傳記)的制作相互協調。

For a more in depth process, you can check out the Github page here. This project was a collaboration done in collaboration with my colleague Paul Torres.

有關更深入的過程,您可以在此處查看Github頁面。 這個項目是與我的同事Paul Torres合作完成的。

翻譯自: https://medium.com/swlh/future-of-a-movie-studio-29a65fcf48c

云尚制片管理系統

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/388181.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/388181.shtml
英文地址,請注明出處:http://en.pswp.cn/news/388181.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

JAVA單向鏈表實現

JAVA單向鏈表實現 單向鏈表 鏈表和數組一樣是一種最常用的線性數據結構&#xff0c;兩者各有優缺點。數組我們知道是在內存上的一塊連續的空間構成&#xff0c;所以其元素訪問可以通過下標進行&#xff0c;隨機訪問速度很快&#xff0c;但數組也有其缺點&#xff0c;由于數組的…

軟件公司管理基本原則

商業人格&#xff1a;獨立履行責任 獨立堅持原則兩大要素&#xff1a;1)靠原則做事&#xff0c;原則高于一切。2)靠結果做交換&#xff0c;我要什么我清楚兩個標準&#xff1a; 1)我不是孩子&#xff0c;我不需要照顧2)承認邏輯&#xff0c;我履行我的責任社會人心態: 1)用社會…

201771010102 常惠琢《面向對象程序設計(java)》第八周學習總結

1、實驗目的與要求 (1) 掌握接口定義方法&#xff1b; (2) 掌握實現接口類的定義要求&#xff1b; (3) 掌握實現了接口類的使用要求&#xff1b; (4) 掌握程序回調設計模式&#xff1b; (5) 掌握Comparator接口用法&#xff1b; (6) 掌握對象淺層拷貝與深層拷貝方法&#xff1b…

新版 Android 已支持 FIDO2 標準,免密登錄應用或網站

谷歌剛剛宣布了與 FIDO 聯盟達成的最新合作&#xff0c;為 Android 用戶帶來了無需密碼、即可登錄網站或應用的便捷選項。 這項服務基于 FIDO2 標準實現&#xff0c;任何運行 Android 7.0 及后續版本的設備&#xff0c;都可以在升級最新版 Google Play 服務后&#xff0c;通過指…

t-sne原理解釋_T-SNE解釋-數學與直覺

t-sne原理解釋The method of t-distributed Stochastic Neighbor Embedding (t-SNE) is a method for dimensionality reduction, used mainly for visualization of data in 2D and 3D maps. This method can find non-linear connections in the data and therefore it is hi…

oracle操作

imp kfqrlcs/kfqrlcshx fileC:\kfqrlcs.dmp fully //創建臨時表空間 create temporary tablespace kfqrlcs_temp tempfile C:\oracledata\kfqrlcs_temp.dbf size 32m autoextend on next 32m maxsize 8048m extent management local; //tempfile參數必須有 //創建數據表…

strust2自定義攔截器

1.創建一個攔截器類&#xff0c;繼承MethodFilterInterceptor類&#xff0c;實現doIntercept方法 package com.yqg.bos.web.interceptor;import com.opensymphony.xwork2.ActionInvocation; import com.opensymphony.xwork2.interceptor.MethodFilterInterceptor; import com.y…

Android Studio如何減小APK體積

最近在用AndroidStudio開發一個小計算器&#xff0c;代碼加起來還不到200行。但是遇到一個問題&#xff0c;導出的APK文件大小竟然達到了1034K。這不科學&#xff0c;于是就自己動手精簡APK。下面我們大家一起學習怎么縮小一個APK的大小&#xff0c;以hello world為例。 新建工…

js合并同類數組里面的對象_通過同類群組保留估算客戶生命周期價值

js合并同類數組里面的對象This is Part I of the two-part series dedicated to estimating customer lifetime value. In this post, I will describe how to estimate LTV, on a conceptual level, in order to explain what we’re going to be doing in Part II with the P…

C#解析HTML

第一種方法&#xff1a;用正則表達式來分析 [csharp] view plaincopy 轉自網上的一個實例&#xff1a;所有的href都抽取出來&#xff1a; using System; using System.Net; using System.Text; using System.Text.RegularExpressions; namespace HttpGet { c…

幫助開發人員學習

在瀏覽器中使用真實環境學習新技術 https://www.katacoda.com/ 轉載于:https://www.cnblogs.com/zuxing/p/9829143.html

【轉】SASS用法指南

SASS用法指南 阮一峰的&#xff0c;偏sass用法教程sass入門 偏實戰的基礎用法

com編程創建快捷方式中文_如何以編程方式為博客創建wordcloud?

com編程創建快捷方式中文Recently, I was in need of an image for our blog and wanted it to have some wow effect or at least a better fit than anything typical we’ve been using. Pondering over ideas for a while, word cloud flashed in my mind. &#x1f4a1;Us…

ETL技術入門之ETL初認識

ETL技術入門之ETL初認識 分類&#xff1a; etl2014-07-10 23:11 3021人閱讀 評論(2) 收藏 舉報數據倉庫商業價值etlbi目錄(?)[-] ETL是什么先說下背景知識下面給下ETL的詳細解釋定義現在來看下kettle的transformation文件一個最簡單的E過程例子windows環境 上圖左邊的是打開表…

ActiveSupport::Concern 和 gem 'name_of_person'(300?) 的內部運行機制分析

理解ActiveRecord::Concern&#xff1a; 參考:include和extend的區別&#xff1a; https://www.cnblogs.com/chentianwei/p/9408963.html 傳統的模塊看起來像&#xff1a; module Mdef self.included(base)# base(一個類)擴展了一個模塊"ClassMethods"&#xff0c; b…

Python 3.8.0a2 發布,面向對象編程語言

百度智能云 云生態狂歡季 熱門云產品1折起>>> Python 3.8.0a2 發布了&#xff0c;這是 3.8 系列計劃中 4 個 alpha 版本的第 2 個。 alpha 版本旨在更加易于測試新功能和 bug 修復狀態&#xff0c;以及發布流程。在 alpha 階段會添加新功能&#xff0c;直到 beta 階…

基于plotly數據可視化_如何使用Plotly進行數據可視化

基于plotly數據可視化The amount of data in the world is growing every second. From sending a text to clicking a link, you are creating data points for companies to use. Insights that can be drawn from this collection of data can be extremely valuable. Every…

關于Oracle實時數據庫的優化思路

關于實時數據庫的優化思路 背景 大概168個換熱站機組&#xff0c;每套機組將近400個點&#xff0c;整體有6萬多個點需要進行實時更新。數據庫里其中有一個監控參數表(yxjk_jkcs)&#xff0c;每一個點位屬性都在里面存放&#xff0c;其中有一個字段CS_VALUE 是存放被更新的實時…

【轉】使用 lsof 查找打開的文件

在 UNIX 環境中&#xff0c;文件無處不在&#xff0c;這便產生了一句格言&#xff1a;“任何事物都是文件”。通過文件不僅僅可以訪問常規數據&#xff0c;通常還可以訪問網絡連接和硬件。在有些情況下&#xff0c;當您使用 ls 請求目錄清單時&#xff0c;將出現相應的條目。在…

ESLint簡介

ESLint簡介 ESLint是一個用來識別 ECMAScript 并且按照規則給出報告的代碼檢測工具&#xff0c;使用它可以避免低級錯誤和統一代碼的風格。如果每次在代碼提交之前都進行一次eslint代碼檢查&#xff0c;就不會因為某個字段未定義為undefined或null這樣的錯誤而導致服務崩潰&…