云尚制片管理系統
Data visualization is a key step of any data science project. During the process of exploratory data analysis, visualizing data allows us to locate outliers and identify distribution, helping us to control for possible biases in our data earlier on. Coupled with simple statistical tests, it can also answer many of the questions and can aid us in prioritizing areas to focus on.
數據可視化是任何數據科學項目的關鍵步驟。 在探索性數據分析過程中,可視化數據使我們能夠找到異常值并識別分布,從而幫助我們盡早控制數據中可能存在的偏差。 結合簡單的統計測試,它還可以回答許多問題,并可以幫助我們確定優先領域。
Here, I will go through some of the exploratory data analysis and data visualization steps in Python using Matplotlib and Seaborn libraries. The goal of the project is to analyze movie trends of the past decade to make suggestions in developing a new movie studio brand for a well-established corporation.
在這里,我將使用Matplotlib和Seaborn庫完成一些探索性數據分析和數據可視化步驟。 該項目的目的是分析過去十年的電影趨勢,為發展成熟的公司開發新的電影制片廠品牌提供建議。
方法 (Approach)
We explored the data with these two primary goals in mind.
考慮到這兩個主要目標,我們探索了數據。
Building a global brand — We don’t just make movies, we make good movies that appeal to a global audience.
建立全球品牌- 我們不僅制作電影,而且制作吸引全球觀眾的優質電影。
Establishing a sustainable long-term plan —Making a sustainable business plan, not just a movie production plan.
建立可持續的長期計劃- 制定可持續的商業計劃,而不僅僅是電影制作計劃。
數據結構 (Data Structure)

This is the basic structure of our cleaned Pandas data frame. We sourced our data from the Movie Database (TMDB), IMDB, and the Numbers. I recommend using the Movie Database (TMDB) API for the preliminary movie data.
這是我們清理過的熊貓數據框的基本結構。 我們從電影數據庫(TMDB),IMDB和數字中獲取數據。 我建議使用電影數據庫(TMDB)API來獲取初步的電影數據。
勘探 (Exploration)
最初設定 (Initial Setup)
總收入分配 (Distribution of Gross Revenue)
Let’s start looking at the distribution of the overall gross revenues for domestic and worldwide. Seaborn’s distplot plots histogram along with KDE (Kernel Density Estimate) plot.
讓我們開始看看國內和全球總收入的分布。 Seaborn的distplot繪制直方圖以及KDE(內核密度估計)圖 。

We can see that it is strongly right skewed, it is a pretty usual trend for income data. Taking the log transformation of this data can help us visualize what’s happening in the dense area more clearly.
我們可以看到它是非常右偏的,對于收入數據來說這是很常見的趨勢。 對這些數據進行對數轉換可以幫助我們更清晰地可視化密集區域中發生的情況。

Not surprisingly, It seems like the global market yields higher revenues on average. Let’s look at the relationship between the budget and revenue.
毫不奇怪,似乎全球市場平均產生更高的收入。 讓我們看一下預算與收入之間的關系。
預算收入 (Budget to Revenue)
Now we want to visualize the relationship between production budget and gross revenue, which are two continuous variables using scatter plots. There are many ways to achieve this. Here, I used the overlaid scatter plots to look at the global and domestic gross revenues together.
現在我們要形象化生產預算和總收入之間的關系,這是使用散點圖的兩個連續變量。 有很多方法可以實現這一目標。 在這里,我使用疊加的散點圖一起查看了全球和國內總收入。

It seems like a high budget does not always lead to high revenue especially in the domestic market. Also some movies yield high revenues with relatively lower budgets when it targets the global market. Let’s take a closer look at which genres might return the most return for its investment.
似乎高預算并不總是導致高收入,尤其是在國內市場。 此外,某些電影面向全球市場時,其預算卻相對較低,可帶來高額收入。 讓我們仔細研究一下哪些類型的內容可能會為其投資帶來最大的回報。
體裁分布 (Distribution of Genre)
We can look at the percentage of each genre in our dataset using a bar plot.
我們可以使用條形圖查看數據集中每種類型的百分比。

We see that about 30% of our data is action movies.
我們看到大約30%的數據是動作電影。
各類型的收益與成本比率 (Revenue to Cost Ratio of Each Genre)
Which genres have the highest return per investment?
哪種類型的單筆投資回報最高?

Based on the global gross revenue to budget ratio, horror films on average make the most return per investment. But this does not necessarily mean that horror movies bring the most profit. Horror movies might take less production budget to make, thus yielding a higher percentage of return per cost. We can compare the budget of each genre using a box plot.
根據全球總收入與預算的比率,恐怖電影平均每筆投資回報最高。 但這并不一定意味著恐怖電影會帶來最大的收益。 恐怖電影可能需要較少的制作預算,因此產生更高的單位成本回報率。 我們可以使用箱形圖比較每種類型的預算。
各類型的平均制作預算 (Average Production Budget of Each Genre)

As we suspected, horror movies usually require a little budget to start out. On the other hand, action, animation and some family films tend to have higher budgets. Then which genre of movies yield the most profit? (Here I’m using the term “profit” liberally to mean global gross revenue minus the production budget. In reality, we cannot entirely know what the total cost involved in the movie production, distribution and marketing is to validate this measure.)
正如我們所懷疑的,恐怖電影通常需要很少的預算才能開始。 另一方面,動作,動畫和一些家庭電影往往預算較高。 那么哪種類型的電影收益最大? (在這里,我用“利潤”一詞來表示全球總收入減去制作預算。實際上,我們不能完全知道電影制作,發行和營銷所涉及的總成本是如何驗證這一指標的。)
各類型的利潤 (Profit of Each Genre)

(code is similar to above)
(代碼與上面類似)
In fact, the genre that usually yields the highest profit is animation, followed by family and action. We can also look at this relationship between production budget and gross revenue of each genre by plotting a linear model plot.
實際上,通常產生最高利潤的類型是動畫,其次是家庭和動作。 通過繪制線性模型圖,我們還可以查看每種類型的生產預算與總收入之間的這種關系。

Looking at the linear model plot, it’s clear that with a very few exceptions, horror movies are low-cost and do not quite make a lot of revenues. Also high average profit for adventures seem to be from a handful of rare successes. It seems like feasible money-makers are action and animation. Action shows stronger correlation between budget and gross revenue, while animation seems to allow some of the high successes with relatively lower budget.
查看線性模型圖 ,很明顯,除了少數例外,恐怖電影是低成本的,并且收入不高。 冒險的高平均利潤似乎也來自少數難得的成功。 似乎可行的賺錢活動是動作和動畫。 動作顯示預算與總收入之間的相關性更強,而動畫似乎可以在預算相對較低的情況下取得一些成功。
We can simply compute correlations for each genre to confirm this.
我們可以簡單地計算每種類型的相關性以確認這一點。
for g in df[‘genre’].unique():corr = df[df.genre == g][‘budget’].corr(df[df.genre == g][‘glob_gross’])print(f”{g}: {round(corr, 2)}”)# Action: 0.74
# Animation 0.60
# slightly higher correlation between global gross revenue and budget for action films.
But the profit is not everything. As a brand new studio, we want to build a reputation and elevate our brand image to level with other established studio brands. This requires making reputable and award-worthy movies, as well as popular movies that go viral. Let’s see which genre tends to earn this status.
但是利潤不是一切。 作為一個全新的工作室,我們希望建立聲譽并提升我們的品牌形象,使其與其他知名工作室品牌保持一致。 這就要求制作著名的和值得獎賞的電影,以及流行的流行電影。 讓我們看看哪種流派傾向于獲得這種地位。
等級 (Ratings)

A majority of horror movies don’t get high average ratings on IMDB, while biography or drama films tend to do well. We should investigate which type of biography or drama films are worth investing into. On the other hand, an all time winner seems like an animation, which often yields high revenue and high ratings. Only downside is that the award opportunities for animations are relatively slim.
大多數恐怖電影在IMDB上的平均收視率都不高,而傳記或戲劇電影則表現良好。 我們應該調查哪些傳記或戲劇電影值得投資。 另一方面,一個歷來的贏家似乎就像一個動畫,通常會帶來高收入和高收視率。 唯一的缺點是動畫的獲獎機會相對較少。
人氣度 (Popularity)

We can see that action, adventure and animation are the most popular genres, based on the TMDB popularity score, while comedy, horror and biography films tend to be less so. For building a global brand presence and high profit, action, adventure and animation are good areas to target. We will look at these three genres first.
根據TMDB的人氣得分,我們可以看到動作,冒險和動畫是最受歡迎的類型,而喜劇,恐怖和傳記電影則不那么受歡迎。 對于建立全球品牌影響力和高利潤而言,動作,冒險和動畫是理想的目標領域。 我們將首先看這三種類型。
超級英雄動作片 (Superhero Action Films)
One thing that stood out from our dataset was that 3 out of 5 top profit action movies were superhero movies from Marvel production. Superhero film market has skyrocketed in the past decade and will be a difficult wall to break as a new studio, since most of them are sequels based on deep-rooted fandoms. So I decided to filter these superhero films based on the name of writers and directors by adding a new column ‘superhero’.
從我們的數據集中脫穎而出的一件事是,五部最賺錢的動作片中有三部是來自漫威制作的超級英雄電影。 在過去的十年中,超級英雄電影市場飛速發展,作為一個新的制片廠,這將是很難打破的一堵墻,因為其中大多數都是基于根深蒂固的狂熱分子的續集。 因此,我決定根據作者和導演的姓名來過濾這些超級英雄電影,方法是添加一個新列“ superhero”。

Swarm plot is a good way to look at distribution of continuous values based on two other categorical values. Here, we can see that a big chunk of high profit action movies are indeed superhero films. Also even though not depicted here, most of successful non-superhero films are sequels (for both action and animation). It might be worthwhile to add a sequel as a feature for more deeper analysis.
Swarm圖是查看基于其他兩個分類值的連續值分布的好方法。 在這里,我們可以看到大量的高利潤動作電影確實是超級英雄電影。 同樣,盡管這里沒有描述,但大多數成功的非超級英雄電影都是續集(用于動作和動畫)。 可能需要添加續集作為更深入的分析功能。

動作,動畫,冒險 (Action, Animation, Adventure)

We can see here that animation on average tends to be more successful globally and domestically.
我們在這里可以看到,動畫在全球和國內平均而言更趨于成功。

獲獎電影 (Award Winning Films)
So far we established that given a high budget, animation is perhaps a less risky genre to invest in. But we also want to invest in non-animation films to expand our chance of winning awards and establishing the reputation. Earlier we saw that biography and drama films tend to get rated high.
到目前為止,我們已經確定,在預算較高的情況下,動畫可能是投資風險較小的類型。但是,我們也希望投資于非動畫電影,以擴大獲得獎項和建立聲譽的機會。 之前我們看到傳記和戲劇電影的收視率往往很高。

This plot shows that generally higher rating is associated with higher profit, but not by much. Also there seems to be some drama films that are following a different trend. We should look more into the sub-genre of drama films.
該圖表明,較高的評級通常與較高的利潤相關,但關系不大。 似乎有些戲劇電影也遵循不同的趨勢。 我們應該更多地研究戲劇電影的子流派。

Strip plot is a scatter plot for categorical value, which adds a bit of horizontal jitter making it easier to visualize the density of values. It’s hard to observe strong trends here as there are too many categories and not enough observation, other than that there many of the drama films have a sub-genre of romance.
帶狀圖是分類值的散點圖,它增加了一些水平抖動,從而更易于可視化值的密度。 在這里很難觀察到強烈的趨勢,因為類別太多,觀察不夠,除了許多戲劇電影都具有浪漫的亞體。
Simple t-test showed that there are statistically significant differences in average IMDB rating between drama and biography films (p < 0.01), but not in profit or budget. So we should focus on making a biography film instead.
簡單的t檢驗表明,戲劇電影和傳記電影之間的IMDB平均評分存在統計學差異( p <0.01 ),但利潤或預算上沒有差異。 因此,我們應該專注于制作傳記電影。
每月趨勢 (Monthly Trend)
Lastly, we looked at when is the best time to release the movie to maximize the profit using line plots.
最后,我們用線圖研究了何時發行電影以最大化利潤的最佳時間。

Looking at the annual trend, we can see that movies released in April to June tend to be the highest revenue yielding. This would be a great time to release our globally appealing animation.
從年度趨勢來看,我們可以看到4月至6月發行的電影收益最高。 這將是發布我們具有全球吸引力的動畫的絕佳時機。
Highly acclaimed movies are released close to the end of the year during the “Oscar Seasons” to maximize their exposures to critics. We recommend releasing our award worthy biography films during this time and elevate our brand to the level of other established studios.
備受贊譽的電影將在“奧斯卡季”(Oscar Seasons)臨近年底發行,以最大程度地提高對評論家的曝光率。 我們建議您在這段時間內發布我們的獲獎傳記電影,并將我們的品牌提升到其他知名制片廠的水平。
結論 (Conclusion)
We reviewed the movie data from the past decade to propose a few recommendations and guidelines to start a movie studio. Horror movies yield the highest percentage return per investment and it requires a little budget to start out. But it’s not a good genre to start with, as it is usually not popular or highly rated, and does not bring in high revenue. To maximize the profit and to develop global presence, investing in animation films is encouraged. As well to target awards, in order to elevate the brand reputation, we suggested making biography films. An annual plan to synergize productions of two separate lines of films (profitable animation and award-worthy biography) is suggested.
我們回顧了過去十年的電影數據,提出了一些建議和指導方針來建立電影制片廠。 恐怖電影的單筆投資回報率最高,而且制作預算也很少。 但這并不是一個很好的類型,因為它通常不受歡迎或評級很高,并且不會帶來高收入。 為了最大化利潤并發展全球影響力,鼓勵在動畫電影上投資。 除了獲得獎項之外,為了提升品牌聲譽,我們建議制作傳記電影。 建議制定一項年度計劃,以使兩行不同的電影(有益的動畫和獲獎的傳記)的制作相互協調。
For a more in depth process, you can check out the Github page here. This project was a collaboration done in collaboration with my colleague Paul Torres.
有關更深入的過程,您可以在此處查看Github頁面。 這個項目是與我的同事Paul Torres合作完成的。
翻譯自: https://medium.com/swlh/future-of-a-movie-studio-29a65fcf48c
云尚制片管理系統
本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。 如若轉載,請注明出處:http://www.pswp.cn/news/388181.shtml 繁體地址,請注明出處:http://hk.pswp.cn/news/388181.shtml 英文地址,請注明出處:http://en.pswp.cn/news/388181.shtml
如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!