2016版單詞的減法
by Amber Thomas
通過琥珀托馬斯
在2016年最大的電影中,女性只說了27%的單詞。 (Women only said 27% of the words in 2016’s biggest movies.)
Movie trailers in 2016 promised viewers so many strong female characters. Jyn Erso. Dory. Harley Quinn. Judy Hopps. Wonder Woman. I felt like this could be the year for gender equality in Hollywood’s biggest films.
2016年的電影預告片向觀眾承諾了這么多堅強的女性角色。 珍妮·艾索(Jyn Erso)。 海ry 哈雷奎恩。 朱迪·霍普斯(Judy Hopps)。 神奇女俠。 我覺得這可能是好萊塢最大的電影中實現性別平等的一年。
I was wrong.
我錯了。
And I don’t make this statement lightly.
而且我不會輕易發表這一聲明。
As a scientist, I turn to data to answer questions I have about the world. And I’ve got the data to back up my claim. In fact, you can have the data, code, and resulting data visualization that I made trying to better understand this topic. But first, let me tell you how I became so interested.
作為科學家,我求助于數據來回答關于世界的問題。 而且我有數據來支持我的主張。 實際上,您可以獲取我試圖更好地理解該主題的數據,代碼和結果數據可視化 。 但是首先,讓我告訴您我是如何變得如此感興趣的。
It all started when I went to see Rogue One: A Star Wars Story. All promotional materials for the movie indicated that Jyn Erso (played by Felicity Jones) was the main character. I mean, just look at the poster.
當我去看《俠盜一號:星球大戰外傳》時,一切就開始了。 電影的所有宣傳材料都表明,金恩·埃索(由Felicity Jones飾演)是主角。 我的意思是,只看海報。
When your picture is several times larger than everyone else’s, you’re probably the main character.
當您的圖片比其他所有人大幾倍時,您可能就是主角。
What I didn’t notice at first was that Jyn is the only woman on that poster.
起初我沒有注意到的是Jyn是那張海報上唯一的女人。
I went into the movie theater expecting to see men and women fighting side by side. I left feeling certain that I could count every female character from the movie on one hand. While Jyn was the main character, I was profoundly aware that she was often the only woman in any scene.
我走進電影院,希望看到男人和女人并肩作戰。 我離開時確定自己可以一方面統計電影中的每個女性角色。 雖然Jyn 是主要角色,但我深刻地意識到,她通常是任何場景中的唯一女性。
It felt strangely familiar to have a lead female character be so outnumbered. Then I realized that Jyn and Princess Leia suffered the same inequality 39 years apart. I was overwhelmed with a need to know exactly how female representation in Star Wars movies has changed. But it seemed unfair to compare movies made today with movies made decades ago.
擁有如此多的女主角令我感到奇怪。 然后我意識到Jyn和Leia公主相距39年,經歷了同樣的不平等。 我不知道要確切地知道《星球大戰》電影中女性形象的變化,這讓我不知所措。 但是,將今天制作的電影與幾十年前制作的電影進行比較似乎是不公平的。
So instead, I decided to look for female equality across the Top 10 Worldwide Highest Grossing Films of 2016. They were:
因此,我決定在2016年全球十大票房最高的電影中尋求女性平等。他們是:
Captain America: Civil War
美國隊長:內戰
Finding Dory
海底總動員2
Zootopia
動物界
The Jungle Book
叢林書
The Secret Life of Pets
寵物的秘密生活
Batman V. Superman: Dawn of Justice
蝙蝠俠訴超人:正義曙光
Rogue One: A Star Wars Story
俠盜一號:星球大戰外傳
Deadpool
死侍
Fantastic Beasts and Where to Find Them
神奇的野獸以及在哪里找到它們
Suicide Squad
自殺小隊
With so many powerful women in these films, some of them must be gender-equal, right?
這些影片中有這么多有影響力的女性,其中有些必須與性別平等,對嗎?
數據 (The Data)
Now that I decided what I wanted to investigate, I needed to figure out how to do it. Similar data exploration projects have focused on dialogue or screen-time equality. Both seemed like good options, but I wanted the ability to report on equality at the movie and character level.
既然我確定了要調查的內容,就需要弄清楚該如何做。 類似的數據探索項目也將重點放在對話或屏幕時間平等上。 兩者似乎都是不錯的選擇,但我希望能夠在電影和角色級別上報道平等。
In the end, I decided to explore the movies’ dialogue. This choice gave me the ability to focus on characters with an active role in the story and to cut non-speaking characters from my analysis.
最后,我決定探索電影的對話。 這種選擇使我能夠專注于故事中活躍角色的角色,并從我的分析中切出不說話的角色。
Luckily for me, dedicated movie fans often transcribe a movie’s dialogue and make it freely available online. If I couldn’t find a transcript, I used closed-caption files instead. For those, I re-watched the movie and manually assigned characters to their spoken lines.
對我來說幸運的是,忠實的電影迷經常抄錄電影的對白并免費在線上觀看。 如果找不到筆錄,請改用隱藏字幕文件。 為此,我重新觀看了電影,并手動將角色分配給了他們的口語行。
This process was a labor of love. It was time consuming, but I have no regrets.
這個過程是愛的勞動。 這很耗時,但我不后悔。
分析 (Analysis)
Once I had all of the transcripts, I just needed to read the .txt files into R and separate the characters from their lines. For the Rogue One transcript, that process looked like this:
擁有所有成績單后,我只需要將.txt文件讀入R并將字符與行分開即可。 對于“流氓一號”筆錄,該過程如下所示:
Now that I had a data frame with both Character and Words columns, I had to assign genders to each Character. To remain consistent with my categorizations, I came up with a few simple rules:
現在,我有了一個同時包含“字符”和“單詞”列的數據框,我必須為每個字符分配性別。 為了與分類保持一致,我提出了一些簡單的規則:
- When possible, assign gender according to the pronouns that other characters use. For example, if a character is referred to by others as “he” or “him”, then he is categorized as “male”. 如果可能,根據其他字符使用的代詞分配性別。 例如,如果一個角色被其他人稱為“他”或“他”,則他被歸類為“男性”。
If there is no pronoun used throughout the movie but the character is named or credited (on IMDB), use the gender of the actor or actress. Note that the gender of an actor or actress was assumed based on publicly available information as of January 2017.
如果在電影中沒有使用代詞,但是角色(在IMDB上 )已被命名或記為角色,請使用演員的性別。 請注意,根據截至2017年1月的公開信息,假定了演員的性別。
- If no pronoun is used for the character and the character is not named or credited, refer to the closed captions. Sometimes they will identify the character that spoke. 如果該字符沒有使用代詞,并且該字符未命名或使用,則請參考隱藏字幕。 有時他們會識別說話的角色。
- If all else fails, make an educated guess based on the character’s voice. 如果其他所有方法均失敗,請根據角色的聲音做出有根據的猜測。
I’ll be the first to say that these methods are not perfect. In fact, here are some caveats:
我將第一個說這些方法并不完美。 實際上,這里有一些警告:
- If a male character was voiced by a female actress (or vice versa) and the character was never addressed by other characters using pronouns, he may be incorrectly labelled. (I don’t think this happened, but anything is possible.) 如果男性角色由女性女演員發聲(反之亦然),而該角色從未被其他角色使用代詞講話,那么他的標簽可能不正確。 (我不認為這發生了,但是一切皆有可能。)
- Voices that are not associated with a physical embodiment of a character (e.g., the voice of a computer) were categorized according to the gender of their voice actor/actress. 與角色的物理實施方式不相關的語音(例如,計算機的語音)是根據其語音演員的性別來分類的。
I can never really know the gender of any character, but I’m using the cues and information that I have at my disposal.
我永遠無法真正知道任何角色的性別,但是我正在使用自己掌握的線索和信息。
Again, I am far from infallible, so if you caught a mistake on my part, please let me know.
同樣,我絕不是萬無一失,因此,如果您遇到了我的失誤,請告訴我 。
So now I just needed to count the number of words spoken by each character. Again, I was able to do this in R using the dplyr
and stringi
packages.
所以現在我只需要計算每個字符說出的單詞數即可。 同樣,我能夠使用dplyr
和stringi
包在R中做到這一點。
It’s worth noting that I included every speaking character in this analysis. So yes, every stormtrooper who shouts a simple “Wait, stop!” before getting shot is included.
值得注意的是,我在分析中包括了每個說話的角色。 所以,是的,每位沖鋒隊大喊一個簡單的“等等,停下來!” 包括拍攝之前。
數據可視化 (Data Visualization)
I had my data. Unfortunately, tables upon tables of word counts and character names don’t give anyone much insight. Like any good data exploration project, it was time to visualize my results. I had to work through a few iterations before I found the best one.
我有我的數據。 不幸的是,字數統計表和字符名稱表并沒有給任何人以太多的見識。 像任何好的數據探索項目一樣,是時候可視化我的結果了。 在找到最佳迭代之前,我必須經過幾次迭代。
Scatterplots and bar charts both masked characters with small roles.
散點圖和條形圖都掩蓋了角色較小的角色。
A simple bubble chart was better but it became difficult to identify individual characters. It was also challenging to understand movie-level statistics.
一個簡單的氣泡圖比較好,但是識別單個字符變得困難。 了解電影級統計數據也具有挑戰性。
In the end, I decided to learn enough d3.js to make an interactive graphic. Here, each bubble represents a character, and the bubble’s area is scaled based on the number of words spoken. Female and male bubbles can be separated for better insight. The stacked bars below indicate movie-level information.
最后,我決定學習足夠的d3.js來制作交互式圖形 。 在這里,每個氣泡代表一個字符,氣泡的面積根據說出的單詞數進行縮放。 可以將雌性和雄性氣泡分開以更好地了解情況。 下面堆疊的條表示電影級信息。
Go ahead, check out the full interactive version.
繼續,查看完整的交互式版本 。
Interested in exploring the raw word-count data for yourself? I’ve made all of the data and code used to generate these visualizations open source. It’s available here:
有興趣探索自己的原始字數統計數據嗎? 我已經將用于生成這些可視化的所有數據和代碼公開了。 在這里可用:
ProQuestionAsker/2016MovieDialogueContribute to 2016MovieDialogue development by creating an account on GitHub.github.com
ProQuestionAsker / 2016MovieDialogue 通過在GitHub上創建一個帳戶為2016MovieDialogue開發 做出 貢獻。 github.com
外賣 (Takeaways)
Ok, so the analysis is done. I’ve got a fancy (and fun-to-play-with) visualization. What did I find?
好的,分析完成了。 我有一個花哨的(而且很有趣的)可視化效果。 我找到了什么?
I recommend taking a quick second to look at something “a-Dory-ble” before going on, because this post is about to get real depressing real fast.
我建議在繼續之前先花點時間看一下“ a-Dory-ble”,因為這篇文章很快就會令人沮喪。
Aw, so cute. Feeling good?
真可愛 感覺好嗎?
All right, here we go.
好吧,我們開始。
This is a static version of what the visualization for all 10 movies looks like:
這是所有10部電影的可視化效果的靜態版本:
(If you’d like to check out the interactive visualization, go here.)
(如果您想查看交互式可視化,請轉到此處 。)
There are a couple of things here that I need to point out:
我需要指出以下幾點:
Not one of the top 10 movies of 2016 had a 50% speaking, female cast.
2016年的十大電影中,沒有一部擁有50%的女性演員。
Finding Dory was the closest to this level of equality with 43% female characters. To be equal, the movie would have needed 8 more speaking, female roles.
尋找多莉(Dory)最接近這個平等水平,女性角色占43%。 為了平等起見,這部電影還需要再增加8位女性角色。
Rogue One was the worst. Only 9% of its speaking characters were female. Of those 10 characters, 1 was a computer voice, 1 appeared on screen for no more than 5 seconds, and 1 was a CGI cameo that said 1 word.
流氓一號最糟糕。 它的說話角色中只有9%是女性。 在這10個字符中,有1個是計算機語音,有1個出現在屏幕上的時間不超過5秒,有1個是CGI客串,說了1個字。
Only 1 of 2016’s top 10 movies had 50% dialogue by a female character.
2016年的前10部電影中,只有1部的女性角色對話率為50%。
Finding Dory comes out on top here too with 53% female dialogue. But, 76% of that dialogue came from Dory alone.
在女性對話中,找到海莉也位居榜首。 但是,這種對話中有76%僅來自Dory。
Trailing at the end was The Jungle Book with only 10% of its dialogue spoken by a female character. Keep in mind, this is after casting Scarlett Johansson as the voice of the historically-male snake, Kaa.
排在最后的是《叢林書》,其中只有10%的對話是由女性角色講的。 請記住,這是在將斯嘉麗·約翰遜(Scarlett Johansson)選作歷史上雄性蛇Kaa的聲音之后。
Here’s a few more:
還有一些:
- Finding Dory and Zootopia were the only 2 movies in 2016’s top 10 in which a female character had the most dialogue. 在2016年的前10名電影中,《尋找海莉》和《動物世界》是僅有的兩部女性角色對話最多的電影。
- Female characters were outnumbered in Captain America: Civil War’s final battle 5:1. Throughout the movie, they only contributed 16% of the dialogue. 在《美國隊長:內戰》的最后一場戰斗中,女性角色的數量超過了5:1。 在整部電影中,他們只貢獻了16%的對話。
- Batman spoke 2.4 times more than Superman and 6 times more than Wonder Woman in Batman V. Superman. 蝙蝠俠在蝙蝠俠V.超人中的說話能力是超人的2.4倍,是《神力女超人》的6倍。
- 78% of the female-spoken lines in Rogue One came from Jyn Erso. Rogue One中78%的女性口語語系來自Jyn Erso。
- While Harley Quinn was a highly advertised character in Suicide Squad, she only spoke 42% as many words as Floyd/Deadshot (played by Will Smith). Notably, Amanda Waller (played by Viola Davis) spoke frequently, totaling just 222 words (16%) short of Deadshot’s word count. 雖然哈雷·奎因(Harley Quinn)是《自殺小隊》(Supericide Squad)中一個備受推崇的角色,但她說的話只占弗洛伊德(Floyd / Deadshot)(威爾·史密斯(Will Smith)飾演)的42%。 值得注意的是,阿曼達·沃勒(Viola Davis飾演)經常講話,僅比Deadshot少222個單詞(16%)。
I started this project because I had a feeling that Rogue One’s cast and dialogue were not equally divided between male and female characters. I was shocked (and saddened) to find that almost none of the top 10 movies from last year were gender equal.
我之所以開始這個項目,是因為我覺得Rogue One的演員和對話在男女角色之間并不均等。 令我震驚(感到難過)的是,去年的前十部電影中幾乎沒有兩性平等。
We can do better.
我們可以做得更好。
Added: If you’re looking for more studies and data explorations like this, check out:
補充 :如果您正在尋找更多類似的研究和數據探索,請查看:
Inequality in 800 popular films from 2007–2015 (includes gender, race/ethnicity, sexual orientation, and disability)
2007年至2015年間800部受歡迎的電影中的不平等現象 (包括性別,種族/民族,性??取向和殘疾)
This exploration of 2000 randomly selected movie scripts from 1980’s — 2010's
從1980年代至2010年代對2000種隨機選擇的電影劇本的探索
This research on 200 biggest movies from 2014 & 2015
這項研究針對2014年和2015年的200部最大電影
Female representations in 2014’s biggest movies
2014年最大電影中的女性形象
This Twitter thread about gender equality in 2016’s animated films
這個推特主題是2016年動畫電影中的性別平等
TL;DR Version: Women represent (on average) 30–35% of speaking roles across each of these investigations.
TL; DR版本:在每個調查中,女性平均占說話角色的30–35%。
Added: Have questions or comments about my methodology or conclusions? Check out my follow-up article featuring the most frequently asked questions.
補充 :對我的方法論或結論有疑問或意見嗎? 查看我的后續文章,其中包含最常見的問題。
I analyzed the dialogue in 2016’s biggest movies and it started a lot of conversations.A few weeks ago I published a story about my analysis of the dialogue in 2016’s 10 Highest Grossing Films. I am so…medium.com
我分析了2016年最大電影中的對話,并開始了很多對話。 幾周前,我發表了一個關于我對2016年10部最賣座電影中對話的分析的故事。 我是如此… medium.com
If you liked this article and want to see more like it, please click the green heart below and share away on your social media network of choice.
如果您喜歡這篇文章并希望看到更多類似文章,請單擊下面的綠色心臟,然后在您選擇的社交媒體網絡上分享。
I am currently spending my time working on personal projects and data visualizations like this while I look for a data science job. So, if you have a fun project idea (or a job inquiry) you’d like to discuss with me, please reach out to me on Twitter or by email.
我目前正在尋找數據科學工作時,將時間花在諸如此類的個人項目和數據可視化上。 因此,如果您想與我討論有趣的項目構想(或工作要求),請通過Twitter或通過電子郵件與我聯系。
Thank you!
謝謝!
翻譯自: https://www.freecodecamp.org/news/women-only-said-27-of-the-words-in-2016s-biggest-movies-955cb480c3c4/
2016版單詞的減法