2016版單詞的減法_在2016年最大的電影中,女性只說了27%的單詞。

2016版單詞的減法

by Amber Thomas

通過琥珀托馬斯

在2016年最大的電影中,女性只說了27%的單詞。 (Women only said 27% of the words in 2016’s biggest movies.)

Movie trailers in 2016 promised viewers so many strong female characters. Jyn Erso. Dory. Harley Quinn. Judy Hopps. Wonder Woman. I felt like this could be the year for gender equality in Hollywood’s biggest films.

2016年的電影預告片向觀眾承諾了這么多堅強的女性角色。 珍妮·艾索(Jyn Erso)。 海ry 哈雷奎恩。 朱迪·霍普斯(Judy Hopps)。 神奇女俠。 我覺得這可能是好萊塢最大的電影中實現性別平等的一年。

I was wrong.

我錯了。

And I don’t make this statement lightly.

而且我不會輕易發表這一聲明。

As a scientist, I turn to data to answer questions I have about the world. And I’ve got the data to back up my claim. In fact, you can have the data, code, and resulting data visualization that I made trying to better understand this topic. But first, let me tell you how I became so interested.

作為科學家,我求助于數據來回答關于世界的問題。 而且我有數據來支持我的主張。 實際上,您可以獲取我試圖更好地理解該主題的數據,代碼和結果數據可視化 。 但是首先,讓我告訴您我是如何變得如此感興趣的。

It all started when I went to see Rogue One: A Star Wars Story. All promotional materials for the movie indicated that Jyn Erso (played by Felicity Jones) was the main character. I mean, just look at the poster.

當我去看《俠盜一號:星球大戰外傳》時,一切就開始了。 電影的所有宣傳材料都表明,金恩·埃索(由Felicity Jones飾演)是主角。 我的意思是,只看海報。

When your picture is several times larger than everyone else’s, you’re probably the main character.

當您的圖片比其他所有人大幾倍時,您可能就是主角。

What I didn’t notice at first was that Jyn is the only woman on that poster.

起初我沒有注意到的是Jyn是那張海報上唯一的女人。

I went into the movie theater expecting to see men and women fighting side by side. I left feeling certain that I could count every female character from the movie on one hand. While Jyn was the main character, I was profoundly aware that she was often the only woman in any scene.

我走進電影院,希望看到男人和女人并肩作戰。 我離開時確定自己可以一方面統計電影中的每個女性角色。 雖然Jyn 主要角色,但我深刻地意識到,她通常是任何場景中的唯一女性。

It felt strangely familiar to have a lead female character be so outnumbered. Then I realized that Jyn and Princess Leia suffered the same inequality 39 years apart. I was overwhelmed with a need to know exactly how female representation in Star Wars movies has changed. But it seemed unfair to compare movies made today with movies made decades ago.

擁有如此多的女主角令我感到奇怪。 然后我意識到Jyn和Leia公主相距39年,經歷了同樣的不平等。 我不知道要確切地知道《星球大戰》電影中女性形象的變化,這讓我不知所措。 但是,將今天制作的電影與幾十年前制作的電影進行比較似乎是不公平的。

So instead, I decided to look for female equality across the Top 10 Worldwide Highest Grossing Films of 2016. They were:

因此,我決定在2016年全球十大票房最高的電影中尋求女性平等。他們是:

  • Captain America: Civil War

    美國隊長:內戰

  • Finding Dory

    海底總動員2

  • Zootopia

    動物界

  • The Jungle Book

    叢林書

  • The Secret Life of Pets

    寵物的秘密生活

  • Batman V. Superman: Dawn of Justice

    蝙蝠俠訴超人:正義曙光

  • Rogue One: A Star Wars Story

    俠盜一號:星球大戰外傳

  • Deadpool

    死侍

  • Fantastic Beasts and Where to Find Them

    神奇的野獸以及在哪里找到它們

  • Suicide Squad

    自殺小隊

With so many powerful women in these films, some of them must be gender-equal, right?

這些影片中有這么多有影響力的女性,其中有些必須與性別平等,對嗎?

數據 (The Data)

Now that I decided what I wanted to investigate, I needed to figure out how to do it. Similar data exploration projects have focused on dialogue or screen-time equality. Both seemed like good options, but I wanted the ability to report on equality at the movie and character level.

既然我確定了要調查的內容,就需要弄清楚該如何做。 類似的數據探索項目也將重點放在對話或屏幕時間平等上。 兩者似乎都是不錯的選擇,但我希望能夠在電影和角色級別上報道平等。

In the end, I decided to explore the movies’ dialogue. This choice gave me the ability to focus on characters with an active role in the story and to cut non-speaking characters from my analysis.

最后,我決定探索電影的對話。 這種選擇使我能夠專注于故事中活躍角色的角色,并從我的分析中切出不說話的角色。

Luckily for me, dedicated movie fans often transcribe a movie’s dialogue and make it freely available online. If I couldn’t find a transcript, I used closed-caption files instead. For those, I re-watched the movie and manually assigned characters to their spoken lines.

對我來說幸運的是,忠實的電影迷經常抄錄電影的對白并免費在線上觀看。 如果找不到筆錄,請改用隱藏字幕文件。 為此,我重新觀看了電影,并手動將角色分配給了他們的口語行。

This process was a labor of love. It was time consuming, but I have no regrets.

這個過程是愛的勞動。 這很耗時,但我不后悔。

分析 (Analysis)

Once I had all of the transcripts, I just needed to read the .txt files into R and separate the characters from their lines. For the Rogue One transcript, that process looked like this:

擁有所有成績單后,我只需要將.txt文件讀入R并將字符與行分開即可。 對于“流氓一號”筆錄,該過程如下所示:

Now that I had a data frame with both Character and Words columns, I had to assign genders to each Character. To remain consistent with my categorizations, I came up with a few simple rules:

現在,我有了一個同時包含“字符”和“單詞”列的數據框,我必須為每個字符分配性別。 為了與分類保持一致,我提出了一些簡單的規則:

  1. When possible, assign gender according to the pronouns that other characters use. For example, if a character is referred to by others as “he” or “him”, then he is categorized as “male”.

    如果可能,根據其他字符使用的代詞分配性別。 例如,如果一個角色被其他人稱為“他”或“他”,則他被歸類為“男性”。
  2. If there is no pronoun used throughout the movie but the character is named or credited (on IMDB), use the gender of the actor or actress. Note that the gender of an actor or actress was assumed based on publicly available information as of January 2017.

    如果在電影中沒有使用代詞,但是角色(在IMDB上 )已被命名或記為角色,請使用演員的性別。 請注意,根據截至2017年1月的公開信息,假定了演員的性別。

  3. If no pronoun is used for the character and the character is not named or credited, refer to the closed captions. Sometimes they will identify the character that spoke.

    如果該字符沒有使用代詞,并且該字符未命名或使用,則請參考隱藏字幕。 有時他們會識別說話的角色。
  4. If all else fails, make an educated guess based on the character’s voice.

    如果其他所有方法均失敗,請根據角色的聲音做出有根據的猜測。

I’ll be the first to say that these methods are not perfect. In fact, here are some caveats:

我將第一個說這些方法并不完美。 實際上,這里有一些警告:

  1. If a male character was voiced by a female actress (or vice versa) and the character was never addressed by other characters using pronouns, he may be incorrectly labelled. (I don’t think this happened, but anything is possible.)

    如果男性角色由女性女演員發聲(反之亦然),而該角色從未被其他角色使用代詞講話,那么他的標簽可能不正確。 (我不認為這發生了,但是一切皆有可能。)
  2. Voices that are not associated with a physical embodiment of a character (e.g., the voice of a computer) were categorized according to the gender of their voice actor/actress.

    與角色的物理實施方式不相關的語音(例如,計算機的語音)是根據其語音演員的性別來分類的。
  3. I can never really know the gender of any character, but I’m using the cues and information that I have at my disposal.

    我永遠無法真正知道任何角色的性別,但是我正在使用自己掌握的線索和信息。

Again, I am far from infallible, so if you caught a mistake on my part, please let me know.

同樣,我絕不是萬無一失,因此,如果您遇到了我的失誤,請告訴我 。

So now I just needed to count the number of words spoken by each character. Again, I was able to do this in R using the dplyr and stringi packages.

所以現在我只需要計算每個字符說出的單詞數即可。 同樣,我能夠使用dplyrstringi包在R中做到這一點。

It’s worth noting that I included every speaking character in this analysis. So yes, every stormtrooper who shouts a simple “Wait, stop!” before getting shot is included.

值得注意的是,我在分析中包括了每個說話的角色。 所以,是的,每位沖鋒隊大喊一個簡單的“等等,停下來!” 包括拍攝之前。

數據可視化 (Data Visualization)

I had my data. Unfortunately, tables upon tables of word counts and character names don’t give anyone much insight. Like any good data exploration project, it was time to visualize my results. I had to work through a few iterations before I found the best one.

我有我的數據。 不幸的是,字數統計表和字符名稱表并沒有給任何人以太多的見識。 像任何好的數據探索項目一樣,是時候可視化我的結果了。 在找到最佳迭代之前,我必須經過幾次迭代。

Scatterplots and bar charts both masked characters with small roles.

散點圖和條形圖都掩蓋了角色較小的角色。

A simple bubble chart was better but it became difficult to identify individual characters. It was also challenging to understand movie-level statistics.

一個簡單的氣泡圖比較好,但是識別單個字符變得困難。 了解電影級統計數據也具有挑戰性。

In the end, I decided to learn enough d3.js to make an interactive graphic. Here, each bubble represents a character, and the bubble’s area is scaled based on the number of words spoken. Female and male bubbles can be separated for better insight. The stacked bars below indicate movie-level information.

最后,我決定學習足夠的d3.js來制作交互式圖形 。 在這里,每個氣泡代表一個字符,氣泡的面積根據說出的單詞數進行縮放。 可以將雌性和雄性氣泡分開以更好地了解情況。 下面堆疊的條表示電影級信息。

Go ahead, check out the full interactive version.

繼續,查看完整的交互式版本 。

Interested in exploring the raw word-count data for yourself? I’ve made all of the data and code used to generate these visualizations open source. It’s available here:

有興趣探索自己的原始字數統計數據嗎? 我已經將用于生成這些可視化的所有數據和代碼公開了。 在這里可用:

ProQuestionAsker/2016MovieDialogueContribute to 2016MovieDialogue development by creating an account on GitHub.github.com

ProQuestionAsker / 2016MovieDialogue 通過在GitHub上創建一個帳戶為2016MovieDialogue開發 做出 貢獻。 github.com

外賣 (Takeaways)

Ok, so the analysis is done. I’ve got a fancy (and fun-to-play-with) visualization. What did I find?

好的,分析完成了。 我有一個花哨的(而且很有趣的)可視化效果。 我找到了什么?

I recommend taking a quick second to look at something “a-Dory-ble” before going on, because this post is about to get real depressing real fast.

我建議在繼續之前先花點時間看一下“ a-Dory-ble”,因為這篇文章很快就會令人沮喪。

Aw, so cute. Feeling good?

真可愛 感覺好嗎?

All right, here we go.

好吧,我們開始。

This is a static version of what the visualization for all 10 movies looks like:

這是所有10部電影的可視化效果的靜態版本:

(If you’d like to check out the interactive visualization, go here.)

(如果您想查看交互式可視化,請轉到此處 。)

There are a couple of things here that I need to point out:

我需要指出以下幾點:

Not one of the top 10 movies of 2016 had a 50% speaking, female cast.

2016年的十大電影中,沒有一部擁有50%的女性演員。

Finding Dory was the closest to this level of equality with 43% female characters. To be equal, the movie would have needed 8 more speaking, female roles.

尋找多莉(Dory)最接近這個平等水平,女性角色占43%。 為了平等起見,這部電影還需要再增加8位女性角色。

Rogue One was the worst. Only 9% of its speaking characters were female. Of those 10 characters, 1 was a computer voice, 1 appeared on screen for no more than 5 seconds, and 1 was a CGI cameo that said 1 word.

流氓一號最糟糕。 它的說話角色中只有9%是女性。 在這10個字符中,有1個是計算機語音,有1個出現在屏幕上的時間不超過5秒,有1個是CGI客串,說了1個字。

Only 1 of 2016’s top 10 movies had 50% dialogue by a female character.

2016年的前10部電影中,只有1部的女性角色對話率為50%。

Finding Dory comes out on top here too with 53% female dialogue. But, 76% of that dialogue came from Dory alone.

在女性對話中,找到海莉也位居榜首。 但是,這種對話中有76%僅來自Dory。

Trailing at the end was The Jungle Book with only 10% of its dialogue spoken by a female character. Keep in mind, this is after casting Scarlett Johansson as the voice of the historically-male snake, Kaa.

排在最后的是《叢林書》,其中只有10%的對話是由女性角色講的。 請記住,這是將斯嘉麗·約翰遜(Scarlett Johansson)選作歷史上雄性蛇Kaa的聲音之后。

Here’s a few more:

還有一些:

  • Finding Dory and Zootopia were the only 2 movies in 2016’s top 10 in which a female character had the most dialogue.

    在2016年的前10名電影中,《尋找海莉》和《動物世界》是僅有的兩部女性角色對話最多的電影。
  • Female characters were outnumbered in Captain America: Civil War’s final battle 5:1. Throughout the movie, they only contributed 16% of the dialogue.

    在《美國隊長:內戰》的最后一場戰斗中,女性角色的數量超過了5:1。 在整部電影中,他們只貢獻了16%的對話。
  • Batman spoke 2.4 times more than Superman and 6 times more than Wonder Woman in Batman V. Superman.

    蝙蝠俠在蝙蝠俠V.超人中的說話能力是超人的2.4倍,是《神力女超人》的6倍。
  • 78% of the female-spoken lines in Rogue One came from Jyn Erso.

    Rogue One中78%的女性口語語系來自Jyn Erso。
  • While Harley Quinn was a highly advertised character in Suicide Squad, she only spoke 42% as many words as Floyd/Deadshot (played by Will Smith). Notably, Amanda Waller (played by Viola Davis) spoke frequently, totaling just 222 words (16%) short of Deadshot’s word count.

    雖然哈雷·奎因(Harley Quinn)是《自殺小隊》(Supericide Squad)中一個備受推崇的角色,但她說的話只占弗洛伊德(Floyd / Deadshot)(威爾·史密斯(Will Smith)飾演)的42%。 值得注意的是,阿曼達·沃勒(Viola Davis飾演)經常講話,僅比Deadshot少222個單詞(16%)。

I started this project because I had a feeling that Rogue One’s cast and dialogue were not equally divided between male and female characters. I was shocked (and saddened) to find that almost none of the top 10 movies from last year were gender equal.

我之所以開始這個項目,是因為我覺得Rogue One的演員和對話在男女角色之間并不均等。 令我震驚(感到難過)的是,去年的前十部電影中幾乎沒有兩性平等。

We can do better.

我們可以做得更好。

Added: If you’re looking for more studies and data explorations like this, check out:

補充 :如果您正在尋找更多類似的研究和數據探索,請查看:

  • Inequality in 800 popular films from 2007–2015 (includes gender, race/ethnicity, sexual orientation, and disability)

    2007年至2015年間800部受歡迎的電影中的不平等現象 (包括性別,種族/民族,性??取向和殘疾)

  • This exploration of 2000 randomly selected movie scripts from 1980’s — 2010's

    從1980年代至2010年代對2000種隨機選擇的電影劇本的探索

  • This research on 200 biggest movies from 2014 & 2015

    這項研究針對2014年和2015年的200部最大電影

  • Female representations in 2014’s biggest movies

    2014年最大電影中的女性形象

  • This Twitter thread about gender equality in 2016’s animated films

    這個推特主題是2016年動畫電影中的性別平等

TL;DR Version: Women represent (on average) 30–35% of speaking roles across each of these investigations.

TL; DR版本:在每個調查中,女性平均占說話角色的30–35%。

Added: Have questions or comments about my methodology or conclusions? Check out my follow-up article featuring the most frequently asked questions.

補充 :對我的方法論或結論有疑問或意見嗎? 查看我的后續文章,其中包含最常見的問題。

I analyzed the dialogue in 2016’s biggest movies and it started a lot of conversations.A few weeks ago I published a story about my analysis of the dialogue in 2016’s 10 Highest Grossing Films. I am so…medium.com

我分析了2016年最大電影中的對話,并開始了很多對話。 幾周前,我發表了一個關于我對2016年10部最賣座電影中對話的分析的故事。 我是如此… medium.com

If you liked this article and want to see more like it, please click the green heart below and share away on your social media network of choice.

如果您喜歡這篇文章并希望看到更多類似文章,請單擊下面的綠色心臟,然后在您選擇的社交媒體網絡上分享。

I am currently spending my time working on personal projects and data visualizations like this while I look for a data science job. So, if you have a fun project idea (or a job inquiry) you’d like to discuss with me, please reach out to me on Twitter or by email.

我目前正在尋找數據科學工作時,將時間花在諸如此類的個人項目和數據可視化上。 因此,如果您想與我討論有趣的項目構想(或工作要求),請通過Twitter或通過電子郵件與我聯系。

Thank you!

謝謝!

翻譯自: https://www.freecodecamp.org/news/women-only-said-27-of-the-words-in-2016s-biggest-movies-955cb480c3c4/

2016版單詞的減法

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/395812.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/395812.shtml
英文地址,請注明出處:http://en.pswp.cn/news/395812.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

軟件工程博客---團隊項目---個人設計2(算法)

針對分析我們團隊項目的需求,我們選定Dijkstra算法。 算法的基本思想: Dijkstra算法是由E.W.Dijkstra于1959年提出,又叫迪杰斯特拉算法,它應用了貪心算法模式,是目前公認的最好的求解最短路徑的方法。算法解決的是有向…

UWP 雜記

UWP用選取文件對話框 http://blog.csdn.net/u011033906/article/details/65448394 文件選取器、獲取文件屬性、寫入和讀取、保存讀取和刪除應用數據 https://yq.aliyun.com/articles/839 UWP判斷文件是否存在 http://blog.csdn.net/lindexi_gd/article/details/51387901…

微信上傳素材 java_微信素材上傳(JAVA)

public String uploadMaterial(String url,InputStream sbs,String filelength,String filename, String type) throws Exception {try {DataInputStream innew DataInputStream(sbs);url url.replace("TYPE", type);URL urlObj new URL(url);// 創建Http連接HttpU…

SQL Server讀寫分離之發布訂閱

一、發布 上面有多種發布方式,這里我選擇事物發布,具體區別請自行百度。 點擊下一步、然后繼續選擇需要發布的對象。 如果需要篩選發布的數據點擊添加。 根據自己的計劃選擇發布的時間。 點擊安全設置,設置代理信息。 最后單擊完成系統會自動…

碼農和程序員的幾個重要區別!

如果一個企業老板大聲嚷嚷說,“我要招個程序員”,那么十之八九指的是“碼農”——一種純粹為了錢而寫代碼的技術人員。這其實是一種非常狹隘和錯誤的做法,原因么,且聽我一一道來。1、碼農寫代碼,程序員寫系統從本質上講…

sql server2008禁用遠程連接

1.打開SQL Server 配置管理器,雙擊左邊 SQL Server 網絡配置,點擊TCP/IP協議,在協議一欄中,找到 全部偵聽,修改為否,然后點擊IP地址,將IP地址為127.0.0.1(IPV4)或::1(IPV6)的已啟用修改為是,其它的IP地址的已啟用修改為否 注意:如…

snapchat注冊不到_從Snapchat獲得開發人員職位中學到的經驗教訓

snapchat注冊不到Here are three links worth your time:這是三個值得您花費時間的鏈接: I just got a developer job at Snapchat. Here’s what I learned and how it can help you with your job search (15 minute read) 我剛剛在Snapchat獲得開發人員職位。 這…

java bitmap jar_Java面試中常用的BitMap代碼

引言阿里內推面試的時候被考了一道編程題:10億個范圍為1~2048的整數,將其去重并計算數字數目。我看到這個題目就想起來了《編程珠璣》第一章講的叫做BitMap的數據結構,但是我并沒有在java上實現過,這就比較尷尬了,再加…

移動端工程架構與后端工程架構的思想摩擦之旅(1)

此文已由作者黎星授權網易云社區發布。歡迎訪問網易云社區,了解更多網易技術產品運營經驗記資源投放后端工程的架構調整與優化 架構思考一直以來對軟件工程架構有著極大的興趣,無論是之前負責的移動端Android工程,亦或是現在轉到后端開發后維…

View野指針問題分析報告

【問題描述】 音樂組同事反饋了一個必現Native Crash問題&#xff0c;tombstone如下&#xff1a; pid: 5028, tid: 5028, name: com.miui.player >>> com.miui.player <<< signal 11 (SIGSEGV), code 2 (SEGV_ACCERR), fault addr 79801f28r0 7ac59c98 r1 …

SicilyFunny Game

一、題目描述 Two players, Singa and Suny, play, starting with two natural numbers. Singa, the first player, subtracts any positive multiple of the lesser of the two numbers from the greater of the two numbers, provided that the resulting number must be non…

java 分布式同步_Java Web分布式集群搭建(三)——Session同步

對于一個業務系統的Tomcat集群來說&#xff0c;必須保證同一個用戶訪問到任一臺服務器上都可以維持之前操作的身份。比如在服務器A進行了登陸&#xff0c;那么在服務器B中也要同步該用戶已登錄的狀態&#xff0c;這里就用到了Session的同步。同步方式sticky模式、復制模式、Ter…

移動應用程序和網頁應用程序_如何不完全破壞您的移動應用程序的用戶界面

移動應用程序和網頁應用程序by Luke Konior盧克科尼爾(Luke Konior) 如何不完全破壞您的移動應用程序的用戶界面 (How to not utterly ruin your mobile app’s user interface) There’s no single universal formula for designing a great user interface (if you discover…

logging記錄日志

日志是一個系統的重要組成部分&#xff0c;用以記錄用戶操作、系統運行狀態和錯誤信息。日志記錄的好壞直接關系到系統出現問題時定位的速度。logging模塊Python2.3版本開始成為Python標準庫的一部分。 日志級別 在最簡單的使用中&#xff0c;我們直接導入logging模塊&#xff…

C#編程之接口

1.定義 接口是把公共方法和屬性組合起來&#xff0c;以封裝特定功能的一個集合。&#xff08;一旦定義了接口&#xff0c;就可以在類中實現它。這樣類就可以支持接口所指定的所有屬性和成員&#xff09; 注意1&#xff1a;接口不能單獨存在。不能像實例化一個類那樣實例化一個接…

supervisor守護進程

2019獨角獸企業重金招聘Python工程師標準>>> supervisor 是一個client/server系統,把不是守護進程的進程變成守護進程,并監控和控制類 Unix 操作系統上的進程。 upervisor就是用Python開發的一套通用的進程管理程序&#xff0c;能將一個普通的命令行進程變為后臺dae…

神經網絡算法 java 源代碼_神經網絡算法與實現 ——基于Java語言 代碼實例

【實例簡介】Neural Network Programming with Java_ISBN 978-7-115-46093-6【實例截圖】【核心代碼】NeuralNetworkProgrammingwithJava_code└── Neural Network Programming with Java_code├── Chapter1│ ├── HiddenLayer.java│ ├── InputLayer.java│ ├…

javascript面試_在編碼面試中需要注意的3個JavaScript問題

javascript面試JavaScript is the official language of all modern web browsers. As such, JavaScript questions come up in all sorts of developer interviews.JavaScript是所有現代Web瀏覽器的官方語言。 因此&#xff0c;各種開發人員訪談中都會出現JavaScript問題。 T…

【學習筆記】深入理解js原型和閉包(11)——執行上下文棧

繼續上文的內容。 執行全局代碼時&#xff0c;會產生一個執行上下文環境&#xff0c;每次調用函數都又會產生執行上下文環境。當函數調用完成時&#xff0c;這個上下文環境以及其中的數據都會被消除&#xff0c;再重新回到全局上下文環境。處于活動狀態的執行上下文環境只有一個…

Java基礎--訪問權限控制符

今天我們來探討一下訪問權限控制符。 使用場景一&#xff1a;攻城獅A編寫了ClassA&#xff0c;但是他不想所有的攻城獅都可以使用該類&#xff0c;應該怎么辦&#xff1f; 使用場景二&#xff1a;攻城獅A編寫了ClassA&#xff0c;里面有func1方法和func2方法&#xff0c;但是他…