貓眼電影評論
Ryan Bellgardt’s 2018 movie, The Jurassic Games, tells the story of ten death row inmates who must compete for survival in a virtual reality game where they not only fight each other but must also fight dinosaurs which can kill them both in the game and for real. Starring mostly B-list Hollywood actors such as Perrey Reeves and Ryan Merriman, the movie clearly sets of all the alarms of a low-budget flick. Nevertheless, most critics thought it was a very good effort: Rotten Tomatoes considered it “fresh”, giving it a rare rating of 83%. Writing on the same website, Sam Kurd of Cultured Vultures felt that the movie, “while not original or ground-breaking, [was] a lot of fun and worth-watching”. Other critics on the same platform rated it nicely, so that the movie ended up with an average rating of 7.2 out of 10. However, if Hollywood, or critics for that matter, expected that regular movie-goers would love the movie, they must have thought quite wrong.
瑞安·貝爾加特(Ryan Bellgardt)的2018年電影《侏羅紀游戲》講述了十名死囚囚犯的故事,他們必須在虛擬現實游戲中為生存而競爭,他們不僅要互相戰斗而且還必須與可能在游戲中和真實世界中殺死他們的恐龍戰斗。 這部電影主要由好萊塢名人B演員(例如Perrey Reeves和Ryan Merriman)主演,這部電影顯然集結了低預算電影的所有警報。 但是,大多數批評家認為這是一個很好的嘗試:爛番茄認為它是“新鮮的”,罕見的評級為83%。 《文化禿鷹》的薩姆·庫爾德在同一個網站上寫道,這部電影“雖然不是原創電影,也不是開創性的,但它充滿了樂趣,值得一看”。 在同一平臺上的其他評論家都對其進行了很好的評分,因此該電影的平均評分為7.2(滿分10分)。但是,如果好萊塢或對此有評論的人們期望普通電影觀眾會喜歡這部電影,認為很不對。
On IMDb, it ended up with an overall rating of 3.8 out of 10 after over 2,000 votes. The chatter is that, though the movie tells a pretty fun story, its special effects are horrendous. For a B-movie that possibly could not afford top-rated Hollywood CGI, it would seem understandable that the directors should be given a pass. Unfortunately, the IMDb crowd was not so forgiving. “Low-budget movie”, “sloppy characters”, “low grade CGI” are some of the words thrown about in the reviews on that website. It seems, what the critics saw past, the audience could not.
在IMDb上,經過2,000票以上的投票,它的總體得分為3.8,滿分為10。 有趣的是,盡管這部電影講述了一個有趣的故事,但其特殊效果令人震驚。 對于可能無法負擔頂級好萊塢CGI的B級電影,應該給導演通行證是可以理解的。 不幸的是,IMDb人群并沒有那么寬容。 “低預算電影”,“草率角色”,“低級CGI”是該網站評論中提到的一些詞語。 看來,評論家過去所看到的,觀眾卻看不到。
While critics and crowd may have disagreed over The Jurassic Games, they do agree on a handful other movies such as Aaron Schneider’s Greyhound, and Mark Lamprell’s Never Too Late, for example, both scored with high ratings on IMDb and Rotten Tomatoes. These contrasting situations put a question before us. Should we often have disagreements, or agreements, when critics and crowd score or review a movie?
盡管評論家和觀眾可能對《侏羅紀奧運會》持不同意見,但他們確實同意了其他幾部電影,例如亞倫·施耐德的《靈緹犬》和馬克·蘭普雷爾的《永不晚》,它們在IMDb和爛番茄上均獲得了很高的評分。 這些相反的情況向我們提出了一個問題。 評論家和觀眾評分或觀看電影時,我們是否應該經常有分歧或協議?
This question surrounding the truth value of crowds is not a recent one. On a spring morning in 1906, Frank Galton, an English statistician and polymath, attended a weight-judging competition at an annual exhibition of the West of England Fat Stock and Poultry at Plymouth. This was a farmers’ fair where all sorts of crop and animal products were on display and sold. A fat ox had been selected for slaughter, and participants were provided a card on which to write their names, addresses and estimates of what the ox would weigh after it is slaughtered and “dressed”. Those with successful guesses would receive a prize. While most may have considered their participation trivial and of no consequence, Galton thought the combined results would make for a good experiment. He collated the results and ran statistical analysis on them. He found that the “middlemost” estimate was very close to the actual weight of the slaughtered ox: it was correct to within 1% of the actual value. While the estimate was 1207-lb, the actual weight of the dressed ox was 1198-lb. In effect, while most of the participants in the guessing competition may have guessed wrongly, their combined effort produced a result close enough to the actual value.
這個關于人群真實價值的問題并不是最近才提出的。 1906年的一個Spring早晨,英國統計學家和數學家弗蘭克·加爾頓(Frank Galton)在普利茅斯(Plymouth)舉行的英格蘭西部脂肪和家禽年度展覽上參加了一次重量比賽。 這是一個農民博覽會,展出并出售各種農作物和動物產品。 選擇了一只肥牛進行屠宰,并為參與者提供了一張卡片,上面寫著他們的名字,地址和對牛被宰殺和“穿衣”后體重的估計。 那些猜測成功的人將獲得獎勵。 盡管大多數人可能認為他們的參與微不足道,并且沒有任何后果,但高爾頓認為合并的結果將有助于進行良好的實驗。 他整理了結果并對其進行了統計分析。 他發現“最中間”的估計值與屠宰牛的實際重量非常接近:正確的是在實際值的1%以內。 雖然估計的重量為1207磅,但穿戴過的牛的實際重量為1198磅。 實際上,盡管大多數猜謎比賽的參與者可能猜錯了,但他們的共同努力產生了接近實際價值的結果。
Though crowd behavior can sometimes be fickle or irrational, in certain cases, such as with Galton’s experiment, it provides interesting global estimates. In some situations, a diverse and independently sampled opinion of a select crowd could in fact reflect the “truth”. This logic has been successfully exploited in election polls, internet search engines, stock market predictions, and online knowledge repositories such as Wikipedia. Recently, we concluded a project where we examined this theory in relation to movie ratings.
盡管人群的行為有時可能是善變的或不合理的,但在某些情況下(例如通過高爾頓的實驗),它提供了有趣的全局估計。 在某些情況下,特定人群的多樣化且獨立采樣的意見實際上可能反映出“真相”。 這種邏輯已在選舉民意測驗,互聯網搜索引擎,股市預測以及諸如Wikipedia之類的在線知識庫中得到了成功利用。 最近,我們完成了一個項目,在該項目中我們研究了與電影分級有關的這一理論。
IMDb and Rotten Tomatoes are some of the biggest movie aggregators online. Both collect ratings and other details on movies and TV shows, making these accessible to their global audience. While the former collects its movie ratings mainly from the crowd, the latter uses a score based strictly on the opinion of critics in the movie industry. These two contrasting techniques of judging a movie pits the crowd against critics and makes for an interesting comparison of the two opinions. Would the “wisdom of crowds” produce a rating for a movie just as good as that from seasoned experts? We examined the data to see what insights are present.
IMDb和Rotten Tomatoes是在線上最大的電影聚合商之一。 兩者都收集電影和電視節目中的收視率和其他詳細信息,從而使全球觀眾都可以訪問。 前者主要從人群中收集電影收視率,而后者則使用嚴格基于電影業評論家意見的得分。 這兩種評判電影的對比技術使觀眾與評論家相提并論,對這兩種觀點進行了有趣的比較。 “人群的智慧”是否會對電影產生與資深專家相同的評價? 我們檢查了數據以查看存在哪些見解。
We collected 44,000 movies from IMDb and 9,638 movies from Rotten Tomatoes, identifying 3,100 unique intersections from both sets. Using this data, we found a few revealing information. There exists a strong positive correlation between movie ratings on Rotten Tomatoes and on IMDb. Perhaps this is unsurprising. Most movies with high ratings on Rotten Tomatoes should also have high ratings on IMDb, even if the ratings are not the same overall. Good movies are good movies, in any case. However, we found that, on average, critics and crowd do not agree all the time.
我們從IMDb收集了44,000部電影,從Rotten Tomatoes收集了9,638部電影,確定了兩組中的3,100個獨特交集。 使用這些數據,我們發現了一些具有啟發性的信息。 爛番茄和IMDb的電影評分之間存在很強的正相關關系。 也許這并不奇怪。 即使在整體上評分不同,大多數在爛番茄上獲得高收視率的電影也應在IMDb上獲得高收視率。 好的電影無論如何都是好電影。 但是,我們發現,平均而言,批評家和群眾不同意。
We scaled the movie ratings on both sites, then divided their difference into three bins. This is a little like the approach Jules Wanderer used in his paper, “In Defense of Popular Taste: Film Ratings among Professionals and Lay Audiences”. We defined a spread value, which is the tolerance we can allow in the difference between movie ratings, so that, for example, if critics rate a movie 0.8 and the crowd rate the same movie 0.75, we say that both the critics and the crowd agree to within 0.05 of a movie’s ratings. We find, as expected, that the agreement between these two depends substantially on the spread value. When we allow no more than 0.1 in spread, both sides agree only on 28% of the movies in the data set. This is quite low. In addition, it appears there has never really been consensus between critics and crowd over the years when we are strict with our tolerance or spread. We found that it is less likely that the movie ratings provided by critics and crowd are within 0.1 of each other. If anything, it is more probable that the Tomatometer Score of a movie is lower than its IMDb score. In effect, while critics appear to be penalizing certain movies by providing them lower scores, the crowd seems to give these same movies a higher rating.
我們對兩個站點上的電影收視率進行了縮放,然后將它們的差異分為三個部分。 這有點像朱爾斯·萬德(Jules Wanderer)在其論文《捍衛大眾品味:專業人士和非專業觀眾的電影收視率》中所采用的方法。 我們定義了一個傳播值,這是我們可以允許的電影評分之間的差異的容忍度,因此,例如,如果評論家對電影評分為0.8,而人群對同一電影評分為0.75,那么我們說評論者和人群同意在電影收視率的0.05以內。 正如我們所料,我們發現這兩者之間的協議很大程度上取決于點差值。 如果我們允許的傳播不超過0.1,則雙方僅同意數據集中28%的電影。 這是相當低的。 此外,多年來,在我們嚴格容忍或傳播時,評論家和人群之間似乎從未真正達成共識。 我們發現,評論家和觀眾提供的電影收視率彼此之間的誤差不太可能在0.1以內。 如果有的話,電影的“番茄計分”很可能低于其IMDb分數。 實際上,盡管評論家似乎通過給某些電影較低的分數來對它們進行懲罰,但觀眾似乎給這些相同的電影更高的評分。

Unlike us, Wanderer, in his paper which examined to what degree professional critics agree with lay movie-goers, found a much higher score of 53% out of 5,644 instances as the fraction of movies on which both sides agreed. We put this difference down to the lay audience examined in these two cases. Wanderer examined an audience of Customer Union members who were more likely to belong to the upper-middle class in America. These were members of a social circle with a median income of around $12,800, compared with the average US family income of about $7,400 at that time. Our audience, who are IMDb users, is more likely to belong in the larger group with the lower median income. Therefore, while Wanderer puts his audience in the same social class as the critics, we think our audience may be in a lower social class.
與我們不同的是,流浪者在其論文中對專業評論家與外行電影觀眾的認同程度進行了調查,結果發現,在雙方共同意的電影比例中,有5,644個實例中53%的得分要高得多。 我們將此差異歸結為在這兩種情況下檢查的非專業觀眾。 Wanderer對客戶聯盟成員的受眾進行了調查,他們更可能屬于美國的中上階層。 這些人是一個社交圈子的成員,中位收入約為12800美元,而當時美國的平均家庭收入約為7400美元。 我們的受眾是IMDb用戶,他們更有可能屬于中位數收入較低的較大群體。 因此,盡管流浪者將聽眾與評論家置于同一社會階層,但我們認為聽眾可能處于較低的社會階層。
This could explain the additional differences we observed in subsequent analyses of the data. For example, we found that while the crowd is more likely to rate a movie higher when it features a top actor, critics seem unbothered. Similarly, when a movie is directed by a top director, it looks like critics are more in favor of such movies than the crowd is. The boxplots below show these details.
這可以解釋我們在后續數據分析中觀察到的其他差異。 例如,我們發現,雖然當演員扮演男主角時,人群對電影的評價更高,但評論家似乎毫不猶豫。 同樣,當電影由高層導演執導時,評論家似乎比觀眾更喜歡這種電影。 下面的方框圖顯示了這些詳細信息。


As an added step, we examined if we could predict movie ratings offered by the crowd given what we know about the movie and its ratings from critics. Here, we go from the critics’ mind to the crowd’s. While this is not entirely related to our subject of discussion, it makes for an interesting experiment. Using an array of machine learning tools, we obtained a decent mean squared error value of 0.37. The resulting model allows us to formulate a mathematical relationship between a movie’s attributes on Rotten Tomatoes and its IMDb score. The movie’s Tomatometer rating and runtime are some of the most significant predictors.
作為附加的步驟,考慮到我們對電影及其評論家的了解,我們檢查了是否可以預測人群提供的電影收視率。 在這里,我們從批評家的思想轉向群眾的思想。 盡管這與我們的討論主題并不完全相關,但卻可以進行有趣的實驗。 使用一系列機器學習工具,我們獲得了0.37的體面均方誤差值。 由此產生的模型使我們能夠在爛番茄上的電影屬性與其IMDb得分之間建立數學關系。 電影的Tomatometer評分和運行時間是一些最重要的預測指標。

In conclusion, though we could establish a strong correlation between ratings on Rotten Tomatoes and IMDb, we found no substantial agreement between the ratings offered by critics and crowd. In this case, we are unable to learn from the “wisdom of the crowd”. Perhaps if this crowd were a more select group of ardent movie-goers, a property we can’t claim for IMDb users, we may have seen a difference in the outcome of our analysis and more similarity between both ratings. We suspect that differing bias on both sides and an audience more diverse than focused may have skewed the ratings such that there is no evident consensus. Galton’s audience, after all, may have all been farmers or understood farming, else why would they be lurking at a farmers’ fair?
總之,盡管我們可以在爛番茄和IMDb的評級之間建立強相關性,但我們發現批評家和人群提供的評級之間沒有實質性的共識。 在這種情況下,我們無法向“人群的智慧”學習。 也許,如果這些人群是一群熱忱的電影愛好者,我們無法為IMDb用戶聲稱這是一個財產,那么我們的分析結果可能會有所不同,而且兩個評級之間的相似性更高。 我們懷疑雙方的不同偏見和聽眾比重點更多樣化可能會歪曲收視率,從而導致沒有明顯的共識。 畢竟,高爾頓的聽眾可能都是農民,或者是農民,他們為什么會潛伏在農民博覽會上?
翻譯自: https://medium.com/swlh/is-the-opinion-of-the-crowd-on-movies-just-as-good-as-that-of-critics-eb3d084bf4a2
貓眼電影評論
本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。 如若轉載,請注明出處:http://www.pswp.cn/news/388659.shtml 繁體地址,請注明出處:http://hk.pswp.cn/news/388659.shtml 英文地址,請注明出處:http://en.pswp.cn/news/388659.shtml
如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!