如何不認識自己

重點 (Top highlight)

By Angela Xiao Wu, assistant professor at New York University

紐約大學助理教授Angela Xiao Wu

This blog post comes out of a paper by Angela Xiao Wu and Harsh Taneja that offers a new take on social sciences’ ongoing embrace of platform log data by questioning their measurement conditions. The distinct nature of platform datafication is foregrounded in comparison with the longer tradition of third-party audience measurement.

這篇博客文章來自 Angela Xiao Wu Harsh Taneja 一篇論文 通過質疑它們的測量條件,為社會科學對平臺日志數據的持續接受提供了新的思路。 與第三方受眾評估的悠久傳統相比,平臺數據化的獨特性質得到了展望。

Surfing a wave of societal awe and excitement about “Big Data,” platforms formed a habit of releasing “data science” insights on what we search, like, express, purchase, obsess over, attempt to hide, and prefer to forget. These colorful graphics and juicy taglines — most notably from OKCupid and PornHub, whose data lay claims to the quirks and desires of our intimate lives — are always popular novelties to behold, ponder, and reference. If knowing ourselves through platform data is a practice of our age, it is certainly not confined to platforms themselves. Aspiring data scientists, curious programmers, vigilant data journalists, analysts of civic organizations and political campaigns, and (last but not the least) academic social scientists such as myself make up the growing field that is figuring out who we are, what we do, and how we sway in the swathes of platform data.

平臺引起了社會對“大數據”的敬畏和興奮,習慣養成了對我們搜索,表達,購買,癡迷,試圖隱藏以及寧愿忘記的事物發布“數據科學”見解的習慣。 這些色彩鮮艷的圖形和多汁的標語,尤其是來自OKCupid和PornHub的數據,它們的數據表明了我們私密生活的怪癖和渴望,這些都是新穎的新穎事物,值得注視,思考和借鑒。 如果通過平臺數據了解自己是我們時代的一種實踐,那么它肯定不僅限于平臺本身。 有抱負的數據科學家,好奇的程序員,警惕的數據記者,民間組織和政治運動的分析人員,以及(最后但并非最不重要的)像我這樣的學術社會科學家組成了一個不斷發展的領域,該領域正在弄清我們是誰,我們做什么,以及我們如何在眾多平臺數據中搖擺。

Such data can be impressive due to their unprecedented granularity and volume, as well as the fact that they are seemingly “unobtrusive” recordings of our activities when no one is watching. These apparent strengths of data for social research are outweighed by a problem in what we call the “measurement conditions”: platform data are platforms’ records of their own behavioral experimentation. Trying to know ourselves through platform data tends to yield partial and contorted accounts of human behavior that conceal platform interventions. Moreover, though increasingly produced by non-corporate actors, such knowledge accounts and narratives tend to be amenable to platform money-making and image-building.

由于這些數據的空前的粒度和數量,以及當沒有人觀看時,它們似乎對我們的活動“不干擾”的記錄,因此這些數據之所以令人印象深刻。 社會研究數據的這些明顯優勢被我們所謂的“測量條件”問題所抵消:平臺數據是平臺自身行為實驗的記錄。 試圖通過平臺數據了解自己往往會產生隱藏在平臺干預中的人類行為的部分和扭曲的描述。 此外,盡管由非企業行為者越來越多地產生這種知識,但這些敘述和敘述往往適合平臺賺錢和建立形象。

Trying to know ourselves through platform data tends to yield partial and contorted accounts of human behavior that conceal platform interventions.

試圖通過平臺數據了解自己往往會產生隱藏在平臺干預中的人類行為的部分和扭曲的描述。

To be clear, for years many have contested the ascendance of platform data as a staple in quantitative social sciences alongside conventional data collection methods, such as surveys and experiments. These contestations focus on issues about the data’s representativeness, privacy concerns, and precarious access at the mercy of platform companies. The “measurement conditions” problem, however, is entirely different. In our newly published paper, Harsh Taneja and I call for attention to the circumstances under which these data come about: what purpose does the measurement initially serve? As historians have told us, measurement — or converting parts of the social world into quantities according to some enduring instrument — is not an end in itself, but a means for managing events and coordinating actions. Measurement is thus a product of the social and institutional context (i.e., “measurement conditions”) in which it is called upon and carried out.

需要明確的是,多年來,許多人一直將平臺數據的崛起與定量社會科學以及常規數據收集方法(例如調查和實驗)一起作為定量社會科學中的主要手段來進行競爭。 這些競賽的重點是關于數據的代表性,隱私問題以及平臺公司的不確定性。 但是,“測量條件”問題完全不同。 在我最近發表的論文中 ,Harsh Taneja和我提請注意這些數據出現的情況:測量最初起什么作用? 正如歷史學家告訴我們的那樣,測量(或根據某種持久性工具將社會世界的一部分轉換為數量)本身并不是目的,而是管理事件和協調行動的一種手段。 因此,衡量是社會和制度環境(即“衡量條件”)的產物,在此環境中需要進行衡量。

A closer look at the measurement conditions of platforms allows us to rethink the nature of platform log data: they are essentially “administrative data” that platforms generate to realize their own organizational goals, which go little beyond enlarging advertising income, harvesting intermediary fees, and attracting venture capitals. These companies track user engagements with their platforms to evaluate and showcase “product performance.” Such data analytics are integral to the iterative process whereby platforms tinker with their digital architectures in attempts to shape usage in ways that maximize profits.

仔細研究平臺的衡量條件,我們可以重新考慮平臺日志數據的性質:它們本質上是平臺為實現自己的組織目標而生成的“管理數據”,除了增加廣告收入,收取中介費和吸引風險投資。 這些公司通過其平臺跟蹤用戶參與度,以評估和展示“產品性能”。 此類數據分析是迭代過程不可或缺的部分,在此過程中,平臺將對其數字架構進行修補,以嘗試通過使利潤最大化的方式來改變使用方式。

In other words, platform log data are not “unobtrusive” recordings of human behavior out in the wild. Rather, their measurement conditions determine that they are accounts of putative user activity — “putative” in a sense that platforms are often incentivized to keep bots and other fake accounts around, because, from their standpoint, it’s always a numbers game with investors, marketers, and the actual, oft-insecure users. With calculated neglect comes calibrated nudges: platform user activity, in the first place, is induced, coaxed, and experimented on by the platform environment. From multilayered graphical organization to complex algorithmic recommendation, it is from all these platform arrangements that user activity arises. Conversely, it is to make decisions about these arrangements that platform companies measure usage.

換句話說,平臺日志數據并不是野外人類行為的“毫不干擾”記錄。 相反,他們的衡量條件確定他們是假定的用戶活動的帳戶-在某種意義上說,“經常”是指平臺經常受到激勵以保持機器人程序和其他虛假帳戶的存在,因為從他們的角度來看,這始終是與投資者,營銷商的數字游戲,以及經常不安全的實際用戶。 經過計算的疏忽帶來了經過校準的微調:首先,平臺環境會誘發,哄騙和試驗平臺用戶的活動。 從多層圖形化組織到復雜的算法推薦,正是從所有這些平臺安排中產生了用戶活動。 相反,平臺公司將根據使用情況做出決策。

Thus, it is difficult to tell to what extent the patterns emerging from platform data are about “us,” rather than testimonies to the effects of platform nudges.

因此,很難說平臺數據出現的模式在多大程度上是關于“我們”的,而不是平臺微弱效果的證詞。

Of course, when bulks of platform log data become available for inquisitive parties to crunch, platforms keep the other part of the iterative process — shifting platform arrangements aimed to nudge usage — in the dark. Thus, it is difficult to tell to what extent the patterns emerging from platform data are about “us,” rather than testimonies to the effects of platform nudges. When we are experimental subjects oblivious to platforms’ treatments on us, taking our induced behaviors as “natural” means regarding these platforms as benign, transparent vehicles for our inherent intentions, and thus obscuring their prevailing power.

當然,當大量平臺日志數據可供查詢方處理時,平臺會將重復過程的另一部分(即旨在輕推使用的平臺安排轉移到黑暗中)保留下來。 因此,很難說平臺數據出現的模式在多大程度上是關于“我們”的,而不是平臺微弱效果的證詞。 當我們是實驗對象而忽略平臺對我們的治療時,將我們的誘發行為視為“自然”意味著將這些平臺視為對我們固有意圖的良性透明工具,從而掩蓋了它們的主導力量。

Consider peeking into our innate preferences (by race, geography, and daily rhythms!) based on “patterns” that emerge from PornHub’s log data, when the site’s visual design, temporal pacing, and content curation is all about eliciting and extending the user’s state of pleasure and pleasure seeking; or using Twitter data to study the insurgent online protests during Occupy Wall Street when, due to unknown algorithmic workings, the very term failed to trend; or using Uber’s rides data to study commuting habits when Uber wields its driving force with strategies, such as price surging under the name of (predicted but unverifiable) high demand; or using YouTube, or more fantastically Netflix data, to discern media preferences when these platforms’ entire business rests on herding sequences of viewing. (Each of these platform strategies have been creatively uncovered by critical scholars.)

考慮基于PornHub日志數據中出現的“模式”來窺視我們的先天偏好(按種族,地理和日常節奏!),此時網站的視覺設計,時間步調和內容管理都是關于激發和擴展用戶狀態的享樂和尋求享樂; 或使用Twitter數據研究“占領華爾街”期間的叛亂在線抗議活動,當時由于未知的算法工作原理,這一術語未能趨于發展 ; 或當Uber 運用策略推動其通行動力時,使用Uber的乘車數據研究通勤習慣,例如以(預計但無法驗證的)高需求的名義飆升價格; 當這些平臺的整個業務都集中在觀看序列上時,或者使用YouTube或更奇妙的Netflix數據來識別媒體偏好。 (批評學者們創造性地發現了每種平臺策略。)

…platforms’ intervention in human behavior is at once the center of platform business models and the secret that platforms strive to hide.

……平臺對人類行為的干預既是平臺業務模型的中心,又是平臺努力隱藏的秘密。

When we wind up finding human nature in platform data, we take administrative records from insulated digital experiments as expressions of humanity in our society. The data envelope a platform-shaped hole that may eschew the scrutiny of the most sophisticated computational techniques. Such a data analytic pitfall, increasingly common in data science showcases, journalistic reporting, and academic research, effectively obscures platforms’ intervention in human behavior. And platforms’ intervention in human behavior is at once the center of platform business models and the secret that platforms strive to hide.

當我們最終在平臺數據中發現人性時,我們將隔離的數字實驗中的管理記錄作為人類在社會中的表現。 數據包圍著一個平臺形的Kong,可以避免對最復雜的計算技術的審查。 這種數據分析陷阱在數據科學展示,新聞報道和學術研究中越來越普遍,有效地掩蓋了平臺對人類行為的干預。 平臺對人類行為的干預既是平臺業務模型的中心,又是平臺努力隱藏的秘密。

What are the human actions and predispositions that initially spark our curiosity? What is the kind of self-knowledge that we would cherish as a foundation for enriching our sociality, our civil and public institutions, and our democratic process? Readily resorting to platform data analytics for such knowledge risks taking platform environments as our entire world. Instead, when dealing with platform data we should aspire to “put the platforms in perspective,” foregrounding rather than obscuring their interventions in how we behave.

最初激發我們好奇心的人類行為和傾向是什么? 我們將以什么樣的自我知識作為豐富我們的社會,我們的公民和公共機構以及我們的民主進程的基礎? 隨便使用平臺數據分析來獲得這樣的知識風險,需要把平臺環境當作我們的整個世界。 相反,在處理平臺數據時,我們應該著眼于“透視平臺”,而不是掩蓋他們對我們行為的干預。

In this collective effort, non-corporate critical actors may find useful some of the strategies discussed in our paper.

在這種集體努力中,非企業的關鍵角色可能會發現本文討論的一些策略有用。

Angela Xiao Wu is an assistant professor in Media, Culture and Communication at New York University researching information technology, knowledge production, and political cultures.

吳小安(Angela Xiao Wu) 是紐約大學媒體,文化和傳播學的助理教授,研究信息技術,知識生產和政治文化。

翻譯自: https://points.datasociety.net/how-not-to-know-ourselves-5227c185569

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/392375.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/392375.shtml
英文地址,請注明出處:http://en.pswp.cn/news/392375.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

JDBC 數據庫連接操作——實習第三天

今天開始了比較重量級的學習了,之前都是對于Java基礎的學習和回顧。繼續上篇的話題,《誰動了我的奶酪》,奉獻一句我覺得比較有哲理的話:“學會自嘲了,而當人們學會自嘲,能夠嘲笑自己的愚蠢和所做的錯事時,他就在開始改變了。他甚至…

webassembly_WebAssembly的設計

webassemblyby Patrick Ferris帕特里克費里斯(Patrick Ferris) WebAssembly的設計 (The Design of WebAssembly) I love the web. It is a modern-day superpower for the dissemination of information and empowerment of the individual. Of course, it has its downsides …

leetcode 509. 斐波那契數(dfs)

斐波那契數,通常用 F(n) 表示,形成的序列稱為 斐波那契數列 。該數列由 0 和 1 開始,后面的每一項數字都是前面兩項數字的和。也就是: F(0) 0,F(1) 1 F(n) F(n - 1) F(n - 2),其中 n > 1 給你 n &a…

java基本特性_Java面試總結之Java基礎

無論是工作多年的高級開發人員還是剛入職場的新人,在換工作面試的過程中,Java基礎是必不可少的面試題之一。能不能順利通過面試,拿到自己理想的offer,在準備面試的過程中,Java基礎也是很關鍵的。對于工作多年的開發人員…

plotly python_使用Plotly for Python時的基本思路

plotly pythonI recently worked with Plotly for data visualization on predicted outputs coming from a Machine Learning Model.我最近與Plotly合作,對來自機器學習模型的預測輸出進行數據可視化。 The documentation I referred to : https://plotly.com/pyt…

轉發:畢業前的贈言

1、找一份真正感興趣的工作。 “一個人如果有兩個愛好,并且把其中一個變成自己的工作,那會是一件非常幸福的事情。那么另外一個愛好用來做什么?打發時間啦。所以,第二個興趣非常重要,在你無聊寂寞的時候越發顯得它…

Python模塊之hashlib:提供hash算法

算法介紹 Python的hashlib提供了常見的摘要算法,如MD5,SHA1等等。 什么是摘要算法呢?摘要算法又稱哈希算法、散列算法。它通過一個函數,把任意長度的數據轉換為一個長度固定的數據串(通常用16進制的字符串表示&#xf…

css flexbox模型_完整CSS課程-包括flexbox和CSS網格

css flexbox模型Learn CSS in this complete 83-part course for beginners. Cascading Style Sheets (CSS) tell the browser how to display the text and other content that you write in HTML.在這本由83部分組成的完整課程中,為初學者學習CSS。 級聯樣式表(CS…

leetcode 830. 較大分組的位置

在一個由小寫字母構成的字符串 s 中,包含由一些連續的相同字符所構成的分組。 例如,在字符串 s “abbxxxxzyy” 中,就含有 “a”, “bb”, “xxxx”, “z” 和 “yy” 這樣的一些分組。 分組可以用區間 [start, end] 表示,其中…

php 匹配圖片路徑_php正則匹配圖片路徑原理與方法

下面我來給大家介紹在php正則匹配圖片路徑原理與實現方法,有需要了解的朋友可進入參考參考。提取src里面的圖片地址還不足夠,因為不能保證那個地址一定是絕對地址,完全的地址,如果那是相對的呢?如果地址諸如&#xff1…

java項目經驗行業_行業研究以及如何炫耀您的項目

java項目經驗行業蘋果 | GOOGLE | 現貨 | 其他 (APPLE | GOOGLE | SPOTIFY | OTHERS) Editor’s note: The Towards Data Science podcast’s “Climbing the Data Science Ladder” series is hosted by Jeremie Harris. Jeremie helps run a data science mentorship startup…

MongoDB教程-使用Node.js從頭開始CRUD應用

In this MongoDB Tutorial from NoobCoder, you will learn how to use MongoDB to create a complete Todo CRUD Application. This project uses MongoDB, Node.js, Express.js, jQuery, Bootstrap, and the Fetch API.在NoobCoder的MongoDB教程中,您將學習如何使…

leetcode 399. 除法求值(bfs)

給你一個變量對數組 equations 和一個實數值數組 values 作為已知條件,其中 equations[i] [Ai, Bi] 和 values[i] 共同表示等式 Ai / Bi values[i] 。每個 Ai 或 Bi 是一個表示單個變量的字符串。 另有一些以數組 queries 表示的問題,其中 queries[j]…

【0718作業】收集和整理面向對象的六大設計原則

面向對象的六大設計原則 (1)單一職責原則——SRP (2)開閉原則——OCP (3)里式替換原則——LSP (4)依賴倒置原則——DIP (5)接口隔離原則——ISP (…

數據科學 python_適用于數據科學的Python vs(和)R

數據科學 pythonChoosing the right programming language when taking on a new project is perhaps one of the most daunting decisions programmers often make.在進行新項目時選擇正確的編程語言可能是程序員經常做出的最艱巨的決定之一。 Python and R are no doubt amon…

如何進行有效的需求調研

一、什么是需求調研?需求調研對于一個應用軟件開發來說,是一個系統開發的開始階段,它的輸出“軟件需求分析報告”是設計階段的輸入,需求調研的質量對于一個應用軟件來說,是一個極其重要的階段,它的質量在一…

java中直角三角形第三條邊,Java編程,根據輸入三角形的三個邊邊長,程序能判斷三角形類型為:等邊、等腰、斜角、直角三角形,求代碼...

private static Scanner sc;private static int edge[] new int[3];public static void main(String[] args) {System.out.println("請輸入三角形的三條邊");sc new Scanner(System.in);input();}public static void input() {int index 0;//數組下標while (sc.ha…

react中使用構建緩存_使用React和Netlify從頭開始構建電子商務網站

react中使用構建緩存In this step-by-step, 6-hour tutorial from Coding Addict, you will learn to build an e-commerce site from scratch using React and create-react-app.在這個Coding Addict的分步,為時6小時的教程中,您將學習使用React和creat…

Django+Vue前后端分離項目的部署

部署靜態文件: 靜態文件有兩種方式 1:通過django路由訪問 2:通過nginx直接訪問 方式1: 需要在根目錄的URL文件中增加 url(r^$, TemplateView.as_view(template_name"index.html")),作為入口,在setting中更改…

leetcode 547. 省份數量(bfs)

有 n 個城市,其中一些彼此相連,另一些沒有相連。如果城市 a 與城市 b 直接相連,且城市 b 與城市 c 直接相連,那么城市 a 與城市 c 間接相連。 省份 是一組直接或間接相連的城市,組內不含其他沒有相連的城市。 給你一…