Posts in Gender Bias
Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification∗

Joy Buolamwini joyab@mit.edu MIT Media Lab 75 Amherst St. Cambridge, MA 02139

Timnit Gebru timnit.gebru@microsoft.com Microsoft Research 641 Avenue of the Americas, New York, NY 10011

Recent studies demonstrate that machine learning algorithms can discriminate based on classes like race and gender. In this work, we present an approach to evaluate bias present in automated facial analysis algorithms and datasets with respect to phenotypic subgroups. Using the dermatologist approved Fitzpatrick Skin Type classification system, we characterize the gender and skin type distribution of two facial analysis benchmarks, IJB-A and Adience. We find that these datasets are overwhelmingly composed of lighter-skinned subjects (79.6% for IJB-A and 86.2% for Adience) and introduce a new facial analysis dataset which is balanced by gender and skin type. We evaluate 3 commercial gender classification systems using our dataset and show that darker-skinned females are the most misclassified group (with error rates of up to 34.7%). The maximum error rate for lighter-skinned males is 0.8%. The substantial disparities in the accuracy of classifying darker females, lighter females, darker males, and lighter males in gender classification systems require urgent attention if commercial companies are to build genuinely fair, transparent and accountable facial analysis algorithms.

Keywords: Computer Vision, Algorithmic Audit, Gender Classification

Full Research: http://proceedings.mlr.press/v81/buolamwini18a/buolamwini18a.pdf

Read More
Fairer machine learning in the real world: Mitigating discrimination without collecting sensitive data

Decisions based on algorithmic, machine learning models can be unfair, reproducing biases in historical data used to train them. While computational techniques are emerging to address aspects of these concerns through communities such as discrimination-aware data mining (DADM) and fairness, accountability and transparency machine learning (FATML), their practical implementation faces real-world challenges. For legal, institutional or commercial reasons, organisations might not hold the data on sensitive attributes such as gender, ethnicity, sexuality or disability needed to diagnose and mitigate emergent indirect discrimination-by-proxy, such as redlining. Such organisations might also lack the knowledge and capacity to identify and manage fairness issues that are emergent properties of complex sociotechnical systems. This paper presents and discusses three potential approaches to deal with such knowledge and information deficits in the context of fairer machine learning. Trusted third parties could selectively store data necessary for performing discrimination discovery and incorporating fairness constraints into model-building in a privacy-preserving manner. Collaborative online platforms would allow diverse organisations to record, share and access contextual and experiential knowledge to promote fairness in machine learning systems. Finally, unsupervised learning and pedagogically interpretable algorithms might allow fairness hypotheses to be built for further selective testing and exploration.

Read More
GENDER BIAS IN ADVERTISING

In 2017, discussions around gender and media have reached a fever pitch. Following a bruising year at the ballot box, fourth-wave feminism has continued to expand. From the Women’s March to high-profile sexual harassment trials to the increasing number of female protagonists gaining audience recognition in an age of “peak TV,” women are ensuring that their concerns are heard and represented.

We’ve seen movements for gender equality in Hollywood, in Silicon Valley — and even on Madison Avenue. In response to longstanding sexism in advertising, industry leaders such as Madonna Badger are highlighting how objectification of women in advertising can lead to unconscious biases that harm women, girls and society as a whole.

Agencies are creating marquee campaigns to support women and girls. The Always #LikeAGirl campaign, which debuted in 2014, ignited a wave of me-too “femvertising” campaigns: #GirlsCan from Cover Girl, “This Girl Can” from Sport England and the UK’s National Lottery, and a spot from H&M that showcased women in all their diversity, set to “She’s a Lady.” Cannes Lions got in on the act in 2015, introducing the Glass Lion: The Lion for Change, an award to honor ad campaigns that address gender inequality or prejudice.

But beyond the marquee case studies, is the advertising industry making strides toward improving representation of women overall? How do we square the surge in “femvertising” with insights from J. Walter Thompson’s Female Tribes initiative, which found in 2016 that, according to 85% of women, the advertising world needs to catch up with the real world?

Read More
Evidence That Gendered Wording in Job Advertisements Exists and Sustains Gender Inequality

Women continue to remain underrepresented in male-dominated fields such as engineering, the natural sciences, and business. Research has identified a range of individual factors such as beliefs and stereotypes that affect these disparities but less is documented around institutional factors that perpetuate gender inequalities within the social structure itself (e.g., public policy or law). These institutional factors can also influence people’s perceptions and attitudes towards women in these fields, as well as other individual factors.

Read More
The Effects of Cognitive Biases and Imperfectness in Long-term Robot-Human Interactions: Case Studies using Five Cognitive Biases on Three Robots

The research presented in this paper demonstrates a model for aiding human-robot companionship based on the principle of 'human' cognitive biases applied to a robot. The aim of this work was to study how cognitive biases can affect human-robot companionship in long-time. In the current paper, we show comparative results of the experiments using five biased algorithms in three different robots such as ERWIN, MyKeepon and MARC.

Read More
GloVe: Global Vectors for Word Representation

Recent methods for learning vector space representations of words have succeeded in capturing fine-grained semantic and syntactic regularities using vector arithmetic , but the origin of these regularities has remained opaque. We analyze and make explicit the model properties needed for such regularities to emerge in word vectors. The result is a new global log-bilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods. Our model efficiently leverages statistical information by training only on the nonzero elements in a word-word co-occurrence matrix, rather than on the entire sparse matrix or on individual context windows in a large corpus. The model produces a vector space with meaningful sub-structure, as evidenced by its performance of 75% on a recent word analogy task. It also outperforms related models on similarity tasks and named entity recognition.

Read More
Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings

The blind application of machine learning runs the risk of amplifying biases present in data. Such a danger is facing us with word embedding, a popular framework to represent text data as vectors which has been used in many machine learning and natural language processing tasks. We show that even word embeddings trained on Google News articles exhibit female/male gender stereotypes to a disturbing extent. This raises concerns because their widespread use, as we describe, often tends to amplify these biases. Geometrically, gender bias is first shown to be captured by a direction in the word embedding. Second, gender neutral words are shown to be linearly separable from gender definition words in the word embedding. Using these properties, we provide a methodology for modifying an embedding to remove gender stereotypes, such as the association between between the words receptionist and female, while maintaining desired associations such as between the words queen and female. We define metrics to quantify both direct and indirect gender biases in embeddings, and develop algorithms to "debias" the embedding. Using crowd-worker evaluation as well as standard benchmarks, we empirically demonstrate that our algorithms significantly reduce gender bias in embeddings while preserving the its useful properties such as the ability to cluster related concepts and to solve analogy tasks. The resulting embeddings can be used in applications without amplifying gender bias.

Read More