Author Archives: Miaoling Xue

Book Review: Meredith Broussard’s Artificial Unintelligence: How Computers Misunderstand the World

I started my reading of Meredith Broussard’s Artificial Unintelligence: How Computers Misunderstand the World with a set of haunting questions: “Will AI eventually replace my job?” and “Should we be worried about the future where AI might dominate the world?” The entire reading experience was enlightening and inspiring, with one of the most direct takeaways being that as a woman, I should not fear studying STEM, despite societal notions during my childhood that my brain may not be qualified to process complex logic and math. Broussard starts the book by illustrating some personal experiences like dissecting toys/computers to learn, providing women’s perspectives in the male-dominated STEM field, and writing criticisms on our current definitions of artificial intelligence. Each chapter stands alone as an independent piece, yet they all contribute to a cohesive narrative that guides readers from understanding the limitations of AI to the concept of “technochauvinism” and how this misplaced faith in tech superiority can lead to a harmful future.

The book starts with a critique of technochauvinism. She gives a wonderful description in the first chapter, “The notion that computers are more ‘objective’ or ‘unbiased’ because they distill questions and answers down to mathematical evaluation; and an unwavering faith that if the world just used more computers, and used them properly, social problems would disappear and we’d create a digitally enabled utopia.” (8) I suggest juxtaposing this statement with one from Chapter 7, where she writes, where she writes, “Therefore, in machine learning, sometimes we have to make things up to make the functions run smoothly.” (104) By comparing the two arguments, I understand that Broussard addresses that a clean, repeatable, large-scale structure and mathematical reasoning do not guarantee a valid model in all contexts. While some abstract aspects in human lives, like emotions, memories, values, and ethics, are reduced to numeric representations in a single dimension but with the wrong direction, data and ML algorithms are “unreasonably effective”(119) in calculations and would intensify bias, discrimination, and inequity.

This book is quite scathing in its criticism of the technochauvinist male leaders in the tech industry. You can find quite direct praise and criticism in the sixth chapter of this book. She recalls the mindset and working methods that tech entrepreneurs and engineers have been employing to guide the world since the 1950s. Bold innovation and disregard for order are two sides of the same coin. What intrigues me is how feminist voices can lead to specific interventions in such an environment. This book was written in 2018, and as of 2023, with the increasingly rapid development of AI, we are witnessing a ‘glass cliff’ phenomenon for female tech leaders. Check the news on the newly appointed Twitter CEO Linda Yaccarino and the concept of the glass cliff that women are usually promoted to leadership roles in industries during the crisis and are, therefore, set up for failure.

In the book’s first chapter, Broussard emphasizes what is considered a failure and why we should not fear failure in scientific learning. I find it fascinating to connect Chapter 6 with the recent news about the “glass cliff,” which reminds us to consider the definition of failure dialectically. The glass cliff of women leaders further alerts us that failure can also be unilaterally interpreted from data. The intervention of women in technochauvinism might also present a state of failure. It raises questions about how we can move beyond data and computational methods to consider feminist interventions in technological development.

Regarding feminist interventions, one possible example of self-driving robot cars in Chapter 8 provides unique insight. We talked about the concept of resist reading by Judith Fetterley in class and the scalar definitions of gender at the beginning of the course. One difference between the two kinds of algorithms in robot driving mentioned by Broussard reminds me of the above discussion. I will try to explain my idea here to make some connections.

  1. Both the resist reading and the scalar definitions are trying to find alternative interpretations and respect non-mainstream experiences.
  2. A robot car model by the CMU that uses the idea of the Karel problem (a chess game in a grid-like pawn for learning programming) challenges the preexisting assumption that we need to build a robot car mimicking human perception. Instead, they propose the idea that we use machine like machine to collect data from building 3D maps, do the calculations quickly, and feed the results back to driving on a grid-like path, which is just like how humans invented airplanes (we were not building a mechanical bird but discovered another model of flying).

If you think about 1 and 2 together: The workflow described above breaks down the idea of reconstructing human intelligence using machine/replacing humans with machine and provides alternative thinking in utilizing machine/algorithm. This narrow definition of machine learning/AI is also a restrained use of machine. I believe this case represents an instance of not excessively breaking the rules and advocating technochauvinism. They respect humanism yet are still achieving innovative results. It probably could also serve as an abstract example of feminist intervention.

And last, let us return to an example of textbook distribution in Chapter 5. Broussard believes that we have overestimated the role of algorithms and artificial intelligence in textbook distribution, resulting in many students without access to books. The sophistication of textbook database models cannot prevent the chaos of input data, and the disorder of textbook distribution and management cannot be solved by large models. Therefore, she visited several schools and discovered the management chaos behind big data. This action reminds me of the data cleaning phase we mentioned earlier, where we would fill in data to maintain a controllable cleaning structure. This kind of on-site investigation for problem identification might be considered another form of data cleaning. Although it seems to bring chaos to the big data model, her visit accurately identified the root cause. Therefore, if human problems are not solved, technology ultimately cannot solve societal problems.

Overall, Broussard raises many challenging questions, shares her perspective as a woman in STEM, and presents an argument for a balanced approach to technology.

Abstract for the Roundtable

Utilizing Feminist Text Analysis in Historical Imagination and Cultural Narrative

Is history a factual account of the past? How to understand interpretive and narrative frameworks constructed by historians? This presentation explores the potential of a feminist approach of historical imagination, re-enactment, and cultural narratives, challenging the notion of objectivity in history. Drawing from R.G. Collingwood’s seminal book The Idea of History and Joan Wallach Scott’s landmark work Gender and the Politics of History, I examine how feminist perspectives enrich our understanding of the past.

Moving to the realm of text analysis and computational studies, I aim to examine how we reconstruct the past through the discovery and presentation of patterns, trends, themes, and topics, particularly when faced with scanty documentary evidence. Reflecting on the works by scholars like Catherine D’Ignazio, Judith Fetterley, Lauren Klein, and Laura Mandell, I discuss the definition of “evidence” when implementing computational methods for inquiries into women’s history. I consider questions like, what counts as evidence from textual or other forms of historical data? Could distorted understanding occur in constructing the past and how computational methods intensify or mitigate this issue? Should we apply our projection of modern values onto the past when designing computational approaches, or should we avoid doing so?  

  • Collingwood, R. G. The Idea of History. Oxford: Oxford University Press, 1946.
  • D’Ignazio, Catherine, and Lauren F. Klein. Data Feminism. Cambridge, MA: The MIT Press, 2020.
  • Fetterley, Judith. The Resisting Reader: A Feminist Approach to American Fiction. Bloomington: Indiana University Press, 1978.
  • Mandell, Laura. “Gender and Cultural Analytics: Finding or Making Stereotypes?” Debates in the Digital Humanities. Eds. Matthew K. Gold and Lauren Klein. Minnesota: U of Minn Press, 2019.
  • Scott, Joan Wallach. Gender and the Politics of History. New York: Columbia University Press, 1988.

Blog Post: AI, Correlation, Homophily, and Topic Modeling

I have read the first three chapters of Artificial Unintelligence. Broussard discusses the issues in applying computer technology to every aspect of our lives and the limitations of artificial intelligence. She gives the name “Technochauvinism” to the idea that computers could always solve all human problems. The example she writes about AlphaGo is pretty straightforward, telling us computers work very well or “intelligently” in a highly structured system, just like the “Smart Games Format.” However, AI or computers can lead to algorithmic injustice and discrimination due to the data or orders the creators feed them.

Chun discusses the concepts of “correlation,” “homophily,” and “network science” in her article “Queerying Homophily” in Pattern Discrimination. I also found her explanation in this interview, Discriminating Data: Wendy Chun in Conversation with Lisa Nakamura amazing; she explained how correlation is treated by eugenics and big data as a future predictor, which actually turns out to close the future. I understand what she means by “closing the future” is closing alternatives for the future that are not predictable by the current and past data. I understand homophily is something amplified by people sharing their previous knowledge, experience, behavior, preferences, etc. Echo chambers and bubbles are generated by this homophily data collection and analysis method, further reinforcing the segregation in our society.

I am trying to make a connection here between the above two readings and the Topic Modeling articles for this week. My question for topic modeling will be how to avoid oversimplifying the context, figurative language, and nuance in discovering the topics. If an alternative reading is possible, how can this kind of representation be included in deciding the topics? For example, while the Latent Dirichlet Allocation (LDA) method could give you the percentage of possibility in similarities for topics, is it possible to address something like the impossible topics or the farthest options, or any other kinds of correlations?

The readings for this week remind me of a conversation my friend and I had a long time ago. We think math is a very romantic field because it starts with an agreement between two people that 1 + 1 = 2. If we don’t have a mutual agreement like this, the world will change accordingly. For example, I read that someone was trying to train ChatGPT that 1 + 1 = 3, and the machine will eventually accept it after a bit of struggle. To summarize my post, I believe we need to revisit concepts like “homophily” and “correlation” to keep space for agreement and mutual understanding, going beyond the aim of finding superficial similarities.

Response Blog Post 1

Can AI accurately determine gender when we conduct different kinds of studies? Can AI help in improving gender and racial equality? Allen Jiang’s article is very clear and compelling, showing us an approach to detecting gender with a machine-learning model with Twitter data. I appreciate Jiang’s methodology and explanation but still doubt the initial questions “Can you guess the gender better than a machine” and “What are business cases for classifying gender.” From my perspective, for example, pregnancy is not a gender-specific topic, and optimizing advertising costs could be achieved by non-binary customers’ gender detection. These questions reflect the definitions of gender I constructed derived from readings in week 2, Language and Gender by Penelope Eckert and Sally McConnell-Ginet, and We Should All Be Feminists by Adichie, to name a few. Sex is undoubtedly a different concept compared with gender. Applying a spectrum of gender in a scalar structure would challenge Jiang’s model, particularly in terms of data organization.

Specifically, Jiang primarily selects Twitter accounts belonging to famous people. In the data collecting and cleaning process, how could Jiang avoid the “celebrity effect?” I could think of methods like improving data diversity. And I am also very confused why Jiang included the data of followers. Is it possible that followers could be a feature impacting gender detection or a confounding variable? I previously raised the question about the significance of repetition in experiments to reduce error effects, which assumes a single correct result as the goal. This article reminds me of cross-validation’s importance in training models’ performances in different scenarios. The key question here I propose is to reduce the dependence on a single model designed based on one dataset with specific features.

The Gendered Language in Teacher Reviews focuses on a more diverse group sharing the same occupation, “teacher.” Upon revisiting the process, I discovered that he writes, “Gender was auto-assigned using Lincoln Mullen’s gender package. There are plenty of mistakes–probably one in sixty people are tagged with the wrong gender because they’re a man named ‘Ashley,’ or something.” The data source of Lincoln Mullen’s package is “names and dates of birth, using either the Social Security Administration’s data set of first names by year of birth or Census Bureau data from 1789 to 1940.” (https://lincolnmullen.com/blog/gender-package-now-on-cran/) Unfortunately, Ben Schmidt is no longer maintaining the teach reviews site, but updating this data visualization remains crucial, as the association between names and gender conventions is constantly changing. Using data from the 1940s to train a model for detecting gender in contemporary society may not yield ideal results.

I continued to read two articles to help me think about the question bias in AI. One is an interview Dr. Alex Hanna received about her work at Distributed AI Research Institute on AI technology and AI bias and constraints. I recommend this interview, especially for her answers to the question on independent AI research and purposes not funded by tech companies.(https://www.sir.advancedleadership.harvard.edu/articles/understanding-gender-and-racial-bias-in-ai)

There is also another super interesting but quite dangerous reading named Natural Selection Favors AIs over Humans (https://arxiv.org/abs/2303.16200)

The author Dan Hendrycks is the director of the Center for AI Safety. He discussed the natural selection and the Darwinian logic applied to artificial agents. In such large-scale computational studies and models, if biased natural selection is also incorporated, would the consequences be unbearable for us as humans? If there is natural selection within AI, where does the human position be?