Monthly Archives: May 2023

Abstract: Is analyzing gender using computational text analysis ethical?

Often in the tech world, we hear of algorithms that can predict with accuracy, the gender of the person who wrote a particular document, tweet, etc. Is this inherently unethical? To discover the answer, we can use the following as a jumping off point to arrive at a decision. 

In every project, there is the potential for biases to be introduced. Some may ask how this could be possible if an algorithm is doing all the work. This is an inaccurate idea. It is important to realize there are people behind every algorithm written. It’s trained on data provided by people who have thoughts, feelings, and opinions which could be translated into the training material provided. Does the training data perpetuate gender stereotypes or other biases?

Another element to consider is privacy. When collecting information about genders of authors, how is that data being used within the project? Was consent even obtained from the individuals providing the data? Was this communicated to the participants? If data were to be exposed would it cause harm? Would it be possible to anonymize the data and still provide significant results?

It is also important to consider social and political context when attempting to analyze gender using computational text analysis. Do the results perpetuate power dynamics between socially constructed gender roles? If so, this could reinforce what has been ingrained in our society. However, constructs change over time. Have historical and cultural context been taken into account to eliminate misunderstandings of the results? Since gender does not stand on its own, was there an intersectional approach taken within the experiment? Other social categories such as race, social class and sexuality are highly intertwined.

Proposal – Atilio Barreda II

Proposal:

In light of Nan Z. Da’s “The Computational Case against Computational Literary Studies,” which exposes the limitations of traditional computational methods in literary studies, my research project will focus on understanding the humanistic and philosophical concepts embedded in techniques like cosine similarity, Euclidean distance, and Latent Dirichlet Allocation (LDA). I aim to deconstruct the foundations of these methods and examine the potential for developing text analysis approaches that are sensitive to humanistic and philosophical dimensions.

To achieve this, I will begin by examining the underlying assumptions and principles of these methods, assessing their ability to capture the intricacies of literary works. I will draw on examples from Da’s critique and other instances within computational literary studies to identify common pitfalls and limitations of these techniques.

Next, I will explore interdisciplinary methodologies that can complement and improve upon traditional computational methods. By incorporating insights from fields such as linguistics, philosophy, and literary theory, I hope to begin develop a more robust and nuanced analytical framework.

Feminist Instructions

“It wasn’t a match, I say. It was a lesson”  Claudia Rankine, Citizen: An American Lyric (Greywolf: 2014)

In her introduction to Living a Feminist Life, scholar and activist Sara Ahmed adopts bell hooks’ definition of feminist work: “the movement to end sexism, sexual exploitation and sexual oppression”(hooks, 2000, cited in Ahmed, 2016).  I in turn take Ahmed’s description of “a scene of feminist instruction” as a starting point for an imagined feminist text analysis. Ahmed writes,

we hear histories in words; we reassemble histories by putting them into words . . . . attending to the same words across different contexts, allowing them to create ripples or new patterns like texture on a ground. I make arguments by listening for resonances . . . . The repetition is the scene of a feminist instruction.

Hence, Text Analysis with its focus on repeated words as quantifiable data points that reveal the workings of text or texts by its very nature would seem to be feminist. 

And yet, as Koen Loers and Sayan Bhattacharyya demonstrate, it’s not at all that simple.

In “Text Analysis for Thought in the Black Atlantic,” Sayan Bhattacharyya points out that “many methods of text analysis prove problematic, because they make an unwarranted assumption about the stability and constancy of the relation between words and their meanings across time.” Proposing Glissant’s notion of “archipelagic thinking in space (and its counterpart in time)” as a way of “pay[ing] attention to variation within, as well as to the specificity of, word-concepts” (Bhattacharyya, 80), Bhattacharyya traces a geneaology of Glissant’s metaphor back through the writings of Aimé Césaire and C. L. R. James, and thus suggests that the Digital Black Atlantic as “the body of interdisciplinary scholarship that examines connections between African diasporic communities and technology” (Introduction, Risam and Baker Josephs),  can, like Paul Gilroy’s eponymous challenge to Eurocentric white supremacist studies, “perform a similar decentering of the epistemological assumptions that underlie digital humanities in general by problematizing its tools” (Bhattacharyya 82).  In particular, “[b]y taking the relationships between words (expressed as co-occurrences of words), rather than the words themselves, as the basic unit of representation” (Bhattacharyya 81), word vectors “are not only a convenient technology to capture semantic relationships but also are . . . productive for problematizing concepts in the text and even for raising epistemological questions about the status of concepts themselves in relation to the text” (Bhattacharyya 81).

Likewise, in “Feminist Data Studies: Using Digital Methods for Ethical, Reflexive and Situated Socio-Cultural Research, Koen Loers points out that “Digital data is performative and context-specific” (Loers, 143); as a result a would-be feminist data researcher needs to “consider . . . text, users and materiality from a relational perspective” (Loers 133).  Asking, “[h]ow can we draw on user-generated data to understand agency vis-a`-vis structures of individuality and collectives across intersecting axes of difference?” as well as “[h]ow can we strategically mobilise digital methods in a non-exploitative way to illuminate everyday power struggles, agency and meaning-making?” (Loers 133), Loers offers a case-study and “road map” for the self-interrogating, “research participant-centered” (132), “alternative data-analysis practice” (139) that might better align with feminist and post-colonial ethics. 

While Loers himself demonstrates how Facebook TouchGraph’s visualizations of users’ relationships, even when jointly created, can generate alienation, hostility, and confusion in participants, necessitating adaptive understandings of data and collaboration, my presentation will focus in particular on his section entitled “Dependencies and relationalities” to explore whether what I tentatively term “relational textual analysis” might afford an epistemological as well as material model for a feminist textual analytic practice.

Ahmed, Sara, Living a Feminist Life. Duke University Press, 2016. Project MUSE. muse.jhu.edu/book/69122.

Bhattacharyya, Sayan, “Text Analysis for Thought in the Black Atlantic” in The Digital Black Atlantic, Roopika Risam and Kelly Baker Josephs eds, pp. 77-83.

Koen Leurs, “feminist data studies: using digital methods for ethical, reflexive and situated socio-cultural research” (130–154) 2017 The Feminist Review Collective. 0141-7789/17

Abstract for roundtable

Data-Driven Feminist Text Analysis: Exploring the Significance of Computational Methods and Digital Humanities Tools in Literary and Cultural Studies

The role of feminist text analysis in literary and cultural studies, with a particular focus on the use of data and code-based tools to support this approach. Drawing on established feminist theories and practices of text analysis, we argue that feminist text analysis is a crucial lens for understanding how gender and power dynamics shape the production and reception of literature and other cultural artifacts.

We explore the ways in which computational methods and digital humanities tools can support feminist text analysis, including through the use of text mining, machine learning (for example: machine learning algorithms can be trained to identify and classify gendered language and stereotypes in texts, which can then be used to quantify and analyze patterns of gender bias and discrimination. This can enable feminist text analysts to more efficiently and effectively identify and critique problematic representations of gender in literature and other cultural artifacts) and other data-driven approaches.

Consider the challenges and limitations of these tools is also very crucial, including the potential for bias and the need for critical awareness of their limitations. To support this argument, examples of feminist text analyses that have successfully navigated these challenges, including studies on the representation of gender in children’s books, the use of the word “hysterical” on Twitter, and the gendering of job titles in academia. These examples demonstrate the potential of feminist text analysis to uncover patterns of gender bias and inequality, and to contribute to the promotion of gender equality and social justice. Ultimately, we argue that feminist text analysis is an essential approach to literary and cultural studies that can help us to create more inclusive and equitable representations of gender in our culture.

Roundtable Abstract: The Days of the LLM is counted

A recently leaked Google document contains “We Have No Moat, And Neither Does OpenAI”. This is quite a big claim against OpenAI, the mastermind behind society’s enchanting creative LLM, ChatGPT. The document is mostly referring to the significant rise of boutique indie open source model developers and tweakers who took a leap frog towards deploying models that are fractionally less effective than renowned LLMs but have significantly less computational cost. Even OpenAI CEO Sam Altman acknowledged that the era of large language models is over. In this paper, some of the open-source language models that are posing significant challenges to LLMs will be analyzed. The analysis will look at the underlying technology, results of these models and answer questions like what these open source models are doing differently to challenge the norms and why they have fewer computational requirements.

Abstract: Investigating Indices of Impermanence

The human response to uncertainty, impermanence, and the unknowable are centrally located at the emergence of cultural and social practice. Discomfort tied to the fluctuating state of the knowable can lead to an outsized cultural emphasis on rigid systems of dissecting, parsing, naming and sorting—be that at the level of culturally defined gender roles or linguistic analysis. Digital text analysis, despite its rootedness in concepts of precision, empirical process and logic, offers outcomes akin to a blurry photograph of a subject in motion. It does provide evidence of materiality, but the details, origin, trajectory, and ongoing development of its subject fail to manifest under its lens. Feminism does not ask us to disregard digital text analysis because of its limitations, it asks us to consider its process and outcomes as circumscribed evidence of an iteration of ongoing knowledge creation, impacted by the interventions of researchers, authors, and editors. 

As exemplified by Standpoint Theory, which asks us to recognize knowledge as stemming from social position and therefore unfixed and subjective, and the practice of acknowledging what is not named, not performed, and not visible in representations of information and experience—Feminism pushes back against the concept of a finite and universally experienced perception of the world. This work argues that these feminist practices, inherently linked to Barthe’s discussion of a “work” as an iteration or “fragment of substance,” in relation to a “text,” as an evolving formulation or “methodological field,” are critical in examining the limitations digital text analysis in documenting the complex, transient and embedded knowledge referenced in literary works it seeks to investigate. Although text analysis may capture evidence of subjectivity and social performance, unearthing the depth of the underlying “methodological field” from which the work was derived requires a complex contextual framework outside the purview of current digital textual analysis tools. 

  • Eckert, Penelope, and Sally McConnell-Ginet. Language and Gender. 2nd Edition. Cambridge UP: Cambridge, 2013. pp 1-36 
  • Catherine D’Ignazio and Lauren Klein. “ChapterTwo: On Rational, Scientific, Objective Viewpoints from Mythical, Imaginary, Impossible Standpoints.”; Chapter 6: “The Numbers Don’t Speak for Themselves.”Data
    Feminism. Cambridge, MA: MIT Press, 2020.
  • Barthes, Roland. “The Rustle of Language.” (R. Howard., Trans.).  Farrar, Straus, and Giroux, Inc., 1986.
  • Jerome McGann “Introduction: Texts and Textualities” “The Textual Condition” and “How to Read a Book” McGann, Jerome J. The Textual Condition. Princeton, N.J.: Princeton UP, 1991. Print. Princeton Studies in Culture/power/history.  

Abstract for Roundtable

“Data Feminism” by D’Ignazio & Klein identifies that data is not objective and reinforces existing social inequalities. Consequently, studying the hidden biases within a text is an important step for building a feminist analysis.

Intersectional feminist theories inform us that social inequalities are better reflected “not by a single axis of social division, but by many axes that work together and influence each other” (Collins & Bilge 2016, p. 2).

Following this idea, to properly critique text analysis, a feminist model should be bi-directional and multi-dimensional (spatial rather than scalar) encoding in itself the context in which words are used to relate to the social divisions at play.

Models like LDA can decode topics but are not context-aware or spatial. Earlier word-embedding models like word2vec are spatial but not context-aware.

The word-embedding model BERT can be suitable in this case. BERT is context-aware and can capture the meaning in which a word is used. Being a multi-dimensional model allows intersectional analysis to be performed, uncovering the relationships between different contextual text use cases. With its sentiment analysis and opinion-mining capabilities, we can uncover those expressed in the text concerning different social identities. Given the model’s customizability, it can also be trained to the specific domain.

However, BERT is notoriously computationally intensive. To feminist scholars that is an issue both in terms of environment and accessibility. To achieve compression, we can use BERT to create context-aware word-embeddings but apply knowledge distillation and pruning to reduce computational intensity, optimizing for maximum accuracy.

Abstract for roundtable – Co-opting feminism in politics

The increased participation of women in politics not only in government but as active voters, has contributed by tipping the balance in favor of those representing their interests at best. The last two democratic presidential winners, for example, saw a higher turnout of female voters than male voters, contributing to their success. This trend however, has been captured by those creating narratives around the political campaigns and used dishonestly by co-opting the language of feminist movements to further agendas that may indirectly or directly, promote gender inequality. A phenomenon referred to as “Purple Washing”. The argument surrounding this problem lies in the need to implement language recognition models able to identify the co-opting of feminine language in the political discourse, allowing for the dissemination of a more transparent ideology. As well as ensuring factors as diversity in the data sets, utilizing both quantitative and qualitative methods, and examining intersectionality across gender, race, class and sexuality are included, in order to reducing bias and securing that data is understood within a contextual framework. While the risks of perpetuating biases with the use of computerized tools needs to be acknowledged, there are also opportunities that may prove crucial at cutting ties with traditional politics and at challenging male dominated-structures. 

Abstract for Round Table

Artificial Intelligence (AI) systems are developing way too quickly with advanced Deep Neural Networks (DNNs) that enacts like biological neurons. Therefore, similarities between humans and AI are nothing but expected. For this to happen, AI needs to be trained and exposed to real world data. The problem of biasness occurs here. The datasets through which they get trained are not sufficiently diverse (for example, in facial recognition systems) and they are gender biased as well. The worst part is that the model can show high accuracy, but it will be biased (that usually goes unnoticed because of the prevailing supremacy of certain groups of people). Even Data Feminism ‘s “Data is Power” chapter talks about the failed systems in computational world due to an unequal distribution of the power that benefits small group of people at the expense of everyone else.

We can see various examples of gender bias adopted by the AI models and replicating the outdated views (at least not how we want our society to progress). For example, if the training dataset does not have enough contributing women, then there will be holes in AI’s knowledge as well. So, if the AI, wired with such biasness, gets standardized, then that is a big problem. If AI fails to understand the fundamental power differentials between women and men, then is feminist text analysis possible using deep neural network system of AI without any biasness? May be, if the feminist approaches are introduced at the initial phase of training an AI model, then there is still some hope. However, my stance lies in the opposite side as well. If biases are unavoidable in real-life, then how is it possible to not make it an unavoidable aspect of new technologies. After all, it is created by humans, based on the human brain, and is trained on data created by humans which makes it more complex. The solution that I see over here is a need for diverse data for which binary system needs to be wiped out.

Abstract for the Roundtable

Utilizing Feminist Text Analysis in Historical Imagination and Cultural Narrative

Is history a factual account of the past? How to understand interpretive and narrative frameworks constructed by historians? This presentation explores the potential of a feminist approach of historical imagination, re-enactment, and cultural narratives, challenging the notion of objectivity in history. Drawing from R.G. Collingwood’s seminal book The Idea of History and Joan Wallach Scott’s landmark work Gender and the Politics of History, I examine how feminist perspectives enrich our understanding of the past.

Moving to the realm of text analysis and computational studies, I aim to examine how we reconstruct the past through the discovery and presentation of patterns, trends, themes, and topics, particularly when faced with scanty documentary evidence. Reflecting on the works by scholars like Catherine D’Ignazio, Judith Fetterley, Lauren Klein, and Laura Mandell, I discuss the definition of “evidence” when implementing computational methods for inquiries into women’s history. I consider questions like, what counts as evidence from textual or other forms of historical data? Could distorted understanding occur in constructing the past and how computational methods intensify or mitigate this issue? Should we apply our projection of modern values onto the past when designing computational approaches, or should we avoid doing so?  

  • Collingwood, R. G. The Idea of History. Oxford: Oxford University Press, 1946.
  • D’Ignazio, Catherine, and Lauren F. Klein. Data Feminism. Cambridge, MA: The MIT Press, 2020.
  • Fetterley, Judith. The Resisting Reader: A Feminist Approach to American Fiction. Bloomington: Indiana University Press, 1978.
  • Mandell, Laura. “Gender and Cultural Analytics: Finding or Making Stereotypes?” Debates in the Digital Humanities. Eds. Matthew K. Gold and Lauren Klein. Minnesota: U of Minn Press, 2019.
  • Scott, Joan Wallach. Gender and the Politics of History. New York: Columbia University Press, 1988.