If, as Sara Ahmed writes, feminism is “a building project” (Living a Feminist Life, 14), then a critical point of feminist intervention in the textual analysis pipeline is in the selection, handling, and use of textual data. The text tokens — and the methods we use to parse, tag, and untangle them — are the nails and hammers, respectively, in that building project.
Every stage of identifying and preparing data (i.e., prior to its analysis) offers opportunities for feminist intervention. When working with an existing dataset (a readymade), we as researchers can interrogate its provenance, genealogy, potential for misuse, and more. Asking these questions can clue us into the assumptions that shaped its collection.
If collecting data ourselves, we should look to domain experts to help us circumvent our own biases and situate knowledge within its local context — otherwise, we risk harming the communities the data claims to represent. When cleaning data, we have an opportunity to preserve difference and nuance by avoiding the tendency to aggressively standardize data to make it “easier” to analyze.
We also have opportunities to rewrite or redistribute data in ways that actively oppose oppression — gender-based or otherwise — by revising or adding metadata, documenting and explaining datasets to promote more informed use, and even constructing alternative datasets that subvert dominant ideologies. These actions can not only bring our research efforts in line with a feminist aim, but also create the conditions for other researchers to do the same.