I’ve often wondered why anyone would need to build an algorithm that would produce the gender or ethnicity of the author of text. To me, it feels a bit creepy and teeters on the verge of a big brother reality. One of the readings assigned that relates to this was a Medium article entitled, “AI Ethics Identifying Your Ethnicity and Gender” by Allen Jiang.
Blog articles are meant to be approachable to all audiences and even though this is a highly divisive topic, at first I thought the author did a good job of explaining AI and how it can be used to understand ethnicity and gender. Jiang gave a few business case examples as to why one would do this including, but not limited to, better customer experience and better customer segmentation.
However, one of the first sentences as discussed in class is as follows “This is an analogous question to: if we had complete discretion, would we teach our children to recognize someone’s ethnicity.” The author uses this comparison as rational for this type of AI model. On the surface, one could glance at this and continue on reading without question. But taking a minute to think, this analogy is not valid.
The author is equating the human experience to computers. We teach children to be accepting of all people, even when they may look different than themselves. We don’t ask a child to point out their friend’s race or ethnicity. While we need to be aware of our surroundings to be sensitive at a higher level this is not what this AI prompt is being used for. Additionally, humans learn in a completely different manner than computers. Human learning is based on experiences and emotions. Computers don’t have emotions or experiences. Computers are programmed to make decisions, which is what the author is equating to learning, based on decision trees, dictionaries and other methods. This is procedural knowledge, not experiential.
Even if the author was to use the data collected (assigning gender to text from celebrity tweets) for business purposes, what would be the impact? One impact could be reinforcing stereotypes and gender binaries. The results from the experiment could mislead the business and they may not truly understand their customer’s needs, wants and preferences. Additionally, looking at the results from the experiment, the accuracy rate is only 72%. This is only about 1/4 higher than the prediction rate of simply guessing if a tweet was written by a male or female (50%). Ultimately, a poor model leads to a poor proxy.
Perhaps one day I’ll come across a compelling argument for why AI would be helpful in detecting gender or ethnicity but it’s not today.