Categories
data science my projects research

Husband and wife: analysing gender issues through literary big data

Some time ago a friend made me realise the peculiar distribution of the word gay in English literature: relatively common in the 1800s, then in decline, then in massive recovery after the 1970s. Of course, the word here is used with two different meanings, the first one (“light-hearted, care free”) more common in the past few centuries, with the second (“homosexual”) going mainstream in the latter part of the 20th century. All of this can be easily visualised using Google Ngrams.

I became rather curious about this because I realised that gender issues have often been written about in literature; also, the ways in which familiar scenes have been depicted could easily be a proxy to understanding the relationship between the genders, especially in their strict, unchanging view often purported by traditionalists in our society.

So I charted four words: manwoman, husbandwife. The result is enlightening.

You see, it’s not just that “man” dominates. This can be explained in many ways, especially by the common use of “man” as a synonym of “human being”. The sudden growth in the latter part of 1700 is pointing to several phenomena happening in those years, from Enlightenment to the French Revolution.

Some data points:

  • “husband” is rarely used, compared with “man”; the ratio is about 1 to 10
  • conversely, “woman” and “wife” follow a similar trend with a much smaller ratio
  • “wife” has been used more than “woman” until the late 1800
  • “woman” becomes increasingly more important than “wife” after the 1970s.

Isn’t that a rather accurate description of what happens not just in the English literary corpus but, more widely, in society?