A dataset with all of the Marvel characters (there are 16,376 of them!) provides fertile ground to explore society's biases. The data lists character attributes such as gender, hair color, eye color, and sexual orientation. Not only that, but it also lists whether each of these fictional characters is labeled as 'good' or 'bad.'

After learning about this dataset, I quickly loaded it into a Jupyter notebook and started to zip through cells. I found already established trends, like that many more men are represented than women, so I kept exploring.

Digging in more, I found strong correlations between lighter phenotypes -- blond hair and blue eyes -- and whether a character is portrayed as good or bad. Since there was no skin color attribute, this was a good stand-in for people with lighter complexions in the dataset.

Over half of Marvel characters with lighter hair and eyes are ‘good,’ whereas just over half of all other characters are ‘bad’

A chart showing that characters with light hair and eyes are portrayed as good much more often than characters with darker hair and eyes

GOOD

BAD

60%

56%

of characters with light hair + eyes are good

51%

are bad

40

33%

29%

of all other characters are good

are bad

20

(remaining characters are neutral)

0

A chart showing that characters with light hair and eyes are portrayed as good much more often than characters with darker hair and eyes

GOOD

BAD

60%

56%

of characters with light hair + eyes are good

51%

are bad

40

33%

of all other characters are good

29%

are bad

20

(remaining characters are neutral)

0

Data: fivethirtyeight

A mentor suggested I look at if the trend of lighter complexions being portrayed as "good" held over time. I turned back to the data and brought in the year of first appearance for each character.

Looking at the results shows that the Marvel canon started out with a wide gap between the phenotype of who was portrayed as good and bad, with lighter hair and eye colors overwhelmingly being portrayed as good. However, as time went on, that gap narrowed.

Over time, the bias towards introducing new characters with lighter hair and eyes as ‘good’ has diminished

A chart showing that the difference between how many characters with light hair and eyes are portrayed as good, and how many characters with dark hair and eyes are portrayed as good has lessened over time

percent of new characters with light hair + eyes who are labeled Good

percent of all other new characters who are labeled Good

100%

80

Gap has almost disappeared over time

60

40

20

0

1940

1950

1960

1970

1980

1990

2000

2010

year

A chart showing that the difference between how many characters with light hair and eyes are portrayed as good, and how many characters with dark hair and eyes are portrayed as good has lessened over time

percent of new characters with light hair + eyes who are labeled Good

percent of all other new characters who are labeled Good

100%

Gap has almost disappeared over time

80

60

40

20

0

1940

1980

1960

2000

year

Data: fivethirtyeight

The gap was closing, but this made me wonder if legacy characters still had an outsized influence. Unfortunately, I didn't have data on the number of appearances for each character by year, but I was able to bring in the overall number of appearances for each character. Again, there was a stark difference in representation.

‘Good’ characters with light hair and eyes appear over three times more than all other ‘good’ characters

A chart showing that 'good' characters with light hair and eyes appear many more times than 'good' characters with dark hair and eyes

good characters with light hair + eyes

all other good characters

89 average appearances

27 average appearances

A chart showing that 'good' characters with light hair and eyes appear many more times than 'good' characters with dark hair and eyes

good characters with light hair + eyes

all other good characters

89 average appearances

27 average appearances

Data: fivethirtyeight