Skip to content

Now with 68% more manliness! Or something.

December 3, 2010

It is honestly one of the wonders of the information age, the way that you can follow a link here and a link there and end up in all sorts of strange Interplaces. Today I was reading a thread at Pharyngula about creationist site Answers in Genesis using history-sniffing scripts to see what other sites you’ve visited (a trick originally developed by porn sites trying to see if you were also visiting their competitors. You’re safe if you’re on Opera, Chrome, or Safari, and the Firefox loophole was closed a few versions back, so if you’re all upgraded you should be fine) and followed a link someone left in the comments about other uses of the script, which brought me to this old post at Mike On Ads:

One of the things that I always wanted to do but never got around to was to analyze a user’s browsing history to estimate age and gender. Of course the idea is definitely not new, in fact Xerox (of all companies??) has a patent on the whole process and I’m certain plenty of networks already do something of the sort… but what the heck, let’s have some fun!

So what I did is I modified the SocialHistory JS so that it polled the browser to find out which of the Quantcast top 10k sites were visited. I then apply the ratio of male to female users for each site and with some basic math determine a guestimate of your gender.

He built a widget that looks at your browsing history and tries to guess what sex you are based on the userbase of those sites – as in, someone who visits a lot of sites whose readership is heavily male is statistically more likely to be male, and so on. (The page itself goes into more detail on the maths.)

I am less than thrilled with the persistent use of the word ‘gender’ where he means ‘sex’; it is also a shame that all the Quantcast data is presented strictly in binary terms, male/female. (I doubt that the inclusion of genderqueer/non-binary people would alter the data very much – currently, at least, they’re in a tiny minority; but that’s not really the point, you know? The point is to not randomly deny people’s existence.) Working with what there is, though, still turns up some interesting tidbits.

Here are the relevant sites from my browser history, copied-and-pasted from the widget:

Site Male-Female Ratio
google.com 0.98
youtube.com 1
facebook.com 0.83
imdb.com 1.06
wordpress.com 0.98
nasa.gov 1.38
biblegateway.com 1.02
guardian.co.uk 1.33
ebay.co.uk 1.17
zazzle.com 0.75
icanhascheezburger.com 1.04
kongregate.com 1.41
behindthename.com 0.77
google.co.uk 1.35

(For comparison purposes, Quantcast starts classing site readerships as “heavily male” around the 1.85 mark, which is 65/35, and as “heavily female” at around 0.53, which is 35/65. In the first few hundred sites listed, the heaviest skew is pogo.com, which is 71% female for a score of o.40.)

And here is what it decided, based on those sites:

Likelihood of you being FEMALE is 32%
Likelihood of you being MALE is 68%

So there you have it: more than twice as likely to be male. (Who knew?)

The thing is, I think a human searching my browsing history would be able to conclude in a matter of minutes that I’m a woman. My full history is fairly mixed: on the one hand there’s a ton of ladybusiness in there – Feministe, Tiger Beatdown, Girls With Slingshots, Pandagon. On the other there’s the Guardian, Doonesbury, xkcd, and Pharyngula, which likely skew heavily male. I suspect that a person presented with that mishmash – that is, a person whose guesses are going to be informed by cultural attitudes, as contrasted with a computer which happily/sadly does not know sexism exists – would interpret that kind of mix as “woman interested in masculine-coded things” rather than “man interested in feminine-coded things”, because the former is a lot more socially acceptable than the latter.

Obviously a lot of my favourite blog destinations are, if not small in absolute terms, definitely low-traffic next to Google.com and its 162 million US users a month, and so don’t show up on the Quantcast 10K list from which the script draws. That said, I’m still interested in the data it did get; I have, after all, been to all these places in my recent perambulations of the web, and I’m intrigued by some of their stats.

Firstly, Google. Being a Brit, it’s Google.co.uk that’s my default destination; the only reason Google.com is showing on my history at all is because I wanted to see if they’d put the St Andrews’ Day doodle up on any versions other than the UK one (they hadn’t.) But look at those stats. Google US has a male-to-female ratio of 0.98 – not much to talk about; but Google UK’s ratio is 1.35, a distinct male skew. I am entirely baffled by this. Do British women just not use the Google? I AM CONFUSED.

Interesting that Facebook and WordPress both skew very slightly female. File it under the continuing categorisation of both socialisation and personal journaling habits as ladyish things to do. Would be interesting to see how others compare. (A start: according to Quantcast, Tumblr is 51/49 men to women – 1.04; Livejournal is estimated at 42/58 – 0.72.)

Can’t say I’m surprised that the Guardian skews male in its readership given how much of a Dude Paper they can occasionally be. Better about it than pretty much all the others in the country, I hasten to add, but still; the old boys’ network in newspaper publishing is going very strong.

Not especially surprised by Ebay, either.

Behindthename.com is a baby names site; I use it most often for character names, so with NaNoWriMo just gone past I’ve been consulting it fairly frequently. Given that All Things Baby are still constructed as the province of women, a 0.77 man/woman ratio seems about right.

And now the big hitters: NASA and Kongregate, look at you with your 1.38 and your 1.41. I was on NASA looking up this business about arsenic-using bacteria (not as exciting as I’d hoped, but still kind of cool) and Kongregate is my go-to site when I need a mindless Flash game to help cool down my brain.

For comparison, J was told he was 9% likely to be female and 91% likely to be male. A strange thing from his list was the differing profiles of two very similar Flash game sites: Kongregate, as previously noted, scores 1.38, but Armor Games (which hosts many of the same titles) only 0.89. The standout figure on his list was the Escapist website – an online gaming magazine – which has a score of 2.08, or approximately 67/33; twice as many male readers as female. He informs me, having done some maths, that omitting the Escapist from the list more than doubles the likelihood of his being female, according to the algorithm – a jump from 9% to 20%. (Clearly it’s just uber-manly.)

I would be very interested to know how other people score, and how the percentages the algorithm throws up correspond – or fail to correspond – to your actual sex. (I would expect a fair amount of hilarious inaccuracy, especially as it doesn’t weight links according to how many times you visit them.) Also any interesting/surprising figures it produces with regard to the sex ratio on particular sites. Anyone? Here’s the link again if you do want to join in.

Advertisements
5 Comments leave one →
  1. December 4, 2010 12:43 pm

    My score was fairly similar to yours – 23% odds of being female, 77% odds of being male. One thing that’s interesting is that science sites are all over the place. New Scientist and the NOAA skewed heavily male (1.44 and 1.41 respectively), the Science and Nature journals were pretty much neutral (1.04 for both) and the American Medical Association and the NHS both skewed female (0.72 and 0.8).

    I know different fields of science do have different gender ratios, but I’m genuinely surprised that the NOAA – environmental and climate science – was visited by so many more men than women, while the AMA site – medical journals, mostly – was visited mostly by women.

    The newspapers give some pretty interesting results too: the Guardian gets 1.33, as you mention, BBC news gets 1.44, the Telegraph and the Independent get 1.5, the Times 1.57, the Sun 1.7 and the Financial Times a whopping 2.08, making it as male-biased as the Escapist (ignoring the FT almost doubles my chance of being a woman from 23% to 38%). At the other end, though, the New York Times gets 1.17, bringing it close to parity.

    By far the lowest score I got (0.3) was for a site called Blingee, which lets you put hilariously awful animated gifs onto any image. I was only visiting it ironically, so I’m not sure it should count, but then I think a good chunk of its userbase does the same.

  2. December 4, 2010 1:49 pm

    @atomicspin Bwahaha suuuure you were visiting Blingee ironically!

    I’m 78% likely to be female. I am totally not all respectable and into science or other masculine-coded things though, so I guess fair play to them. I mean, fanfiction.net. It’s pretty much written that I am a girl.

    Wordreference is also full of the ladies, apparently. Languages and all, I presume.

    Ultimate-guitar.com however, is for the boys.

    You know, this is not something I’m super proud of, but I always get kind of disappointed when these things guess, always correctly, that I’m female. And I’m not entirely sure how much of it has to do with “damn I’m predictable and therefore not SPECIAL and A MAVERICK” and how much has to do with “bah, I am into girly things!”

    I suspect that the answer would make me more than a little ashamed.

  3. Paul Skinner permalink
    December 4, 2010 2:12 pm

    Amusingly I get different results (by quite a margin) on each of my computers.

    Likelihood of you being FEMALE is 39%
    Likelihood of you being MALE is 61%

    and

    Likelihood of you being FEMALE is 7%
    Likelihood of you being MALE is 93%

    The Pirate Bay and Mac Rumors [sic] seem to have quite a massive Male over Female bias. 2.13 and 2.08 respectively.

  4. December 5, 2010 12:56 pm

    Likelihood of you being FEMALE is 21%
    Likelihood of you being MALE is 79%

    Site Male-Female Ratio
    blogspot.com
    1.08
    photobucket.com
    0.85
    slate.com
    1.11
    guardian.co.uk
    1.33
    ebay.co.uk
    1.17
    amazon.co.uk
    1.11
    independent.co.uk
    1.5
    joystiq.com
    1.44

    And it doesn’t count PubMed, E2 or any number of other awesome sites. My lack of social networking probably skews me towards male though.

  5. Schala permalink
    January 6, 2011 7:14 am

    majorgeeks.com 1.9 – a very male ratio

    the thing says 30% female 70% male

    Other interesting sites:
    photobucket.com 0.85
    paypal.com 1.04
    imageshack.us 0.87
    google.ca 1.33
    hotmail.com 0.83
    wikipedia.org 1.08

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s