The password conundrum
By Alon Nir
0. Intro
Sometimes interesting opportunities can emerge from unfavorable situations. Tense diplomatic atmosphere between Israel and Turkey in the past couple of months, brought on a cyber-attack from the Turkish side. A major Israeli apartment-listing website was hacked and so was Pizza Hut’s local website. The credentials of over 100,000 user accounts (roughly 2% of internet users in the country) were revealed and published on dubious Turkish forums. Naturally, it wasn’t long before these lucrative spreadsheets, containing usernames, email addresses and passwords became so widespread anyone with basic googling abilities could find them. One person who got his hands on these files comes from a profession no less defamed than computer-hacking. You guessed it – an economist. Me.
I took a look at the spreadsheets and after some basic analysis I was left with a few interesting insights. The data reveal the extent to which people fail to think creatively and incorporate even a touch of randomness in their username and password selection.
1. Popular Passwords
My analysis focused on a spreadsheet containing 31,588 users of the above mentioned apartment-listing website. The reason I chose it is because that website was (evidently) particularly lenient in the registration process and didn’t even strict choices of password. One digit password? Not a problem. This leniency probably isn’t the best security policy the website could adopt, but it is valuable for our analysis as it shows what people will do when they are unconstrained.
15,820 of these 31,588 registered users (slightly over 50%!) used their email address as their username. Furthermore, 675 people (over 2%) picked their phone number as a username. Both of these choices aren’t considered very secure, since phone numbers, and email addresses in particular, are easy to find out.
While these two bits of information give an indication on lack of creativity on the users’ part, the really interesting discovery is their selection of passwords.
The most common password was 123456 (584 users), with 1234 as the runner up (569) and 12345 coming in third (388). All in all, 1786 passwords (5.65%!) were comprised of consecutive increasing numerals. This means that one person in 18 didn’t muster the cognitive capacity to generate a password more intricate than 1234 and the like.
788 people (roughly 2.5%, or one in forty people) chose a password identical to their username.
417 people (1.32%) chose a password comprised of identical digits (e.g. 1111).
Keyboard patterns were ubiquitous, horizontal in particular: on top of the 1786 ‘123X’ passwords mentioned above, 123 passwords began with ‘qwe’ (including 25 instances of ‘qwerty’), 41 with ‘asd’, and 31 with ‘zxc’. It’s interesting to see how the frequency of these patterns falls as we go to a lower line on the keyboard. A similar distribution appears when looking at vertical lines on the keyboard, though the frequency is substantially lower (‘1qaz’ and ‘qaz’ make up 83 observations combined).
48 passwords began with ‘abc’ (e.g. ‘abcdef’, ‘abc123’, etc.).
And finally, 69 passwords had variants of the actual word ‘password’, with no less than 29 exact matches.
What’s more interesting is the fact that using this information, diligent mischiefs hacked into tens of thousands of email and Facebook accounts, which indicates that a high percentage of the people in our sample uses the same (trivial) password for different websites. It also refutes arguments that people carelessly entered an easy password because they didn’t care much for their account on that particular website.
Lastly, I looked at the spreadsheet with Pizza Hut’s users credentials hoping something will catch my eye and help me gain a better insight into password selection. I didn’t have to look for long, as roughly 200 people (out of around 70K accounts, I must mention) chose ‘pizza’ or something similar as password. This got me thinking, and I suspect that password selection might be influenced by cognitive availability. I went back to the original data and found that 89 had the name of the website as part of their password. Other passwords (though much more rare) were nouns like ‘coffee’ and brand names like ‘samsung’, ‘acer’, ‘cocacola’, ‘nokia’ and others, all of which can be attributed to physical objects just in front of the user’s eyes or in his hand. Add this to the 2,000 or so patterned passwords I mentioned earlier (visual availability on the keyboard), and you get a plausible explanation in my view.
2. Explaining the Findings
So what can account for this password picking behavior? A few possible explanations come to mind. Let me describe them by using John, an imaginary typical internet user, as an example.
One explanation is that when John first embarks on setting up a new account at a website, he knows he’ll get a blank profile when the registration process is done. Since his profile will contain no information, the ‘price’ (in terms of lost information, contacts, time, etc.) John will pay in case that an amicable hacker takes over his account is close to zero. Hence, he is reluctant to strain himself making up, and remembering, a truly unique password.
The problem starts a while later, when John’s email box is already full with valuable correspondences, and his Facebook page is populated with hundreds of friends (including a few flirtatious Janes). Then, a more difficult to decipher password is of true value, but John irrationally sticks to the password he already has. It can be because he’s a terrible procrastinator, forgetful, or even due to the Sunk Cost bias – in his mind John already “paid” the price (in terms of mental resources and time spent) for setting a password, and that’s why he refrains from going through the process again. As time passes the incentive to change password becomes greater and greater, but try telling that to John.
3. Conclusion
When I first put my hands on the coveted files, I did not expect to find such interest in password selection, but the more I go over the data, and the more I think about it, I see the value of studying the way people choose passwords. As you see from the above analysis, it’s an example of a decision making problem where behavioral and cognitive processes, and biases, come into play. Since it’s a one-shot game and the decision is kept in strict confidentiality (until, of course, someone picks up on security breaches), the setting is rather simplified and the observed behavior is of research value. In a way, we have here a form of a natural experiment; it’s just unfortunate the way the data were obtained.
If you have any other theories explaining password selection, please feel free to share them in the comments below. Don’t worry, no registration is required ….