Why passwords have never been weaker—and crackers have never been stronger

In late 2010, Sean Brooks received three e-mails over a span of 30 hours warning that his accounts on LinkedIn, Battle.net, and other popular websites were at risk. He was tempted to dismiss them as hoaxes—until he noticed they included specifics that weren't typical of mass-produced phishing scams. The e-mails said that his login credentials for various Gawker websites had been exposed by hackers who rooted the sites' servers, then bragged about it online; if Brooks used the same e-mail and password for other accounts, they would be compromised too.
The warnings Brooks and millions of other people received that December weren't fabrications. Within hours of anonymous hackers penetrating Gawker servers and exposing cryptographically protected passwords for 1.3 million of its users, botnets were cracking the passwords and using them to commandeer Twitter accounts and send spam. Over the next few days, the sites advising or requiring their users to change passwords expanded to include Twitter, Amazon, and Yahoo.
"The danger of weak password habits is becoming increasingly well-recognized," said Brooks, who at the time blogged about the warnings as the Program Associate for the Center for Democracy and Technology. The warnings, he told me, "show [that] these companies understand how a security breach outside their systems can create a vulnerability within their networks."
The ancient art of password cracking has advanced further in the past five years than it did in the previous several decades combined. At the same time, the dangerous practice of password reuse has surged. The result: security provided by the average password in 2012 has never been weaker.

A new world

The average Web user maintains 25 separate accounts but uses just 6.5 passwords to protect them, according to a landmark study (PDF) from 2007. As the Gawker breach demonstrated, such password reuse, combined with the frequent use of e-mail addresses as user names, means that once hackers have plucked login credentials from one site, they often have the means to compromise dozens of other accounts, too.
Newer hardware and modern techniques have also helped to contribute to the rise in password cracking. Now used increasingly for computing, graphics processors allow password-cracking programs to work thousands of times faster than they did just a decade ago on similarly priced PCs that used traditional CPUs alone. A PC running a single AMD Radeon HD7970 GPU, for instance, can try on average an astounding 8.2 billion password combinations each second, depending on the algorithm used to scramble them. Only a decade ago, such speeds were possible only when using pricey supercomputers.
The advances don't stop there. PCs equipped with two or more $500 GPUs can achieve speeds two, three, or more times faster, and free password cracking programs such as oclHashcat-plus will run on many of them with little or no tinkering. Hackers running such gear also work in tandem in online forums, which allow them to pool resources and know-how to crack lists of 100,000 or more passwords in just hours.
Most importantly, a series of leaks over the past few years containing more than 100 million real-world passwords have provided crackers with important new insights about how people in different walks of life choose passwords on different sites or in different settings. The ever-growing list of leaked passwords allows programmers to write rules that make cracking algorithms faster and more accurate; password attacks have become cut-and-paste exercises that even script kiddies can perform with ease.
"It has been night and day, the amount of improvement," said Rick Redman, a penetration tester for security consultants KoreLogic and organizer of the Crack Me If You Can password contest at the past three Defcon hacker conferences. "It's been an exciting year for password crackers because of the amount of data. Cracking 16-character passwords is something I could not do four or five years ago, and it's not because I have more computers now."
This $12,000 computer, dubbed Project Erebus v2.5 by creator d3ad0ne, contains eight AMD Radeon HD7970 GPU cards. Running version 0.10 of oclHashcat-lite, it requires just 12 hours to brute force the entire keyspace for any eight-character password containing upper- or lower-case letters, digits or symbols. It aided Team Hashcat in winning this year's Crack Me If You Can contest.
----------------------------------------------------------------------------------------------------------------------------------
At any given time, Redman is likely to be running thousands of cryptographically hashed passwords though a PC containing four of Nvidia's GeForce GTX 480 graphics cards. It's an "older machine," he conceded, but it still gives him the ability to cycle through as many as 6.2 billion combinations every second. He typically uses a dictionary file containing about 26 million words, combined with programming rules that greatly extend its effectiveness by adding numbers, punctuation, and other characters to each list entry. Depending on the job, he sometimes uses a 60 million-strong word list and something known as "rainbow tables," which are described later in this article.
As a penetration tester who gets paid to pierce the defenses of Fortune 500 companies, Redman tries to spot weaknesses before criminal hackers exploit them on his customers' networks. One of the key ways he stays ahead is by downloading hash lists that are dumped almost every day on pastebin.com and other sites to see if any belong to the organizations he is contracted to protect.
Recently, he recovered a 13-character password that he had spent several months trying to crack. To protect the account holder, he declined to reveal the precise combination of characters and instead made up the imaginary passphrase "Sup3rThinkers" (minus the quotation marks) to illustrate his breakthrough. "Sup3rThinkers" follows a number of patterns that have become common: it opens with a common, five-letter word that begins with a capitalized letter and substitutes a 3 for an E, followed by a common, seven-letter word that also begins with a capital letter. While the speed of his system didn't hurt, cracking the password was largely the result of the collective codebreaking expertise developed online over the past few years.
The most important single contribution to cracking knowledge came in late 2009, when an SQL injection attack against online games service RockYou.com exposed 32 million plaintext passwords used by its members to log in to their accounts. The passcodes, which came to 14.3 million once duplicates were removed, were posted online; almost overnight, the unprecedented corpus of real-world credentials changed the way whitehat and blackhat hackers alike cracked passwords.

Hashing it out

Like many password breaches, almost none of the 1.3 million Gawker credentials exposed in December 2010 contained human-readable passcodes. Instead, they had been converted into what are known as "hash values" by passing them through a one-way cryptographic function that creates a unique sequence of characters for each plaintext input. When passed through the MD5 algorithm, for instance, the string "password" (minus the quotes) translates into "5f4dcc3b5aa765d61d8327deb882cf99".
Even minor changes to the plaintext input—say, "password1" or "Password"—result in vastly different hash values ("7c6a180b36896a0a8c02787eeafb0e4c" and "dc647eb65e6711e155375218212b3964" respectively). When processed by the SHA1 algorithm, the inputs "password", "password1", and "Password" result in "5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8", "e38ad214943daad1d64c102faec29de4afe9da3d", and "8be3c943b1609fffbfc51aad666d0a04adf83c9d" respectively.
In theory, once a string has been converted into a hash value, it's impossible to revert it to plaintext using cryptographic means. Password cracking, then, is the practice of running plaintext guesses through the same cryptographic function used to generate a compromised hash. When the two hash values match, the password has been identified.
The RockYou dump was a watershed moment, but it turned out to be only the start of what's become a much larger cracking phenomenon. By putting 14 million of the most common passwords into the public domain, it allowed people attacking cryptographically protected password leaks to almost instantaneously crack the weakest passwords. That made it possible to devote more resources to cracking the stronger ones.
Within days of the Gawker breach, for instance, a large percentage of the password hashes had been converted to plaintext, a feat that gave crackers an even larger corpus of real-world passwords to inform future attacks. That collective body of passwords has only snowballed since then, and it grows ever larger with each passing breach. Just six days after the leak of 6.5 million LinkedIn password hashes in June, more than 90 percent of them were cracked. In the past year alone, Redman said, more than 100 million passwords have been published online, either in plaintext or in ciphertext that can be readily cracked.
"Now, it's like once a quarter you get another RockYou," Redman said.
 A screenshot from ocl-Hashcat as it cracks a list of password hashes leaked online.
 ----------------------------------------------------------------------------------------------------------------------------------

We will RockYou

In the RockYou aftermath, everything changed. Gone were word lists compiled from Webster's and other dictionaries that were then modified in hopes of mimicking the words people actually used to access their e-mail and other online services. In their place went a single collection of letters, numbers, and symbols—including everything from pet names to cartoon characters—that would seed future password attacks.
"So it's no longer this theoretical word list of Klingon planets and stuff like that," Redman said of the RockYou list. "It's literally 'dragon' and 'princess' and stuff like that, and [the list] may crack 60 percent of a newly compromised website. Now you have 60 percent of the work done and you haven't done any thinking at all. You've just used your previous knowledge."
Almost as important as the precise words used to access millions of online accounts, the RockYou breach revealed the strategic thinking people often employed when they chose a passcode. For most people, the goal was to make the password both easy to remember and hard for others to guess. Not surprisingly, the RockYou list confirmed that nearly all capital letters come at the beginning of a password; almost all numbers and punctuation show up at the end. It also revealed a strong tendency to use first names followed by years, such as Julia1984 or Christopher1965.
"Sup3rThinkers" wasn't included in the list of RockYou passwords, making it part of the 40 percent of hashes that require Redman to apply cracking techniques that go beyond a simple word-list attack. Fortunately for him, the RockYou corpus included both "sup3r" and "thinkers" as separate passwords. That allowed him to recover the password in question by appending each word in his list to every other word in the list. The technique is simple enough to do, although it increases the number of required guesses dramatically—from about 26 million, assuming the dictionary Redman uses most often, to about 676 trillion.
Other complex passwords require similar manipulations to be cracked. The RockYou list, and the hundred-millions-plus passwords that have collectively been exposed in its aftermath, brought to light a plethora of other techniques people employ to protect simple passcodes from traditional dictionary attacks. One is adding numbers or non-alphanumeric characters such as "!!!" to them, usually at the end, but sometimes at the beginning. Another, known as "mangling," transforms words such as "super" or "princess" into "sup34" and "prince$$." Still others append a mirror image of the chosen word, so "book" becomes "bookkoob" and "password" becomes "passworddrowssap."
Passwords such as "mustacheehcatsum" (that's "mustache" spelled forward and then backward) may give the appearance of strong security, but they're easily cracked by isolating their patterns, then writing rules that augment the words contained in the RockYou dump and similar lists. For Redman to crack "Sup3rThinkers", he employed rules that directed his software to try not just "super" but also "Super", "sup3r", "Sup3r", "super!!!" and similar modifications. It then tried each of those words in combination with "thinkers", "Thinkers", "think3rs", and "Think3rs".
Such cracking techniques have existed for a decade, but they work far better now that the crackers possess a more intimate understanding of the ways people choose passwords.
"It's vastly different than it was [before] because of these massive password lists," said Rob Graham, CEO of penetration testing firm Errata Security. "We never had a really large password list to work from. Now that we do, we're learning how to remove the entropy from them. The state of the art of cracking is much more subtle in that before we were guessing in the dark."

A little finesse

That subtlety takes all sorts of forms. One promising technique is to use programs such as the open-source Passpal to reduce cracking time by identifying patterns exhibited in a statistically significant percentage of intercepted passwords. For example, as noted above, many website users have a propensity to append years to proper names, words, or other strings of text that contain a single capital letter at the beginning. Using brute-force techniques to crack the password Julia1984 would require 629 possible combinations, a "keyspace" that's calculated by the number of possible letters (52) plus the number of numbers (10) and raising the sum to the power of nine (which in this example is the maximum number of password characters a cracker is targeting). Using an AMD Radeon HD7970, it would still take about 19 days to cycle through all the possibilities.
Using features built into password-cracking apps such as Hashcat and Extreme GPU Bruteforcer, the same password can be recovered in about 90 seconds by performing what's known as a mask attack. It works by intelligently reducing the keyspace to only those guesses likely to match a given pattern. Rather than trying aaaaa0000, ZZZZZ9999, and every possible combination in between, it tries a lower- or upper-case letter only for the first character, and tries only lower-case characters for the next four characters. It then appends all possible four-digit numbers to the end. The result is a drastically reduced keyspace of about 237.6 billion, or 52 * 26 * 26 * 26 * 26 * 10 * 10 * 10 * 10.
An even more powerful technique is a hybrid attack. It combines a word list, like the one used by Redman, with rules to greatly expand the number of passwords those lists can crack. Rather than brute-forcing the five letters in Julia1984, hackers simply compile a list of first names for every single Facebook user and add them to a medium-sized dictionary of, say, 100 million words. While the attack requires more combinations than the mask attack above—specifically about 1 trillion (100 million * 104) possible strings—it's still a manageable number that takes only about two minutes using the same AMD 7970 card. The payoff, however, is more than worth the additional effort, since it will quickly crack Christopher2000, thomas1964, and scores of others.
"The hybrid is my favorite attack," said Atom, the pseudonymous developer of Hashcat, whose team won this year's Crack Me if You Can contest at Defcon. "It's the most efficient. If I get a new hash list, let's say 500,000 hashes, I can crack 50 percent just with hybrid."
With half the passwords in a given breach recovered, cracking experts like Atom can use Passpal and other programs to isolate patterns that are unique to the website from which they came. They then write new rules to crack the remaining unknown passwords. More often than not, however, no amount of sophistication and high-end hardware is enough to quickly crack some hashes exposed in a server breach. To ensure they keep up with changing password choices, crackers will regularly brute-force crack some percentage of the unknown passwords, even when they contain as many as nine or more characters.
"It's very expensive, but you do it to improve your model and keep up with passwords people are choosing," said Moxie Marlinspike, another cracking expert. "Then, given that knowledge, you can go back and build rules and word lists to effectively crack lists without having to brute force all of them. When you feed your successes back into your process, you just keep learning more and more and more and it does snowball."