Analysis of a billion accounts obtained as a result of various leaks of user bases

Published statistics generated based on the analysis of a collection of a billion accounts obtained as a result of various database leaks with authentication parameters. Also prepared by samples with data on the frequency of use of typical passwords and lists from 1 thousand, 10 thousand, 100 thousand, 1 million and 10 million most popular passwords, which can be used to speed up the selection of password hashes.

Some generalizations and findings:

  • Of the resulting collection of a billion records, 257 million were discarded as corrupted data (chaotic data in the wrong format) or test accounts. After all the filtering, 169 million passwords and 293 million logins were identified from a billion records.
  • The most popular password β€œ123456” is used about 7 million times (0.722% of all passwords). Further with a noticeable lag follow passwords 123456789, password, qwerty, 12345678.
  • The share of the thousand most popular passwords is 6.607% of all passwords, the share of the million most popular passwords is 36.28%, and the share of 10 million is 54%.
  • The average password size is 9.4822 characters.
  • 12.04% of passwords contain special characters.
  • 28.79% of passwords consist of letters only.
  • 26.16% of passwords include only lowercase characters.
  • 13.37% of passwords consist only of numbers.
  • 34.41% of passwords end with numbers, but only 4.522% of all passwords begin with a number.
  • Only 8.83% of passwords are unique, the rest occur two or more times. The average length of a unique password is 9.7965 characters. Only some of these passwords are a chaotic set of characters, devoid of meaning, and only 7.082% include special characters. 20.02% of unique passwords consist of only letters and 15.02% only of lowercase letters, with an average length of 9.36 characters.
  • Fixed set of high-quality, high-entropy passwords that were similar in style (10 characters, random combination of numbers, upper and lower case letters, no special characters, capital letters at the beginning and end) and reused. The reuse rate was quite low (some of these passwords were repeated 10 times), but still higher than expected for passwords of this level.

Source: opennet.ru

Add a comment