Winfried's Blog

My Links

Blog Stats

Archives

Post Categories

Algemeen

Wednesday, January 12, 2011 #

Calculating password strength

Introduction

To determine if a password is strong enough, two things need to be known:

  • the amount of randomness of the password
  • the amount of randomness that is needed

Randomness of the password:

Human text is just slightly random. When typing text in lower ascii characters the randomness of the text is estimated at 2-3 bits per character (see RFC 1750). The 26 lowercase ascii characters represent 4.7 bits each ( log(26)/log(2) = 4.7 ). So just because a human has choosen these characters, 40% to 60% of the randomness is lost.

This makes that the amount of randomness of a human chosen password can be estimated with:

b = log(c)/(log(2)*2)

where:
c is the size of the character set chosen from
and
b is the amount of random bits per character

Randomness needed:

RFC 1750 gives a nice example of calculating the randomness needed. It makes the following assumptions:

  • One guess of the password takes 6 seconds (for example because of a delay in the login system).
  • An attempt to brute-force the password will be detected within a month.
  • A chance of guessing the password of 1 in 1000 per attempt is acceptable.

This translates to 500,000 tries before the attack is detected. Taken the acceptable chance of 1 in 1000, the password needs to be randomly chosen out of 500,000,000 possibilities, which equivalents 29 bits of random data.

More generic the equation is:

b = log((d*86400)/(s*c))/log(2)

where:
b is the needed amount of bits
d is the amount of days before detecting an attack
s is the delay in seconds when reattempting to login
c is the acceptable chance of a successful attack (as fraction)

Some examples of this calculation:

Detection (Days)Delay (sec)Chance (1 in)Randomness needed
1601,00021
7101,00026
73010,00028
3151,00029
7601,000,00034
1410100,00034
21201,000,00037
3111,000,00042

Two scenario's seem realistic to me:

  1. low-security system
    A system like this is not very intensively monitored and you don't want to set a long delay in case of a wrong password. This results in a detection time of 31 days, a delay of 5 seconds and an acceptable chance of 1 in 1,000. The amount of randomness needed is 29 bits for such a system.
  2. high-security system
    A system like this is much more intensively monitored. Detection within 7 days is likely. At the same time it is acceptable to increase the login delay to 30 seconds or more in case of many failing attempts. At the same time the acceptable chance is much lower, 1 in 1,000,000. In these conditions the amount of randomness needed is 35 bits.

So how long should the password be?

Please make your calculation for yourself! Take a look at your local circumstances, acceptable risks etc. Choosing the length of a password is also seeking a balance between security and nagging your users. Having said that, I can answer this question for the two servers described above:

low-security server:

When using:characters needed:
numeric only18
lower only13
upper only13
symbols only12
lower + numeric12
upper + numeric12
upper + lower11
lower + symbols10
upper + symbols10
lower + upper + numeric10
lower + upper + symbols10
lower + upper + numeric + symbols9

high-security server:

When using:characters needed:
numeric only22
lower only15
upper only15
symbols only15
lower + numeric14
upper + numeric14
upper + lower13
lower + symbols12
upper + symbols12
lower + upper + numeric12
lower + upper + symbols11
lower + upper + numeric + symbols11

Dictionaries

This story is nice, but anybody who has ever played with a password cracker like john the ripper, knows that there are nice lists with passwords that are used very commonly. Using these dictionaries increases the chance of success dramatically. According to the calculations above, the password "August2008" has a randomness of approximately 32 bits. That is equivalent to a chance of 1 in 4,000,000,000 of guessing the password in one guess. When using a dictionary, a chance of 1 in 1000 would be more realistic. So when combinations that are likely to be part of a dictionary attack are part of the password, its randomness should be decreased. However, this is not hard science, just like composing the dictionaries themselves is a combination of statistics, experience and intuition. Thereby is it language dependent: commonly used names or the month of a year are different for each language.

Wrapping it up in a script

Note: the script described here is part of HelpIM

First of all we check the password against the blacklist of 'forbidden' combinations. Any part of the password that matches one of these regexes is discarded for the further calculation of the strength.

The amount of randomness of what is left of the passward can now be calculated by multiplying the length by a factor depending on the characters used. To calculate this factor we first need to know the size of the character set where the password is chosen from:

when using:adds to character set size:
space1
numeric10
lower26
upper26
symbols32

Now the amount of random bits per character can be calculated by the formula mentioned earlier:

b = log(c)/(log(2)*2)

The randomness of the password can now be calculated by multiplying the length of (what is left of) the password by randomness per bit.

This score is compared with a minimal amount of randomness for the site. On base of this a percentage and a color is calculated for a nice strength-bar.

Problems with this way of calculating:

  • This way of calculating is very dependant on the quality of the dictionaries used. Choose them with care!
  • Diacritical and other non-ascii characters are not accounted for, although these make the password much stronger (but are quite clumpsy to use in a password, unless you use a localized keyboard containing them).

posted @ 12:10 PM | Feedback (1)