Calculating the entropy of a password

2017-08-04 15:11:17

Let's say I'm generating a random string of length $n>0$ from a finite, non-empty alphabet $A$. The formula for entropy is, according to Wikipedia:

$$-\sum_{i=1}^n P(x_i) \log_b P(x_i)$$

$P$ is the probability mass function; we assume randomness is uniformly distributed over A, so $P = \frac{1}{|A|}$. Is it therefore the case that the entropy of my whole string is:

$$ \frac{n}{|A|} \log_b |A|$$

That means, if I have a string that is 16 characters long, selected uniformly from lower and uppercase Latin characters and numbers (i.e., $|A|=62$), then my string only has an entropy of around 1.5 bits... Is that right? It seems low, but I may be misinterpreting how this is all supposed to work.

EDIT I had originally assumed that entropy was $n \log_b |A|$, so my 16 character password would have an entropy of around 95 bits, but the above formula has this additional divisor. Which is correct?

The $n$ in the formula isn't the $n$ you're working with. What you are supp

  • The $n$ in the formula isn't the $n$ you're working with. What you are supposed to do is sum over all possible outcomes $x$ of the random variable $-P(x)\log_2 P(x)$. Here there are $62^{16}$ possibilities, all with probability $62^{-16}$, so you get $62^{16}62^{-16}\log_2(62^{16})=16\log_2 62\approx 95$.

    In fact whenever all possibilities are equally likely the entropy is just $\log_2$ of the number of possibilities.

    (The formula is for a random variable with $n$ possibilities $x_1,\ldots,x_n$, not a random variable $x$ with $n$ components.)

    2017-08-04 16:31:42