In this article, I'd like to present two methods of calculating binomial distribution. The first method does not store any computed values and relies on recursion to do all calculations with every iteration. The second method stores values in an array (actually in a hash table but this is just an implementation detail). A quick refresher on binomial distribution follows.

Example #1: For the sake of example, let's consider an experiment in which we flip a fair coin five times. How many heads could we get? Well, we can have 0, 1, 2, 3, 4, or 5 heads. What is the probability of having 0, 1, 2, 3, 4, or 5 heads? To answer this question we need to know the number of possible outcomes. Let's write a few of them:

  • T T T T T (all tails)
  • T T T T H (4 tails and 1 head)
  • T T T H T (4 tails and 1 head)
  • T T T H H (3 tails and two heads) etc.

We can deduce that the number of outcomes is `2^n` where `n` is the number of flips. Here, `n = 5` hence the number of all possible outcomes is `2^5= 32`. Also, from combinatorics, we know that the number of ways to choose a k-element subset from an n-element set is given by the formula for binomial coefficients:

`C(n,k) = (n!) / ((n-k)! k!)`

where, in our example, `n` is the number of flips (`n = 5`) and `k` is the number of heads (0, 1, 2, 3, 4, or 5).

We are ready to calculate probabilities for each outcome.

  • P (0 heads) = C (5, 0) / 32 = 1/32
  • P (1 head) = C (5, 1) / 32 = 5/32
  • P (2 heads) = C (5, 2) / 32 = 10/32
  • P (3 heads) = C (5, 3) / 32 = 10/32
  • P (4 heads) = C (5, 4) / 32 = 5/32
  • P (5 heads) = C (5, 5) / 32 = 1/32

The above list represents a binomial distribution of probability for flipping a coin five times. Note that the distribution is symmetrical as the probability of getting a head and a tail is the same, 0.5. What would happen if probabilities for the two possible outcomes were different? Let's see it in another example.

Example #2: Let's assume that the probability of scoring in a shoot 'em up space game is p = 0.7. It means that the probability of missing is p - 1 = 0.3. Let's say we have six attempts to destroy an alien spaceship. Here are a few possible outcomes:

  • M M M M M M (all missed)
  • M M M M M S (one scored)
  • M M M M S M (one scored)
  • M M M M S S (two scored)
  • M M M S M M (one scored) etc.

In order to calculate probabilities for each outcome we will use the following formula:

`P(n,k) = C(n,k) * p^k (1-p)^(n-k)`

where, in our example, `n` is the number of attempts (`n = 6`), `k` is the number of successful shots (0, 1, 2, 3, 4, 5, or 6), and `P(n,k)` represents the probability of exactly `k` successful shots in `n` attempts.

Let's calculate probabilities:

  • P (6,0) = C (6,0) 0.70 0.36 = 0.001
  • P (6,1) = C (6,1) 0.71 0.35 = 0.01
  • P (6,2) = C (6,2) 0.72 0.34 = 0.06
  • P (6,3) = C (6,3) 0.73 0.33 = 0.185
  • P (6,4) = C (6,4) 0.74 0.32 = 0.324
  • P (6,5) = C (6,5) 0.75 0.31 = 0.303
  • P (6,6) = C (6,6) 0.76 0.30 = 0.118

This is the binomial distribution of scoring in six attempts with the probability of scoring 0.7. Note that this distribution is not symmetrical as the probability of scoring is different than the probability of missing.

The formula `P(n,k) = C(n,k) * p^k (1-p)^(n-k)` reduces to `P(n,k) = (C(n,k)) / (2^n)` if the probabilities of both outcomes are equal i.e., if they are 0.5 (just like in the first example).

Now, we are ready for some coding. Both methods calculate a single probability value for a given `n` and `k` using recursion.

Method #1: The first method (Binomial) does not store any intermediate results i.e., it calculates each probability without utilizing already computed values:

double Binominal(int N, int k, double p)
{
    if ((N == 0) && (k == 0))
        return 1.0;

    if ((N < 0) || (k < 0))
        return 0.0;

    double b1 = Binominal(N - 1, k, p);
    double b2 = Binominal(N - 1, k - 1, p);

    return (1 - p) * b1 + p * b2;
}

Method #2: The second method (BinomialWithMemory) stores intermediate results in a dictionary (which is basically a hash table): 

Dictionary<string, double> a = new Dictionary<string, double>();
double BinominalWithMemory(int n, int k, double p)
{
    if ((n == 0) && (k == 0))
    {
        return 1.0;
    }

    if ((n < 0) || (k < 0))
    {
        return 0.0;
    }

    double v1;
    string key1 = (n - 1).ToString() + "_" + k.ToString();
    if (!(a.TryGetValue(key1, out v1)))
    {
        v1 = BinominalWithMemory(n - 1, k, p);
        a.Add(key1, v1);
    }

    double v2;
    string key2 = (n - 1).ToString() + "_" + (k - 1).ToString();
    if (!(a.TryGetValue(key2, out v2)))
    {
        v2 = BinominalWithMemory(n - 1, k - 1, p);
        a.Add(key2, v2);
    }

    return (1 - p) * v1 + p * v2;
}

The difference in running times between both methods is staggering:

n,k Result P(n,k) Method #1 Method #2
10, 5 0.2636718750 < 1 ms < 1 ms
20, 10 0.0099222753 0.034 s < 1 ms
30, 15 0.0019305450 34 s < 1 ms
50, 25 0.0000844919 weeks < 1 ms
100, 50 0.0000000451 > years? 0.002 s

p = 0.25

Clearly, storing calculated values is a better strategy than blindly calculating them with each recursive iteration.

This is the complete code: 

using System;
using System.Collections.Generic;
using System.Diagnostics;

// An implementation of binominal distribution 
// based on saving computed values in a dictionary.
//
// Binomial distribution is a discrete version of normal distribution (a bell curve).

namespace ConsoleTest
{
    class Program
    {
        private static Stopwatch stopwatch = new Stopwatch();

        static void Main(string[] args)
        {
            if (args.Length != 3 && args.Length != 4)
            {
                Console.WriteLine(
                    "Syntax: binominal N, k, p, r\n" +
                    "        N - the number of attempts i.e., the number of all possible outcomes\n" +
                    "        k - the number of particular outcomes\n" +
                    "        p - the probability of the particular outcome\n" +
                    "        x (optional) - indicates that the computed values should not be stored\n");
                Console.WriteLine("Examples:\n" +
                    "binominal 5 3 0.5\n" +
                    "        5 attempts, e.g., 5 flipping of a fair coin\n" +
                    "        3 outcomes, it means that in 5 coin flips we had 3, for example, heads\n" +
                    "        0.5 is the probability of getting, in our example, a head\n\n" +
                    "binominal 6 2 0.7 x\n" +
                    "        6 attempts, e.g., 6 throws into a basket\n" +
                    "        2 scored, it means that in 6 throws we scored 2 times\n" +
                    "        0.7 is the probability of scoring\n" +
                    "        x indicates that the computed values should not be stored");
                return;
            }

            int N;
            if (!Int32.TryParse(args[0], out N))
            {
                Console.WriteLine("The first argument N (the number of attempts) should be an integer.");
                return; 
            }

            int k;
            if (!Int32.TryParse(args[1], out k))
            {
                Console.WriteLine("The second argument k (the number of a particular outcome) should be an integer.");
                return;
            }

            double p;
            if (!Double.TryParse(args[2], out p))
            {
                Console.WriteLine("The third argument p (the probability of the particular outcome) should be an integer.");
                return;
            }

            stopwatch.Reset();
            stopwatch.Start();

            double b = 0.0;
            string algorithm = "";

            if (args.Length == 4 && args[3] == "x")
            {
                b = Binominal(N, k, p);
                algorithm = "Binominal";
            }
            else
            {
                b = BinominalWithMemory(N, k, p);
                algorithm = "BinominalWithMemory";
            }

            double timeElapsed = stopwatch.ElapsedMilliseconds / 1000.0;
            Console.WriteLine("{0}({1}, {2}, {3:F2}) = {4:F10}    ({5:F3} sec.)", algorithm, N, k, p, b, timeElapsed);
        }

        static double Binominal(int N, int k, double p)
        {
            if ((N == 0) && (k == 0))
                return 1.0;

            if ((N < 0) || (k < 0))
                return 0.0;

            double b1 = Binominal(N - 1, k, p);
            double b2 = Binominal(N - 1, k - 1, p);

            return (1 - p) * b1 + p * b2;
        }

        static Dictionary<string, double> a = new Dictionary<string, double>();
        static double BinominalWithMemory(int n, int k, double p)
        {
            if ((n == 0) && (k == 0))
            {
                return 1.0;
            }

            if ((n < 0) || (k < 0))
            {
                return 0.0;
            }

            double v1;
            string key1 = (n - 1).ToString() + "_" + k.ToString();
            if (!(a.TryGetValue(key1, out v1)))
            {
                v1 = BinominalWithMemory(n - 1, k, p);
                a.Add(key1, v1);
            }

            double v2;
            string key2 = (n - 1).ToString() + "_" + (k - 1).ToString();
            if (!(a.TryGetValue(key2, out v2)))
            {
                v2 = BinominalWithMemory(n - 1, k - 1, p);
                a.Add(key2, v2);
            }

            return (1 - p) * v1 + p * v2;
        }
    }
}

You can download the entire project from here.