Python calculate standard deviation of list

This article will explain four different ways to calculate standard deviation of a list of numbers in python with examples and explanation.

Standard Deviation
Standard deviation means the variation or dispersion of values of a data set from its mean or average. For a list, it means the variation of its elements from the mean value of elements.
Python provides different ways to calculate standard deviation and we can also calculate it by applying its formula. All this will be covered in this article.
Method 1: Using stdev()
Python statistics module has a stdev() function which takes a data set as argument and returns square root of variance, also called standard deviation. Example,

import statistics

# create list
print('Standard deviation =',d)

This prints

Standard deviation = 1.5811388300841898

This method is available since Python 3.4
Method 2: Using pstdev()
Python statistics module has a pstdev() function which calculates standard deviation over the entire population or data set. Example,

import statistics

print('Standard deviation =',statistics.pstdev(l))

Output is

Standard deviation = 1.4142135623730951

stdev() takes into account sample data[(n -1 ) elements] to calculate variance. Hence, there is a difference between the results of stdev() and pstddev().
Since stdev() takes a smaller data set, its value is higher as compared to pstdev().

Method 3: Using numpy library
Python’s numpy (shorthand for Numerical Python) library contains mathematical functions to work on large data sets. It has a function std() which takes a data set argument and returns its standard deviation. Example,

import numpy as n

print('Standard deviation =',n.std(l))

Output is

Standard deviation = 1.4142135623730951

There is a difference in the values returnd from statistics module and numpy because statistics considers (n-1) elements while numpy takes into account n elements.

Notice that the result of numpy and pstdev() are identical, since they both cover all list elements.
Method 4: Using formula
Mathematically, standard deviation is the square root of variance. Variance is calculated using below formula
Variance formula to calculate standard deviation
xi is value of obervation or a single list element,
is the average or mean of list elements,
n is the total number of list elements,
S2 is the variance, and
Σ is the summation.

So, if you carefully look at the formula, it is subtracting each list element from the list average, squaring the result, adding them up and dividing by the list element count.
This values will be variance. Finally, take the square root of variance to get standard deviation.
We can apply this formula in a python program to calculate standard deviation of list elements. Example,

# calculate average of list
mean = sum(l) / len(l)
# apply formula
variance = sum((x - mean)**2 for x in l) / len(l)
# square root of variance
std_dev = variance ** 0.5
print('Standard deviation =',std_dev)

To calculate mean or average, Python’s inbuilt sum() and len() functions are used.

To calculate, variance, we are iterating over a list, subtracting the mean from each element and taking its square. All these operations are performed in below line

sum((x - mean)**2 for x in l)

This syntax is called Python list comprehension.
If you are not familiar with this syntax, then replace it with Python for loop as shown below.

element_sum = 0
for x in l:
    element_sum += (x-mean)**2
variance = element_sum / len(l)

Divide the sum of elements with the length of list to get variance. Finally, calculate standard deviation by taking square root of varianceby raising it to the power of 0.5.
Output of this code is

Standard deviation = 1.4142135623730951

Note that this method does not require any external library or module, it is a pure mathematical solution.

That is all on different methods to calculate standard deviation in Python. Hope the article was useful.


Liked the article ? Spread the word...