# How to Calculate Standard Deviation in Python

1. from math import sqrt
2.
3.
4. def standard_deviation(lst, population=True):
5. """Calculates the standard deviation for a list of numbers."""
6. num_items = len(lst)
7. mean = sum(lst) / num_items
8. differences = [x - mean for x in lst]
9. sq_differences = [d ** 2 for d in differences]
10. ssd = sum(sq_differences)
11.
12. # Note: it would be better to return a value and then print it outside
13. # the function, but this is just a quick way to print out the values along
14. # the way.
15. if population is True:
16. print('This is POPULATION standard deviation.')
17. variance = ssd / num_items
18. else:
19. print('This is SAMPLE standard deviation.')
20. variance = ssd / (num_items - 1)
21. sd = sqrt(variance)
22. # You could return sd here.
23.
24. print('The mean of {} is {}.'.format(lst, mean))
25. print('The differences are {}.'.format(differences))
26. print('The sum of squared differences is {}.'.format(ssd))
27. print('The variance is {}.'.format(variance))
28. print('The standard deviation is {}.'.format(sd))
29. print('--------------------------')
30.
31.
32. s = [98, 127, 133, 147, 170, 197, 201, 211, 255]
33. standard_deviation(s)
34. standard_deviation(s, population=False)

Output:

1. This is POPULATION standard deviation.
2. The mean of [98, 127, 133, 147, 170, 197, 201, 211, 255] is 171.0.
3. The differences are [-73.0, -44.0, -38.0, -24.0, -1.0, 26.0, 30.0, 40.0, 84.0].
4. The sum of squared differences is 19518.0.
5. The variance is 2168.6666666666665.
6. The standard deviation is 46.56894530335282.
7. --------------------------
8. This is SAMPLE standard deviation.
9. The mean of [98, 127, 133, 147, 170, 197, 201, 211, 255] is 171.0.
10. The differences are [-73.0, -44.0, -38.0, -24.0, -1.0, 26.0, 30.0, 40.0, 84.0].
11. The sum of squared differences is 19518.0.
12. The variance is 2439.75.
13. The standard deviation is 49.393825525059306.
14. --------------------------
Sun, 2016-07-17 18:42
Offline
Joined: 10 months 1 week ago

#### losing precision

I was doing my own program for calculating (sample)standard deviation and ended with similar code:

for i in lst:
aux += ((i-mean)**2)

However, this seems to lose precision, and your solution give more accurate results

differences = [x - mean for x in lst]
sq_differences = [d ** 2 for d in differences]
ssd = sum(sq_differences)

Why is that separating the operations gives different results?

Fri, 2016-08-12 21:30
Offline
Joined: 2 years 2 months ago

#### I not sure. Could you post

I not sure. Could you post your complete function? You could put a print statement in there to see what the values are. Or run them on pythontutor.com to compare.