I have two programs that both behave nearly identically: they both take in any numbers you give them and can tell you the arithmetic mean and how many numbers were given. However, when you don’t give them any numbers, one says the arithmetic mean is
0.0, and the other says it’s
NaN(“Not a Number”). Which of these answers, if any, is more correct, and why?
Note: Although I use “programs” as a metaphor here, this isn’t a programming question; I could’ve just as easily said “computers”, “machines”, “wise men”, etc. and my question would be the same
From a statistical point-of-view, the average of no sample points should not exist. The reason is simple. The average is an indication of the centre of mass of the distribution. Clearly, for no observations there can be no way to prefer one location vs. another as their centre of mass since the the empty set is translation invariant.
More mathematically, taking the average is a linear operation, which means if you add a constant c to each observation, then the average a becomes a+c. Now if you add c to each observation in the empty set, you get the empty set again, and thus the average will have to satisfy a+c=a for all c, clearly nonsense.