8 Statistics Functions

8.17

8 Statistics Functions🔗ℹ

This module exports functions that compute statistics, meaning summary values for collections of samples, and functions for managing sequences of weighted or unweighted samples.

Most of the functions that compute statistics accept a sequence of nonnegative reals that correspond one-to-one with sample values. These are used as weights; equivalently counts, pseudocounts or unnormalized probabilities. While this makes it easy to work with weighted samples, it introduces some subtleties in bias correction. In particular, central moments must be computed without bias correction by default. See Expected Values for a discussion.

8.1 Expected Values

8.2 Running Expected Values

8.3 Correlation

8.4 Counting and Binning

8.5 Order Statistics

8.6 Simulations

8.1 Expected Values🔗ℹ

Functions documented in this section that compute higher central moments, such as variance, stddev and skewness, can optionally apply bias correction to their estimates. For example, when variance is given the argument #:bias #t, it multiplies the result by (/ n (- n 1)), where n is the number of samples.

The meaning of “bias correction” becomes less clear with weighted samples, however. Often, the weights represent counts, so when moment-estimating functions receive #:bias #t, they interpret it as “use the sum of ws for n.” In the following example, the sample 4 is first counted twice and then given weight 2; therefore n = 5 in both cases:

> (variance '(1 2 3 4 4) #:bias #t)
- : Real [more precisely: Nonnegative-Real]
17/10
> (variance '(1 2 3 4) '(1 1 1 2) #:bias #t)
- : Real [more precisely: Nonnegative-Real]
17/10

However, sample weights often do not represent counts. For these cases, the #:bias keyword can be followed by a real-valued pseudocount, which is used for n:

> (variance '(1 2 3 4) '(1/2 1/2 1/2 1) #:bias 5)
- : Real [more precisely: Nonnegative-Real]
17/10

Because the magnitude of the bias correction for weighted samples cannot be known without user guidance, in all cases, the bias argument defaults to #f.

procedure
(mean xs [ws]) → Real
xs : (Sequenceof Real)
ws : (U #f (Sequenceof Real)) = #f

When ws is #f (the default), returns the sample mean of the values in xs. Otherwise, returns the weighted sample mean of the values in xs with corresponding weights ws.

Examples:

> (mean '(1 2 3 4 5))
- : Real
3
> (mean '(1 2 3 4 5) '(1 1 1 1 10.0))
- : Real
4.285714285714286
> (define d (normal-dist))
> (mean (sample d 10000))
- : Real
0.01294430251641271
> (define arr (array-strict (build-array #(5 1000) (λ (_) (sample d)))))
> (array-map mean (array->list-array arr 1))
- : #(struct:Array
      (Indexes Index (Boxof Boolean) (-> Void) (-> Indexes Real))
      #<syntax:build/user/8.17/pkgs/math-lib/math/private/array/typed-array-struct.rkt:56:13 prop:equal+hash>
      #<syntax:build/user/8.17/pkgs/math-lib/math/private/array/typed-array-struct.rkt:55:13 prop:custom-write>
      #<syntax:build/user/8.17/pkgs/math-lib/math/private/array/typed-array-struct.rkt:54:13 prop:custom-print-quotable>)
(array
#[0.0036331457336001077
   -0.009641388971146933
   0.010030179214147998
   -0.022560383059775733
   0.00354538446434749])

procedure
(variance xs [ws #:bias bias]) → Nonnegative-Real
  xs : (Sequenceof Real)
  ws : (U #f (Sequenceof Real)) = #f
  bias : (U #t #f Real) = #f
procedure
(stddev xs [ws #:bias bias]) → Nonnegative-Real
  xs : (Sequenceof Real)
  ws : (U #f (Sequenceof Real)) = #f
  bias : (U #t #f Real) = #f
procedure
(skewness xs [ws #:bias bias]) → Real
  xs : (Sequenceof Real)
  ws : (U #f (Sequenceof Real)) = #f
  bias : (U #t #f Real) = #f
procedure
(kurtosis xs [ws #:bias bias]) → Nonnegative-Real
  xs : (Sequenceof Real)
  ws : (U #f (Sequenceof Real)) = #f
  bias : (U #t #f Real) = #f

If ws is #f, these compute the sample variance, standard deviation, skewness and excess kurtosis the samples in xs. If ws is not #f, they compute weighted variations of the same.

Examples:

> (stddev '(1 2 3 4 5))
- : Real [more precisely: Nonnegative-Real]
1.4142135623730951
> (stddev '(1 2 3 4 5) '(1 1 1 1 10))

1	Constants and Elementary Functions
2	Flonums
3	Special Functions
4	Number Theory
5	Arbitrary-Precision Floating-Point Numbers (Bigfloats)
6	Arrays
7	Matrices and Linear Algebra
8	Statistics Functions
9	Probability Distributions
10	Stuff That Doesn’t Belong Anywhere Else

8.1	Expected Values
8.2	Running Expected Values
8.3	Correlation
8.4	Counting and Binning
8.5	Order Statistics
8.6	Simulations