Empirical Distribution¶

Given a set of data \(D = \{ x_1, \cdots, x_n \}\), we define the empirical distribution (empirical measure) as:

\[ p_{\text{emp}} (A) \triangleq \frac{1}{N} \sum_{i=1}^N \delta_{x_i}(A)\]

Where:

We could also associate weights with each sample:

\[ p(x) = \sum_{i=1}^N w_i \delta_{x_i}(x) \]

where \(\sum_i w_i = 1\) and \(0 \le w_i \le 1\).

The Dirac delta is necessary only for continuous variables.

study-notes