In statistics, an estimator or point estimate is a statistic (that is, a measurable function of the data) that is used to infer the value of an unknown parameter in a statistical model. The parameter being estimated is sometimes called the estimand. It can be either finite-dimensional (in parametric and semi-parametric models), or infinite-dimensional (semi-nonparametric and non-parametric models). If the parameter is denoted θ then the estimator is typically written by adding a “hat” over the symbol: . Being a function of the data, the estimator is itself a random variable; a particular realization of this random variable is called the estimate. Sometimes the words “estimator” and “estimate” are used interchangeably.
The definition places virtually no restrictions on which functions of the data can be called the “estimators”. We judge the attractiveness of different estimators by looking at their properties, such as unbiasedness, mean square error, consistency, asymptotic distribution, etc. The construction and comparison of estimators are the subjects of the estimation theory. In the context of decision theory, an estimator is a type of decision rule, and its performance may be evaluated through the use of loss functions.
When the word “estimator” is used without a qualifier, it refers to point estimation. The estimate in this case is a single point in the parameter space. Other types of estimators also exist: interval estimators, where the estimates are subsets of the parameter space; density estimators that deal with estimating the pdfs of random variables, and whose estimates are functions; etc.
Contents |
Suppose we have a fixed parameter that we wish to estimate. Then an estimator is a function that maps a sample design to a set of sample estimates. An estimator of
is usually denoted by the symbol
. A sample design can be thought of as an ordered pair
where
is a set of samples (or outcomes), and
is the probability density function. The probability density function maps the set
to the closed interval [0,1], and has the property that the sum (or integral) of the values of
, over all
in
, is equal to 1. For any given subset
of
, the sum or integral of
over all
in
is
.
For all the properties below, the value , the estimation formula, the set of samples, and the set probabilities of the collection of samples, can be considered fixed. Yet since some of the definitions vary by sample (yet for the same set of samples and probabilities), we must use
in the notation. Hence, the estimate for a given sample
is denoted as
.
We have the following definitions and attributes:
A consistent sequence of estimators is a sequence of estimators that converge in probability to the quantity being estimated as the index (usually the sample size) grows without bound. In other words, increasing the sample size increases the probability of the estimator being close to the population parameter.
Mathematically, a sequence of estimators is a consistent estimator for parameter
if and only if, for all
, no matter how small, we have
The consistency defined above may be called weak consistency. The sequence is strongly consistent, if it converges almost surely to the true value.
An estimator that converges to a multiple of a parameter can be made into a consistent estimator by multiplying the estimator by a scale factor, namely the true value divided by the asymptotic value of the estimator. This occurs frequently in estimation of scale parameters by measures of statistical dispersion.
An asymptotically normal estimator is a consistent estimator whose distribution around the true parameter approaches a normal distribution with standard deviation shrinking in proportion to
as the sample size
grows. Using
to denote convergence in distribution,
is asymptotically normal if
for some , which is called the asymptotic variance of the estimator.
The central limit theorem implies asymptotic normality of the sample mean as an estimator of the true mean. More generally, maximum likelihood estimators are asymptotically normal under fairly weak regularity conditions — see the asymptotics section of the maximum likelihood article.
Two naturally desirable properties of estimators are for them to be unbiased and have minimal mean squared error (MSE). These cannot in general both be satisfied simultaneously: a biased estimator may have lower mean squared error (MSE) than any unbiased estimator: despite having bias, the estimator variance may be sufficiently smaller than that of any unbiased estimator, and it may be preferable to use, despite the bias; see estimator bias.
Among unbiased estimators, there often exists one with the lowest variance, called the minimum variance unbiased estimator (MVUE). In some cases an unbiased efficient estimator exists, which, in addition to having the lowest variance among unbiased estimators, satisfies the Cramér-Rao bound, which is an absolute lower bound on variance for statistics of a variable.
Concerning such "best unbiased estimators", see also Cramér–Rao bound, Gauss–Markov theorem, Lehmann–Scheffé theorem, Rao–Blackwell theorem.
See: Robust estimator, Robust statistics