Key focus: Understand step by step, the least squares estimator for parameter estimation. Hands-on example to fit a curve using least squares estimation
Background:
The various estimation concepts/techniques like Maximum Likelihood Estimation (MLE), Minimum Variance Unbiased Estimation (MVUE), Best Linear Unbiased Estimator (BLUE) – all falling under the umbrella of classical estimation – require assumptions/knowledge on second order statistics (covariance) before the estimation technique can be applied. Linear estimators, discussed here, do not require any statistical model to begin with. It only requires a signal model in linear form.
Linear models are ubiquitously used in various fields for studying the relationship between two or more variables. Linear models include regression analysis models, ANalysis Of VAriance (ANOVA) models, variance component models etc. Here, one variable is considered as a dependent (response) variable which can be expressed as a linear combination of one or more independent (explanatory) variables.
Studying the dependence between variables is fundamental to linear models. For applying the concepts to real application, following procedure is required
- Problem identification
- Model selection
- Statistical performance analysis
- Criticism of the model based on statistical analysis
- Conclusions and recommendations
Following text seeks to elaborate on linear models when applied to parameter estimation using Ordinary Least Squares (OLS).
Linear Regression Model
A regression model relates a dependent (response) variable y to a set of k independent explanatory variables {x1, x2 ,…, xk} using a function. When the relationship is not exact, an error term e is introduced.
If the function f is not a linear function, the above model is referred as Non-Linear Regression Model. If f is linear, equation (1) is expressed as linear combination of independent variables xk weighted by unknown vector parameters θ = {θ1, θ2,…, θk } that we wish to estimate.
Equation (2) is referred as Linear Regression model. When N such observations are made
where,
yi – response variable
xi – independent variables – known expressed as observed matrix X with rank k
θi – set of parameters to be estimated
e – disturbances/measurement errors – modeled as noise vector with PDF N(0, σ2 I)
It is convenient to express all the variables in matrix form when N observations are made.
Denoting equation (3) using (4),
Except for X which is a matrix, all other variables are column/row vectors.
Ordinary Least Squares Estimation (OLS)
In OLS – all errors are considered equal as opposed to Weighted Least Squares where some errors are considered significant than others.
If
Thus the error vector e can be computed from the observed data matrix y and the estimated
Here, the errors are assumed to be following multivariate normal distribution with zero mean and standard deviation σ2.
To determine the least squares estimator, we write the sum of squares of the residuals (as a function of
The least squares estimator is obtained by minimizing
Thus, the least squared estimate of θ is given by
where the operator T denotes Hermitian Transpose (conjugate transpose).
Summary of computations
- Step 1: Choice of variables. Choose the variable to be explained (y) and the explanatory variables { x1, x2 ,…, xk } where x1 is often considered a constant (optional) that always takes the value 1 – this is to incorporate a DC component in the model.
- Step 2: Collect data. Collect n observations of y and for a set of known values of { x1, x2 ,…, xk }. Example: { x1, x2 ,…, xk } is the pilot data in OFDM using which we would like to estimate the channel impulse response θ and y is the received vector of samples. Store the observed data y in an – n⨉1 vector and the data on the explanatory variables in the n⨉k matrix X.
- Step 3: Compute the estimates. Compute the least squares estimates by the formula
The superscript T indicates Hermitian Transpose (conjugate transpose) operation.
Key Points
- We do not need a probabilistic assumption but only a deterministic signal model.
- It has a broader range of applications.
- Least squares is unbiased.
- Estimating the disturbance variance (k variables to estimate and n observations are available).
- To keep the variance low, the number of observations must be greater than the number of variables to estimate.
- The observation matrix X should have maximum rank – this leads to independent rows and columns which always happens with real data. This will make sure (XTX) is invertible.
- Least Squares Estimator can be used in block processing mode with overlapping segments – similar to Welch’s method of PSD estimation.
- Useful in time-frequency analysis.
- Adaptive filters are utilized for non-stationary applications.
LSE applied to curve fitting
Matlab snippet for implementing Least Estimate to fit a curve is given below.
x = -5:.1:5; % set of x- values - known explanatory variables
y = 5.3 + 1.2* x; % Straight line without noise
e=randn(size(y));
y = y + e; % adding random noise to get observed variable -
%Linear model - Y=Xa+e where a - parameters to be estimated
X = [ ones(length(x),1) x']; %first column treated aas all ones since x_1=1
y = y'; %column vector for proper dimension during multiplication
a = inv(X'*X)*X'*y % Least Squares Estimator - equivalent code X\y
h=plot ( x , y , 'o'); %original data
hold on;
plot( x , a(1)+ a(2)*x , 'r-' ); %Fitted line
legend('observed samples',['y=' num2str(a(1)) '+' num2str(a(2)) 'x'])
title('Least Squares Estimate for Curve Fitting');
xlabel('X values');
ylabel('Y values');
Simulation Results
Rate this article: Note: There is a rating embedded within this post, please visit this post to rate it.
Related topics:
Books by the author
Hello Sir
I want to do channel equalization and I am using the zero forcing equalizer.
I am using this code.
enbtx=dlmread(‘input.txt’);
uerx_cap=dlmread(‘output.txt’);
enbtx=enbtx(:,1)+1i*enbtx(:,2);
enbtx_norm=enbtx/max(abs(enbtx));
uerx_cap=uerx_cap(:,1)+1i*uerx_cap(:,2);
uerx_cap_norm=uerx_cap/max(abs(uerx_cap));
x=enbtx_norm; % I/P
y=uerx_cap_norm; %o/p
X=fft(x,);
Y=fft(y,);
H=Y*pinv(X); channel estimation
H_zf=pinv(H); making 1/H(z)
As channel is estimated then I take new data which is passed by the same channel
z is the new data taken
Z=fft(z);
Y_eq=H_zf*Y;
y_eq=ifft(Y_eq);
But for the new input output the equalizer is not working
Kindly help me, I am stuck in it.
With warm regards
can u please tell me how to do same estimation of parameter in linear model using Maximum likelihood? as soon as possible…in MLE u have solved only x=A+wn but I want to know for x = H*s(n)+w
For your question on x=H*s(n)+w, I assume your goal is to estimate the channel – ‘H’. This problem is very specific to the application and the nature of the channel (channel model dependent).
To apply MLE for channel estimation, you need to first understand the channel model. Then develop a statistical model that represents the mix of received signal, noise and interference (if any).
An excellent example would be pilot estimation algorithms in OFDM systems. Some of them can be found here.
http://www.freescale.com/files/dsp/doc/app_note/AN3059.pdf
thank you so much.