Keywords: maximum likelihood estimation, statistical method, probability distribution, MLE, models, practical applications, finance, economics, natural sciences.
Introduction
Maximum Likelihood Estimation (MLE) is a statistical method used to estimate the parameters of a probability distribution by finding the set of values that maximize the likelihood function of the observed data. In other words, MLE is a method of finding the most likely values of the unknown parameters that would have generated the observed data.
The likelihood function is a function that describes the probability of observing the data given the parameters of the probability distribution. The MLE method seeks to find the set of parameter values that maximizes this likelihood function.
For example, suppose we have a set of data that we believe to be normally distributed, but we do not know the mean or variance of the distribution. We can use MLE to estimate these parameters by finding the mean and variance that maximize the likelihood function of the observed data.
The MLE method is widely used in statistical inference, hypothesis testing, and model fitting in many areas, including economics, finance, engineering, and the natural sciences. MLE is a powerful and flexible method that can be applied to a wide range of statistical models, making it a valuable tool in data analysis and modeling.
Difference between MLE and MLD
Maximum likelihood estimation (MLE) and maximum likelihood decoding (MLD) are two different concepts used in different contexts.
Maximum likelihood estimation is a statistical method used to estimate the parameters of a probability distribution based on a set of observed data. The goal is to find the set of parameter values that maximize the likelihood function of the observed data. MLE is commonly used in statistical inference, hypothesis testing, and model fitting.
On the other hand, maximum likelihood decoding (MLD) is a method used in digital communications and signal processing to decode a received signal that has been transmitted through a noisy channel. The goal is to find the transmitted message that is most likely to have produced the received signal, based on a given probabilistic model of the channel.
In maximum likelihood decoding, the receiver calculates the likelihood of each possible transmitted message, given the received signal and the channel model. The maximum likelihood decoder then selects the transmitted message that has the highest likelihood as the decoded message.
While both MLE and MLD involve the concept of maximum likelihood, they are used in different contexts. MLE is used in statistical estimation, while MLD is used in digital communications and signal processing for decoding.
MLE applied to communication systems
Maximum Likelihood estimation (MLE) is an important tool in determining the actual probabilities of the assumed model of communication.
In reality, a communication channel can be quite complex and a model becomes necessary to simplify calculations at decoder side.The model should closely approximate the complex communication channel. There exist a myriad of standard statistical models that can be employed for this task; Gaussian, Binomial, Exponential, Geometric, Poisson,etc., A standard communication model is chosen based on empirical data.
Each model mentioned above has unique parameters that characterizes them. Determination of these parameters for the chosen model is necessary to make them closely model the communication channel at hand.
Suppose a binomial model is chosen (based on observation of data) for the error events over a particular channel, it is essential to determine the probability of succcess (\(p\)) of the binomial model.
If a Gaussian model (normal distribution!!!) is chosen for a particular channel then estimating mean (\(\mu\)) and variance (\(\sigma^{2}\)) are necessary so that they can be applied while computing the conditional probability of p(y received | x sent)
Similarly estimating the mean number of events within a given interval of time or space (\(\lambda\)) is a necessity for a Poisson distribution model.
Maximum likelihood estimation is a method to determine these unknown parameters associated with the corresponding chosen models of the communication channel.
Python code example for MLE
The following program is an implementation of maximum likelihood estimation (MLE) for the binary symmetric channel (BSC) using the binomial probability mass function (PMF).
The goal of MLE is to estimate the value of an unknown parameter (in this case, the error probability \(p\)) based on observed data. The BSC is a simple channel model where each transmitted bit is flipped (with probability \(p\)) independently of other bits during transmission. The goal of the following program is to estimate the error probability \(p\) of the BSC based on a given binary data sequence.
import numpy as np
from scipy.optimize import minimize
from scipy.special import binom
import matplotlib.pyplot as plt
def BSC_MLE(data):
"""
Maximum likelihood estimation (MLE) for the Binary Symmetric Channel (BSC).
This function estimates the error probability p of the BSC based on the observed data.
"""
# Define the binomial probability mass function
def binom_PMF(p):
n = len(data)
k = np.sum(data)
p = np.clip(p, 1e-10, 1 - 1e-10) # Regularization to avoid problems due to small estimation errors
logprob = np.log(binom(n, k)) + k*np.log(p) + (n-k)*np.log(1-p)
return -logprob
# Use the minimize function from scipy.optimize to find the value of p that maximizes the binomial PMF
#x0 argument specifies the initial guess for the value of p that maximizes the binomial PMF. For BSC x0=0.5
#BFGS is Broyden-Fletcher-Goldfarb-Shanno optimization algorithm used for unconstrained nonlinear optimization
res = minimize(lambda p: binom_PMF(p), x0=0.5, method='BFGS')
p_est = res.x[0]
# Plot the observed data as a histogram
plt.hist(data, bins=2, density=True, alpha=0.5)
plt.axvline(p_est, color='r', linestyle='--')
plt.xlabel('Bit value')
plt.ylabel('Frequency')
plt.title('Observed data')
plt.show()
return p_est
data = np.random.randint(2, size=1000)
p_est = BSC_MLE(data)
print('Estimated error probability: {:.4f}'.format(p_est))
The program first defines a function called BSC_MLE
that takes a binary data sequence as input and returns the estimated error probability p_est
. The BSC_MLE
function defines the binomial PMF, which represents the probability of observing a certain number of errors (i.e., bit flips) in the data sequence given a specific error probability p
. The binomial PMF is then maximized using the minimize
function from the scipy.optimize
module to find the value of p
that maximizes the likelihood of observing the data.
The program then generates a random binary data sequence of length 100 using the np.random.randint()
function and calls the BSC_MLE
function to estimate the error probability based on the observed data. Finally, the program prints the estimated error probability. Try increasing the sequence length to 1000 and observe the estimated error probability.
Reference :
[1] – Maximum Likelihood Estimation – a detailed explanation by S.Purcell
Books by the author
Related Topics: