Maximum Likelihood Estimation

Mehdi ●November 24, 2015

Any observation has some degree of noise content that makes our observations uncertain. When we try to make conclusions based on noisy observations, we have to separate the dynamics of a signal from noise. This is the point that estimation starts. Any time that we analyse noisy observations to make decisions, we are estimating some parameters. Parameters are mainly used to simplify the description of a dynamic.

Noise by its definition is a sequence of data which we can not formulate its dynamic. This is why in any model we have a residual that is normally Gaussian and is separated from the deterministic dynamic.

Now the main question that arises is that how accurate we can estimate the parameters when the dynamic is polluted by noise. The best solution means the optimal value of the parameter in some sense.

To answer this question, we have to first find a measure that shows the amount of noise in the parameter value. The next step is to minimize this measure which is proportional to the error.

The maximum likelihood is the most sensitive measure that reflects the effect of noise. When we talk about ML estimation, we maximize the probability of the parameter value given the observations. Interchangeably, ML gives the most probable value given a set of observations.

The likelihood function that we maximize during ML, is the probability of the parameters given the data. The main challenge in ML is finding this distribution. Once we found the distribution, then by setting the derivatives versus parameters equal to zero, we can find the optimal values.

Most of the cases, the Likelihood function is Gaussian, which in turn reduces the problem to the mean squared error minimization. Therefore, the solution is the least squares solution, which was proposed years back by the great mathematician Carl Friedrich Gauss.

In other cases, we simplify the joint distribution by assuming that the observations are IID (independent). In this case the marginal PDFs are multiplied to make the joint density.

Generally, the main bottle-neck in ML is to assign a joint PDF to the parameters and then maximize this value given the observations.

Comments

Comments
Write a Comment

Select to add a comment

To post reply to a comment, click on the 'reply' button attached to each comment. To post a new comment (not a reply to a comment) check out the 'Write a Comment' tab at the top of the comments.

Please login (on the right) if you already have an account on this platform.

Otherwise, please use this form to register (free) an join one of the largest online community for Electrical/Embedded/DSP/FPGA/ML engineers:

Choose a Username

E-Mail (Work, School or ieee)

First Name

Last Name

Employer

Job Title

Country

State

Password

Confirm Password

By checking this box, I agree with the terms of use and privacy policy By checking this box, I consent to receive occasional emails from the *Related sites and their partners. I understand that these emails will only contain relevant information and that I can unsubscribe at any time.

Maximum Likelihood Estimation

Sign in

About Mehdi

Popular Posts by Mehdi

Blogs - Hall of Fame

Free PDF Downloads

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group