Markov chains are useful in computing the probability of events that are observable. However, in many real world applications, the events that we are interested in are usually hidden, that is we don’t observe them directly. These hidden events need to be inferred. For example, given a sentence in a natural language we only observe the words and characters directly. The parts-of-speech from the sentence are hidden, they have to be inferred. This bring us to the following topic– the hidden Markov models.
Hidden Markov models enables us to visualize both observations and the associated hidden events. Let’s consider an example for understanding the concept.
The cheating casino and the gullible gambler
Consider a dishonest casino that deceives it player by using two types of die : a fair die (F) and a loaded die (L). For a fair die, each of the faces has the same probability of landing facing up. For the loaded die, the probabilities of the faces are skewed as given next
When the gambler throws the die, numbers land facing up. These are our observations at a given time t (denoted as Ot = {1,2,3,4,5,6}). At any given time t, whether these number are rolled from a fair die (state St = F) or a loaded die (St = L), is unknown to an observer and therefore they are the hidden events.
Emission probabilities
The probabilities associated with the observations are the observation likelihoods, also called emission probabilities (B).
Initial probabilities
The initial probability of starting (at time = 0) with any of fair die or loaded die (hidden event) is 50%.
Transition probabilities
A gullible gambler switches from the fair die to loaded die with 10% probability. He switches back from loaded die to fair die with 5% probability.
The probabilities of transitioning from one hidden event to another is described by the transition probability matrix (A). The elements of the probability transition matrix, are the transition probabilities (pij) of moving from one hidden state i to another hidden state j.
The transition probabilities from time t-1 to t, for the hidden events are
Therefore, the transition probability matrix is
Based on the given information so far, a probability model is constructed. This is the Hidden Markov Model (HMM) for the given problem.
Assumptions
We saw, in previous article, that the Markov models come with assumptions. Similarly, HMMs models also have such assumptions.
1. Assumption on probability of hidden states
In the model given here, the probability of a given hidden state depends only on the previous hidden state. This is a typical first order Markov chain assumption.
2. Assumption on Output
The probability of any observation (output) depends on the hidden state that produce it and not on any other hidden state or output observations.
Problems and Algorithms
Let’s briefly discuss the different problems and the related algorithms for HMMs. The algorithms will be explained in detail in the future articles.
In the dishonest casino, the gambler rolls the following numbers:
1. Evaluation
Given the model of the dishonest casino, what is the probability of obtaining the above sequence ? This is a typical evaluation problem in HMMs. Forward algorithm is applied for such evaluation problems.
2. Decoding
What is the most likely sequence of die (hidden states) given the above sequence ? Such problems are addressed by Viterbi decoding.
What is the probability of fourth die being loaded, given the above sequence ? Forward-backward algorithm to our rescue.
3. Learning
Learning problems involve parametrization of the model. In learning problems, we attempt to find the various parameters (transition probabilities, emission probabilities) of the HMM, given the observation. Baum-Welch algorithm helps us to find the unknown parameters of a HMM.
Some real-life examples
Here are some real-life examples of HMM applications:
- Speech recognition: HMMs are widely used in speech recognition systems to model the variability of speech sounds. In this application, the observable events are the acoustic features of the speech signal, while the hidden states represent the phonemes or words that generate the speech signal.
- Handwriting recognition: HMMs can be used to recognize handwritten characters by modeling the temporal variability of the pen strokes. In this application, the observable events are the coordinates of the pen on the writing surface, while the hidden states represent the letters or symbols that generate the handwriting.
- Stock price prediction: HMMs can be used to model the behavior of stock prices and predict future price movements. In this application, the observable events are the daily price movements, while the hidden states represent the different market conditions that generate the price movements.
- Gene prediction: HMMs can be used to identify genes in DNA sequences. In this application, the observable events are the nucleotides in the DNA sequence, while the hidden states represent the different regions of the genome that generate the sequence.
- Natural language processing: HMMs are used in many natural language processing tasks, such as part-of-speech tagging and named entity recognition. In these applications, the observable events are the words in the text, while the hidden states represent the grammatical structures or semantic categories that generate the text.
- Image and video analysis: HMMs can be used to analyze images and videos, such as for object recognition and tracking. In this application, the observable events are the pixel values in the image or video, while the hidden states represent the object or motion that generates the pixel values.
- Bio-signal analysis: HMMs can be used to analyze physiological signals, such as electroencephalograms (EEGs) and electrocardiograms (ECGs). In this application, the observable events are the signal measurements, while the hidden states represent the physiological states that generate the signal.
- Radar signal processing: HMMs can be used to process radar signals and detect targets in noisy environments. In this application, the observable events are the radar measurements, while the hidden states represent the presence or absence of targets.
Rate this post:
Books by the author
Similar topics
This is one of many helpful articles. Many thanks.
One correction however; the singular of dice is die. One can roll a die or more than one dice.
Thanks for the feedback. I will correct the mistake.
Many thanks for all your posts
Could you please make more posts on the applications of Markov chains and hidden Markov models ? At least one example of a certain problem and how the markov chain can be helpful in this aspect.