This article explores how Entropy can be employed as a tool for uncertainty estimation in image segmentation tasks. We will walk through what Entropy is, and how to implement it with Python.
While working at Cambridge University as a Research Scientist in Neuroimaging and AI, I faced the challenge of performing image segmentation on intricate brain datasets using the latest Deep Learning techniques, especially the nnU-Net. During this endeavor, I observed a significant gap: the overlooking of uncertainty estimation. Yet, uncertainty is crucial for reliable decision-making.
Before delving into the specifics, feel free to check out my Github repository which contains all the code snippets discussed in this article.
In the world of computer vision and machine learning, image segmentation is a central problem. Whether it’s in medical imaging, self-driving cars, or robotics, accurate segmentation are vital for effective decision-making. However, one often overlooked aspect is the measure of uncertainty associated with these segmentations.
Why should we care about uncertainty in image segmentation?
In many real-world applications, an incorrect segmentation could result in dire consequences. For example, if a self-driving car misidentifies an object or a medical imaging system incorrectly labels a tumor, the consequences could be catastrophic. Uncertainty estimation gives us a measure of how ‘sure’ the model is about its prediction, allowing for better-informed decisions.
We can also use Entropy as a measure of uncertainty to improve the learning of our neural networks. This area is knows as ‘active learning’. This idea will be explored in further articles but the main idea is to identify the zones on which the models are the most uncertain to focus on them. For example we could have a CNN performing medical image segmentation on the brain, but performing very poorly on subjects with tumours. Then we could concentrate our efforts to acquire more labels of this type.
Entropy is a concept borrowed from thermodynamics and information theory, which quantifies the amount of uncertainty or randomness in a system. In the context of machine learning, entropy can be used to measure the uncertainty of model predictions.
Mathematically, for a discrete random variable X with probability mass function P(x), the entropy H(X) is defined as:
Or in the continous case:
The higher the entropy, the greater the uncertainty, and vice versa.
A classic example to fully grasp the concept:
Situation 1: A biased coin
Imagine a biased coin, which lands on head with a probability p=0.9, and tail with a probability 1-p=0.1.
Its entropy is
Situation 2: Balanced coin
Now let’s imagine a balanced coin which lands on head and tail with probability p=0.5
Its entropy is:
The entropy is larger, which is coherent with what we said before: more uncertainty = more entropy.
Actually it is interesting to note that p=0.5 corresponds to the maximum entropy:
Intuitively, remember that a uniform distribution is the case with maximal entropy. If every outcome is equally probable, then this corresponds to the maximal uncertainty.
To link this to image segmentation, consider that in deep learning, the final softmax layer usually provides the class probabilities for each pixel. One can easily compute the entropy for each pixel based on these softmax outputs.
But How does it work?
When a model is confident about a particular pixel belonging to a specific class, the softmax layer shows high probability (~1) for that class, and very small probabilities (~0) for the other classes.
Conversely, when the model is uncertain, the softmax output is more evenly spread across multiple classes.
The probabilities are much more diffuse, close to the uniform case if you remember, because the model cannot decide which class is associated with the pixel.
If you have made it until now, great! You should have a great intuition of how entropy works.
Let’s illustrate this with a hands-on example using medical imaging, specifically T1 Brain scans of fetuses. All codes and images for this case study are available in my Github repository.
1. Computing Entropy with Python
As we said before, we are working with the softmax output tensor, given by our Neural Network. This approach is model-free, it only uses the probabilities of each class.
Let’s clarify something important about the dimensions of the tensors we are working with.
If you are working with 2D Images, the shape of your softmax layer should be:
Meaning that for each pixel (or voxel), we have a vector of size Classes, which gives us the probabilities of a pixel to belong to each of the classes we have.
Therefore the entropy should be computer along the first dimension:
def compute_entropy_4D(tensor):
"""
Compute the entropy on a 4D tensor with shape (number_of_classes, 256, 256, 256).Parameters:
tensor (np.ndarray): 4D tensor of shape (number_of_classes, 256, 256, 256)
Returns:
np.ndarray: 3D tensor of shape (256, 256, 256) with entropy values for each pixel.
"""
# First, normalize the tensor along the class axis so that it represents probabilities
sum_tensor = np.sum(tensor, axis=0, keepdims=True)
tensor_normalized = tensor / sum_tensor
# Calculate entropy
entropy_elements = -tensor_normalized * np.log2(tensor_normalized + 1e-12) # Added a small value to avoid log(0)
entropy = np.sum(entropy_elements, axis=0)
entropy = np.transpose(entropy, (2,1,0))
total_entropy = np.sum(entropy)
return entropy, total_entropy
2. Visualizing Entropy-based Uncertainty
Now let’s visualize the uncertainties by using a heatmap, on each slice of our image segmentation.
Let’s look at an other example:
The results look great! Indeed we can see that this is coherent because the zones of high entropy are at the contour of the shapes. This is normal because the model does not really doubt the points at the middle of each zone, but its rather the delimitation or contour that is difficult to spot.
This uncertainty can be used in plenty of different ways:
- As medical experts work more and more with AI as a tool, being aware of the uncertainty of the model is crucial. This mean that medical experts could spend more times on the zone where more fine-grained attention is required.
2. In the context of Active Learning or Semi-Supervised Learning, we can leverage Entropy based Uncertainty to focus on the examples with maximal uncertainty, and improve the efficiency of learning (more about this in coming articles).
- Entropy is an extremely powerful concept to measure the randomness or uncertainty of a system.
- It is possible to leverage Entropy in Image Segmentation. This approach is model free and only uses the softmax output tensor.
- Uncertainty estimation is overlooked, but it is crucial. Good Data Scientists know how to make good models. Great Data Scientists know where their model fail and use this to improve learning.