Beyond Precision and Recall: A Deep Dive Deep into the Tversky Index | by Mikhail Klassen | Sep, 2023

Exploring an alternative metric

Mikhail Klassen
Towards Data Science
by Ricardo Arce on Unsplash

In the of science, metrics are the compass that guide our models to success. While many are familiar with the classic measures of precision and recall, there are actually a wide of other options that are worth exploring.

In this article, we’ll dive into the Tversky index. This metric, a generalization of the Dice and Jaccard coefficients, can be extremely when trying to balance precision and recall against each other. When implemented as a loss for neural networks, it can be a powerful way to with class imbalances.

A quick refresher on precision and recall

Imagine you are a detective tasked with capturing criminals in your town. In truth, there are 10 criminals roaming the streets.

In your first month, you bring in 8 suspects you assume to be criminals. Only 4 of them end up being guilty, while the other 4 are innocent.

If you were a machine model, you’d be evaluated against your precision and recall.

Precision asks: “of all those you caught, how many were criminals?”

Recall asks: “of all the criminals in the town, how many did you catch?”

Precision is a metric that captures how accurate your predictions are, not counting how many true positives you miss (false negatives). Recall measures how many of the true positives you capture, irrespective of how many false positives you get.

How do your detective rate against these metrics?

  • precision = 4 / (4 + 4) = 0.5
  • recall = 4 / (4 + 6) = 0.4

Balancing precision and recall: the F1 metric

In an ideal world, your classifier has both high precision and high recall. As a measure of how well your classifier is doing against both, the F1 statistic measures the harmonic mean between the two:

This metric is also sometimes called the Dice similarity coefficient (DSC).

Measuring similarity another way…

Source link