Your Features Are Important? It Doesn’t Mean They Are Good | by Samuele Mazzanti | Aug, 2023

Importance” is not enough. You also need to look at “Error Contribution” if you want to know which are beneficial for your .

[ by Author]

The concept of “feature importance” is widely used in as the most basic type of model explainability. For example, it is used in Recursive Feature Elimination (RFE), to iteratively drop the least important feature of the model.

However, there is a misconception about it.

The fact that a feature is important doesn’t imply that it is beneficial for the model!

Indeed, when we say that a feature is important, this simply means that the feature brings a high contribution to the predictions made by the model. But we should consider that such contribution may be wrong.

Take a simple example: a accidentally forgets the Customer ID between its model’s features. The model uses Customer ID as a highly predictive feature. As a consequence, this feature will have a high feature importance even if it is actually worsening the model, because it cannot well on unseen .

To make things clearer, we will need to make a distinction between two concepts:

  • Prediction Contribution: what part of the predictions is due to the feature; this is equivalent to feature importance.
  • Error Contribution: what part of the prediction errors is due to the presence of the feature in the model.

In this article, we will see how to calculate these quantities and how to use them to get valuable insights about a predictive model (and to it).

Suppose we built a model to predict the income of people based on their job, age, and nationality. Now we use the model to make predictions on three people.

Thus, we have the ground truth, the model prediction, and the resulting error:

Source link