Exploratory Data Analysis: Unraveling the Story Within Your Dataset | by Deepak Chopra | Talking Data Science | Jul, 2023

The secret art of exploring — Understanding, cleaning, and unveiling the hidden insights within your dataset

Deepak Chopra | Talking Data Science
Towards Data Science
Photo by Andrew Neel on Unsplash

As a data enthusiast, exploring a new dataset is an exciting endeavour. It allows us to gain a deeper understanding of the data and lays the foundation for successful analysis. Getting a good feeling for a new dataset is not always easy, and takes . However, a good and thorough exploratory (EDA) can help a lot to understand your dataset and get a feeling for how things are connected and what needs to be done to properly process your dataset.

Infact, you probably will spend 80% of your time in data preparation and exploration and only 20% in actual data modelling. For other types of analysis, exploration might take an even larger proportion of your time.

Exploratory Data Analysis, simply put, refers to the art of exploring data. It is the process of investigating data from different angles to enhance your understanding, exploring patterns, establishing relationships between variables and if required enhancing the data itself

Its like going on a ‘blind’ date with your dataset, sitting across the table from this enigmatic collection of numbers and texts, yearning to understand it before embarking on a serious relationship. Just like a blind date, EDA allows you to uncover the hidden facets of your dataset. You observe patterns, detect outliers, and explore the nuances before making any significant commitments. It’s all about getting acquainted and with the numbers, ensuring you’re on solid ground before conclusions.

We’ve all been there; knowingly or unknowingly, delving into statistical tools or sifting through reports — we’ve all explored some kind of data at some point!

We as and data scientists are supposed to best understand the data. We must become the experts when it comes to understanding and interpreting the data. Whether it is , experimentation frameworks or simple — the outcome is as good as the data on which it is based.

Source link