Exploratory data analysis is the first and the most important step in any Machine Learning problem. EDA gives us an overview of the data. Basically we get a bird's view of the data and then try to find interesting patterns and relationships in the data which further help us in predictive or statistical analysis.
This is bascially my 3rd EDA work. Previously, I have explored 2 completely different genre of data
- IPL data
- Movies data
Data Science is mostly about logic. If we can understand the logic then coding part is no rocket science. And to improve our logical thinking the most important thing to do is to explore more and more datasets. The more we explore, the more idea we get as to how we can think rationally. EDA is nothing but a logical framework of our thinking. Most data science projects do not give data scientists any particular questions to find answers to. Data scientists explore the data and try and come up with logical questions which can help to solve business needs and EDA is the way by which they figure out different aspects of the data.
In line with the above thoughts, here I have tried my best to come up with some interesting patterns in the employee data. The dataset consists of roughly 1500 rows and 9 columns. Data was nicely structured with no missing values or any other data quality issues and hence no cleaning or preprocessing were required. I had imported the data along with the necessary packages in Python and started exploring it. The findings from each question has been gathered and written right below that graph/table in bullet points.