Table of contents
What is Data?
Data is the knowledge gained from a factual basis. It can be related to an object or a person. An explanation obtained from processing the data is called information. Thus, data and information are two different things.
- Data → Facts and Figures
- Information → Processed data which is understood better
Credits of Cover Image - Photo by Hunter Harritt on Unsplash
Data Analytics - Meaning
From the above two, we can get clarity of the term " Data Analysis ". Data Analysis is a field of Statistics, Mathematics, and Computer Science combined together in processing the raw data to produce insightful or valuable information. You might have this question - Statistics and Mathematics are fine but why Computer Science? The knowledge of programming helps in different ways in analyzing data. Some of which are -
- Process Automation
- Handling Large Datasets
- Querying Databases
- Creating Models
- Data Visualization
- Dashboard Development
Tools / Languages
Of course, we cannot just analyze the given data with a piece of paper and pencil. We need to find one such platform to do all three - Stats, Math, and Programming.
Tools involved in Data Analysis -
- Python
- R
- Julia
- Matlab
Note - There are so many languages or tools available. But here, I talk about Python. If you want to know the list then do refer to this article.
Packages
To get started in the field of data, learning Python would benefit in many ways. Python has a wide variety of packages that have been developed over the years. From data collection to data modeling, Python has everything set for you.
List of Packages or libraries:
- NumPy
- Pandas
- Statsmodels
- Matplotlib
- OpenCV
- Scikit-Learn
- Pytorch
- Tensorflow
- Plotly
- Py-Spark, etc
These packages have been extensively used for data-related problems. There is no requirement to learn all the packages as long as one is curious enough to understand the problem and implement the method. But the deeper one goes the deeper knowledge of using these are a must.
End
Well, that's all for now. This article is included in the series Exploratory Data Analysis, where I share tips and tutorials helpful to get started. We will learn step-by-step how to explore and analyze the data. As a pre-condition, knowing the fundamentals of programming would be helpful. The link to this series can be found here.