Take a moment to bow down to the soul of data science: -Statistics!
We all are familiar with the fact that statistics is extremely crucial if one wishes to work on any sort of data efficiently. It’s an excellent tool that, if used correctly, can help us draw amazing insights from the given data, no matter how complex it may look at the first sight. Oftentimes, we tend to get confused as to how to get started with learning statistical concepts because there is a plethora of resources to choose from.
In this blog, I would be talking about a practical plan to learn statistics for data science. This plan worked wonders for me, so I strongly believe that it might as well be of some use to you. I encourage you to take this plan as a rough sketch. Feel free to customize it as per your needs.
For your convenience, I have divided this plan into 3 groups: Basic, Intermediate, and Advanced. Even if one learns the basic and intermediate concepts, he/she would be in a position to start working on real-time data. Furthermore, if you aim higher and want to see yourself becoming a data wizard, then advanced concepts should definitely be in your arsenal. Remember, when it comes to learning, the sky is the limit!
Group I: Basics
Basics: These are the concepts that most of us have learned during our high school days. However, it's highly likely that some of us might have lost touch with these topics, so I highly insist that you pay close attention while learning these topics as these topics establish the rudimentary foundation of statistics as a subject. (Hence the name; basic).
1. 1. What is Statistics? Why Statistics?
2. Descriptive and Inferential statistics
3. Samples and Population.
4. Mean Median Mode. (And how they’re utilized to measure central tendency)
5. Variance and Standard Deviation (Measure of Dispersion)
6. Sampling Methods.
7. Variables and their types.
8. Frequency Distribution.
9. Cumulative Frequency.
10. Histogram Analysis.
11. Percentiles and Quantiles.
Group II: Intermediate
In Intermediate: Now that you feel comfortable and confident with the basics, you can dive into the intermediate topics listed below for better clarity of statistics as a tool. Few of the topics in this list might be new to many people, but believe me, understanding new topics should be a cakewalk for you if you are well-versed with the fundamentals.
1. Five Number Summary.
2. Inter-quartile Range.
3. Boxplot Analysis.
4. Probability Density Function.
5. Gaussian Distribution and Empirical Formula.
6. Concept of Z-Score.
7. Standardization and Normalization.
8. Central Limit Theorem.
9. Chebyshev’s Inequality.
10. Co-variance.
11. Pearson Correlation Coefficient.
12. QQ Plot.
13. Bernoulli’s Distribution.
14. Log Normal Distribution.
15. Power Law Distribution.
Optional: This is the time when you can start to implement projects based on what you’ve learned so far. This will help you relate to these concepts on an “application level”. This shall motivate you to learn further.
Group III: Advanced
Advance: If you have made it this far, you are a true data lover. More power to you captain!
These topics, as the name suggests, are for seasoned learners (i.e. those who are already good at basic and intermediate topics).
1. Boxcox Transform.
2. Confidence Interval.
3. Type 1 and Type 2 Error.
4. One-Tailed and Two-tailed tests.
5. Hypothesis testing.
6. P-value.
7. T-Test.
8. Z- Test.
9. Annova Test.
10. Chi-square Test.
s
I
0 Comments