There are many documents, online courses or blogs, and websites about data science (I know it is the sexiest job in the 21st century). I have learned data science for my work and study. I am in “Sponge Mode,” – which means “soaking in as much theory and knowledge as possible to give yourself a strong foundation,” but sometimes I feel “drowned” by these “oceans of the resource” (before I was sunk by “Big” data). I realize that the study materials are copious. We have enough resources to study. The problem is time, determination, and consistency.
Many books present the same knowledge. This note is about some valuable sources I used to learn. It helps me to review and refer when necessary. I hope you will find this helpful if you are also a beginner. If you are a data expert, primarily in the air transport industry or business, please comment and give me some recommendations to improve. It is so great to have a mentor.
(I will update this note whenever I find new and more useful sources).
I started self-studying data science when I was in my master’s course to support my air transport management research. I have an economic background, and my mathematics or statistic skill is not strong. When I started, my coding skill was zero, and data science was so complicated to learn. However, studying is a journey, and challenges make it more interesting. In October 2022, I will start my PhD at the University of Nottingham, where I will learn more about computer science and applied data to solve real problems. I believe that this self-learning experience makes my application better to apply for the scholarship.
There are many free online courses suitable for social science students you can start before learning more complicated data science issues.
- Using Big Data to Solve Economic and Social Problems from Harvard Business school. This course does not focus on coding, but it shows you how to read data. Very amazing course.
- The Analytics Edge by Sloan Business School MIT
My data science studying route includes four sectors: Mathematics (Linear Algebra, Calculus and Gradient Descent), Statistics and Probability, Machine Learning, and Programming language (R and Python). Besides, I also study SQL for accessing and manipulating databases.
Because data science consists of many problems and many things to learn, I must define my studying objective. I want to apply some algorithms and programming skills to process data and engender business insights. Therefore, I think I will not mention other technologies (such as speech recognition, advanced image recognition, and building chatbot…)
I have studied Mathematics quite hard since I was in primary school. Still, in university, I looked at more management subjects. I forgot most of the complicated math problems that I had learned. However, the knowledge could be recovered quickly. I found this Specialization on Coursera useful for reviewing useful Mathematics for data science. It is for the beginner level.
Mathematics for Machine Learning Specialization by London Imperial College
This is a free course with videos, reading, exams, and assignments covering Linear Algebra and Calculus. Coursera is my favourite online learning platform. I like its interface design and course organisation.
You can also find my experience applying for Coursera Financial Aid to fund your study in the Blog below. I wrote it in Vietnamese, but you can use the small button on the right side of the window to read the automatic English translation of all posts in my Blog.
Honestly, for my learning style, I prefer reading books because it’s easier for me to acquire, digest, process, understand, and remember knowledge. Thank God, there are so many high qualities free books to read. My favourite source is Springer (free books) (but I cannot read all).
Here are some of the books that are useful for learning mathematics (It provides more deeply knowledge than the Online course)
2. Statistic and Probability
Honestly, I don’t remember how many statistics classes I have attended during my studies since high school. I learned, passed exams, and I forgot. At undergraduate university, I was even scared of statistics and econometrics subject. How did I overcome my fear of statistics? I found this book by chance in the library and was impressed by its name. That’s my saviour. It is a fantastic book that explains every concept clearly, and elaborately.
Taking the fear out of data analysis by Adamantis Diamantopoulos
I am attending “Statistics with Python Specialization” by the University of Michigan and Coursera. Besides explaining Inferential Statistic analysis and statistic models, it also provides a guideline for using Python in analysis and visualization and many reading materials, websites, and online reference tools.
Besides, if you want to review statistics with Python again (it is pretty normal to study, then forget), please look at these online free books to explore Statistics with Python by Allen B. Downey. They also provide a GitHub repository to see Code examples and solutions for exams.
Think Stats – 2nd edition (an introduction to Probability and Statistics for Python programmers.)
Think Bayes (an introduction to Bayesian statistics using Python)
3. Machine Learning
Many people recommend Machine Learning Course by Andrew Ng (Stanford University) as a good source to study Machine Learning. This course includes most of the problems in Machine Learning from the beginner level. I have finished the older version of this course (Machine learning using Octave) and I will attend the new version course soon.
I have read two books to learn Machine learning using R and Python, both provided by Packt
These books explain the theory and then give examples with code. They are easy to understand.
4. Programming Language
I studied R first when I did my master’s dissertation. After that, I found that Python is more available and quite similar to human language. I still find R excellent, especially in data visualization, so I decided to study both. I think Programming Languages are tools, and we should first understand the theory and knowledge to build algorithms.
Before learning about Programming, I discovered “Computational Thinking” first. I highly recommend this course. It is very excellent. Although I found it challenging to pass all assignments, I have received much.
Computational Thinking for Problem-Solving by University of Pennsylvania
Computational thinking is the process of approaching a problem systematically. We divide it into small problems, create algorithms, and use the Python programming language to help the computer understand and solve the problems. After this course, I realised that Python is quite similar to English. 🙂
I found this interactive book helpful for Python beginners. The author is excellent at explaining.
Another book by Allen B. Downey is also helpful for the beginner. The book introduces data science problems and ways to express them in Python, from basic to advanced. You also find some challenging exercises in this book. Access the online version and solutions/ code example via this link.
If you are finding more coherent paths to study, you can look through the learning path from IBM via https://cognitiveclass.ai/learn. It is totally free, and you earn certificates and badges from IBM to record your achievement.
I have finished IBM Data Science Professional Specialization on Coursera (9 courses). This Specialization is suitable for beginners who want to explore data science using Python.
I think that joining the courses can provide knowledge, but it cannot make you a master. We must practice more in practical problems to improve our skills and avoid forgetting. Learning is a long journey, but a journey of a thousand miles begins with a single step.
I hope this note will be helpful to you, and I will update more references in the future.
Please subscribe to receive notifications for a new posts.
You can donate to support me in maintaining it. Donation