After I got accepted into the program, one of my first worries was around how I was ever going to learn R and Python, the two most common languages used in this program. I decided to list a few of the resources that I have used. These are by no means comprehensive, but should provide you with information about these so that you can make decisions about which you want to use.
There are three resources at the top of my list – DataCamp, Udemy, and Lynda, and I have done courses in all of these. There is a curriculum for learning these languages within each of the beginning courses, based on some assigned texts. These are very informative and useful, but personally I would recommend having some skills going into the classes, so taking some of these classes before the MSPA class starts would be helpful. That way you are not learning the course content, and new programming – although for the students who don’t have any programming experience prior to the class, they do say it is doable.
In addition I have found it useful to do my coding in Jupyter Notebooks. See my previous blog post “Using Jupyter Notebooks to Learn R, Python” for more information.
DataCamp is a great resource to learn both R and Python, and is my favorite. There are a few free courses, but most course you will need to sign up either monthly ($29) or yearly ($300) to get unlimited access to all of their courses during that time. It is well worth the money. There are multiple courses for R and Python on importing and cleaning data, data manipulation, data visualization, probability and statistics, machine learning, … and more. I like the interactive structure where you watch a video, go over some questions, and then code within the API and get feedback. This is the perfect format for me.
Udemy is an online marketplace where they have over 42,000 courses for sale. The trick is to wait for fairly frequent periodic sales where you can buy these courses at steep discounts. I have a fair number of these for R and Python, Excel, Python and MongoDB, SQL, Apache Spark and Python, Data Visualization with Python and Matplotlib, Data Science, Python and Pandas, MapReduce and Hadoop, Machine Learning, Time Series analysis in R, Linear Modeling in R, a bunch of statistical applications in R, Power BI, Apache Spark with Scala, etc. The listed price on some of these course are quite steep, but if you wait for a sale, then you can get for very cheap. The quality of the instruction does vary, so I would check out the reviews first.
Lynda is a resource with >4,000 courses that normally you have to pay for, but once you have been accepted into the MSPA program and have a student account set up, then you get access to Lynda.com for free, and I would take advantage of this resource. Their are instructions about how to sign up if you are a student.
Some other resources:
The Data Science Learning Club – from “The Becoming a Data Scientist Podcast” is a great site to check out that will walk you through the steps from setting up your environment, finding, importing and exploring a dataset; creating visuals for exploratory data analysis; Naive Bayes classification; k-means clustering, linear regression, model evaluation, and many more. This is worth checking out, as well as listening to the podcasts.
The free online text book, How to Think Like a Computer Scientist, runs through Python coding very thoroughly, and is worth checking out.
Udacity has individual courses as well as “Nanodegree” programs. The Nanodegree programs have course requirements, and once you complete all the courses, you receive your Nanodegree. There are Nanodegree programs in Artificial Intelligence, Predictive Analytics for Business, Machine Learning Engineer, Intro to Programming, Data Analyst. These would be complements to the program, but difficult to get through while enrolled in the program.