I currently have about a month off between my last course Math for Modelers, and my next course, Statistical Analysis in Northwestern University’s Master of Science in Predictive Analytics program. I have started to catch up on some things on my to do list.
The first of these tasks is getting to know R better. I have a little bit of previous R experience, but at a very, very basic level. I am currently working my way through DataCamp‘s excellent series of R programming courses. I completed the “Introduction to R” course which consisted of 6 chapters covering the basics, vectors, matrices, factors, data frames and lists. This was very informative and done nicely. I am currently almost finished with the “Intermediate R” course. This has 5 chapters covering, conditionals and flow control, loops, functions, the apply family, and utilities. I highly recommend these courses for people starting to learn R, they are very nicely done.
I am also using Jupyter Notebooks as I go through these courses. I just started using these, and wish I would have found them much earlier. I would have done my Python work in them as I went through the math for modelers course. I just added the R kernel and have been doing the code and taking notes on R as I progress through these courses. I wish Northwestern University would consider using these for courses in which there is programming.
I am also starting to explore KNIME. I was first introduced to KNIME earlier this year when I visited Dr. Randall Moorman’s predictive monitoring lab at the University of Virginia. They were using KNIME on a very elaborate project and I was very impressed with the functionality of this platform. KNIME is an “open-source, enterprise-grade analytics platform”, that can be used to “discover the potential hidden in their data, mine for fresh insights, or predict new futures”. I am very early in my exploration of this platform, but I am very impressed so far, and am excited to get to work on a project using this. I will post further updates as I learn more about it.
Lastly, a few words about what I am listening to and reading. I am currently listening to the audio version of “The Master Algorithm” by Pedro Domingos. This is a must read book for practitioners of predictive analytics and anyone who is interested in machine learning. I am reading the print version of “Superforecasting: The art and science of prediction” by Philip E. Tetlock and Dan Gardner. This is an excellent read as well. I will try to review them in more detail when I am done.