I love using Jupyter Notebooks to learn R and Python. I only wish I would have discovered them when I first started to learn Python. The notebooks are a great way to take notes, run code, see the output of the code, and then visualize the output. The notebooks can be organized by language – ie Python vs R, and also by the course you are taking, or book you are working your way through. You can then go back and view your notes and code for future reference.
Project Jupyter was developed from the IPython Project in 2014, and IPython notebooks are now Jupyter notebooks. Jupyter Notebooks are described as “a web application for interactive data science and scientific computing” . These notebooks support over 40 programming languages, and you can create notebooks with a Python kernel, or ones with an R kernel, amongst others. These are great for learning programming languages and several academic institutions are using these in their CS courses. They are also great for “reproducibility” – the ability to reproduce the findings that other people report. By publishing the notebook on GitHub, Dropbox, or Jupyter Notebook Viewer, others can see exactly what was performed, and run the code themselves.
Here is how I use Jupyter Notebooks. When I start a new course, whether an official course in my Northwestern University Master of Science in Predictive Analytics, or a web based course like the ones I have been taking from DataCamp and Udemy, or from a book that I am working my way through – I will create a new Jupyter notebook.
You first have to open up a Jupyter Notebook by typing “Jupyter notebook” in your shell (I use Windows PowerShell). This then opens up a browser page “Home”.
If I want to open up an existing notebook, I scroll down to the notebook of interest and open it. Here is a screen shot showing some of my notebooks.
If I want to start a new notebook, I go to the top, select “New”, and then either a Python or R notebook. They come with the Python kernel installed (you go to IRkernel on GitHub to install the R kernel). This opens up a new notebook.
You type commands or text into “cells” and can run the cells individually or all together. The two most common cells I use are “Markdown” and “Code”. You do have to learn a few easy Markdown commands, for headers/etc. The Markdown cells are used for taking notes, and inserting text. The Code cells are used to input and run the code.
Once you have inputted your code, you can run the cell several ways. The most convenient is to hit “Shift-Enter”, which will run the code in that cell, and bring up a new blank cell.
These are great for creating and saving visualizations, as you can make minor changes and then compare plots. Here are a few examples.
There are a few things to don’t run smoothly yet, like loading packages in R. I have found the easiest way to load a package is to load it using RStudio, and then use the library command in Jupyter to load it into the Jupyter notebook. Alternatively you could use the following command each time:
install.packages(“package name”, repos = c(“https://rweb.crmda.ku.edu/cran/”)) # You can select your CRAN mirror and insert into the repos command).
Overall, I love using Jupyter to both take notes, run code while learning, and organize my learning so I can easily find it later. I see it’s huge potential in sharing data, and being able to easily reproduce results. Give it a try!