Data Science, Jupyter Notebook, JupyterLab

JupyterLab – Exciting Improvement on Jupyter Notebooks

At SciPy 2016, Brian Granger and Jason Grout presented JupyterLab, now in a pre-alpha release.  This was the most exciting and monumental news of the conference for me.  A blog post about JupyterLab from Fernando Perez can be viewed here, the link to the YouTube video of the presentation is available here, while the video is presented below.

The blog post discusses some of today’s “Jupyter Notebook” functionality, most of which I have not used.  This includes the Notebooks, “a file manager, a text editor, a terminal emulator, a monitor for running Jupyter processes, an IPython cluster manager, and a pager to display help”.   The new functionality allows you to “arrange a notebook next to a graphical console, atop a terminal that is monitoring the system, while keeping the file manager on the left”.  Users of RStudio will be happy to see this.  (I am wondering if they are going to create a Package Manager like RStudio?).

Here are a few screenshots of what it looks like.


You can download this now, and help “test and refine the system”.  Instructions to do this are here.

Data Science, Data Visualization, Jupyter Notebook

Jupyter Notebook, matplotlib figure display options, and pandas.set_option() optimization tips.

I prefer to do my coding in a Jupyter Notebook, as my previous posts have mentioned.  However, I have not run across any good documentation on how to optimize the notebook, for either a python or R kernel.  I am going to mention a few helpful hints I have found.  Here is the link to the Project Jupyter site.

First a basic comment on how to create a notebook where you want it.   You need to navigate to the directory where you want the notebook to be created.  I use the Windows PowerShell command-line shell.  When you open it up, you are at your home directory.  Use the “dir” command to see what is in that directory, and then use the “cd” (change directory) command to navigate to the directory you want to end up in.  If it is a longer path, you should enclose in quotes.  If you need to create a new directory, use the “md” or “mkdir” command to create a new directory.  For example, my long path is –  “….\Jupyter Notebooks\Python Notebooks”, and while at SciPy 2016 I created an new folder, and this directory is “….\Jupyter Notebooks\Python Notebooks\SciPy16” – to which I added a folder for each tutorial I attended.

Once you get into the final directory, type “Jupyter Notebook”, and a new notebook will be opened.  The first page that opens up is the “Home” page, and if your notebook exists, you can select it here.  If it doesn’t yet exist, then select “New” if the upper right, select your notebook type (for me R or Python 3), and it will launch the notebook.  (This notebook is from a pandas tutorial I attended at SciPy 2016 – “Analyzing and Manipulating Data with Pandas by Jonathon Rocher (excellent presentation if want to watch the video being created).


Once you click on the “pandas_tutorial”, this Jupyter notebook will open up.


A nice feature is that if you clone GitHub repository into that folder, and start a new Jupyter Notebook, then all the files that go with that repository are immediately available for use.

Importing data in a Jupyter Notebook.

If you are tired of hunting down the path for a data set, there is an easy way to find a data set and get it into the directory of the Jupyter notebook.  Go to the “Home” page, and select “Upload” and you will be taken to the “file upload” application.  Navigate to where you stored the data set on your computer, select, and then it will load that onto the home page.  You can then easily load it into your specific Jupyter notebook that is associated with that directory.


Matplotlib figure display options.

If you don’t specify how to display your figures in the Jupyter notebook, when you create a figure using matplotlib, a separate window will open and display the graph.  This window is nice because it is interactive, and you can zoom in on the graph, save it, put labels in, etc.  There is a way to do this in the Jupyter notebook.

The first option I learned about was:

%matplotlib inline

This would display the graph in the notebook, but it was no longer interactive.

However, if you use:

%matplotlib notebook

The figures will now show up in the notebook , and still be interactive.  I learned this during the pandas tutorial at SciPy 2016.

You can also set your figure size by:

LARGE_FIGSIZE = (12,8) # for example


Some pandas optimization hints



to set a large number of options.  For example:

pandas.set_option(“display.max_rows”, 16)

and only 16 rows of data will be displayed.  There are many options, so just use “pandas.set_option?” command to see what is available.

If you have other useful Jupyter notebook tips, would love to hear about them.