Quick Tutorial for Jupyter
Jupyter is a powerful data analytic notebook that runs in a browser. Previously, I went over how to set one up as part of the Anaconda package.
In this post we will explore the various buttons and options you can do in Jupyter.
Creating a new notebook
Once launched, you may see some sample notebooks (depending on your installation), as well as a set of buttons.
Let’s create a new folder in the current directory to house our notebooks. Click on New and select Folder
The new folder will show up as “Untitled Folder,” so check the box next to its name and Rename it to My Projects
Click on the folder’s title to enter it. Here we can create a new notebook in any language of our choosing. We’ll be creating one using Python 3.
Aside: It is possible to install more programming language support in Jupyter. Jupyter uses “Kernels” which allow you to install support for additional programming languages. For instance, the IRKernel adds support for R. In the future when you become more comfortable, you can create a variety of notebooks like so.
Your First Python Program
The jupyter notebook is useful in that you can start writing code immediately without worrying about many of the logistical points.
Let’s write a simple program that prints out your name.
Type in the following code into the box:
my_name = "Howard" print ("Hello", my_name)
And hit Ctrl-Enter
Making Edits
So my name is Howard, but yours is probably not. To quickly modify the program and see the new outputs, change Howard to your name and then hit Ctrl-Enter
.
You should see an immediate change in the output. Jupyter allows you to rapidly change code and see its effects.
You can also click on the “Untitled” title of the notebook to change it to something else.
Your First Data Plot
Jupyter also allows you to integrate visualizations directly into the notebook. Let’s make one right now using a package called matplotlib.
Create a new “paragraph” either by clicking on the + button or just start typing in the box below your first paragraph.
Then, type in this code:
%matplotlib inline from matplotlib import pyplot as plt # These numbers I just made up for this exercise. years = [1950, 1960, 1970, 1980, 1990, 2000, 2010] cxr_vol_in_thousands = [10.2, 18.3, 31.9, 92.5, 180.6, 343.7, 485.3] # create a line chart, years on x-axis, volume on y-axis plt.plot(years, cxr_vol_in_thousands, color='blue', marker='o', linestyle='solid')
As you start typing these more sophisticated code you will notice some of the more subtle editing functions Jupyter provides such as syntax coloring and auto-bracketing.
In the later posts we will go over how to use the specific commands. This post focuses on getting you familiar with basic functions in Jupyter.
Notice that the notebook keeps all the old code you entered and incorporates both the new code and its graphical output all in one place.
Conclusion
Jupyter is a powerful web-based notebook that is indispensable for the data explorer and data scientist. I highly recommend using it for rapid data exploration. This blog focuses on radiology informatics and data science so will only peripherally explore these tools and the programming language in favor of digging into the data (the fun part). If you want to dig deeper into Jupyter, consider the full documentation.
[…] For the Python-minded data scientist, Jupyter is one of the most powerful tools (and here’s how to install it on your computer). A Jupyter notebook is not unlike other standard notebooks such as OneNote or Evernote: you can […]