Creating Dynamic Visualizations in Python Using Bokeh: A Step-by-Step Tutorial

Bokeh is a robust Python library for creating interactive, visually appealing web visualizations without requiring JavaScript. It excels in interactivity, integration with web frameworks, customization, and performance. This tutorial covers installation, basic and advanced plotting, styling, data management, and layout configurations, providing a comprehensive guide for data scientists and developers.

Bokeh is a powerful Python library designed for creating interactive and visually appealing visualizations for modern web browsers. It empowers developers to build a wide range of graphics, from simple plots to complex dashboards, without the need to write any JavaScript code. Bokeh’s flexibility and ease of use make it an excellent choice for data scientists and developers who want to create dynamic visualizations that can be shared easily online.

Why Use Bokeh?

Bokeh stands out among other Python visualization libraries for several reasons:

  1. Interactivity: Bokeh’s plots are interactive by default. You can easily zoom, pan, and hover over plots to gain more insights from your data.
  2. Integration: Bokeh integrates seamlessly with popular web frameworks like Django and Flask, allowing you to embed visualizations into web applications.
  3. Customization: Bokeh provides extensive customization options for styling your plots, including changing colors, adding legends, and modifying axis properties.
  4. Performance: Bokeh is optimized for handling large datasets efficiently, making it suitable for real-time data streaming and dashboard creation.

Getting Started with Bokeh

To get started with Bokeh, you need to install the library. You can do this using pip or conda:

pip install bokeh

Or, if you prefer using Conda:

conda install bokeh

Once installed, Bokeh can be used in standalone Python scripts, Jupyter Notebooks, or even in web applications.

Downloading Example Codes

For those who want to follow along with the examples in this tutorial, the complete code is available on my GitHub repository. The repository includes an environment.yml file that details all the Python packages used. You can create a separate Conda environment and install all the dependencies using the following command:

conda env create -f environment.yml

Topics Covered

This tutorial covers several important topics related to creating and customizing plots in Bokeh. Below is a detailed walkthrough of each topic:

1. Setting the Dimension of the Figure and Plotting Markers

Let’s start by creating a simple scatter plot. The following code demonstrates how to set the dimensions of a Bokeh figure and plot basic markers:

import numpy as np
from bokeh.plotting import figure
from bokeh.io import output_file, save, show

# Data
x = [1, 2, 3, 4, 5]
y = [6, 7, 8, 9, 10]

# Creating a figure
p = figure(plot_width=300, plot_height=300)

# Plotting markers
circ = p.circle(x, y)

# Display the plot
show(p)

This basic plot demonstrates the core functionality of Bokeh. However, Bokeh offers extensive customization options to make your plots more informative and visually appealing.

2. Styling the Markers and Figure Background

Bokeh allows you to style markers and figure backgrounds with various properties, including colors, line styles, and legends. Here’s an example:

# Generate random data
x = np.random.rand(10) * 10
y = np.random.rand(10) * 10

# Create a figure
p = figure(plot_width=300, plot_height=300)

# Add different markers with custom styles
p.circle(x, y, radius=0.1, fill_color='red', fill_alpha=0.7, line_dash="dashed", legend_label='circle')
p.triangle(np.random.rand(10) * 10, np.random.rand(10) * 10, legend_label='triangle')
p.line(np.random.rand(10) * 10, np.random.rand(10) * 10, line_color='black', legend_label='line')

# Style the background
p.background_fill_color = 'lightgray'

# Display the plot
show(p)

3. Setting and Styling the Title of the Figure

Titles can be crucial for understanding the context of a plot. Bokeh allows for detailed customization of the title text:

# Set title properties
p.title.text = "Random Scatter"
p.title.text_color = "black"
p.title.text_font = "times"
p.title.text_font_size = "25px"
p.title.align = "center"

# Display the plot
show(p)

4. Styling the Axes of the Figure

Axes are another critical part of any plot, and Bokeh provides a wide range of options to style them:

from bokeh.models import Range1d

# Style the axes
p.axis.minor_tick_line_color = "blue"
p.xaxis.minor_tick_line_color = "red"
p.yaxis.major_label_orientation = "vertical"
p.xaxis.visible = True
p.xaxis.axis_label = "X-axis"
p.yaxis.axis_label = "Y-axis"

p.axis.axis_label_text_color = "blue"
p.axis.major_label_text_color = "orange"

# Set axes ranges and tickers
p.x_range = Range1d(start=0, end=10, bounds=(-10, 20))
p.y_range = Range1d(start=0, end=10)
p.xaxis[0].ticker.desired_num_ticks = 2
p.yaxis[0].ticker.desired_num_ticks = 10

# Display the plot
show(p)

5. Styling the Grid and Legends

Grids and legends help in interpreting the data more easily. Bokeh allows you to customize these elements extensively:

# Style the grid
p.xgrid.grid_line_color = "gray"
p.xgrid.grid_line_alpha = 0.3
p.ygrid.grid_line_color = "gray"
p.ygrid.grid_line_alpha = 0.3
p.grid.grid_line_dash = [5, 3]

# Style the legend
p.legend.location = 'top_left'
p.legend.background_fill_alpha = 0.6
p.legend.border_line_color = None
p.legend.margin = 10
p.legend.padding = 18
p.legend.label_text_color = 'olive'
p.legend.label_text_font = 'times'

# Display the plot
show(p)

6. Introduction to ColumnDataSource

ColumnDataSource is a central part of Bokeh that allows you to manage and manipulate data in a more structured way. Here’s an example using a well-known dataset:

from bokeh.sampledata.iris import flowers
from bokeh.models import ColumnDataSource

# Mapping species to colors
colormap = {'setosa': 'red', 'versicolor': 'green', 'virginica': 'blue'}
flowers['color'] = [colormap[x] for x in flowers['species']]
flowers['size'] = flowers['sepal_width'] * 5

# Creating ColumnDataSources
setosa = ColumnDataSource(flowers[flowers["species"] == "setosa"])
versicolor = ColumnDataSource(flowers[flowers["species"] == "versicolor"])
virginica = ColumnDataSource(flowers[flowers["species"] == "virginica"])

# Create a figure
p = figure(height=500, width=500)

# Add scatter plots for each species
p.circle(x="petal_length", y="petal_width", size='size', fill_alpha=0.2,
         color="color", legend_label='Setosa', source=setosa)

p.circle(x="petal_length", y="petal_width", size='size', fill_alpha=0.2,
         color="color", legend_label='Versicolor', source=versicolor)

p.circle(x="petal_length", y="petal_width", size='size', fill_alpha=0.2,
         color="color", legend_label='Virginica', source=virginica)

# Customize the legend and axes
p.legend.location = 'top_left'
p.xaxis.axis_label = "Petal Length"
p.yaxis.axis_label = "Petal Width"
p.title.text = "Petal Plot"
p.legend.click_policy = 'hide'

# Display the plot
show(p)

7. Configuring Toolbars for the Plot

Toolbars in Bokeh can be customized with a variety of tools, such as pan, zoom, and hover:

from bokeh.models import PanTool, ResetTool, HoverTool, WheelZoomTool, BoxZoomTool, SaveTool

# Add tools to the plot
p.tools = [PanTool(), ResetTool(), WheelZoomTool(), BoxZoomTool(), SaveTool()]
hover = HoverTool(tooltips=[("Species", "@species"), ("Sepal Width", "@sepal_width")])
p.add_tools(hover)

# Customize the toolbar location and logo
p.toolbar_location = 'above'
p.toolbar.logo = None

# Display the plot
show(p)

8. Bokeh Layouts – Column, Row, Gridplot

Bokeh provides layout options to arrange multiple plots in columns, rows, or grids:

from bokeh.layouts import column, row, gridplot

# Create additional plots
p2 = figure(height=500, width=500)
p2.circle(x="sepal_length", y="sepal_width", size='size', fill_alpha=0.2,
          color="color", legend_label='Setosa', source=setosa)

p2.circle(x="sepal_length", y="sepal_width", size='size', fill_alpha=0.2,
          color="color", legend_label='Versicolor', source=versicolor)

p2.circle(x="sepal_length", y="sepal_width", size='size', fill_alpha=0.2,
          color="color", legend_label='Virginica', source=virginica)

# Column layout
show(column(p, p2))

# Row layout
show(row(p, p2))

# Grid layout
layout1 = gridplot([[p, p2]], toolbar_location='right')
show(layout1)

9. Bokeh Widgets – Tabs, Panel

Bokeh allows you to add interactive widgets like tabs and panels to your plots:

from bokeh.models.widgets import Tabs, Panel

# Create two panels
panel1 = Panel(child=p, title='Petal')
panel2 = Panel(child=p2, title='Sepal')

# Assign the panels to Tabs
tabs = Tabs(tabs=[panel1, panel2])

# Show the tabbed layout
show(tabs)

10. Selecting Data Points in the Bokeh Figure

Bokeh supports interactive selection tools that allow users to select and highlight specific data points:

select_tools = ['box_select', 'lasso_select', 'poly_select', 'tap', 'reset']

# Create a new plot with selection tools
p = figure(height=500, width=500, x_axis_label='Petal Length', y_axis_label="Petal Width",
           title="Petal Plot", toolbar_location='above', tools=select_tools)

# Add scatter plots with selection color options
p.circle(x="petal_length", y="petal_width", size='size', fill_alpha=0.2,
         color="color", legend_label='Setosa', source=setosa, selection_color='deepskyblue',
         nonselection_color='lightgray', nonselection_alpha=0.3)

# Repeat for other species
p.circle(x="petal_length", y="petal_width", size='size', fill_alpha=0.2,
         color="color", legend_label='Versicolor', source=versicolor, selection_color='deepskyblue',
         nonselection_color='lightgray', nonselection_alpha=0.3)

p.circle(x="petal_length", y="petal_width", size='size', fill_alpha=0.2,
         color="color", legend_label='Virginica', source=virginica, selection_color='deepskyblue',
         nonselection_color='lightgray', nonselection_alpha=0.3)

# Customize legend and toolbar
p.legend.location = 'top_left'
p.legend.click_policy = 'hide'
p.toolbar.logo = None

# Display the plot
show(p)

11. Linking Plots

Bokeh allows you to link multiple plots together so that they share axes and respond to each other’s interactions:

plot_options = dict(width=250, plot_height=250, tools='pan,wheel_zoom,reset')

# Create data
x = list(range(11))
y0, y1, y2 = x, [10-i for i in x], [abs(i-5) for i in x]

# Create linked plots
s1 = figure(**plot_options, title='s1: linked x and y with s2')
s1.circle(x, y0, size=10, color="navy")

s2 = figure(x_range=s1.x_range, y_range=s1.y_range, **plot_options, title='s2: linked x and y with s1')
s2.triangle(x, y1, size=10, color="firebrick")

s3 = figure(x_range=s1.x_range, **plot_options, title='s3: linked x with s1 and s2')
s3.square(x, y2, size=10, color="olive")

# Arrange in a grid and display
p = gridplot([[s1, s2, s3]])
show(p)

Conclusion

Bokeh is an incredibly versatile tool for creating interactive and dynamic visualizations in Python. Whether you’re building simple plots or complex dashboards, Bokeh provides the flexibility and power you need to bring your data to life. With the extensive customization options, you can tailor your visualizations to meet the specific needs of your projects, all without writing a single line of JavaScript.

By following the examples and topics covered in this tutorial, you’ll gain a strong foundation in using Bokeh to create compelling visualizations that can be easily integrated into web applications or shared online.

References

For more in-depth information and examples, Bokeh’s official documentation is an excellent resource, complete with executable notebooks that provide a hands-on learning experience.

Utpal Kumar
Utpal Kumar

Geophysicist | Geodesist | Seismologist | Open-source Developer
I am a geophysicist with a background in computational geophysics, currently working as a postdoctoral researcher at UC Berkeley. My research focuses on seismic data analysis, structural health monitoring, and understanding deep Earth structures. I have had the opportunity to work on diverse projects, from investigating building characteristics using smartphone data to developing 3D models of the Earth's mantle beneath the Yellowstone hotspot.

In addition to my research, I have experience in cloud computing, high-performance computing, and single-board computers, which I have applied in various projects. This includes working with platforms like AWS, Docker, and Kubernetes, as well as supercomputing environments such as STAMPEDE2, ANVIL, Savio and PERLMUTTER (and CORI). My work involves developing innovative solutions for structural health monitoring and advancing real-time seismic response analysis. I am committed to applying these skills to further research in computational seismology and structural health monitoring.

Articles: 29

Leave a Reply

Your email address will not be published. Required fields are marked *