Interactive visualizations with Plotly

Plotly is a popular library for creating interactive visualizations in Python. Here’s an example of how to create an interactive scatter plot using Plotly

import plotly.graph_objs as go
import pandas as pd

# Load the iris dataset
iris_df = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv')

# Create a scatter plot of sepal length vs. sepal width
fig = go.Figure(
    data=go.Scatter(
        x=iris_df['sepal_length'],
        y=iris_df['sepal_width'],
        mode='markers',
        marker=dict(
            size=10,
            color=iris_df['species_id'],
            colorscale='Viridis',
            opacity=0.8
        ),
        text=iris_df['species']
    )
)

# Customize the layout
fig.update_layout(
    title='Iris Dataset',
    xaxis_title='Sepal Length',
    yaxis_title='Sepal Width'
)

# Display the plot
fig.show()

This code loads the iris dataset using the read_csv() function from Pandas. It then creates a scatter plot of sepal length vs. sepal width using Plotly’s Scatter() function. The mode parameter is set to 'markers' create a scatter plot, and the marker parameter is used to customize the appearance of the markers. In this case, the marker size is set to 10, the color is determined by the species_id column, the color scale is set to 'Viridis', and the opacity is set to 0.8. The text parameter is used to add labels to each point based on the species column.

The code then customizes the layout of the plot using the update_layout() function. This function is used to set the plot title and axis labels.

Finally, the plot is displayed using the show() function.

The resulting plot is an interactive scatter plot that allows the user to hover over each point to see the species label and to zoom and pan the plot as needed. This kind of interactive visualization can be useful for exploring data and identifying patterns and relationships between variables.

Here’s another example of how to create an interactive visualization using Plotly. This time we’ll create a bar chart that shows the number of movies released each year.

import plotly.express as px
import pandas as pd

# Load the movie dataset
movies_df = pd.read_csv('https://raw.githubusercontent.com/sundeepblue/movie_rating_prediction/master/movie_metadata.csv')

# Group the data by release year and count the number of movies in each year
movies_per_year = movies_df.groupby('title_year')['movie_title'].count()

# Create a bar chart of the data using Plotly
fig = px.bar(
    x=movies_per_year.index,
    y=movies_per_year.values,
    labels={
        'x': 'Year',
        'y': 'Number of Movies'
    }
)

# Customize the layout
fig.update_layout(
    title='Number of Movies Released Each Year',
    xaxis_tickangle=-45
)

# Display the plot
fig.show()

This code loads the movie dataset using the read_csv() function from Pandas. It then groups the data by release year using Pandas’ groupby() function, and counts the number of movies in each year using the count() function. The resulting data is a Pandas Series.

The code then creates a bar chart of the data using Plotly’s bar() function. The x the parameter is set to the index of the Pandas Series (i.e., the release years), and the y parameter is set to the values of the Pandas Series (i.e., the number of movies in each year). The labels parameter is used to customize the axis labels.

The code then customizes the layout of the plot using the update_layout() function. This function is used to set the plot title and to rotate the x-axis tick labels by 45 degrees to improve readability.

Finally, the plot is displayed using the show() function.

The resulting plot is an interactive bar chart that shows the number of movies released each year and allows the user to hover over each bar to see the exact number of movies for each year. This kind of interactive visualization can be useful for exploring trends and patterns in large datasets.

Tech insights for the curious mind