Quantcast
Viewing latest article 2
Browse Latest Browse All 48

Segmented Funnel charts in Python using Plotly

Funnel Charts are often used to represent data in different stages of a business process. You can learn more about them in our previous post, Funnel charts in Python using Plotly.

In this post, we will learn about creating Segmented Funnel Charts.

Instead of having a single source of data like the funnel charts, the segmented funnel charts have multiple data sources.

We are going to use a sample dataset from a dummy E-commerce firm’s quarterly product sales. The funnel chart will represent the users at different stages of the process. We can also inspect the number of users contributed by different segments (channels).

Here is the dataset for this post, segment-funnel-dataset.csv.

IPython Notebook for the source code is available here.

Ad Media Affiliates Referrals Direct
Visit 9806 13105 6505 2517 24321
Sign-up 3065 6096 3011 1710 11453
Selection 1765 3592 2234 1555 8603
Purchase 1507 2403 1610 1005 5798

import plotly.plotly as py
import plotly.graph_objs as go

from __future__ import division

# campaign data (download the file mentioned above)
import pandas as pd
df = pd.read_csv('segment-funnel-dataset.csv')

# color for each segment
colors = ['rgb(63,92,128)', 'rgb(90,131,182)', 'rgb(255,255,255)', 'rgb(127,127,127)', 'rgb(84,73,75)']

We can calculate the total number of users in each phase using DataFrame.iterrows() method.

total = [sum(row[1]) for row in df.iterrows()]

Number of phases and segments can be calculated using the shape (returns a tuple) attribute of DataFrame.

n_phase, n_seg = df.shape

We are using a fixed width for the plot and the width of each phase will be calculated according to the total users compared to the initial phase.

plot_width = 600
unit_width = plot_width / total[0]

phase_w = [int(value * unit_width) for value in total]

# height of a section and difference between sections 
section_h = 100
section_d = 10

# shapes of the plot
shapes = []

# plot traces data
data = []

# height of the phase labels
label_y = []

A phase in the chart will be a rectangle made of smaller rectangles representing different segments.

height = section_h * n_phase + section_d * (n_phase-1)

# rows of the DataFrame
df_rows = list(df.iterrows())

# iteration over all the phases
for i in range(n_phase):
    # phase name
    row_name = df.index[i]
    
    # width of each segment (smaller rectangles) will be calculated
    # according to their contribution in the total users of phase
    seg_unit_width = phase_w[i] / total[i]
    seg_w = [int(df_rows[i][1][j] * seg_unit_width) for j in range(n_seg)]
    
    # starting point of segment (the rectangle shape) on the X-axis
    xl = -1 * (phase_w[i] / 2)
    
    # iteration over all the segments
    for j in range(n_seg):
        # name of the segment
        seg_name = df.columns[j]
        
        # corner points of a segment used in the SVG path
        points = [xl, height, xl + seg_w[j], height, xl + seg_w[j], height - section_h, xl, height - section_h]
        path = 'M {0} {1} L {2} {3} L {4} {5} L {6} {7} Z'.format(*points)
        
        shape = {
                'type': 'path',
                'path': path,
                'fillcolor': colors[j],
                'line': {
                    'width': 1,
                    'color': colors[j]
                }
        }
        shapes.append(shape)
        
        # to support hover on shapes
        hover_trace = go.Scatter(
            x=[xl + (seg_w[j] / 2)],
            y=[height - (section_h / 2)],
            mode='markers',
            marker=dict(
                size=min(seg_w[j]/2, (section_h / 2)),
                color='rgba(255,255,255,1)'
            ),
            text="Segment : %s" % (col_name),
            name="Value : %d" % (df[col_name][row_name])
        )
        data.append(hover_trace)
        
        xl = xl + seg_w[j]

    label_y.append(height - (section_h / 2))

    height = height - (section_h + section_d)

We will use text mode to draw the name of phase and its value.

# For phase names
label_trace = go.Scatter(
    x=[-350]*n_phase,
    y=label_y,
    mode='text',
    text=df.index.tolist(),
    textfont=dict(
        color='rgb(200,200,200)',
        size=15
    )
)

data.append(label_trace)
 
# For phase values (total)
value_trace = go.Scatter(
    x=[350]*n_phase,
    y=label_y,
    mode='text',
    text=total,
    textfont=dict(
        color='rgb(200,200,200)',
        size=15
    )
)

data.append(value_trace)

We will style the plot by changing the background color of the plot and the plot paper, hiding the legend and tick labels, and removing the zeroline.

layout = go.Layout(
    title="<b>Segmented Funnel Chart</b>",
    titlefont=dict(
        size=20,
        color='rgb(230,230,230)'
    ),
    hovermode='closest',
    shapes=shapes,
    showlegend=False,
    paper_bgcolor='rgba(44,58,71,1)',
    plot_bgcolor='rgba(44,58,71,1)',
    xaxis=dict(
        showticklabels=False,
        zeroline=False,
    ),
    yaxis=dict(
        showticklabels=False,
        zeroline=False
    )
)

fig = go.Figure(data=data, layout=layout)
py.plot(fig)

You can even analyze different segments by hovering on them.


Viewing latest article 2
Browse Latest Browse All 48

Trending Articles