7 Interactive Plots from the Pharmaceutical Industry

January 27, 2017, 7:19 am

≫ Next: NBA Player Movement using Plotly Animations

≪ Previous: Segmented Funnel charts in Python using Plotly

Introduction

In a recent blog post we introduced 7 Interactive Bioinformatics Plots Made in Python and R.

Here I introduced 7 Interactive Plots from the Pharmaceutical Industry using the plotly R package. These plots are essential for any survival analysis study, where there is interest in time-to-events as often seen in the Pharmaceutical industry. For example, in clinical trials for determining drug efficacy.

Install Packages

We first install the required packages using the pacman R package which will install (if not already present), load, and update packages:

library(pacman)
pacman::p_load(plotly)
pacman::p_load(GGally)
pacman::p_load(survival)
pacman::p_load(cowplot)
pacman::p_load(broom)
pacman::p_load_current_gh("sahirbhatnagar/casebase")
pacman::p_load(Epi)
"%ni%" <- Negate("%in%")

1. Kaplan-Meier Curve

The Kaplan-Meier estimator, also known as the product limit estimator, is a non-parametric statistic used to estimate the survival function from lifetime data e.g. measure the fraction of subjects living for a certain amount of time after treatment. In clinical trials, the effect of an intervention is assessed by measuring the number of subjects survived or saved after that intervention over a period of time [ref].

It is standard practice to plot the Kaplan-Meier (KM) curve as a first step in your analysis. We use the lung data from the survival R package and the GGally R package to plot the KM curve with confidence bands (note that the red ticks represent events, which in this example are deaths):

data(lung, package = "survival")
sf_lung <- survival::survfit(survival::Surv(time, status) ~ 1, data = lung)
p1 <- GGally::ggsurv(sf_lung, main = "Kaplan-Meier Curve for the NCCTG Lung Cancer Data")
plotly::ggplotly(p1)

2. Stratified Kaplan-Meier Curve with Log-rank test

Often we want to test if there is a difference between two or more survival curves. In this example, we want to see is there is a difference in survival time between men and women. We use the survival::survdiff function to perform a log-rank test and annotate the plot with the corresponding p-value:

lung <- transform(lung, sex = factor(sex, levels = 1:2, labels = c("Male","Female")))
sf_sex <- survival::survfit(Surv(time, status) ~ sex, data = lung)
pl_sex <- GGally::ggsurv(sf_sex, main = "Kaplan-Meier Curve for the NCCTG Lung Cancer Data Stratified by Sex")
log_rank_sex <- survival::survdiff(Surv(time, status) ~ sex, data = lung)


pl_sex_annotated <- pl_sex + ggplot2::geom_text(aes(label = sprintf("log-rank test p-value: %0.2g", 
                                                pchisq(log_rank_sex$chisq, df = 1, lower.tail = F)),
                            x = 750, y = 0.9))

plotly::ggplotly(pl_sex_annotated)

3. Lexis Plot

Lexis diagrams provide a graphical representation of the relationships between demographic events in time and persons at risk and they also assist in calculating rates. Every demographic event is characterized by two numbers: the time (e.g., year) at which it occurs and the age (or other duration measure) of the person to whom it occurs [ref]. Time is on the x-axis and age is on the y-axis. A line segment is drawn for each individual in the cohort representing how long they were under observation, while each point on the plot represents an event. No points are plotted if a subject did not experience an event. We use the Epi R package for the Epi::Lexis.diagram function to create an appropriately formatted dataset and the nickel data which is a cohort of nickel smelters in South Wales.

data(nickel, package = "Epi")
attach(nickel)
LL <- Lexis.diagram( age=c(10,100), date=c(1900,1990), 
                     entry.age=age1st, exit.age=ageout, birth.date=dob, 
                     fail=(icd %in% c(162,163)), lwd.life=1,
                     cex.fail=0.8, col.fail=c("green","red") )

LL[nickel$icd %in% c(162,163),"cause"] <- "lung"
LL[nickel$icd %in% c(160),"cause"] <- "nasal"
LL[nickel$icd %ni% c(160,162,163), "cause"] <- "other"

lex_plot <- ggplot(LL, aes(x=entry.date, xend=exit.date, y=entry.age, yend=exit.age)) + 
  xlab("Calendar time") +
  ylab("Age") + 
  labs(title = "Lexis Diagram of Nickel Smelting Workers in South Wales")+
  scale_y_continuous(breaks = seq(10,100,10)) +
  scale_x_continuous(breaks = seq(1900,1990,10)) +
  geom_segment(size=.4, colour="grey") +
  geom_point(aes(x = exit.date, y = exit.age, color = cause), 
             data = LL[LL$cause %in% c("lung","nasal"),]) + 
  scale_color_brewer(palette = "Set1") + theme(legend.position = "bottom") + 
  theme(legend.title = element_blank()) + background_grid(major = "xy", minor = "xy",colour.major = "grey")

ggplotly(lex_plot)

4. Cox-Snell Residuals Plot

In any regression analysis, it is important to verify that the modeling assumptions are reasonable. We can do this by looking at the residuals. In parametric survival models we can plot each covariate against the Cox-Snell (CS) residuals (note that CS residuals can also be used for semi-parametric models such as the Cox model). I have provided a dataset in this GitHub Gist which contains survival times for leukemia patients. The times are in weeks from diagnosis and there are two covariates: white blood cell count (wbc) at diagnosis and a binary covariate AG that indicates a positive or negative (positve=1, negative=0) test related to white blood cell characteristics.

I first build a Weibull regression model with suitable covariates using the survival::survreg function. Then I calculate the CS residuals and plot them against each covariate. The error distribution should follow an exponential distribution with mean 1, i.e., the points should be randomly scattered around the yintercept=1 line if the assumptions are appropriate.

leuk <- read.table("https://gist.githubusercontent.com/sahirbhatnagar/0026f614c55d75521662c06db92e9332/raw/2d8f77fe76c07746fb67593214460e306f907a24/leuk",
                   header = TRUE)

# check if weibull distribution is appropriate
#Test interaction term
fit <- survreg(Surv(time, status) ~ AG + wbc + log(wbc)+ wbc*AG, data = leuk, x = T)

#Cox-Snell Residuals for failure/censored observations
cox_snell_residuals <- as.numeric(exp((fit$y[,1] - fit$x %*% fit$coefficients)/fit$scale) + 1 - leuk$status)

#Cox-Snell Residuals: Leukemia Data
plotly::subplot(
  plot_ly(x = log(leuk$wbc), y = cox_snell_residuals, name = "log White Blood Cell Count"),  
  plot_ly(x = leuk$wbc, y = cox_snell_residuals, name = "White Blood Cell Count"),
  plot_ly(x = leuk$wbc*leuk$AG, y = cox_snell_residuals, name = "AG*White Blood Cell Count"),
  plot_ly(x = leuk$AG, y = cox_snell_residuals, name = "AG"),
  nrows = 2
) %>%
  layout(title = 'Cox-Snell Residuals for Weibull Regression Model of Leukemia Data')

5. Cox Model Coefficient Plot

In a Cox Regression analysis, the resulting coefficients are of interest to determine the magnitude of effect of that predictor on survival time. A coefficient plot makes it easy to see these effects. We use the survival::coxph function to fit a Cox model, and the broom R package to extract the hazard ratios and 95% confidence intervals:

cfit <- survival::coxph(Surv(time, status) ~ age + sex + inst + wt.loss + log(ph.karno) +  meal.cal, data = lung)

d <- broom::tidy(cfit, exponentiate = TRUE) %>% 
  arrange(desc(estimate)) %>%
  mutate(term = factor(term, levels = term))

plot_ly(d, x = ~estimate, y = ~term) %>%
  add_markers(error_x = ~list(value = std.error)) %>%
  layout(title = 'Cox Model Hazard Ratio Estimates and 95% CI for Lung Data')

6. Population Time Plots

In order to try and visualize the incidence density of an event, we can look at a population-time plots (somewhat similar to waterfall plots): on the X-axis we have time, and on the Y-axis we have the size of the risk set at a particular time point. Failure times associated to the event of interest can then be highlighted on the plot using red dots. We use the casebase R package by Sahir Bhatnagar and Maxime Turgeon to create these plots.

We can right away draw a few conclusions from this plot: first of all, we get a sense of how quickly the size of the risk set changes over time. We also see that the incidence density is non-constant: most events occur before time 20. Finally, we also see that the risk set keeps shrinking after the last event has occured; this could be due to either censoring or the competing event.

bmt <- read.csv("https://raw.githubusercontent.com/sahirbhatnagar/casebase/master/inst/extdata/bmtcrr.csv")

obj <- popTime(bmt, time = "ftime")
p6 <- plot(obj)

p6 %>%
  layout(title = 'Population Time Plot for Stem Cell Transplant Data')

7. Stratified Population Time Plots

We can also create stratified Population time plots. This allows us to see if there is differential follow-up between two groups (in this example its disease type ALL vs. AML), and also if the incidence density is somewhat different between both groups as indicated by the distribution of the red dots. We can also clearly see that the AML group has more follow up time than the ALL group (based on the amount of grey area).

obj <- popTime(bmt, exposure = "D", time = "ftime")
p7 <- plot(obj)
ggplotly(p7) %>%
  layout(title = 'Disease Type Stratified Population Time Plot for Stem Cell Transplant Data')

↧

NBA Player Movement using Plotly Animations

February 16, 2017, 8:28 am

≫ Next: Plotcon May 2017 – Speakers and Topics

≪ Previous: 7 Interactive Plots from the Pharmaceutical Industry

If you have been looking for animation support in Plotly, it’s the high time you explore it.

In this post, we will recreate a slice of the NBA game between Toronto Raptors and Charlotte Hornets using the Plotly animations.

Data Collection

NBA’s official site had a section for ‘player tracking movement’ data in the past. Currently, it’s offline due to some technical difficulties (according to the site).

We will be using a publicly available dump of that dataset from GitHub (source repository: linouk23/NBA-Player-Movements).

Let’s fetch the data for a game event.

import requests as req
res = req.get('https://gist.githubusercontent.com/pravj/ea6b8ac5c14d41b81d87c7863b01ee3a/raw/a6d92935ae90a61524266c2c8640190abb2aa935/NBA-CHA-TOR-event.json')

event = res.json()[0]

Basketball court using Plotly shapes

We will start with drawing the game court using Plotly shapes. You can read our post (NBA shots analysis using Plotly shapes) for more details on it. It’s based on the details provided in a similar blog by Savvas Tjortjoglou.

import plotly.graph_objs as go
from plotly.offline import download_plotlyjs, init_notebook_mode, iplot, iplot_mpl, plot
init_notebook_mode()

As the NBA court is 94 by 50 feet in dimension, we will use it for our court.

# point on the center of the court (dummy data)
midpoint_trace = go.Scatter(
    x = [47],
    y = [25]
)

# outer boundary
outer_shape = {
    'type': 'rect',
    'x0': 0,
    'y0': 0,
    'x1': 94,
    'y1': 50,
    'line': {
        'color': 'rgba(0,0,0,1)',
        'width': 1
    },
}

# left backboard
left_backboard_shape = {
    'type': 'line',
    'x0': 4,
    'y0': 22,
    'x1': 4,
    'y1': 28,
    'line': {
        'color': 'rgba(0,0,0,1)',
        'width': 1
    },
}

# right backboard
right_backboard_shape = {
    'type': 'line',
    'x0': 90,
    'y0': 22,
    'x1': 90,
    'y1': 28,
    'line': {
        'color': 'rgba(0,0,0,1)',
        'width': 1
    },
}

# left outer box
left_outerbox_shape = {
    'type': 'rect',
    'x0': 0,
    'y0': 17,
    'x1': 19,
    'y1': 33,
    'line': {
        'color': 'rgba(0,0,0,1)',
        'width': 1
    },
}

# left inner box
left_innerbox_shape = {
    'type': 'rect',
    'x0': 0,
    'y0': 19,
    'x1': 19,
    'y1': 31,
    'line': {
        'color': 'rgba(0,0,0,1)',
        'width': 1
    },
}

# right outer box
right_outerbox_shape = {
    'type': 'rect',
    'x0': 75,
    'y0': 17,
    'x1': 94,
    'y1': 33,
    'line': {
        'color': 'rgba(0,0,0,1)',
        'width': 1
    },
}

# right inner box
right_innerbox_shape = {
    'type': 'rect',
    'x0': 75,
    'y0': 19,
    'x1': 94,
    'y1': 31,
    'line': {
        'color': 'rgba(0,0,0,1)',
        'width': 1
    },
}

# left corner a
leftcorner_topline_shape = {
    'type': 'rect',
    'x0': 0,
    'y0': 47,
    'x1': 14,
    'y1': 47,
    'line': {
        'color': 'rgba(0,0,0,1)',
        'width': 1
    },
}

# left corner b
leftcorner_bottomline_shape = {
    'type': 'rect',
    'x0': 0,
    'y0': 3,
    'x1': 14,
    'y1': 3,
    'line': {
        'color': 'rgba(0,0,0,1)',
        'width': 1
    },
}

# right corner a
rightcorner_topline_shape = {
    'type': 'rect',
    'x0': 80,
    'y0': 47,
    'x1': 94,
    'y1': 47,
    'line': {
        'color': 'rgba(0,0,0,1)',
        'width': 1
    },
}

# right corner b
rightcorner_bottomline_shape = {
    'type': 'rect',
    'x0': 80,
    'y0': 3,
    'x1': 94,
    'y1': 3,
    'line': {
        'color': 'rgba(0,0,0,1)',
        'width': 1
    },
}

# half court
half_court_shape = {
    'type': 'rect',
    'x0': 47,
    'y0': 0,
    'x1': 47,
    'y1': 50,
    'line': {
        'color': 'rgba(0,0,0,1)',
        'width': 1
    },
}

# left hoop
left_hoop_shape = {
    'type': 'circle',
    'x0': 6.1,
    'y0': 25.75,
    'x1': 4.6,
    'y1': 24.25,
    'line': {
        'color': 'rgba(0,0,0,1)',
        'width': 1
    },
}

# right hoop
right_hoop_shape = {
    'type': 'circle',
    'x0': 89.4,
    'y0': 25.75,
    'x1': 87.9,
    'y1': 24.25,
    'line': {
        'color': 'rgba(0,0,0,1)',
        'width': 1
    },
}

# left free throw circle
left_freethrow_shape = {
    'type': 'circle',
    'x0': 25,
    'y0': 31,
    'x1': 13,
    'y1': 19,
    'line': {
        'color': 'rgba(0,0,0,1)',
        'width': 1
    },
}

# right free throw circle
right_freethrow_shape = {
    'type': 'circle',
    'x0': 81,
    'y0': 31,
    'x1': 69,
    'y1': 19,
    'line': {
        'color': 'rgba(0,0,0,1)',
        'width': 1
    },
}

# center big circle
center_big_shape = {
    'type': 'circle',
    'x0': 53,
    'y0': 31,
    'x1': 41,
    'y1': 19,
    'line': {
        'color': 'rgba(0,0,0,1)',
        'width': 1
    },
}

# center small circle
center_small_shape = {
    'type': 'circle',
    'x0': 49,
    'y0': 27,
    'x1': 45,
    'y1': 23,
    'line': {
        'color': 'rgba(0,0,0,1)',
        'width': 1
    },
}

# left arc shape
left_arc_shape = {
    'type': 'path',
    'path': 'M 14,47 Q 45,25 14,3',
    'line': {
        'color': 'rgba(0,0,0,1)',
        'width': 1
    },
}

# right arc shape
right_arc_shape = {
    'type': 'path',
    'path': 'M 80,47 Q 49,25 80,3',
    'line': {
        'color': 'rgba(0,0,0,1)',
        'width': 1
    },
}

# list containing all the shapes
_shapes = [
    outer_shape,
    left_backboard_shape,
    right_backboard_shape,
    left_outerbox_shape,
    left_innerbox_shape,
    right_outerbox_shape,
    right_innerbox_shape,
    leftcorner_topline_shape,
    leftcorner_bottomline_shape,
    rightcorner_topline_shape,
    rightcorner_bottomline_shape,
    half_court_shape,
    left_hoop_shape,
    right_hoop_shape,
    left_freethrow_shape,
    right_freethrow_shape,
    center_big_shape,
    center_small_shape,
    left_arc_shape,
    right_arc_shape
]


layout = go.Layout(
    title = 'Basketball Court',
    shapes = _shapes
)

fig = go.Figure(data = [midpoint_trace], layout=layout)
iplot(fig)

It will result in the following chart, we will be using it as the background of our animation chart.

Data Processing

We will create a dictionary titled team_details from the event data, it’ll contain the basic information like team name, team id, player jersey and name.

team_details = {}

visitor = event['visitor']
home = event['home']

team_details[visitor['teamid']] = {'name': visitor['name'].encode('utf-8'), 'players': {}}
team_details[home['teamid']] = {'name': home['name'].encode('utf-8'), 'players': {}}

for player in visitor['players']:
    team_details[visitor['teamid']]['players'][player['playerid']] = (player['jersey'].encode('utf-8'), '{0} {1}'.format(player['firstname'], player['lastname']))
    
for player in home['players']:
    team_details[home['teamid']]['players'][player['playerid']] = (player['jersey'].encode('utf-8'), '{0} {1}'.format(player['firstname'], player['lastname']))
    
team_details[-1] = {'name': 'Ball', 'players': {-1: ('', 'Ball')}}

For this particular event, there are 600 total moments.

moments = event['moments']
print len(moments) // 600

A single grid can’t have all the 600 frames, so we will be using multiple grids. We will use 40 grids for all the 600 frames, each having data for 15 frames.

import time
import plotly.plotly as py
from plotly.grid_objs import Grid, Column

grids = []

for j in range(40):
    text_attrs = []
    columns = []

    for i in range(j*15, (j+1)*15):
        moment = moments[i]

        text_attrs = []
        attr_dict = {
            '{0}x'.format(visitor['name']): [],
            '{0}y'.format(visitor['name']): [],
            '{0}x'.format(home['name']): [],
            '{0}y'.format(home['name']): [],
            '{0}x'.format('Ball'): [],
            '{0}y'.format('Ball'): [],
        }

        for obj in moment[5]:
            attr_dict['{0}x'.format(team_details[obj[0]]['name'])].append(obj[2])
            attr_dict['{0}y'.format(team_details[obj[0]]['name'])].append(obj[3])

            text_attrs.append(team_details[obj[0]]['players'][obj[1]][0])

        columns.append(Column(attr_dict['{0}x'.format(visitor['name'])], '{0}x{1}'.format(visitor['name'], i)))
        columns.append(Column(attr_dict['{0}y'.format(visitor['name'])], '{0}y{1}'.format(visitor['name'], i)))
        columns.append(Column(attr_dict['{0}x'.format(home['name'])], '{0}x{1}'.format(home['name'], i)))
        columns.append(Column(attr_dict['{0}y'.format(home['name'])], '{0}y{1}'.format(home['name'], i)))
        columns.append(Column(attr_dict['Ballx'], '{0}x{1}'.format('Ball', i)))
        columns.append(Column(attr_dict['Bally'], '{0}y{1}'.format('Ball', i)))

    columns.append(Column(text_attrs[:1], '{0}text'.format('Ball')))
    columns.append(Column(text_attrs[1:6], '{0}text'.format(home['name'])))
    columns.append(Column(text_attrs[6:], '{0}text'.format(visitor['name'])))
    
    _grid = Grid(columns)
    grids.append(_grid)
    
    py.grid_ops.upload(grids[j], 'nba_grid'+str(time.time()), auto_open=False)

The animation will have 3 moving traces (groups); for home team, visitor team, and ball. We will use different colors and size for them.

groups = [visitor['name'], home['name'], 'Ball']
colormap = {visitor['name']: 'yellow', home['name']: 'orange', 'Ball': 'red'}
sizemap = {visitor['name']: 20, home['name']: 20, 'Ball': 10}

Now we will create the frames iterating over all the grids and groups.

grid_frames = []

for j in range(40):
    for i in range(j*15, (j+1)*15):
        frame_data = []
        for g in groups:
            frame_data.append({
                    'xsrc': grids[j].get_column_reference('{0}x{1}'.format(g, i)),
                    'ysrc': grids[j].get_column_reference('{0}y{1}'.format(g, i)),
                    'textsrc': grids[j].get_column_reference('{0}text'.format(g)),
                    'mode': 'markers+text',
                    'name': g,
                    'marker': {'size': sizemap[g], 'color': colormap[g]}
                })


        grid_frames.append(frame_data)

Finally, we can create the animation using the previously created shapes as a background.

sliders_dict = {
    'active': 0,
    'yanchor': 'top',
    'xanchor': 'left',
    'currentvalue': {
        'font': {'size': 15},
        'prefix': '<b>',
        'suffix': '</b>',
        'visible': True,
        'xanchor': 'left'
    },
    'transition': {'duration': 0, 'easing': 'cubic-in-out'},
    'pad': {'b': 10, 't': 50},
    'len': 1.0,
    'x': 0,
    'y': 0,
    'steps': []
}

# create figure
figure = {
    'data': grid_frames[0],
    'layout': {'title': '<b>Charlotte Hornets</b> vs <b>Toronto Raptors</b>',
               'shapes': _shapes,
               'xaxis': {'range': [0, 94], 'autorange': False, 'showgrid': False, 'showticklabels': False},
               'yaxis': {'range': [0, 50], 'autorange': False, 'showgrid': False, 'showticklabels': False},
               'updatemenus': [{
                   'buttons': [
                       {'args': [None, {'frame': {'redraw': True, 'duration': 75}, 'fromcurrent': True}],
                        'label': 'Resume',
                        'method': 'animate'},
                       {
                            'args': [[None], {'frame': {'duration': 0, 'redraw': False}, 'mode': 'immediate',
                            'transition': {'duration': 0}}],
                            'label': 'Pause',
                            'method': 'animate'
                        }
                   ],
                   'pad': {'r': 10, 't': 87},
                   'hovermode': 'closest',
                   'showactive': True,
                   'type': 'buttons',
                }]
              },
    'frames': []
}

# slider marks
marks = [0, 145, 175, 355, 599]

# label on slider marks
mark_labels = {
    0: 'Start',
    145: 'Lowry (7) misses shot',
    175: 'Defensive rebound',
    355: 'Batum (5) makes shot',
    599: 'End'}

# function to generate slider labels based on index
def frame_name(index):
    if index in marks:
        return mark_labels[index]
    else:
        return index + 1

left_grids = grid_frames
for i in range(len(left_grids)):
    figure['frames'].append({'data': left_grids[i], 'name': str(i+1)})
    
    if i in marks:
        slider_step = {'args': [
                [i+1],
                {'frame': {'duration': 0, 'redraw': False},
                 'mode': 'immediate', 'fromcurrent': True,
               'transition': {'duration': 0}}
             ],
             'label': frame_name(i),
             'method': 'animate'}
        sliders_dict['steps'].append(slider_step)

figure['layout']['slider'] = {
    'args': [
        'slider.value', {
            'duration': 0,
            'ease': 'cubic-in-out'
        }
    ],
    'initialValue': '1',
    'plotlycommand': 'animate',
    'values': ['1', '146', '176', '356', '600'],
    'visible': True
}

figure['layout']['sliders'] = [sliders_dict]

py.icreate_animations(figure)

The source code for this animation is available as IPython Notebook on Plotly.

↧

Plotcon May 2017 – Speakers and Topics

March 1, 2017, 2:17 am

≫ Next: Plotly for R workshop at Plotcon 2017

≪ Previous: NBA Player Movement using Plotly Animations

Plotcon Oakland, California

Plotcon in happening in Oakland California this time around and we have an exciting line up of speakers !

Want to attend? You can grab tickets here.

Steve Lianoglou

Steve is a computational biologist at Genentech. Steve holds a PhD in Computational Biology from Cornell and develops computational techniques that integrate heterogeneous sources of high throughput biological data to better understand mechanisms of gene regulation and how they effect cell fate determination and disease progression.

You can find him on Linkedin and Github

Maxime Beauchemin

Maxime Beauchemin joined Airbnb as a data engineer developing tools to help streamline and automate data-engineering processes and was an early adopter of Hadoop/Pig while at Yahoo in 2007. At Facebook, he developed analytics-as-a-service frameworks around engagement and growth-metrics computation, anomaly detection, and cohort analysis. He’s a father of three, and in his free time, he’s a digital artist. You can read more about his projects on his blog, Digital Artifacts.

You can find him on Linkedin and Github

Jess Stauth

Dr. Jessica Stauth is Quantopian’s Vice President of Quant Strategy. Quantopian is a crowd-sourced quantitative investment firm that inspires people from around the world to write investment algorithms. Jess and her team are in charge of selecting algorithms from the Quantopian community to be used in their portfolio. Previously, she worked as an equity quant analyst at StarMine Corporation and as a director of Quant Product Strategy for Thomson Reuters. Jess holds a PhD from UC Berkeley in Biophysics.

You can find her on Linkedin and Github

Holly Bik

Holly Bik is an Assistant Professor in the Department of Nematology at the University of California, Riverside. She holds a Ph.D. from the University of Southampton, UK. Her research uses environmental genomics, computational biology, and data visualization tools to explore the biodiversity of microbial species in diverse habitats, with an emphasis on nematode worms in deep-sea sediments. She also serves as the Associate Editor for Deep-Sea News and maintains an active presence on Twitter (@hollybik).

You can find her on Linkedin and Github

Nicolas Belmonte

Nicholas is the Head of Visualization at Uber. He and his team help enhance people’s ability to understand and communicate data at Uber through the design and implementation of interactive systems for data visualization and analysis. prior to Uber he as at Twitter where he led interactive.twitter.com: a platform which generated interactive visualizations for News, Government, Brand Strategy, Sales, Sports and Music teams and was responsible for our public-facing data visualization work.

You can find him on Linkedin and Github

Alan Jacobson

Mr. Jacobson earned his Bachelor’s degree in Mechanical Engineering from the University of New Hampshire and a Master’s degree at Virginia Tech. He started with the Ford Motor Company in 1991, and has been instrumental in helping bring to market products like the F-Series, Expedition, Navigator, and Taurus. Alan has been the recipient of the Automotive Hall of Fame’s Young Leadership & Excellence award and the Engineering Society of Detroit’s Young Engineer of the Year award.

You can find him on Linkedin

Sylvain Corlay

Sylvain Corlay is an applied mathematician specializing in stochastic analysis and optimal control. He holds a PhD in applied mathematics from University Paris VI. As an open source developer, Sylvain contributes to Project Jupyter in the area of interactive widgets for the notebook, and is steering committee member of the Project. Besides Jupyter, Sylvain contributes to a number of scientific computing open-source projects such as bqplot, xtensor and ipyleaflet.

Sylvain founded QuantStack in September 2016. Prior to founding QuantStack, Sylvain was a quant researcher at Bloomberg and an adjunct faculty member at the Courant Institute and Columbia University.

You can find him on Linkedin and Github

Chris Toomey

Chris Toomey is the Senior Tableau Architect and Technical Program Manager for Big Data at Zillow Group in Seattle. His work focuses on building infrastructure and tools to enable an “always on” view of data. Prior to Zillow, he spent two years consulting with Slalom and five years as a Research Scientist for Pacific Northwest National Laboratory and the U.S. Department of Energy.

You can find him on Linkedin and Github

Elizabeth Seiver

Elizabeth’s work at PLOS focuses on understanding their contributors: authors, editors and reviewers. She uses both quantitative and qualitative research techniques, from analyzing datasets to surveys to in-depth interviews. She works on a diverse set of teams and collaborates with folks in Product, Marketing, Editorial, and Advocacy areas. Her research informs decision-making so that the services are tailored to their contributors’ needs.

You can find her on Linkedin and Github

Carson Sievert

Carson Sievert is a freelance data scientist developing software and creating products that make data analysis more accessible, appealing, and efficient. During his PhD, he became maintainer of the R package plotly and was recognized with the John Chambers Statistical Software Award. He is also author and maintainer of numerous other R packages including: LDAvis, animint, pitchRx, and rdom.

You can find him on Linkedin and Github

Kan Nishida

Kan (Kanichiro) has built and delivered software products from scratch in Data Science, Machine Learning, Analytics, Data Visualization, BI, and Enterprise Reporting. Prior to that, he has built a strong consulting team to deliver Analytics and BI solutions.

You can find him on Linkedin and Github

Todd Mostak

Todd is the CEO and Founder of MapD Technologies. Todd built the original prototype of MapD after tiring of the inability of conventional tools to allow for interactive exploration of big datasets while conducting his Harvard graduate research on the role of Twitter in the Arab Spring. He then joined MIT as a research fellow focusing on GPU databases before turning the MapD project into a startup.

You can find him on Linkedin and Github

Matthew Dillon

Matthew Dillon is a Research Software Engineer in the Caporaso Lab, a working group of the Pathogen and Microbiome Institute at Northern Arizona University (Flagstaff, AZ). He is a core developer of QIIME 2 – an NSF-funded project that will revolutionize microbiome bioinformatics.

You can find him on Github

Alex Johnson

Alex is Plotly’s CTO and the instigator of Plotly.js. He also consults for Microsoft Research and the Center for Quantum Devices at the Niels Bohr Institute, University of Copenhagen, focusing on data acquisition for quantum computing experiments. Alex holds a BS in Physics from Harvey Mudd College and a PhD in Physics from Harvard University. He has 20 peer reviewed publications, including 2006 paper of the year in Science, and other top journals including Nature, Physical Review Letters, and Journal of Power Sources. He has previously worked in the fields of semiconductor optics, quantum information, magnetic resonance imaging, hydrogen fuel cells, and photovoltaic power.

You can find him on Linkedin and Github

Chris Parmer

Chris is a cofounder of Plotly, co-author of the Plotly Python library, and creator of Dash.

You can find him on Github

Ankit Rohatgi

Ankit works as a Software Developer in collaboration with a globally distributed team responsible for development, testing and documentation of ANSYS software used for structural mechanics, fluid flow and electromagnetic simulations. He works with a variety of programming languages, software development tools and platforms to enhance softwares such as ANSYS Mechanical, Workbench and AIM.

Ankit holds a Ph.D. in Chemical Engineering from the University of Notre Dame where he studied particle-fluid interaction and transport mechanisms in multiphase flows by means of many analytical, computational and experimental methods. He has also developed an opensource web application called WebPlotDigitizer that is used by thousands all over the world and has been cited in many peer-reviewed publications.

You can find him on Linkedin and Github

Pascal Bugnion

Pascal is a data engineer at ASI Data Science, a consultancy specializing in data science and analytics. He spends too much of his spare time contributing to open source software, including jupyter-gmaps, a Jupyter plugin for visualizing geographical data on google maps, and, more recently, a Scala client for Plotly. He is the author of ‘Scala for Data Science’ (Packt publishing)

You can find him on Linkedin and Github

Jules Malin

Jules is a Manager of Product Analytics & Data Science at GoPro responsible for building data and analytics pipelines that produce insights which enable insight informed decision making with the goal of making better products and services.

You can find him on Linkedin and Github

Chris Holdgraf

Chris is a graduate student in Bob Knight’s cognitive neuroscience laboratory. He uses applied statistics and machine learning to study the brain, utilizing encoding and decoding models of electrophysiology signals to study how our experience with the auditory world affects the way that we process sounds. He’s a regular contributor to the MNE-python project for MEG and EEG data analysis in Python and to a handful of tools in the scientific python ecosystem.

When he’s not coding, writing, collecting data, or in a meeting, he tries to spend as much time as he can going for hikes, playing basketball, traveling, and eating barbecue.

You can find him on Linkedin and Github

Jeremy Freeman

Jeremy is the group leader at the Freeman Lab where he and his team are are trying to understand how the brain works. They do so by developing technologies for data analysis and visualization that facilitate collaboration and reproducible science and by designing experiments that monitor and manipulate neural activity in animals performing complex behaviours to learn principles of neural coding.

You can find him on Linkedin and Github

Mike Driscoll

Mike Driscoll founded Metamarkets in 2010 after spending more than a decade developing data analytics solutions for online retail, life sciences, digital media, insurance, and banking. Metamarkets provides an end-to-end analytics solution for leaders in programmatic marketing, including Twitter, LinkedIn and AOL. Prior to Metamarkets, Mike successfully founded and sold two companies: Dataspora, a life science analytics company, and CustomInk, an early pioneer in customized apparel. He began his career as a software engineer for the Human Genome Project. Mike holds an A.B. in Government from Harvard and a Ph.D. in Bioinformatics from Boston University.

You can find him on Linkedin

Alexandre Sobolevski

Alexandre is a software engineer at Plotly and an engineering graduate from McGill University who previously worked at Siemens. He is co-author of the Plotly Database Connector, enjoys developing tools for data scientists, and loves to pack a backpack to go on a trek every now and then.

You can find him on Linkedin and Github

Yves Hilpisch

Dr. Hilpisch holds a PhD in Mathematical Finance and is the founder and managing partner of The Python Quants Gmbh. He has written many books including Python for Finance, Derivatives Analytics with Python and Listed Volatility & Variance Derivatives and is a lecturer for Data Science at htw saar University of Applied Sciences and for Computational Finance at the CQF Program.

You can find him on Linkedin and Github

Mikola Lysenko

Mikola Lysenko is a founding member of BITS cooperative, a worker-owned technology company based out of the big island of Hawai’i. Previously at plot.ly, he helped develop the WebGL rendering systems for 3D and 2D plots. He has contributed to several open source projects, including stack.gl, scijs and regl and has written several hundred npm modules.

You can find him on LinkedInand Github

Andrew Seier

Andrew Seier works at Plotly as a front-end and back-end developer. He co-wrote the Python API library for Plotly. He has an MS and BS in electrical engineering from the University of Vermont and previously worked for Sandia National Labs. He lives right here in sunny, Oakland and gets out to climb whenever he has time.

You can find him on Linkedin

Kyle Kelley

Kyle Kelley is a programmer, sometimes mathematician, oftentimes ops. He works for Netflix, hoping to foster Developer Experience. He enjoys working on open source in ways that benefit communities, open source projects, and the companies that rely on and support the tooling. As part of this, he works on IPython and the Jupyter project.

You can find him on Linkedin and Github

Jorge Santos

Jorge heads the Product Management teams for Eikon – Thomson Reuters’ premium Desktop offering for Financial professionals, with more than 120,000 subscribers. It provides users with access to access financial content, analytics, advanced calculators as well as functional and visual applications.

You can find him on Linkedin and Github

↧

Plotly for R workshop at Plotcon 2017

March 14, 2017, 11:25 pm

≫ Next: News and Updates Surrounding plotly for R

≪ Previous: Plotcon May 2017 – Speakers and Topics

Carson Sievert, the lead developer of the Plotly package for R will be hosting a workshop at https://plotcon.plot.ly/. Here’s an outline of the material he will be covering during the workshop.

More details here. The workshop will be based on Carson’s Plotly for R book.

Broad Topic	Details
A tale of 2 interfaces	Converting ggplot2 via `ggplotly()`Directly interfacing to plotly.js with `plot_ly()` Augmenting `ggplotly()` via `layout()` and friends Accessing and leveraging ggplot2 internals Accessing any plot spec via `plotly_json()`
The `plot_ly()` cookbook	Scatter traces Maps Bars & histograms Boxplots 2D frequencies 3D plots
Arranging multiple views	Arranging htmlwidget objects Merging plotly objects into a subplot Navigating many views
Multiple linked views	Linking views with shiny Linking views without shiny Linking views “without shiny, inside shiny”
Animating views	Key frame animations Linking animated views
Advanced topics	Adding custom behavior with the plotly.js API and `htmlwidgets::onRender()` Translating ggplot2 geoms to plotly

↧

News and Updates Surrounding plotly for R

March 19, 2017, 10:26 am

≫ Next: Burtin’s Antibiotics visualization in Plotly and R

≪ Previous: Plotly for R workshop at Plotcon 2017

The plotly R package will soon release version 4.6.0 which includes new features that are over a year in the making. The NEWS file lists all the new features and changes. This webinar highlights the most important new features including animations and multiple linked views.

Concrete examples with code that you can run yourself will be covered in this webinar, however Carson will give you a more in depth learning experience at his workshop at plotcon taking place in Oakland on May 4th.

Here’s an example of the animation capabilities supported by the Plotly package

library(plotly)
library(quantmod)
library(zoo)
library(dplyr)
library(reshape2)
library(PerformanceAnalytics)

stocklist = c("AAPL","GOOGL","MSFT","BRK-A","AMZN","FB","JNJ","XOM","JPM","WFC","BABA",
              "T","BAC","GE","PG","CHL","BUD","RDS-A","WMT","V","VZ","PFE","CVX","ORCL",
              "KO","HD","NVS","CMCSA","DIS","MRK","PM","CSCO","TSM","C","INTC","IBM","UNH",
              "HSBC","PEP","MO","UL","CX","AMGN","MA","CCV","TOT","BTI","SAP","MMM","MDT")

ddf <- getSymbols(Symbols = stocklist[1], auto.assign = F)
ddf <- ddf[,6]

pb <- txtProgressBar(min = 0, max = length(stocklist) - 1, style=3)

for(i in stocklist[-1]){
  df <- getSymbols(Symbols = i, auto.assign = F)  
  df <- df[,6]
  
  ddf <- merge(ddf, df)
  
  setTxtProgressBar(pb, which(stocklist[-1] == i))
}

month <- as.yearmon(index(ddf))
prices <- data.frame(ddf, month)
names(prices) <- c(stocklist, "Month")
prices <- melt(prices, id.vars = "Month")

# Calculate returns
CalcRet <- function(x, vec = F){
  ret <- (x[2:length(x)] - x[1:(length(x) - 1)]) / x[1:(length(x) - 1)]
  
  if(vec == T) {
    return(ret)
  }else{
    return(mean(ret))
  }
}

returns <- prices %>% 
  group_by(Month, variable) %>% 
  summarize(Return = CalcRet(value))

returns <- data.frame(returns, VAR = "Returns")
names(returns) <- c("Period", "Stock", "Value", "Variable")

# Calculate volatility
volatility <- prices %>% 
  group_by(Month, variable) %>% 
  summarize(Volatility = sd(CalcRet(value, vec = T)))

volatility <- data.frame(volatility, VAR = "Volatility")
names(volatility) <- c("Period", "Stock", "Value", "Variable")

# Create df for plotting
plot.df <- rbind(returns, volatility)
plot.df <- dcast(plot.df, Period + Stock ~ Variable, value.var = "Value")
plot.df$Year <- format(plot.df[,1], "%Y")

p <- plot_ly(plot.df, x = ~Volatility, y = ~Returns) %>% 
  
  add_markers(color = ~Stock, size = ~(Returns / Volatility),
              frame = ~Year,
              marker = list(opacity = 0.6, line = list(width = 1, color = "black"))) %>% 
  
  layout(title = "Monthly Return vs Volatility over last 10 years <br> for 50 US stocks over time",
         showlegend = F, 
         plot_bgcolor = "#e6e6e6",
         paper_bgcolor = "#e6e6e6") %>% 
  
  animation_opts(frame = 1000)

About Carson

Carson Sievert is a freelance data scientist developing software and creating products that make data analysis more exciting and accessible. During his PhD, he became maintainer of the R package plotly and was recognized with the John Chambers Statistical Software Award. He is also author and maintainer of numerous other R packages including: LDAvis, animint, pitchRx, and rdom.

Follow Carson on Twitter

↧

Burtin’s Antibiotics visualization in Plotly and R

March 29, 2017, 3:35 am

≫ Next: Plotly charts in nteract notebooks using R

≪ Previous: News and Updates Surrounding plotly for R

In this post, we’ll try to re-create Burtin’s antibiotics visualization using Plotly. The post follows Mike Bostock’s original re-creation in Protovis. See here.

library(plotly)
library(reshape2)
library(dplyr)

# Data
df <- read.csv("https://cdn.rawgit.com/plotly/datasets/5360f5cd/Antibiotics.csv", stringsAsFactors = F)
N <- nrow(df)

# Melting for easier use later on
df$Seq <- 1:N
df <- melt(df, id.vars = c("Bacteria", "Gram", "Seq")) %>% arrange(Seq)

# Angle generation
theta.start <- (5/2)*pi - pi/10
theta.end <- pi/2 + pi/10
theta.range <- seq(theta.start, theta.end, length.out = N * 4 - 1)
dtheta <- diff(theta.range)[1]

# Fine adjustment for larger ticks (black)
inc <- 0.04

# Angles for larger ticks
big_ticks <- theta.range[1:length(theta.range) %% 4 == 0]
big_ticks <- c(theta.range[1] - dtheta, big_ticks, theta.range[length(theta.range)] + dtheta)

# Angles for smaller ticks
small_ticks <- theta.range[1:length(theta.range) %% 4 != 0]

# Set inner and outer radii
inner.radius <- 0.3
outer.radius <- 1

# Set colors
cols <- c("#0a3264","#c84632","#000000")
pos_col <- "rgba(174, 174, 184, .8)"
neg_col <- "rgba(230, 130, 110, .8)"

# Function to calculate radius given minimum inhibitory concentration
# Scaling function is sqrt(log(x))
radiusFUNC <- function(x){
    min <- sqrt(log(0.001 * 1e4))
    max <- sqrt(log(1000 * 1e4))
    a <- (outer.radius - inner.radius)/(min - max)
    b <- inner.radius - a * max
    rad <- a * sqrt(log(x * 1e4)) + b

    return(rad)
}

SEQ <- function(start, by, length){
    vec <- c()
    vec[1] <- start

    for(i in 2:length){
        vec[i] <- vec[i-1] + by
    }

    return(vec)
}

# Generate x and y coordinates for large ticks
radial_gridlines <- data.frame(theta = big_ticks,
                               x = (inner.radius - inc) * cos(big_ticks),
                               y = (inner.radius - inc) * sin(big_ticks),
                               xend = (outer.radius + inc) * cos(big_ticks),
                               yend = (outer.radius + inc) * sin(big_ticks))

# Generate x and y coordinates for antibiotics
antibiotics <- df
antibiotics$x <- inner.radius * cos(small_ticks)
antibiotics$y <- inner.radius * sin(small_ticks)
antibiotics$xend <- radiusFUNC(df$value) * cos(small_ticks)
antibiotics$yend <- radiusFUNC(df$value) * sin(small_ticks)
antibiotics$text <- with(antibiotics,
                         paste("<b>Bacteria:</b>", Bacteria, "<br>",
                               "<b>Antibiotic:</b>", variable, "<br>",
                               "<b>Min. conc:</b>", value, "<br>"))

# Generate x and y coordinates for white circles (grid)
rad <- c(100, 10, 1, 0.1, 0.01, 0.001)
rad <- c(inner.radius, radiusFUNC(rad))
theta <- seq(0, 2 * pi, length.out = 100)

circles <- lapply(rad, function(x){
    x.coord <- x * cos(theta)
    y.coord <- x * sin(theta)

    return(data.frame(label = which(rad == x),
                      x = x.coord,
                      y = y.coord))
})

circles <- do.call(rbind, lapply(circles, data.frame))

# Generate gram-negative polygon
theta <- seq(big_ticks[1], big_ticks[10], length.out = 100)

gram.neg <- data.frame(theta = theta,
                       x = inner.radius * cos(theta),
                       y = inner.radius * sin(theta))

theta <- rev(theta)

gram.neg <- rbind(gram.neg,
                  data.frame(theta = theta,
                             x = outer.radius * cos(theta),
                             y = outer.radius * sin(theta)))

# Generate gram-positive polygon
theta <- seq(big_ticks[10], big_ticks[length(big_ticks)], length.out = 100)

gram.pos <- data.frame(theta = theta,
                       x = inner.radius * cos(theta),
                       y = inner.radius * sin(theta))

theta <- rev(theta)

gram.pos <- rbind(gram.pos,
                  data.frame(theta = theta,
                             x = outer.radius * cos(theta),
                             y = outer.radius * sin(theta)))

# Text annotations - bacteria name
bacteria.df <- data.frame(bacteria = unique(antibiotics$Bacteria),
                                theta = big_ticks[-length(big_ticks)] - 0.17,
                                stringsAsFactors = F)

bacteria.df$x <- (outer.radius + inc) * cos(bacteria.df$theta)
bacteria.df$y <- (outer.radius + inc) * sin(bacteria.df$theta)
bacteria.df$textangle <- SEQ(-70, by = 21, length = nrow(bacteria.df))
bacteria.df$textangle[9:nrow(bacteria.df)] <-
    (bacteria.df$textangle[9:nrow(bacteria.df)] - 90) - 90

bacteria.df <- lapply(1:nrow(bacteria.df), function(x){

    list(x = outer.radius*cos(bacteria.df$theta[x]) ,
         y = outer.radius*sin(bacteria.df$theta[x]),
         xref = "plot", yref = "plot",
         xanchor = "center", yanchor = "middle",
         text = bacteria.df$bacteria[x],
         showarrow = F,
         font = list(size = 12, family = "arial"),
         textangle = bacteria.df$textangle[x])
})

# Title
bacteria.df[[17]] <- list(x = 0, y = 1,
                          xref = "paper", yref = "paper",
                          xanchor = "left", yanchor = "top",
                          text = "<b>Burtin’s Antibiotics</b>",
                          showarrow = F,
                          font = list(size = 30, family = "serif"))

# Text annotations - scale
scale.annotate <- data.frame(x = 0,
                             y = c(inner.radius, radiusFUNC(c(100, 10, 1, 0.1, 0.01, 0.001))),
                             scale = c("", "100", "10", "1", "0.1", "0.01","0.001"))

# Plot
p <- plot_ly(x = ~x, y = ~y, xend = ~xend, yend = ~yend,
        hoverinfo = "text",
        height = 900, width = 800) %>%

    # Gram negative sector
    add_polygons(data = gram.neg,
                 x = ~x, y = ~y,
                 line = list(color = "transparent"),
                 fillcolor = neg_col,
                 inherit = F,
                 hoverinfo = "none") %>%

    # Gram positive sector
    add_polygons(data = gram.pos,
                 x = ~x, y = ~y,
                 line = list(color = "transparent"),
                 fillcolor = pos_col,
                 inherit = F,
                 hoverinfo = "none") %>%

    # Antibiotics
    add_segments(data = antibiotics %>% filter(variable == "Penicillin"),
                 text = ~text,
                 line = list(color = cols[1], width = 7)) %>%

    add_segments(data = antibiotics %>% filter(variable == "Streptomycin"),
                 text = ~text,
                 line = list(color = cols[2], width = 7)) %>%

    add_segments(data = antibiotics %>% filter(variable == "Neomycin"),
                 text = ~text,
                 line = list(color = cols[3], width = 7)) %>%

    # Black large ticks
    add_segments(data = radial_gridlines, line = list(color = "black", width = 2),
                 hoverinfo = "none") %>%

    # White circles
    add_polygons(data = circles,
                 x = ~x, y = ~y,
                 group_by = ~label,
                 line = list(color = "#eeeeee", width = 1),
                 fillcolor = "transparent",
                 inherit = F,
                 hoverinfo = "none") %>%

    # Scale labels
    add_text(data = scale.annotate, x = ~x, y = ~y, text = ~scale,
             inherit = F,
             textfont = list(size = 10, color = "black"),
             hoverinfo = "none") %>%

    # Gram labels
    add_markers(x = c(-0.05, -0.05), y = c(-1.4, -1.5),
                marker = list(color = c(neg_col, pos_col), size = 15),
                inherit = F, hoverinfo = "none") %>%

    add_text(x = c(0.15, 0.15), y = c(-1.4, -1.5), text = c("Gram Negative", "Gram Positive"),
             textfont = list(size = 10, color = "black"),
             inherit = F, hoverinfo = "none") %>%

    # Antibiotic legend
    add_markers(x = c(-0.15, -0.15, -0.15), y = c(0.1, 0, -0.1),
                marker = list(color = c(cols), size = 15, symbol = "square"),
                inherit = F, hoverinfo = "none") %>%

    add_text(x = c(0.05, 0.05, 0.05), y = c(0.1, 0, -0.1),
             text = c("Penicillin", "Streptomycin", "Neomycin"),
             textfont = list(size = 10, color = "black"),
             inherit = F, hoverinfo = "none") %>%


    layout(showlegend = F,
           xaxis = list(showgrid = F, zeroline = F, showticklabels = F, title = "", domain = c(0.05, 0.95)),
           yaxis = list(showgrid = F, zeroline = F, showticklabels = F, title = "", domain = c(0.05, 0.95)),
           plot_bgcolor = "rgb(240, 225, 210)",
           paper_bgcolor = "rgb(240, 225, 210)",

           # Annotations
           annotations = bacteria.df)
p

↧

Plotly charts in nteract notebooks using R

April 5, 2017, 10:21 pm

≫ Next: Creating interactive SVG tables in R

≪ Previous: Burtin’s Antibiotics visualization in Plotly and R

nteract is an open-source, desktop-based, interactive computing application similar to Jupyter notebooks. The Plotly R package can now be used within nteract.
Visit the links below for information on how to install nteract and a R kernel for nteract

Visit Carson’s Plotly for R book for more details on plotly and its capabilities
Visit nteract releases to download nteract
Visit IRkernel to see details on how to install an R kernel for nteract

Note

Ensure pandoc.exe and pandoc-citeproc.exe are available in the same folder where R.exe resides.

Below are some examples on how to use plotly within nteract.

↧

Creating interactive SVG tables in R

October 14, 2017, 4:51 am

≪ Previous: Plotly charts in nteract notebooks using R

In this post we will explore how to make SVG tables in R using plotly. The tables are visually appealing and can be modified on the fly using simple drag and drop. Make sure you install the latest version of Plotly i.e. v 4.7.1.9 from Github using devtools::install_github("ropensci/plotly)

The easiest way to create a SVG table in R using plotly is by manually specifying headers and individual cell values. The following code snippet highlights this:

#library(devtools)
#install_github("ropensci/plotly")

library(plotly)

p <- plot_ly(
  type = 'table',  # Specify type of plot as table
  
  # header is a list and every parameter shown below needs 
  # to be specified. Note that html tags can be used as well
  
  header = list(
    
    # First specify table headers
    # Note the enclosure within 'list'
    
    values = list(list('<b>EXPENSES</b>'),
                  list('<b>Q1</b>'),
                  list('<b>Q2</b>'), 
                  list('<b>Q3</b>'), 
                  list('<b>Q4</b>')),
    
    # Formatting 
    line = list(color = '#DFE8F3'),
    align = c('left','left','left','left','left'),
    font = list(color = c('#506784', '#506784', '#506784', '#506784', '#ab63fa'), size = 14)
  ),
  
  # Specify individual cells
  
  cells = list(
    
    # Now specify each cell content
    
    values = list(
      c('Salaries', 'Office', 'Merchandise', 'Legal', '<b>TOTAL</b>'),
      c(1200000, 20000, 80000, 2000, 12120000),
      c(1300000, 20000, 70000, 2000, 130902000),
      c(1300000, 20000, 120000, 2000, 131222000),
      c(1400000, 20000, 90000, 2000, 14102000)),
    
    # Formatting
    line = list(color = '#DFE8F3'),
    align = c('left', 'left', 'left', 'left', 'left'),
    font = list(color = c('#506784', '#506784', '#506784', '#506784', '#ab63fa'), size = 14),
    height = 48
    )) %>% 
  
  # Layout is needed to remove gridlines, axis zero lines and ticktext 
  # or else they will also show up
  
  layout(xaxis = list(zeroline = F, showgrid = F, showticklabels = F, domain = c(0, 0.5)),
         yaxis = list(zeroline = F, showgrid = F, showticklabels = F))

p

We can also write a helper function to create these tables using dataframes.

library(plotly)

createTable <- function(df, tableHeight = 50){
  
  # Create the value parameters
  # Headers
  nms <- lapply(names(df), function(x){
    return(paste0("<b>", x, "</b>"))    
  })
  
  nms <- append(nms, "<b>Rows</b>", after = 0)
  headerValues <- lapply(nms, function(x){return(list(x))})
  
  # Cell values
  names(df) <- NULL
  cellValues <- apply(df, 2, function(x){return(list(x))})
  cellValues <- lapply(cellValues, function(x){return(unlist(x))})
  
  cellValues <- append(cellValues, list(rownames(df)), after = 0)
  
  # Create the list to pass to plot_ly()
  header <- list(
    values = headerValues, 
    
    # Formatting 
    line = list(color = '#DFE8F3'),
    align = c('left', rep('center', ncol(df))),
    font = list(color = '#ffffff', size = 16),
    fill = list(color = '#999999')
  )
  
  cells <- list(
    values = cellValues,
    
    # Formatting
    line = list(color = '#DFE8F3'),
    align = c('left', rep('right', ncol(df))),
    font = list(color = c('#262626'), size = 14),
    fill = list(color = c("#d9d9d9", rep("#ffe6cc", ncol(df)))),
    height = tableHeight
  )
  
  # Create table in plotly
  p <- plot_ly(
    type = "table",
    header = header,
    cells = cells,
    width = 1200, 
    height = 1600) %>% 
    
    layout(xaxis = list(zeroline = F, showgrid = F, showticklabels = F),
           yaxis = list(zeroline = F, showgrid = F, showticklabels = F))
  
  return(p)
}

p <- createTable(mtcars)
p

Note that columns can easily rearranged by dragging them around. You can find more information on individual attributes here
Hope this post was useful – happy table making !

↧