Productionizing Shiny and Plumber with Pins

Producing an API that serves model results or a Shiny app that displays the results of an analysis requires a collection of intermediate datasets and model objects, all of which need to be saved. Depending on the project, they might need to be reused in another project later, shared with a colleague, used to shortcut computationally intensive steps, or safely stored for QA and auditing. Some of these should be saved in a data warehouse, data lake, or database, but write access to an appropriate database isn’t always available.

Read more

Share Comments · · · · · · ·

Building Interactive World Maps in Shiny

Florianne Verkroost is a PhD candidate at Nuffield College at the University of Oxford. With a passion for data science and a background in mathematics and econometrics. She applies her interdisciplinary knowledge to computationally address societal problems of inequality. In this post, I will show you how to create interactive world maps and how to show these in the form of an R Shiny app. As the Shiny app cannot be embedded into this blog, I will direct you to the live app and show you in this post on my GitHub how to embed a Shiny app in your R Markdown files, which is a really cool and innovative way of preparing interactive documents.

Read more

Share Comments · · · · ·

Multiple Hypothesis Testing in R

In the first article of this series, we looked at understanding type I and type II errors in the context of an A/B test, and highlighted the issue of “peeking”. In the second, we illustrated a way to calculate always-valid p-values that were immune to peeking. We will now explore multiple hypothesis testing, or what happens when multiple tests are conducted on the same family of data. We will set things up as before, with the false positive rate (\alpha = 0.

Read more

Share Comments · · ·

August 2019: "Top 40" R packages

Two hundred and twenty-seven new packages made it to CRAN in August. Quite a few were devoted to medical or genomic applications, and this is reflected in my “Top 40” selections, listed below in nine categories: Computational Methods, Data, Genomics, Machine Learning, Medicine and Pharma, Statistics, Time Series, Utilities, and Visualization. Computational Methods fmcmc v0.2-0: Provides a flexible Markov Chain Monte Carlo (MCMC) framework for implementing Metropolis-Hastings algorithms. Thee is a vignette on user-defined kernels and another on workflows.

Read more

Share Comments · · ·

Accelerate your plots with ggforce

In this post, I will walk you through some examples that show off the major features of the ggforce package. The main goal is to share a few ideas about customizing visualizations that you may find useful in your everyday work. The ggforce package is an extension to ggplot2 developed by Thomas Pedersen. Thanks to ggforce, you can enhance almost any ggplot by highlighting data groupings, and focusing attention on interesting features of the plot.

Read more

Share Comments · · · · ·

R/Medicine 2019 Workshops

R/Medicine 2019 kicked off on Thursday with two outstanding workshops. It was difficult to choose between the two, but fortunately both presenters developed rich sets of materials that are available online. Alison Hill delivered R Markdown for Medicine with an elegant HTML exposition masterfully created to cultivate beginners while still engaging experienced R Markdown users. Photo by Samuel Zeller on Unsplash In four sections: (1) R Markdown Anatomy, (2) Outputs and Tables, (3) Graphics for Communication and (4) Data and Workflows she developed aspects of R Markdown aimed at statisticians and clinicians writing medical document which should also delight a wide audience of R Markdown users.

Read more

Share Comments · · · · · ·

How to Send Custom E-mails with R

A common business oriented data science task is to programatically craft and send custom emails. In this post, I will show how to accomplish this with R on the RStudio Connect platform (a paid product built for the enterprise) using the blastula package.blastula provides a set of functions for composing high-quality HTML e-mails that render across various e-mail clients, such as gmail and outlook, and also includes tooling for sending out those e-mails via SMTP, the standard protocol for electronic mail transmission between different e-mail providers.

Read more

Share Comments · · · · ·

July 2019 "Top 40" R Packages

One hundred seventy-six new packages made it to CRAN in July. Here are my “Top 40” picks organized into twelve categories: Data, Data Science, Finance, Genomics, Machine Learning, Mathematics, Medicine, Statistics, Time Series, Topological Data Analysis, Utilities and Visualization. Data eia v0.3.2: Provides API access to data from the US Energy Information Administration (EIA). Use of the API requires a free API key. See the Package Overview. litteR v0.4.1: Implements a user interface to analyze litter data: beach litter, riverain litter, floating litter, seafloor litter, etc.

Read more

Share Comments · · · ·

Calculating Always-Valid p-values in R

In this post, we will develop a framework for always-valid inference based on the paper Always Valid Inference: Continuous Monitoring of A/B Tests (2019 Johari, Pekelis, Walsh). Using an always-valid p-value allows us to continuously monitor A/B tests, and potentially stop the test early in a valid way1. In section 5 of the paper, the authors propose their method for calculating always-valid p-values: the mixture sequential ratio probability test (mSPRT), first introduced by Robbins (1970).

Read more

Share Comments · ·

Tech Dividends, Part 2

In a previous post, we explored the dividend history of stocks included in the SP500, and we followed that with exploring the dividend history of some NASDAQ tickers. Today’s post is a short continuation of that tech dividend theme, with the aim of demonstrating how we can take our previous work and use it to quickly visualize research from the real world. In this case, the inspiration is the July 27th edition of Barron’s, which has an article called 8 Tech Stocks That Yield Steady Payouts.

Read more

Share Comments · · · · · ·