Dec 2017: "Top 40" New Package Picks

by Joseph Rickert

Sometimes it appears to me that the invisible hand economists speak of guides the market for new R packages. Eight of the 129 new packages that stuck to CRAN in December fall under Computational Methods, a category I have only recently begun using. All of them made it into the list below of my “Top 40” picks. One day, I would like to go back and reexamine the categories I have been using to see if package developers really do respond to some idea that is “in the air” or whether the variation in categories is just one more of my many hidden biases.

Computational Methods

alphashape3d v1.3: Provides functions to compute the alpha-shape (a generalization of the convex hull) of a finite set of points in the three-dimensional space.

deGradInfer v1.0.0: Implements efficient Bayesian parameter inference for systems of ordinary differential equations based on adaptive gradient matching. See Dondelinger et al. (2013) and Macdonald (2017). The vignette provides examples.

FixedPoint v0.1: Provides algorithms for finding fixed-point vectors, including iterative procedures for non-linear integral equations Anderson (1965), epsilon extrapolation methods Wynn (1962), and minimal polynomial methods Cabay & Jackson (1976). The vignette provides a very nice introduction to the subject.

grapherator v1.0.0: Aimed at research in single- and multi-objective combinatorial optimization, the package provides functions for step-wise generation of weighted graphs. There are vignettes on Graph Generation, Custom Generators and using pipes.

HMMEsolver v0.1.0: Implements a fast solver for Henderson Mixed Model Equation via row operations without computing a matrix inverse. See Kim (2017) for more details.

kexpmv v0.0.3: Implements functions from EXPOKIT to calculate matrix exponentials. See Sidje RB, (1998) for both small dense matrices and large sparse matrices based on Krylov subspace methods.

sparseEigen v0.1.0: Provides methods to compute sparse eigenvectors of a matrix with running time two to three orders of magnitude lower than existing methods. The methods are based on the paper by (Benidis et al. (2016). The vignette includes performance benchmarks.

Average Running Time 

TukeyRegion v0.1.2: Provides fast computation of Tukey regions, polytopes in the Euclidean space giving upper-level sets of the Tukey depth function for given data. For details, see Liu, Mosler, and Mozharovskyi (2017).

Data

mlbgameday v0.0.1: Implements methods for multi-core processing of Gameday data from Major League Baseball Advanced Media (http://gd2.mlb.com/components/game/mlb/). There are vignettes on Database Connedtions, Parallel Processing, Plotting Pitches, and Searching Games.

robis v1.0.0: Implements a client for the Ocean Biogeographic Information System (http://iobis.org). See README to get started.

seaaroundus v1.2.0: Provides access to Sea Around Us fish catch data. See README to get started.

tidyhydat v0.3.2: Provides functions to extract historical and real-time national ‘hydrometric’ data from Water Survey of Canada data sources here and here. There is an Introduction and an Example.

Machine Learning

afCEC v1.0.2: Implements active function cross-entropy clustering partitions the n-dimensional data into the clusters by finding the parameters of the mixed generalized multivariate normal distribution, that optimally approximates the scattering of the data in the n-dimensional space. For details see P. Spurek et al (2017).

dissever v0.2-2: Enables spatial down-scaling of coarse-grid mapping to fine-grid mapping using predictive covariates and a model fitted using the ‘caret’ package. The original dissever algorithm was published by Malone et al. (2012) and extended by Roudier et al. (2017). There is a short tutorial.

mlapi Provides R6 abstract classes for building machine-learning models with scikit-learn-like API. (scikit-learn is a popular module for Python programming language.) There is a vignette.

Numero v1.0.3: Implements an unsupervised statistical framework for defining subgroups in complex datasets based on visual cues Makinen et al. (2011). The vignette shows how to use the package.

PPforest v0.1.0: Implements projection pursuit forest algorithm for supervised classification. The vignette provides details.

qCBA v0.3.1: Implements quantitative classification by association rules. See Kliegr (2017).

tfestimators v1.5: Implements an interface to TensorFlow Estimators, a high-level API that provides implementations of many different model types, including linear models and deep neural networks. There is an Introduction and vignettes on Custom Estimators, the Dataset API, Basic Components, Feature Columns, Input Functions, Parsing Utilites, Run Hooks, and TensorBoard Visualization.

Science

ePCR v0.9.9-4: Provides the top-performing ensemble-based Penalized Cox Regression (ePCR) framework developed during the DREAM 9.5 mCRPC Prostate Cancer Challenge. See Guinney J et al. (2017) for details.

simRVPedigree v0.1.0: Provides routines to simulate and manipulate pedigrees ascertained to contain multiple family members affected by a rare disease. See Nieuwoudt et al. (2017) for the science and the vignette to get started.

theseus v0.1.0: Provides analysis and visualization tools for the interpretation of microbial community composition data, especially those originating from amplicon sequencing. The vignette describes how to use the package.

Statistics

ForecastComb v1.1: Provides geometric and regression-based forecast combination methods under a unified user interface for the packages ForecastCombinations and GeomComb. For details see Hsiao C, Wan SK (2014), Hansen BE (2007), [Elliott G, Gargano A, Timmermann A (2013)](doi:10.1016/j.jeconom.2013.04.017] and Clemen RT (1989).

hesim v0.1.0: Provides functions to develop and analyze health-economic simulation models, including random sampling functions for probabilistic sensitivity analyses Claxton et al. (2005), individual patient simulations Brennan et al. (2006), cost-effectiveness analysis Basu and Meltzer (2007), and Ioannidis and Garber (2011). The vignette provides an overview.

PlackettLuce v0.2-1: Implements a generalization of the model jointly attributed to Plackett (1975) and Luce (1959) for modelling rankings data. The vignette introduces the model.

PUlasso v2.1: Implements an efficient algorithm for solving the Positive and Unlabeled problem in low- or high-dimensional settings with lasso or group lasso penalty. See Hyebin 7 Raskutti (2017) for details and the vignette for an introduction.

recurse v1.0.1: Computes revisitation metrics for trajectory data, such as the number of revisitations for each location as well as the time spent for that visit and the time since the previous visit. The vignette works through a case study of using the package to analyze revisitations in animal movement data.

samplesizeCMH v0.0.0: Provides functions to calculate the power and sample size for Cochran-Mantel-Haenszel tests, and for working with probability, odds, relative risk, and odds ratio values. There is an Introduction to the Cochran-Mantal-Haenszel Test, and vignettes on Power Calculations and Sample Size Calculations.

skimr v1.0.1: Provides a function to display summary statistics at the console. There is an Introduction and vignettes on Fonts and defining summary objects.

ZOIP v0.1: Implements the ZOIP (Zeros Ones Inflated Proportional), proportional data distribution inflated with zeros and/or ones. See Jørgensen and Barndorff-Nielsen (1991), Ferrari and Cribari-Neto (2004), and Rigby and Stasinopoulos (2005) for details, and the vignette for a summary.

Time Series

OSTSC v0.0.1: Provides functions for oversampling imbalanced univariate time series classification data using integrated Enhanced Structure Preserving Oversampling (ESPO) and Adaptive Synthetic (ADASYN) methods. The vignette describes the method and provides examples.

Utilities

JuniperKernel v1.2.2.0: Implements the Jupyter kernel for R, providing an interface to libraries that exist in the Jupyter ecosystem for building widgets, plotting, and more. Look here for details.

labelVector v0.0.1: Supports labels for atomic vectors in a light-weight design that is suitable for use in other packages. The vignette provides details.

ncmeta v0.0.1: Provides functions to extract metadata from NetCDF data sources, which can be files, file handles, or servers. The provides a framework for the in-development tidync project.

RPostgres v1.0-4: Implements a fully DBI-compliant, Rcpp-backed interface to PostgreSQL, an open-source relational database.

swatches v0.5.0: Provides functions to read and inspect Adobe Color (ACO), Adobe Swatch Exchange (ASE), GIMP Palette (GPL), OpenOffice palette (SOC) files, and KDE Palette (colors) files.

stylr v1.0.0: Provides functions to pretty-print R code without changing the user’s formatting intent. There is an Introduction, and vignettes on Customization and Performance Improvements.

Visualizations

BioCircos.png v0.2.2: Implements interactive Circos-like visualizations of genomic data, to map information such as genetic variants, genomic fusions, and aberrations to a circular genome. The vignette shows how to use the package.

cubing v1.0-1: Provides functions for visualizing, animating, solving and analyzing the Rubik’s cube. See Rokicki (2008) for the underlying Kociemba solver. The vignette shows how to get started.

Share Comments · · ·

You may leave a comment below or discuss the post in the forum community.rstudio.com.