This package contains an implementation of a function digest()
for the creation of hash digests of arbitrary R objects (using the md5, sha-1, sha-256, crc32, xxhash and murmurhash algorithms) permitting easy comparison of R language objects, as well as a function hmac()
to create hash-based message authentication code.
Please note that this package is not meant to be deployed for cryptographic purposes for which more comprehensive (and widely tested) libraries such as OpenSSL should be used.
Writing interfaces to command line software is cumbersome. The cmdfun package provides a framework for building function calls to seamlessly interface with shell commands by allowing lazy evaluation of command line arguments. It also provides methods for handling user-specific paths to tool installs or secrets like API keys. Its focus is to equally serve package builders who wish to wrap command line software, and to help analysts stay inside R when they might usually leave to execute non-R software.
This package provides functions and data sets for actuarial science: modeling of loss distributions; risk theory and ruin theory; simulation of compound models, discrete mixtures and compound hierarchical models; credibility theory. It boasts support for many additional probability distributions to model insurance loss amounts and loss frequency: 19 continuous heavy tailed distributions; the Poisson-inverse Gaussian discrete distribution; zero-truncated and zero-modified extensions of the standard discrete distributions. It also supports phase-type distributions commonly used to compute ruin probabilities.
This package implements a parsimonious evolutionary model to analyze and predict gene-functional annotations in phylogenetic trees as described in Vega Yon et al. (2021) <doi:10.1371/journal.pcbi.1007948>. Focusing on computational efficiency, aphylo makes it possible to estimate pooled phylogenetic models, including thousands (hundreds) of annotations (trees) in the same run. The package also provides the tools for visualization of annotated phylogenies, calculation of posterior probabilities (prediction) and goodness-of-fit assessment featured in Vega Yon et al. (2021).
This package contains various functions for optimal scaling. One function performs optimal scaling by maximizing an aspect (i.e. a target function such as the sum of eigenvalues, sum of squared correlations, squared multiple correlations, etc.) of the corresponding correlation matrix. Another function performs implements the LINEALS approach for optimal scaling by minimization of an aspect based on pairwise correlations and correlation ratios. The resulting correlation matrix and category scores can be used for further multivariate methods such as structural equation models.
Fitting Bayesian multiple and mixed-effect regression models for circular data based on the projected normal distribution. Both continuous and categorical predictors can be included. Sampling from the posterior is performed via an MCMC algorithm. Posterior descriptives of all parameters, model fit statistics and Bayes factors for hypothesis tests for inequality constrained hypotheses are provided. See Cremers, Mulder & Klugkist (2018) <doi:10.1111/bmsp.12108> and Nuñez-Antonio & Guttiérez-Peña (2014) <doi:10.1016/j.csda.2012.07.025>.
Demonstrate the results of a statistical model object as a dynamic nomogram in an RStudio panel or web browser. The package provides two generics functions: DynNom
, which display statistical model objects as a dynamic nomogram; DNbuilder, which builds required scripts to publish a dynamic nomogram on a web server such as the <https://www.shinyapps.io/>. Current version of DynNom
supports stats::lm, stats::glm, survival::coxph, rms::ols, rms::Glm, rms::lrm, rms::cph, and mgcv::gam model objects.
Implementation of Forecastable Component Analysis ('ForeCA
'), including main algorithms and auxiliary function (summary, plotting, etc.) to apply ForeCA
to multivariate time series data. ForeCA
is a novel dimension reduction (DR) technique for temporally dependent signals. Contrary to other popular DR methods, such as PCA or ICA', ForeCA
takes time dependency explicitly into account and searches for the most forecastable signal. The measure of forecastability is based on the Shannon entropy of the spectral density of the transformed signal.
This package provides functions for range estimation in birds based on Pennycuick (2008) and Pennycuick (1975), Flight program which compliments Pennycuick (2008) requires manual entry of birds which can be tedious when there are thousands of birds to estimate. Implemented are two ODE methods discussed in Pennycuick (1975) and time-marching computation method "constant muscle mass" as in Pennycuick (1998). See Pennycuick (1975, ISBN:978-0-12-249405-5), Pennycuick (1998) <doi:10.1006/jtbi.1997.0572>, and Pennycuick (2008, ISBN:9780080557816).
This function is an extension of the Small Area Estimation (SAE) model. Geoadditive Small Area Model is a combination of the geoadditive model with the Small Area Estimation (SAE) model, by adding geospatial information to the SAE model. This package refers to J.N.K Rao and Isabel Molina (2015, ISBN: 978-1-118-73578-7), Bocci, C., & Petrucci, A. (2016)<doi:10.1002/9781118814963.ch13>, and Ardiansyah, M., Djuraidah, A., & Kurnia, A. (2018)<doi:10.21082/jpptp.v2n2.2018.p101-110>.
Allows to evaluate Higher Order Assortativity of complex networks defined through objects of class igraph from the package of the same name. The package returns a result also for directed and weighted graphs. References, Arcagni, A., Grassi, R., Stefani, S., & Torriero, A. (2017) <doi:10.1016/j.ejor.2017.04.028> Arcagni, A., Grassi, R., Stefani, S., & Torriero, A. (2021) <doi:10.1016/j.jbusres.2019.10.008> Arcagni, A., Cerqueti, R., & Grassi, R. (2023) <doi:10.48550/arXiv.2304.01737>
.
It uses phenological and productivity-related variables derived from time series of vegetation indexes, such as the Normalized Difference Vegetation Index, to assess ecosystem dynamics and change, which eventually might drive to land degradation. The final result of the Land Productivity Dynamics indicator is a categorical map with 5 classes of land productivity dynamics, ranging from declining to increasing productivity. See www.sciencedirect.com/science/article/pii/S1470160X21010517/ for a description of the methods used in the package to calculate the indicator.
Fits regularization paths for linear regression, GLM, and Cox regression models using lasso or nonconvex penalties, in particular the minimax concave penalty (MCP) and smoothly clipped absolute deviation (SCAD) penalty, with options for additional L2 penalties (the "elastic net" idea). Utilities for carrying out cross-validation as well as post-fitting visualization, summarization, inference, and prediction are also provided. For more information, see Breheny and Huang (2011) <doi:10.1214/10-AOAS388> or visit the ncvreg homepage <https://pbreheny.github.io/ncvreg/>.
Soft-margin support vector machines (SVMs) are a common class of classification models. The training of SVMs usually requires that the data be available all at once in a single batch, however the Stochastic majorization-minimization (SMM) algorithm framework allows for the training of SVMs on streamed data instead Nguyen, Jones & McLachlan(2018)<doi:10.1007/s42081-018-0001-y>
. This package utilizes the SMM framework to provide functions for training SVMs with hinge loss, squared-hinge loss, and logistic loss.
Estimate the receiver operating characteristic (ROC) curve, area under the curve (AUC) and optimal cut-off points for individual classification taking into account complex sampling designs when working with complex survey data. Methods implemented in this package are described in: A. Iparragirre, I. Barrio, I. Arostegui (2024) <doi:10.1002/sta4.635>; A. Iparragirre, I. Barrio, J. Aramendi, I. Arostegui (2022) <doi:10.2436/20.8080.02.121>; A. Iparragirre, I. Barrio (2024) <doi:10.1007/978-3-031-65723-8_7>.
Calculates total survey error (TSE) for one or more surveys, using both scale-dependent and scale-independent metrics. Package works directly from the data set, with no hand calculations required: just upload a properly structured data set (see TESTIND and its documentation), properly input column names (see functions documentation), and run your functions. For more on TSE, see: Weisberg, Herbert (2005, ISBN:0-226-89128-3); Biemer, Paul (2010) <doi:10.1093/poq/nfq058>; Biemer, Paul et.al. (2017, ISBN:9781119041672); etc.
This package implements functions for varying coefficient meta-analysis methods. These methods do not assume effect size homogeneity. Subgroup effect size comparisons, general linear effect size contrasts, and linear models of effect sizes based on varying coefficient methods can be used to describe effect size heterogeneity. Varying coefficient meta-analysis methods do not require the unrealistic assumptions of the traditional fixed-effect and random-effects meta-analysis methods. For details see: Statistical Methods for Psychologists, Volume 5, <https://dgbonett.sites.ucsc.edu/>.
Vector binary tree provides a new data structure, to make your data visiting and management more efficient. If the data has structured column names, it can read these names and factorize them through specific split pattern, then build the mappings within double list, vector binary tree, array and tensor mutually, through which the batched data processing is achievable easily. The methods of array and tensor are also applicable. Detailed methods are described in Chen Zhang et al. (2020) <doi:10.35566/isdsa2019c8>.
Enables interaction with the National Weather Service application programming web-interface for fetching of real-time and forecast meteorological data. Users can provide latitude and longitude, Automated Surface Observing System identifier, or Automated Weather Observing System identifier to fetch recent weather observations and recent forecasts for the given location or station. Additionally, auxiliary functions exist to identify stations nearest to a point, convert wind direction from character to degrees, and fetch active warnings. Results are returned as simple feature objects whenever possible.
This package provides probe-level data for 20 HGU133A and 20 HGU133B arrays which are a subset of arrays from a large ALL study. The data is for the MLL arrays. This data was published in Mary E. Ross, Xiaodong Zhou, Guangchun Song, Sheila A. Shurtleff, Kevin Girtman, W. Kent Williams, Hsi-Che Liu, Rami Mahfouz, Susana C. Raimondi, Noel Lenny, Anami Patel, and James R. Downing (2003) Classification of pediatric acute lymphoblastic leukemia by gene expression profiling Blood 102: 2951-2959.
CIMICE is a tool in the field of tumor phylogenetics and its goal is to build a Markov Chain (called Cancer Progression Markov Chain, CPMC) in order to model tumor subtypes evolution. The input of CIMICE is a Mutational Matrix, so a boolean matrix representing altered genes in a collection of samples. These samples are assumed to be obtained with single-cell DNA analysis techniques and the tool is specifically written to use the peculiarities of this data for the CMPC construction.
This package provides gsubfn
which is like gsub
but can take a replacement function or certain other objects instead of the replacement string. Matches and back references are input to the replacement function and replaced by the function output. gsubfn
can be used to split strings based on content rather than delimiters and for quasi-perl-style string interpolation. The package also has facilities for translating formulas to functions and allowing such formulas in function calls instead of functions.
The RISC-V Proxy Kernel, pk
, is a lightweight application execution environment that can host statically-linked RISC-V ELF binaries. It is designed to support tethered RISC-V implementations with limited I/O capability and thus handles I/O-related system calls by proxying them to a host computer.
This package also contains the Berkeley Boot Loader, bbl
, which is a supervisor execution environment for tethered RISC-V systems. It is designed to host the RISC-V Linux port.
An interface for performing all stages of ADMIXTOOLS analyses (<https://reich.hms.harvard.edu/software>) entirely from R. Wrapper functions (D, f4, f3, etc.) completely automate the generation of intermediate configuration files, run ADMIXTOOLS programs on the command-line, and parse output files to extract values of interest. This allows users to focus on the analysis itself instead of worrying about low-level technical details. A set of complementary functions for processing and filtering of data in the EIGENSTRAT format is also provided.