Simple Principal Components Analysis (PCA) and (Multiple) Correspondence Analysis (CA) based on the Singular Value Decomposition (SVD). This package provides S4 classes and methods to compute, extract, summarize and visualize results of multivariate data analysis. It also includes methods for partial bootstrap validation described in Greenacre (1984, ISBN: 978-0-12-299050-2) and Lebart et al. (2006, ISBN: 978-2-10-049616-7).
This package provides a genetic algorithm for finding variable subsets in high dimensional data with high prediction performance. The genetic algorithm can use ordinary least squares (OLS) regression models or partial least squares (PLS) regression models to evaluate the prediction power of variable subsets. By supporting different cross-validation schemes, the user can fine-tune the tradeoff between speed and quality of the solution.
Fits multivariate Ornstein-Uhlenbeck types of models to continues trait data from species related by a common evolutionary history. See K. Bartoszek, J, Pienaar, P. Mostad, S. Andersson, T. F. Hansen (2012) <doi:10.1016/j.jtbi.2012.08.005>. The suggested PCMBaseCpp
package (which significantly speeds up the likelihood calculations) can be obtained from <https://github.com/venelin/PCMBaseCpp/>
.
Proxy forward modelling for sediment archived climate proxies such as Mg/Ca, d18O or Alkenones. The user provides a hypothesised "true" past climate, such as output from a climate model, and details of the sedimentation rate and sampling scheme of a sediment core. Sedproxy returns simulated proxy records. Implements the methods described in Dolman and Laepple (2018) <doi:10.5194/cp-14-1851-2018>.
This package provides a tool to rectangle a nested list, that is to convert it into a tibble. This is done automatically or according to a given specification. A common use case is for nested lists coming from parsing JSON files or the JSON response of REST APIs. It is supported by the vctrs package and therefore offers a wide support of vector types.
This package provides a collection of functions for automatically creating Stan code for transition diagnostic classification models (TDCMs) as they are defined by Madison and Bradshaw (2018) <DOI:10.1007/s11336-018-9638-5>. This package supports automating the creation of Stan code for TDCMs, fungible TDCMs (i.e., TDCMs with item parameters constrained to be equal across all items), and multi-threaded TDCMs.
Fits hierarchical models of animal abundance and occurrence to data collected using survey methods such as point counts, site occupancy sampling, distance sampling, removal sampling, and double observer sampling. Parameters governing the state and observation processes can be modeled as functions of covariates. References: Kellner et al. (2023) <doi:10.1111/2041-210X.14123>, Fiske and Chandler (2011) <doi:10.18637/jss.v043.i10>.
This package lets you interface to Nocedal et al. L-BFGS-B.3.0 limited memory BFGS minimizer with bounds on parameters. This registers a R compatible C interface to L-BFGS-B.3.0 that uses the same function types and optimization as the optim()
function. This package also adds more stopping criteria as well as allowing the adjustment of more tolerances.
This is a package for variable elimination (Gaussian elimination, Fourier-Motzkin elimination), Moore-Penrose pseudoinverse, reduction to reduced row echelon form, value substitution, projecting a vector on the convex polytope described by a system of (in)equations, simplify systems by removing spurious columns and rows and collapse implied equalities, test if a matrix is totally unimodular, compute variable ranges implied by linear (in)equalities.
This package provides a low-level spell checker and morphological analyzer based on the famous hunspell
library. The package can analyze or check individual words as well as parse text, LaTeX, HTML or XML documents. For a more user-friendly interface use the spelling
package which builds on this package to automate checking of files, documentation and vignettes in all common formats.
Wraps some of the matrix exponentiation utilities from EXPOKIT (<http://www.maths.uq.edu.au/expokit/>), a FORTRAN library that is widely recommended for matrix exponentiation (Sidje RB, 1998. "Expokit: A Software Package for Computing Matrix Exponentials." ACM Trans. Math. Softw. 24(1): 130-156). EXPOKIT includes functions for exponentiating both small, dense matrices, and large, sparse matrices (in sparse matrices, most of the cells have value 0). Rapid matrix exponentiation is useful in phylogenetics when we have a large number of states (as we do when we are inferring the history of transitions between the possible geographic ranges of a species), but is probably useful in other ways as well. NOTE: In case FORTRAN checks temporarily get rexpokit archived on CRAN, see archived binaries at GitHub
in: nmatzke/Matzke_R_binaries (binaries install without compilation of source code).
Interacting with binary files can be difficult because R's types are a subset of what is generally supported by C'. This package provides a suite of functions for reading and writing binary data (with files, connections, and raw vectors) using C type descriptions. These functions convert data between C types and R types while checking for values outside the type limits, NA values, etc.
Nonparametric kernel density estimation, bandwidth selection, and other utilities for analyzing directional data. Implements the estimator in Bai, Rao and Zhao (1987) <doi:10.1016/0047-259X(88)90113-3>, the cross-validation bandwidth selectors in Hall, Watson and Cabrera (1987) <doi:10.1093/biomet/74.4.751> and the plug-in bandwidth selectors in Garcà a-Portugués (2013) <doi:10.1214/13-ejs821>.
This package provides functions and data supporting the Eco-Stats text (Warton, 2022, Springer), and solutions to exercises. Functions include tools for using simulation envelopes in diagnostic plots, and a function for diagnostic plots of multivariate linear models. Datasets mentioned in the package are included here (where not available elsewhere) and there is a vignette for each chapter of the text with solutions to exercises.
R binds GeoSpark
<http://geospark.datasyslab.org/> extending sparklyr <https://spark.rstudio.com/> R package to make distributed geocomputing easier. Sf is a package that provides [simple features] <https://en.wikipedia.org/wiki/Simple_Features> access for R and which is a leading geospatial data processing tool. Geospark R package bring the same simple features access like sf but running on Spark distributed system.
Uses simple Bayesian conjugate prior update rules to calculate the win probability of each option, value remaining in the test, and percent lift over the baseline for various marketing objectives. References: Fink, Daniel (1997) "A Compendium of Conjugate Priors" <https://www.johndcook.com/CompendiumOfConjugatePriors.pdf>
. Stucchio, Chris (2015) "Bayesian A/B Testing at VWO" <https://vwo.com/downloads/VWO_SmartStats_technical_whitepaper.pdf>
.
This package provides a correlation-based batch process for fast, accurate imputation for high dimensional missing data problems via chained random forests. See Waggoner (2023) <doi:10.1007/s00180-023-01325-9> for more on hdImpute
', Stekhoven and Bühlmann (2012) <doi:10.1093/bioinformatics/btr597> for more on missForest
', and Mayer (2022) <https://github.com/mayer79/missRanger>
for more on missRanger
'.
This R package implements methods for estimation and inference under Incomplete Block Designs and Balanced Incomplete Block Designs within a design-based finite-population framework. Based on Koo and Pashley (2024) <arXiv:2405.19312>
, it includes block-level estimators and extends to unit-level effects using Horvitz-Thompson and Hájek estimators. The package also provides asymptotic confidence intervals to support valid statistical inference.
The K-sample omnibus non-proportional hazards (KONP) tests are powerful non-parametric tests for comparing K (>=2) hazard functions based on right-censored data (Gorfine, Schlesinger and Hsu, 2020, <doi:10.1177/0962280220907355>). These tests are consistent against any differences between the hazard functions of the groups. The KONP tests are often more powerful than other existing tests, especially under non-proportional hazard functions.
LineUp
is an interactive technique designed to create, visualize and explore rankings of items based on a set of heterogeneous attributes. This is a htmlwidget wrapper around the JavaScript
library LineUp.js
'. It is designed to be used in R Shiny apps and R Markddown files. Due to an outdated webkit version of RStudio it won't work in the integrated viewer.
Data class for increased interoperability working with spatial-temporal data together with corresponding functions and methods (conversions, basic calculations and basic data manipulation). The class distinguishes between spatial, temporal and other dimensions to facilitate the development and interoperability of tools build for it. Additional features are name-based addressing of data and internal consistency checks (e.g. checking for the right data order in calculations).
Generates functional Magnetic Resonance Imaging (fMRI
) time series or 4D data. Some high-level functions are created for fast data generation with only a few arguments and a diversity of functions to define activation and noise. For more advanced users it is possible to use the low-level functions and manipulate the arguments. See Welvaert et al. (2011) <doi:10.18637/jss.v044.i10>.
Framework is devoted to mining numerical association rules through the utilization of nature-inspired algorithms for optimization. Drawing inspiration from the NiaARM
Python and the NiaARM
Julia packages, this repository introduces the capability to perform numerical association rule mining in the R programming language. Fister Jr., Iglesias, Galvez, Del Ser, Osaba and Fister (2018) <doi:10.1007/978-3-030-03493-1_9>.
Streamline the management, creation, and formatting of panel data from the Panel Study of Income Dynamics ('PSID') <https://psidonline.isr.umich.edu> using this user-friendly tool. Simply define variable names and input code book details directly from the PSID official website, and this toolbox will efficiently facilitate the data preparation process, transforming raw PSID files into a well-organized format ready for further analysis.