Estimates ordered probit switching regression models - a Heckman type selection model with an ordinal selection and continuous outcomes. Different model specifications are allowed for each treatment/regime. For more details on the method, see Wang & Mokhtarian (2024) <doi:10.1016/j.tra.2024.104072> or Chiburis & Lokshin (2007) <doi:10.1177/1536867X0700700202>.
This package provides a multiway method to decompose a tensor (array) of any order, as a generalisation of SVD also supporting non-identity metrics and penalisations. 2-way SVD with these extensions is also available. The package includes also some other multiway methods: PCAn (Tucker-n) and PARAFAC/CANDECOMP with these extensions.
This work is an extension of the state space model for Poisson count data, Poisson-Gamma model, towards a semiparametric specification. Just like the generalized additive models (GAM), cubic splines are used for covariate smoothing. The semiparametric models are fitted by an iterative process that combines maximization of likelihood and backfitting algorithm.
Sparse redundancy analysis for high dimensional (biomedical) data. Directional multivariate analysis to express the maximum variance in the predicted data set by a linear combination of variables of the predictive data set. Implemented in a partial least squares framework, for more details see Csala et al. (2017) <doi:10.1093/bioinformatics/btx374>.
This package provides functions to estimate the density and size of a spatially distributed animal population sampled with an array of passive detectors, such as traps, or by searching polygons or transects. Models incorporating distance-dependent detection are fitted by maximizing the likelihood. Tools are included for data manipulation and model selection.
We described a novel Topology-based pathway enrichment analysis, which integrated the global position of the nodes and the topological property of the pathways in Kyoto Encyclopedia of Genes and Genomes Database. We also provide some functions to obtain the latest information about pathways to finish pathway enrichment analysis using this method.
Calculate point estimates of and valid confidence intervals for nonparametric, algorithm-agnostic variable importance measures in high and low dimensions, using flexible estimators of the underlying regression functions. For more information about the methods, please see Williamson et al. (Biometrics, 2020), Williamson et al. (JASA, 2021), and Williamson and Feng (ICML, 2020).
Frequentist sequential meta-analysis based on Trial Sequential Analysis (TSA) in programmed in Java by the Copenhagen Trial Unit (CTU). The primary function is the calculation of group sequential designs for meta-analysis to be used for planning and analysis of both prospective and retrospective sequential meta-analyses to preserve type-I-error control under sequential testing. RTSA includes tools for sample size and trial size calculation for meta-analysis and core meta-analyses methods such as fixed-effect and random-effects models and forest plots. TSA is described in Wetterslev et. al (2008) <doi:10.1016/j.jclinepi.2007.03.013>. The methods for deriving the group sequential designs are based on Jennison and Turnbull (1999, ISBN:9780849303166).
Low-rank matrix decompositions are fundamental tools and widely used for data analysis, dimension reduction, and data compression. Classically, highly accurate deterministic matrix algorithms are used for this task. However, the emergence of large-scale data has severely challenged our computational ability to analyze big data. The concept of randomness has been demonstrated as an effective strategy to quickly produce approximate answers to familiar problems such as the singular value decomposition (SVD). This package provides several randomized matrix algorithms such as the randomized singular value decomposition (rsvd), randomized principal component analysis (rpca), randomized robust principal component analysis (rrpca), randomized interpolative decomposition (rid), and the randomized CUR decomposition (rcur). In addition several plot functions are provided.
This package provides an R API to the Open Source Geometry Engine (GEOS) library and a vector format with which to efficiently store GEOS geometries. High-performance functions to extract information from, calculate relationships between, and transform geometries are provided. Finally, facilities to import and export geometry vectors to other spatial formats are provided.
This package is for genomic regions processing using command line tools such as BEDTools, BEDOPS and Tabix. These tools offer scalable and efficient utilities to perform genome arithmetic e.g indexing, formatting and merging. The bedr package's API enhances access to these tools as well as offers additional utilities for genomic regions processing.
Content-preserving transformations transformations of PDF files such as split, combine, and compress. This package interfaces directly to the qpdf C++ API and does not require any command line utilities. Note that qpdf does not read actual content from PDF files: to extract text and data you need the pdftools package.
This package contains some tools for testing, analyzing time series data and fitting popular time series models such as ARIMA, Moving Average and Holt Winters, etc. Most functions also provide nice and clear outputs like SAS does, such as identify, estimate and forecast, which are the same statements in PROC ARIMA in SAS.
Package binr (pronounced as "binner") provides algorithms for cutting numerical values exhibiting a potentially highly skewed distribution into evenly distributed groups (bins). This functionality can be applied for binning discrete values, such as counts, as well as for discretization of continuous values, for example, during generation of features used in machine learning algorithms.
An all-encompassing R toolkit designed to streamline the process of calling various bioinformatics software and then performing data analysis and visualization in R. With blit', users can easily integrate a wide array of bioinformatics command line tools into their workflows, leveraging the power of R for sophisticated data manipulation and graphical representation.
Learning the structure of graphical models from datasets with thousands of variables. More information about the research papers detailing the theory behind Chordalysis is available at <http://www.francois-petitjean.com/Research> (KDD 2016, SDM 2015, ICDM 2014, ICDM 2013). The R package development site is <https://github.com/HerrmannM/Monash-ChoR>.
Efficient methods for computing distance covariance and relevant statistics. See Székely et al.(2007) <doi:10.1214/009053607000000505>; Székely and Rizzo (2013) <doi:10.1016/j.jmva.2013.02.012>; Székely and Rizzo (2014) <doi:10.1214/14-AOS1255>; Huo and Székely (2016) <doi:10.1080/00401706.2015.1054435>.
Interface for Rcpp users to dlib <http://dlib.net> which is a C++ toolkit containing machine learning algorithms and computer vision tools. It is used in a wide range of domains including robotics, embedded devices, mobile phones, and large high performance computing environments. This package allows R users to use dlib through Rcpp'.
Automatic differentiation is achieved by using dual numbers without providing hand-coded gradient functions. The output value of a mathematical function is returned with the values of its exact first derivative (or gradient). For more details see Baydin, Pearlmutter, Radul, and Siskind (2018) <https://jmlr.org/papers/volume18/17-468/17-468.pdf>.
Enhance R help system by fuzzy search and preview interface, pseudo-postfix operators, and more. The `?.` pseudo-postfix operator and the `?` prefix operator displays documents and contents (source or structure) of objects simultaneously to help understanding the objects. The `?p` pseudo-postfix operator displays package documents, and is shorter than help(package = foo).
This package provides a C++ API for routinely used numerical tools such as integration, root-finding, and optimization, where function arguments are given as lambdas. This facilitates Rcpp programming, enabling the development of R'-like code in C++ where functions can be defined on the fly and use variables in the surrounding environment.
GEE estimation of the parameters in mean structures with possible correlation between the outcomes. User-specified mean link and variance functions are allowed, along with observation weighting. The M in the name geeM is meant to emphasize the use of the Matrix package, which allows for an implementation based fully in R.
This package provides a light-weight, dependency-free, application programming interface (API) to access system-level Git <https://git-scm.com/downloads> commands from within R'. Contains wrappers and defaults for common data science workflows as well as Zsh <https://github.com/ohmyzsh/ohmyzsh> plugin aliases. A generalized API syntax is also available.
Efficient sampling from high-dimensional truncated Gaussian distributions, or multivariate truncated normal (MTN). Techniques include zigzag Hamiltonian Monte Carlo as in Akihiko Nishimura, Zhenyu Zhang and Marc A. Suchard (2024) <doi:10.1080/01621459.2024.2395587>, and harmonic Monte Carlo in Ari Pakman and Liam Paninski (2014) <doi:10.1080/10618600.2013.788448>.