This package provides a scale based normalization (SCBN) method to identify genes with differential expression between different species. It takes into account the available knowledge of conserved orthologous genes and the hypothesis testing framework to detect differentially expressed orthologous genes. The method on this package are described in the article A statistical normalization method and differential expression analysis for RNA-seq data between different species by Yan Zhou, Jiadi Zhu, Tiejun Tong, Junhui Wang, Bingqing Lin, Jun Zhang (2018, pending publication).
This is an implementation of design methods for binomial reliability demonstration tests (BRDTs) with failure count data. The acceptance decision uncertainty of BRDT has been quantified and the impacts of the uncertainty on related reliability assurance activities such as reliability growth (RG) and warranty services (WS) are evaluated. This package is associated with the work from the published paper "Optimal Binomial Reliability Demonstration Tests Design under Acceptance Decision Uncertainty" by Suiyao Chen et al. (2020) <doi:10.1080/08982112.2020.1757703>.
This package provides a self-contained set of methods to aid clinical trial safety investigators, statisticians and researchers, in the early detection of adverse events using groupings by body-system or system organ class. This work was supported by the Engineering and Physical Sciences Research Council (UK) (EPSRC) [award reference 1521741] and Frontier Science (Scotland) Ltd. The package title c212 is in reference to the original Engineering and Physical Sciences Research Council (UK) funded project which was named CASE 2/12.
Neural network has potential in forestry modelling. This package is designed to create and assess Artificial Intelligence based Neural Networks with varying architectures for prediction of volume of forest trees using two input features: height and diameter at breast height, as they are the key factors in predicting volume, therefore development and validation of efficient volume prediction neural network model is necessary. This package has been developed using the algorithm of Tabassum et al. (2022) <doi:10.18805/ag.D-5555>.
This package provides a lightweight, dependency-free data engine for R that provides a grammar for tabular and time-series manipulation. Built entirely on Base R, m61r offers a fluent, chainable API inspired by modern data tools while prioritizing memory efficiency and speed. It includes optimized versions of common data verbs such as filtering, mutation, grouped aggregation, and approximate temporal joins, making it an ideal choice for environments where external dependencies are restricted or where performance in pure R is required.
Ing and Lai (2011) <doi:10.5705/ss.2010.081> proposed a high-dimensional model selection procedure that comprises three steps: orthogonal greedy algorithm (OGA), high-dimensional information criterion (HDIC), and Trim. The first two steps, OGA and HDIC, are used to sequentially select input variables and determine stopping rules, respectively. The third step, Trim, is used to delete irrelevant variables remaining in the second step. This package aims at fitting a high-dimensional linear regression model via OGA+HDIC+Trim.
This package provides a set of functions is provided for 1) the stratum lengths analysis along a chosen direction, 2) fast estimation of continuous lag spatial Markov chains model parameters and probability computing (also for large data sets), 3) transition probability maps and transiograms drawing, 4) simulation methods for categorical random fields. More details on the methodology are discussed in Sartore (2013) <doi:10.32614/RJ-2013-022> and Sartore et al. (2016) <doi:10.1016/j.cageo.2016.06.001>.
Fits the regularization path of regression models (linear and logistic) with additively combined penalty terms. All possible combinations with Least Absolute Shrinkage and Selection Operator (LASSO), Smoothly Clipped Absolute Deviation (SCAD), Minimax Concave Penalty (MCP) and Exponential Penalty (EP) are supported. This includes Sparse Group LASSO (SGL), Sparse Group SCAD (SGS), Sparse Group MCP (SGM) and Sparse Group EP (SGE). For more information, see Buch, G., Schulz, A., Schmidtmann, I., Strauch, K., & Wild, P. S. (2024) <doi:10.1002/bimj.202200334>.
Fit Bayesian hierarchical models of animal abundance and occurrence via the rstan package, the R interface to the Stan C++ library. Supported models include single-season occupancy, dynamic occupancy, and N-mixture abundance models. Covariates on model parameters are specified using a formula-based interface similar to package unmarked', while also allowing for estimation of random slope and intercept terms. References: Carpenter et al. (2017) <doi:10.18637/jss.v076.i01>; Fiske and Chandler (2011) <doi:10.18637/jss.v043.i10>.
This package provides support for transformations of numeric aggregates between statistical classifications (e.g. occupation or industry categorisations) using the Crossmaps framework. Implements classes for representing transformations between a source and target classification as graph structures, and methods for validating and applying crossmaps to transform data collected under the source classification into data indexed using the target classification codes. Documentation about the Crossmaps framework is provided in the included vignettes and in Huang (2024, <doi:10.48550/arXiv.2406.14163>).
This package contains extensions to ggplot2.
Geomas:
geom_table,geom_plotandgeom_grobadd insets to plots using native data coordinates, whilegeom_table_npc,geom_plot_npcandgeom_grob_npcdo the same usingnpccoordinates through new aestheticsnpcxandnpcy.Statistics: select observations based on 2D density.
Positions: radial nudging away from a center point and nudging away from a line or curve.
rebar3 is an Erlang build tool that makes it easy to compile and test Erlang applications, port drivers and releases.
rebar3 is a self-contained Erlang script, so it's easy to distribute or even embed directly in a project. Where possible, rebar uses standard Erlang/OTP conventions for project structures, thus minimizing the amount of build configuration work. rebar3 also provides dependency management, enabling application writers to easily re-use common libraries from a variety of locations (git, hg, etc).
The SEQC/MAQC-III Consortium has produced benchmark RNA-seq data for the assessment of RNA sequencing technologies and data analysis methods (Nat Biotechnol, 2014). Billions of sequence reads have been generated from ten different sequencing sites. This package contains the summarized read count data for ~2000 sequencing libraries. It also includes all the exon-exon junctions discovered from the study. TaqMan RT-PCR data for ~1000 genes and ERCC spike-in sequence data are included in this package as well.
An interactive shiny application for performing non-compartmental analysis (NCA) on pre-clinical and clinical pharmacokinetic data. The package builds on PKNCA for core estimators and provides interactive visualizations, CDISC outputs ('ADNCA', PP', ADPP') and configurable TLGs (tables, listings, and graphs). Typical use cases include exploratory analysis, validation, reporting or teaching/demonstration of NCA methods. Methods and core estimators are described in Denney, Duvvuri, and Buckeridge (2015) "Simple, Automatic Noncompartmental Analysis: The PKNCA R Package" <doi:10.1007/s10928-015-9432-2>.
This package provides the ASUS procedure for estimating a high dimensional sparse parameter in the presence of auxiliary data that encode side information on sparsity. It is a robust data combination procedure in the sense that even when pooling non-informative auxiliary data ASUS would be at least as efficient as competing soft thresholding based methods that do not use auxiliary data. For more information, please see the paper Adaptive Sparse Estimation with Side Information by Banerjee, Mukherjee and Sun (JASA 2020).
This package provides a comprehensive approach for identifying and estimating change points in multivariate time series through various statistical methods. Implements the multiple change point detection methodology from Ryan & Killick (2023) <doi:10.1080/00401706.2023.2183261> and a novel estimation methodology from Fotopoulos et al. (2023) <doi:10.1007/s00362-023-01495-0> generalized to fit the detection methodologies. Performs both detection and estimation of change points, providing visualization and summary information of the estimation process for each detected change point.
The Fill-Mask Association Test ('FMAT') <doi:10.1037/pspa0000396> is an integrative, probability-based social computing method using Masked Language Models to measure conceptual associations (e.g., attitudes, biases, stereotypes, social norms, cultural values) as propositional semantic representations in natural language. Supported language models include BERT <doi:10.48550/arXiv.1810.04805> and its variants available at Hugging Face <https://huggingface.co/models?pipeline_tag=fill-mask>. Methodological references and installation guidance are provided at <https://psychbruce.github.io/FMAT/>.
This package provides a general, flexible framework for estimating parameters and empirical sandwich variance estimator from a set of unbiased estimating equations (i.e., M-estimation in the vein of Stefanski & Boos (2002) <doi:10.1198/000313002753631330>). All examples from Stefanski & Boos (2002) are published in the corresponding Journal of Statistical Software paper "The Calculus of M-Estimation in R with geex" by Saul & Hudgens (2020) <doi:10.18637/jss.v092.i02>. Also provides an API to compute finite-sample variance corrections.
We provide an R tool for computation and nonparametric plug-in estimation of Highest Density Regions (HDRs) and general level sets in the directional setting. Concretely, circular and spherical HDRs can be reconstructed from a data sample following Saavedra-Nieves and Crujeiras (2021) <doi:10.1007/s11634-021-00457-4>. This library also contains two real datasets in the circular and spherical settings. The first one concerns a problem from animal orientation studies and the second one is related to earthquakes occurrences.
The heterogeneous multi-task feature learning is a data integration method to conduct joint feature selection across multiple related data sets with different distributions. The algorithm can combine different types of learning tasks, including linear regression, Huber regression, adaptive Huber, and logistic regression. The modified version of Bayesian Information Criterion (BIC) is produced to measure the model performance. Package is based on Yuan Zhong, Wei Xu, and Xin Gao (2022) <https://www.fields.utoronto.ca/talk-media/1/53/65/slides.pdf>.
Rapid satellite data streams in operational applications have clear benefits for monitoring land cover, especially when information can be delivered as fast as changing surface conditions. Over the past decade, remote sensing has become a key tool for monitoring and predicting environmental variables by using satellite data. This package presents the main applications in remote sensing for land surface monitoring and land cover mapping (soil, vegetation, water...). Tomlinson, C.J., Chapman, L., Thornes, E., Baker, C (2011) <doi:10.1002/met.287>.
Modified functions of the package pcalg and some additional functions to run the PC and the FCI (Fast Causal Inference) algorithm for constraint-based causal discovery in incomplete and multiply imputed datasets. Foraita R, Friemel J, Günther K, Behrens T, Bullerdiek J, Nimzyk R, Ahrens W, Didelez V (2020) <doi:10.1111/rssa.12565>; Andrews RM, Bang CW, Didelez V, Witte J, Foraita R (2021) <doi:10.1093/ije/dyae113>; Witte J, Foraita R, Didelez V (2022) <doi:10.1002/sim.9535>.
Computes pseudo-realizations from the posterior distribution of a Gaussian Process (GP) with the method described in Azzimonti et al. (2016) <doi:10.1137/141000749>. The realizations are obtained from simulations of the field at few well chosen points that minimize the expected distance in measure between the true excursion set of the field and the approximate one. Also implements a R interface for (the main function of) Distance Transform of sampled Functions (<https://cs.brown.edu/people/pfelzens/dt/index.html>).
Two stage curvature identification with machine learning for causal inference in settings when instrumental variable regression is not suitable because of potentially invalid instrumental variables. Based on Guo and Buehlmann (2022) "Two Stage Curvature Identification with Machine Learning: Causal Inference with Possibly Invalid Instrumental Variables" <doi:10.48550/arXiv.2203.12808>. The vignette is available in Carl, Emmenegger, Bühlmann and Guo (2025) "TSCI: Two Stage Curvature Identification for Causal Inference with Invalid Instruments in R" <doi:10.18637/jss.v114.i07>.