C++ classes for sparse matrix methods including implementation of sparse LDL decomposition of symmetric matrices and solvers described by Timothy A. Davis (2016) <https://fossies.org/linux/SuiteSparse/LDL/Doc/ldl_userguide.pdf>. Provides a set of C++ classes for basic sparse matrix specification and linear algebra, and a class to implement sparse LDL decomposition and solvers. See <https://github.com/samuel-watson/SparseChol> for details.
This package provides functions for fitting multi-state semi-Markov models to longitudinal data. A parametric maximum likelihood estimation method adapted to deal with Exponential, Weibull and Exponentiated Weibull distributions is considered. Right-censoring can be taken into account and both constant and time-varying covariates can be included using a Cox proportional model. Reference: A. Krol and P. Saint-Pierre (2015) <doi:10.18637/jss.v066.i06>.
Phenotypic analysis of field trials using mixed models with and without spatial components. One of a series of statistical genetic packages for streamlining the analysis of typical plant breeding experiments developed by Biometris. Some functions have been created to be used in conjunction with the R package asreml for the ASReml software, which can be obtained upon purchase from VSN international (<https://vsni.co.uk/software/asreml-r/>).
This package provides probability density, cumulative distribution, quantile, and random number generation functions for the Vasicek distribution. In addition, two functions are available for fitting Generalized Additive Models for Location, Scale and Shape introduced by Rigby and Stasinopoulos (2005, <doi:10.1111/j.1467-9876.2005.00510.x>). Some functions are written in C++ using Rcpp', developed by Eddelbuettel and Francois (2011, <doi:10.18637/jss.v040.i08>).
Imputation of missing numerical outcomes for a longitudinal trial with protocol deviations. The package uses distinct treatment arm-based assumptions for the unobserved data, following the general algorithm of Carpenter, Roger, and Kenward (2013) <doi:10.1080/10543406.2013.834911>, and the causal model of White, Royes and Best (2020) <doi:10.1080/10543406.2019.1684308>. Sensitivity analyses to departures from these assumptions can be done by the Delta method of Roger. The program uses the same algorithm as the mimix Stata package written by Suzie Cro, with additional coding for the causal model and delta method. The reference-based methods are jump to reference (J2R), copy increments in reference (CIR), copy reference (CR), and the causal model, all of which must specify the reference treatment arm. Other methods are missing at random (MAR) and the last mean carried forward (LMCF). Individual-specific imputation methods (and their reference groups) can be specified.
Implementations of several robust nonparametric two-sample tests for location or scale differences. The test statistics are based on robust location and scale estimators, e.g. the sample median or the Hodges-Lehmann estimators as described in Fried & Dehling (2011) <doi:10.1007/s10260-011-0164-1>. The p-values can be computed via the permutation principle, the randomization principle, or by using the asymptotic distributions of the test statistics under the null hypothesis, which ensures (approximate) distribution independence of the test decision. To test for a difference in scale, we apply the tests for location difference to transformed observations; see Fried (2012) <doi:10.1016/j.csda.2011.02.012>. Random noise on a small range can be added to the original observations in order to hold the significance level on data from discrete distributions. The location tests assume homoscedasticity and the scale tests require the location parameters to be zero.
This package provides a random-effects stochastic model that allows quick detection of clonal dominance events from clonal tracking data collected in gene therapy studies. Starting from the Ito-type equation describing the dynamics of cells duplication, death and differentiation at clonal level, we first considered its local linear approximation as the base model. The parameters of the base model, which are inferred using a maximum likelihood approach, are assumed to be shared across the clones. Although this assumption makes inference easier, in some cases it can be too restrictive and does not take into account possible scenarios of clonal dominance. Therefore we extended the base model by introducing random effects for the clones. In this extended formulation the dynamic parameters are estimated using a tailor-made expectation maximization algorithm. Further details on the methods can be found in L. Del Core et al., (2022) <doi:10.1101/2022.05.31.494100>.
JASPAR is an open-access database containing manually curated, non-redundant transcription factor (TF) binding profiles for TFs across six taxonomic groups. In this 9th release, we expanded the CORE collection with 341 new profiles (148 for plants, 101 for vertebrates, 85 for urochordates, and 7 for insects), which corresponds to a 19% expansion over the previous release. To search thisdatabases, please use the package TFBSTools (>= 1.31.2).
Set of functions for analyzing Atomic Force Microscope (AFM) force-distance curves. It allows to obtain the contact and unbinding points, perform the baseline correction, estimate the Young's modulus, fit up to two exponential decay function to a stress-relaxation / creep experiment, obtain adhesion energies. These operations can be done either over a single F-d curve or over a set of F-d curves in batch mode.
This package provides an accessible and robust implementation of core BME methodologies for spatial prediction. It enables the systematic integration of heterogeneous data sources including both hard data (precise measurements) and soft interval data (bounded or uncertain observations) while incorporating prior knowledge and supporting variogram-based spatial modeling. The BME methodology is described in Christakos (1990) <doi:10.1007/BF00890661> and Serre and Christakos (1999) <doi:10.1007/s004770050029>.
This package provides functions for the clustering of variables around Latent Variables, for 2-way or 3-way data. Each cluster of variables, which may be defined as a local or directional cluster, is associated with a latent variable. External variables measured on the same observations or/and additional information on the variables can be taken into account. A "noise" cluster or sparse latent variables can also be defined.
Circumplex models, which organize constructs in a circle around two underlying dimensions, are popular for studying interpersonal functioning, mood/affect, and vocational preferences/environments. This package provides tools for analyzing and visualizing circular data, including scoring functions for relevant instruments and a generalization of the bootstrapped structural summary method from Zimmermann & Wright (2017) <doi:10.1177/1073191115621795> and functions for creating publication-ready tables and figures from the results.
Providing a set of functions to easily generate and iterate complex networks. The functions can be used to generate realistic networks with a wide range of different clustering, density, and average path length. For more information consult research articles by Amiyaal Ilany and Erol Akcay (2016) <doi:10.1093/icb/icw068> and Ilany and Erol Akcay (2016) <doi:10.1101/026120>, which have inspired many methods in this package.
Synthesizing joint distributions from marginal densities, focusing on controlling key statistical properties such as correlation for continuous data, mutual information for categorical data, and inducing Simpson's Paradox. Generate datasets with specified correlation structures for continuous variables, adjust mutual information between categorical variables, and manipulate subgroup correlations to intentionally create Simpson's Paradox. Joe (1997) <doi:10.1201/b13150> Sklar (1959) <https://en.wikipedia.org/wiki/Sklar%27s_theorem>.
Enables R users to run large language models locally using GGUF model files and the llama.cpp inference engine. Provides a complete R interface for loading models, generating text completions, and streaming responses in real-time. Supports local inference without requiring cloud APIs or internet connectivity, ensuring complete data privacy and control. Based on the llama.cpp project by Georgi Gerganov (2023) <https://github.com/ggml-org/llama.cpp>.
This package provides the necessary functions to identify and extract a selection of already available barcode constructs (Cornils, K. et al. (2014) <doi:10.1093/nar/gku081>) and freely choosable barcode designs from next generation sequence (NGS) data. Furthermore, it offers the possibility to account for sequence errors, the calculation of barcode similarities and provides a variety of visualisation tools (Thielecke, L. et al. (2017) <doi:10.1038/srep43249>).
We provide extensions to the classical dataset "Example 4: Death by the kick of a horse in the Prussian Army" first used by Ladislaus von Bortkeiwicz in his treatise on the Poisson distribution "Das Gesetz der kleinen Zahlen", <DOI:10.1017/S0370164600019453>. As well as an extended time series for the horse-kick death data, we also provide, in parallel, deaths by falling from a horse and by drowning.
This package provides a generalization of the Synth package that is designed for data at a more granular level (e.g., micro-level). Provides functions to construct weights (including propensity score-type weights) and run analyses for synthetic control methods with micro- and meso-level data; see Robbins, Saunders, and Kilmer (2017) <doi:10.1080/01621459.2016.1213634> and Robbins and Davenport (2021) <doi:10.18637/jss.v097.i02>.
This package creates and manages a provenance graph corresponding to the provenance created by the rdtLite package, which collects provenance from R scripts. rdtLite is available on CRAN. The provenance format is an extension of the W3C PROV JSON format (<https://www.w3.org/Submission/2013/SUBM-prov-json-20130424/>). The extended JSON provenance format is described in <https://github.com/End-to-end-provenance/ExtendedProvJson>.
Parametric linkage analysis of monogenic traits in medical pedigrees. Features include singlepoint analysis, multipoint analysis via MERLIN (Abecasis et al. (2002) <doi:10.1038/ng786>), visualisation of log of the odds (LOD) scores and summaries of linkage peaks. Disease models may be specified to accommodate phenocopies, reduced penetrance and liability classes. paramlink2 is part of the pedsuite package ecosystem, presented in Pedigree Analysis in R (Vigeland, 2021, ISBN:9780128244302).
This package provides a collection of easy-to-use tools for regression analysis of survival data with a cure fraction proposed in Su et al. (2022) <doi:10.1177/09622802221108579>. The modeling framework is based on the Cox proportional hazards mixture cure model and the bounded cumulative hazard (promotion time cure) model. The pseudo-observations approach is utilized to assess covariate effects and embedded in the variable selection procedure.
An entirely data-driven cell type annotation tools, which requires training data to learn the classifier, but not biological knowledge to make subjective decisions. It consists of three steps: preprocessing training and test data, model fitting on training data, and cell classification on test data. See Xiangling Ji,Danielle Tsao, Kailun Bai, Min Tsao, Li Xing, Xuekui Zhang.(2022)<doi:10.1101/2022.02.19.481159> for more details.
Modelling the yield curve with some parametric models. The models implemented are: Nelson, C.R., and A.F. Siegel (1987) <doi: 10.1086/296409>, Diebold, F.X. and Li, C. (2006) <doi: 10.1016/j.jeconom.2005.03.005> and Svensson, L.E. (1994) <doi: 10.3386/w4871>. The package also includes the data of the term structure of interest rate of Federal Reserve Bank and European Central Bank.
Casting metadata for REDCap database creation and handling of castellated data using repeated instruments and longitudinal projects in REDCap'. Keeps a focused data export approach, by allowing to only export required data from the database. Also for casting new REDCap databases based on datasets from other sources. Originally forked from the R part of REDCapRITS by Paul Egeler. See <https://github.com/pegeler/REDCapRITS>. REDCap (Research Electronic Data Capture) is a secure, web-based software platform designed to support data capture for research studies, providing 1) an intuitive interface for validated data capture; 2) audit trails for tracking data manipulation and export procedures; 3) automated export procedures for seamless data downloads to common statistical packages; and 4) procedures for data integration and interoperability with external sources (Harris et al (2009) <doi:10.1016/j.jbi.2008.08.010>; Harris et al (2019) <doi:10.1016/j.jbi.2019.103208>).