Circumplex models, which organize constructs in a circle around two underlying dimensions, are popular for studying interpersonal functioning, mood/affect, and vocational preferences/environments. This package provides tools for analyzing and visualizing circular data, including scoring functions for relevant instruments and a generalization of the bootstrapped structural summary method from Zimmermann & Wright (2017) <doi:10.1177/1073191115621795> and functions for creating publication-ready tables and figures from the results.
This package provides functions for the clustering of variables around Latent Variables, for 2-way or 3-way data. Each cluster of variables, which may be defined as a local or directional cluster, is associated with a latent variable. External variables measured on the same observations or/and additional information on the variables can be taken into account. A "noise" cluster or sparse latent variables can also be defined.
Providing a set of functions to easily generate and iterate complex networks. The functions can be used to generate realistic networks with a wide range of different clustering, density, and average path length. For more information consult research articles by Amiyaal Ilany and Erol Akcay (2016) <doi:10.1093/icb/icw068> and Ilany and Erol Akcay (2016) <doi:10.1101/026120>, which have inspired many methods in this package.
Synthesizing joint distributions from marginal densities, focusing on controlling key statistical properties such as correlation for continuous data, mutual information for categorical data, and inducing Simpson's Paradox. Generate datasets with specified correlation structures for continuous variables, adjust mutual information between categorical variables, and manipulate subgroup correlations to intentionally create Simpson's Paradox. Joe (1997) <doi:10.1201/b13150> Sklar (1959) <https://en.wikipedia.org/wiki/Sklar%27s_theorem>.
This package provides the necessary functions to identify and extract a selection of already available barcode constructs (Cornils, K. et al. (2014) <doi:10.1093/nar/gku081>) and freely choosable barcode designs from next generation sequence (NGS) data. Furthermore, it offers the possibility to account for sequence errors, the calculation of barcode similarities and provides a variety of visualisation tools (Thielecke, L. et al. (2017) <doi:10.1038/srep43249>).
We provide extensions to the classical dataset "Example 4: Death by the kick of a horse in the Prussian Army" first used by Ladislaus von Bortkeiwicz in his treatise on the Poisson distribution "Das Gesetz der kleinen Zahlen", <DOI:10.1017/S0370164600019453>. As well as an extended time series for the horse-kick death data, we also provide, in parallel, deaths by falling from a horse and by drowning.
This package provides a generalization of the Synth package that is designed for data at a more granular level (e.g., micro-level). Provides functions to construct weights (including propensity score-type weights) and run analyses for synthetic control methods with micro- and meso-level data; see Robbins, Saunders, and Kilmer (2017) <doi:10.1080/01621459.2016.1213634> and Robbins and Davenport (2021) <doi:10.18637/jss.v097.i02>.
This package creates and manages a provenance graph corresponding to the provenance created by the rdtLite package, which collects provenance from R scripts. rdtLite is available on CRAN. The provenance format is an extension of the W3C PROV JSON format (<https://www.w3.org/Submission/2013/SUBM-prov-json-20130424/>). The extended JSON provenance format is described in <https://github.com/End-to-end-provenance/ExtendedProvJson>.
This package provides a collection of easy-to-use tools for regression analysis of survival data with a cure fraction proposed in Su et al. (2022) <doi:10.1177/09622802221108579>. The modeling framework is based on the Cox proportional hazards mixture cure model and the bounded cumulative hazard (promotion time cure) model. The pseudo-observations approach is utilized to assess covariate effects and embedded in the variable selection procedure.
Parametric linkage analysis of monogenic traits in medical pedigrees. Features include singlepoint analysis, multipoint analysis via MERLIN (Abecasis et al. (2002) <doi:10.1038/ng786>), visualisation of log of the odds (LOD) scores and summaries of linkage peaks. Disease models may be specified to accommodate phenocopies, reduced penetrance and liability classes. paramlink2 is part of the pedsuite package ecosystem, presented in Pedigree Analysis in R (Vigeland, 2021, ISBN:9780128244302).
An entirely data-driven cell type annotation tools, which requires training data to learn the classifier, but not biological knowledge to make subjective decisions. It consists of three steps: preprocessing training and test data, model fitting on training data, and cell classification on test data. See Xiangling Ji,Danielle Tsao, Kailun Bai, Min Tsao, Li Xing, Xuekui Zhang.(2022)<doi:10.1101/2022.02.19.481159> for more details.
Modelling the yield curve with some parametric models. The models implemented are: Nelson, C.R., and A.F. Siegel (1987) <doi: 10.1086/296409>, Diebold, F.X. and Li, C. (2006) <doi: 10.1016/j.jeconom.2005.03.005> and Svensson, L.E. (1994) <doi: 10.3386/w4871>. The package also includes the data of the term structure of interest rate of Federal Reserve Bank and European Central Bank.
Casting metadata for REDCap database creation and handling of castellated data using repeated instruments and longitudinal projects in REDCap'. Keeps a focused data export approach, by allowing to only export required data from the database. Also for casting new REDCap databases based on datasets from other sources. Originally forked from the R part of REDCapRITS by Paul Egeler. See <https://github.com/pegeler/REDCapRITS>. REDCap (Research Electronic Data Capture) is a secure, web-based software platform designed to support data capture for research studies, providing 1) an intuitive interface for validated data capture; 2) audit trails for tracking data manipulation and export procedures; 3) automated export procedures for seamless data downloads to common statistical packages; and 4) procedures for data integration and interoperability with external sources (Harris et al (2009) <doi:10.1016/j.jbi.2008.08.010>; Harris et al (2019) <doi:10.1016/j.jbi.2019.103208>).
This package implements an approximate string matching version of R's native match function. It can calculate various string distances based on edits (Damerau-Levenshtein, Hamming, Levenshtein, optimal string alignment), qgrams (q- gram, cosine, jaccard distance) or heuristic metrics (Jaro, Jaro-Winkler). An implementation of soundex is provided as well. Distances can be computed between character vectors while taking proper care of encoding or between integer vectors representing generic sequences.
The Racket BC (``before Chez'' or ``bytecode'') implementation was the default before Racket 8.0. It uses a compiler written in C targeting architecture-independent bytecode, plus a JIT compiler on most platforms. Racket BC has a different C API than the current default runtime system, Racket CS (based on ``Chez Scheme'').
This package is the normal implementation of Racket BC with a precise garbage collector, 3M (``Moving Memory Manager'').
The fonts provide uppercase formal script letters for use as symbols in scientific and mathematical typesetting (in contrast to the informal script fonts such as that used for the calligraphic symbols in the TeX maths symbol font). The fonts are provided as Metafont source, and as derived Adobe Type 1 format. LaTeX support, for using these fonts in mathematics, is available via one of the packages calrsfs and mathrsfs.
This package implements inferential methods to compare gene lists in terms of their biological meaning as expressed in the GO. The compared gene lists are characterized by cross-tabulation frequency tables of enriched GO items. Dissimilarity between gene lists is evaluated using the Sorensen-Dice index. The fundamental guiding principle is that two gene lists are taken as similar if they share a great proportion of common enriched GO items.
This package provides a set of tools for performing graph theory analysis of brain MRI data. It works with data from a Freesurfer analysis (cortical thickness, volumes, local gyrification index, surface area), diffusion tensor tractography data (e.g., from FSL) and resting-state fMRI data (e.g., from DPABI). It contains a graphical user interface for graph visualization and data exploration, along with several functions for generating useful figures.
This package implements the count splitting methodology from Neufeld et al. (2022) <doi:10.1093/biostatistics/kxac047> and Neufeld et al. (2023) <arXiv:2307.12985>. Intended for turning a matrix of single-cell RNA sequencing counts, or similar count datasets, into independent folds that can be used for training/testing or cross validation. Assumes that the entries in the matrix are from a Poisson or a negative binomial distribution.
Example data sets to run the example problems from causal inference textbooks. Currently, contains data sets for Huntington-Klein, Nick (2021 and 2025) "The Effect" <https://theeffectbook.net>, first and second edition, Cunningham, Scott (2021 and 2025, ISBN-13: 978-0-300-25168-5) "Causal Inference: The Mixtape", and Hernán, Miguel and James Robins (2020) "Causal Inference: What If" <https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/>.
Build graph/network structures using functions for stepwise addition and deletion of nodes and edges. Work with data available in tables for bulk addition of nodes, edges, and associated metadata. Use graph selections and traversals to apply changes to specific nodes or edges. A wide selection of graph algorithms allow for the analysis of graphs. Visualize the graphs and take advantage of any aesthetic properties assigned to nodes and edges.
Generates (U,W) mixture graphs where U is a line graph graphon and W is a dense graphon. Graphons are graph limits and graphon U can be written as sequence of positive numbers adding to 1. Graphs are sampled from U and W and joined randomly to obtain the mixture graph. Given a mixture graph, U can be inferred. Kandanaarachchi and Ong (2025) <doi:10.48550/arXiv.2505.13864>.
Reads annual financial reports including assets, liabilities, dividends history, stockholder composition and much more from Bovespa's DFP, FRE and FCA systems <http://www.b3.com.br/pt_br/produtos-e-servicos/negociacao/renda-variavel/empresas-listadas.htm>. These are web based interfaces for all financial reports of companies traded at Bovespa. The package is specially designed for large scale data importation, keeping a tabular (long) structure for easier processing.
This package provides tools for plotting gene clusters and transcripts by importing data from GenBank, FASTA, and GFF files. It performs BLASTP and MUMmer alignments [Altschul et al. (1990) <doi:10.1016/S0022-2836(05)80360-2>; Delcher et al. (1999) <doi:10.1093/nar/27.11.2369>] and displays results on gene arrow maps. Extensive customization options are available, including legends, labels, annotations, scales, colors, tooltips, and more.