Implementation of penalized regression with second-generation p-values for variable selection. The algorithm can handle linear regression, GLM, and Cox regression. S3 methods print(), summary(), coef(), predict(), and plot() are available for the algorithm. Technical details can be found at Zuo et al. (2021) <doi:10.1080/00031305.2021.1946150>.
This package provides a convenient framework for aggregating and disaggregating continuously varying parameters (for example, case fatality ratio, with age) for proper parametrization of lower-resolution compartmental models (for example, with broad age categories) and subsequent upscaling of model outputs to high resolution (for example, as needed when calculating age-sensitive measures like years-life-lost).
This package implements the "Residual (Sur)Realism" algorithm described by Stefanski (2007) <doi:10.1198/000313007X190079> to generate datasets that reveal hidden images or messages in their residual plots. It offers both predefined datasets and tools to embed custom text or images into residual structures. Allowing users to create intriguing visual demonstrations for teaching model diagnostics.
This package provides a collection of functions for data analysis with two-by-two contingency tables. The package provides tools to compute measures of effect (odds ratio, risk ratio, and risk difference), calculate impact numbers and attributable fractions, and perform hypothesis testing. Statistical analysis methods are oriented towards epidemiological investigation of relationships between exposures and outcomes.
Interface to easily access data via the United States Department of Agriculture (USDA)'s Livestock Mandatory Reporting ('LMR') Data API at <https://mpr.datamart.ams.usda.gov/>. The downloaded data can be saved for later off-line use. Also provide relevant information and metadata for each of the input variables needed for sending the data inquiry.
Integrated tools to support rigorous and well documented data harmonization based on Maelstrom Research guidelines. The package includes functions to assess and prepare input elements, apply specified processing rules to generate harmonized datasets, validate data processing and identify processing errors, and document and summarize harmonized outputs. The harmonization process is defined and structured by two key user-generated documents: the DataSchema (specifying the list of harmonized variables to generate across datasets) and the Data Processing Elements (specifying the input elements and processing algorithms to generate harmonized variables in DataSchema formats). The package was developed to address key challenges of retrospective data harmonization in epidemiology (as described in Fortier I and al. (2017) <doi:10.1093/ije/dyw075>) but can be used for any data harmonization initiative.
This package provides a Bayesian credible interval is interpreted with respect to posterior probability, and this interpretation is far more intuitive than that of a frequentist confidence interval. However, standard highest-density intervals can be wide due to between-subjects variability and tends to hide within-subject effects, rendering its relationship with the Bayes factor less clear in within-subject (repeated-measures) designs. This urgent issue can be addressed by using within-subject intervals in within-subject designs, which integrate four methods including the Wei-Nathoo-Masson (2023) <doi:10.3758/s13423-023-02295-1>, the Loftus-Masson (1994) <doi:10.3758/BF03210951>, the Nathoo-Kilshaw-Masson (2018) <doi:10.1016/j.jmp.2018.07.005>, and the Heck (2019) <doi:10.31234/osf.io/whp8t> interval estimates.
This package provides a data-driven test for the assumptions of quantile normalization using raw data such as objects that inherit eSets (e.g. ExpressionSet, MethylSet). Group level information about each sample (such as Tumor / Normal status) must also be provided because the test assesses if there are global differences in the distributions between the user-defined groups.
Sending functions to remote processes can be wasteful of resources because they carry their environments with them. With this package, it is easy to create functions that are isolated from their environment. These isolated functions, also called crates, print to the console with their total size and can be easily tested locally before being sent to a remote.
This package translates microarray expression data into metadata of reduced dimension. It provides various sample-centered and group-centered visualizations, sample similarity analyses and functional enrichment analyses. The underlying SOM algorithm combines feature clustering, multidimensional scaling and dimension reduction, along with strong visualization capabilities. It enables extraction and description of functional expression modules inherent in the data.
This package provides a single sample pathway perturbation testing method for RNA-seq data. The method propagates changes in gene expression down gene-set topologies to compute single-sample directional pathway perturbation scores that reflect potential direction of change. Perturbation scores can be used to test significance of pathway perturbation at both individual-sample and treatment levels.
Make summary tables for descriptive statistics and select explanatory variables automatically in various regression models. Support linear models, generalized linear models and cox-proportional hazard models. Generate publication-ready tables summarizing result of regression analysis and plots. The tables and plots can be exported in "HTML", "pdf('LaTex')", "docx('MS Word')" and "pptx('MS Powerpoint')" documents.
Offers several functions for Configural Frequencies Analysis (CFA), which is a useful statistical tool for the analysis of multiway contingency tables. CFA was introduced by G. A. Lienert as Konfigurations Frequenz Analyse - KFA'. Lienert, G. A. (1971). Die Konfigurationsfrequenzanalyse: I. Ein neuer Weg zu Typen und Syndromen. Zeitschrift für Klinische Psychologie und Psychotherapie, 19(2), 99â 115.
Dependent censoring regression models for survival multivariate data. These models are based on extensions of the frailty models, capable to accommodating the dependence between failure and censoring times, with Weibull and piecewise exponential marginal distributions. Theoretical details regarding the models implemented in the package can be found in Schneider et al. (2019) <doi:10.1002/bimj.201800391>.
This package provides various statistical methods for evaluating Individualized Treatment Rules under randomized data. The provided metrics include Population Average Value (PAV), Population Average Prescription Effect (PAPE), Area Under Prescription Effect Curve (AUPEC). It also provides the tools to analyze Individualized Treatment Rules under budget constraints. Detailed reference in Imai and Li (2019) <arXiv:1905.05389>.
This package provides a system to facilitate designing comparative (and non-comparative) experiments using the grammar of experimental designs <https://emitanaka.org/edibble-book/>. An experimental design is treated as an intermediate, mutable object that is built progressively by fundamental experimental components like units, treatments, and their relation. The system aids in experimental planning, management and workflow.
This package provides a collection of utility functions for working with Year Month Day objects. Includes functions for fast parsing of numeric and character input based on algorithms described in Hinnant, H. (2021) <https://howardhinnant.github.io/date_algorithms.html> as well as a branchless calculation of leap years by Jerichaux (2025) <https://stackoverflow.com/a/79564914>.
Extends the fitdist() (from fitdistrplus') adding the Anderson-Darling ad.test() (from ADGofTest') and Kolmogorov Smirnov Test ks.test() inside, trying the distributions from stats package by default and offering a second function which uses mixed distributions to fit, this distributions are split with unsupervised learning, with Mclust() function (from mclust').
Read Swiss time series data from the KOF Data API, <https://datenservice.kof.ethz.ch>. The API provides macro economic time series data mostly about Switzerland. The package itself is a set of wrappers around the KOF Datenservice API. The kofdata package is able to consume public information as well as data that requires an API token.
Access business registration data from the Dutch Chamber of Commerce (Kamer van Koophandel, KvK) through their official API <https://developers.kvk.nl/>. Search for companies by name, location, or registration number. Retrieve detailed business profiles, establishment information, and company name histories. Built on httr2 for robust API interaction with automatic pagination, error handling, and usage tracking.
The mFilter package implements several time series filters useful for smoothing and extracting trend and cyclical components of a time series. The routines are commonly used in economics and finance, however they should also be interest to other areas. Currently, Christiano-Fitzgerald, Baxter-King, Hodrick-Prescott, Butterworth, and trigonometric regression filters are included in the package.
Statistical framework for comparing sets of trees using hypothesis testing methods. Designed for transmission trees, phylogenetic trees, and directed acyclic graphs (DAGs), the package implements chi-squared tests to compare edge frequencies between sets and PERMANOVA to analyse topological dissimilarities with customisable distance metrics, following Anderson (2001) <doi:10.1111/j.1442-9993.2001.01070.pp.x>.
Software to aid in modeling and analyzing mass-spectrometry-based proteome melting data. Quantitative data is imported and normalized and thermal behavior is modeled at the protein level. Methods exist for normalization, modeling, visualization, and export of results. For a general introduction to MS-based thermal profiling, see Savitski et al. (2014) <doi:10.1126/science.1255784>.
This package performs network meta-analysis using integrated nested Laplace approximations ('INLA') which is described in Guenhan, Held, and Friede (2018) <doi:10.1002/jrsm.1285>. Includes methods to assess the heterogeneity and inconsistency in the network. Contains more than ten different network meta-analysis dataset. INLA package can be obtained from <https://www.r-inla.org>.