This package provides a method that estimates an IV-optimal individualized treatment rule. An individualized treatment rule is said to be IV-optimal if it minimizes the maximum risk with respect to the putative IV and the set of IV identification assumptions. Please refer to <arXiv:2002.02579> for more details on the methodology and some theory underpinning the method. Function IV-PILE() uses functions in the package locClass'. Package locClass can be accessed and installed from the R-Forge repository via the following link: <https://r-forge.r-project.org/projects/locclass/>. Alternatively, one can install the package by entering the following in R: install.packages("locClass", repos="<http://R-Forge.R-project.org>")'.
This package provides a Modern and Flexible Neo4J Driver, allowing you to query data on a Neo4J server and handle the results in R. It's modern in the sense it provides a driver that can be easily integrated in a data analysis workflow, especially by providing an API working smoothly with other data analysis and graph packages. It's flexible in the way it returns the results, by trying to stay as close as possible to the way Neo4J returns data. That way, you have the control over the way you will compute the results. At the same time, the result is not too complex, so that the "heavy lifting" of data wrangling is not left to the user.
This package provides helper functions to compute linear predictors, time-dependent ROC curves, and Harrell's concordance index for Cox proportional hazards models as described in Therneau (2024) <https://CRAN.R-project.org/package=survival>, Therneau and Grambsch (2000, ISBN:0-387-98784-3), Hung and Chiang (2010) <doi:10.1002/cjs.10046>, Uno et al. (2007) <doi:10.1198/016214507000000149>, Blanche, Dartigues, and Jacqmin-Gadda (2013) <doi:10.1002/sim.5958>, Blanche, Latouche, and Viallon (2013) <doi:10.1007/978-1-4614-8981-8_11>, Harrell et al. (1982) <doi:10.1001/jama.1982.03320430047030>, Peto and Peto (1972) <doi:10.2307/2344317>, Schemper (1992) <doi:10.2307/2349009>, and Uno et al. (2011) <doi:10.1002/sim.4154>.
Gcrma adjusts for background intensities in Affymetrix array data which include optical noise and non-specific binding (NSB). The main function gcrma converts background adjusted probe intensities to expression measures using the same normalization and summarization methods as a Robust Multiarray Average (RMA). Gcrma uses probe sequence information to estimate probe affinity to NSB. The sequence information is summarized in a more complex way than the simple GC content. Instead, the base types (A, T, G or C) at each position along the probe determine the affinity of each probe. The parameters of the position-specific base contributions to the probe affinity is estimated in an NSB experiment in which only NSB but no gene-specific binding is expected.
Inference based on models with or without spatially-correlated random effects, multivariate responses, or non-Gaussian random effects (e.g., Beta). Variation in residual variance (heteroscedasticity) can itself be represented by a mixed-effect model. Both classical geostatistical models (Rousset and Ferdy 2014 <doi:10.1111/ecog.00566>), and Markov random field models on irregular grids (as considered in the INLA package, <https://www.r-inla.org>), can be fitted, with distinct computational procedures exploiting the sparse matrix representations for the latter case and other autoregressive models. Laplace approximations are used for likelihood or restricted likelihood. Penalized quasi-likelihood and other variants discussed in the h-likelihood literature (Lee and Nelder 2001 <doi:10.1093/biomet/88.4.987>) are also implemented.
This package provides a Cairo graphics device that can be use to create high-quality vector (PDF, PostScript and SVG) and bitmap output (PNG, JPEG, TIFF), and high-quality rendering in displays (X11 and Win32). Since it uses the same back-end for all output, copying across formats is WYSIWYG. Files are created without the dependence on X11 or other external programs. This device supports alpha channel (semi-transparent drawing) and resulting images can contain transparent and semi-transparent regions. It is ideal for use in server environments (file output) and as a replacement for other devices that don't have Cairo's capabilities such as alpha support or anti-aliasing. Backends are modular such that any subset of backends is supported.
This package provides interface to the MATLAB toolbox Flexible Statistical Data Analysis (FSDA) which is comprehensive and computationally efficient software package for robust statistics in regression, multivariate and categorical data analysis. The current R version implements tools for regression: (forward search, S- and MM-estimation, least trimmed squares (LTS) and least median of squares (LMS)), for multivariate analysis (forward search, S- and MM-estimation), for cluster analysis and cluster-wise regression. The distinctive feature of our package is the possibility of monitoring the statistics of interest as a function of breakdown point, efficiency or subset size, depending on the estimator. This is accompanied by a rich set of graphical features, such as dynamic brushing, linking, particularly useful for exploratory data analysis.
This package provides a model that provides researchers with a powerful tool for the classification and study of native corn by aiding in the identification of racial complexes which are fundamental to Mexico's agriculture and culture. This package has been developed based on data collected by "Proyecto Global de Maà ces Nativos México", which has conducted exhaustive surveys across the country to document the qualitative and quantitative characteristics of different types of native maize. The trained model uses a robust and diverse dataset, enabling it to achieve an 80% accuracy in classifying maize racial complexes. The characteristics included in the analysis comprise geographic location, grain and cob colors, as well as various physical measurements, such as lengths and widths.
This package provides a comprehensive suite of tools for managing, processing, and analyzing data from the IFCB. I R FlowCytobot ('iRfcb') supports quality control, geospatial analysis, and preparation of IFCB data for publication in databases like <https://www.gbif.org>, <https://www.obis.org>, <https://emodnet.ec.europa.eu/en>, <https://shark.smhi.se/en/>, and <https://www.ecotaxa.org>. The package integrates with the MATLAB ifcb-analysis tool, which is described in Sosik and Olson (2007) <doi:10.4319/lom.2007.5.204>, and provides features for working with raw, manually classified, and machine learningâ classified image datasets. Key functionalities include image extraction, particle size distribution analysis, taxonomic data handling, and biomass concentration calculations, essential for plankton research.
The age is estimated by calculating the Dirichlet Normal Energy (DNE) on the whole auricular surface and the apex of the auricular surface. It involves three estimation methods: principal component discriminant analysis (PCQDA), and principal component logistic regression analysis (PCLR) methods, principal component regression analysis with Southeast Asian (A_PCR), and principal component regression analysis with multipopulation (M_PCR). The package is created with the data from the Louis Lopes Collection in Lisbon, the 21st Century Identified Human Remains Collection in Coimbra, and the CAL Milano Cemetery Skeletal Collection in Milan, and the skeletal collection at Khon Kaen University (KKU) Human Skeletal Research Centre (HSRC), housed in the Department of Anatomy in the Faculty of Medicine at KKU in Khon Kaen.
Fits Stable Isotope Mixing Models (SIMMs) and is meant as a longer term replacement to the previous widely-used package SIAR. SIMMs are used to infer dietary proportions of organisms consuming various food sources from observations on the stable isotope values taken from the organisms tissue samples. However SIMMs can also be used in other scenarios, such as in sediment mixing or the composition of fatty acids. The main functions are simmr_load() and simmr_mcmc(). The two vignettes contain a quick start and a full listing of all the features. The methods used are detailed in the papers Parnell et al 2010 <doi:10.1371/journal.pone.0009672>, and Parnell et al 2013 <doi:10.1002/env.2221>.
Run mixed-effects models that include weights at every level. The WeMix package fits a weighted mixed model, also known as a multilevel, mixed, or hierarchical linear model (HLM). The weights could be inverse selection probabilities, such as those developed for an education survey where schools are sampled probabilistically, and then students inside of those schools are sampled probabilistically. Although mixed-effects models are already available in R, WeMix is unique in implementing methods for mixed models using weights at multiple levels. Both linear and logit models are supported. Models may have up to three levels. Random effects are estimated using the PIRLS algorithm from lme4pureR (Walker and Bates (2013) <https://github.com/lme4/lme4pureR>).
Monte Carlo simulation framework for different randomized clinical trial designs with a special emphasis on estimators based on covariate adjustment. The package implements regression-based covariate adjustment (Rosenblum & van der Laan (2010) <doi:10.2202/1557-4679.1138>) and a one-step estimator (Van Lancker et al (2024) <doi:10.48550/arXiv.2404.11150>) for trials with continuous, binary and count outcomes. The estimation of the minimum sample-size required to reach a specified statistical power for a given estimator uses bisection to find an initial rough estimate, followed by stochastic approximation (Robbins-Monro (1951) <doi:10.1214/aoms/1177729586>) to improve the estimate, and finally, a grid search to refine the estimate in the neighborhood of the current best solution.
This package provides a Low Rank Correction Variational Bayesian algorithm for high-dimensional multi-source heterogeneous quantile linear models. More details have been written up in a paper submitted to the journal Statistics in Medicine, and the details of variational Bayesian methods can be found in Ray and Szabo (2021) <doi:10.1080/01621459.2020.1847121>. It simultaneously performs parameter estimation and variable selection. The algorithm supports two model settings: (1) local models, where variable selection is only applied to homogeneous coefficients, and (2) global models, where variable selection is also performed on heterogeneous coefficients. Two forms of parameter estimation are output: one is the standard variational Bayesian estimation, and the other is the variational Bayesian estimation corrected with low-rank adjustment.
Data and utilities for estimating pediatric blood pressure percentiles by sex, age, and optionally height (stature) as described in Martin et.al. (2022) <doi:10.1001/jamanetworkopen.2022.36918>. Blood pressure percentiles for children under one year of age come from Gemelli et.al. (1990) <doi:10.1007/BF02171556>. Estimates of blood pressure percentiles for children at least one year of age are informed by data from the National Heart, Lung, and Blood Institute (NHLBI) and the Centers for Disease Control and Prevention (CDC) <doi:10.1542/peds.2009-2107C> or from Lo et.al. (2013) <doi:10.1542/peds.2012-1292>. The flowchart for selecting the informing data source comes from Martin et.al. (2022) <doi:10.1542/hpeds.2021-005998>.
This package provides tools for creating and working with survey replicate weights, extending functionality of the survey package from Lumley (2004) <doi:10.18637/jss.v009.i08>. Implements bootstrap methods for complex surveys, including the generalized survey bootstrap as described by Beaumont and Patak (2012) <doi:10.1111/j.1751-5823.2011.00166.x>. Methods are provided for applying nonresponse adjustments to both full-sample and replicate weights as described by Rust and Rao (1996) <doi:10.1177/096228029600500305>. Implements methods for sample-based calibration described by Opsomer and Erciulescu (2021) <https://www150.statcan.gc.ca/n1/pub/12-001-x/2021002/article/00006-eng.htm>. Diagnostic functions are included to compare weights and weighted estimates from different sets of replicate weights.
Implementation of SPECS, your favourite Single-Equation Penalized Error-Correction Selector developed in Smeekes and Wijler (2021) <doi:10.1016/j.jeconom.2020.07.021>. SPECS provides a fully automated estimation procedure for large and potentially (co)integrated datasets. The dataset in levels is converted to a conditional error-correction model, either by the user or by means of the functions included in this package, and various specialised forms of penalized regression can be applied to the model. Automated options for initializing and selecting a sequence of penalties, as well as the construction of penalty weights via an initial estimator, are available. Moreover, the user may choose from a number of pre-specified deterministic configurations to further simplify the model building process.
An efficient tool for fitting nested mixture models based on a shared set of atoms via Markov Chain Monte Carlo and variational inference algorithms. Specifically, the package implements the common atoms model (Denti et al., 2023), its finite version (similar to D'Angelo et al., 2023), and a hybrid finite-infinite model (D'Angelo and Denti, 2024). All models implement univariate nested mixtures with Gaussian kernels equipped with a normal-inverse gamma prior distribution on the parameters. Additional functions are provided to help analyze the results of the fitting procedure. References: Denti, Camerlenghi, Guindani, Mira (2023) <doi:10.1080/01621459.2021.1933499>, Dâ Angelo, Canale, Yu, Guindani (2023) <doi:10.1111/biom.13626>, Dâ Angelo, Denti (2024) <doi:10.1214/24-BA1458>.
The _CAGEr_ package identifies transcription start sites (TSS) and their usage frequency from CAGE (Cap Analysis Gene Expression) sequencing data. It normalises raw CAGE tag count, clusters TSSs into tag clusters (TC) and aggregates them across multiple CAGE experiments to construct consensus clusters (CC) representing the promoterome. CAGEr provides functions to profile expression levels of these clusters by cumulative expression and rarefaction analysis, and outputs the plots in ggplot2 format for further facetting and customisation. After clustering, CAGEr performs analyses of promoter width and detects differential usage of TSSs (promoter shifting) between samples. CAGEr also exports its data as genome browser tracks, and as R objects for downsteam expression analysis by other Bioconductor packages such as DESeq2, CAGEfightR, or seqArchR.
Construction and smart selection of Gaussian process models for analysis of computer experiments with emphasis on treatment of functional inputs that are regularly sampled. This package offers: (i) flexible modeling of functional-input regression problems through the fairly general Gaussian process model; (ii) built-in dimension reduction for functional inputs; (iii) heuristic optimization of the structural parameters of the model (e.g., active inputs, kernel function, type of distance). An in-depth tutorial in the use of funGp is provided in Betancourt et al. (2024) <doi:10.18637/jss.v109.i05> and Metamodeling background is provided in Betancourt et al. (2020) <doi:10.1016/j.ress.2020.106870>. The algorithm for structural parameter optimization is described in <https://hal.science/hal-02532713>.
Includes the ga.lts() function that estimates LTS (Least Trimmed Squares) parameters using genetic algorithms and C-steps. ga.lts() constructs a genetic algorithm to form a basic subset and iterates C-steps as defined in Rousseeuw and van-Driessen (2006) to calculate the cost value of the LTS criterion. OLS (Ordinary Least Squares) regression is known to be sensitive to outliers. A single outlying observation can change the values of estimated parameters. LTS is a resistant estimator even the number of outliers is up to half of the data. This package is for estimating the LTS parameters with lower bias and variance in a reasonable time. Version >=1.3 includes the function medmad for fast outlier detection in linear regression.
Structure and formatting requirements for clinical trial table and listing outputs vary between pharmaceutical companies. junco provides additional tooling for use alongside the rtables', rlistings and tern packages when creating table and listing outputs. While motivated by the specifics of Johnson and Johnson Clinical and Statistical Programming's table and listing shells, junco provides functionality that is general and reusable. Major features include a) alternative and extended statistical analyses beyond what tern supports for use in standard safety and efficacy tables, b) a robust production-grade Rich Text Format (RTF) exporter for both tables and listings, c) structural support for spanning column headers and risk difference columns in tables, and d) robust font-aware automatic column width algorithms for both listings and tables.
The MARSS package provides maximum-likelihood parameter estimation for constrained and unconstrained linear multivariate autoregressive state-space (MARSS) models, including partially deterministic models. MARSS models are a class of dynamic linear model (DLM) and vector autoregressive model (VAR) model. Fitting available via Expectation-Maximization (EM), BFGS (using optim), and TMB (using the marssTMB companion package). Functions are provided for parametric and innovations bootstrapping, Kalman filtering and smoothing, model selection criteria including bootstrap AICb, confidences intervals via the Hessian approximation or bootstrapping, and all conditional residual types. See the user guide for examples of dynamic factor analysis, dynamic linear models, outlier and shock detection, and multivariate AR-p models. Online workshops (lectures, eBook, and computer labs) at <https://atsa-es.github.io/>.
Identifying maturation stages across young athletes is paramount for talent identification. Furthermore, the concept of biobanding, or grouping of athletes based on their biological development, instead of their chronological age, has been widely researched. The goal of this package is to help professionals working in the field of strength & conditioning and talent ID obtain common maturation metrics and as well as to quickly visualize this information via several plotting options. For the methods behind the computed maturation metrics implemented in this package refer to Khamis, H. J., & Roche, A. F. (1994) <https://pubmed.ncbi.nlm.nih.gov/7936860/>, Mirwald, R.L et al., (2002) <https://pubmed.ncbi.nlm.nih.gov/11932580/> and Cumming, Sean P. et al., (2017) <doi:10.1519/SSC.0000000000000281>.