This package performs genomic prediction of hybrid performance using eight GS methods including GBLUP, BayesB
, RKHS, PLS, LASSO, Elastic net, XGBoost and LightGBM
. GBLUP: genomic best liner unbiased prediction, RKHS: reproducing kernel Hilbert space, PLS: partial least squares regression, LASSO: least absolute shrinkage and selection operator, XGBoost: extreme gradient boosting, LightGBM
: light gradient boosting machine. It also provides fast cross-validation and mating design scheme for training population (Xu S et al (2016) <doi:10.1111/tpj.13242>; Xu S (2017) <doi:10.1534/g3.116.038059>).
Computes the minimum sample size required for the development of a new multivariable prediction model using the criteria proposed by Riley et al. (2018) <doi: 10.1002/sim.7992>. pmsampsize can be used to calculate the minimum sample size for the development of models with continuous, binary or survival (time-to-event) outcomes. Riley et al. (2018) <doi: 10.1002/sim.7992> lay out a series of criteria the sample size should meet. These aim to minimise the overfitting and to ensure precise estimation of key parameters in the prediction model.
This is an implementation of the algorithm described in Section 3 of Hosszejni and Frühwirth-Schnatter (2022) <doi:10.48550/arXiv.2211.00671>
. The algorithm is used to verify that the counting rule CR(r,1) holds for the sparsity pattern of the transpose of a factor loading matrix. As detailed in Section 2 of the same paper, if CR(r,1) holds, then the idiosyncratic variances are generically identified. If CR(r,1) does not hold, then we do not know whether the idiosyncratic variances are identified or not.
This is a package for parsing Affymetrix files (CDF, CEL, CHP, BPMAP, BAR). It provides methods for fast and memory efficient parsing of Affymetrix files using the Affymetrix' Fusion SDK. Both ASCII- and binary-based files are supported. Currently, there are methods for reading chip definition file (CDF) and a cell intensity file (CEL). These files can be read either in full or in part. For example, probe signals from a few probesets can be extracted very quickly from a set of CEL files into a convenient list structure.
This package provides an interface for working with large matrices stored in files, not in computer memory. It supports multiple non-character data types (double, integer, logical and raw) of various sizes (e.g. 8 and 4 byte real values). Access to parts of the matrix is done by indexing, exactly as with usual R matrices. It supports very large matrices; the package has been tested on multi-terabyte matrices. It allows for more than 2^32 rows or columns, ad allows for quick addition of extra columns to a filematrix.
This package provides a set of tools for the statistical analysis of data using:
normal linear models;
generalized linear models;
negative binomial regression models as alternative to the Poisson regression models under the presence of overdispersion;
beta-binomial and random-clumped binomial regression models as alternative to the binomial regression models under the presence of overdispersion;
zero-inflated and zero-altered regression models to deal with zero-excess in count data;
generalized nonlinear models;
generalized estimating equations for cluster correlated data.
Rolling and expanding window approaches to assessing abundance based early warning signals, non-equilibrium resilience measures, and machine learning. See Dakos et al. (2012) <doi:10.1371/journal.pone.0041010>, Deb et al. (2022) <doi:10.1098/rsos.211475>, Drake and Griffen (2010) <doi:10.1038/nature09389>, Ushio et al. (2018) <doi:10.1038/nature25504> and Weinans et al. (2021) <doi:10.1038/s41598-021-87839-y> for methodological details. Graphical presentation of the outputs are also provided for clear and publishable figures. Visit the EWSmethods website for more information, and tutorials.
Calculates fundamental IO matrices (Leontief, Wassily W. (1951) <doi:10.1038/scientificamerican1051-15>); within period analysis via various rankings and coefficients (Sonis and Hewings (2006) <doi:10.1080/09535319200000013>, Blair and Miller (2009) <ISBN:978-0-521-73902-3>, Antras et al (2012) <doi:10.3386/w17819>, Hummels, Ishii, and Yi (2001) <doi:10.1016/S0022-1996(00)00093-3>); across period analysis with impact analysis (Dietzenbacher, van der Linden, and Steenge (2006) <doi:10.1080/09535319300000017>, Sonis, Hewings, and Guo (2006) <doi:10.1080/09535319600000002>); and a variety of table operators.
Life and Fertility Tables are appropriate to study the dynamics of arthropods populations. This package provides utilities for constructing Life Tables and Fertility Tables, related demographic parameters, and some simple graphs of interest. It also offers functions to transform the obtained data into a known format for better manipulation. This document is based on the article by Maia, Luiz, and Campanhola "Statistical Inference on Associated Fertility Life Table Parameters Using Jackknife Technique Computational Aspects" (April 2000, Journal of Economic Entomology, Volume 93, Issue 2) <doi:10.1603/0022-0493-93.2.511>.
Extended tools for analyzing telemetry data using generalized hidden Markov models. Features of momentuHMM
(pronounced ``momentum'') include data pre-processing and visualization, fitting HMMs to location and auxiliary biotelemetry or environmental data, biased and correlated random walk movement models, hierarchical HMMs, multiple imputation for incorporating location measurement error and missing data, user-specified design matrices and constraints for covariate modelling of parameters, random effects, decoding of the state process, visualization of fitted models, model checking and selection, and simulation. See McClintock
and Michelot (2018) <doi:10.1111/2041-210X.12995>.
This is the very popular mine sweeper game! The game requires you to find out tiles that contain mines through clues from unmasking neighboring tiles. Each tile that does not contain a mine shows the number of mines in its adjacent tiles. If you unmask all tiles that do not contain mines, you win the game; if you unmask any tile that contains a mine, you lose the game. For further game instructions, please run `help(run_game)` and check details. This game runs in X11-compatible devices with `grDevices::x11()
`.
This package implements Bayesian phase I repeated measurement design that accounts for multidimensional toxicity endpoints and longitudinal efficacy measure from multiple treatment cycles. The package provides flags to fit a variety of model-based phase I design, including 1 stage models with or without individualized dose modification, 3-stage models with or without individualized dose modification, etc. Functions are provided to recommend dosage selection based on the data collected in the available patient cohorts and to simulate trial characteristics given design parameters. Yin, Jun, et al. (2017) <doi:10.1002/sim.7134>.
Temporal disaggregation methods are used to disaggregate and interpolate a low frequency time series to a higher frequency series, where either the sum, the mean, the first or the last value of the resulting high frequency series is consistent with the low frequency series. Temporal disaggregation can be performed with or without one or more high frequency indicator series. Contains the methods of Chow-Lin, Santos-Silva-Cardoso, Fernandez, Litterman, Denton and Denton-Cholette, summarized in Sax and Steiner (2013) <doi:10.32614/RJ-2013-028>. Supports most R time series classes.
The standard index of DNA methylation (beta) is computed from methylated and unmethylated signal intensities. Betas calculated from raw signal intensities perform well, but using 11 methylomic datasets we demonstrate that quantile normalization methods produce marked improvement. The commonly used procedure of normalizing betas is inferior to the separate normalization of M and U, and it is also advantageous to normalize Type I and Type II assays separately. This package provides 15 flavours of betas and three performance metrics, with methods for objects produced by the methylumi
and minfi
packages.
Calculates distances from point locations to features. The usual approach for eg. resource selection function analyses is to generate a complete distance to features surface then sample it with your observed and random points. Since these raster based approaches can be pretty costly with large areas, and often lead to memory issues in R, the distanceto package opts to compute these distances using efficient, vector based approaches. As a helper, there's a decidedly low-res raster based approach for visually inspecting your region's distance surface. But the workhorse is distance_to.
Provide estimation and data generation tools for some new multivariate frailty models. This version includes the gamma, inverse Gaussian, weighted Lindley, Birnbaum-Saunders, truncated normal, mixture of inverse Gaussian, mixture of Birnbaum-Saunders and generalized exponential as the distribution for the frailty terms. For the basal model, it is considered a parametric approach based on the exponential, Weibull and the piecewise exponential distributions as well as a semiparametric approach. For details, see Gallardo and Bourguignon (2025) <doi:10.1002/bimj.70044> and Gallardo et al. (2024) <doi:10.1007/s11222-024-10458-w>.
This package provides a Bayesian model selection approach for generalized linear mixed models. Currently, GLMMselect can be used for Poisson GLMM and Bernoulli GLMM. GLMMselect can select fixed effects and random effects simultaneously. Covariance structures for the random effects are a product of a unknown scalar and a known semi-positive definite matrix. GLMMselect can be widely used in areas such as longitudinal studies, genome-wide association studies, and spatial statistics. GLMMselect is based on Xu, Ferreira, Porter, and Franck (202X), Bayesian Model Selection Method for Generalized Linear Mixed Models, Biometrics, under review.
There are occasions where you need a piece of HTML with integrated styles. A prime example of this is HTML email. This transformation involves moving the CSS and associated formatting instructions from the style block in the head of your document into the body of the HTML. Many prominent email clients require integrated styles in HTML email; otherwise a received HTML email will be displayed without any styling. This package will quickly and precisely perform these CSS transformations when given HTML text and it does so by using the JavaScript
juice library.
Allows for fitting of maximum likelihood models using Markov chains on phylogenetic trees for analysis of discrete character data. Examples of such discrete character data include restriction sites, gene family presence/absence, intron presence/absence, and gene family size data. Hypothesis-driven user- specified substitution rate matrices can be estimated. Allows for biologically realistic models combining constrained substitution rate matrices, site rate variation, site partitioning, branch-specific rates, allowing for non-stationary prior root probabilities, correcting for sampling bias, etc. See Dang and Golding (2016) <doi:10.1093/bioinformatics/btv541> for more details.
Computes A-, MV-, D- and E-optimal or near-optimal block designs for two-colour cDNA
microarray experiments using the linear fixed effects and mixed effects models where the interest is in a comparison of all possible elementary treatment contrasts. The algorithms used in this package are based on the treatment exchange and array exchange algorithms of Debusho, Gemechu and Haines (2018) <doi:10.1080/03610918.2018.1429617>. The package also provides an optional method of using the graphical user interface (GUI) R package tcltk to ensure that it is user friendly.
Spatial homogeneous regions (SHRs) in tissues are domains that are homogenous with respect to cell type composition. We present a method for identifying SHRs using spatial transcriptomics data, and demonstrate that it is efficient and effective at finding SHRs for a wide variety of tissue types. concordex relies on analysis of k-nearest-neighbor (kNN
) graphs. The tool is also useful for analysis of non-spatial transcriptomics data, and can elucidate the extent of concordance between partitions of cells derived from clustering algorithms, and transcriptomic similarity as represented in kNN
graphs.
Covered uses modern Ruby features to generate comprehensive coverage, including support for templates which are compiled into Ruby. It has the following features:
Incremental coverage -- if you run your full test suite, and the run a subset, it will still report the correct coverage - so you can incrementally work on improving coverage.
Integration with RSpec, Minitest, Travis & Coveralls - no need to configure anything - out of the box support for these platforms.
It supports coverage of views -- templates compiled to Ruby code can be tracked for coverage reporting.
Considering an (n x m) data matrix X, this package is based on the method proposed by Gower, Groener, and Velden (2010) <doi:10.1198/jcgs.2010.07134>, and utilize the resulting matrices from the extended version of the NIPALS decomposition to determine n triangles whose areas are used to visually estimate the elements of a specific column of X. After a 90-degree rotation of the sample points, the triangles are drawn regarding the following points: 1.the origin of the axes; 2.the sample points; 3. the vector endpoint representing some variable.
This package provides tools for visualization of, and inference on, the calibration of prediction models on the cumulative domain. This provides a method for evaluating calibration of risk prediction models without having to group the data or use tuning parameters (e.g., loess bandwidth). This package implements the methodology described in Sadatsafavi and Patkau (2024) <doi:10.1002/sim.10138>. The core of the package is cumulcalib()
, which takes in vectors of binary responses and predicted risks. The plot()
and summary()
methods are implemented for the results returned by cumulcalib()
.