Offers a gene-based meta-analysis test with filtering to detect gene-environment interactions (GxE) with association data, proposed by Wang et al. (2018) <doi:10.1002/gepi.22115>. It first conducts a meta-filtering test to filter out unpromising SNPs by combining all samples in the consortia data. It then runs a test of omnibus-filtering-based GxE meta-analysis (ofGEM) that combines the strengths of the fixed- and random-effects meta-analysis with meta-filtering. It can also analyze data from multiple ethnic groups.
Generic interface for the PX-Web/PC-Axis API. The PX-Web/PC-Axis API is used by organizations such as Statistics Sweden and Statistics Finland to disseminate data. The R package can interact with all PX-Web/PC-Axis APIs to fetch information about the data hierarchy, extract metadata and extract and parse statistics to R data.frame format. PX-Web is a solution to disseminate PC-Axis data files in dynamic tables on the web. Since 2013 PX-Web contains an API to disseminate PC-Axis files.
This package performs elementary probability calculations on finite sample spaces, which may be represented by data frames or lists. This package is meant to rescue some widely used functions from the archived prob package (see <https://cran.r-project.org/src/contrib/Archive/prob/>). Functionality includes setting up sample spaces, counting tools, defining probability spaces, performing set algebra, calculating probability and conditional probability, tools for simulation and checking the law of large numbers, adding random variables, and finding marginal distributions. Characteristic functions for all base R distributions are included.
This package provides functions that provide statistical methods for interval-censored (grouped) data. The package supports the estimation of linear and linear mixed regression models with interval-censored dependent variables. Parameter estimates are obtained by a stochastic expectation maximization algorithm. Furthermore, the package enables the direct (without covariates) estimation of statistical indicators from interval-censored data via an iterative kernel density algorithm. Survey and Organisation for Economic Co-operation and Development (OECD) weights can be included into the direct estimation (see, Walter, P. (2019) <doi:10.17169/refubium-1621>).
We propose a general ensemble classification framework, RaSE algorithm, for the sparse classification problem. In RaSE algorithm, for each weak learner, some random subspaces are generated and the optimal one is chosen to train the model on the basis of some criterion. To be adapted to the problem, a novel criterion, ratio information criterion (RIC) is put up with based on Kullback-Leibler divergence. Besides minimizing RIC, multiple criteria can be applied, for instance, minimizing extended Bayesian information criterion (eBIC), minimizing training error, minimizing the validation error, minimizing the cross-validation error, minimizing leave-one-out error. There are various choices of base classifier, for instance, linear discriminant analysis, quadratic discriminant analysis, k-nearest neighbour, logistic regression, decision trees, random forest, support vector machines. RaSE algorithm can also be applied to do feature ranking, providing us the importance of each feature based on the selected percentage in multiple subspaces. RaSE framework can be extended to the general prediction framework, including both classification and regression. We can use the selected percentages of variables for variable screening. The latest version added the variable screening function for both regression and classification problems.
This package provides a compendium of new geometries, coordinate systems, statistical transformations, scales and fonts for ggplot2, including splines, 1d and 2d densities, univariate average shifted histograms, a new map coordinate system based on the PROJ.4-library along with geom_cartogram() that mimics the original functionality of geom_map(), formatters for "bytes", a stat_stepribbon() function, increased plotly compatibility and the StateFace open source font ProPublica. Further new functionality includes lollipop charts, dumbbell charts, the ability to encircle points and coordinate-system-based text annotations.
This package works as a prelude replacement for Haskell, providing more functionality and types out of the box than the standard prelude (such as common data types like ByteString and Text), as well as removing common ``gotchas'', like partial functions and lazy I/O. The guiding principle here is:
If something is safe to use in general and has no expected naming conflicts, expose it.
If something should not always be used, or has naming conflicts, expose it from another module in the hierarchy.
mastR is an R package designed for automated screening of signatures of interest for specific research questions. The package is developed for generating refined lists of signature genes from multiple group comparisons based on the results from edgeR and limma differential expression (DE) analysis workflow. It also takes into account the background noise of tissue-specificity, which is often ignored by other marker generation tools. This package is particularly useful for the identification of group markers in various biological and medical applications, including cancer research and developmental biology.
Measure of agreement delta was originally by Martà n & Femia (2004) <DOI:10.1348/000711004849268>. Since then has been considered as agreement measure for different fields, since their behavior is usually better than the usual kappa index by Cohen (1960) <DOI:10.1177/001316446002000104>. The main issue with delta is that can not be computed by hand contrary to kappa. The current algorithm is based on the Version 5 of the delta windows program that can be found on <https://www.ugr.es/~bioest/software/delta/cmd.php?seccion=downloads>.
This package implements the method of Hofmeyr, D.P. (2021) <DOI:10.1109/TPAMI.2019.2930501> for fast evaluation of univariate kernel smoothers based on recursive computations. Applications to the basic problems of density and regression function estimation are provided, as well as some projection pursuit methods for which the objective is based on non-parametric functionals of the projected density, or conditional density of a response given projected covariates. The package is accompanied by an instructive paper in the Journal of Statistical Software <doi:10.18637/jss.v101.i03>.
Estimates parameters in Mixture Transition Distribution (MTD) models, a class of high-order Markov chains. The set of relevant pasts (lags) is selected using either the Bayesian Information Criterion or the Forward Stepwise and Cut algorithms. Other model parameters (e.g. transition probabilities and oscillations) can be estimated via maximum likelihood estimation or the Expectation-Maximization algorithm. Additionally, hdMTD includes a perfect sampling algorithm that generates samples of an MTD model from its invariant distribution. For theory, see Ost & Takahashi (2023) <http://jmlr.org/papers/v24/22-0266.html>.
Facilitates access to the International Union for Conservation of Nature (IUCN) Red List of Threatened Species, a comprehensive global inventory of species at risk of extinction. This package streamlines the process of determining conservation status by matching species names with Red List data, providing tools to easily query and retrieve conservation statuses. Designed to support biodiversity research and conservation planning, this package relies on data from the iucnrdata package, available on GitHub <https://github.com/PaulESantos/iucnrdata>. To install the data package, use pak::pak('PaulESantos/iucnrdata').
Volume prediction is one of challenging task in forestry research. This package is a comprehensive toolset designed for the fitting and validation of various linear and nonlinear allometric equations (Linear, Log-Linear, Inverse, Quadratic, Cubic, Compound, Power and Exponential) used in the prediction of conifer tree volume. This package is particularly useful for forestry professionals, researchers, and resource managers engaged in assessing and estimating the volume of coniferous trees. This package has been developed using the algorithm of Sharma et al. (2017) <doi:10.13140/RG.2.2.33786.62407>.
Option is a one of the financial derivatives and its pricing is an important problem in practice. The process of stock prices are represented as Geometric Brownian motion [Black (1973) <doi:10.1086/260062>] or jump diffusion processes [Kou (2002) <doi:10.1287/mnsc.48.8.1086.166>]. In this package, algorithms and visualizations are implemented by Monte Carlo method in order to calculate European option price for three equations by Geometric Brownian motion and jump diffusion processes and furthermore a model that presents jumps among companies affect each other.
This package provides accessible, interactive visualizations through the MAIDR (Multimodal Access and Interactive Data Representation) system. Converts ggplot2 and Base R plots into accessible HTML/SVG formats with keyboard navigation, screen reader support, and sonification capabilities. Supports bar charts (simple, grouped, stacked), histograms, line plots, scatter plots, box plots, violin plots, heat maps, density/smooth curves, faceted plots, multi-panel layouts (including patchwork), and multi-layered plot combinations. Enables data exploration for users with visual impairments through multiple sensory modalities. For more details see the MAIDR project <https://maidr.ai/>.
Convenient tools for exchanging files securely from within R. By encrypting the content safe passage of files (shipment) can be provided by common but insecure carriers such as ftp and email. Based on asymmetric cryptography no management of shared secrets is needed to make a secure shipment as long as authentic public keys are available. Public keys used for secure shipments may also be obtained from external providers as part of the overall process. Transportation of files will require that relevant services such as ftp and email servers are available.
Handling of behavioural data from the Ethoscope platform (Geissmann, Garcia Rodriguez, Beckwith, French, Jamasb and Gilestro (2017) <DOI:10.1371/journal.pbio.2003026>). Ethoscopes (<https://giorgiogilestro.notion.site/Ethoscope-User-Manual-a9739373ae9f4840aa45b277f2f0e3a7>) are an open source/open hardware framework made of interconnected raspberry pis (<https://www.raspberrypi.org>) designed to quantify the behaviour of multiple small animals in a distributed and real-time fashion. The default tracking algorithm records primary variables such as xy coordinates, dimensions and speed. This package is part of the rethomics framework <https://rethomics.github.io/>.
Comprehensive analysis and forecasting of univariate time series using automatic time series models of many kinds. Harvey AC (1989) <doi:10.1017/CBO9781107049994>. Pedregal DJ and Young PC (2002) <doi:10.1002/9780470996430>. Durbin J and Koopman SJ (2012) <doi:10.1093/acprof:oso/9780199641178.001.0001>. Hyndman RJ, Koehler AB, Ord JK, and Snyder RD (2008) <doi:10.1007/978-3-540-71918-2>. Gómez V, Maravall A (2000) <doi:10.1002/9781118032978>. Pedregal DJ, Trapero JR and Holgado E (2024) <doi:10.1016/j.ijforecast.2023.09.004>.
Bayesian approaches for analyzing multivariate data in ecology. Estimation is performed using Markov Chain Monte Carlo (MCMC) methods via Three. JAGS types of models may be fitted: 1) With explanatory variables only, boral fits independent column Generalized Linear Models (GLMs) to each column of the response matrix; 2) With latent variables only, boral fits a purely latent variable model for model-based unconstrained ordination; 3) With explanatory and latent variables, boral fits correlated column GLMs with latent variables to account for any residual correlation between the columns of the response matrix.
P-values and no/lowest observed (adverse) effect concentration values derived from the closure principle computational approach test (Lehmann, R. et al. (2015) <doi:10.1007/s00477-015-1079-4>) are provided. The package contains functions to generate intersection hypotheses according to the closure principle (Bretz, F., Hothorn, T., Westfall, P. (2010) <doi:10.1201/9781420010909>), an implementation of the computational approach test (Ching-Hui, C., Nabendu, P., Jyh-Jiuan, L. (2010) <doi:10.1080/03610918.2010.508860>) and the combination of both, that is, the closure principle computational approach test.
This package provides a collection of acceleration schemes for proximal gradient methods for estimating penalized regression parameters described in Goldstein, Studer, and Baraniuk (2016) <arXiv:1411.3406>. Schemes such as Fast Iterative Shrinkage and Thresholding Algorithm (FISTA) by Beck and Teboulle (2009) <doi:10.1137/080716542> and the adaptive stepsize rule introduced in Wright, Nowak, and Figueiredo (2009) <doi:10.1109/TSP.2009.2016892> are included. You provide the objective function and proximal mappings, and it takes care of the issues like stepsize selection, acceleration, and stopping conditions for you.
This package provides functions for plotting, and animating, the output of importance samplers, sequential Monte Carlo samplers (SMC) and ensemble-based methods. The package can be used to plot and animate histograms, densities, scatter plots and time series, and to plot the genealogy of an SMC or ensemble-based algorithm. These functions all rely on algorithm output to be supplied in tidy format. A function is provided to transform algorithm output from matrix format (one Monte Carlo point per row) to the tidy format required by the plotting and animating functions.
An ensemble of algorithms that enable the clustering of networks and data matrices (such as counts, categorical or continuous) with different type of generative models. Model selection and clustering is performed in combination by optimizing the Integrated Classification Likelihood (which is equivalent to minimizing the description length). Several models are available such as: Stochastic Block Model, degree corrected Stochastic Block Model, Mixtures of Multinomial, Latent Block Model. The optimization is performed thanks to a combination of greedy local search and a genetic algorithm (see <arXiv:2002:11577> for more details).
This package implements methods developed by Ding, Feller, and Miratrix (2016) <doi:10.1111/rssb.12124> <doi:10.48550/arXiv.1412.5000>, and Ding, Feller, and Miratrix (2018) <doi:10.1080/01621459.2017.1407322> <doi:10.48550/arXiv.1605.06566> for testing whether there is unexplained variation in treatment effects across observations, and for characterizing the extent of the explained and unexplained variation in treatment effects. The package includes wrapper functions implementing the proposed methods, as well as helper functions for analyzing and visualizing the results of the test.