This is a package for normalization, testing for differential variability and differential methylation and gene set testing for data from Illumina's Infinium HumanMethylation arrays. The normalization procedure is subset-quantile within-array normalization (SWAN), which allows Infinium I and II type probes on a single array to be normalized together. The test for differential variability is based on an empirical Bayes version of Levene's test. Differential methylation testing is performed using RUV, which can adjust for systematic errors of unknown origin in high-dimensional data by using negative control probes. Gene ontology analysis is performed by taking into account the number of probes per gene on the array, as well as taking into account multi-gene associated probes.
Providing tools for microRNA
(miRNA
) text mining. miRetrieve
summarizes miRNA
literature by extracting, counting, and analyzing miRNA
names, thus aiming at gaining biological insights into a large amount of text within a short period of time. To do so, miRetrieve
uses regular expressions to extract miRNAs
and tokenization to identify meaningful miRNA
associations. In addition, miRetrieve
uses the latest miRTarBase
version 8.0 (Hsi-Yuan Huang et al. (2020) "miRTarBase
2020: updates to the experimentally validated microRNAâ target
interaction database" <doi:10.1093/nar/gkz896>) to display field-specific miRNA-mRNA
interactions. The most important functions are available as a Shiny web application under <https://miretrieve.shinyapps.io/miRetrieve/>
.
Alternative implementation of the beautiful MissForest
algorithm used to impute mixed-type data sets by chaining random forests, introduced by Stekhoven, D.J. and Buehlmann, P. (2012) <doi:10.1093/bioinformatics/btr597>. Under the hood, it uses the lightning fast random forest package ranger'. Between the iterative model fitting, we offer the option of using predictive mean matching. This firstly avoids imputation with values not already present in the original data (like a value 0.3334 in 0-1 coded variable). Secondly, predictive mean matching tries to raise the variance in the resulting conditional distributions to a realistic level. This would allow, e.g., to do multiple imputation when repeating the call to missRanger()
. Out-of-sample application is supported as well.
This package provides the users with the ability to quickly create linked micromap plots for a collection of geographic areas. Linked micromap plots are visualizations of geo-referenced data that link statistical graphics to an organized series of small maps or graphic images. The Help description contains examples of how to use the micromapST
function. Contained in this package are border group datasets to support creating linked micromap plots for the 50 U.S. states and District of Columbia (51 areas), the U. S. 20 Seer Registries, the 105 counties in the state of Kansas, the 62 counties of New York, the 24 counties of Maryland, the 29 counties of Utah, the 32 administrative areas in China, the 218 administrative areas in the UK and Ireland (for testing only), the 25 districts in the city of Seoul South Korea, and the 52 counties on the Africa continent. A border group dataset contains the boundaries related to the data level areas, a second layer boundaries, a top or third layer boundary, a parameter list of run options, and a cross indexing table between area names, abbreviations, numeric identification and alias matching strings for the specific geographic area. By specifying a border group, the package create linked micromap plots for any geographic region. The user can create and provide their own border group dataset for any area beyond the areas contained within the package with the BuildBorderGroup
function. In April of 2022, it was announced that maptools', rgdal', and rgeos R packages would be retired in middle to end of 2023 and removed from the CRAN libraries. The BuildBorderGroup
function was dependent on these packages. micromapST
functions were not impacted by the retired R packages. Upgrading of BuildBorderGroup
function was completed and released with version 3.0.0 on August 10, 2023 using the sf R package. References: Carr and Pickle, Chapman and Hall/CRC, Visualizing Data Patterns with Micromaps, CRC Press, 2010. Pickle, Pearson, and Carr (2015), micromapST
: Exploring and Communicating Geospatial Patterns in US State Data., Journal of Statistical Software, 63(3), 1-25., <https://www.jstatsoft.org/v63/i03/>. Copyrighted 2013, 2014, 2015, 2016, 2022, 2023, 2024, and 2025 by Carr, Pearson and Pickle.
This package provides a downstream bioinformatics tool to construct and assist curation of microhaplotypes from short read sequences.
This package provides functions and tools for analysing consumer demand with the Almost Ideal Demand System (AIDS) suggested by Deaton and Muellbauer (1980).
66 data sets that were imported using read.table()
where appropriate but more commonly after converting to a csv file for importing via read.csv()
.
This package provides tools for econometric production analysis with the Symmetric Normalized Quadratic (SNQ) profit function, e.g. estimation, imposing convexity in prices, and calculating elasticities and shadow prices.
Single imputation based on the Ensemble Conditional Trees (i.e. Cforest algorithm Strobl, C., Boulesteix, A. L., Zeileis, A., & Hothorn, T. (2007) <doi:10.1186/1471-2105-8-25>).
Play and record games of minesweeper using a graphics device that supports event handling. Replay recorded games and save GIF animations of them. Based on classic minesweeper as detailed by Crow P. (1997) <https://minesweepergame.com/math/a-mathematical-introduction-to-the-game-of-minesweeper-1997.pdf>.
Apply tests of multiple comparisons based on studentized midrange and range distributions. The tests are: Tukey Midrange ('TM test), Student-Newman-Keuls Midrange ('SNKM test), Means Grouping Midrange ('MGM test) and Means Grouping Range ('MGR test). The first two tests were published by Batista and Ferreira (2020) <doi:10.1590/1413-7054202044008020>. The last two were published by Batista and Ferreira (2023) <doi:10.28951/bjb.v41i4.640>.
Estimates models that extend the standard GLM to take misclassification into account. The models require side information from a secondary data set on the misclassification process, i.e. some sort of misclassification probabilities conditional on some common covariates. A detailed description of the algorithm can be found in Dlugosz, Mammen and Wilke (2015) <https://www.zew.de/publikationen/generalised-partially-linear-regression-with-misclassified-data-and-an-application-to-labour-market-transitions>.
Facilitate the description, transformation, exploration, and reproducibility of metabarcoding analyses. MiscMetabar
is mainly built on top of the phyloseq', dada2 and targets R packages. It helps to build reproducible and robust bioinformatics pipelines in R. MiscMetabar
makes ecological analysis of alpha and beta-diversity easier, more reproducible and more powerful by integrating a large number of tools. Important features are described in Taudière A. (2023) <doi:10.21105/joss.06038>.
Evolutionary black box optimization algorithms building on the bbotk package. miesmuschel offers both ready-to-use optimization algorithms, as well as their fundamental building blocks that can be used to manually construct specialized optimization loops. The Mixed Integer Evolution Strategies as described by Li et al. (2013) <doi:10.1162/EVCO_a_00059> can be implemented, as well as the multi-objective optimization algorithms NSGA-II by Deb, Pratap, Agarwal, and Meyarivan (2002) <doi:10.1109/4235.996017>.
An R interface to the MinIO
Client. The MinIO
Client ('mc') provides a modern alternative to UNIX commands like ls', cat', cp', mirror', diff', find etc. It supports filesystems and Amazon "S3" compatible cloud storage service ("AWS" Signature v2 and v4). This package provides convenience functions for installing the MinIO
client and running any operations, as described in the official documentation, <https://min.io/docs/minio/linux/reference/minio-mc.html?ref=docs-redirect>. This package provides a flexible and high-performance alternative to aws.s3'.
Supply functions for the creation and handling of missing data as well as tools to evaluate missing data methods. Nearly all possibilities of generating missing data discussed by Santos et al. (2019) <doi:10.1109/ACCESS.2019.2891360> and some additional are implemented. Functions are supplied to compare parameter estimates and imputed values to true values to evaluate missing data methods. Evaluations of these types are done, for example, by Cetin-Berber et al. (2019) <doi:10.1177/0013164418805532> and Kim et al. (2005) <doi:10.1093/bioinformatics/bth499>.
Offers a convenient pipeline to test and compare various missing data imputation algorithms on simulated and real data. These include simpler methods, such as mean and median imputation and random replacement, but also include more sophisticated algorithms already implemented in popular R packages, such as mi', described by Su et al. (2011) <doi:10.18637/jss.v045.i02>; mice', described by van Buuren and Groothuis-Oudshoorn (2011) <doi:10.18637/jss.v045.i03>; missForest
', described by Stekhoven and Buhlmann (2012) <doi:10.1093/bioinformatics/btr597>; missMDA
', described by Josse and Husson (2016) <doi:10.18637/jss.v070.i01>; and pcaMethods
', described by Stacklies et al. (2007) <doi:10.1093/bioinformatics/btm069>. The central assumption behind missCompare
is that structurally different datasets (e.g. larger datasets with a large number of correlated variables vs. smaller datasets with non correlated variables) will benefit differently from different missing data imputation algorithms. missCompare
takes measurements of your dataset and sets up a sandbox to try a curated list of standard and sophisticated missing data imputation algorithms and compares them assuming custom missingness patterns. missCompare
will also impute your real-life dataset for you after the selection of the best performing algorithm in the simulations. The package also provides various post-imputation diagnostics and visualizations to help you assess imputation performance.
The estimation of the parameters in mixed Poisson models.
This package provides tools for calculating Laspeyres, Paasche, and Fisher price and quantity indices.
This package provides utilities for reading and processing microdata from Spanish official statistics with R.
Testing CRAN and Bioconductor mirror speed by recording download time of src/base/COPYING (for CRAN) and packages/release/bioc/html/ggtree.html (for Bioconductor).
Developed to deal with multi-locus genotype data, this package is especially designed for those panel which include different type of markers. Basic genetic parameters like allele frequency, genotype frequency, heterozygosity and Hardy-Weinberg test of mixed genetic data can be obtained. In addition, a new test for mutual independence which is compatible for mixed genetic data is developed in this package.
In many agricultural, engineering, industrial, post-harvest and processing experiments, the number of factor level changes and hence the total number of changes is of serious concern as such experiments may consists of hard-to-change factors where it is physically very difficult to change levels of some factors or sometime such experiments may require normalization time to obtain adequate operating condition. For this reason, run orders that offer the minimum number of factor level changes and at the same time minimize the possible influence of systematic trend effects on the experimentation have been sought. Factorial designs with minimum changes in factors level may be preferred for such situations as these minimally changed run orders will minimize the cost of the experiments. For method details see, Bhowmik, A.,Varghese, E., Jaggi, S. and Varghese, C. (2017)<doi:10.1080/03610926.2016.1152490>.This package used to construct all possible minimally changed factorial run orders for different experimental set ups along with different statistical criteria to measure the performance of these designs. It consist of the function minFactDesign()
.
This package provides a framework for analyzing broth microdilution assays in various 96-well plate designs, visualizing results and providing descriptive and (simple) inferential statistics (i.e. summary statistics and sign test). The functions are designed to add metadata to 8 x 12 tables of absorption values, creating a tidy data frame. Users can choose between clean-up procedures via function parameters (which covers most cases) or user prompts (in cases with complex experimental designs). Users can also choose between two validation methods, i.e. exclusion of absorbance values above a certain threshold or manual exclusion of samples. A function for visual inspection of samples with their absorption values over time for certain group combinations helps with the decision. In addition, the package includes functions to subtract the background absorption (usually at time T0) and to calculate the growth performance compared to a baseline. Samples can be visually inspected with their absorption values displayed across time points for specific group combinations. Core functions of this package (i.e. background subtraction, sample validation and statistics) were inspired by the manual calculations that were applied in Tewes and Muller (2020) <doi:10.1038/s41598-020-67600-7>.