This package provides R bindings for selected components of the Computational Morphometry Toolkit ('CMTK') for image registration and point transformation. A subset of the C++ source code required for point transforms is bundled with cmtkr'. This allows direct calls into the CMTK library, avoiding command-line invocations and providing order-of-magnitude speed improvements. Additional CMTK functionality may be wrapped in future releases. CMTK is described in Rohlfing T and Maurer CR (2003) <doi:10.1109/titb.2003.808506>.
This package provides a fold change rank based method is presented to search for genes with changing expression and to detect recurrent chromosomal copy number aberrations. This method may be useful for high-throughput biological data (micro-array, sequencing, ...). Probabilities are associated with genes or probes in the data set and there is no problem of multiple tests when using this method. For array-based comparative genomic hybridization data, segmentation results are obtained by merging the significant probes detected.
Fits linear regression, logistic and multinomial regression models, Poisson regression, Cox model via Global Adaptive Generative Adjustment Algorithm. For more detailed information, see Bin Wang, Xiaofei Wang and Jianhua Guo (2022) <arXiv:1911.00658>. This paper provides the theoretical properties of Gaga linear model when the load matrix is orthogonal. Further study is going on for the nonorthogonal cases and generalized linear models. These works are in part supported by the National Natural Foundation of China (No.12171076).
This package provides tools for studying genotype-phenotype maps for bi-allelic loci underlying quantitative phenotypes. The 0.1 version is released in connection with the publication of Gjuvsland et al (2013) and implements basic line plots and the monotonicity measures for GP maps presented in the paper. Reference: Gjuvsland AB, Wang Y, Plahte E and Omholt SW (2013) Monotonicity is a key feature of genotype-phenotype maps. Frontier in Genetics 4:216 <doi:10.3389/fgene.2013.00216>.
Genomic biology is not limited to the confines of the canonical B-forming DNA duplex, but includes over ten different types of other secondary structures that are collectively termed non-B DNA structures. Of these non-B DNA structures, the G-quadruplexes are highly stable four-stranded structures that are recognized by distinct subsets of nuclear factors. This package provide functions for predicting intramolecular G quadruplexes. In addition, functions for predicting other intramolecular nonB DNA structures are included.
This package provides tools to adjust estimates of learning for guessing-related bias in educational and survey research. Implements standard guessing correction methods and a sophisticated latent class model that leverages informative pre-post test transitions to account for guessing behavior. The package helps researchers obtain more accurate estimates of actual learning when respondents may guess on closed-ended knowledge items. For theoretical background and empirical validation, see Cor and Sood (2018) <https://gsood.com/research/papers/guess.pdf>.
This package implements the multivariate adaptive shrinkage (mash) method of Urbut et al (2019) <DOI:10.1038/s41588-018-0268-8> for estimating and testing large numbers of effects in many conditions (or many outcomes). Mash takes an empirical Bayes approach to testing and effect estimation; it estimates patterns of similarity among conditions, then exploits these patterns to improve accuracy of the effect estimates. The core linear algebra is implemented in C++ for fast model fitting and posterior computation.
This package provides Scilab n1qn1'. This takes more memory than traditional L-BFGS. The n1qn1 routine is useful since it allows prespecification of a Hessian. If the Hessian is near enough the truth in optimization it can speed up the optimization problem. The algorithm is described in the Scilab optimization documentation located at <https://www.scilab.org/sites/default/files/optimization_in_scilab.pdf>. This version uses manually modified code from f2c to make this a C only binary.
Allows fitting of step-functions to univariate serial data where neither the number of jumps nor their positions is known by implementing the multiscale regression estimators SMUCE, simulataneous multiscale changepoint estimator, (K. Frick, A. Munk and H. Sieling, 2014) <doi:10.1111/rssb.12047> and HSMUCE, heterogeneous SMUCE, (F. Pein, H. Sieling and A. Munk, 2017) <doi:10.1111/rssb.12202>. In addition, confidence intervals for the change-point locations and bands for the unknown signal can be obtained.
Generate data objects from XML versions of the Swiss Register of Plant Protection Products. An online version of the register can be accessed at <https://www.psm.admin.ch/de/produkte>. There is no guarantee of correspondence of the data read in using this package with that online version, or with the original registration documents. Also, the Federal Food Safety and Veterinary Office, coordinating the authorisation of plant protection products in Switzerland, does not answer requests regarding this package.
Non-negative Matrix Factorization(NMF) is a powerful tool for identifying the key features of microbial communities and a dimension-reduction method. When we are interested in the differences between the structures of two groups of communities, supervised NMF(Yun Cai, Hong Gu and Tobby Kenney (2017),<doi:10.1186/s40168-017-0323-1>) provides a better way to do this, while retaining all the advantages of NMF -- such as interpretability, and being based on a simple biological intuition.
Storm is a distributed real-time computation system. Similar to how Hadoop provides a set of general primitives for doing batch processing, Storm provides a set of general primitives for doing real-time computation. . Storm includes a "Multi-Language" (or "Multilang") Protocol to allow implementation of Bolts and Spouts in languages other than Java. This R extension provides implementations of utility functions to allow an application developer to focus on application-specific functionality rather than Storm/R communications plumbing.
Maximum likelihood estimation of univariate Gaussian Mixture Autoregressive (GMAR), Student's t Mixture Autoregressive (StMAR), and Gaussian and Student's t Mixture Autoregressive (G-StMAR) models, quantile residual tests, graphical diagnostics, forecast and simulate from GMAR, StMAR and G-StMAR processes. Leena Kalliovirta, Mika Meitz, Pentti Saikkonen (2015) <doi:10.1111/jtsa.12108>, Mika Meitz, Daniel Preve, Pentti Saikkonen (2023) <doi:10.1080/03610926.2021.1916531>, Savi Virolainen (2022) <doi:10.1515/snde-2020-0060>.
Univariate and multivariate methods to analyze randomized response (RR) survey designs (e.g., Warner, S. L. (1965). Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 60, 63â 69, <doi:10.2307/2283137>). Besides univariate estimates of true proportions, RR variables can be used for correlations, as dependent variable in a logistic regression (with or without random effects), or as predictors in a linear regression (Heck, D. W., & Moshagen, M. (2018). RRreg: An R package for correlation and regression analyses of randomized response data. Journal of Statistical Software, 85(2), 1â 29, <doi:10.18637/jss.v085.i02>). For simulations and the estimation of statistical power, RR data can be generated according to several models. The implemented methods also allow to test the link between continuous covariates and dishonesty in cheating paradigms such as the coin-toss or dice-roll task (Moshagen, M., & Hilbig, B. E. (2017). The statistical analysis of cheating paradigms. Behavior Research Methods, 49, 724â 732, <doi:10.3758/s13428-016-0729-x>).
Single-cell RNA-seq (scRNA-seq) is widely used to investigate the composition of complex tissues since the technology allows researchers to define cell-types using unsupervised clustering of the transcriptome. However, due to differences in experimental methods and computational analyses, it is often challenging to directly compare the cells identified in two different experiments. scmap is a method for projecting cells from a scRNA-seq experiment onto the cell-types or individual cells identified in a different experiment.
This package contains a small DNS daemon especially made to handle queries of DNSBL, a simple way to publish IP addresses and/or (domain) names which are somehow notable. Such lists are frequently used to refuse e-mail service to clients known to send unwanted (spam) messages.
rbldnsd is not a general-purpose nameserver. It answers to a limited variety of queries. This makes it extremely fast---greatly outperforming both BIND and djbdns---whilst using relatively little memory.
Fit a latent embedding multivariate regression (LEMUR) model to multi-condition single-cell data. The model provides a parametric description of single-cell data measured with treatment vs. control or more complex experimental designs. The parametric model is used to (1) align conditions, (2) predict log fold changes between conditions for all cells, and (3) identify cell neighborhoods with consistent log fold changes. For those neighborhoods, a pseudobulked differential expression test is conducted to assess which genes are significantly changed.
The MOFA2 package contains a collection of tools for training and analysing multi-omic factor analysis (MOFA). MOFA is a probabilistic factor model that aims to identify principal axes of variation from data sets that can comprise multiple omic layers and/or groups of samples. Additional time or space information on the samples can be incorporated using the MEFISTO framework, which is part of MOFA2. Downstream analysis functions to inspect molecular features underlying each factor, vizualisation, imputation etc are available.
spiky implements methods and model generation for cfMeDIP (cell-free methylated DNA immunoprecipitation) with spike-in controls. CfMeDIP is an enrichment protocol which avoids destructive conversion of scarce template, making it ideal as a "liquid biopsy," but creating certain challenges in comparing results across specimens, subjects, and experiments. The use of synthetic spike-in standard oligos allows diagnostics performed with cfMeDIP to quantitatively compare samples across subjects, experiments, and time points in both relative and absolute terms.
This is an open-source implementation of the Congruent Matching Profile Segments (CMPS) method (Chen et al. 2019)<doi:10.1016/j.forsciint.2019.109964>. In general, it can be used for objective comparison of striated tool marks, and in our examples, we specifically use it for bullet signatures comparisons. The CMPS score is expected to be large if two signatures are similar. So it can also be considered as a feature that measures the similarity of two bullet signatures.
This package provides a simplified interface to the Central Data Repository REST API service made available by the United States Federal Financial Institutions Examination Council ('FFIEC'). Contains functions to retrieve reports of Condition and Income (Call Reports) and Uniform Bank Performance Reports ('UBPR') in list or tidy data frame format for most FDIC insured institutions. See <https://cdr.ffiec.gov/public/Files/SIS611_-_Retrieve_Public_Data_via_Web_Service.pdf> for the official REST API documentation published by the FFIEC'.
Perform optimal transport based tests in factorial designs as introduced in Groppe et al. (2025) <doi:10.48550/arXiv.2509.13970> via the FDOTT() function. These tests are inspired by ANOVA and its nonparametric counterparts. They allow for testing linear relationships in factorial designs between finitely supported probability measures on a metric space. Such relationships include equality of all measures (no treatment effect), interaction effects between a number of factors, as well as main and simple factor effects.
The goal of GHCNr is to provide a fast and friendly interface with the Global Historical Climatology Network daily (GHCNd) database, which contains daily summaries of weather station data worldwide (<https://www.ncei.noaa.gov/products/land-based-station/global-historical-climatology-network-daily>). GHCNd is accessed through the web API <https://www.ncei.noaa.gov/access/services/data/v1>. GHCNr main functionalities consist of downloading data from GHCNd, filter it, and to aggregate it at monthly and annual scales.
This package implements spatial optimization algorithms across several problem families including contiguity-constrained regionalization, discrete facility location, market share analysis, and least-cost corridor and route optimization over raster cost surfaces. Facility location problems also accept user-supplied network travel-time matrices. Uses a Rust backend via extendr for graph and routing algorithms, and the HiGHS solver via the highs package for facility location mixed-integer programs. Method-level references are provided in the documentation of the individual functions.