Israeli baby names provided by Israel's Central Bureau of Statistics. The package contains only names used for at least 5 children in at least one gender and sector ("Jewish", "Muslim", "Christian", "Druze" and "Other"). Data was downloaded from: <https://www.cbs.gov.il/he/publications/LochutTlushim/2020/%D7%A9%D7%9E%D7%95%D7%AA-%D7%A4%D7%A8%D7%98%D7%99%D7%99%D7%9D.xlsx>
.
An implementation of several functions for feature extraction in categorical time series datasets. Specifically, some features related to marginal distributions and serial dependence patterns can be computed. These features can be used to feed clustering and classification algorithms for categorical time series, among others. The package also includes some interesting datasets containing biological sequences. Practitioners from a broad variety of fields could benefit from the general framework provided by ctsfeatures'.
The gasanalyzer R package offers methods for importing, preprocessing, and analyzing data related to photosynthetic characteristics (gas exchange, chlorophyll fluorescence and isotope ratios). It translates variable names into a standard format, and can recalculate derived, physiological quantities using imported or predefined equations. The package also allows users to assess the sensitivity of their results to different assumptions used in the calculations. See also Tholen (2024) <doi:10.1093/aobpla/plae035>.
This package provides a latent variable model based on factor analytic and mixture of experts models, designed to infer food intake from multiple biomarkers data. The model is framed within a Bayesian hierarchical framework, which provides flexibility to adapt to different biomarker distributions and facilitates inference on food intake from biomarker data alone, along with the associated uncertainty. Details are in D'Angelo, et al. (2020) <arXiv:2006.02995>
.
This package provides matrix Gaussian mixture models, matrix transformation mixture models and their model-based clustering results. The parsimonious models of the mean matrices and variance covariance matrices are implemented with a total of 196 variations. For more information, please check: Xuwen Zhu, Shuchismita Sarkar, and Volodymyr Melnykov (2021), "MatTransMix
: an R package for matrix model-based clustering and parsimonious mixture modeling", <doi:10.1007/s00357-021-09401-9>.
Greedy Bayesian algorithm to fit the noisy stochastic block model to an observed sparse graph. Moreover, a graph inference procedure to recover Gaussian Graphical Model (GGM) from real data. This procedure comes with a control of the false discovery rate. The method is described in the article "Enhancing the Power of Gaussian Graphical Model Inference by Modeling the Graph Structure" by Kilian, Rebafka, and Villers (2024) <arXiv:2402.19021>
.
Analyze repertory grids, a qualitative-quantitative data collection technique devised by George A. Kelly in the 1950s. Today, grids are used across various domains ranging from clinical psychology to marketing. The package contains functions to quantitatively analyze and visualize repertory grid data (e.g. Fransella', Bell', & Bannister', 2004, ISBN: 978-0-470-09080-0). The package is part of the The package is part of the <https://openrepgrid.org/> project.
This package provides a probability tree allows to compute probabilities of complex events, such as genotype probabilities in intermediate generations of inbreeding through recurrent self-fertilization (selfing). This package implements functionality to compute probability trees for two- and three-marker genotypes in the F2 to F7 selfing generations. The conditional probabilities are derived automatically and in symbolic form. The package also provides functionality to extract and evaluate the relevant probabilities.
An implementation of the selectboost algorithm (Bertrand et al. 2020, Bioinformatics', <doi:10.1093/bioinformatics/btaa855>), which is a general algorithm that improves the precision of any existing variable selection method. This algorithm is based on highly intensive simulations and takes into account the correlation structure of the data. It can either produce a confidence index for variable selection or it can be used in an experimental design planning perspective.
This package contains selected data from two publications, Campbell et al'. (2016) <DOI:10.1080/14486563.2015.1028486> and Pacioni et al'. (2017) <DOI:10.1071/PC17002>. The data is provided both as raw outputs from the population viability analysis software Vortex and packaged as R objects. The R package vortexR
uses the raw data provided here to illustrate its functionality of parsing raw Vortex output into R objects.
An approach to filter out and/or identify phytoplankton cells from all particles measured via flow cytometry pigment and cell complexity information. It does this using a sequence of one-dimensional gates on pre-defined channels measuring certain pigmentation and complexity. The package is especially tuned for cyanobacteria, but will work fine for phytoplankton communities where there is at least one cell characteristic that differentiates every phytoplankton in the community.
This R package helps the user identify k-mers (e.g. di- or tri-nucleotides) present periodically in a set of genomic loci (typically regulatory elements). The functions of this package provide a straightforward approach to find periodic occurrences of k-mers in DNA sequences, such as regulatory elements. It is not aimed at identifying motifs separated by a conserved distance; for this type of analysis, please visit MEME website.
LLVM is a compiler infrastructure designed for compile-time, link-time, runtime, and idle-time optimization of programs from arbitrary programming languages. It currently supports compilation of C and C++ programs, using front-ends derived from GCC 4.0.1. A new front-end for the C family of languages is in development. The compiler infrastructure includes mirror sets of programming tools as well as libraries with equivalent functionality.
Rofi-pass provides a way to manipulate information stored using password-store through rofi interface:
open URLs of entries with hotkey;
type any field from entry;
auto-typing of user and/or password fields;
auto-typing username based on path;
auto-typing of more than one field, using the autotype entry;
bookmarks mode (open stored URLs in browser, default: Alt+x).
This package provides tools for constructing a matched design with multiple comparison groups. Further specifications of refined covariate balance restriction and exact match on covariate can be imposed. Matches are approximately optimal in the sense that the cost of the solution is at most twice the optimal cost, Crama and Spieksma (1992) <doi:10.1016/0377-2217(92)90078-N>, Karmakar, Small and Rosenbaum (2019) <doi:10.1080/10618600.2019.1584900>.
Developer oriented utility functions designed to be used as the building blocks of R packages that work with ArcGIS
Location Services. It provides functionality for authorization, Esri JSON construction and parsing, as well as other utilities pertaining to geometry and Esri type conversions. To support ArcGIS
Pro users, authorization can be done via arcgisbinding'. Installation instructions for arcgisbinding can be found at <https://developers.arcgis.com/r-bridge/installation/>.
The company, Algorithmia, houses the largest marketplace of online algorithms. This package essentially holds a bunch of REST wrappers that make it very easy to call algorithms in the Algorithmia platform and access files and directories in the Algorithmia data API. To learn more about the services they offer and the algorithms in the platform visit <http://algorithmia.com>. More information for developers can be found at <https://algorithmia.com/developers>.
Generate ground truth cases for object localization algorithms. Cycle through a list of images, select points around which to generate bounding boxes and assign classifiers. Output the coordinates, and images annotated with boxes and labels. For an example study that uses bounding boxes for image localization and classification see Ibrahim, Badr, Abdallah, and Eissa (2012) "Bounding Box Object Localization Based on Image Superpixelization" <doi:10.1016/j.procs.2012.09.119>.
It fits linear regression models for censored spatial data. It provides different estimation methods as the SAEM (Stochastic Approximation of Expectation Maximization) algorithm and seminaive that uses Kriging prediction to estimate the response at censored locations and predict new values at unknown locations. It also offers graphical tools for assessing the fitted model. More details can be found in Ordonez et al. (2018) <doi:10.1016/j.spasta.2017.12.001>.
This package provides tools for fitting Bayesian Distributed Lag Models (DLMs) to longitudinal response data that is a count or binary. Count data is fit using negative binomial regression and binary is fit using quantile regression. The contribution of the lags are fit via b-splines. In addition, infers the predictor inclusion uncertainty. Multimomial models are not supported. Based on Dempsey and Wyse (2025) <doi:10.48550/arXiv.2403.03646>
.
Generates risk estimates and comorbidity flags from ICD-9-CM codes available in administrative medical datasets. The package supports the Charlson Comorbidity Index, the Elixhauser Comorbidity classification, the Revised Cardiac Risk Index, and the Risk Stratification Index. Methods are table-based, fast, and use the plyr package, so parallelization is possible for large jobs. Also includes a sample of real ICD-9 data for 100 patients from a publicly available dataset.
Routines for PLS-based genomic analyses, implementing PLS methods for classification with microarray data and prediction of transcription factor activities from combined ChIP-chip
analysis. The >=1.2-1 versions include two new classification methods for microarray data: GSIM and Ridge PLS. The >=1.3 versions includes a new classification method combining variable selection and compression in logistic regression context: logit-SPLS; and an adaptive version of the sparse PLS.
This package provides tools for analysing the agreement of two or more rankings of the same items. Examples are importance rankings of predictor variables and risk predictions of subjects. Benchmarks for agreement are computed based on random permutation and bootstrap. See Ekstrøm CT, Gerds TA, Jensen, AK (2018). "Sequential rank agreement methods for comparison of ranked lists." _Biostatistics_, *20*(4), 582-598 <doi:10.1093/biostatistics/kxy017> for more information.
Asio is a cross-platform C++ library for network and low-level I/O programming that provides developers with a consistent asynchronous model using a modern C++ approach. It is also included in Boost but requires linking when used with Boost. Standalone it can be used header-only (provided a recent compiler). Asio is written and maintained by Christopher M. Kohlhoff, and released under the Boost Software License', Version 1.0.