This package implements an explicit exploration strategy for evolutionary algorithms in order to have a more effective search in solving optimization problems. Along with this exploration search strategy, a set of four different Estimation of Distribution Algorithms (EDAs) are also implemented for solving optimization problems in continuous domains. The implemented explicit exploration strategy in this package is described in Salinas-Gutiérrez and Muñoz Zavala (2023) <doi:10.1016/j.asoc.2023.110230>.
This package provides methods and tools for the analysis of Genome Wide Identity-by-Descent ('gwid') mapping data, focusing on testing whether there is a higher occurrence of Identity-By-Descent (IBD) segments around potential causal variants in cases compared to controls, which is crucial for identifying rare variants. To enhance its analytical power, gwid incorporates a Sliding Window Approach, allowing for the detection and analysis of signals from multiple Single Nucleotide Polymorphisms (SNPs).
Penalized regression for generalized linear models for measurement error problems (aka. errors-in-variables). The package contains a version of the lasso (L1-penalization) which corrects for measurement error (Sorensen et al. (2015) <doi:10.5705/ss.2013.180>). It also contains an implementation of the Generalized Matrix Uncertainty Selector, which is a version the (Generalized) Dantzig Selector for the case of measurement error (Sorensen et al. (2018) <doi:10.1080/10618600.2018.1425626>).
The Iterative Cumulative Sum of Squares (ICSS) algorithm by Inclan/Tiao (1994) <https://www.jstor.org/stable/2290916> detects multiple change points, i.e. structural break points, in the variance of a sequence of independent observations. For series of moderate size (i.e. 200 observations and beyond), the ICSS algorithm offers results comparable to those obtained by a Bayesian approach or by likelihood ration tests, without the heavy computational burden required by these approaches.
We connect the multi-class Neyman-Pearson classification (NP) problem to the cost-sensitive learning (CS) problem, and propose two algorithms (NPMC-CX and NPMC-ER) to solve the multi-class NP problem through cost-sensitive learning tools. Under certain conditions, the two algorithms are shown to satisfy multi-class NP properties. More details are available in the paper "Neyman-Pearson Multi-class Classification via Cost-sensitive Learning" (Ye Tian and Yang Feng, 2021).
This package provides functions to access and download data from various NASA APIs <https://api.nasa.gov/#browseAPI>, including: Astronomy Picture of the Day (APOD), Mars Rover Photos, Earth Polychromatic Imaging Camera (EPIC), Near Earth Object Web Service (NeoWs), Earth Observatory Natural Event Tracker (EONET), and NASA Earthdata CMR Search. Most endpoints require a NASA API key for access. Data is retrieved, cleaned for analysis, and returned in a dataframe-friendly format.
In Switzerland, the landscape of municipalities is changing rapidly mainly due to mergers. The Swiss Municipal Data Merger Tool automatically detects these mutations and maps municipalities over time, i.e. municipalities of an old state to municipalities of a new state. This functionality is helpful when working with datasets that are based on different spatial references. The package's idea and use case is discussed in the following article: <doi:10.1111/spsr.12487>.
This package performs variable selection/feature reduction under a clustering or classification framework. In particular, it can be used in an automated fashion using mixture model-based methods ('teigen and mclust are currently supported). Can account for mixtures of non-Gaussian distributions via Manly transform (via ManlyMix'). See Andrews and McNicholas (2014) <doi:10.1007/s00357-013-9139-2> and Neal and McNicholas (2023) <doi:10.48550/arXiv.2305.16464>.
rifi analyses data from rifampicin time series created by microarray or RNAseq. rifi is a transcriptome data analysis tool for the holistic identification of transcription and decay associated processes. The decay constants and the delay of the onset of decay is fitted for each probe/bin. Subsequently, probes/bins of equal properties are combined into segments by dynamic programming, independent of a existing genome annotation. This allows to detect transcript segments of different stability or transcriptional events within one annotated gene. In addition to the classic decay constant/half-life analysis, rifi detects processing sites, transcription pausing sites, internal transcription start sites in operons, sites of partial transcription termination in operons, identifies areas of likely transcriptional interference by the collision mechanism and gives an estimate of the transcription velocity. All data are integrated to give an estimate of continous transcriptional units, i.e. operons. Comprehensive output tables and visualizations of the full genome result and the individual fits for all probes/bins are produced.
This package provides functions for the analysis of income distributions for subgroups of the population as defined by a set of variables like age, gender, region, etc. This entails a Kolmogorov-Smirnov test for a mixture distribution as well as functions for moments, inequality measures, entropy measures and polarisation measures of income distributions. This package thus aides the analysis of income inequality by offering tools for the exploratory analysis of income distributions at the disaggregated level.
This package provides implementation of methods for estimation of quantitative maps from Multi-Parameter Mapping (MPM) acquisitions including adaptive smoothing methods in the framework of the ESTATICS model. The smoothing method is described in Mohammadi et al. (2017). <doi:10.20347/WIAS.PREPRINT.2432>. Usage of the package is also described in Polzehl and Tabelow (2019), Magnetic Resonance Brain Imaging, Chapter 6, Springer, Use R! Series. <doi:10.1007/978-3-030-29184-6_6>.
Racket is a general-purpose programming language in the Scheme family, with a large set of libraries and a compiler based on Chez Scheme. Racket is also a platform for language-oriented programming, from small domain-specific languages to complete language implementations.
The main Racket distribution comes with many bundled packages, including the DrRacket IDE, libraries for GUI and web programming, and implementations of languages such as Typed Racket, R5RS and R6RS Scheme, Algol 60, and Datalog.
This package provides functions to fit Accurate Generalized Linear Model (AGLM) models, visualize them, and predict for new data. AGLM is defined as a regularized GLM which applies a sort of feature transformations using a discretization of numerical features and specific coding methodologies of dummy variables. For more information on AGLM, see Suguru Fujita, Toyoto Tanaka, Kenji Kondo and Hirokazu Iwasawa (2020) <https://www.institutdesactuaires.com/global/gene/link.php?doc_id=16273&fg=1>.
This package implements bound constrained optimal sample size allocation (BCOSSA) framework described in Bulus & Dong (2021) <doi:10.1080/00220973.2019.1636197> for power analysis of multilevel regression discontinuity designs (MRDDs) and multilevel randomized trials (MRTs) with continuous outcomes. Minimum detectable effect size (MDES) and power computations for MRDDs allow polynomial functional form specification for the score variable (with or without interaction with the treatment indicator). See Bulus (2021) <doi:10.1080/19345747.2021.1947425>.
Written to help undergraduate as well as graduate students to get started with R for basic econometrics without the need to import specific functions and datasets from many different sources. Primarily, the package is meant to accompany the German textbook Auer, L.v., Hoffmann, S., Kranz, T. (2024, ISBN: 978-3-662-68263-0) from which the exercises cover all the topics from the textbook Auer, L.v. (2023, ISBN: 978-3-658-42699-6).
Detection and attribution of climate change using methods including optimal fingerprinting via generalized total least squares or an estimating equation approach (Li et al., 2025, <doi:10.1175/JCLI-D-24-0193.1>; Ma et al., 2023, <doi:10.1175/JCLI-D-22-0681.1>). Provides shrinkage estimators for the covariance matrix following Ledoit and Wolf (2004, <doi:10.1016/S0047-259X(03)00096-4>) and Ledoit and Wolf (2017, <doi:10.2139/ssrn.2383361>).
This package implements a Markov Chain Monte Carlo algorithm to approximate exact conditional inference for logistic regression models. Exact conditional inference is based on the distribution of the sufficient statistics for the parameters of interest given the sufficient statistics for the remaining nuisance parameters. Using model formula notation, users specify a logistic model and model terms of interest for exact inference. See Zamar et al. (2007) <doi:10.18637/jss.v021.i03> for more details.
This package provides functions are provided for quantifying evolution and selection on complex traits. The package implements effective handling and analysis algorithms scaled for genome-wide data and calculates a composite statistic, denoted Ghat, which is used to test for selection on a trait. The package provides a number of simple examples for handling and analysing the genome data and visualising the output and results. Beissinger et al., (2018) <doi:10.1534/genetics.118.300857>.
This package provides functions to compute the Generalized Dynamic Principal Components introduced in Peña and Yohai (2016) <DOI:10.1080/01621459.2015.1072542>. The implementation includes an automatic procedure proposed in Peña, Smucler and Yohai (2020) <DOI:10.18637/jss.v092.c02> for the identification of both the number of lags to be used in the generalized dynamic principal components as well as the number of components required for a given reconstruction accuracy.
IRT-M is a semi-supervised approach based on Bayesian Item Response Theory that produces theoretically identified underlying dimensions from input data and a constraints matrix. The methodology is fully described in Morucci et al. (2024), "Measurement That Matches Theory: Theory-Driven Identification in Item Response Theory Models"'. Details are available at <https://www.cambridge.org/core/journals/american-political-science-review/article/measurement-that-matches-theory-theorydriven-identification-in-item-response-theory-models/395DA1DFE3DCD7B866DC053D7554A30B>.
For fitting Bayesian joint latent class and regression models using Gibbs sampling. See the documentation for the model. The technical details of the model implemented here are described in Elliott, Michael R., Zhao, Zhangchen, Mukherjee, Bhramar, Kanaya, Alka, Needham, Belinda L., "Methods to account for uncertainty in latent class assignments when using latent classes as predictors in regression models, with application to acculturation strategy measures" (2020) In press at Epidemiology <doi:10.1097/EDE.0000000000001139>.
This package provides functions to fit linear mixed models based on convolutions of the generalized Laplace (GL) distribution. The GL mixed-effects model includes four special cases with normal random effects and normal errors (NN), normal random effects and Laplace errors (NL), Laplace random effects and normal errors (LN), and Laplace random effects and Laplace errors (LL). The methods are described in Geraci and Farcomeni (2020, Statistical Methods in Medical Research) <doi:10.1177/0962280220903763>.
Includes functions and data used in the book "Presenting Statistical Results Effectively", Andersen and Armstrong (2022, ISBN: 978-1446269800). Several functions aid in data visualization - creating compact letter displays for simple slopes, kernel density estimates with normal density overlay. Other functions aid in post-model evaluation heatmap fit statistics for binary predictors, several variable importance measures, compact letter displays and simple-slope calculation. Finally, the package makes available the example datasets used in the book.
The typicality and eccentricity data analysis (TEDA) framework was put forward by Angelov (2013) <DOI:10.14313/JAMRIS_2-2014/16>. It has been further developed into multiple different techniques since, and provides a non-parametric way of determining how similar an observation, from a process that is not purely random, is to other observations generated by the process. This package provides code to use the batch and recursive TEDA methods that have been published.