This package is an R program for the subset-based analysis of heterogeneous traits and disease subtypes. ASSET allows the user to search through all possible subsets of z-scores to identify the subset of traits giving the best meta-analyzed z-score. Further, it returns a p-value adjusting for the multiple-testing involved in the search. It also allows for searching for the best combination of disease subtypes associated with each variant.
The atena package quantifies expression of TEs (transposable elements) from RNA-seq data through different methods, including ERVmap, TEtranscripts and Telescope. A common interface is provided to use each of these methods, which consists of building a parameter object, calling the quantification function with this object and getting a SummarizedExperiment
object as an output container of the quantified expression profiles. The implementation allows quantifing TEs and gene transcripts in an integrated manner.
The BADER package is intended for the analysis of RNA sequencing data. The algorithm fits a Bayesian hierarchical model for RNA sequencing count data. BADER returns the posterior probability of differential expression for each gene between two groups A and B. The joint posterior distribution of the variables in the model can be returned in the form of posterior samples, which can be used for further down-stream analyses such as gene set enrichment.
Analyzes autocorrelation and partial autocorrelation using surrogate methods and bootstrapping, and computes the acceleration constants for the vectorized moving block bootstrap provided by this package. It generates percentile, bias-corrected, and accelerated intervals and estimates partial autocorrelations using Durbin-Levinson. This package calculates the autocorrelation power spectrum, computes cross-correlations between two time series, computes bandwidth for any time series, and performs autocorrelation frequency analysis. It also calculates the periodicity of a time series.
Provide a tool to easily build customized data flows to pre-process large volumes of information from different sources. To this end, bdpar allows to (i) easily use and create new functionalities and (ii) develop new data source extractors according to the user needs. Additionally, the package provides by default a predefined data flow to extract and pre-process the most relevant information (tokens, dates, ... ) from some textual sources (SMS, Email, YouTube
comments).
An implementation of the k-means-- algorithm proposed by Chawla and Gionis, 2013 in their paper, "k-means-- : A unified approach to clustering and outlier detection. SIAM International Conference on Data Mining (SDM13)", <doi:10.1137/1.9781611972832.21> and using ordering described by Howe, 2013 in the thesis, Clustering and anomaly detection in tropical cyclones". Useful for creating (potentially) tighter clusters than standard k-means and simultaneously finding outliers inexpensively in multidimensional space.
Functions, data sets and examples for the book: Yves Croissant (2025) "Microeconometrics with R", Chapman and Hall/CRC The R Series <doi:10.1201/9781003100263>. The package includes a set of estimators for models used in microeconometrics, especially for count data and limited dependent variables. Test functions include score test, Hausman test, Vuong test, Sargan test and conditional moment test. A small subset of the data set used in the book is also included.
Algorithms for solving various Maximum Weight Connected Subgraph Problems, including variants with budget constraints, cardinality constraints, weighted edges and signals. The package represents an R interface to high-efficient solvers based on relax-and-cut approach (Ã lvarez-Miranda E., Sinnl M. (2017) <doi:10.1016/j.cor.2017.05.015>) mixed-integer programming (Loboda A., Artyomov M., and Sergushichev A. (2016) <doi:10.1007/978-3-319-43681-4_17>) and simulated annealing.
An adaptation of Non-dominated Sorting Genetic Algorithm III for multi objective feature selection tasks. Non-dominated Sorting Genetic Algorithm III is a genetic algorithm that solves multiple optimization problems simultaneously by applying a non-dominated sorting technique. It uses a reference points based selection operator to explore solution space and preserve diversity. See the original paper by K. Deb and H. Jain (2014) <DOI:10.1109/TEVC.2013.2281534> for a detailed description.
Do Markov chain Monte Carlo (MCMC) simulation of Potts models (Potts, 1952, <doi:10.1017/S0305004100027419>), which are the multi-color generalization of Ising models (so, as as special case, also simulates Ising models). Use the Swendsen-Wang algorithm (Swendsen and Wang, 1987, <doi:10.1103/PhysRevLett.58.86>
) so MCMC is fast. Do maximum composite likelihood estimation of parameters (Besag, 1975, <doi:10.2307/2987782>, Lindsay, 1988, <doi:10.1090/conm/080>).
Calculating Pst values to assess differentiation among populations from a set of quantitative traits is the primary purpose of such a package. The bootstrap method provides confidence intervals and distribution histograms of Pst. Variations of Pst in function of the parameter c/h^2 are studied as well. Finally, the package proposes different transformations especially to eliminate any variation resulting from allometric growth (calculation of residuals from linear regressions, Reist standardizations or Aitchison transformation).
Supervised and unsupervised multivariate methods, supplemented by GUI and some visualizations, to perform various analyses in the field of computational stylistics, authorship attribution, etc. For further reference, see Eder et al. (2016), <https://journal.r-project.org/archive/2016/RJ-2016-007/index.html>. You are also encouraged to visit the Computational Stylistics Group's website <https://computationalstylistics.github.io/>, where a reasonable amount of information about the package and related projects are provided.
This package contains functions that fit linear mixed-effects models for high-dimensional data (p>>n) with penalty for both the fixed effects and random effects for variable selection. The details of the algorithm can be found in Luoying Yang PhD
thesis (Yang and Wu 2020). The algorithm implementation is based on the R package lmmlasso'. Reference: Yang L, Wu TT (2020). Model-Based Clustering of Longitudinal Data in High-Dimensionality. Unpublished thesis.
Allows users to list data structures using path-based navigation. Provides intuitive methods for storing, accessing, and manipulating nested data through simple path strings. Key features include strict mode validation, path existence checking, recursive operations, and automatic parent-level creation. Designed for use cases requiring organized storage of complex nested data while maintaining simple access patterns. Particularly useful for configuration management, nested settings, and any application where data naturally forms a tree-like structure.
This package implements the Variable importance Explainable Elastic Shape Analysis pipeline for explainable machine learning with functional data inputs. Converts training and testing data functional inputs to elastic shape analysis principal components that account for vertical and/or horizontal variability. Computes feature importance to identify important principal components and visualizes variability captured by functional principal components. See Goode et al. (2025) <doi:10.48550/arXiv.2501.07602>
for technical details about the methodology.
This package provides the facility to calculate the Brainerd-Robinson similarity coefficient for the rows of an input table, and to calculate the significance of each coefficient based on a permutation approach; a heatmap is produced to visually represent the similarity matrix. Optionally, hierarchical agglomerative clustering can be performed and the silhouette method is used to identify an optimal number of clusters; the results of the clustering can be optionally used to sort the heatmap.
Provide a series of functions to conduct a meta analysis of factor analysis based on co-occurrence matrices. The tool can be used to solve the factor structure (i.e. inner structure of a construct, or scale) debate in several disciplines, such as psychology, psychiatry, management, education so on. References: Shafer (2005) <doi:10.1037/1040-3590.17.3.324>; Shafer (2006) <doi:10.1002/jclp.20213>; Loeber and Schmaling (1985) <doi:10.1007/BF00910652>.
This package implements a basis function or functional data analysis framework for several techniques of multivariate analysis in continuous-time setting. Specifically, we introduced continuous-time analogues of several classical techniques of multivariate analysis, such as principal component analysis, canonical correlation analysis, Fisher linear discriminant analysis, K-means clustering, and so on. Details are in Biplab Paul, Philip T. Reiss and Erjia Cui (2023) "Continuous-time multivariate analysis" <doi:10.48550/arXiv.2307.09404>
.
This package provides functions designed to connect disease-related differential proteins and co-expression network. It provides the basic statics analysis included t test, ANOVA analysis. The network construction is not offered by the package, you can used WGCNA package which you can learn in Peter et al. (2008) <doi:10.1186/1471-2105-9-559>. It also provides module analysis included PCA analysis, two enrichment analysis, Planner maximally filtered graph extraction and hub analysis.
Instrumental variable estimation for linear models by two-stage least-squares (2SLS) regression or by robust-regression via M-estimation (2SM) or MM-estimation (2SMM). The main ivreg()
model-fitting function is designed to provide a workflow as similar as possible to standard lm()
regression. A wide range of methods is provided for fitted ivreg model objects, including extensive functionality for computing and graphing regression diagnostics in addition to other standard model tools.
This package implements the method to analyse weighted mobility networks or distribution networks as outlined in: Block, P., Stadtfeld, C., & Robins, G. (2022) <doi:10.1016/j.socnet.2021.08.003>. The purpose of the model is to analyse the structure of mobility, incorporating exogenous predictors pertaining to individuals and locations known from classical mobility analyses, as well as modelling emergent mobility patterns akin to structural patterns known from the statistical analysis of social networks.
This package contains basic tools for performing multiple-output quantile regression and computing regression quantile contours by means of directional regression quantiles. In the location case, one can thus obtain halfspace depth contours in two to six dimensions. Hallin, M., Paindaveine, D. and Å iman, M. (2010) Multivariate quantiles and multiple-output regression quantiles: from L1 optimization to halfspace depth. Annals of Statistics 38, 635-669 For more references about the method, see Help pages.
To study network evolution models and different blockmodeling approaches. Various functions enable generating (temporal) networks with a selected blockmodel type, taking into account selected local network mechanisms. The development of this package is financially supported the Slovenian Research Agency (www.arrs.gov.si) within the research program P5<96>0168 and the research project J5-2557 (Comparison and evaluation of different approaches to blockmodeling dynamic networks by simulations with application to Slovenian co-authorship networks).
Model age schedules of mortality, nqx, suitable for a life table. This package implements the SVD-Comp mortality model indexed by either child or child/adult mortality. Given input value(s) of either 5q0 or (5q0, 45q15), the qx()
function generates single-year 1qx or 5-year 5qx conditional age-specific probabilities of dying. See Clark (2016) <doi:10.48550/arXiv.1612.01408>
and Clark (2019) <doi:10.1007/s13524-019-00785-3>.