Focuses on data processing and visualization in hydrology and climate forecasting. Main function includes data extraction, data downscaling, data resampling, gap filler of precipitation, bias correction of forecasting data, flexible time series plot, and spatial map generation. It is a good pre- processing and post-processing tool for hydrological and hydraulic modellers.
Efficient sampling from high-dimensional truncated Gaussian distributions, or multivariate truncated normal (MTN). Techniques include zigzag Hamiltonian Monte Carlo as in Akihiko Nishimura, Zhenyu Zhang and Marc A. Suchard (2024) <doi:10.1080/01621459.2024.2395587>, and harmonic Monte in Ari Pakman and Liam Paninski (2014) <doi:10.1080/10618600.2013.788448>.
This package implements approaches of non-parametric smooth test to compare simultaneously K(K>1) copulas and non-parametric clustering of multivariate populations with arbitrary sizes. See Yves I. Ngounou Bakam and Denys Pommeret (2022) <arXiv:2112.05623>
and Yves I. Ngounou Bakam and Denys Pommeret (2022) <arXiv:2211.06338>
.
An implementation of Multi-Task Logistic Regression (MTLR) for R. This package is based on the method proposed by Yu et al. (2011) which utilized MTLR for generating individual survival curves by learning feature weights which vary across time. This model was further extended to account for left and interval censored data.
This package provides mailmerge methods for reading spreadsheets of addresses and other relevant information to create standardized but customizable letters. Provides a method for mapping US ZIP codes, including those of letter recipients. Provides a method for parsing and processing html code from online job postings of the American Political Science Association.
This package provides a complete toolkit to process the Munich ChronoType
Questionnaire (MCTQ) for its three versions (standard, micro, and shift). MCTQ is a quantitative and validated tool to assess chronotypes using peoples sleep behavior, originally presented by Till Roenneberg, Anna Wirz-Justice, and Martha Merrow (2003, <doi:10.1177/0748730402239679>).
Uses a kernel smoothing approach to calculate Mutual Information for comparisons between all types of variables including continuous vs continuous, continuous vs discrete and discrete vs discrete. Uses a nonparametric bias correction giving Bias Corrected Mutual Information (BCMI). Implemented efficiently in Fortran 95 with OpenMP
and suited to large genomic datasets.
Estimates ordinal probit switching regression models - a Heckman type selection model with an ordinal selection and continuous outcomes. Different model specifications are allowed for each treatment/regime. For more details on the method, see Wang & Mokhtarian (2024) <doi:10.1016/j.tra.2024.104072> or Chiburis & Lokshin (2007) <doi:10.1177/1536867X0700700202>.
This work is an extension of the state space model for Poisson count data, Poisson-Gamma model, towards a semiparametric specification. Just like the generalized additive models (GAM), cubic splines are used for covariate smoothing. The semiparametric models are fitted by an iterative process that combines maximization of likelihood and backfitting algorithm.
This package provides a multiway method to decompose a tensor (array) of any order, as a generalisation of SVD also supporting non-identity metrics and penalisations. 2-way SVD with these extensions is also available. The package includes also some other multiway methods: PCAn (Tucker-n) and PARAFAC/CANDECOMP with these extensions.
This package provides functions to estimate the density and size of a spatially distributed animal population sampled with an array of passive detectors, such as traps, or by searching polygons or transects. Models incorporating distance-dependent detection are fitted by maximizing the likelihood. Tools are included for data manipulation and model selection.
Sparse redundancy analysis for high dimensional (biomedical) data. Directional multivariate analysis to express the maximum variance in the predicted data set by a linear combination of variables of the predictive data set. Implemented in a partial least squares framework, for more details see Csala et al. (2017) <doi:10.1093/bioinformatics/btx374>.
We described a novel Topology-based pathway enrichment analysis, which integrated the global position of the nodes and the topological property of the pathways in Kyoto Encyclopedia of Genes and Genomes Database. We also provide some functions to obtain the latest information about pathways to finish pathway enrichment analysis using this method.
The goal of tosr is to create the Tree of Science from Web of Science (WoS
) and Scopus data. It can read files from both sources at the same time. More information can be found in Valencia-Hernández (2020) <https://revistas.unal.edu.co/index.php/ingeinv/article/view/77718>.
This package provides a convenient R interface to the National Health Service (NHS) Technology Reference Update Distribution (TRUD) API'. Retrieve available releases for items that you are subscribed to and download these with ease. For more information on the API, see <https://isd.digital.nhs.uk/trud/users/guest/filters/0/api>.
Calculate point estimates of and valid confidence intervals for nonparametric, algorithm-agnostic variable importance measures in high and low dimensions, using flexible estimators of the underlying regression functions. For more information about the methods, please see Williamson et al. (Biometrics, 2020), Williamson et al. (JASA, 2021), and Williamson and Feng (ICML, 2020).
This package provides a multivariate inferential analysis method for detecting differentially expressed genes in gene expression data. It uses artificial components, close to the data's principal components but with an exact interpretation in terms of differential genetic expression, to identify differentially expressed genes while controlling the false discovery rate (FDR).
This package fits models for genotypic disequilibria, as described in Huttley and Wilson (2000), Weir (1996) and Weir and Wilson (1986). Contrast terms are available that account for first order interactions between loci. It also implements, for a single locus in a single population, a conditional exact test for Hardy-Weinberg equilibrium.
This package provides an efficient algorithm for solving ultra-sparse regularized regression models using a variational Bayes algorithm with a spike prior. The algorithm is solved on a path, with coordinate updates, and is capable of generating very sparse models. Very general model diagnostics for controlling type-1 errors are also provided.
This package contains some tools for testing, analyzing time series data and fitting popular time series models such as ARIMA, Moving Average and Holt Winters, etc. Most functions also provide nice and clear outputs like SAS does, such as identify, estimate and forecast, which are the same statements in PROC ARIMA in SAS.
An all-encompassing R toolkit designed to streamline the process of calling various bioinformatics software and then performing data analysis and visualization in R. With blit', users can easily integrate a wide array of bioinformatics command line tools into their workflows, leveraging the power of R for sophisticated data manipulation and graphical representation.
Learning the structure of graphical models from datasets with thousands of variables. More information about the research papers detailing the theory behind Chordalysis is available at <http://www.francois-petitjean.com/Research> (KDD 2016, SDM 2015, ICDM 2014, ICDM 2013). The R package development site is <https://github.com/HerrmannM/Monash-ChoR>
.
Estimate common causal parameters using double/debiased machine learning as proposed by Chernozhukov et al. (2018) <doi:10.1111/ectj.12097>. ddml simplifies estimation based on (short-)stacking as discussed in Ahrens et al. (2024) <doi:10.1177/1536867X241233641>, which leverages multiple base learners to increase robustness to the underlying data generating process.
Interface for Rcpp users to dlib <http://dlib.net> which is a C++ toolkit containing machine learning algorithms and computer vision tools. It is used in a wide range of domains including robotics, embedded devices, mobile phones, and large high performance computing environments. This package allows R users to use dlib through Rcpp'.