Decision tree algorithm with a major feature added. Allows for users to define an ordering on the partitioning process. Resulting in Branch-Exclusive Splits Trees (BEST). Cedric Beaulac and Jeffrey S. Rosentahl (2019) <arXiv:1804.10168>
.
This package provides tools for extraction and analysis of various n-grams (k-mers) derived from biological sequences (proteins or nucleic acids). Contains QuiPT
(quick permutation test) for fast feature-filtering of the n-gram data.
Compare baseline characteristics between two or more groups. The variables being compared can be factor and numeric variables. The function will automatically judge the type and distribution of the variables, and make statistical description and bivariate analysis.
This package provides a minimal interface for applying annotators from the Stanford CoreNLP
java library. Methods are provided for tasks such as tokenisation, part of speech tagging, lemmatisation, named entity recognition, coreference detection and sentiment analysis.
This package contains the function calendR()
for creating fully customizable monthly and yearly calendars (colors, fonts, formats, ...) and even heatmap calendars. In addition, it allows saving the calendars in ready to print A4 format PDF files.
This package provides a unified interface for simplifying cloud storage interactions, including uploading, downloading, reading, and writing files, with functions for both Google Drive (<https://www.google.com/drive/>) and Amazon S3 (<https://aws.amazon.com/s3/>).
Germline and somatic locus data which contain the total read depth and B allele read depth using Bayesian model (Dirichlet Process) to cluster. Meanwhile, the cluster model can deal with the SNVs mutation and the CNAs mutation.
Data cleaning functions for classes logical, factor, numeric, character, currency and Date to make data cleaning fast and easy. Relying on very few dependencies, it provides smart guessing, but with user options to override anything if needed.
Every research team have their own script for calculation of hemodynamic indexes. This package makes it possible to insert a long-format dataframe, and add both periods of interest (trigger-periods), and delete artifacts with deleter-files.
This package provides methods and utilities for testing, identifying, selecting and mutating objects as categorical or continous types. These functions work on both atomic vectors as well as recursive objects: data.frames, data.tables, tibbles, lists, etc..
Monthly download stats of CRAN and Bioconductor packages. Download stats of CRAN packages is from the RStudio CRAN mirror', see <https://cranlogs.r-pkg.org:443>. Bioconductor package download stats is at <https://bioconductor.org/packages/stats/>.
Prediction methods where explanatory information is coded as a matrix of distances between individuals. Distances can either be directly input as a distances matrix, a squared distances matrix, an inner-products matrix or computed from observed predictors.
Enables the user to build a citation network/graph from bibliographic data and, based on modularity and heterocitation metrics, assess the degree of awareness/cross-fertilization between two corpora/communities. This toolset is optimized for Scopus data.
Estimate Barton & Lord's (1981) <doi:10.1002/j.2333-8504.1981.tb01255.x> four parameter IRT model with lower and upper asymptotes using Bayesian formulation described by Culpepper (2016) <doi:10.1007/s11336-015-9477-6>.
This package provides functions for automatically performing a reanalysis series on a data set using CNA, and for calculating the fit-robustness of the resulting models, as described in Parkkinen and Baumgartner (2021) <doi:10.1177/0049124120986200>.
The free algebra in R with non-commuting indeterminates. Uses disordR
discipline (Hankin, 2022, <doi:10.48550/ARXIV.2210.03856>). To cite the package in publications please use Hankin (2022) <doi:10.48550/ARXIV.2211.04002>.
Detecting spatial associations via spatial stratified heterogeneity, accounting for spatial dependencies, interpretability, complex interactions, and robust stratification. In addition, it supports the spatial stratified heterogeneity family described in Lv et al. (2025)<doi:10.1111/tgis.70032>.
Simulation and analysis of graded response data with different types of estimators. Also, an interactive shiny application is provided with graphics for characteristic and information curves. Samejima (2018) <doi:10.1007/978-1-4757-2691-6_5>.
Manage project dependencies from your DESCRIPTION file. Create a reproducible virtual environment with minimal additional files in your project. Provides tools to add, remove, and update dependencies as well as install existing dependencies with a single function.
This package provides a set of tools to foster the development of reproducible analytical workflow by simplifying the download of data and metadata from DataONE
(<https://www.dataone.org>) and easily importing this information into R.
Our approach uses a mixture of multilayer stochastic block models to group co-membership matrices with similar information into components and to partition observations into different clusters. See De Santiago (2023, ISBN: 978-2-87587-088-9).
Simulate Mediterranean forest functioning and dynamics using cohort-based description of vegetation [De Caceres et al. (2015) <doi:10.1016/j.agrformet.2015.06.012>; De Caceres et al. (2021) <doi:10.1016/j.agrformet.2020.108233>].
Derives the most frequent hierarchies along with their probability of occurrence. One can also define complex hierarchy criteria and calculate their probability. Methodology based on Papakonstantinou et al. (2021) <DOI:10.21203/rs.3.rs-858140/v1>.
Classification, regression, and clustering with k nearest neighbors algorithm. Implements several distance and similarity measures, covering continuous and logical features. Outputs ranked neighbors. Most features of this package are directly based on the PMML specification for KNN.