The package CellBarcode performs Cellular DNA Barcode analysis. It can handle all kinds of DNA barcodes, as long as the barcode is within a single sequencing read and has a pattern that can be matched by a regular expression. \codeCellBarcode can handle barcodes with flexible lengths, with or without UMI (unique molecular identifier). This tool also can be used for pre-processing some amplicon data such as CRISPR gRNA screening, immune repertoire sequencing, and metagenome data.
Routines to handle family data with a Pedigree object. The initial purpose was to create correlation structures that describe family relationships such as kinship and identity-by-descent, which can be used to model family data in mixed effects models, such as in the coxme function. Also includes a tool for Pedigree drawing which is focused on producing compact layouts without intervention. Recent additions include utilities to trim the Pedigree object with various criteria, and kinship for the X chromosome.
SpectralTAD is an R package designed to identify Topologically Associated Domains (TADs) from Hi-C contact matrices. It uses a modified version of spectral clustering that uses a sliding window to quickly detect TADs. The function works on a range of different formats of contact matrices and returns a bed file of TAD coordinates. The method does not require users to adjust any parameters to work and gives them control over the number of hierarchical levels to be returned.
SpatialCPie is an R package designed to facilitate cluster evaluation for spatial transcriptomics data by providing intuitive visualizations that display the relationships between clusters in order to guide the user during cluster identification and other downstream applications. The package is built around a shiny "gadget" to allow the exploration of the data with multiple plots in parallel and an interactive UI. The user can easily toggle between different cluster resolutions in order to choose the most appropriate visual cues.
A tandem repeat in DNA is two or more adjacent, approximate copies of a pattern of nucleotides. Tandem Repeats Finder is a program to locate and display tandem repeats in DNA sequences. In order to use the program, the user submits a sequence in FASTA format. The output consists of two files: a repeat table file and an alignment file. Submitted sequences may be of arbitrary length. Repeats with pattern size in the range from 1 to 2000 bases are detected.
Programming oncology specific Clinical Data Interchange Standards Consortium (CDISC) compliant Analysis Data Model (ADaM) datasets in R'. ADaM datasets are a mandatory part of any New Drug or Biologics License Application submitted to the United States Food and Drug Administration (FDA). Analysis derivations are implemented in accordance with the "Analysis Data Model Implementation Guide" (CDISC Analysis Data Model Team (2021), <https://www.cdisc.org/standards/foundational/adam>). The package is an extension package of the admiral package.
This package provides functions to compute distances between probability measures or any other data object than can be posed in this way, entropy measures for samples of curves, distances and depth measures for functional data, and the Generalized Mahalanobis Kernel distance for high dimensional data. For further details about the metrics please refer to Martos et al (2014) <doi:10.3233/IDA-140706>; Martos et al (2018) <doi:10.3390/e20010033>; Hernandez et al (2018, submitted); Martos et al (2018, submitted).
The four-gamete test is based on the infinite-sites model which assumes that the probability of the same mutation occurring twice (recurrent or parallel mutations) and the probability of a mutation back to the original state (reverse mutations) are close to zero. Without these types of mutations, the only explanation for observing the four dilocus genotypes (example below) is recombination (Hudson and Kaplan 1985, Genetics 111:147-164). Thus, the presence of all four gametes is also called phylogenetic incompatibility.
This package provides a set of tools supporting more flexible heatmaps. The graphics is grid-like using the old graphics system. The main function is heatmap.n2(), which is a wrapper around the various functions constructing individual parts of the heatmap, like sidebars, picket plots, legends etc. The function supports zooming and splitting, i.e., having (unlimited) small heatmaps underneath each other in one plot deriving from the same data set, e.g., clustered and ordered by a supervised clustering method.
This package provides functionality to support data preparation and exploration for palaeobiological analyses, improving code reproducibility and accessibility. The wider aim of palaeoverse is to bring the palaeobiological community together to establish agreed standards. The package currently includes functionality for data cleaning, binning (time and space), exploration, summarisation and visualisation. Reference datasets (i.e. Geological Time Scales <https://stratigraphy.org/chart>) and auxiliary functions are also provided. Details can be found in: Jones et al., (2023) <doi: 10.1111/2041-210X.14099>.
Download, navigate and analyse the Student-Life dataset. The Student-Life dataset contains passive and automatic sensing data from the phones of a class of 48 Dartmouth college students. It was collected over a 10 week term. Additionally, the dataset contains ecological momentary assessment results along with pre-study and post-study mental health surveys. The intended use is to assess mental health, academic performance and behavioral trends. The raw dataset and additional information is available at <https://studentlife.cs.dartmouth.edu/>.
Supporting data for the EpiMix R package. It include: - HM450_lncRNA_probes.rda - HM450_miRNA_probes.rda - EPIC_lncRNA_probes.rda - EPIC_miRNA_probes.rda - EpigenomeMap.rda - LUAD.sample.annotation - TCGA_BatchData - MET.data - mRNA.data - microRNA.data - lncRNA.data - Sample_EpiMixResults_lncRNA - Sample_EpiMixResults_miRNA - Sample_EpiMixResults_Regular - Sample_EpiMixResults_Enhancer - lncRNA expression data of tumors from TCGA that are stored in the ExperimentHub.
Some elementary matrix algebra tools are implemented to manage block matrices or partitioned matrix, i.e. "matrix of matrices" (http://en.wikipedia.org/wiki/Block_matrix). The block matrix is here defined as a new S3 object. In this package, some methods for "matrix" object are rewritten for "blockmatrix" object. New methods are implemented. This package was created to solve equation systems with block matrices for the analysis of environmental vector time series . Bugs/comments/questions/collaboration of any kind are warmly welcomed.
Simplifying the creation of print-ready maps, this package offers a user-friendly interface derived from ggplot2 for handling OpenStreetMap data. It streamlines the map-making process, allowing users to focus on the story their maps tell. Transforming raw geospatial data into informative visualizations is made easy with simple features sf geometries. Whether for urban planning, environmental studies, or impactful public presentations, this tool facilitates straightforward and effective map creation. Enhance the dissemination of spatial information with high-quality, narrative-driven visualizations!
This package provides text analysis in R, focusing on the use of a tokenized text format. In this format, the positions of tokens are maintained, and each token can be annotated (e.g., part-of-speech tags, dependency relations). Prominent features include advanced Lucene-like querying for specific tokens or contexts (e.g., documents, sentences), similarity statistics for words and documents, exporting to DTM for compatibility with many text analysis packages, and the possibility to reconstruct original text from tokens to facilitate interpretation.
This package provides functions to calculate predicted values and the difference between the two cases with confidence interval for lm() [linear model], glm() [generalized linear model], glm.nb() [negative binomial model], polr() [ordinal logistic model], vglm() [generalized ordinal logistic model], multinom() [multinomial model], tobit() [tobit model], svyglm() [survey-weighted generalised linear models] and lmer() [linear multilevel models] using Monte Carlo simulations or bootstrap. Reference: Bennet A. Zelner (2009) <doi:10.1002/smj.783>.
We provide an R tool for teaching in Social Sciences. It allows the computation of index numbers. It is a measure of the evolution of a fixed magnitude for only a product of for several products. It is very useful in Social Sciences. Among others, we obtain simple index numbers (in chain or in serie), index numbers for not only a product or weighted index numbers as the Laspeyres index (Laspeyres, 1864), the Paasche index (Paasche, 1874) or the Fisher index (Lapedes, 1978).
Since the reference management software (such as Zotero', Mendeley') exports Bib file journal abbreviation is not detailed enough, the journalabbr package only abbreviates the journal field of Bib file, and then outputs a new Bib file for generating reference format with journal abbreviation on other software (such as texstudio'). The abbreviation table is from JabRef'. At the same time, Shiny application is provided to generate thebibliography', a reference format that can be directly used for latex paper writing based on Rmd files.
Allows users to download and analyze official data on Brazil's federal budget through the SPARQL endpoint provided by the Integrated Budget and Planning System ('SIOP'). This package enables access to detailed information on budget allocations and expenditures of the federal government, making it easier to analyze and visualize these data. Technical information on the Brazilian federal budget is available (Portuguese only) at <https://www1.siop.planejamento.gov.br/mto/>. The SIOP endpoint is available at <https://www1.siop.planejamento.gov.br/sparql/>.
This package provides a complete suite of tools for interacting with the Survey Solutions GraphQL API <https://demo.mysurvey.solutions/graphql/>. This package encompasses all currently available queries and mutations, including the latest features for map uploads. It is built on the modern httr2 package, offering a streamlined and efficient interface without relying on external GraphQL client packages. In addition to core API functionalities, the package includes a range of helper functions designed to facilitate the use of available query filters.
Estimates the predicted 10-year cardiovascular (CVD) risk score (in probability) for women military service members and veterans by inputting patient profiles. The proposed women CVD risk score improves the accuracy of the existing American College of Cardiology/American Heart Association CVD risk assessment tool in predicting longâ term CVD risk for VA women, particularly in young and racial/ethnic minority women. See the reference: Jeonâ Slaughter, H., Chen, X., Tsai, S., Ramanan, B., & Ebrahimi, R. (2021) <doi:10.1161/JAHA.120.019217>.
This package implements four major subtype classifiers for high-grade serous (HGS) ovarian cancer as described by Helland et al. (PLoS One, 2011), Bentink et al. (PLoS One, 2012), Verhaak et al. (J Clin Invest, 2013), and Konecny et al. (J Natl Cancer Inst, 2014). In addition, the package implements a consensus classifier, which consolidates and improves on the robustness of the proposed subtype classifiers, thereby providing reliable stratification of patients with HGS ovarian tumors of clearly defined subtype.
Software which provides numerous functionalities for detecting and removing group-level effects from high-dimensional scientific data which, when combined with additional assumptions, allow for causal conclusions, as-described in our manuscripts Bridgeford et al. (2024) <doi:10.1101/2021.09.03.458920> and Bridgeford et al. (2023) <doi:10.48550/arXiv.2307.13868>. Also provides a number of useful utilities for generating simulations and balancing covariates across multiple groups/batches of data via matching and propensity trimming for more than two groups.
Diagnose, visualize, and aggregate event report level data to the event level. Users provide an event report level dataset, specify their aggregation rules, and the package produces a dataset aggregated at the event level. Also includes the Modes and Agents of Election-Related Violence in Côte d'Ivoire and Kenya (MAVERICK) dataset, an event report level dataset that records all documented instances of electoral violence from the first multiparty election to 2022 in Côte d'Ivoire (1995-2022) and Kenya (1992-2022).