This package provides a collection of datasets essential for functional genomic analysis. Gene names, gene positions, cytoband information, sourced from Ensembl and phenotypes association graph prepared from GWAScatalog are included. Data is available in both GRCh37 and 38 builds. These datasets facilitate a wide range of genomic studies, including the identification of genetic variants, exploration of genomic features, and post-GWAS functional analysis.
Generalized LassO
applied to knot selection in multivariate B-splinE
Regression (GLOBER) implements a novel approach for estimating functions in a multivariate nonparametric regression model based on an adaptive knot selection for B-splines using the Generalized Lasso. For further details we refer the reader to the paper Savino, M. E. and Lévy-Leduc, C. (2023), <arXiv:2306.00686>
.
An approach to analyzing Likert response items, with an emphasis on visualizations. The stacked bar plot is the preferred method for presenting Likert results. Tabular results are also implemented along with density plots to assist researchers in determining whether Likert responses can be used quantitatively instead of qualitatively. See the likert()
, summary.likert()
, and plot.likert()
functions to get started.
The proposed method aims at predicting the longitudinal mean response trajectory by a kernel-based estimator. The kernel estimator is constructed by imposing weights based on subject-wise similarity on L2 metric space between predictor trajectories as well as time proximity. Users could also perform variable selections to derive functional predictors with predictive significance by the proposed multiplicative model with multivariate Gaussian kernels.
Complete analytical environment for the construction and analysis of matrix population models and integral projection models. Includes the ability to construct historical matrices, which are 2d matrices comprising 3 consecutive times of demographic information. Estimates both raw and function-based forms of historical and standard ahistorical matrices. It also estimates function-based age-by-stage matrices and raw and function-based Leslie matrices.
Implementation of hypothesis testing procedures described in Hansen (1992) <doi:10.1002/jae.3950070506>, Carrasco, Hu, & Ploberger (2014) <doi:10.3982/ECTA8609>, Dufour & Luger (2017) <doi:10.1080/07474938.2017.1307548>, and Rodriguez Rondon & Dufour (2024) <https://grodriguezrondon.com/files/RodriguezRondon_Dufour_2024_MonteCarlo_LikelihoodRatioTest_MarkovSwitchingModels_20241015.pdf>
that can be used to identify the number of regimes in Markov switching models.
Generate maximum projection (MaxPro
) designs for quantitative and/or qualitative factors. Details of the MaxPro
criterion can be found in: (1) Joseph, Gul, and Ba. (2015) "Maximum Projection Designs for Computer Experiments", Biometrika, 102, 371-380, and (2) Joseph, Gul, and Ba. (2018) "Designing Computer Experiments with Multiple Types of Factors: The MaxPro
Approach", Journal of Quality Technology, to appear.
This package provides automatization for plot generation succeeding common molecular dynamics analyses. This includes straightforward plots, such as RMSD (Root-Mean-Square-Deviation) and RMSF (Root-Mean-Square-Fluctuation) but also more sophisticated ones such as dihedral angle maps, hydrogen bonds, cluster bar plots and DSSP (Definition of Secondary Structure of Proteins) analysis. Currently able to load GROMOS, GROMACS and AMBER formats, respectively.
Additive proportional odds model for ordinal data using Laplace P-splines. The combination of Laplace approximations and P-splines enable fast and flexible inference in a Bayesian framework. Specific approximations are proposed to account for the asymmetry in the marginal posterior distributions of non-penalized parameters. For more details, see Lambert and Gressani (2023) <doi:10.1177/1471082X231181173> ; Preprint: <arXiv:2210.01668>
).
Tool for producing Pen's parade graphs, useful for visualizing inequalities in income, wages or other variables, as proposed by Pen (1971, ISBN: 978-0140212594). Income or another economic variable is captured by the vertical axis, while the population is arranged in ascending order of income along the horizontal axis. Pen's income parades provide an easy-to-interpret visualization of economic inequalities.
The goal of SAFEPG is to predict climate-related extreme losses by fitting a frequency-severity model. It improves predictive performance by introducing a sign-aligned regularization term, which ensures consistent signs for the coefficients across the frequency and severity components. This enhancement not only increases model accuracy but also enhances its interpretability, making it more suitable for practical applications in risk assessment.
This package provides tools for modeling non-continuous linear responses of ecological communities to environmental data. The package is straightforward through three steps: (1) data ordering (function OrdData()
), (2) split-moving-window analysis (function SMW()
) and (3) piecewise redundancy analysis (function pwRDA()
). Relevant references include Cornelius and Reynolds (1991) <doi:10.2307/1941559> and Legendre and Legendre (2012, ISBN: 9780444538697).
This package provides functions to perform split robust least angle regression. The approach first uses the least angle regression algorithm to split the variables into the models of an ensemble and robust estimates of the correlation between predictors. An elastic net estimator is then applied to the selected predictors in each model using the imputed data from the detect deviating cell (DDC) method.
This package infers the V genotype of an individual from immunoglobulin (Ig) repertoire sequencing data (AIRR-Seq, Rep-Seq). Includes detection of any novel alleles. This information is then used to correct existing V allele calls from among the sample sequences. Citations: Gadala-Maria, et al (2015) <doi:10.1073/pnas.1417683112>, Gadala-Maria, et al (2019) <doi:10.3389/fimmu.2019.00129>.
This package provides a Tcl/Tk Graphical User Interface (GUI) to display images than can be zoomed and panned using the mouse and keyboard shortcuts. tkImgR
read and write different image formats (PPM/PGM, PNG and GIF) using the standard Tcl/Tk distribution (>=8.6), but other formats (JPEG, TIFF, CR2) can be handled using the tkImg
package for Tcl/Tk'.
This package provides functions for defining and conducting a time series prediction process including pre(post)processing, decomposition, modelling, prediction and accuracy assessment. The generated models and its yielded prediction errors can be used for benchmarking other time series prediction methods and for creating a demand for the refinement of such methods. For this purpose, benchmark data from prediction competitions may be used.
An implementation of three procedures developed by John Tukey: FUNOP (FUll NOrmal Plot), FUNOR-FUNOM (FUll NOrmal Rejection-FUll NOrmal Modification), and vacuum cleaner. Combined, they provide a way to identify, treat, and analyze outliers in two-way (i.e., contingency) tables, as described in his landmark paper "The Future of Data Analysis", Tukey, John W. (1962) <https://www.jstor.org/stable/2237638>.
The vcfpp.h (<https://github.com/Zilong-Li/vcfpp>) provides an easy-to-use C++ API of htslib', offering full functionality for manipulating Variant Call Format (VCF) files. The vcfppR
package serves as the R bindings of the vcfpp.h library, enabling rapid processing of both compressed and uncompressed VCF files. Explore a range of powerful features for efficient VCF data manipulation.
HiCool
provides an R interface to process and normalize Hi-C paired-end fastq reads into .(m)cool files. .(m)cool is a compact, indexed HDF5 file format specifically tailored for efficiently storing HiC-based
data. On top of processing fastq reads, HiCool
provides a convenient reporting function to generate shareable reports summarizing Hi-C experiments and including quality controls.
This package provides tool for estimation, testing and regression modeling of subdistribution functions in competing risks, as described in Gray (1988), A class of K-sample tests for comparing the cumulative incidence of a competing risk, Ann. Stat. 16:1141-1154, and Fine JP and Gray RJ (1999), A proportional hazards model for the subdistribution of a competing risk, JASA, 94:496-509.
This package provides tools to get text from images of text using Abbyy Cloud Optical Character Recognition (OCR) API. With abbyyyR, one can easily OCR images, barcodes, forms, documents with machine readable zones, e.g. passports and get the results in a variety of formats including plain text and XML. To learn more about the Abbyy OCR API, see http://ocrsdk.com/.
This package provides infrastructure for the management of survey data including value labels, definable missing values, recoding of variables, production of code books, and import of (subsets of) SPSS and Stata files is provided. Further, the package produces tables and data frames of arbitrary descriptive statistics and (almost) publication-ready tables of regression model estimates, which can be exported to LaTeX and HTML.
uom
(Units of measurement) is a crate that does automatic type-safe zero-cost dimensional analysis. You can create your own systems or use the pre-built International System of Units (SI) which is based on the International System of Quantities (ISQ) and includes numerous quantities (length, mass, time, ...) with conversion factors for even more numerous measurement units (meter, kilometer, foot, mile, ...).
uom
(Units of measurement) is a crate that does automatic type-safe zero-cost dimensional analysis. You can create your own systems or use the pre-built International System of Units (SI) which is based on the International System of Quantities (ISQ) and includes numerous quantities (length, mass, time, ...) with conversion factors for even more numerous measurement units (meter, kilometer, foot, mile, ...).