The heterogeneous multi-task feature learning is a data integration method to conduct joint feature selection across multiple related data sets with different distributions. The algorithm can combine different types of learning tasks, including linear regression, Huber regression, adaptive Huber, and logistic regression. The modified version of Bayesian Information Criterion (BIC) is produced to measure the model performance. Package is based on Yuan Zhong, Wei Xu, and Xin Gao (2022) <https://www.fields.utoronto.ca/talk-media/1/53/65/slides.pdf>.
We provide an R tool for computation and nonparametric plug-in estimation of Highest Density Regions (HDRs) and general level sets in the directional setting. Concretely, circular and spherical HDRs can be reconstructed from a data sample following Saavedra-Nieves and Crujeiras (2021) <doi:10.1007/s11634-021-00457-4>. This library also contains two real datasets in the circular and spherical settings. The first one concerns a problem from animal orientation studies and the second one is related to earthquakes occurrences.
Rapid satellite data streams in operational applications have clear benefits for monitoring land cover, especially when information can be delivered as fast as changing surface conditions. Over the past decade, remote sensing has become a key tool for monitoring and predicting environmental variables by using satellite data. This package presents the main applications in remote sensing for land surface monitoring and land cover mapping (soil, vegetation, water...). Tomlinson, C.J., Chapman, L., Thornes, E., Baker, C (2011) <doi:10.1002/met.287>.
Advanced statistical library offering a method to encapsulate and query the probability space of a dataset effortlessly using Probability Boxes (p-boxes). Its distinctive feature lies in the ease with which users can navigate and analyze marginal, joint, and conditional probabilities while taking into account the underlying correlation structure inherent in the data using copula theory and models. A comprehensive explanation is available in the paper "pbox: Exploring Multivariate Spaces with Probability Boxes" to be published in the Journal of Statistical Software.
Computes pseudo-realizations from the posterior distribution of a Gaussian Process (GP) with the method described in Azzimonti et al. (2016) <doi:10.1137/141000749>. The realizations are obtained from simulations of the field at few well chosen points that minimize the expected distance in measure between the true excursion set of the field and the approximate one. Also implements a R interface for (the main function of) Distance Transform of sampled Functions (<https://cs.brown.edu/people/pfelzens/dt/index.html>).
The SEQC/MAQC-III Consortium has produced benchmark RNA-seq data for the assessment of RNA sequencing technologies and data analysis methods (Nat Biotechnol, 2014). Billions of sequence reads have been generated from ten different sequencing sites. This package contains the summarized read count data for ~2000 sequencing libraries. It also includes all the exon-exon junctions discovered from the study. TaqMan
RT-PCR data for ~1000 genes and ERCC spike-in sequence data are included in this package as well.
This package provides an object-oriented modeling language for disciplined convex programming (DCP) as described in Fu, Narasimhan, and Boyd (2020, <doi:10.18637/jss.v094.i14>). It allows the user to formulate convex optimization problems in a natural way following mathematical convention and DCP rules. The system analyzes the problem, verifies its convexity, converts it into a canonical form, and hands it off to an appropriate solver to obtain the solution. Interfaces to solvers on CRAN and elsewhere are provided.
Automated methods to assemble population PK (pharmacokinetic) and PKPD (pharmacodynamic) datasets for analysis in NONMEM (non-linear mixed effects modeling) by Bauer (2019) <doi:10.1002/psp4.12404>. The package includes functions to build datasets from SDTM (study data tabulation module) <https://www.cdisc.org/standards/foundational/sdtm>, ADaM
(analysis dataset module) <https://www.cdisc.org/standards/foundational/adam>, or other dataset formats. The package will combine population datasets, add covariates, and create documentation to support regulatory submission and internal communication.
This package provides a method for quantifying resilience after a stress event. A set of functions calculate the area of resilience that is created by the departure of baseline y (i.e., robustness) and the time taken x to return to baseline (i.e., rapidity) after a stress event using the Cartesian coordinates of the data. This package has the capability to calculate areas of resilience, growth, and cases in which resilience is not achieved (e.g., diminished performance without return to baseline).
An open-source implementation of the Congruent Matching Cells method for cartridge case identification as proposed by Song (2013) <https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=911193> as well as an extension of the method proposed by Tong et al. (2015) <doi:10.6028/jres.120.008>. Provides a wide range of pre, inter, and post-processing options when working with cartridge case scan data and their associated comparisons. See the cmcR
package website for more details and examples.
Facilitates the identification of counterfactual queries in structural causal models via the ID* and IDC* algorithms by Shpitser, I. and Pearl, J. (2007, 2008) <arXiv:1206.5294>
, <https://jmlr.org/papers/v9/shpitser08a.html>. Provides a simple interface for defining causal diagrams and counterfactual conjunctions. Construction of parallel worlds graphs and counterfactual graphs is carried out automatically based on the counterfactual query and the causal diagram. See Tikka, S. (2023) <doi:10.32614/RJ-2023-053> for a tutorial of the package.
Access chemical, hazard, bioactivity, and exposure data from the Computational Toxicology and Exposure ('CTX') APIs <https://api-ccte.epa.gov/docs/>. ccdR
was developed to streamline the process of accessing the information available through the CTX APIs without requiring prior knowledge of how to use APIs. Most data is also available on the CompTox
Chemical Dashboard ('CCD') <https://comptox.epa.gov/dashboard/> and other resources found at the EPA Computational Toxicology and Exposure Online Resources <https://www.epa.gov/comptox-tools>.
Estimates average treatment effects using model average double robust (MA-DR) estimation. The MA-DR estimator is defined as weighted average of double robust estimators, where each double robust estimator corresponds to a specific choice of the outcome model and the propensity score model. The MA-DR estimator extend the desirable double robustness property by achieving consistency under the much weaker assumption that either the true propensity score model or the true outcome model be within a specified, possibly large, class of models.
The NOIA model, as described extensively in Alvarez-Castro & Carlborg (2007), is a framework facilitating the estimation of genetic effects and genotype-to-phenotype maps. This package provides the basic tools to perform linear and multilinear regressions from real populations (provided the phenotype and the genotype of every individuals), estimating the genetic effects from different reference points, the genotypic values, and the decomposition of genetic variances in a multi-locus, 2 alleles system. This package is presented in Le Rouzic & Alvarez-Castro (2008).
Imports Variant Calling Format file into R. It can detect whether a sample contains contaminant from the same species. In the first stage of the approach, a change-point detection method is used to identify copy number variations for filtering. Next, features are extracted from the data for a support vector machine model. For log-likelihood calculation, the deviation parameter is estimated by maximum likelihood method. Using a radial basis function kernel support vector machine, the contamination of a sample can be detected.
This package provides tools for building decision and cost-effectiveness analysis models. It enables users to write these models concisely, simulate outcomesâ including probabilistic analysesâ efficiently using optimized vectorized processes and parallel computing, and produce results. The package employs a Grammar of Modeling approach, inspired by the Grammar of Graphics, to streamline model construction. For an interactive graphical user interface, see DecisionTwig
at <https://www.dashlab.ca/projects/decision_twig/>. Comprehensive tutorials and vignettes are available at <https://hjalal.github.io/twig/>.
Extracts coordinates of an event location from text based on dictionaries of landmarks, roads, and areas. Only returns the location of an event of interest and ignores other location references; for example, if determining the location of a road traffic crash from the text "crash near [location 1] heading towards [location 2]", only the coordinates of "location 1" would be returned. Moreover, accounts for differences in spelling between how a user references a location and how a location is captured in location dictionaries.
This package contains extensions to ggplot2.
Geomas:
geom_table
,geom_plot
andgeom_grob
add insets to plots using native data coordinates, whilegeom_table_npc
,geom_plot_npc
andgeom_grob_npc
do the same usingnpc
coordinates through new aestheticsnpcx
andnpcy
.Statistics: select observations based on 2D density.
Positions: radial nudging away from a center point and nudging away from a line or curve.
rebar3
is an Erlang build tool that makes it easy to compile and test Erlang applications, port drivers and releases.
rebar3
is a self-contained Erlang script, so it's easy to distribute or even embed directly in a project. Where possible, rebar uses standard Erlang/OTP conventions for project structures, thus minimizing the amount of build configuration work. rebar3
also provides dependency management, enabling application writers to easily re-use common libraries from a variety of locations (git, hg, etc).
Simulate, estimate and forecast a wide range of regression based dynamic models for bounded time series, covering the most commonly applied models in the literature. The main calculations are done in FORTRAN', which translates into very fast algorithms. The main references are Bayer et al. (2017) <doi:10.1016/j.jhydrol.2017.10.006>, Pumi et al. (2019) <doi:10.1016/j.jspi.2018.10.001>, Pumi et al. (2021) <doi:10.1111/sjos.12439> and Pumi et al. (2022) <arXiv:2211.02097>
.
This package provides functions to compute and plot Coverage Probability Excursion (CoPE
) sets for real valued functions on a 2-dimensional domain. CoPE
sets are obtained from repeated noisy observations of the function on the entire domain. They are designed to bound the excursion set of the target function at a given level from above and below with a predefined probability. The target function can be a parameter in spatially-indexed linear regression. Support by NIH grant R01 CA157528 is gratefully acknowledged.
Statistical downscaling and bias correction (model output statistics) method based on cumulative distribution functions (CDF) transformation. See Michelangeli, Vrac, Loukos (2009) Probabilistic downscaling approaches: Application to wind cumulative distribution functions. Geophysical Research Letters, 36, L11708, <doi:10.1029/2009GL038401>. ; and Vrac, Drobinski, Merlo, Herrmann, Lavaysse, Li, Somot (2012) Dynamical and statistical downscaling of the French Mediterranean climate: uncertainty assessment. Nat. Hazards Earth Syst. Sci., 12, 2769-2784, www.nat-hazards-earth-syst-sci.net/12/2769/2012/, <doi:10.5194/nhess-12-2769-2012>.
Auto, Cross and Multi-dimensional recurrence quantification analysis. Different methods for computing recurrence, cross vs. multidimensional or profile iti.e., only looking at the diagonal recurrent points, as well as functions for optimization and plotting are proposed. in-depth measures of the whole cross-recurrence plot, Please refer to Coco and others (2021) <doi:10.32614/RJ-2021-062>, Coco and Dale (2014) <doi:10.3389/fpsyg.2014.00510> and Wallot (2018) <doi: 10.1080/00273171.2018.1512846> for further details about the method.
This package contains a set of functions that can be used to apply formats to data frames or vectors. The package aims to provide functionality similar to that of SAS® formats. Formats are assigned to the format attribute on data frame columns. Then when the fdata()
function is called, a new data frame is created with the column data formatted as specified. The package also contains a value()
function to create a user-defined format, similar to a SAS® user-defined format.