Calculate ocean wave height summary statistics and process data from bottom-mounted pressure sensor data loggers. Derived primarily from MATLAB functions provided by U. Neumeier at <http://neumeier.perso.ch/matlab/waves.html>. Wave number calculation based on the algorithm in Hunt, J. N. (1979, ISSN:0148-9895) "Direct Solution of Wave Dispersion Equation", American Society of Civil Engineers Journal of the Waterway, Port, Coastal, and Ocean Division, Vol 105, pp 457-459.
Validate data in data frames, tibble objects, Spark DataFrames
', and database tables. Validation pipelines can be made using easily-readable, consecutive validation steps. Upon execution of the validation plan, several reporting options are available. User-defined thresholds for failure rates allow for the determination of appropriate reporting actions. Many other workflows are available including an information management workflow, where the aim is to record, collect, and generate useful information on data tables.
Gene regulatory networks model the underlying gene regulation hierarchies that drive gene expression and observed phenotypes. Epiregulon infers TF activity in single cells by constructing a gene regulatory network (regulons). This is achieved through integration of scATAC-seq
and scRNA-seq
data and incorporation of public bulk TF ChIP-seq
data. Links between regulatory elements and their target genes are established by computing correlations between chromatin accessibility and gene expressions.
The Hashery is a tight collection of Hash
-like classes. Included are the auto-sorting Dictionary
class, the efficient LRUHash
, the flexible OpenHash
and the convenient KeyHash
. Nearly every class is a subclass of the CRUDHash
which defines a CRUD (Create, Read, Update and Delete) model on top of Ruby's standard Hash
making it possible to subclass and augment to fit any specific use case.
Wrapper for widely used SUNDIALS software (SUite of Nonlinear and DIfferential/ALgebraic Equation Solvers) and more precisely to its CVODES solver. It is aiming to solve ordinary differential equations (ODE) and optionally pending forward sensitivity problem. The wrapper is made R friendly by allowing to pass custom parameters to user's callback functions. Such functions can be both written in R and in C++ ('RcppArmadillo
flavor). In case of C++', performance is greatly improved so this option is highly advisable when performance matters. If provided, Jacobian matrix can be calculated either in dense or sparse format. In the latter case rmumps package is used to solve corresponding linear systems. Root finding and pending event management are optional and can be specified as R or C++ functions too. This makes them a very flexible tool for controlling the ODE system during the time course simulation. SUNDIALS library was published in Hindmarsh et al. (2005) <doi:10.1145/1089014.1089020>.
This package provides tools to create, validate, and export BioCompute
Objects described in King et al. (2019) <doi:10.17605/osf.io/h59uh>. Users can encode information in data frames, and compose BioCompute
Objects from the domains defined by the standard. A checksum validator and a JSON schema validator are provided. This package also supports exporting BioCompute
Objects as JSON, PDF, HTML, or Word documents, and exporting to cloud-based platforms.
Represents generalized geometric ellipsoids with the "(U,D)" representation. It allows degenerate and/or unbounded ellipsoids, together with methods for linear and duality transformations, and for plotting. Thus ellipsoids are naturally extended to include lines, hyperplanes, points, cylinders, etc. This permits exploration of a variety to statistical issues that can be visualized using ellipsoids as discussed by Friendly, Fox & Monette (2013), Elliptical Insights: Understanding Statistical Methods Through Elliptical Geometry <doi:10.1214/12-STS402>.
Multivariate outlier detection is performed using invariant coordinates where the package offers different methods to choose the appropriate components. ICS is a general multivariate technique with many applications in multivariate analysis. ICSOutlier offers a selection of functions for automated detection of outliers in the data based on a fitted ICS object or by specifying the dataset and the scatters of interest. The current implementation targets data sets with only a small percentage of outliers.
Keras Tuner <https://keras-team.github.io/keras-tuner/> is a hypertuning framework made for humans. It aims at making the life of AI practitioners, hypertuner algorithm creators and model designers as simple as possible by providing them with a clean and easy to use API for hypertuning. Keras Tuner makes moving from a base model to a hypertuned one quick and easy by only requiring you to change a few lines of code.
Integration of the units and errors packages for a complete quantity calculus system for R vectors, matrices and arrays, with automatic propagation, conversion, derivation and simplification of magnitudes and uncertainties. Documentation about units and errors is provided in the papers by Pebesma, Mailund & Hiebert (2016, <doi:10.32614/RJ-2016-061>) and by Ucar, Pebesma & Azcorra (2018, <doi:10.32614/RJ-2018-075>), included in those packages as vignettes; see citation("quantities") for details.
Given independent and identically distributed observations X(1), ..., X(n) from a Generalized Pareto distribution with shape parameter gamma in [-1,0], offers several estimates to compute estimates of gamma. The estimates are based on the principle of replacing the order statistics by quantiles of a distribution function based on a log--concave density function. This procedure is justified by the fact that the GPD density is log--concave for gamma in [-1,0].
The goal of surveynnet is to extend the functionality of nnet', which already supports survey weights, by enabling it to handle clustered and stratified data. It achieves this by incorporating design effects through the use of effective sample sizes as outlined by Chen and Rust (2017), <doi:10.1093/jssam/smw036>, and performed by deffCR
in the package PracTools
(Valliant, Dever, and Kreuter (2018), <doi:10.1007/978-3-319-93632-1>).
The goal of siteymlgen is to make it easy to organise the building of your R Markdown website. The init()
function placed within the first code chunk of the index.Rmd file of an R project directory will initiate the generation of an automatically written _site.yml file. siteymlgen recommends a specific naming convention for your R Markdown files. This naming will ensure that your navbar layout is ordered according to a hierarchy.
This package provides functions to calculate exact critical values, statistical power, expected time to signal, and required sample sizes for performing exact sequential analysis. All these calculations can be done for either Poisson or binomial data, for continuous or group sequential analyses, and for different types of rejection boundaries. In case of group sequential analyses, the group sizes do not have to be specified in advance and the alpha spending can be arbitrarily settled.
R data pipelines commonly require reading and writing data to versioned directories. Each directory might correspond to one step of a multi-step process, where that version corresponds to particular settings for that step and a chain of previous steps that each have their own versions. This package creates a configuration object that makes it easy to read and write versioned data, based on YAML configuration files loaded and saved to each versioned folder.
Representation-dependent gene-level operations for genetic and evolutionary algorithms with real-coded genes are collected in this package. The common feature of the gene operations is that all of them are useful for derivation-free optimization algorithms. At the moment the package implements initialization, mutation, crossover, and replication operations for differential evolution as described in Price, Kenneth V., Storn, Rainer M. and Lampinen, Jouni A. (2005) <doi:10.1007/3-540-31306-0>.
As a successor of the packages BatchJobs and BatchExperiments, this package provides a parallel implementation of the Map function for high performance computing systems managed by various schedulers. A multicore and socket mode allow the parallelization on a local machines, and multiple machines can be hooked up via SSH to create a makeshift cluster. Moreover, the package provides an abstraction mechanism to define large-scale computer experiments in a well-organized and reproducible way.
This package provides some easy-to-use functions to extract and visualize the output of multivariate data analyses, including PCA
(Principal Component Analysis), CA
(Correspondence Analysis), MCA
(Multiple Correspondence Analysis), FAMD
(Factor Analysis of Mixed Data), MFA
(Multiple Factor Analysis) and HMFA
(Hierarchical Multiple Factor Analysis) functions from different R packages. It contains also functions for simplifying some clustering analysis steps and provides ggplot2-based elegant data visualization.
Fit Bayesian models in Stan <doi: 10.18637/jss.v076.i01> with checkpointing, that is, the ability to stop the MCMC sampler at will, and then pick right back up where the MCMC sampler left off. Custom Stan models can be fitted, or the popular package brms <doi: 10.18637/jss.v080.i01> can be used to generate the Stan code. This package is fully compatible with the R packages brms', posterior', cmdstanr', and bayesplot'.
This package provides a system for extracting news from Chilean media, specifically through Web Scapping from Chilean media. The package allows for news searches using search phrases and date filters, and returns the results in a structured format, ready for analysis. Additionally, it includes functions to clean the extracted data, visualize it, and store it in databases. All of this can be done automatically, facilitating the collection and analysis of relevant information from Chilean media.
Data package for dartR
'. Provides data sets to run examples in dartR
'. This was necessary due to the size limit imposed by CRAN'. The data in dartR.data
is needed to run the examples provided in the dartR
functions. All available data sets are either based on actual data (but reduced in size) and/or simulated data sets to allow the fast execution of examples and demonstration of the functions.
Functionalities for calculating the local score and calculating statistical relevance (p-value) to find a local Score in a sequence of given distribution (S. Mercier and J.-J. Daudin (2001) <https://hal.science/hal-00714174/>) ; S. Karlin and S. Altschul (1990) <https://pmc.ncbi.nlm.nih.gov/articles/PMC53667/> ; S. Mercier, D. Cellier and F. Charlot (2003) <https://hal.science/hal-00937529v1/> ; A. Lagnoux, S. Mercier and P. Valois (2017) <doi:10.1093/bioinformatics/btw699> ).
Statistical methods for whole-trial and time-domain analysis of single cell neural response to multiple stimuli presented simultaneously. The package is based on the paper by C Glynn, ST Tokdar, A Zaman, VC Caruso, JT Mohl, SM Willett, and JM Groh (2021) "Analyzing second order stochasticity of neural spiking under stimuli-bundle exposure", is in press for publication by the Annals of Applied Statistics. A preprint may be found at <arXiv:1911.04387>
.
Calculation of the parametric, nonparametric confidence intervals for the difference or ratio of location parameters, nonparametric confidence interval for the Behrens-Fisher problem and for the difference, ratio and odds-ratio of binomial proportions for comparison of independent samples. Common wrapper functions to split data sets and apply confidence intervals or tests to these subsets. A by-statement allows calculation of CI separately for the levels of further factors. CI are not adjusted for multiplicity.