Fetches zonal statistics from weather indicators that were calculated for each municipality in Brazil using data from the BR-DWGD and TerraClimate
projects. Zonal statistics such as mean, maximum, minimum, standard deviation, and sum were computed by taking into account the data cells that intersect the boundaries of each municipality and stored in Parquet files. This procedure was carried out for all Brazilian municipalities, and for all available dates, for every indicator available in the weather products (BR-DWGD and TerraClimate
projects). This package queries on-line the already calculated statistics on the Parquet files and returns easy-to-use data.frames.
This package provides a user friendly function crrcbcv to compute bias-corrected variances for competing risks regression models using proportional subdistribution hazards with small-sample clustered data. Four types of bias correction are included: the MD-type bias correction by Mancl and DeRouen
(2001) <doi:10.1111/j.0006-341X.2001.00126.x>, the KC-type bias correction by Kauermann and Carroll (2001) <doi:10.1198/016214501753382309>, the FG-type bias correction by Fay and Graubard (2001) <doi:10.1111/j.0006-341X.2001.01198.x>, and the MBN-type bias correction by Morel, Bokossa, and Neerchal (2003) <doi:10.1002/bimj.200390021>.
This package provides a comprehensive and dynamic configuration driven logging package for R. While there are several excellent logging solutions already in the R ecosystem, I always feel constrained in some way by each of them. Every project is designed differently to solve it's domain specific problem, and ultimately the utility of a logging solution is its ability to adapt to this design. This is the raison d'être for dyn.log': to provide a modular design, template mechanics and a configuration-based integration model, so that the logger can integrate deeply into your design, even though it knows nothing about it.
Limpa e simplifica nomes de pessoas para auxiliar no pareamento de banco de dados na ausência de chaves únicas não ambà guas. Detecta e corrige erros tipográficos mais comuns, simplifica opcionalmente termos sujeitos eventualmente a omissão em cadastros, e simplifica foneticamente suas palavras, aplicando variação própria do algoritmo metaphoneBR
. (Cleans and simplifies person names to assist in database matching when unambiguous unique keys are unavailable. Detects and corrects common typos, optionally simplifies terms prone to omission in records, and applies phonetic simplification using a custom variation of the metaphoneBR
algorithm.) Mation (2025) <doi:10.6082/uchicago.15104>.
Forms queries to submit to the Cleveland Federal Reserve Bank web site's financial stress index data site. Provides query functions for both the composite stress index and the components data. By default the download includes daily time series data starting September 25, 1991. The functions return a class of either type easing or cfsi which contain a list of items related to the query and its graphical presentation. The list includes the time series data as an xts object. The package provides four lattice time series plots to render the time series data in a manner similar to the bank's own presentation.
Estimation of time-dependent ROC curve and area under time dependent ROC curve (AUC) in the presence of censored data, with or without competing risks. Confidence intervals of AUCs and tests for comparing AUCs of two rival markers measured on the same subjects can be computed, using the iid-representation of the AUC estimator. Plot functions for time-dependent ROC curves and AUC curves are provided. Time-dependent Positive Predictive Values (PPV) and Negative Predictive Values (NPV) can also be computed. See Blanche et al. (2013) <doi:10.1002/sim.5958> and references therein for the details of the methods implemented in the package.
The dependencies of CRAN packages can be analysed in a network fashion. For each package we can obtain the packages that it depends, imports, suggests, etc. By iterating this procedure over a number of packages, we can build, visualise, and analyse the dependency network, enabling us to have a bird's-eye view of the CRAN ecosystem. One aspect of interest is the number of reverse dependencies of the packages, or equivalently the in-degree distribution of the dependency network. This can be fitted by the power law and/or an extreme value mixture distribution <doi:10.1111/stan.12355>, of which functions are provided.
This package implements Cramer-von Mises Statistics for testing fit to (1) fully specified discrete distributions as described in Choulakian, Lockhart and Stephens (1994) <doi:10.2307/3315828> (2) discrete distributions with unknown parameters that must be estimated from the sample data, see Spinelli & Stephens (1997) <doi:10.2307/3315735> and Lockhart, Spinelli and Stephens (2007) <doi:10.1002/cjs.5550350111> (3) grouped continuous distributions with Unknown Parameters, see Spinelli (2001) <doi:10.2307/3316040>. Maximum likelihood estimation (MLE) is used to estimate the parameters. The package computes the Cramer-von Mises Statistics, Anderson-Darling Statistics and the Watson-Stephens Statistics and their p-values.
The hydReng
package provides a set of functions for hydraulic engineering tasks and natural hazard assessments. It includes basic hydraulics (wetted area, wetted perimeter, flow, flow velocity, flow depth, and maximum flow) for open channels with arbitrary geometry under uniform flow conditions. For structures such as circular pipes, weirs, and gates, the package includes calculations for pressure flow, backwater depth, and overflow over a weir crest. Additionally, it provides formulas for calculating bedload transport. The formulas used can be found in standard literature on hydraulics, such as Bollrich (2019, ISBN:978-3-410-29169-5) or Hager (2011, ISBN:978-3-642-77430-0).
Extensive penalized variable selection methods have been developed in the past two decades for analyzing high dimensional omics data, such as gene expressions, single nucleotide polymorphisms (SNPs), copy number variations (CNVs) and others. However, lipidomics data have been rarely investigated by using high dimensional variable selection methods. This package incorporates our recently developed penalization procedures to conduct interaction analysis for high dimensional lipidomics data with repeated measurements. The core module of this package is developed in C++. The development of this software package and the associated statistical methods have been partially supported by an Innovative Research Award from Johnson Cancer Research Center, Kansas State University.
Pearson and Spearman correlation coefficients are commonly used to quantify the strength of bivariate associations of genomic variables. For example, correlations of gene-level DNA copy number and gene expression measurements may be used to assess the impact of DNA copy number changes on gene expression in tumor tissue. MVisAGe
enables users to quickly compute and visualize the correlations in order to assess the effect of regional genomic events such as changes in DNA copy number or DNA methylation level. Please see Walter V, Du Y, Danilova L, Hayward MC, Hayes DN, 2018. Cancer Research <doi:10.1158/0008-5472.CAN-17-3464>.
When working with big data sets, RAM conservation is critically important. However, it is not always enough to just monitor the size of the objects created. So-called "copy-on-modify" behavior, characteristic of R, means that some expressions or functions may require an unexpectedly large amount of RAM overhead. For example, replacing a single value in a matrix duplicates that matrix in the back-end, making this task require twice as much RAM as that used by the matrix itself. This package makes it easy to monitor the total and peak RAM used so that developers can quickly identify and eliminate RAM hungry code.
This package provides Sensory and Consumer Data mapping and analysis <doi:10.14569/IJACSA.2017.081266>. The mapping visualization is made available from several features : options in dimension reduction methods and prediction models ranging from linear to non linear regressions. A smoothed version of the map performed using locally weighted regression algorithm is available. A selection process of map stability is provided. A shiny application is included. It presents an easy GUI for the implemented functions as well as a comparative tool of fit models using several criteria. Basic analysis such as characterization of products, panelists and sessions likewise consumer segmentation are also made available.
By gaining the property of emergence through self-organization, the enhancement of SOMs(self organizing maps) is called Emergent SOM (ESOM). The result of the projection by ESOM is a grid of neurons which can be visualised as a three dimensional landscape in form of the Umatrix. Further details can be found in the referenced publications (see url). This package offers tools for calculating and visualising the ESOM as well as Umatrix, Pmatrix and UStarMatrix
. All the functionality is also available through graphical user interfaces implemented in shiny'. Based on the recognized data structures, the method can be used to generate new data.
The main janitor functions can: perfectly format data.frame column
names; provide quick counts of variable combinations (i.e., frequency tables and crosstabs); and isolate duplicate records. Other janitor functions nicely format the tabulation results. These tabulate-and-report functions approximate popular features of SPSS and Excel. This package follows the principles of the "tidyverse" and works well with the pipe function %>%
. janitor was built with beginning-to-intermediate R users in mind and is optimized for user-friendliness. Advanced R users can already do everything covered here, but with janitor they can do it faster and save their thinking for the fun stuff.
Calculations of the most common metrics of automated advertisement and plotting of them with trend and forecast. Calculations and description of metrics is taken from different RTB platforms support documentation. Plotting and forecasting is based on packages forecast', described in Rob J Hyndman and George Athanasopoulos (2021) "Forecasting: Principles and Practice" <https://otexts.com/fpp3/> and Rob J Hyndman et al "Documentation for forecast'" (2003) <https://pkg.robjhyndman.com/forecast/>, and ggplot2', described in Hadley Wickham et al "Documentation for ggplot2'" (2015) <https://ggplot2.tidyverse.org/>, and Hadley Wickham, Danielle Navarro, and Thomas Lin Pedersen (2015) "ggplot2: Elegant Graphics for Data Analysis" <https://ggplot2-book.org/>.
Bootstrap based goodness-of-fit tests. It allows to perform rigorous statistical tests to check if a chosen model family is correct based on the marked empirical process. The implemented algorithms are described in (Dikta and Scheer (2021) <doi:10.1007/978-3-030-73480-0>) and can be applied to generalized linear models without any further implementation effort. As far as certain linearity conditions are fulfilled the resampling scheme are also applicable beyond generalized linear models. This is reflected in the software architecture which allows to reuse the resampling scheme by implementing only certain interfaces for models that are not supported natively by the package.
This package provides several novel exact hypothesis tests with minimal assumptions on the errors. The tests are exact, meaning that their p-values are correct for the given sample sizes (the p-values are not derived from asymptotic analysis). The test for stochastic inequality is for ordinal comparisons based on two independent samples and requires no assumptions on the errors. The other tests include tests for the mean and variance of a single sample and comparing means in independent samples. All these tests only require that the data has known bounds (such as percentages that lie in [0,100]. These bounds are part of the input.
High Dynamic Range (HDR) images support a large range in luminosity between the lightest and darkest regions of an image. To capture this range, data in HDR images is often stored as floating point numbers and in formats that capture more data and channels than standard image types. This package supports reading and writing two types of HDR images; PFM (Portable Float Map) and OpenEXR
images. HDR images can be converted to lower dynamic ranges (for viewing) using tone-mapping. A number of tone-mapping algorithms are included which are based on Reinhard (2002) "Photographic tone reproduction for digital images" <doi:10.1145/566654.566575>.
Extracts features from amplification curve data of quantitative Polymerase Chain Reactions (qPCR
) according to Pabinger et al. 2014 <doi:10.1016/j.bdq.2014.08.002> for machine learning purposes. Helper functions prepare the amplification curve data for processing as functional data (e.g., Hausdorff distance) or enable the plotting of amplification curve classes (negative, ambiguous, positive). The hookreg()
and hookregNL()
functions of Burdukiewicz et al. (2018) <doi:10.1016/j.bdq.2018.08.001> can be used to predict amplification curves with an hook effect-like curvature. The pcrfit_single()
function can be used to extract features from an amplification curve.
Generalized Additive Mixed Modeling (GAMM; Lin & Zhang, 1999) as implemented in the R package mgcv
is a nonlinear regression analysis which is particularly useful for time course data such as EEG, pupil dilation, gaze data (eye tracking), and articulography recordings, but also for behavioral data such as reaction times and response data. As time course measures are sensitive to autocorrelation problems, GAMMs implements methods to reduce the autocorrelation problems. This package includes functions for the evaluation of GAMM models (e.g., model comparisons, determining regions of significance, inspection of autocorrelational structure in residuals) and interpreting of GAMMs (e.g., visualization of complex interactions, and contrasts).
This package implements Collective And Point Anomaly (CAPA) Fisch, Eckley, and Fearnhead (2022) <doi:10.1002/sam.11586>, Multi-Variate Collective And Point Anomaly (MVCAPA) Fisch, Eckley, and Fearnhead (2021) <doi:10.1080/10618600.2021.1987257>, Proportion Adaptive Segment Selection (PASS) Jeng, Cai, and Li (2012) <doi:10.1093/biomet/ass059>, and Bayesian Abnormal Region Detector (BARD) Bardwell and Fearnhead (2015) <doi:10.1214/16-BA998>. These methods are for the detection of anomalies in time series data. Further information regarding the use of this package along with detailed examples can be found in Fisch, Grose, Eckley, Fearnhead, and Bardwell (2024) <doi:10.18637/jss.v110.i01>.
This package implements fast, exact bootstrap Principal Component Analysis and Singular Value Decompositions for high dimensional data, as described in <doi:10.1080/01621459.2015.1062383> (see also <doi:10.48550/arXiv.1405.0922>
). For data matrices that are too large to operate on in memory, users can input objects with class ff (see the ff package), where the actual data is stored on disk. In response, this package will implement a block matrix algebra procedure for calculating the principal components (PCs) and bootstrap PCs. Depending on options set by the user, the parallel package can be used to parallelize the calculation of the bootstrap PCs.
Several implementations of non-parametric stable bootstrap-based techniques to determine the numbers of components for Partial Least Squares linear or generalized linear regression models as well as and sparse Partial Least Squares linear or generalized linear regression models. The package collects techniques that were published in a book chapter (Magnanensi et al. 2016, The Multiple Facets of Partial Least Squares and Related Methods', <doi:10.1007/978-3-319-40643-5_18>) and two articles (Magnanensi et al. 2017, Statistics and Computing', <doi:10.1007/s11222-016-9651-4>) and (Magnanensi et al. 2021, Frontiers in Applied Mathematics and Statistics', <doi:10.3389/fams.2021.693126>).