Estimation of production functions by the Olley-Pakes, Levinsohn-Petrin and Wooldridge methodologies. The package aims to reproduce the results obtained with the Stata's user written opreg <http://www.stata-journal.com/article.html?article=st0145> and levpet <http://www.stata-journal.com/article.html?article=st0060> commands. The first was originally proposed by Olley, G.S. and Pakes, A. (1996) <doi:10.2307/2171831>. The second by Levinsohn, J. and Petrin, A. (2003) <doi:10.1111/1467-937X.00246>. And the third by Wooldridge (2009) <doi:10.1016/j.econlet.2009.04.026>.
Accurate and computationally efficient p-value calculation methods for a general family of Fisher type statistics (GFisher). The GFisher covers Fisher's combination, Good's statistic, Lancaster's statistic, weighted Z-score combination, etc. It allows a flexible weighting scheme, as well as an omnibus procedure that automatically adapts proper weights and degrees of freedom to a given data. The new p-value calculation methods are based on novel ideas of moment-ratio matching and joint-distribution approximation. The technical details can be found in Hong Zhang and Zheyang Wu (2020) <arXiv:2003.01286>
.
This package provides a framework for analytically computing the asymptotic confidence intervals and maximum-likelihood estimates of a class of continuous-time Gaussian branching processes defined by Mitov V, Bartoszek K, Asimomitis G, Stadler T (2019) <doi:10.1016/j.tpb.2019.11.005>. The class of model includes the widely used Ornstein-Uhlenbeck and Brownian motion branching processes. The framework is designed to be flexible enough so that the users can easily specify their own sub-models, or re-parameterizations, and obtain the maximum-likelihood estimates and confidence intervals of their own custom models.
Presentation of distributions such as: two-piece power normal (TPPN), plasticizing component (PC), DS normal (DSN), expnormal (EN), Sulewski plasticizing component (SPC), easily changeable kurtosis (ECK) distributions. Density, distribution function, quantile function and random generation are presented. For details on this method see: Sulewski (2019) <doi:10.1080/03610926.2019.1674871>, Sulewski (2021) <doi:10.1080/03610926.2020.1837881>, Sulewski (2021) <doi:10.1134/S1995080221120337>, Sulewski (2022) <"New members of the Johnson family of probability dis-tributions: properties and application">, Sulewski, Volodin (2022) <doi:10.1134/S1995080222110270>, Sulewski (2023) <doi:10.17713/ajs.v52i3.1434>.
It is often advantageous to test a hypothesis more than once in the context of propensity score analysis (Rosenbaum, 2012) <doi:10.1093/biomet/ass032>. The functions in this package facilitate bootstrapping for propensity score analysis (PSA). By default, bootstrapping using two classification tree methods (using rpart and ctree functions), two matching methods (using Matching and MatchIt
packages), and stratification with logistic regression. A framework is described for users to implement additional propensity score methods. Visualizations are emphasized for diagnosing balance; exploring the correlation relationships between bootstrap samples and methods; and to summarize results.
Based on the compound Poisson risk process that is perturbed by a Brownian motion, saddlepoint approximations to some measures of risk are provided. Various approximation methods for the probability of ruin are also included. Furthermore, exact values of both the risk measures as well as the probability of ruin are available if the individual claims follow a hypo-exponential distribution (i. e., if it can be represented as a sum of independent exponentially distributed random variables with different rate parameters). For more details see Gatto and Baumgartner (2014) <doi:10.1007/s11009-012-9316-5>.
Fit Cox non-proportional hazards models with time-varying coefficients. Both unpenalized procedures (Newton and proximal Newton) and penalized procedures (P-splines and smoothing splines) are included using B-spline basis functions for estimating time-varying coefficients. For penalized procedures, cross validations, mAIC
, TIC or GIC are implemented to select tuning parameters. Utilities for carrying out post-estimation visualization, summarization, point-wise confidence interval and hypothesis testing are also provided. For more information, see Wu et al. (2022) <doi: 10.1007/s10985-021-09544-2> and Luo et al. (2023) <doi:10.1177/09622802231181471>.
This package provides a framework for extracting semantic motifs around entities in textual data. It implements an entity-centered semantic grammar that distinguishes six classes of motifs: actions of an entity, treatments of an entity, agents acting upon an entity, patients acted upon by an entity, characterizations of an entity, and possessions of an entity. Motifs are identified by applying a set of extraction rules to a parsed text object that includes part-of-speech tags and dependency annotations - such as those generated by spacyr'. For further reference, see: Stuhler (2022) <doi: 10.1177/00491241221099551>.
This package provides functions for computing moments and coefficients related to the Beta-Wishart and Inverse Beta-Wishart distributions. It includes functions for calculating the expectation of matrix-valued functions of the Beta-Wishart distribution, coefficient matrices C_k and H_k, expectation of matrix-valued functions of the inverse Beta-Wishart distribution, and coefficient matrices \tildeC_k and \tildeH_k. For more details, refer Hillier and Kan (2024) <https://www-2.rotman.utoronto.ca/~kan/papers/wishmom.pdf>, "On the Expectations of Equivariant Matrix-valued Functions of Wishart and Inverse Wishart Matrices".
Many two-colour hybridizations suffer from a dye bias that is both gene-specific and slide-specific. The former depends on the content of the nucleotide used for labeling; the latter depends on the labeling percentage. The slide-dependency was hitherto not recognized, and made addressing the artefact impossible. Given a reasonable number of dye-swapped pairs of hybridizations, or of same vs. same hybridizations, both the gene- and slide-biases can be estimated and corrected using the GASSCO method (Margaritis et al., Mol. Sys. Biol. 5:266 (2009), doi:10.1038/msb.2009.21).
ANCOMBC
is a package containing differential abundance (DA) and correlation analyses for microbiome data. Specifically, the package includes Analysis of Compositions of Microbiomes with Bias Correction(ANCOM-BC) and Analysis of Composition of Microbiomes (ANCOM) for DA analysis, and Sparse Estimation of Correlations among Microbiomes (SECOM) for correlation analysis. Microbiome data are typically subject to two sources of biases: unequal sampling fractions (sample-specific biases) and differential sequencing efficiencies (taxon-specific biases). Methodologies included in the ANCOMBC
package were designed to correct these biases and construct statistically consistent estimators.
This package provides quantitative variant callers for detecting subclonal mutations in ultra-deep (>=100x coverage) sequencing experiments. The deepSNV algorithm is used for a comparative setup with a control experiment of the same loci and uses a beta-binomial model and a likelihood ratio test to discriminate sequencing errors and subclonal SNVs. The shearwater algorithm computes a Bayes classifier based on a beta-binomial model for variant calling with multiple samples for precisely estimating model parameters - such as local error rates and dispersion - and prior knowledge, e.g. from variation data bases such as COSMIC.
This package provides a tool to provide an easy, intuitive and consistent access to information contained in various R models, like model formulas, model terms, information about random effects, data that was used to fit the model or data from response variables. The package mainly revolves around two types of functions: Functions that find (the names of) information, starting with find_
, and functions that get the underlying data, starting with get_
. The package has a consistent syntax and works with many different model objects, where otherwise functions to access these information are missing.
Computes confidence intervals for the rate (or risk) difference ('RD') or rate ratio (or relative risk, RR') for binomial proportions or Poisson rates, or for odds ratio ('OR', binomial only). Also confidence intervals for a single binomial or Poisson rate, and intervals for matched pairs. Includes skewness-corrected asymptotic score ('SCAS') methods, which have been developed in Laud (2017) <doi:10.1002/pst.1813> from Miettinen & Nurminen (1985) <doi:10.1002/sim.4780040211> and Gart & Nam (1988) <doi:10.2307/2531848>. The same score produces hypothesis tests analogous to the test for binomial RD and RR by Farrington & Manning (1990) <doi:10.1002/sim.4780091208>, or the McNemar
test for paired data. The package also includes MOVER methods (Method Of Variance Estimates Recovery) for all contrasts, derived from the Newcombe method but with options to use equal-tailed intervals in place of the Wilson score method, and generalised for Bayesian applications incorporating prior information. So-called exact methods for strictly conservative coverage are approximated using continuity corrections, and the amount of correction can be selected to avoid over-conservative coverage. Also includes methods for stratified calculations (e.g. meta-analysis), either assuming fixed effects (matching the CMH test) or incorporating stratum heterogeneity.
This package provides statistical methods for analyzing experimental evaluation of the causal impacts of algorithmic recommendations on human decisions developed by Imai, Jiang, Greiner, Halen, and Shin (2023) <doi:10.1093/jrsssa/qnad010> and Ben-Michael, Greiner, Huang, Imai, Jiang, and Shin (2024) <doi:10.48550/arXiv.2403.12108>
. The data used for this paper, and made available here, are interim, based on only half of the observations in the study and (for those observations) only half of the study follow-up period. We use them only to illustrate methods, not to draw substantive conclusions.
Bayesian inference under log-normality assumption must be performed very carefully. In fact, under the common priors for the variance, useful quantities in the original data scale (like mean and quantiles) do not have posterior moments that are finite (Fabrizi et al. 2012 <doi:10.1214/12-BA733>). This package allows to easily carry out a proper Bayesian inferential procedure by fixing a suitable distribution (the generalized inverse Gaussian) as prior for the variance. Functions to estimate several kind of means (unconditional, conditional and conditional under a mixed model) and quantiles (unconditional and conditional) are provided.
Cellular cooperation compromises the plating efficiency-based analysis of clonogenic survival data. This tool provides functions that enable a robust analysis of colony formation assay (CFA) data in presence or absence of cellular cooperation. The implemented method has been described in Brix et al. (2020). (Brix, N., Samaga, D., Hennel, R. et al. "The clonogenic assay: robustness of plating efficiency-based analysis is strongly compromised by cellular cooperation." Radiat Oncol 15, 248 (2020). <doi:10.1186/s13014-020-01697-y>) Power regression for parameter estimation, calculation of survival fractions, uncertainty analysis and plotting functions are provided.
Enable researchers to adjust identification rates using the 1/(lineup size) method, generate the full receiver operating characteristic (ROC) curves, and statistically compare the area under the curves (AUC). References: Yueran Yang & Andrew Smith. (2020). "fullROC
: An R package for generating and analyzing eyewitness-lineup ROC curves". <doi:10.13140/RG.2.2.20415.94885/1> , Andrew Smith, Yueran Yang, & Gary Wells. (2020). "Distinguishing between investigator discriminability and eyewitness discriminability: A method for creating full receiver operating characteristic curves of lineup identification performance". Perspectives on Psychological Science, 15(3), 589-607. <doi:10.1177/1745691620902426>.
We implemented multiple tests based on the restricted mean survival time (RMST) for general factorial designs as described in Munko et al. (2024) <doi:10.1002/sim.10017>. Therefore, an asymptotic test, a groupwise bootstrap test, and a permutation test are incorporated with a Wald-type test statistic. The asymptotic and groupwise bootstrap test take the asymptotic exact dependence structure of the test statistics into account to gain more power. Furthermore, confidence intervals for RMST contrasts can be calculated and plotted and a stepwise extension that can improve the power of the multiple tests is available.
Fast implementation of mathematical operations and performance metrics for multi-objective optimization, including filtering and ranking of dominated vectors according to Pareto optimality, computation of the empirical attainment function, V.G. da Fonseca, C.M. Fonseca, A.O. Hall (2001) <doi:10.1007/3-540-44719-9_15>, hypervolume metric, C.M. Fonseca, L. Paquete, M. López-Ibáñez (2006) <doi:10.1109/CEC.2006.1688440>, epsilon indicator, inverted generational distance, and Vorob'ev threshold, expectation and deviation, M. Binois, D. Ginsbourger, O. Roustant (2015) <doi:10.1016/j.ejor.2014.07.032>, among others.
Like similar profiling tools, the proffer package automatically detects sources of slowness in R code. The distinguishing feature of proffer is its utilization of pprof', which supplies interactive visualizations that are efficient and easy to interpret. Behind the scenes, the profile package converts native Rprof()
data to a protocol buffer that pprof understands. For the documentation of proffer', visit <https://r-prof.github.io/proffer/>. To learn about the implementations and methodologies of pprof', profile', and protocol buffers, visit <https://github.com/google/pprof>. <https://protobuf.dev>, and <https://github.com/r-prof/profile>, respectively.
This package provides functions for stabilometric signal quantification. The input is a data frame containing the x, y coordinates of the center-of-pressure displacement. Jose Magalhaes de Oliveira (2017) <doi:10.3758/s13428-016-0706-4> "Statokinesigram normalization method"; T E Prieto, J B Myklebust, R G Hoffmann, E G Lovett, B M Myklebust (1996) <doi:10.1109/10.532130> "Measures of postural steadiness: Differences between healthy young and elderly adults"; L F Oliveira et al (1996) <doi:10.1088/0967-3334/17/4/008> "Calculation of area of stabilometric signals using principal component analisys".
This package provides infrastructure for handling running, cycling and swimming data from GPS-enabled tracking devices within R. The package provides methods to extract, clean and organise workout and competition data into session-based and unit-aware data objects of class trackeRdata
(S3 class). The information can then be visualised, summarised, and analysed through flexible and extensible methods. Frick and Kosmidis (2017) <doi: 10.18637/jss.v082.i07>, which is updated and maintained as one of the vignettes, provides detailed descriptions of the package and its methods, and real-data demonstrations of the package functionality.
Estimates the Vevea and Hedges (1995) weight-function model. By specifying arguments, users can also estimate the modified model described in Vevea and Woods (2005), which may be more practical with small datasets. Users can also specify moderators to estimate a linear model. The package functionality allows users to easily extract the results of these analyses as R objects for other uses. In addition, the package includes a function to launch both models as a Shiny application. Although the Shiny application is also available online, this function allows users to launch it locally if they choose.