Simplifies and largely automates practical voice analytics for social science research. This package offers an accessible and easy-to-use interface, including an interactive Shiny app, that simplifies the processing, extraction, analysis, and reporting of voice recording data in the behavioral and social sciences. The package includes batch processing capabilities to read and analyze multiple voice files in parallel, automates the extraction of key vocal features for further analysis, and automatically generates APA formatted reports for typical between-group comparisons in experimental social science research. A more extensive methodological introduction that inspired the development of the voiceR package is provided in Hildebrand et al. 2020 <doi:10.1016/j.jbusres.2020.09.020>.
Current gene set enrichment methods rely upon permutations for inference. These approaches are computationally expensive and have minimum achievable p-values based on the number of permutations, not on the actual observed statistics. We have derived three parametric approximations to the permutation distributions of two gene set enrichment test statistics. We are able to reduce the computational burden and granularity issues of permutation testing with our method, which is implemented in this package. npGSEA calculates gene set enrichment statistics and p-values without the computational cost of permutations. It is applicable in settings where one or many gene sets are of interest. There are also built-in plotting functions to help users visualize results.
Cochran-Mantel-Haenszel methods (Cochran (1954) <doi:10.2307/3001616>; Mantel and Haenszel (1959) <doi:10.1093/jnci/22.4.719>; Landis et al. (1978) <doi:10.2307/1402373>) are a suite of tests applicable to categorical data. A competitor to those tests is the procedure of Nonparametric ANOVA which was initially introduced in Rayner and Best (2013) <doi:10.1111/anzs.12041>. The methodology was then extended in Rayner et al. (2015) <doi:10.1111/anzs.12113>. This package employs functions related to both methodologies and serves as an accompaniment to the book: An Introduction to Cochranâ Mantelâ Haenszel and Non-Parametric ANOVA. The package also contains the data sets used in that text.
This package provides functions to describe sampling and diversity dynamics of fossil occurrence datasets (e.g. from the Paleobiology Database). The package includes methods to calculate range- and occurrence-based metrics of taxonomic richness, extinction and origination rates, along with traditional sampling measures. A powerful subsampling tool is also included that implements frequently used sampling standardization methods in a multiple bin-framework. The plotting of time series and the occurrence data can be simplified by the functions incorporated in the package, as well as other calculations, such as environmental affinities and extinction selectivity testing. Details can be found in: Kocsis, A.T.; Reddin, C.J.; Alroy, J. and Kiessling, W. (2019) <doi:10.1101/423780>.
User friendly interface based on the R package gstat to fit exponential parametric models to empirical semi-variograms in order to model the spatial correlation structure of health data. Geo-located health outcomes of survey participants may be used to model spatial effects on health in an ego-centred approach. The package contains a range of functions to help explore the spatial structure of the data as well as visualize the fit of exponential models for various metaparameter combinations with respect to the number of lag intervals and maximal distance. Furthermore, the outcome of interest can be adjusted for covariates by fitting a linear regression in a preliminary step before the semi-variogram fitting process.
Samples generalized random product graphs, a generalization of a broad class of network models. Given matrices X, S, and Y with with non-negative entries, samples a matrix with expectation X S Y^T and independent Poisson or Bernoulli entries using the fastRG algorithm of Rohe et al. (2017) <https://www.jmlr.org/papers/v19/17-128.html>. The algorithm first samples the number of edges and then puts them down one-by-one. As a result it is O(m) where m is the number of edges, a dramatic improvement over element-wise algorithms that which require O(n^2) operations to sample a random graph, where n is the number of nodes.
Fits the lifespan datasets of biological systems such as yeast, fruit flies, and other similar biological units with well-known finite mixture models introduced by Farewell V. (1982) <doi:10.2307/2529885> and Al-Hussaini et al. (2000) <doi:10.1080/00949650008812033>. Estimates parameter space fitting of a lifespan dataset with finite mixtures of parametric distributions. Computes the following tasks; 1) Estimates parameter space of the finite mixture model by implementing the expectation maximization (EM) algorithm. 2) Finds a sequence of four goodness-of-fit measures consist of Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), Kolmogorov-Smirnov (KS), and log-likelihood (log-likelihood) statistics. 3)The initial values is determined by k-means clustering.
This package provides a subgroup identification method for precision medicine based on quantitative objectives. This method can handle continuous, binary and survival endpoint for both prognostic and predictive case. For the predictive case, the method aims at identifying a subgroup for which treatment is better than control by at least a pre-specified or auto-selected constant. For the prognostic case, the method aims at identifying a subgroup that is at least better than a pre-specified/auto-selected constant. The derived signature is a linear combination of predictors, and the selected subgroup are subjects with the signature > 0. The false discover rate when no true subgroup exists is controlled at a user-specified level.
This package provides a RangedSummarizedExperiment object of read counts in genes for an RNA-Seq experiment on four human airway smooth muscle cell lines treated with dexamethasone. Details on the gene model and read counting procedure are provided in the package vignette. The citation for the experiment is: Himes BE, Jiang X, Wagner P, Hu R, Wang Q, Klanderman B, Whitaker RM, Duan Q, Lasky-Su J, Nikolos C, Jester W, Johnson M, Panettieri R Jr, Tantisira KG, Weiss ST, Lu Q. RNA-Seq Transcriptome Profiling Identifies CRISPLD2 as a Glucocorticoid Responsive Gene that Modulates Cytokine Function in Airway Smooth Muscle Cells. PLoS One. 2014 Jun 13;9(6):e99625. PMID: 24926665. GEO: GSE52778.
EPE's (Empresa de Pesquisa Energética) 4MD (Modelo de Mercado da Micro e Minigeração Distribuà da - Micro and Mini Distributed Generation Market Model) model to forecast the adoption of Distributed Generation. Given the user's assumptions, it is possible to estimate how many consumer units will have distributed generation in Brazil over the next 10 years, for example. In addition, it is possible to estimate the installed capacity, the amount of investments that will be made in the country and the monthly energy contribution of this type of generation. <https://www.epe.gov.br/sites-pt/publicacoes-dados-abertos/publicacoes/PublicacoesArquivos/publicacao-689/topico-639/NT_Metodologia_4MD_PDE_2032_VF.pdf>.
This package performs end-to-end analysis of gene clustersâ such as photosynthesis, carbon/nitrogen/sulfur cycling, carotenoid, antibiotic, or viral marker genes (e.g., capsid, polymerase, integrase)â from genomes and metagenomes. It parses Basic Local Alignment Search Tool (BLAST) results in tab-delimited format produced by tools like NCBI BLAST+ and Diamond BLASTp, filters Open Reading Frames (ORFs) by length, detects contiguous clusters of reference genes, optionally extracts genomic coordinates, merges functional annotations, and generates publication-ready arrow plots. The package works seamlessly with or without the coding sequences input and skips plotting when no functional groups are found. For more details see Li et al. (2023) <doi:10.1038/s41467-023-42193-7>.
Scalable Bayesian clustering of categorical datasets. The package implements a hierarchical Dirichlet (Process) mixture of multinomial distributions. It is thus a probabilistic latent class model (LCM) and can be used to reduce the dimensionality of hierarchical data and cluster individuals into latent classes. It can automatically infer an appropriate number of latent classes or find k classes, as defined by the user. The model is based on a paper by Dunson and Xing (2009) <doi:10.1198/jasa.2009.tm08439>, but implements a scalable variational inference algorithm so that it is applicable to large datasets. It is described and tested in the accompanying paper by Ahlmann-Eltze and Yau (2018) <doi:10.1109/DSAA.2018.00068>.
The utility of this package is in simulating mixtures of Gaussian distributions with different levels of overlap between mixture components. Pairwise overlap, defined as a sum of two misclassification probabilities, measures the degree of interaction between components and can be readily employed to control the clustering complexity of datasets simulated from mixtures. These datasets can then be used for systematic performance investigation of clustering and finite mixture modeling algorithms. Among other capabilities of MixSim', there are computing the exact overlap for Gaussian mixtures, simulating Gaussian and non-Gaussian data, simulating outliers and noise variables, calculating various measures of agreement between two partitionings, and constructing parallel distribution plots for the graphical display of finite mixture models.
Calculate and compare the prediction probability (PK) values for Anesthetic Depth Indicators. The PK values are widely used for measuring the performance of anesthetic depth and were first proposed by the group of Dr. Warren D. Smith in the paper Warren D. Smith; Robert C. Dutton; Ty N. Smith (1996) <doi:10.1097/00000542-199601000-00005> and Warren D. Smith; Robert C. Dutton; Ty N. Smith (1996) <doi:10.1002/(SICI)1097-0258(19960615)15:11%3C1199::AID-SIM218%3E3.0.CO;2-Y>. The authors provided two Microsoft Excel files in xls format for calculating and comparing PK values. This package provides an easy-to-use API for calculating and comparing PK values in R.
This package provides a process-oriented and trajectory-based Discrete-Event Simulation (DES) package for R. It is designed as a generic yet powerful framework. The architecture encloses a robust and fast simulation core written in C++ with automatic monitoring capabilities. It provides a rich and flexible R API that revolves around the concept of trajectory, a common path in the simulation model for entities of the same type. Documentation about simmer is provided by several vignettes included in this package, via the paper by Ucar, Smeets & Azcorra (2019, <doi:10.18637/jss.v090.i02>), and the paper by Ucar, Hernández, Serrano & Azcorra (2018, <doi:10.1109/MCOM.2018.1700960>); see citation("simmer") for details.
This is an R implementation of a constrained l1 minimization approach for estimating multiple Sparse Gaussian or Nonparanormal Graphical Models (SIMULE). The SIMULE algorithm can be used to estimate multiple related precision matrices. For instance, it can identify context-specific gene networks from multi-context gene expression datasets. By performing data-driven network inference from high-dimensional and heterogenous data sets, this tool can help users effectively translate aggregated data into knowledge that take the form of graphs among entities. Please run demo(simuleDemo) to learn the basic functions provided by this package. For further details, please read the original paper: Beilun Wang, Ritambhara Singh, Yanjun Qi (2017) <DOI:10.1007/s10994-017-5635-7>.
This package provides movies to help students to understand statistical concepts. The rpanel package <https://cran.r-project.org/package=rpanel> is used to create interactive plots that move to illustrate key statistical ideas and methods. There are movies to: visualise probability distributions (including user-supplied ones); illustrate sampling distributions of the sample mean (central limit theorem), the median, the sample maximum (extremal types theorem) and (the Fisher transformation of the) product moment correlation coefficient; examine the influence of an individual observation in simple linear regression; illustrate key concepts in statistical hypothesis testing. Also provided are dpqr functions for the distribution of the Fisher transformation of the correlation coefficient under sampling from a bivariate normal distribution.
This package provides a collection of tools for clinical trial data management and analysis in research and teaching. The package is mainly collected for personal use, but any use beyond that is encouraged. This package has migrated functions from agdamsbo/daDoctoR', and new functions has been added. Version follows months and year. See NEWS/Changelog for release notes. This package includes sampled data from the TALOS trial (Kraglund et al (2018) <doi:10.1161/STROKEAHA.117.020067>). The win_prob() function is based on work by Zou et al (2022) <doi:10.1161/STROKEAHA.121.037744>. The age_calc() function is based on work by Becker (2020) <doi:10.18637/jss.v093.i02>.
cogena is a workflow for co-expressed gene-set enrichment analysis. It aims to discovery smaller scale, but highly correlated cellular events that may be of great biological relevance. A novel pipeline for drug discovery and drug repositioning based on the cogena workflow is proposed. Particularly, candidate drugs can be predicted based on the gene expression of disease-related data, or other similar drugs can be identified based on the gene expression of drug-related data. Moreover, the drug mode of action can be disclosed by the associated pathway analysis. In summary, cogena is a flexible workflow for various gene set enrichment analysis for co-expressed genes, with a focus on pathway/GO analysis and drug repositioning.
The Langmuir and Freundlich adsorption isotherms are pivotal in characterizing adsorption processes, essential across various scientific disciplines. Proper interpretation of adsorption isotherms involves robust fitting of data to the models, accurate estimation of parameters, and efficiency evaluation of the models, both in linear and non-linear forms. For researchers and practitioners in the fields of chemistry, environmental science, soil science, and engineering, a comprehensive package that satisfies all these requirements would be ideal for accurate and efficient analysis of adsorption data, precise model selection and validation for rigorous scientific inquiry and real-world applications. Details can be found in Langmuir (1918) <doi:10.1021/ja02242a004> and Giles (1973) <doi:10.1111/j.1478-4408.1973.tb03158.x>.
This package provides functions for estimating plant pathogen parameters from access period (AP) experiments. Separate functions are implemented for semi-persistently transmitted (SPT) and persistently transmitted (PT) pathogens. The common AP experiment exposes insect cohorts to infected source plants, healthy test plants, and intermediate plants (for PT pathogens). The package allows estimation of acquisition and inoculation rates during feeding, recovery rates, and latent progression rates (for PT pathogens). Additional functions support inference of epidemic risk from pathogen and local parameters, and also simulate AP experiment data. The functions implement probability models for epidemiological analysis, as derived in Donnelly et al. (2025), <doi:10.32942/X29K9P>. These models were originally implemented in the EpiPv GitHub package.
Calculates the fused extended two-way fixed effects (FETWFE) estimator for unbiased and efficient estimation of difference-in-differences in panel data with staggered treatment adoption. This estimator eliminates bias inherent in conventional two-way fixed effects estimators, while also employing a novel bridge regression regularization approach to improve efficiency and yield valid standard errors. Also implements extended TWFE (etwfe) and bridge-penalized ETWFE (betwfe). Provides S3 classes for streamlined workflow and supports flexible tuning (ridge and rank-condition guarantees), automatic covariate centering/scaling, and detailed overall and cohort-specific effect estimates with valid standard errors. Includes simulation and formatting utilities, extensive diagnostic tools, vignettes, and examples. See Faletto (2025) (<doi:10.48550/arXiv.2312.05985>).
This package provides a model building procedure to build parsimonious geoadditive model from a large number of covariates. Continuous, binary and ordered categorical responses are supported. The model building is based on component wise gradient boosting with linear effects, smoothing splines and a smooth spatial surface to model spatial autocorrelation. The resulting covariate set after gradient boosting is further reduced through backward elimination and aggregation of factor levels. The package provides a model based bootstrap method to simulate prediction intervals for point predictions. A test data set of a soil mapping case study in Berne (Switzerland) is provided. Nussbaum, M., Walthert, L., Fraefel, M., Greiner, L., and Papritz, A. (2017) <doi:10.5194/soil-3-191-2017>.
This package provides an interface to the Maxar Geospatial Platform (MGP) Application Programming Interface. <https://www.maxar.com/maxar-geospatial-platform> It facilitates imagery searches using the MGP Streaming Application Programming Interface via the Web Feature Service (WFS) method, and supports image downloads through Web Map Service (WMS) and Web Map Tile Service (WMTS) Open Geospatial Consortium (OGC) methods. Additionally, it integrates with the Maxar Geospatial Platform Basemaps Application Programming Interface for accessing Maxar basemaps imagery and seamlines. The package also offers seamless integration with the Maxar Geospatial Platform Discovery Application Programming Interface, allowing users to search, filter, and sort Maxar content, while retrieving detailed metadata in formats like SpatioTemporal Asset Catalog (STAC) and GeoJSON.