This package performs distance sampling simulations. dsims repeatedly generates instances of a user defined population within a given survey region. It then generates realisations of a survey design and simulates the detection process. The data are then analysed so that the results can be compared for accuracy and precision across all replications. This process allows users to optimise survey designs for their specific set of survey conditions. The effects of uncertainty in population distribution or parameters can be investigated under a number of simulations so that users can be confident that they have achieved a robust survey design before deploying vessels into the field. The distance sampling designs used in this package from dssd are detailed in Chapter 7 of Advanced Distance Sampling, Buckland et. al. (2008, ISBN-13: 978-0199225873). General distance sampling methods are detailed in Introduction to Distance Sampling: Estimating Abundance of Biological Populations, Buckland et. al. (2004, ISBN-13: 978-0198509271). Find out more about estimating animal/plant abundance with distance sampling at <https://distancesampling.org/>.
This data package contains four datasets of quantitative PCR (qPCR
) amplification curves that were used as supplementary data in the research article by Sisti et al. (2010), <doi:10.1186/1471-2105-11-186>. The primary dataset comprises a ten-fold dilution series spanning copy numbers from 3.14 Ã 10^7 to 3.14 Ã 10^2, with twelve replicates per concentration. These samples are based on a pGEM-T
Promega plasmid containing a 104 bp fragment of the mitochondrial gene NADH dehydrogenase 1 (MT-ND1), amplified using the ND1/ND2 primer pair. The remaining three datasets contain qPCR
results in the presence of specific PCR inhibitors: tannic acid, immunoglobulin G (IgG
), and quercetin, respectively, to assess their effects on the amplification process. These datasets are useful for researchers interested in PCR kinetics. The original raw data file is available as Additional File 1: <https://static-content.springer.com/esm/art%3A10.1186%2F1471-2105-11-186/MediaObjects/12859_2009_3643_MOESM1_ESM.XLS>
.
Mahalanobis-Taguchi (MT) system is a collection of multivariate analysis methods developed for the field of quality engineering. MT system consists of two families depending on their purpose. One is a family of Mahalanobis-Taguchi (MT) methods (in the broad sense) for diagnosis (see Woodall, W. H., Koudelik, R., Tsui, K. L., Kim, S. B., Stoumbos, Z. G., and Carvounis, C. P. (2003) <doi:10.1198/004017002188618626>) and the other is a family of Taguchi (T) methods for forecasting (see Kawada, H., and Nagata, Y. (2015) <doi:10.17929/tqs.1.12>). The MT package contains three basic methods for the family of MT methods and one basic method for the family of T methods. The MT method (in the narrow sense), the Mahalanobis-Taguchi Adjoint (MTA) methods, and the Recognition-Taguchi (RT) method are for the MT method and the two-sided Taguchi (T1) method is for the family of T methods. In addition, the Ta and Tb methods, which are the improved versions of the T1 method, are included.
This package provides a collection of functions to test and estimate Seemingly Unrelated Regression (usually called SUR) models, with spatial structure, by maximum likelihood and three-stage least squares. The package estimates the most common spatial specifications, that is, SUR with Spatial Lag of X regressors (called SUR-SLX), SUR with Spatial Lag Model (called SUR-SLM), SUR with Spatial Error Model (called SUR-SEM), SUR with Spatial Durbin Model (called SUR-SDM), SUR with Spatial Durbin Error Model (called SUR-SDEM), SUR with Spatial Autoregressive terms and Spatial Autoregressive Disturbances (called SUR-SARAR), SUR-SARAR with Spatial Lag of X regressors (called SUR-GNM) and SUR with Spatially Independent Model (called SUR-SIM). The methodology of these models can be found in next references Minguez, R., Lopez, F.A., and Mur, J. (2022) <doi:10.18637/jss.v104.i11> Mur, J., Lopez, F.A., and Herrera, M. (2010) <doi:10.1080/17421772.2010.516443> Lopez, F.A., Mur, J., and Angulo, A. (2014) <doi:10.1007/s00168-014-0624-2>.
An implementation for high-dimensional time series analysis methods, including factor model for vector time series proposed by Lam and Yao (2012) <doi:10.1214/12-AOS970> and Chang, Guo and Yao (2015) <doi:10.1016/j.jeconom.2015.03.024>, martingale difference test proposed by Chang, Jiang and Shao (2023) <doi:10.1016/j.jeconom.2022.09.001>, principal component analysis for vector time series proposed by Chang, Guo and Yao (2018) <doi:10.1214/17-AOS1613>, cointegration analysis proposed by Zhang, Robinson and Yao (2019) <doi:10.1080/01621459.2018.1458620>, unit root test proposed by Chang, Cheng and Yao (2022) <doi:10.1093/biomet/asab034>, white noise test proposed by Chang, Yao and Zhou (2017) <doi:10.1093/biomet/asw066>, CP-decomposition for matrix time series proposed by Chang et al. (2023) <doi:10.1093/jrsssb/qkac011> and Chang et al. (2024) <doi:10.48550/arXiv.2410.05634>
, and statistical inference for spectral density matrix proposed by Chang et al. (2022) <doi:10.48550/arXiv.2212.13686>
.
This package provides a prediction model is calibrated if, roughly, for any percentage x we can expect that x subjects out of 100 experience the event among all subjects that have a predicted risk of x%. A calibration plot provides a simple, yet useful, way of assessing the calibration assumption. The Wally plot consists of a sequence of usual calibration plots. Among the plots contained within the sequence, one is the actual calibration plot which has been obtained from the data and the others are obtained from similar simulated data under the calibration assumption. It provides the investigator with a direct visual understanding of the shape and sampling variability that are common under the calibration assumption. The original calibration plot from the data is included randomly among the simulated calibration plots, similarly to a police lineup. If the original calibration plot is not easily identified then the calibration assumption is not contradicted by the data. The method handles the common situations in which the data contain censored observations and occurrences of competing events.
The Satellite Application Facility on Climate Monitoring (CM SAF) is a ground segment of the European Organization for the Exploitation of Meteorological Satellites (EUMETSAT) and one of EUMETSATs Satellite Application Facilities. The CM SAF contributes to the sustainable monitoring of the climate system by providing essential climate variables related to the energy and water cycle of the atmosphere (<https://www.cmsaf.eu>). It is a joint cooperation of eight National Meteorological and Hydrological Services. The cmsaf R-package includes a shiny based interface for an easy application of the cmsafops and cmsafvis packages - the CM SAF R Toolbox. The Toolbox offers an easy way to prepare, manipulate, analyse and visualize CM SAF NetCDF
formatted data. Other CF conform NetCDF
data with time, longitude and latitude dimension should be applicable, but there is no guarantee for an error-free application. CM SAF climate data records are provided for free via (<https://wui.cmsaf.eu/safira>). Detailed information and test data are provided on the CM SAF webpage (<http://www.cmsaf.eu/R_toolbox>).
This package provides a glycolipid mass spectrometry technology has the potential to accurately identify individual bacterial species from polymicrobial samples. To develop bacterial identification algorithms (e.g. machine learning) using this glycolipid technology, it is necessary to generate a large number of various in-silico polymicrobial mass spectra that are similar to real mass spectra. MGMS2 (Membrane Glycolipid Mass Spectrum Simulator) generates such in-silico mass spectra, considering errors in m/z (mass-to-charge ratio) and variances of intensity values, occasions of missing signature ions, and noise peaks. It estimates summary statistics of monomicrobial mass spectra for each strain or species and simulates polymicrobial glycolipid mass spectra using the summary statistics of monomicrobial mass spectra. References: Ryu, S.Y., Wendt, G.A., Chandler, C.E., Ernst, R.K. and Goodlett, D.R. (2019) <doi:10.1021/acs.analchem.9b03340> "Model-based Spectral Library Approach for Bacterial Identification via Membrane Glycolipids." Gibb, S. and Strimmer, K. (2012) <doi:10.1093/bioinformatics/bts447> "MALDIquant: a versatile R package for the analysis of mass spectrometry data.".
NuPoP
is an R package for Nucleosome Positioning Prediction.This package is built upon a duration hidden Markov model proposed in Xi et al, 2010; Wang et al, 2008. The core of the package was written in Fotran. In addition to the R package, a stand-alone Fortran software tool is also available at https://github.com/jipingw. The Fortran codes have complete functonality as the R package. Note: NuPoP
has two separate functions for prediction of nucleosome positioning, one for MNase-map trained models and the other for chemical map-trained models. The latter was implemented for four species including yeast, S.pombe, mouse and human, trained based on our recent publications. We noticed there is another package nuCpos
by another group for prediction of nucleosome positioning trained with chemicals. A report to compare recent versions of NuPoP
with nuCpos
can be found at https://github.com/jiping/NuPoP_doc
. Some more information can be found and will be posted at https://github.com/jipingw/NuPoP
.
With satin functions, visualisation, data extraction and further analysis like producing climatologies from several images, and anomalies of satellite derived ocean data can be easily done. Reading functions can import a user defined geographical extent of data stored in netCDF
files. Currently supported ocean data sources include NASA's Oceancolor web page <https://oceancolor.gsfc.nasa.gov/>, sensors VIIRS-SNPP; MODIS-Terra; MODIS-Aqua; and SeaWiFS
. Available variables from this source includes chlorophyll concentration, sea surface temperature (SST), and several others. Data sources specific for SST that can be imported too includes Pathfinder AVHRR <https://www.ncei.noaa.gov/products/avhrr-pathfinder-sst> and GHRSST <https://www.ghrsst.org/>. In addition, ocean productivity data produced by Oregon State University <http://sites.science.oregonstate.edu/ocean.productivity/> can also be handled previous conversion from HDF4 to HDF5 format. Many other ocean variables can be processed by importing netCDF
data files from two European Union's Copernicus Marine Service databases <https://marine.copernicus.eu/>, namely Global Ocean Physical Reanalysis and Global Ocean Biogeochemistry Hindcast.
Una herramienta rápida y consistente para la disposición de microdatos y la visualización de las cifras y estadà sticas oficiales de la Universidad Nacional de Colombia <https://unal.edu.co>. Contiene una biblioteca de funciones gráficas, tanto estáticas como interactivas, que ofrece numerosos tipos de gráficos con una sintaxis altamente configurable y simple. Entre estos encontramos la visualización de tablas HTML, series, gráficos de barras y circulares, mapas, etc. Todo lo anterior apoyado en bibliotecas de JavaScript
. English: A fast and consistent tool for the arrangement of microdata and the visualization of official figures and statistics from the National University of Colombia <https://unal.edu.co>. It includes a library of graphical functions, both static and interactive, offering numerous types of charts with a highly configurable and simple syntax. Among these, we find the visualization of HTML tables, series, bar and pie charts, maps, etc. It provides the capability to transition from the interactive to the dynamic world and from one library to another without changing function or syntax.
Covariate measurement error correction is implemented by means of regression calibration by Carroll RJ, Ruppert D, Stefanski LA & Crainiceanu CM (2006, ISBN:1584886331), efficient regression calibration by Spiegelman D, Carroll RJ & Kipnis V (2001) <doi:10.1002/1097-0258(20010115)20:1%3C139::AID-SIM644%3E3.0.CO;2-K> and maximum likelihood estimation by Bartlett JW, Stavola DBL & Frost C (2009) <doi:10.1002/sim.3713>. Outcome measurement error correction is implemented by means of the method of moments by Buonaccorsi JP (2010, ISBN:1420066560) and efficient method of moments by Keogh RH, Carroll RJ, Tooze JA, Kirkpatrick SI & Freedman LS (2014) <doi:10.1002/sim.7011>. Standard error estimation of the corrected estimators is implemented by means of the Delta method by Rosner B, Spiegelman D & Willett WC (1990) <doi:10.1093/oxfordjournals.aje.a115715> and Rosner B, Spiegelman D & Willett WC (1992) <doi:10.1093/oxfordjournals.aje.a116453>, the Fieller method described by Buonaccorsi JP (2010, ISBN:1420066560), and the Bootstrap by Carroll RJ, Ruppert D, Stefanski LA & Crainiceanu CM (2006, ISBN:1584886331).
This package implements techniques to estimate the unknown quantities related to two-component admixture models, where the two components can belong to any distribution (note that in the case of multinomial mixtures, the two components must belong to the same family). Estimation methods depend on the assumptions made on the unknown component density; see Bordes and Vandekerkhove (2010) <doi:10.3103/S1066530710010023>, Patra and Sen (2016) <doi:10.1111/rssb.12148>, and Milhaud, Pommeret, Salhi, Vandekerkhove (2024) <doi:10.3150/23-BEJ1593>. In practice, one can estimate both the mixture weight and the unknown component density in a wide variety of frameworks. On top of that, hypothesis tests can be performed in one and two-sample contexts to test the unknown component density (see Milhaud, Pommeret, Salhi and Vandekerkhove (2022) <doi:10.1016/j.jspi.2021.05.010>, and Milhaud, Pommeret, Salhi, Vandekerkhove (2024) <doi:10.3150/23-BEJ1593>). Finally, clustering of unknown mixture components is also feasible in a K-sample setting (see Milhaud, Pommeret, Salhi, Vandekerkhove (2024) <https://jmlr.org/papers/v25/23-0914.html>).
Original idea was presented in the reference paper. Varghese et al. (2020, 74(1):35-42) "Bayesian State-space Implementation of Schaefer Production Model for Assessment of Stock Status for Multi-gear Fishery". Marine fisheries governance and management practices are very essential to ensure the sustainability of the marine resources. A widely accepted resource management strategy towards this is to derive sustainable fish harvest levels based on the status of marine fish stock. Various fish stock assessment models that describe the biomass dynamics using time series data on fish catch and fishing effort are generally used for this purpose. In the scenario of complex multi-species marine fishery in which different species are caught by a number of fishing gears and each gear harvests a number of species make it difficult to obtain the fishing effort corresponding to each fish species. Since the capacity of the gears varies, the effort made to catch a resource cannot be considered as the sum of efforts expended by different fishing gears. This necessitates standardisation of fishing effort in unit base.
An efficient unified nonconvex penalized estimation algorithm for Gaussian (linear), binomial Logit (logistic), Poisson, multinomial Logit, and Cox proportional hazard regression models. The unified algorithm is implemented based on the convex concave procedure and the algorithm can be applied to most of the existing nonconvex penalties. The algorithm also supports convex penalty: least absolute shrinkage and selection operator (LASSO). Supported nonconvex penalties include smoothly clipped absolute deviation (SCAD), minimax concave penalty (MCP), truncated LASSO penalty (TLP), clipped LASSO (CLASSO), sparse ridge (SRIDGE), modified bridge (MBRIDGE) and modified log (MLOG). For high-dimensional data (data set with many variables), the algorithm selects relevant variables producing a parsimonious regression model. Kim, D., Lee, S. and Kwon, S. (2018) <arXiv:1811.05061>
, Lee, S., Kwon, S. and Kim, Y. (2016) <doi:10.1016/j.csda.2015.08.019>, Kwon, S., Lee, S. and Kim, Y. (2015) <doi:10.1016/j.csda.2015.07.001>. (This research is funded by Julian Virtue Professorship from Center for Applied Research at Pepperdine Graziadio Business School and the National Research Foundation of Korea.).
An implementation of a single-index regression for optimizing individualized dose rules from an observational study. To model interaction effects between baseline covariates and a treatment variable defined on a continuum, we employ two-dimensional penalized spline regression on an index-treatment domain, where the index is defined as a linear combination of the covariates (a single-index). An unspecified main effect for the covariates is allowed, which can also be modeled through a parametric model. A unique contribution of this work is in the parsimonious single-index parametrization specifically defined for the interaction effect term. We refer to Park, Petkova, Tarpey, and Ogden (2020) <doi:10.1111/biom.13320> (for the case of a discrete treatment) and Park, Petkova, Tarpey, and Ogden (2021) "A single-index model with a surface-link for optimizing individualized dose rules" <arXiv:2006.00267v2>
for detail of the method. The model can take a member of the exponential family as a response variable and can also take an ordinal categorical response. The main function of this package is simsl()
.
Constructs genotype x environment interaction (GxE
) models where G is a weighted sum of genetic variants (genetic score) and E is a weighted sum of environments (environmental score) using the alternating optimization algorithm by Jolicoeur-Martineau et al. (2017) <arXiv:1703.08111>
. This approach has greatly enhanced predictive power over traditional GxE
models which include only a single genetic variant and a single environmental exposure. Although this approach was originally made for GxE
modelling, it is flexible and does not require the use of genetic and environmental variables. It can also handle more than 2 latent variables (rather than just G and E) and 3-way interactions or more. The LEGIT model produces highly interpretable results and is very parameter-efficient thus it can even be used with small sample sizes (n < 250). Tools to determine the type of interaction (vantage sensitivity, diathesis-stress or differential susceptibility), with any number of genetic variants or environments, are available <arXiv:1712.04058>
. The software can now produce mixed-effects LEGIT models through the lme4 package.
Calculate unified measures that quantify the effect of a covariate on a binary dependent variable (e.g., for meta-analyses). This can be particularly important if the estimation results are obtained with different models/estimators (e.g., linear probability model, logit, probit, ...) and/or with different transformations of the explanatory variable of interest (e.g., linear, quadratic, interval-coded, ...). The calculated unified measures are: (a) semi-elasticities of linear, quadratic, or interval-coded covariates and (b) effects of linear, quadratic, interval-coded, or categorical covariates when a linear or quadratic covariate changes between distinct intervals, the reference category of a categorical variable or the reference interval of an interval-coded variable needs to be changed, or some categories of a categorical covariate or some intervals of an interval-coded covariate need to be grouped together. Approximate standard errors of the unified measures are also calculated. All methods that are implemented in this package are described in the vignette "Extracting and Unifying Semi-Elasticities and Effect Sizes from Studies with Binary Dependent Variables" that is included in this package.
In some situations where researchers would like to demonstrate causal effects, it is hard to obtain a sample size that would allow for a well-powered randomized controlled trial. Single case designs are experimental designs that can be used to demonstrate causal effects with only one participant or with only a few participants. The scdtb package provides a suite of tools for analyzing data from studies that use single case designs. The nap()
function can be used to compute the nonoverlap of all pairs as outlined by the What Works Clearinghouse (2022) <https://ies.ed.gov/ncee/wwc/Handbooks>. The package also offers the mixed_model_analysis()
and cross_lagged()
functions which implement mixed effects models and cross lagged analyses as described in Maric & van der Werff (2020) <doi:10.4324/9780429273872-9>. The randomization_test()
function implements randomization tests based on methods presented in Onghena (2020) <doi:10.4324/9780429273872-8>. The scdtb()
shiny application can be used to upload single case design data and access various scdtb tools for plotting and analysis.
This package provides a major challenge in estimating treatment decision rules from a randomized clinical trial dataset with covariates measured at baseline lies in detecting relatively small treatment effect modification-related variability (i.e., the treatment-by-covariates interaction effects on treatment outcomes) against a relatively large non-treatment-related variability (i.e., the main effects of covariates on treatment outcomes). The class of Single-Index Models with Multiple-Links is a novel single-index model specifically designed to estimate a single-index (a linear combination) of the covariates associated with the treatment effect modification-related variability, while allowing a nonlinear association with the treatment outcomes via flexible link functions. The models provide a flexible regression approach to developing treatment decision rules based on patients data measured at baseline. We refer to Park, Petkova, Tarpey, and Ogden (2020) <doi:10.1016/j.jspi.2019.05.008> and Park, Petkova, Tarpey, and Ogden (2020) <doi:10.1111/biom.13320> (that allows an unspecified X main effect) for detail of the method. The main function of this package is simml()
.
Third order response surface designs (M. Hemavathi, Shashi Shekhar, Eldho Varghese, Seema Jaggi, Bikas Sinha & Nripes Kumar Mandal (2022) <DOI:10.1080/03610926.2021.1944213>."Theoretical developments in response surface designs: an informative review and further thoughts") are classified into two types viz., designs which are suitable for sequential experimentation and designs for non-sequential experimentation (M. Hemavathi, Eldho Varghese, Shashi Shekhar & Seema Jaggi (2022)<DOI:10.1080/02664763.2020.1864817>." Sequential asymmetric third order rotatable designs (SATORDs)"). The sequential experimentation approach involves conducting the trials step by step whereas, in the non-sequential experimentation approach, the entire runs are executed in one go.This package contains functions named STORDs()
and NSTORDs()
for generating sequential/non-sequential TORDs given in Das, M. N., and V. L. Narasimham (1962). <DOI:10.1214/aoms/1177704374>. "Construction of rotatable designs through balanced incomplete block designs" along with the randomized layout. It also contains another function named Pred3.var()
for generating the variance of predicted response as well as the moment matrix based on a third order response surface model.
Allows users to create and deploy the workflow with multiple functions in Function-as-a-Service (FaaS
) cloud computing platforms. The FaaSr
package makes it simpler for R developers to use FaaS
platforms by providing the following functionality: 1) Parsing and validating a JSON-based payload compliant to FaaSr
schema supporting multiple FaaS
platforms 2) Invoking user functions written in R in a Docker container (derived from rocker), using a list generated from the parser as argument 3) Downloading/uploading of files from/to S3 buckets using simple primitives 4) Logging to files in S3 buckets 5) Triggering downstream actions supporting multiple FaaS
platforms 6) Generating FaaS-specific
API calls to simplify the registering of a user's workflow with a FaaS
platform Supported FaaS
platforms: Apache OpenWhisk
<https://openwhisk.apache.org/> GitHub
Actions <https://github.com/features/actions> Amazon Web Services (AWS) Lambda <https://aws.amazon.com/lambda/> Supported cloud data storage for persistent storage: Amazon Web Services (AWS) Simple Storage Service (S3) <https://aws.amazon.com/s3/>.
Conduct multi-locus genome-wide association study under the framework of multi-locus random-SNP-effect mixed linear model (mrMLM
). First, each marker on the genome is scanned. Bonferroni correction is replaced by a less stringent selection criterion for significant test. Then, all the markers that are potentially associated with the trait are included in a multi-locus genetic model, their effects are estimated by empirical Bayes, and all the nonzero effects were further identified by likelihood ratio test for significant QTL. The program may run on a desktop or laptop computers. If marker genotypes in association mapping population are almost homozygous, these methods in this software are very effective. If there are many heterozygous marker genotypes, the IIIVmrMLM
software is recommended. Wen YJ, Zhang H, Ni YL, Huang B, Zhang J, Feng JY, Wang SB, Dunwell JM, Zhang YM, Wu R (2018, <doi:10.1093/bib/bbw145>), and Li M, Zhang YW, Zhang ZC, Xiang Y, Liu MH, Zhou YH, Zuo JF, Zhang HQ, Chen Y, Zhang YM (2022, <doi:10.1016/j.molp.2022.02.012>).
This package provides a scalable and fast method for estimating joint Species Distribution Models (jSDMs
) for big community data, including eDNA
data. The package estimates a full (i.e. non-latent) jSDM
with different response distributions (including the traditional multivariate probit model). The package allows to perform variation partitioning (VP) / ANOVA on the fitted models to separate the contribution of environmental, spatial, and biotic associations. In addition, the total R-squared can be further partitioned per species and site to reveal the internal metacommunity structure, see Leibold et al., <doi:10.1111/oik.08618>. The internal structure can then be regressed against environmental and spatial distinctiveness, richness, and traits to analyze metacommunity assembly processes. The package includes support for accounting for spatial autocorrelation and the option to fit responses using deep neural networks instead of a standard linear predictor. As described in Pichler & Hartig (2021) <doi:10.1111/2041-210X.13687>, scalability is achieved by using a Monte Carlo approximation of the joint likelihood implemented via PyTorch
and reticulate', which can be run on CPUs or GPUs.