Data-driven approach for arriving at person-specific time series models from within a Graphical Vector Autoregression (VAR) framework. The method first identifies which relations replicate across the majority of individuals to detect signal from noise. These group-level relations are then used as a foundation for starting the search for person-specific (or individual-level) relations. All estimates are obtained uniquely for each individual in the final models. The method for the graphicalVAR approach is found in Epskamp, Waldorp, Mottus & Borsboom (2018) <doi:10.1080/00273171.2018.1454823>.
This package implements the Maki (2012) <doi:10.1016/j.econmod.2012.05.006> cointegration test that allows for an unknown number of structural breaks. The test detects cointegration relationships in the presence of up to five structural breaks in the intercept and/or slope coefficients. Four different model specifications are supported: level shifts, level shifts with trend, regime shifts, and trend with regime shifts. The method is described in Maki (2012) "Tests for cointegration allowing for an unknown number of breaks" <doi:10.1016/j.econmod.2012.05.006>.
Set of tools for descriptive analysis of metaproteomics data generated from high-throughput mass spectrometry instruments. These tools allow to cluster peptides and proteins abundance, expressed as spectral counts, and to manipulate them in groups of metaproteins. This information can be represented using multiple visualization functions to portray the global metaproteome landscape and to differentiate samples or conditions, in terms of abundance of metaproteins, taxonomic levels and/or functional annotation. The provided tools allow to implement flexible analytical pipelines that can be easily applied to studies interested in metaproteomics analysis.
This package provides tools to print a compact, readable directory tree for a folder or project. The package can automatically detect common project roots (e.g., RStudio .Rproj files) and formats output for quick inspection of code and data organization. It supports typical tree customizations such as limiting depth, excluding files using ignore patterns, and producing clean, aligned text output suitable for console use, reports, and reproducible documentation. A snapshot helper can also render the tree output to a PNG image for sharing in issues, teaching material, or project documentation.
Reads/write binary genotype file compatible with PLINK <https://www.cog-genomics.org/plink/1.9/input#bed> into/from a R matrix; traverse genotype data one windows of variants at a time, like apply() or a for loop; reads/writes genotype relatedness/kinship matrices created by PLINK <https://www.cog-genomics.org/plink/1.9/distance#make_rel> or GCTA <https://cnsgenomics.com/software/gcta/#MakingaGRM> into/from a R square matrix. It is best used for bringing data produced by PLINK and GCTA into R workflow.
Given a sample with additive measurement error, the package estimates the deconvolution density - that is, the density of the underlying distribution of the sample without measurement error. The method maximises the log-likelihood of the estimated density, plus a quadratic smoothness penalty. The distribution of the measurement error can be either a known family, or can be estimated from a "pure error" sample. For known error distributions, the package supports Normal, Laplace or Beta distributed error. For unknown error distribution, a pure error sample independent from the data is used.
This package provides functions for the evaluation of surrogate endpoints when both the surrogate and the true endpoint are failure time variables. The approaches implemented are: (1) the two-step approach (Burzykowski et al, 2001) <DOI:10.1111/1467-9876.00244> with a copula model (Clayton, Plackett, Hougaard) at the first step and either a linear regression of log-hazard ratios at the second step (either adjusted or not for measurement error); (2) mixed proportional hazard models estimated via mixed Poisson GLM (Rotolo et al, 2017 <DOI:10.1177/0962280217718582>).
Quickly and flexibly calculates weights for survey data, in order to correct for survey non-response or other sampling issues. Uses rake weighting, a common technique also know as rim weighting or iterative proportional fitting. This technique allows for weighting on multiple variables, even when the interlocked distribution of the two variables is not known. Interacts with Thomas Lumley's survey package, as described in Lumley, Thomas (2011, ISBN:978-1-118-21093-2). Adds additional functionality, more adaptable syntax, and error-checking to the base weighting functionality in survey.'.
Statistics students often have problems understanding the relation between a random variable's true scale and its z-values. To allow instructors to better better visualize histograms for these students, the package provides histograms with two horizontal axis containing z-values and the true scale of the variable. The function TeachHistDens() provides a density histogram with two axis. TeachHistCounts() and TeachHistRelFreq() are variations for count and relative frequency histograms, respectively. TeachConfInterv() and TeachHypTest() help instructors to visualize confidence levels and the results of hypothesis tests.
This package provides a collection of functions to make R a more effective viewscape analysis tool for calculating viewscape metrics based on computing the viewable area for given a point/multiple viewpoints and a digital elevation model.The method of calculating viewscape metrics implemented in this package are based on the work of Tabrizian et al. (2020) <doi:10.1016/j.landurbplan.2019.103704>. The algorithm of computing viewshed is based on the work of Franklin & Ray. (1994) <https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=555780f6f5d7e537eb1edb28862c86d1519af2be>.
Chromatin segmentation analysis transforms ChIP-seq data into signals over the genome. The latter represents the observed states in a multivariate Markov model to predict the chromatin's underlying states. ChromHMM, written in Java, integrates histone modification datasets to learn the chromatin states de-novo. The goal of this package is to call chromHMM from within R, capture the output files in an S4 object and interface to other relevant Bioconductor analysis tools. In addition, segmenter provides functions to test, select and visualize the output of the segmentation.
TEKRABber is made to provide a user-friendly pipeline for comparing orthologs and transposable elements (TEs) between two species. It considers the orthology confidence between two species from BioMart to normalize expression counts and detect differentially expressed orthologs/TEs. Then it provides one to one correlation analysis for desired orthologs and TEs. There is also an app function to have a first insight on the result. Users can prepare orthologs/TEs RNA-seq expression data by their own preference to run TEKRABber following the data structure mentioned in the vignettes.
This package implements a method for identifying and removing the cell-cycle effect from scRNA-Seq data. The description of the method is in Barron M. and Li J. (2016) <doi:10.1038/srep33892>. Identifying and removing the cell-cycle effect from single-cell RNA-Sequencing data. Submitted. Different from previous methods, ccRemover implements a mechanism that formally tests whether a component is cell-cycle related or not, and thus while it often thoroughly removes the cell-cycle effect, it preserves other features/signals of interest in the data.
Alpha and beta diversity for taxonomic (TD), functional (FD), and phylogenetic (PD) dimensions based on rasters. Spatial and temporal beta diversity can be partitioned into replacement and richness difference components. It also calculates standardized effect size for FD and PD alpha diversity and the average individual traits across multilayer rasters. The layers of the raster represent species, while the cells represent communities. Methods details can be found at Cardoso et al. 2022 <https://CRAN.R-project.org/package=BAT> and Heming et al. 2023 <https://CRAN.R-project.org/package=SESraster>.
Data quality assessments guided by a data quality framework introduced by Schmidt and colleagues, 2021 <doi:10.1186/s12874-021-01252-7> target the data quality dimensions integrity, completeness, consistency, and accuracy. The scope of applicable functions rests on the availability of extensive metadata which can be provided in spreadsheet tables. Either standardized (e.g. as html5 reports) or individually tailored reports can be generated. For an introduction into the specification of corresponding metadata, please refer to the package website <https://dataquality.qihs.uni-greifswald.de/VIN_Annotation_of_Metadata.html>.
This package implements the methods of McGrath et al. (2020) <doi:10.1177/0962280219889080> and Cai et al. (2021) <doi:10.1177/09622802211047348> for estimating the sample mean and standard deviation from commonly reported quantiles in meta-analysis. These methods can be applied to studies that report the sample median, sample size, and one or both of (i) the sample minimum and maximum values and (ii) the first and third quartiles. The corresponding standard error estimators described by McGrath et al. (2023) <doi:10.1177/09622802221139233> are also included.
Kiener distributions K1, K2, K3, K4 and K7 to characterize distributions with left and right, symmetric or asymmetric fat tails in finance, neuroscience and other disciplines. Two algorithms to estimate the distribution parameters, quantiles, value-at-risk and expected shortfall. IMPORTANT: Standardization has been changed in versions >= 2.0.0 to get sd = 1 when kappa = Inf rather than 2*pi/sqrt(3) in versions <= 1.8.6. This affects parameter g (other parameters stay unchanged). Do not update if you need consistent comparisons with previous results for the g parameter.
Guided partial least squares (guided-PLS) is the combination of partial least squares by singular value decomposition (PLS-SVD) and guided principal component analysis (guided-PCA). This package provides implementations of PLS-SVD, guided-PLS, and guided-PCA for supervised dimensionality reduction. The guided-PCA function (new in v1.1.0) automatically handles mixed data types (continuous and categorical) in the supervision matrix and provides detailed contribution analysis for interpretability. For the details of the methods, see the reference section of GitHub README.md <https://github.com/rikenbit/guidedPLS>.
Make R scripts reproducible, by ensuring that every time a given script is run, the same version of the used packages are loaded (instead of whichever version the user running the script happens to have installed). This is achieved by using the command groundhog.library() instead of the base command library(), and including a date in the call. The date is used to call on the same version of the package every time (the most recent version available at that date). Load packages from CRAN, GitHub, or Gitlab.
This package provides a collection of functions for working with time series data, including functions for drawing, decomposing, and forecasting. Includes capabilities to compare multiple series and fit both additive and multiplicative models. Used by iNZight', a graphical user interface providing easy exploration and visualisation of data for students of statistics, available in both desktop and online versions. Holt (1957) <doi:10.1016/j.ijforecast.2003.09.015>, Winters (1960) <doi:10.1287/mnsc.6.3.324>, Cleveland, Cleveland, & Terpenning (1990) "STL: A Seasonal-Trend Decomposition Procedure Based on Loess".
Quantify the causal effect of a binary exposure on a binary outcome with adjustment for multiple biases. The functions can simultaneously adjust for any combination of uncontrolled confounding, exposure/outcome misclassification, and selection bias. The underlying method generalizes the concept of combining inverse probability of selection weighting with predictive value weighting. Simultaneous multi-bias analysis can be used to enhance the validity and transparency of real-world evidence obtained from observational, longitudinal studies. Based on the work from Paul Brendel, Aracelis Torres, and Onyebuchi Arah (2023) <doi:10.1093/ije/dyad001>.
Optimal scaling of a data vector, relative to a set of targets, is obtained through a least-squares transformation subject to appropriate measurement constraints. The targets are usually predicted values from a statistical model. If the data are nominal level, then the transformation must be identity-preserving. If the data are ordinal level, then the transformation must be monotonic. If the data are discrete, then tied data values must remain tied in the optimal transformation. If the data are continuous, then tied data values can be untied in the optimal transformation.
Computation and visualization of Taxicab Correspondence Analysis, Choulakian (2006) <doi:10.1007/s11336-004-1231-4>. Classical correspondence analysis (CA) is a statistical method to analyse 2-dimensional tables of positive numbers and is typically applied to contingency tables (Benzecri, J.-P. (1973). L'Analyse des Donnees. Volume II. L'Analyse des Correspondances. Paris, France: Dunod). Classical CA is based on the Euclidean distance. Taxicab CA is like classical CA but is based on the Taxicab or Manhattan distance. For some tables, Taxicab CA gives more informative results than classical CA.
This package provides functions for importing external vector images and drawing them as part of R plots. This package is different from the grImport package because, where that package imports PostScript format images, this package imports SVG format images. Furthermore, this package imports a specific subset of SVG, so external images must be preprocessed using a package like rsvg to produce SVG that this package can import. SVG features that are not supported by R graphics, such as gradient fills, can be imported and then exported via the gridSVG package.