This package provides a data package with 2 main package variables: signature and etiology'. The signature variable contains the latest mutational signature profiles released on COSMIC <https://cancer.sanger.ac.uk/signatures/> for 3 mutation types: * Single base substitutions in the context of preceding and following bases, * Doublet base substitutions, and * Small insertions and deletions. The etiology variable provides the known or hypothesized causes of signatures. cosmicsig stands for COSMIC signatures. Please run ?'cosmicsig for more information.
Calculate with spectral properties of light sources, materials, cameras, eyes, and scanners. Build complex systems from simpler parts using a spectral product algebra. For light sources, compute CCT, CRI, SSI, and IES TM-30 reports. For object colors, compute optimal colors and Logvinenko coordinates. Work with the standard CIE illuminants and color matching functions, and read spectra from text files, including CGATS files. Estimate a spectrum from its response. A user guide and 9 vignettes are included.
This package provides a systematic biology tool was developed to repurpose drugs via a subpathway crosstalk network. The operation modes include 1) calculating centrality scores of SPs in the context of gene expression data to reflect the influence of SP crosstalk, 2) evaluating drug-disease reverse association based on disease- and drug-induced SPs weighted by the SP crosstalk, 3) identifying cancer candidate drugs through perturbation analysis. There are also several functions used to visualize the results.
This package provides a collection of gold price data in various currencies in the form of USD, EUR, JPY, GBP, CAD, CHF, INR, CNY, TRY, SAR, IDR, AED, THB, VND, EGP, KRW, RUB, ZAR, and AUD. This data comes from the World Gold Council. In addition, the data is in the form of daily, weekly, monthly (average and the end of period), quarterly (average and the end of period), and yearly (average and the end of period).
This package provides routines to estimate the Mixture Transition Distribution Model based on Raftery (1985) <http://www.jstor.org/stable/2345788> and Nicolau (2014) <doi:10.1111/sjos.12087> specifications, for multivariate data. Additionally, provides a function for the estimation of a new model for multivariate non-homogeneous Markov chains. This new specification, Generalized Multivariate Markov Chains (GMMC) was proposed by Carolina Vasconcelos and Bruno Damasio and considers (continuous or discrete) covariates exogenous to the Markov chain.
This package implements key features of Gephi for network visualization, including ForceAtlas2 (with LinLog mode), network scaling, and network rotations. It also includes easy network visualization tools such as edge and node color assignment for recreating Gephi'-style graphs in R. The package references layout algorithms developed by Jacomy, M., Venturini T., Heymann S., and Bastian M. (2014) <doi:10.1371/journal.pone.0098679> and Noack, A. (2009) <doi:10.48550/arXiv.0807.4052>.
Graphical visualization tools for analyzing the data produced by irace'. The iraceplot package enables users to analyze the performance and the parameter space data sampled by the configuration during the search process. It provides a set of functions that generate different plots to visualize the configurations sampled during the execution of irace and their performance. The functions just require the log file generated by irace and, in some cases, they can be used with user-provided data.
An implementation of the correction methods proposed by Shu and Yi (2017) <doi:10.1177/0962280217743777> for the inverse probability weighted (IPW) estimation of average treatment effect (ATE) with misclassified binary outcomes. Logistic regression model is assumed for treatment model for all implemented correction methods, and is assumed for the outcome model for the implemented doubly robust correction method. Misclassification probability given a true value of the outcome is assumed to be the same for all individuals.
Estimate the mean of a Gaussian vector, by choosing among a large collection of estimators, following the method developed by Y. Baraud, C. Giraud and S. Huet (2014) <doi:10.1214/13-AIHP539>. In particular it solves the problem of variable selection by choosing the best predictor among predictors emanating from different methods as lasso, elastic-net, adaptive lasso, pls, randomForest. Moreover, it can be applied for choosing the tuning parameter in a Gauss-lasso procedure.
Perform a mail merge (mass email) using the message defined in markdown, the recipients in a csv file, and gmail as the mailing engine. With this package you can parse markdown documents as the body of email, and the yaml header to specify the subject line of the email. Any braces in the email will be encoded with glue::glue()'. You can preview the email in the RStudio viewer pane, and send (draft) email using gmailr'.
This package provides tools for training, selecting, and evaluating maximum entropy (and standard logistic regression) distribution models. This package provides tools for user-controlled transformation of explanatory variables, selection of variables by nested model comparison, and flexible model evaluation and projection. It follows principles based on the maximum- likelihood interpretation of maximum entropy modeling, and uses infinitely- weighted logistic regression for model fitting. The package is described in Vollering et al. (2019; <doi:10.1002/ece3.5654>).
This package provides a simple R interface to the OPUS Miner algorithm (implemented in C++) for finding the top-k productive, non-redundant itemsets from transaction data. The OPUS Miner algorithm uses the OPUS search algorithm to efficiently discover the key associations in transaction data, in the form of self-sufficient itemsets, using either leverage or lift. See <http://i.giwebb.com/index.php/research/association-discovery/> for more information in relation to the OPUS Miner algorithm.
Simplifies the manufacturing, analysis and display of pressure volume and leaf drying curves. From the progression of the curves turgor loss point, osmotic potential, apoplastic fraction as well as minimum conductance and stomatal closure can be derived. Methods adapted from Bartlett, Scoffoni, Sack (2012) <doi:10.1111/j.1461-0248.2012.01751.x> and Sack, Scoffoni, PrometheusWikiContributors (2011) <http://prometheuswiki.org/tiki-index.php?page=Minimum+epidermal+conductance+%28gmin%2C+a.k.a.+cuticular+conductance%29>.
Topological data analysis is a powerful tool for finding non-linear global structure in whole datasets. The main tool of topological data analysis is persistent homology, which computes a topological shape descriptor of a dataset called a persistence diagram. TDApplied provides useful and efficient methods for analyzing groups of persistence diagrams with machine learning and statistical inference, and these functions can also interface with other data science packages to form flexible and integrated topological data analysis pipelines.
This package aims to integrate GWAS-derived SNPs and coexpression networks to mine candidate genes associated with a particular phenotype. For that, users must define a set of guide genes, which are known genes involved in the studied phenotype. Additionally, the mined candidates can be given a score that favor candidates that are hubs and/or transcription factors. The scores can then be used to rank and select the top n most promising genes for downstream experiments.
This package provides functions for animations in statistics, covering topics in probability theory, mathematical statistics, multivariate statistics, non-parametric statistics, sampling survey, linear models, time series, computational statistics, data mining and machine learning. These functions may be helpful in teaching statistics and data analysis. Also provided in this package are a series of functions to save animations to various formats, e.g. GIF, HTML pages, PDF, and videos. PDF animations can be inserted into Sweave / knitr easily.
This package provides a set of functions for organising and analysing datasets from experiments run using Eyelink eye-trackers. Organising functions help to clean and prepare eye-tracking datasets for analysis, and mark up key events such as display changes and responses made by participants. Analysing functions help to create means for a wide range of standard measures (such as mean fixation durations'), which can then be fed into the appropriate statistical analyses and graphing packages as necessary.
An implementation of the Fizz Buzz algorithm, as defined e.g. in <https://en.wikipedia.org/wiki/Fizz_buzz>. It provides the standard algorithm with 3 replaced by Fizz and 5 replaced by Buzz, with the option of specifying start and end numbers, step size and the numbers being replaced by fizz and buzz, respectively. This package gives interviewers the optional answer of "I use fizzbuzzR::fizzbuzz()" when interviewing rather than having to write an algorithm themselves.
An implementation of the fair data adaptation with quantile preservation described in Plecko & Meinshausen (JMLR 2020, 21(242), 1-44). The adaptation procedure uses the specified causal graph to pre-process the given training and testing data in such a way to remove the bias caused by the protected attribute. The procedure uses tree ensembles for quantile regression. Instructions for using the methods are further elaborated in the corresponding JSS manuscript, see <doi:10.18637/jss.v110.i04>.
Tool for import and process data from Lattes curriculum platform (<http://lattes.cnpq.br/>). The Brazilian government keeps an extensive base of curricula for academics from all over the country, with over 5 million registrations. The academic life of the Brazilian researcher, or related to Brazilian universities, is documented in Lattes'. Some information that can be obtained: professional formation, research area, publications, academics advisories, projects, etc. getLattes package allows work with Lattes data exported to XML format.
Identifies chromatin interaction modules by constructing a Hi-C contact network based on statistically significant interactions, followed by network clustering. The method enables comparison of module connectivity across two Hi-C datasets and is capable of detecting cell-type-specific regulatory modules. By integrating network analysis with chromatin conformation data, this approach provides insights into the spatial organization of the genome and its functional implications in gene regulation. Author: Sora Yoon (2025) <https://github.com/ysora/HiCociety>.
This package provides tools for the estimation of Heckman selection models with robust variance-covariance matrices. It includes functions for computing the bread and meat matrices, as well as clustered standard errors for generalized Heckman models, see Fernando de Souza Bastos and Wagner Barreto-Souza and Marc G. Genton (2022, ISSN: <https://www.jstor.org/stable/27164235>). The package also offers cluster-robust inference with sandwich estimators, and tools for handling issues related to eigenvalues in covariance matrices.
This package provides tools for parsing NOAA Integrated Surface Data ('ISD') files, described at <https://www.ncdc.noaa.gov/isd>. Data includes for example, wind speed and direction, temperature, cloud data, sea level pressure, and more. Includes data from approximately 35,000 stations worldwide, though best coverage is in North America/Europe/Australia. Data is stored as variable length ASCII character strings, with most fields optional. Included are tools for parsing entire files, or individual lines of data.
This package provides a hybrid of the K-means algorithm and a Majorization-Minimization method to introduce a robust clustering. The reference paper is: Julien Mairal, (2015) <doi:10.1137/140957639>. The two most important functions in package MajKMeans are cluster_km() and cluster_MajKm(). cluster_km() clusters data without Majorization-Minimization and cluster_MajKm() clusters data with Majorization-Minimization method. Both of these functions calculate the sum of squares (SS) of clustering.