The number of studies involving correlated traits and the availability of tools to handle this type of data has increased considerably in the last decade. With such a demand, we need tools for testing hypotheses related to single and multi-trait (correlated) phenotypes based on many genetic settings. Thus, we implemented various options for simulation of pleiotropy and Linkage Disequilibrium under additive, dominance and epistatic models. The simulation currently takes a marker data set as an input and then uses it for simulating multiple traits as described in Fernandes and Lipka (2020) <doi:10.1186/s12859-020-03804-y>.
For high-dimensional data whose main feature is a large number, p, of variables but a small sample size, the null hypothesis that the marginal distributions of p variables are the same for two groups is tested. We propose a test statistic motivated by the simple idea of comparing, for each of the p variables, the empirical characteristic functions computed from the two samples. If one rejects this global null hypothesis of no differences in distributions between the two groups, a set of permutation p-values is reported to identify which variables are not equally distributed in both groups.
Romande ADF is a serif font family with oldstyle figures, designed as a substitute for Times, Tiffany or Caslon. The family currently includes upright, italic and small-caps shapes in each of regular and demi-bold weights and an italic script in regular. The support package renames the fonts according to the Karl Berry font name scheme and defines four families. Two of these primarily provide access to the ``standard'' or default characters while the ``alternate'' families support alternate characters, additional ligatures and the long ``s''. The included package files provide access to these features in LaTeX as explained in the documentation.
Intra-miR-ExploreR, an integrative miRNA target prediction bioinformatics tool, identifies targets combining expression and biophysical interactions of a given microRNA (miR). Using the tool, we have identified targets for 92 intragenic miRs in D. melanogaster, using available microarray expression data, from Affymetrix 1 and Affymetrix2 microarray array platforms, providing a global perspective of intragenic miR targets in Drosophila. Predicted targets are grouped according to biological functions using the DAVID Gene Ontology tool and are ranked based on a biologically relevant scoring system, enabling the user to identify functionally relevant targets for a given miR.
Diversification is one of the most important concepts in portfolio management. This framework offers scholars, practitioners and policymakers a useful toolbox to measure diversification. Specifically, this framework provides recent diversification measures from the recent literature. These diversification measures are based on the works of Rudin and Morgan (2006) <doi:10.3905/jpm.2006.611807>, Choueifaty and Coignard (2008) <doi:10.3905/JPM.2008.35.1.40>, Vermorken et al. (2012) <doi:10.3905/jpm.2012.39.1.067>, Flores et al. (2017) <doi:10.3905/jpm.2017.43.4.112>, Calvet et al. (2007) <doi:10.1086/524204>, and Candelon, Fuerst and Hasse (2020).
This app provides some useful tools for Offering an accessible GUI for generalised blockmodeling of single-relation, one-mode networks. The user can execute blockmodeling without having to write a line code by using the app's visual helps. Moreover, there are several ways to visualisations networks and their partitions. Finally, the results can be exported as if they were produced by writing code. The development of this package is financially supported by the Slovenian Research Agency (www.arrs.gov.si) within the research project J5-2557 (Comparison and evaluation of different approaches to blockmodeling dynamic networks by simulations with application to Slovenian co-authorship networks).
Adds variable-selection functions for Beta regression models (both mean and phi submodels) so they can be used within the SelectBoost algorithm. Includes stepwise AIC, BIC, and corrected AIC on betareg() fits, gamlss'-based LASSO/Elastic-Net, a pure glmnet iterative re-weighted least squares-based selector with an optional standardization speedup, and C++ helpers for iterative re-weighted least squares working steps and precision updates. Also provides a fastboost_interval() variant for interval responses, comparison helpers, and a flexible simulator simulation_DATA.beta() for interval-valued data. For more details see Bertrand and Maumy (2023) <doi:10.7490/f1000research.1119552.1>.
This package implements functionality for exploratory data analysis and nonparametric analysis of spatial data, mainly spatial point patterns, in the spatstat family of packages. Methods include quadrat counts, K-functions and their simulation envelopes, nearest neighbour distance and empty space statistics, Fry plots, pair correlation function, kernel smoothed intensity, relative risk estimation with cross-validated bandwidth selection, mark correlation functions, segregation indices, mark dependence diagnostics, and kernel estimates of covariate effects. Formal hypothesis tests of random pattern (chi-squared, Kolmogorov-Smirnov, Monte Carlo, Diggle-Cressie-Loosmore-Ford, Dao-Genton, two-stage Monte Carlo) and tests for covariate effects (Cox-Berman-Waller-Lawson, Kolmogorov-Smirnov, ANOVA) are also supported.
This package provides a well known identifiability issue in factor analytic models is the invariance with respect to orthogonal transformations. This problem burdens the inference under a Bayesian setup, where Markov chain Monte Carlo (MCMC) methods are used to generate samples from the posterior distribution. The package applies a series of rotation, sign and permutation transformations (Papastamoulis and Ntzoufras (2022) <DOI:10.1007/s11222-022-10084-4>) into raw MCMC samples of factor loadings, which are provided by the user. The post-processed output is identifiable and can be used for MCMC inference on any parametric function of factor loadings. Comparison of multiple MCMC chains is also possible.
This package provides a statistical disclosure control tool to protect tables by suppression using the Gaussian elimination secondary suppression algorithm (Langsrud, 2024) <doi:10.1007/978-3-031-69651-0_6>. A suggestion is to start by working with functions SuppressSmallCounts() and SuppressDominantCells(). These functions use primary suppression functions for the minimum frequency rule and the dominance rule, respectively. Novel functionality for suppression of disclosive cells is also included. General primary suppression functions can be supplied as input to the general working horse function, GaussSuppressionFromData(). Suppressed frequencies can be replaced by synthetic decimal numbers as described in Langsrud (2019) <doi:10.1007/s11222-018-9848-9>.
Derive instances of Arbitrary for QuickCheck, with various options to customize implementations.
Automating the arbitrary boilerplate also ensures that when a type changes to have more or fewer constructors, then the generator either fixes itself to generate that new case (when using the uniform distribution) or causes a compilation error so you remember to fix it (when using an explicit distribution).
This package also offers a simple (optional) strategy to ensure termination for recursive types: make Test.QuickCheck.Gen's size parameter decrease at every recursive call; when it reaches zero, sample directly from a trivially terminating generator given explicitly (genericArbitraryRec and withBaseCase) or implicitly (genericArbitrary').
This package provides functions to be used in conjunction with the Sequential package that allows for planning of observational database studies that will be analyzed with exact sequential analysis. This package supports Poisson- and binomial-based data. The primary function, seq_wrapper(...), accepts parameters for simulation of a simple exposure pattern and for the Sequential package setup and analysis functions. The exposure matrix is used to simulate the true and false positive and negative populations (Green (1983) <doi:10.1093/oxfordjournals.aje.a113521>, Brenner (1993) <doi:10.1093/oxfordjournals.aje.a116805>). Functions are then run from the Sequential package on these populations, which allows for the exploration of outcome misclassification in data.
DuplexDiscovereR is a package designed for analyzing data from RNA cross-linking and proximity ligation protocols such as SPLASH, PARIS, LIGR-seq, and others. DuplexDiscovereR accepts input in the form of chimerically or split-aligned reads. It includes procedures for alignment classification, filtering, and efficient clustering of individual chimeric reads into duplex groups (DGs). Once DGs are identified, the package predicts RNA duplex formation and their hybridization energies. Additional metrics, such as p-values for random ligation hypothesis or mean DG alignment scores, can be calculated to rank final set of RNA duplexes. Data from multiple experiments or replicates can be processed separately and further compared to check the reproducibility of the experimental method.
For any spending function specified by the user, this package provides corresponding boundaries for interim testing using the adaptively weighted log-rank test developed by Yang and Prentice (2010 <doi:10.1111/j.1541-0420.2009.01243.x>). The package uses a re-sampling method to obtain stopping boundaries at the interim looks.The output consists of stopping boundaries and observed values of the test statistics at the interim looks, along with nominal p-values defined as the probability of the test exceeding the specific observed test statistic value or critical value, regardless of the test behavior at other looks. The asymptotic validity of the stopping boundaries is established in Yang (2018 <doi:10.1002/sim.7958>).
Simulation and visualization depth-dependent integrated visual fields. Visual fields are measured monocularly at a single depth, yet real-life activities involve predominantly binocular vision at multiple depths. The package provides functions to simulate and visualize binocular visual field impairment in a depth-dependent fashion from monocular visual field results based on Ping Liu, Allison McKendrick, Anna Ma-Wyatt, Andrew Turpin (2019) <doi:10.1167/tvst.9.3.8>. At each location and depth plane, sensitivities are linearly interpolated from corresponding locations in monocular visual field and returned as the higher value of the two. Its utility is demonstrated by evaluating DD-IVF defects associated with 12 glaucomatous archetypes of 24-2 visual field pattern in the included shiny apps.
Monte Carlo simulation is a stochastic method computing trajectories of photons in media. Surface backscattering is performing calculations in semi-infinite media and summarizing photon flux leaving the surface. This simulation is modeling the optical measurement of diffuse reflectance using an incident light beam. The semi-infinite media is considered to have flat surface. Media, typically biological tissue, is described by four optical parameters: absorption coefficient, scattering coefficient, anisotropy factor, refractive index. The media is assumed to be homogeneous. Computational parameters of the simulation include: number of photons, radius of incident light beam, lowest photon energy threshold, intensity profile (halo) radius, spatial resolution of intensity profile. You can find more information and validation in the Open Access paper. Laszlo Baranyai (2020) <doi:10.1016/j.mex.2020.100958>.
This package provides pre-fit and post-fit methods for detecting separation and infinite maximum likelihood estimates in generalized linear models with categorical responses. The pre-fit methods apply on binomial-response generalized liner models such as logit, probit and cloglog regression, and can be directly supplied as fitting methods to the glm() function. The post-fit methods apply to models with categorical responses, including binomial-response generalized linear models and multinomial-response models, such as baseline category logits and adjacent category logits models; for example, the models implemented in the brglm2 package. The post-fit methods successively refit the model with increasing number of iteratively reweighted least squares iterations, and monitor the ratio of the estimated standard error for each parameter to what it has been in the first iteration.
This package provides a polycross is the pollination by natural hybridization of a group of genotypes, generally selected, grown in isolation from other compatible genotypes in such a way to promote random open pollination. A particular practical application of the polycross method occurs in the production of a synthetic variety resulting from cross-pollinated plants. Laying out these experiments in appropriate designs, known as polycross designs, would not only save experimental resources but also gather more information from the experiment. Different experimental situations may arise in polycross nurseries which may be requiring different polycross designs (Varghese et. al. (2015) <doi:10.1080/02664763.2015.1043860>. " Experimental designs for open pollination in polycross trials"). This package contains a function named PD() which generates nine types of polycross designs suitable for various experimental situations.
This r-physicalactivity package provides a function wearingMarking for classification of monitored wear and nonwear time intervals in accelerometer data collected to assess physical activity. The package also contains functions for making plots of accelerometer data and obtaining the summary of various information including daily monitor wear time and the mean monitor wear time during valid days. The revised package version 0.2-1 improved the functions regarding speed, robustness and add better support for time zones and daylight saving. In addition, several functions were added:
the
markDeliverycan classify days for ActiGraph delivery by mail;the
markPAIcan categorize physical activity intensity level based on user-defined cut-points of accelerometer counts.
It also supports importing ActiGraph (AGD) files with readActigraph and queryActigraph functions.
Enables the interactive visualization of dimensional reduction, clustering, and cell properties for scRNA-Seq results. It generates an interactive HTML page using either a numeric matrix, SummarizedExperiment, SingleCellExperiment or Seurat objects as input. The input data can be projected into two-dimensional representations by applying dimensionality reduction methods such as PCA, MDS, t-SNE, UMAP, and NMF. Displaying multiple dimensionality reduction results within the same interface, with interconnected graphs, provides different perspectives that facilitate accurate cell classification. The package also integrates unsupervised clustering techniques, whose results that can be viewed interactively in the graphical interface. In addition to visualization, this interface allows manual selection of groups, labeling of cell entities based on processed meta-information, generation of new graphs displaying gene expression values for each cell, sample identification, and visual comparison of samples and clusters.
An R interface to the Julia package NeuralEstimators.jl'. The package facilitates the user-friendly development of neural Bayes estimators, which are neural networks that map data to a point summary of the posterior distribution (Sainsbury-Dale et al., 2024, <doi:10.1080/00031305.2023.2249522>). These estimators are likelihood-free and amortised, in the sense that, once the neural networks are trained on simulated data, inference from observed data can be made in a fraction of the time required by conventional approaches. The package also supports amortised Bayesian or frequentist inference using neural networks that approximate the posterior or likelihood-to-evidence ratio (Zammit-Mangion et al., 2025, Sec. 3.2, 5.2, <doi:10.48550/arXiv.2404.12484>). The package accommodates any model for which simulation is feasible by allowing users to define models implicitly through simulated data.
This package provides a comprehensive suite of functions for processing, analyzing, and visualizing textual data from tweets is offered. Users can clean tweets, analyze their sentiments, visualize data, and examine the correlation between sentiments and environmental data such as weather conditions. Main features include text processing, sentiment analysis, data visualization, correlation analysis, and synthetic data generation. Text processing involves cleaning and preparing tweets by removing textual noise and irrelevant words. Sentiment analysis extracts and accurately analyzes sentiments from tweet texts using advanced algorithms. Data visualization creates various charts like word clouds and sentiment polarity graphs for visual representation of data. Correlation analysis examines and calculates the correlation between tweet sentiments and environmental variables such as weather conditions. Additionally, random tweets can be generated for testing and evaluating the performance of analyses, empowering users to effectively analyze and interpret Twitter data for research and commercial purposes.
Point mutations occurring in a genome can be divided into 96 categories based on the base being mutated, the base it is mutated into and its two flanking bases. Therefore, for any patient, it is possible to represent all the point mutations occurring in that patient's tumor as a vector of length 96, where each element represents the count of mutations for a given category in the patient. A mutational signature represents the pattern of mutations produced by a mutagen or mutagenic process inside the cell. Each signature can also be represented by a vector of length 96, where each element represents the probability that this particular mutagenic process generates a mutation of the 96 above mentioned categories. In this R package, we provide a set of functions to extract and visualize the mutational signatures that best explain the mutation counts of a large number of patients.
Treemaps are a visually appealing graphical representation of numerical data using a space-filling approach. A plane or map is subdivided into smaller areas called cells. The cells in the map are scaled according to an underlying metric which allows to grasp the hierarchical organization and relative importance of many objects at once. This package contains two different implementations of treemaps, Voronoi treemaps and Sunburst treemaps. The Voronoi treemap function subdivides the plot area in polygonal cells according to the highest hierarchical level, then continues to subdivide those parental cells on the next lower hierarchical level, and so on. The Sunburst treemap is a computationally less demanding treemap that does not require iterative refinement, but simply generates circle sectors that are sized according to predefined weights. The Voronoi tesselation is based on functions from Paul Murrell (2012) <https://www.stat.auckland.ac.nz/~paul/Reports/VoronoiTreemap/voronoiTreeMap.html>.