Enter the query into the form above. You can look for specific version of a package by using @ symbol like this: gcc@10.
API method:
GET /api/packages?search=hello&page=1&limit=20
where search is your query, page is a page number and limit is a number of items on a single page. Pagination information (such as a number of pages and etc) is returned
in response headers.
If you'd like to join our channel webring send a patch to ~whereiseveryone/toys@lists.sr.ht adding your channel as an entry in channels.scm.
Data cleaning scripts typically contain a lot of if this change that type of statements. Such statements are typically condensed expert knowledge. With this package, such data modifying rules are taken out of the code and become in stead parameters to the work flow. This allows one to maintain, document, and reason about data modification rules as separate entities.
Data quality assessments guided by a data quality framework introduced by Schmidt and colleagues, 2021 <doi:10.1186/s12874-021-01252-7> target the data quality dimensions integrity, completeness, consistency, and accuracy. The scope of applicable functions rests on the availability of extensive metadata which can be provided in spreadsheet tables. Either standardized (e.g. as html5 reports) or individually tailored reports can be generated. For an introduction into the specification of corresponding metadata, please refer to the package website <https://dataquality.qihs.uni-greifswald.de/VIN_Annotation_of_Metadata.html>.
This package provides a novel framework for detecting, quantifying, and analyzing dormant patterns in multivariate data. Dormant patterns are statistical relationships that exist in data but remain inactive until specific trigger conditions emerge. This concept, inspired by biological dormancy (seeds, pathogens) and geological phenomena (dormant faults), provides tools to identify latent risks, hidden correlations, and potential phase transitions in complex systems. The package introduces methods for quantifying dormancy depth, trigger sensitivity, and awakening risk - enabling analysts to discover patterns that conventional methods miss because they focus only on currently active relationships.
This package provides a wide collection of univariate discrete data sets from various applied domains related to distribution theory. The functions allow quick, easy, and efficient access to 100 univariate discrete data sets. The data are related to different applied domains, including medical, reliability analysis, engineering, manufacturing, occupational safety, geological sciences, terrorism, psychology, agriculture, environmental sciences, road traffic accidents, demography, actuarial science, law, and justice. The documentation, along with associated references for further details and uses, is presented.
Analysis, visualisation and simulation of digital polymerase chain reaction (dPCR) (Burdukiewicz et al. (2016) <doi:10.1016/j.bdq.2016.06.004>). Supports data formats of commercial systems (Bio-Rad QX100 and QX200; Fluidigm BioMark) and other systems.
Create quick and easy dot-and-whisker plots of regression results. It takes as input either (1) a coefficient table in standard form or (2) one (or a list of) fitted model objects (of any type that has methods implemented in the parameters package). It returns ggplot objects that can be further customized using tools from the ggplot2 package. The package also includes helper functions for tasks such as rescaling coefficients or relabeling predictor variables. See more methodological discussion of the visualization and data management methods used in this package in Kastellec and Leoni (2007) <doi:10.1017/S1537592707072209> and Gelman (2008) <doi:10.1002/sim.3107>.
This package provides a distributed framework for simulating and estimating skew factor models under various skewed and heavy-tailed distributions. The methods support distributed data generation, aggregation of local estimators, and evaluation of estimation performance via mean squared error, relative error, and sparsity measures. The distributed principal component (PC) estimators implemented in the package include IPC (Independent Principal Component),'PPC (Project Principal Component), SPC (Sparse Principal Component), and other related distributed PC methods. The methodological background follows Guo G. (2023) <doi:10.1007/s00180-022-01270-z>.
This package provides functions for the calculation and plotting of synchrony in tree growth from tree-ring width chronologies (TRW index). It combines variance-covariance (VCOV) mixed modelling with functions that quantify the degree to which the TRW chronologies contain a common temporal signal. It also implements temporal trends in spatial synchrony using a moving window. These methods can also be used with other kind of ecological variables that have temporal autocorrelation corrected.
The df2yaml aims to simplify the process of converting dataframe to YAML <https://yaml.org/>. The dataframe with multiple key columns and one value column will be converted to the multi-level hierarchy.
Supports import/export for a number of datetime string standards and R datetime classes often including lossless re-export of any original reduced precision including ISO 8601 <https://en.wikipedia.org/wiki/ISO_8601> and pdfmark <https://opensource.adobe.com/dc-acrobat-sdk-docs/library/pdfmark/> datetime strings. Supports local/global datetimes with optional UTC offsets and/or (possibly heterogeneous) time zones with up to nanosecond precision.
This package implements the doubly robust distribution balancing weighting proposed by Katsumata (2024) <doi:10.1017/psrm.2024.23>, which improves the augmented inverse probability weighting (AIPW) by estimating propensity scores with estimating equations suitable for the pre-specified parameter of interest (e.g., the average treatment effects or the average treatment effects on the treated) and estimating outcome models with the estimated inverse probability weights. It also implements the covariate balancing propensity score proposed by Imai and Ratkovic (2014) <doi:10.1111/rssb.12027> and the entropy balancing weighting proposed by Hainmueller (2012) <doi:10.1093/pan/mpr025>, both of which use covariate balancing conditions in propensity score estimation. The point estimate of the parameter of interest and its uncertainty as well as coefficients for propensity score estimation and outcome regression are produced using the M-estimation. The same functions can be used to estimate average outcomes in missing outcome cases.
Various kinds of designs for (industrial) experiments can be created. The package uses, and sometimes enhances, design generation routines from other packages. So far, response surface designs from package rsm', Latin hypercube samples from packages lhs and DiceDesign', and D-optimal designs from package AlgDesign have been implemented.
In tumor tissue, underlying genomic instability can lead to DNA copy number alterations, e.g., copy number gains or losses. Sporadic copy number alterations occur randomly throughout the genome, whereas recurrent alterations are observed in the same genomic region across multiple independent samples, perhaps because they provide a selective growth advantage. Here we use cyclic shift permutations to identify recurrent copy number alterations in a single cohort or recurrent copy number differences in two cohorts based on a common set of genomic markers. Additional functionality is provided to perform downstream analyses, including the creation of summary files and graphics. DiNAMIC.Duo builds upon the original DiNAMIC package of Walter et al. (2011) <doi:10.1093/bioinformatics/btq717> and leverages the theory developed in Walter et al. (2015) <doi:10.1093/biomet/asv046>. An article describing DiNAMIC.Duo by Walter et al. (2022) can be found at <doi: 10.1093/bioinformatics/btac542>.
This package provides functions to describe sampling and diversity dynamics of fossil occurrence datasets (e.g. from the Paleobiology Database). The package includes methods to calculate range- and occurrence-based metrics of taxonomic richness, extinction and origination rates, along with traditional sampling measures. A powerful subsampling tool is also included that implements frequently used sampling standardization methods in a multiple bin-framework. The plotting of time series and the occurrence data can be simplified by the functions incorporated in the package, as well as other calculations, such as environmental affinities and extinction selectivity testing. Details can be found in: Kocsis, A.T.; Reddin, C.J.; Alroy, J. and Kiessling, W. (2019) <doi:10.1101/423780>.
Computations of Fisher's z-tests concerning different kinds of correlation differences. The diffpwr family entails approaches to estimating statistical power via Monte Carlo simulations. Important to note, the Pearson correlation coefficient is sensitive to linear association, but also to a host of statistical issues such as univariate and bivariate outliers, range restrictions, and heteroscedasticity (e.g., Duncan & Layard, 1973 <doi:10.1093/BIOMET/60.3.551>; Wilcox, 2013 <doi:10.1016/C2010-0-67044-1>). Thus, every power analysis requires that specific statistical prerequisites are fulfilled and can be invalid if the prerequisites do not hold. To this end, the bootcor family provides bootstrapping confidence intervals for the incorporated correlation difference tests.
Allows you to define rules which can be used to verify a given dataset. The package acts as a thin wrapper around more powerful data packages such as dplyr', data.table', arrow', and DBI ('SQL'), which do the heavy lifting.
This package provides the dose transition pathways (DTP) to project in advance the doses recommended by a model-based design for subsequent patients (stay, escalate, deescalate or stop early) using all the accumulated toxicity information; See Yap et al (2017) <doi: 10.1158/1078-0432.CCR-17-0582>. DTP can be used as a design and an operational tool and can be displayed as a table or flow diagram. The dtpcrm package also provides the modified continual reassessment method (CRM) and time-to-event CRM (TITE-CRM) with added practical considerations to allow stopping early when there is sufficient evidence that the lowest dose is too toxic and/or there is a sufficient number of patients dosed at the maximum tolerated dose.
This package provides a collection of tools that support data diagnosis, exploration, and transformation. Data diagnostics provides information and visualization of missing values, outliers, and unique and negative values to help you understand the distribution and quality of your data. Data exploration provides information and visualization of the descriptive statistics of univariate variables, normality tests and outliers, correlation of two variables, and the relationship between the target variable and predictor. Data transformation supports binning for categorizing continuous variables, imputes missing values and outliers, and resolves skewness. And it creates automated reports that support these three tasks.
When visualising changes between two values over time, a strict linear interpolation can look jarring and unnatural. By applying a non-linear easing to the transition, the motion between values can appear smoother and more natural. This package includes functions for applying such non-linear easings to colors and numeric values, and is useful where smooth animated movement and transitions are desired.
Implement some deep learning architectures and neural network algorithms, including BP,RBM,DBN,Deep autoencoder and so on.
It is a novel tool used to identify the candidate drugs against a particular disease based on the drug target set enrichment analysis. It assumes the most effective drugs are those with a closer affinity in the protein-protein interaction network to the specified disease. (See Gómez-Carballa et al. (2022) <doi: 10.1016/j.envres.2022.112890> and Feng et al. (2022) <doi: 10.7150/ijms.67815> for disease expression profiles; see Wishart et al. (2018) <doi: 10.1093/nar/gkx1037> and Gaulton et al. (2017) <doi: 10.1093/nar/gkw1074> for drug target information; see Kanehisa et al. (2021) <doi: 10.1093/nar/gkaa970> for the details of KEGG database.).
Read Word documents containing bibliographic references, search for corresponding DOIs using the Crossref API, and append the retrieved DOIs directly to the references. Supports parallel processing for faster retrieval and produces a new Word document with numbered references including DOIs.
This package provides a flexible container to transport and manipulate complex sets of data. These data may consist of multiple data files and associated meta data and ancillary files. Individual data objects have associated system level meta data, and data files are linked together using the OAI-ORE standard resource map which describes the relationships between the files. The OAI- ORE standard is described at <https://www.openarchives.org/ore/>. Data packages can be serialized and transported as structured files that have been created following the BagIt specification. The BagIt specification is described at <https://datatracker.ietf.org/doc/html/draft-kunze-bagit-08>.
Gives access to data visualisation methods that are relevant from the data scientist's point of view. The flagship idea of DataVisualizations is the mirrored density plot (MD-plot) for either classified or non-classified multivariate data published in Thrun, M.C. et al.: "Analyzing the Fine Structure of Distributions" (2020), PLoS ONE, <DOI:10.1371/journal.pone.0238835>. The MD-plot outperforms the box-and-whisker diagram (box plot), violin plot and bean plot and geom_violin plot of ggplot2. Furthermore, a collection of various visualization methods for univariate data is provided. In the case of exploratory data analysis, DataVisualizations makes it possible to inspect the distribution of each feature of a dataset visually through a combination of four methods. One of these methods is the Pareto density estimation (PDE) of the probability density function (pdf). Additionally, visualizations of the distribution of distances using PDE, the scatter-density plot using PDE for two variables as well as the Shepard density plot and the Bland-Altman plot are presented here. Pertaining to classified high-dimensional data, a number of visualizations are described, such as f.ex. the heat map and silhouette plot. A political map of the world or Germany can be visualized with the additional information defined by a classification of countries or regions. By extending the political map further, an uncomplicated function for a Choropleth map can be used which is useful for measurements across a geographic area. For categorical features, the Pie charts, slope charts and fan plots, improved by the ABC analysis, become usable. More detailed explanations are found in the book by Thrun, M.C.: "Projection-Based Clustering through Self-Organization and Swarm Intelligence" (2018) <DOI:10.1007/978-3-658-20540-9>.