Enter the query into the form above. You can look for specific version of a package by using @ symbol like this: gcc@10.
API method:
GET /api/packages?search=hello&page=1&limit=20
where search is your query, page is a page number and limit is a number of items on a single page. Pagination information (such as a number of pages and etc) is returned
in response headers.
If you'd like to join our channel webring send a patch to ~whereiseveryone/toys@lists.sr.ht adding your channel as an entry in channels.scm.
Data sets associated with modeling examples in Craig Starbuck's book, "The Fundamentals of People Analytics: With Applications in R".
Includes functions for keyword search of pdf files. There is also a wrapper that includes searching of all files within a single directory.
Based on different statistical definitions of discrimination, several methods have been proposed to detect and mitigate social inequality in machine learning models. This package aims to provide an alternative to fairness treatment in predictive models. The ROC method implemented in this package is described by Kamiran, Karim and Zhang (2012) <https://ieeexplore.ieee.org/document/6413831/>.
Simulates pooled sequencing data under a variety of conditions. Also allows for the evaluation of the average absolute difference between allele frequencies computed from genotypes and those computed from pooled data. Carvalho et al., (2022) <doi:10.1101/2023.01.20.524733>.
This package provides functions to setup a personal R package that attaches given libraries and exports personal helper functions.
Estimate commonly used population genomic statistics and generate publication quality figures. PopGenHelpR uses vcf, geno (012), and csv files to generate output.
Construct parser combinator functions, higher order functions that parse input. Construction of such parsers is transparent and easy. Their main application is the parsing of structured text files like those generated by laboratory instruments. Based on a paper by Hutton (1992) <doi:10.1017/S0956796800000411>.
This package provides functions to aid the identification of probable/possible duplicates in Plant Genetic Resources (PGR) collections using passport databases comprising of information records of each constituent sample. These include methods for cleaning the data, creation of a searchable Key Word in Context (KWIC) index of keywords associated with sample records and the identification of nearly identical records with similar information by fuzzy, phonetic and semantic matching of keywords.
The semiparametric accelerated failure time (AFT) model is an attractive alternative to the Cox proportional hazards model. This package provides a suite of functions for fitting one popular rank-based estimator of the semiparametric AFT model, the regularized Gehan estimator. Specifically, we provide functions for cross-validation, prediction, coefficient extraction, and visualizing both trace plots and cross-validation curves. For further details, please see Suder, P. M. and Molstad, A. J., (2022) Scalable algorithms for semiparametric accelerated failure time models in high dimensions, Statistics in Medicine <doi:10.1002/sim.9264>.
This package provides data set and function for exploration of Multiple Indicator Cluster Survey (MICS) 2017-18 Maternal Mortality questionnaire data for Punjab, Pakistan. The results of the present survey are critically important for the purposes of Sustainable Development Goals (SDGs) monitoring, as the survey produces information on 32 global Sustainable Development Goals (SDGs) indicators. The data was collected from 53,840 households selected at the second stage with systematic random sampling out of a sample of 2,692 clusters selected using probability proportional to size sampling. Six questionnaires were used in the survey: (1) a household questionnaire to collect basic demographic information on all de jure household members (usual residents), the household, and the dwelling; (2) a water quality testing questionnaire administered in three households in each cluster of the sample; (3) a questionnaire for individual women administered in each household to all women age 15-49 years; (4) a questionnaire for individual men administered in every second household to all men age 15-49 years; (5) an under-5 questionnaire, administered to mothers (or caretakers) of all children under 5 living in the household; and (6) a questionnaire for children age 5-17 years, administered to the mother (or caretaker) of one randomly selected child age 5-17 years living in the household (<http://www.mics.unicef.org/surveys>).
Design parameters of the optimal two-period multiarm platform design (controlling for either family-wise error rate or pair-wise error rate) can be calculated using this package, allowing pre-planned deferred arms to be added during the trial. More details about the design method can be found in the paper: Pan, H., Yuan, X. and Ye, J. (2022) "An optimal two-period multiarm platform design with new experimental arms added during the trial". Manuscript submitted for publication. For additional references: Dunnett, C. W. (1955) <doi:10.2307/2281208>.
Location- and scale-invariant Box-Cox and Yeo-Johnson power transformations allow for transforming variables with distributions distant from 0 to normality. Transformers are implemented as S4 objects. These allow for transforming new instances to normality after optimising fitting parameters on other data. A test for central normality allows for rejecting transformations that fail to produce a suitably normal distribution, independent of sample number.
This package provides propensity score weighting methods to control for confounding in causal inference with dichotomous treatments and continuous/binary outcomes. It includes the following functional modules: (1) visualization of the propensity score distribution in both treatment groups with mirror histogram, (2) covariate balance diagnosis, (3) propensity score model specification test, (4) weighted estimation of treatment effect, and (5) augmented estimation of treatment effect with outcome regression. The weighting methods include the inverse probability weight (IPW) for estimating the average treatment effect (ATE), the IPW for average treatment effect of the treated (ATT), the IPW for the average treatment effect of the controls (ATC), the matching weight (MW), the overlap weight (OVERLAP), and the trapezoidal weight (TRAPEZOIDAL). Sandwich variance estimation is provided to adjust for the sampling variability of the estimated propensity score. These methods are discussed by Hirano et al (2003) <DOI:10.1111/1468-0262.00442>, Lunceford and Davidian (2004) <DOI:10.1002/sim.1903>, Li and Greene (2013) <DOI:10.1515/ijb-2012-0030>, and Li et al (2016) <DOI:10.1080/01621459.2016.1260466>.
Simulation of models Poisson-Tweedie.
This package provides a novel pseudo-value regression approach for the differential co-expression network analysis in expression data, which can incorporate additional clinical variables in the model. This is a direct regression modeling for the differential network analysis, and it is therefore computationally amenable for the most users. The full methodological details can be found in Ahn S et al (2023) <doi:10.1186/s12859-022-05123-w>.
Several functions are provided to implement a MBPLSDA : components search, optimal model components number search, optimal model validity test by permutation tests, observed values evaluation of optimal model parameters and predicted categories, bootstrap values evaluation of optimal model parameters and predicted cross-validated categories. The use of this package is described in Brandolini-Bunlon et al (2019. Multi-block PLS discriminant analysis for the joint analysis of metabolomic and epidemiological data. Metabolomics, 15(10):134).
Easy and efficient access to the API provided by Prevedere', an industry insights and predictive analytics company. Query and download indicators, models and workbenches built with Prevedere for further analysis and reporting <https://www.prevedere.com/>.
Learn optimal policies via doubly robust empirical welfare maximization over trees. Given doubly robust reward estimates, this package finds a rule-based treatment prescription policy, where the policy takes the form of a shallow decision tree that is globally (or close to) optimal.
Read Protein Data Bank (PDB) files, performs its analysis, and presents the result using different visualization types including 3D. The package also has additional capability for handling Virus Report data from the National Center for Biotechnology Information (NCBI) database. Nature Structural Biology 10, 980 (2003) <doi:10.1038/nsb1203-980>. US National Library of Medicine (2021) <https://www.ncbi.nlm.nih.gov/datasets/docs/reference-docs/data-reports/virus/>.
An implementation of a hybrid method of person-oriented method and perturbation on the model. Pompom is the initials of the two methods. The hybrid method will provide a multivariate intraindividual variability metric (iRAM). The person-oriented method used in this package refers to uSEM (unified structural equation modeling, see Kim et al., 2007, Gates et al., 2010 and Gates et al., 2012 for details). Perturbation on the model was conducted according to impulse response analysis introduced in Lutkepohl (2007). Kim, J., Zhu, W., Chang, L., Bentler, P. M., & Ernst, T. (2007) <doi:10.1002/hbm.20259>. Gates, K. M., Molenaar, P. C. M., Hillary, F. G., Ram, N., & Rovine, M. J. (2010) <doi:10.1016/j.neuroimage.2009.12.117>. Gates, K. M., & Molenaar, P. C. M. (2012) <doi:10.1016/j.neuroimage.2012.06.026>. Lutkepohl, H. (2007, ISBN:3540262393).
The main function, plot_GMM, is used for plotting output from Gaussian mixture models (GMMs), including both densities and overlaying mixture weight component curves from the fit GMM. The package also include the function, plot_cut_point, which plots the cutpoint (mu) from the GMM over a histogram of the distribution with several color options. Finally, the package includes the function, plot_mix_comps, which is used in the plot_GMM function, and can be used to create a custom plot for overlaying mixture component curves from GMMs. For the plot_mix_comps function, usage most often will be specifying the "fun" argument within "stat_function" in a ggplot2 object.
Anomaly detection method based on the paper "Truth will out: Departure-based process-level detection of stealthy attacks on control systems" from Wissam Aoudi, Mikel Iturbe, and Magnus Almgren (2018) <DOI:10.1145/3243734.3243781>. Also referred to the following implementation: <https://github.com/rahulrajpl/PyPASAD>.
Returns almost all features that has been extracted from Position Specific Scoring Matrix (PSSM) so far, which is a matrix of L rows (L is protein length) and 20 columns produced by PSI-BLAST which is a program to produce PSSM Matrix from multiple sequence alignment of proteins see <https://www.ncbi.nlm.nih.gov/books/NBK2590/> for mor details. some of these features are described in Zahiri, J., et al.(2013) <DOI:10.1016/j.ygeno.2013.05.006>, Saini, H., et al.(2016) <DOI:10.17706/jsw.11.8.756-767>, Ding, S., et al.(2014) <DOI:10.1016/j.biochi.2013.09.013>, Cheng, C.W., et al.(2008) <DOI:10.1186/1471-2105-9-S12-S6>, Juan, E.Y., et al.(2009) <DOI:10.1109/CISIS.2009.194>.
Support for parallel computation with progress bar, and option to stop or proceed on errors. Also provides logging to console and disk, and the logging persists in the parallel threads. Additional functions support function call automation with delayed execution (e.g. for executing functions in parallel).