Intended to be used by the United States Copyright Office Product Management Division Business Analysts. Include algorithms for the United States Copyright Office Product Management Division SR Audit Data dataset. The algorithm takes in the SR Audit Data excel file and reformat the spreadsheet such that the values and variables fit the format of the online database. Support functions in this package include clean_str()
, which cleans instances of variable AUDIT_LOG; clean_data_to_excel()
, which cleans and output the reorganized SR Audit Data dataset in excel format; clean_data_to_dataframe()
, which cleans and stores the reorganized SR Audit Data data set to a data frame; format_from_excel()
, which reads in the outputted excel file from the clean_data_to_excel()
function and formats and returns the data as a dictionary that uses FIELD types as keys and NON-FIELD types as the values of those keys. format_from_dataframe()
, which reads in the outputted data frame from the clean_data_to_dataframe()
function and formats and returns the data as a dictionary that uses FIELD types as keys and NON-FIELD types as the values of those keys; support_function()
, which takes in the dictionary outputted either from the format_from_dataframe()
or format_from_excel()
function and returns the data as a formatted data frame according to the original U.S. Copyright Office SR Audit Data online database. The main function of this package is clean_format_all()
, which takes in an excel file and returns the formatted data into a new excel and text file according to the format from the U.S. Copyright Office SR Audit Data online database.
Quickly score raw data outputted from an Implicit Association Test (IAT; Greenwald, McGhee
, & Schwartz, 1998) <doi:10.1037/0022-3514.74.6.1464>. IAT scores are calculated as specified by Greenwald, Nosek, and Banaji (2003) <doi:10.1037/0022-3514.85.2.197>. The output of this function is a data frame that consists of four rows containing the following information: (1) the overall IAT effect size for the participant's dataset, (2) the effect size calculated for odd trials only, (3) the effect size calculated for even trials only, and (4) the proportion of trials with reaction times under 300ms (which is important for exclusion purposes). Items (2) and (3) allow for a measure of the internal consistency of the IAT. Specifically, you can use the subsetted IAT effect sizes for odd and even trials to calculate Cronbach's alpha across participants in the sample. The input function consists of three arguments. First, indicate the name of the dataset to be analyzed. This is the only required input. Second, indicate the number of trials in your entire IAT (the default is set to 220, which is typical for most IATs). Last, indicate whether congruent trials (e.g., flowers and pleasant) or incongruent trials (e.g., guns and pleasant) were presented first for this participant (the default is set to congruent). Data files should consist of six columns organized in order as follows: Block (0-6), trial (0-19 for training blocks, 0-39 for test blocks), category (dependent on your IAT), the type of item within that category (dependent on your IAT), a dummy variable indicating whether the participant was correct or incorrect on that trial (0=correct, 1=incorrect), and the participantâ s reaction time (in milliseconds). A sample dataset (titled sampledata') is included in this package to practice with.
Statistical methods for the modeling and monitoring of time series of counts, proportions and categorical data, as well as for the modeling of continuous-time point processes of epidemic phenomena. The monitoring methods focus on aberration detection in count data time series from public health surveillance of communicable diseases, but applications could just as well originate from environmetrics, reliability engineering, econometrics, or social sciences. The package implements many typical outbreak detection procedures such as the (improved) Farrington algorithm, or the negative binomial GLR-CUSUM method of Hoehle and Paul (2008) <doi:10.1016/j.csda.2008.02.015>. A novel CUSUM approach combining logistic and multinomial logistic modeling is also included. The package contains several real-world data sets, the ability to simulate outbreak data, and to visualize the results of the monitoring in a temporal, spatial or spatio-temporal fashion. A recent overview of the available monitoring procedures is given by Salmon et al. (2016) <doi:10.18637/jss.v070.i10>. For the retrospective analysis of epidemic spread, the package provides three endemic-epidemic modeling frameworks with tools for visualization, likelihood inference, and simulation. hhh4()
estimates models for (multivariate) count time series following Paul and Held (2011) <doi:10.1002/sim.4177> and Meyer and Held (2014) <doi:10.1214/14-AOAS743>. twinSIR()
models the susceptible-infectious-recovered (SIR) event history of a fixed population, e.g, epidemics across farms or networks, as a multivariate point process as proposed by Hoehle (2009) <doi:10.1002/bimj.200900050>. twinstim()
estimates self-exciting point process models for a spatio-temporal point pattern of infective events, e.g., time-stamped geo-referenced surveillance data, as proposed by Meyer et al. (2012) <doi:10.1111/j.1541-0420.2011.01684.x>. A recent overview of the implemented space-time modeling frameworks for epidemic phenomena is given by Meyer et al. (2017) <doi:10.18637/jss.v077.i11>.
Algorithms for checking the accuracy of a clustering result with known classes, computing cluster validity indices, and generating plots for comparing them. The package is compatible with K-means, fuzzy C means, EM clustering, and hierarchical clustering (single, average, and complete linkage). The details of the indices in this package can be found in: J. C. Bezdek, M. Moshtaghi, T. Runkler, C. Leckie (2016) <doi:10.1109/TFUZZ.2016.2540063>, T. Calinski, J. Harabasz (1974) <doi:10.1080/03610927408827101>, C. H. Chou, M. C. Su, E. Lai (2004) <doi:10.1007/s10044-004-0218-1>, D. L. Davies, D. W. Bouldin (1979) <doi:10.1109/TPAMI.1979.4766909>, J. C. Dunn (1973) <doi:10.1080/01969727308546046>, F. Haouas, Z. Ben Dhiaf, A. Hammouda, B. Solaiman (2017) <doi:10.1109/FUZZ-IEEE.2017.8015651>, M. Kim, R. S. Ramakrishna (2005) <doi:10.1016/j.patrec.2005.04.007>, S. H. Kwon (1998) <doi:10.1049/EL:19981523>, S. H. Kwon, J. Kim, S. H. Son (2021) <doi:10.1049/ell2.12249>, G. W. Miligan (1980) <doi:10.1007/BF02293907>, M. K. Pakhira, S. Bandyopadhyay, U. Maulik (2004) <doi:10.1016/j.patcog.2003.06.005>, M. Popescu, J. C. Bezdek, T. C. Havens, J. M. Keller (2013) <doi:10.1109/TSMCB.2012.2205679>, S. Saitta, B. Raphael, I. Smith (2007) <doi:10.1007/978-3-540-73499-4_14>, A. Starczewski (2017) <doi:10.1007/s10044-015-0525-8>, Y. Tang, F. Sun, Z. Sun (2005) <doi:10.1109/ACC.2005.1470111>, N. Wiroonsri (2024) <doi:10.1016/j.patcog.2023.109910>, N. Wiroonsri, O. Preedasawakul (2023) <doi:10.48550/arXiv.2308.14785>
, C. H. Wu, C. S. Ouyang, L. W. Chen, L. W. Lu (2015) <doi:10.1109/TFUZZ.2014.2322495>, X. Xie, G. Beni (1991) <doi:10.1109/34.85677> and Rousseeuw (1987) and Kaufman and Rousseeuw(2009) <doi:10.1016/0377-0427(87)90125-7> and <doi:10.1002/9780470316801> C. Alok. (2010).
The main goal of the R package treeDbalance
is to provide functions for the computation of several measurements of 3D node imbalance and their respective 3D tree imbalance indices, as well as to introduce the new phylo3D format for rooted 3D tree objects. Moreover, it encompasses an example dataset of 3D models of 63 beans in phylo3D format. Please note that this R package was developed alongside the project described in the manuscript Measuring 3D tree imbalance of plant models using graph-theoretical approaches by M. Fischer, S. Kersting, and L. Kühn (2023) <arXiv:2307.14537>
, which provides precise mathematical definitions of the measurements. Furthermore, the package contains several helpful functions, for example, some auxiliary functions for computing the ancestors, descendants, and depths of the nodes, which ensures that the computations can be done in linear time. Most functions of treeDbalance
require as input a rooted tree in the phylo3D format, an extended phylo format (as introduced in the R package ape 1.9 in November 2006). Such a phylo3D object must have at least two new attributes next to those required by the phylo format: node.coord', the coordinates of the nodes, as well as edge.weight', the literal weight or volume of the edges. Optional attributes are edge.diam', the diameter of the edges, and edge.length', the length of the edges. For visualization purposes one can also specify edge.type', which ranges from normal cylinder to bud to leaf, as well as edge.color to change the color of the edge depiction. This project was supported by the joint research project DIG-IT! funded by the European Social Fund (ESF), reference: ESF/14-BM-A55-0017/19, and the Ministry of Education, Science and Culture of Mecklenburg-Western Pomerania, Germany, as well as by the the project ArtIGROW
, which is a part of the WIR!-Alliance ArtIFARM
â Artificial Intelligence in Farming funded by the German Federal Ministry of Education and Research (FKZ: 03WIR4805).
Descriptive Statistics is essential for publishing articles. This package can perform descriptive statistics according to different data types. If the data is a continuous variable, the mean and standard deviation or median and quartiles are automatically output; if the data is a categorical variable, the number and percentage are automatically output. In addition, if you enter two variables in this package, the two variables will be described and their relationships will be tested automatically according to their data types. For example, if one of the two input variables is a categorical variable, another variable will be described hierarchically based on the categorical variable and the statistical differences between different groups will be compared using appropriate statistical methods. And for groups of more than two, the post hoc test will be applied. For more information on the methods we used, please see the following references: Libiseller, C. and Grimvall, A. (2002) <doi:10.1002/env.507>, Patefield, W. M. (1981) <doi:10.2307/2346669>, Hope, A. C. A. (1968) <doi:10.1111/J.2517-6161.1968.TB00759.X>, Mehta, C. R. and Patel, N. R. (1983) <doi:10.1080/01621459.1983.10477989>, Mehta, C. R. and Patel, N. R. (1986) <doi:10.1145/6497.214326>, Clarkson, D. B., Fan, Y. and Joe, H. (1993) <doi:10.1145/168173.168412>, Cochran, W. G. (1954) <doi:10.2307/3001616>, Armitage, P. (1955) <doi:10.2307/3001775>, Szabo, A. (2016) <doi:10.1080/00031305.2017.1407823>, David, F. B. (1972) <doi:10.1080/01621459.1972.10481279>, Joanes, D. N. and Gill, C. A. (1998) <doi:10.1111/1467-9884.00122>, Dunn, O. J. (1964) <doi:10.1080/00401706.1964.10490181>, Copenhaver, M. D. and Holland, B. S. (1988) <doi:10.1080/00949658808811082>, Chambers, J. M., Freeny, A. and Heiberger, R. M. (1992) <doi:10.1201/9780203738535-5>, Shaffer, J. P. (1995) <doi:10.1146/annurev.ps.46.020195.003021>, Myles, H. and Douglas, A. W. (1973) <doi:10.2307/2063815>, Rahman, M. and Tiwari, R. (2012) <doi:10.4236/health.2012.410139>, Thode, H. J. (2002) <doi:10.1201/9780203910894>, Jonckheere, A. R. (1954) <doi:10.2307/2333011>, Terpstra, T. J. (1952) <doi:10.1016/S1385-7258(52)50043-X>.
The data that is generated from independent and consecutive GillespieSSA
runs for a generic biochemical network is formatted as rows and constitutes an observation. The first column of each row is the computed timestep for each run. Subsequent columns are used for the number of molecules of each participating molecular species or "metabolite" of a generic biochemical network. In this way TemporalGSSA
', is a wrapper for the R-package GillespieSSA
'. The number of observations must be at least 30. This will generate data that is statistically significant. TemporalGSSA
', transforms this raw data into a simulation time-dependent and metabolite-specific trial. Each such trial is defined as a set of linear models (n >= 30) between a timestep and number of molecules for a metabolite. Each linear model is characterized by coefficients such as the slope, arbitrary constant, etc. The user must enter an integer from 1-4. These specify the statistical modality utilized to compute a representative timestep (mean, median, random, all). These arguments are mandatory and will be checked. Whilst, the numeric indicator "0" indicates suitability, "1" prompts the user to revise and re-enter their data. An optional logical argument controls the output to the console with the default being "TRUE" (curtailed) whilst "FALSE" (verbose). The coefficients of each linear model are averaged (mean slope, mean constant) and are incorporated into a metabolite-specific linear regression model as the dependent variable. The independent variable is the representative timestep chosen previously. The generated data is the imputed molecule number for an in silico experiment with (n >=30) observations. These steps can be replicated with multiple set of observations. The generated "technical replicates" can be statistically evaluated (mean, standard deviation) and will constitute simulation time-dependent molecules for each metabolite. For SSA-generated datasets with varying simulation times TemporalGSSA
will generate a simulation time-dependent trajectory for each metabolite of the biochemical network under study. The relevant publication with the mathematical derivation of the algorithm is (2022, Journal of Bioinformatics and Computational Biology) <doi:10.1142/S0219720022500184>. The algorithm has been deployed in the following publications (2021, Heliyon) <doi:10.1016/j.heliyon.2021.e07466> and (2016, Journal of Theoretical Biology) <doi:10.1016/j.jtbi.2016.07.002>.
Generates random strings and byte strings matching a regex.
Fast monotone priority queues
Sampling from random number distributions.
Sampling from random number distributions.
This package provides Safe bindings for gettext.
This package provides core APIs for Rayon.
This package provides HTTP Range header parser.
Electrical properties of resistor networks using matrix methods.
Lazy static regular expressions checked at compile time.
Lazy static regular expressions checked at compile time.
Row encodings for the Polars DataFrame
library.
This package provides GNU Gettext FFI bindings for Rust.
Use Rust's regex library with the grep crate.
This package provides Rustls bindings for non-Rust languages.
This package provides tool for sampling from random number distributions.
Rspec-core provides the RSpec test runner and example groups.
Rspec-core provides the RSpec test runner and example groups.