Forced-choice (FC) response has gained increasing popularity and interest for its resistance to faking when well-designed (Cao & Drasgow, 2019 <doi:10.1037/apl0000414>). To established well-designed FC scales, typically each item within a block should measure different trait and have similar level of social desirability (Zhang et al., 2020 <doi:10.1177/1094428119836486>). Recent study also suggests the importance of high inter-item agreement of social desirability between items within a block (Pavlov et al., 2021 <doi:10.31234/osf.io/hmnrc>). In addition to this, FC developers may also need to maximize factor loading differences (Brown & Maydeu-Olivares, 2011 <doi:10.1177/0013164410375112>) or minimize item location differences (Cao & Drasgow, 2019 <doi:10.1037/apl0000414>) depending on scoring models. Decision of which items should be assigned to the same block, termed item pairing, is thus critical to the quality of an FC test. This pairing process is essentially an optimization process which is currently carried out manually. However, given that we often need to simultaneously meet multiple objectives, manual pairing becomes impractical or even not feasible once the number of latent traits and/or number of items per trait are relatively large. To address these problems, autoFC
is developed as a practical tool for facilitating the automatic construction of FC tests (Li et al., 2022 <doi:10.1177/01466216211051726>), essentially exempting users from the burden of manual item pairing and reducing the computational costs and biases induced by simple ranking methods. Given characteristics of each item (and item responses), FC measures can be constructed either automatically based on user-defined pairing criteria and weights, or based on exact specifications of each block (i.e., blueprint; see Li et al., 2024 <doi:10.1177/10944281241229784>). Users can also generate simulated responses based on the Thurstonian Item Response Theory model (Brown & Maydeu-Olivares, 2011 <doi:10.1177/0013164410375112>) and predict trait scores of simulated/actual respondents based on an estimated model.
In shotgun proteomics, shared peptides (i.e., peptides that might originate from different proteins sharing homology, from different proteoforms due to alternative mRNA
splicing, post-translational modifications, proteolytic cleavages, and/or allelic variants) represent a major source of ambiguity in protein identifications. The net4pg package allows to assess and handle ambiguity of protein identifications. It implements methods for two main applications. First, it allows to represent and quantify ambiguity of protein identifications by means of graph connected components (CCs). In graph theory, CCs are defined as the largest subgraphs in which any two vertices are connected to each other by a path and not connected to any other of the vertices in the supergraph. Here, proteins sharing one or more peptides are thus gathered in the same CC (multi-protein CC), while unambiguous protein identifications constitute CCs with a single protein vertex (single-protein CCs). Therefore, the proportion of single-protein CCs and the size of multi-protein CCs can be used to measure the level of ambiguity of protein identifications. The package implements a strategy to efficiently calculate graph connected components on large datasets and allows to visually inspect them. Secondly, the net4pg package allows to exploit the increasing availability of matched transcriptomic and proteomic datasets to reduce ambiguity of protein identifications. More precisely, it implement a transcriptome-based filtering strategy fundamentally consisting in the removal of those proteins whose corresponding transcript is not expressed in the sample-matched transcriptome. The underlying assumption is that, according to the central dogma of biology, there can be no proteins without the corresponding transcript. Most importantly, the package allows to visually inspect the effect of the filtering on protein identifications and quantify ambiguity before and after filtering by means of graph connected components. As such, it constitutes a reproducible and transparent method to exploit transcriptome information to enhance protein identifications. All methods implemented in the net4pg package are fully described in Fancello and Burger (2022) <doi:10.1186/s13059-022-02701-2>.
Inflammation can affect many micronutrient biomarkers and can thus lead to incorrect diagnosis of individuals and to over- or under-estimate the prevalence of deficiency in a population. Biomarkers Reflecting Inflammation and Nutritional Determinants of Anemia (BRINDA) is a multi-agency and multi-country partnership designed to improve the interpretation of nutrient biomarkers in settings of inflammation and to generate context-specific estimates of risk factors for anemia (Suchdev (2016) <doi:10.3945/an.115.010215>). In the past few years, BRINDA published a series of papers to provide guidance on how to adjust micronutrient biomarkers, retinol binding protein, serum retinol, serum ferritin by Namaste (2020), soluble transferrin receptor (sTfR
), serum zinc, serum and Red Blood Cell (RBC) folate, and serum B-12, using inflammation markers, alpha-1-acid glycoprotein (AGP) and/or C-Reactive Protein (CRP) by Namaste (2020) <doi:10.1093/ajcn/nqaa141>, Rohner (2017) <doi:10.3945/ajcn.116.142232>, McDonald
(2020) <doi:10.1093/ajcn/nqz304>, and Young (2020) <doi:10.1093/ajcn/nqz303>. The BRINDA inflammation adjustment method mainly focuses on Women of Reproductive Age (WRA) and Preschool-age Children (PSC); however, the general principle of the BRINDA method might apply to other population groups. The BRINDA R package is a user-friendly all-in-one R package that uses a series of functions to implement BRINDA adjustment method, as described above. The BRINDA R package will first carry out rigorous checks and provides users guidance to correct data or input errors (if they occur) prior to inflammation adjustments. After no errors are detected, the package implements the BRINDA inflammation adjustment for up to five micronutrient biomarkers, namely retinol-binding-protein, serum retinol, serum ferritin, sTfR
, and serum zinc (when appropriate), using inflammation indicators of AGP and/or CRP for various population groups. Of note, adjustment for serum and RBC folate and serum B-12 is not included in the R package, since evidence shows that no adjustment is needed for these micronutrient biomarkers in either WRA or PSC groups (Young (2020) <doi:10.1093/ajcn/nqz303>).
Optimization of conditional inference trees from the package party for classification and regression. For optimization, the model space is searched for the best tree on the full sample by means of repeated subsampling. Restrictions are allowed so that only trees are accepted which do not include pre-specified uninterpretable split results (cf. Weihs & Buschfeld, 2021a). The function PrInDT()
represents the basic resampling loop for 2-class classification (cf. Weihs & Buschfeld, 2021a). The function RePrInDT()
(repeated PrInDT()
) allows for repeated applications of PrInDT()
for different percentages of the observations of the large and the small classes (cf. Weihs & Buschfeld, 2021c). The function NesPrInDT()
(nested PrInDT()
) allows for an extra layer of subsampling for a specific factor variable (cf. Weihs & Buschfeld, 2021b). The functions PrInDTMulev()
and PrInDTMulab()
deal with multilevel and multilabel classification. In addition to these PrInDT()
variants for classification, the function PrInDTreg()
has been developed for regression problems. Finally, the function PostPrInDT()
allows for a posterior analysis of the distribution of a specified variable in the terminal nodes of a given tree. In version 2, additionally structured sampling is implemented in functions PrInDTCstruc()
and PrInDTRstruc()
. In these functions, repeated measurements data can be analyzed, too. Moreover, multilabel 2-stage versions of classification and regression trees are implemented in functions C2SPrInDT()
and R2SPrInDT()
as well as interdependent multilabel models in functions SimCPrInDT()
and SimRPrInDT()
. Finally, for mixtures of classification and regression models functions Mix2SPrInDT()
and SimMixPrInDT()
are implemented. These extensions of PrInDT
are all described in Buschfeld & Weihs (2025Fc). References: -- Buschfeld, S., Weihs, C. (2025Fc) "Optimizing decision trees for the analysis of World Englishes and sociolinguistic data", Cambridge Elements. -- Weihs, C., Buschfeld, S. (2021a) "Combining Prediction and Interpretation in Decision Trees (PrInDT
) - a Linguistic Example" <doi:10.48550/arXiv.2103.02336>
; -- Weihs, C., Buschfeld, S. (2021b) "NesPrInDT
: Nested undersampling in PrInDT
" <doi:10.48550/arXiv.2103.14931>
; -- Weihs, C., Buschfeld, S. (2021c) "Repeated undersampling in PrInDT
(RePrInDT
): Variation in undersampling and prediction, and ranking of predictors in ensembles" <doi:10.48550/arXiv.2108.05129>
.
Sometimes data for analysis are obtained using more convenient or less expensive means yielding "surrogate" variables for what could be obtained more accurately, albeit with less convenience; or less conveniently or at more expense yielding "reference" variables, thought of as being measured without error. Analysis of the surrogate variables measured with error generally yields biased estimates when the objective is to make inference about the reference variables. Often it is thought that ignoring the measurement error in surrogate variables only biases effects toward the null hypothesis, but this need not be the case. Measurement errors may bias parameter estimates either toward or away from the null hypothesis. If one has a data set with surrogate variable data from the full sample, and also reference variable data from a randomly selected subsample, then one can assess the bias introduced by measurement error in parameter estimation, and use this information to derive improved estimates based upon all available data. Formulaically these estimates based upon the reference variables from the validation subsample combined with the surrogate variables from the whole sample can be interpreted as starting with the estimate from reference variables in the validation subsample, and "augmenting" this with additional information from the surrogate variables. This suggests the term "augmented" estimate. The meerva package calculates these augmented estimates in the regression setting when there is a randomly selected subsample with both surrogate and reference variables. Measurement errors may be differential or non-differential, in any or all predictors (simultaneously) as well as outcome. The augmented estimates derive, in part, from the multivariate correlation between regression model parameter estimates from the reference variables and the surrogate variables, both from the validation subset. Because the validation subsample is chosen at random any biases imposed by measurement error, whether non-differential or differential, are reflected in this correlation and these correlations can be used to derive estimates for the reference variables using data from the whole sample. The main functions in the package are meerva.fit which calculates estimates for a dataset, and meerva.sim.block which simulates multiple datasets as described by the user, and analyzes these datasets, storing the regression coefficient estimates for inspection. The augmented estimates, as well as how measurement error may arise in practice, is described in more detail by Kremers WK (2021) <arXiv:2106.14063>
and is an extension of the works by Chen Y-H, Chen H. (2000) <doi:10.1111/1467-9868.00243>, Chen Y-H. (2002) <doi:10.1111/1467-9868.00324>, Wang X, Wang Q (2015) <doi:10.1016/j.jmva.2015.05.017> and Tong J, Huang J, Chubak J, et al. (2020) <doi:10.1093/jamia/ocz180>.
Embedded scripting for Rust.
Endian-aware primitives for Rust
Read Acoustic HAC format.
Persistent data structures with structural sharing
Zero-copy deserialization framework for Rust
Rust library for accessing USB devices.
Zero-copy deserialization framework for Rust
ROFF (man page format) generation library.
This package provides a generic connection pool.
This package provides a genetic algorithm plus derivative optimizer.
This package provided safe, fast, small crypto using Rust.
This package provided safe, fast, small crypto using Rust.
This package provided safe, fast, small crypto using Rust.
This package provided safe, fast, small crypto using Rust.
Rkyv is a zero-copy deserialization framework for Rust.
Rkyv is a zero-copy deserialization framework for Rust.
This package provides Safe, fast, small crypto using Rust.
This package provides a Rust library for accessing USB devices.
This create provides a value variant for rust-rmp
.