Based on individual market shares of all participants in a market or space, the package offers a set of different structural and concentration measures frequently - and not so frequently - used in research and in practice. Measures can be calculated in groups or individually. The calculated measure or the resulting vector in table format should help practitioners make more informed decisions. Methods used in this package are from: 1. Chang, E. J., Guerra, S. M., de Souza Penaloza, R. A. & Tabak, B. M. (2005) "Banking concentration: the Brazilian case". 2. Cobham, A. and A. Summer (2013). "Is It All About the Tails? The Palma Measure of Income Inequality". 3. Garcia Alba Idunate, P. (1994). "Un Indice de dominancia para el analisis de la estructura de los mercados". 4. Ginevicius, R. and S. Cirba (2009). "Additive measurement of market concentration" <doi:10.3846/1611-1699.2009.10.191-198>. 5. Herfindahl, O. C. (1950), "Concentration in the steel industry" (PhD thesis). 6. Hirschmann, A. O. (1945), "National power and structure of foreign trade". 7. Melnik, A., O. Shy, and R. Stenbacka (2008), "Assessing market dominance" <doi:10.1016/j.jebo.2008.03.010>. 8. Palma, J. G. (2006). "Globalizing Inequality: Centrifugal and Centripetal Forces at Work". 9. Shannon, C. E. (1948). "A Mathematical Theory of Communication". 10. Simpson, E. H. (1949). "Measurement of Diversity" <doi:10.1038/163688a0>.
This package provides automated downloading, parsing and formatting of weather data for Australia through API endpoints provided by the Department of Primary Industries and Regional Development (DPIRD) of Western Australia and by the Science and Technology Division of the Queensland Government's Department of Environment and Science (DES). As well as the Bureau of Meteorology (BOM) of the Australian government precis and coastal forecasts, and downloading and importing radar and satellite imagery files. DPIRD weather data are accessed through public APIs provided by DPIRD, <https://www.dpird.wa.gov.au/online-tools/apis/>, providing access to weather station data from the DPIRD weather station network. Australia-wide weather data are based on data from the Australian Bureau of Meteorology (BOM) data and accessed through SILO (Scientific Information for Land Owners) Jeffrey et al. (2001) <doi:10.1016/S1364-8152(01)00008-1>. DPIRD data are made available under a Creative Commons Attribution 3.0 Licence (CC BY 3.0 AU) license <https://creativecommons.org/licenses/by/3.0/au/deed.en>. SILO data are released under a Creative Commons Attribution 4.0 International licence (CC BY 4.0) <https://creativecommons.org/licenses/by/4.0/>. BOM data are (c) Australian Government Bureau of Meteorology and released under a Creative Commons (CC) Attribution 3.0 licence or Public Access Licence (PAL) as appropriate, see <http://www.bom.gov.au/other/copyright.shtml> for further details.
This package contains functions to compute p-values for the one-sample and two-sample Kolmogorov-Smirnov (KS) tests and the two-sample Kuiper test for any fixed critical level and arbitrary (possibly very large) sample sizes. For the one-sample KS test, this package implements a novel, accurate and efficient method named Exact-KS-FFT, which allows the pre-specified cumulative distribution function under the null hypothesis to be continuous, purely discrete or mixed. In the two-sample case, it is assumed that both samples come from an unspecified (unknown) continuous, purely discrete or mixed distribution, i.e. ties (repeated observations) are allowed, and exact p-values of the KS and the Kuiper tests are computed. Note, the two-sample Kuiper test is often used when data samples are on the line or on the circle (circular data). To cite this package in publication: (for the use of the one-sample KS test) Dimitrina S. Dimitrova, Vladimir K. Kaishev, and Senren Tan. Computing the Kolmogorov-Smirnov Distribution When the Underlying CDF is Purely Discrete, Mixed, or Continuous. Journal of Statistical Software. 2020; 95(10): 1--42. <doi:10.18637/jss.v095.i10>. (for the use of the two-sample KS and Kuiper tests) Dimitrina S. Dimitrova, Yun Jia and Vladimir K. Kaishev (2024). The R functions KS2sample and Kuiper2sample: Efficient Exact Calculation of P-values of the Two-sample Kolmogorov-Smirnov and Kuiper Tests. submitted.
This package provides functions for Gaussian and Non Gaussian (bivariate) spatial and spatio-temporal data analysis are provided for a) (fast) simulation of random fields, b) inference for random fields using standard likelihood and a likelihood approximation method called weighted composite likelihood based on pairs and b) prediction using (local) best linear unbiased prediction. Weighted composite likelihood can be very efficient for estimating massive datasets. Both regression and spatial (temporal) dependence analysis can be jointly performed. Flexible covariance models for spatial and spatial-temporal data on Euclidean domains and spheres are provided. There are also many useful functions for plotting and performing diagnostic analysis. Different non Gaussian random fields can be considered in the analysis. Among them, random fields with marginal distributions such as Skew-Gaussian, Student-t, Tukey-h, Sin-Arcsin, Two-piece, Weibull, Gamma, Log-Gaussian, Binomial, Negative Binomial and Poisson. See the URL for the papers associated with this package, as for instance, Bevilacqua and Gaetan (2015) <doi:10.1007/s11222-014-9460-6>, Bevilacqua et al. (2016) <doi:10.1007/s13253-016-0256-3>, Vallejos et al. (2020) <doi:10.1007/978-3-030-56681-4>, Bevilacqua et. al (2020) <doi:10.1002/env.2632>, Bevilacqua et. al (2021) <doi:10.1111/sjos.12447>, Bevilacqua et al. (2022) <doi:10.1016/j.jmva.2022.104949>, Morales-Navarrete et al. (2023) <doi:10.1080/01621459.2022.2140053>, and a large class of examples and tutorials.
The presence of outliers in a dataset can substantially bias the results of statistical analyses. To correct for outliers, micro edits are manually performed on all records. A set of constraints and decision rules is typically used to aid the editing process. However, straightforward decision rules might overlook anomalies arising from disruption of linear relationships. Computationally efficient methods are provided to identify historical, tail, and relational anomalies at the data-entry level (Sartore et al., 2024; <doi:10.6339/24-JDS1136>). A score statistic is developed for each anomaly type, using a distribution-free approach motivated by the Bienaymé-Chebyshev's inequality, and fuzzy logic is used to detect cellwise outliers resulting from different types of anomalies. Each data entry is individually scored and individual scores are combined into a final score to determine anomalous entries. In contrast to fuzzy logic, Bayesian bootstrap and a Bayesian test based on empirical likelihoods are also provided as studied by Sartore et al. (2024; <doi:10.3390/stats7040073>). These algorithms allow for a more nuanced approach to outlier detection, as it can identify outliers at data-entry level which are not obviously distinct from the rest of the data. --- This research was supported in part by the U.S. Department of Agriculture, National Agriculture Statistics Service. The findings and conclusions in this publication are those of the authors and should not be construed to represent any official USDA, or US Government determination or policy.
This package provides a tuneable and interpretable method for relaxing the instrumental variables (IV) assumptions to infer treatment effects in the presence of unobserved confounding. For a treatment-associated covariate to be a valid IV, it must be (a) unconfounded with the outcome and (b) have a causal effect on the outcome that is exclusively mediated by the exposure. There is no general test of the validity of these IV assumptions for any particular pre-treatment covariate. However, if different pre-treatment covariates give differing causal effect estimates when treated as IVs, then we know at least some of the covariates violate these assumptions. budgetIVr exploits this fact by taking as input a minimum budget of pre-treatment covariates assumed to be valid IVs and idenfiying the set of causal effects that are consistent with the user's data and budget assumption. The following generalizations of this principle can be used in this package: (1) a vector of multiple budgets can be assigned alongside corresponding thresholds that model degrees of IV invalidity; (2) budgets and thresholds can be chosen using specialist knowledge or varied in a principled sensitivity analysis; (3) treatment effects can be nonlinear and/or depend on multiple exposures (at a computational cost). The methods in this package require only summary statistics. Confidence sets are constructed under the "no measurement error" (NOME) assumption from the Mendelian randomization literature. For further methodological details, please refer to Penn et al. (2024) <doi:10.48550/arXiv.2411.06913>.
Assess the agreement in method comparison studies by tolerance intervals and errors-in-variables (EIV) regressions. The Ordinary Least Square regressions (OLSv and OLSh), the Deming Regression (DR), and the (Correlated)-Bivariate Least Square regressions (BLS and CBLS) can be used with unreplicated or replicated data. The BLS() and CBLS() are the two main functions to estimate a regression line, while XY.plot() and MD.plot() are the two main graphical functions to display, respectively an (X,Y) plot or (M,D) plot with the BLS or CBLS results. Four hyperbolic statistical intervals are provided: the Confidence Interval (CI), the Confidence Bands (CB), the Prediction Interval and the Generalized prediction Interval. Assuming no proportional bias, the (M,D) plot (Band-Altman plot) may be simplified by calculating univariate tolerance intervals (beta-expectation (type I) or beta-gamma content (type II)). Major updates from last version 1.0.0 are: title shortened, include the new functions BLS.fit() and CBLS.fit() as shortcut of the, respectively, functions BLS() and CBLS(). References: B.G. Francq, B. Govaerts (2016) <doi:10.1002/sim.6872>, B.G. Francq, B. Govaerts (2014) <doi:10.1016/j.chemolab.2014.03.006>, B.G. Francq, B. Govaerts (2014) <http://publications-sfds.fr/index.php/J-SFdS/article/view/262>, B.G. Francq (2013), PhD Thesis, UCLouvain, Errors-in-variables regressions to assess equivalence in method comparison studies, <https://dial.uclouvain.be/pr/boreal/object/boreal%3A135862/datastream/PDF_01/view>.
This package provides a collection of user-friendly functions for assessing and visualizing fragility of individual studies (Walsh et al., 2014 <doi:10.1016/j.jclinepi.2013.10.019>; Lin, 2021 <doi:10.1111/jep.13428>), conventional pairwise meta-analyses (Atal et al., 2019 <doi:10.1016/j.jclinepi.2019.03.012>), and network meta-analyses of multiple treatments with binary outcomes (Xing et al., 2020 <doi:10.1016/j.jclinepi.2020.07.003>). The included functions are designed to: 1) calculate the fragility index (i.e., the minimal event status modifications that can alter the significance or non-significance of the original result) and fragility quotient (i.e., fragility index divided by sample size) at a specific significance level; 2) give the cases of event status modifications for altering the result's significance or non-significance and visualize these cases; 3) visualize the trend of statistical significance as event status is modified; 4) efficiently derive fragility indexes and fragility quotients at multiple significance levels, and visualize the relationship between these fragility measures against the significance levels; and 5) calculate fragility indexes and fragility quotients of multiple datasets (e.g., a collection of clinical trials or meta-analyses) and produce plots of their overall distributions. The outputs from these functions may inform the robustness of clinical results in terms of statistical significance and aid the interpretation of fragility measures. The usage of this package is illustrated in Lin et al. (2023 <doi:10.1016/j.ajog.2022.08.053>) and detailed in Lin and Chu (2022 <doi:10.1371/journal.pone.0268754>).
This package provides a modification of the preventive vaccine efficacy trial design of Gilbert, Grove et al. (2011, Statistical Communications in Infectious Diseases) is implemented, with application generally to individual-randomized clinical trials with multiple active treatment groups and a shared control group, and a study endpoint that is a time-to-event endpoint subject to right-censoring. The design accounts for the issues that the efficacy of the treatment/vaccine groups may take time to accrue while the multiple treatment administrations/vaccinations are given; there is interest in assessing the durability of treatment efficacy over time; and group sequential monitoring of each treatment group for potential harm, non-efficacy/efficacy futility, and high efficacy is warranted. The design divides the trial into two stages of time periods, where each treatment is first evaluated for efficacy in the first stage of follow-up, and, if and only if it shows significant treatment efficacy in stage one, it is evaluated for longer-term durability of efficacy in stage two. The package produces plots and tables describing operating characteristics of a specified design including an unconditional power for intention-to-treat and per-protocol/as-treated analyses; trial duration; probabilities of the different possible trial monitoring outcomes (e.g., stopping early for non-efficacy); unconditional power for comparing treatment efficacies; and distributions of numbers of endpoint events occurring after the treatments/vaccinations are given, useful as input parameters for the design of studies of the association of biomarkers with a clinical outcome (surrogate endpoint problem). The code can be used for a single active treatment versus control design and for a single-stage design.
EQ-5D is a standard instrument (<https://euroqol.org/eq-5d-instruments/>) that measures the quality of life often used in clinical and economic evaluations of health care technologies. Both adult versions of EQ-5D (EQ-5D-3L and EQ-5D-5L) contain a descriptive system and visual analog scale. The descriptive system measures the patient's health in 5 dimensions: the 5L versions has 5 levels and 3L version has 3 levels. The descriptive system scores are usually converted to index values using country specific values sets (that incorporates the country preferences). This package allows the calculation of both descriptive system scores to the index value scores. The value sets for EQ-5D-3L are from the references mentioned in the website <https://euroqol.org/eq-5d-instruments/eq-5d-3l-about/valuation/> The value sets for EQ-5D-3L for a total of 31 countries are used for the valuation (see the user guide for a complete list of references). The value sets for EQ-5D-5L are obtained from references mentioned in the <https://euroqol.org/eq-5d-instruments/eq-5d-5l-about/valuation-standard-value-sets/> and other sources. The value sets for EQ-5D-5L for a total of 17 countries are used for the valuation (see the user guide for a complete list of references). The package can also be used to map 5L scores to 3L index values for 10 countries: Denmark, France, Germany, Japan, Netherlands, Spain, Thailand, UK, USA, and Zimbabwe. The value set and method for mapping are obtained from Van Hout et al (2012) <doi: 10.1016/j.jval.2012.02.008>.
The microplot function writes a set of R graphics files to be used as microplots (sparklines) in tables in either LaTeX', HTML', Word', or Excel files. For LaTeX', we provide methods for the Hmisc::latex() generic function to construct latex tabular environments which include the graphs. These can be used directly with the operating system pdflatex or latex command, or by using one of Sweave', knitr', rmarkdown', or Emacs org-mode as an intermediary. For MS Word', the msWord() function uses the flextable package to construct Word tables which include the graphs. There are several distinct approaches for constructing HTML files. The simplest is to use the msWord() function with argument filetype="html". Alternatively, use either Emacs org-mode or the htmlTable::htmlTable() function to construct an HTML file containing tables which include the graphs. See the documentation for our as.htmlimg() function. For Excel use on Windows', the file examples/irisExcel.xls includes VBA code which brings the individual panels into individual cells in the spreadsheet. Examples in the examples and demo subdirectories are shown with lattice graphics, ggplot2 graphics, and base graphics. Examples for LaTeX include Sweave (both LaTeX'-style and Noweb'-style), knitr', emacs org-mode', and rmarkdown input files and their pdf output files. Examples for HTML include org-mode and Rmd input files and their webarchive HTML output files. In addition, the as.orgtable() function can display a data.frame in an org-mode document. The examples for MS Word (with either filetype="docx" or filetype="html") work with all operating systems. The package does not require the installation of LaTeX or MS Word to be able to write .tex or .docx files.
Mobile-monitoring or "sensors on a mobile platform", is an increasingly popular approach to measure high-resolution pollution data at the street level. Coupled with location data, spatial visualisation of air-quality parameters helps detect localized areas of high air-pollution, also called hotspots. In this approach, portable sensors are mounted on a vehicle and driven on predetermined routes to collect high frequency data (1 Hz). mmaqshiny is for analysing, visualising and spatial mapping of high-resolution air-quality data collected by specific devices installed on a moving platform. 1 Hz data of PM2.5 (mass concentrations of particulate matter with size less than 2.5 microns), Black carbon mass concentrations (BC), ultra-fine particle number concentrations, carbon dioxide along with GPS coordinates and relative humidity (RH) data collected by popular portable instruments (TSI DustTrak-8530, Aethlabs microAeth-AE51, TSI CPC3007, LICOR Li-830, Garmin GPSMAP 64s, Omega USB RH probe respectively). It incorporates device specific cleaning and correction algorithms. RH correction is applied to DustTrak PM2.5 following the Chakrabarti et al., (2004) <doi:10.1016/j.atmosenv.2004.03.007>. Provision is given to add linear regression coefficients for correcting the PM2.5 data (if required). BC data will be cleaned for the vibration generated noise, by adopting the statistical procedure as explained in Apte et al., (2011) <doi:10.1016/j.atmosenv.2011.05.028>, followed by a loading correction as suggested by Ban-Weiss et al., (2009) <doi:10.1021/es8021039>. For the number concentration data, provision is given for dilution correction factor (if a diluter is used with CPC3007; default value is 1). The package joins the raw, cleaned and corrected data from the above said instruments and outputs as a downloadable csv file.
This package provides a tool to create hydroclimate scenarios, stress test systems and visualize system performance in scenario-neutral climate change impact assessments. Scenario-neutral approaches stress-test the performance of a modelled system by applying a wide range of plausible hydroclimate conditions (see Brown & Wilby (2012) <doi:10.1029/2012EO410001> and Prudhomme et al. (2010) <doi:10.1016/j.jhydrol.2010.06.043>). These approaches allow the identification of hydroclimatic variables that affect the vulnerability of a system to hydroclimate variation and change. This tool enables the generation of perturbed time series using a range of approaches including simple scaling of observed time series (e.g. Culley et al. (2016) <doi:10.1002/2015WR018253>) and stochastic simulation of perturbed time series via an inverse approach (see Guo et al. (2018) <doi:10.1016/j.jhydrol.2016.03.025>). It incorporates Richardson-type weather generator model configurations documented in Richardson (1981) <doi:10.1029/WR017i001p00182>, Richardson and Wright (1984), as well as latent variable type model configurations documented in Bennett et al. (2018) <doi:10.1016/j.jhydrol.2016.12.043>, Rasmussen (2013) <doi:10.1002/wrcr.20164>, Bennett et al. (2019) <doi:10.5194/hess-23-4783-2019> to generate hydroclimate variables on a daily basis (e.g. precipitation, temperature, potential evapotranspiration) and allows a variety of different hydroclimate variable properties, herein called attributes, to be perturbed. Options are included for the easy integration of existing system models both internally in R and externally for seamless stress-testing'. A suite of visualization options for the results of a scenario-neutral analysis (e.g. plotting performance spaces and overlaying climate projection information) are also included. Version 1.0 of this package is described in Bennett et al. (2021) <doi:10.1016/j.envsoft.2021.104999>. As further developments in scenario-neutral approaches occur the tool will be updated to incorporate these advances.
Calculate different glucose variability measures, including average measures of glycemia, measures of glycemic variability and measures of glycemic risk, from continuous glucose monitoring data. Boris P. Kovatchev, Erik Otto, Daniel Cox, Linda Gonder-Frederick, and William Clarke (2006) <doi:10.2337/dc06-1085>. Jean-Pierre Le Floch, Philippe Escuyer, Eric Baudin, Dominique Baudon, and Leon Perlemuter (1990) <doi:10.2337/diacare.13.2.172>. C.M. McDonnell, S.M. Donath, S.I. Vidmar, G.A. Werther, and F.J. Cameron (2005) <doi:10.1089/dia.2005.7.253>. Everitt, Brian (1998) <doi:10.1111/j.1751-5823.2011.00149_2.x>. Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) <doi:10.2307/2234167>. Dougherty, R. L., Edelman, A. and Hyman, J. M. (1989) <doi:10.1090/S0025-5718-1989-0962209-1>. Tukey, J. W. (1977) <doi:10.1016/0377-2217(86)90209-2>. F. John Service (2013) <doi:10.2337/db12-1396>. Edmond A. Ryan, Tami Shandro, Kristy Green, Breay W. Paty, Peter A. Senior, David Bigam, A.M. James Shapiro, and Marie-Christine Vantyghem (2004) <doi:10.2337/diabetes.53.4.955>. F. John Service, George D. Molnar, John W. Rosevear, Eugene Ackerman, Leal C. Gatewood, William F. Taylor (1970) <doi:10.2337/diab.19.9.644>. Sarah E. Siegelaar, Frits Holleman, Joost B. L. Hoekstra, and J. Hans DeVries (2010) <doi:10.1210/er.2009-0021>. Gabor Marics, Zsofia Lendvai, Csaba Lodi, Levente Koncz, David Zakarias, Gyorgy Schuster, Borbala Mikos, Csaba Hermann, Attila J. Szabo, and Peter Toth-Heyn (2015) <doi:10.1186/s12938-015-0035-3>. Thomas Danne, Revital Nimri, Tadej Battelino, Richard M. Bergenstal, Kelly L. Close, J. Hans DeVries, SatishGarg, Lutz Heinemann, Irl Hirsch, Stephanie A. Amiel, Roy Beck, Emanuele Bosi, Bruce Buckingham, ClaudioCobelli, Eyal Dassau, Francis J. Doyle, Simon Heller, Roman Hovorka, Weiping Jia, Tim Jones, Olga Kordonouri,Boris Kovatchev, Aaron Kowalski, Lori Laffel, David Maahs, Helen R. Murphy, Kirsten Nørgaard, Christopher G.Parkin, Eric Renard, Banshi Saboo, Mauro Scharf, William V. Tamborlane, Stuart A. Weinzimer, and Moshe Phillip.International consensus on use of continuous glucose monitoring.Diabetes Care, 2017 <doi:10.2337/dc17-1600>.
The method of anticlustering partitions a pool of elements into groups (i.e., anticlusters) with the goal of maximizing between-group similarity or within-group heterogeneity. The anticlustering approach thereby reverses the logic of cluster analysis that strives for high within-group homogeneity and clear separation between groups. Computationally, anticlustering is accomplished by maximizing instead of minimizing a clustering objective function, such as the intra-cluster variance (used in k-means clustering) or the sum of pairwise distances within clusters. The main function anticlustering() gives access to optimal and heuristic anticlustering methods described in Papenberg and Klau (2021; <doi:10.1037/met0000301>), Brusco et al. (2020; <doi:10.1111/bmsp.12186>), Papenberg (2024; <doi:10.1111/bmsp.12315>), and Papenberg et al. (2025; <doi:10.1101/2025.03.03.641320>). The optimal algorithms require that an integer linear programming solver is installed. This package will install lpSolve (<https://cran.r-project.org/package=lpSolve>) as a default solver, but it is also possible to use the package Rglpk (<https://cran.r-project.org/package=Rglpk>), which requires the GNU linear programming kit (<https://www.gnu.org/software/glpk/glpk.html>), the package Rsymphony (<https://cran.r-project.org/package=Rsymphony>), which requires the SYMPHONY ILP solver (<https://github.com/coin-or/SYMPHONY>), or the commercial solver Gurobi, which provides its own R package that is not available via CRAN (<https://www.gurobi.com/downloads/>). Rglpk', Rsymphony', gurobi and their system dependencies have to be manually installed by the user because they are only suggested dependencies. Full access to the bicriterion anticlustering method proposed by Brusco et al. (2020) is given via the function bicriterion_anticlustering(), while kplus_anticlustering() implements the full functionality of the k-plus anticlustering approach proposed by Papenberg (2024). Some other functions are available to solve classical clustering problems. The function balanced_clustering() applies a cluster analysis under size constraints, i.e., creates equal-sized clusters. The function matching() can be used for (unrestricted, bipartite, or K-partite) matching. The function wce() can be used optimally solve the (weighted) cluster editing problem, also known as correlation clustering, clique partitioning problem or transitivity clustering.
Create what we call Elemental Graphics for display of anova results. The term elemental derives from the fact that each function is aimed at construction of graphical displays that afford direct visualizations of data with respect to the fundamental questions that drive the particular anova methods. This package represents a modification of the original granova package; the key change is to use ggplot2', Hadley Wickham's package based on Grammar of Graphics concepts (due to Wilkinson). The main function is granovagg.1w() (a graphic for one way ANOVA); two other functions (granovagg.ds() and granovagg.contr()) are to construct graphics for dependent sample analyses and contrast-based analyses respectively. (The function granova.2w(), which entails dynamic displays of data, is not currently part of granovaGG'.) The granovaGG functions are to display data for any number of groups, regardless of their sizes (however, very large data sets or numbers of groups can be problematic). For granovagg.1w() a specialized approach is used to construct data-based contrast vectors for which anova data are displayed. The result is that the graphics use a straight line to facilitate clear interpretations while being faithful to the standard effect test in anova. The graphic results are complementary to standard summary tables; indeed, numerical summary statistics are provided as side effects of the graphic constructions. granovagg.ds() and granovagg.contr() provide graphic displays and numerical outputs for a dependent sample and contrast-based analyses. The graphics based on these functions can be especially helpful for learning how the respective methods work to answer the basic question(s) that drive the analyses. This means they can be particularly helpful for students and non-statistician analysts. But these methods can be of assistance for work-a-day applications of many kinds, as they can help to identify outliers, clusters or patterns, as well as highlight the role of non-linear transformations of data. In the case of granovagg.1w() and granovagg.ds() several arguments are provided to facilitate flexibility in the construction of graphics that accommodate diverse features of data, according to their corresponding display requirements. See the help files for individual functions.
Mobile Motor Activity Research Consortium for Health (mMARCH) is a collaborative network of studies of clinical and community samples that employ common clinical, biological, and digital mobile measures across involved studies. One of the main scientific goals of mMARCH sites is developing a better understanding of the inter-relationships between accelerometry-measured physical activity (PA), sleep (SL), and circadian rhythmicity (CR) and mental and physical health in children, adolescents, and adults. Currently, there is no consensus on a standard procedure for a data processing pipeline of raw accelerometry data, and few open-source tools to facilitate their development. The R package GGIR is the most prominent open-source software package that offers great functionality and tremendous user flexibility to process raw accelerometry data. However, even with GGIR', processing done in a harmonized and reproducible fashion requires a non-trivial amount of expertise combined with a careful implementation. In addition, novel accelerometry-derived features of PA/SL/CR capturing multiscale, time-series, functional, distributional and other complimentary aspects of accelerometry data being constantly proposed and become available via non-GGIR R implementations. To address these issues, mMARCH developed a streamlined harmonized and reproducible pipeline for loading and cleaning raw accelerometry data, extracting features available through GGIR as well as through non-GGIR R packages, implementing several data and feature quality checks, merging all features of PA/SL/CR together, and performing multiple analyses including Joint Individual Variation Explained (JIVE), an unsupervised machine learning dimension reduction technique that identifies latent factors capturing joint across and individual to each of three domains of PA/SL/CR. In detail, the pipeline generates all necessary R/Rmd/shell files for data processing after running GGIR for accelerometer data. In module 1, all csv files in the GGIR output directory were read, transformed and then merged. In module 2, the GGIR output files were checked and summarized in one excel sheet. In module 3, the merged data was cleaned according to the number of valid hours on each night and the number of valid days for each subject. In module 4, the cleaned activity data was imputed by the average Euclidean norm minus one (ENMO) over all the valid days for each subject. Finally, a comprehensive report of data processing was created using Rmarkdown, and the report includes few exploratory plots and multiple commonly used features extracted from minute level actigraphy data. Reference: Guo W, Leroux A, Shou S, Cui L, Kang S, Strippoli MP, Preisig M, Zipunnikov V, Merikangas K (2022) Processing of accelerometry data with GGIR in Motor Activity Research Consortium for Health (mMARCH) Journal for the Measurement of Physical Behaviour, 6(1): 37-44.
The geohabnet package is designed to perform a geographically or spatially explicit risk analysis of habitat connectivity. Xing et al (2021) <doi:10.1093/biosci/biaa067> proposed the concept of cropland connectivity as a risk factor for plant pathogen or pest invasions. As the functions in geohabnet were initially developed thinking on cropland connectivity, users are recommended to first be familiar with the concept by looking at the Xing et al paper. In a nutshell, a habitat connectivity analysis combines information from maps of host density, estimates the relative likelihood of pathogen movement between habitat locations in the area of interest, and applies network analysis to calculate the connectivity of habitat locations. The functions of geohabnet are built to conduct a habitat connectivity analysis relying on geographic parameters (spatial resolution and spatial extent), dispersal parameters (in two commonly used dispersal kernels: inverse power law and negative exponential models), and network parameters (link weight thresholds and network metrics). The functionality and main extensions provided by the functions in geohabnet to habitat connectivity analysis are a) Capability to easily calculate the connectivity of locations in a landscape using a single function, such as sensitivity_analysis() or msean(). b) As backbone datasets, the geohabnet package supports the use of two publicly available global datasets to calculate cropland density. The backbone datasets in the geohabnet package include crop distribution maps from Monfreda, C., N. Ramankutty, and J. A. Foley (2008) <doi:10.1029/2007gb002947> "Farming the planet: 2. Geographic distribution of crop areas, yields, physiological types, and net primary production in the year 2000, Global Biogeochem. Cycles, 22, GB1022" and International Food Policy Research Institute (2019) <doi:10.7910/DVN/PRFF8V> "Global Spatially-Disaggregated Crop Production Statistics Data for 2010 Version 2.0, Harvard Dataverse, V4". Users can also provide any other geographic dataset that represents host density. c) Because the geohabnet package allows R users to provide maps of host density (as originally in Xing et al (2021)), host landscape density (representing the geographic distribution of either crops or wild species), or habitat distribution (such as host landscape density adjusted by climate suitability) as inputs, we propose the term habitat connectivity. d) The geohabnet package allows R users to customize parameter values in the habitat connectivity analysis, facilitating context-specific (pathogen- or pest-specific) analyses. e) The geohabnet package allows users to automatically visualize maps of the habitat connectivity of locations resulting from a sensitivity analysis across all customized parameter combinations. The primary functions are msean() and sensitivity analysis(). Most functions in geohabnet provide three main outcomes: i) A map of mean habitat connectivity across parameters selected by the user, ii) a map of variance of habitat connectivity across the selected parameters, and iii) a map of the difference between the ranks of habitat connectivity and habitat density. Each function can be used to generate these maps as final outcomes. Each function can also provide intermediate outcomes, such as the adjacency matrices built to perform the analysis, which can be used in other network analysis. Refer to article at <https://garrettlab.github.io/HabitatConnectivity/articles/analysis.html> to see examples of each function and how to access each of these outcome types. To change parameter values, the file called parameters.yaml stores the parameters and their values, can be accessed using get_parameters() and set new parameter values with set_parameters()'. Users can modify up to ten parameters.
The continuous wavelet transform enables the observation of transient/non-stationary cyclicity in time-series. The goal of cyclostratigraphic studies is to define frequency/period in the depth/time domain. By conducting the continuous wavelet transform on cyclostratigraphic data series one can observe and extract cyclic signals/signatures from signals. These results can then be visualized and interpreted enabling one to identify/interpret cyclicity in the geological record, which can be used to construct astrochronological age-models and identify and interpret cyclicity in past and present climate systems. The WaverideR R package builds upon existing literature and existing codebase. The list of articles which are relevant can be grouped in four subjects; cyclostratigraphic data analysis,example data sets,the (continuous) wavelet transform and astronomical solutions. References for the cyclostratigraphic data analysis articles are: Stephen Meyers (2019) <doi:10.1016/j.earscirev.2018.11.015>. Mingsong Li, Linda Hinnov, Lee Kump (2019) <doi:10.1016/j.cageo.2019.02.011> Stephen Meyers (2012)<doi:10.1029/2012PA002307> Mingsong Li, Lee R. Kump, Linda A. Hinnov, Michael E. Mann (2018) <doi:10.1016/j.epsl.2018.08.041>. Wouters, S., Crucifix, M., Sinnesael, M., Da Silva, A.C., Zeeden, C., Zivanovic, M., Boulvain, F., Devleeschouwer, X. (2022) <doi:10.1016/j.earscirev.2021.103894>. Wouters, S., Da Silva, A.-C., Boulvain, F., and Devleeschouwer, X. (2021) <doi:10.32614/RJ-2021-039>. Huang, Norden E., Zhaohua Wu, Steven R. Long, Kenneth C. Arnold, Xianyao Chen, and Karin Blank (2009) <doi:10.1142/S1793536909000096>. Cleveland, W. S. (1979)<doi:10.1080/01621459.1979.10481038> Hurvich, C.M., Simonoff, J.S., and Tsai, C.L. (1998) <doi:10.1111/1467-9868.00125>, Golub, G., Heath, M. and Wahba, G. (1979) <doi:10.2307/1268518>. References for the example data articles are: Damien Pas, Linda Hinnov, James E. (Jed) Day, Kenneth Kodama, Matthias Sinnesael, Wei Liu (2018) <doi:10.1016/j.epsl.2018.02.010>. Steinhilber, Friedhelm, Abreu, Jacksiel, Beer, Juerg , Brunner, Irene, Christl, Marcus, Fischer, Hubertus, HeikkilA, U., Kubik, Peter, Mann, Mathias, Mccracken, K. , Miller, Heinrich, Miyahara, Hiroko, Oerter, Hans , Wilhelms, Frank. (2012 <doi:10.1073/pnas.1118965109>. Christian Zeeden, Frederik Hilgen, Thomas Westerhold, Lucas Lourens, Ursula Röhl, Torsten Bickert (2013) <doi:10.1016/j.palaeo.2012.11.009>. References for the (continuous) wavelet transform articles are: Morlet, Jean, Georges Arens, Eliane Fourgeau, and Dominique Glard (1982a) <doi:10.1190/1.1441328>. J. Morlet, G. Arens, E. Fourgeau, D. Giard (1982b) <doi:10.1190/1.1441329>. Torrence, C., and G. P. Compo (1998)<https://paos.colorado.edu/research/wavelets/bams_79_01_0061.pdf>, Gouhier TC, Grinsted A, Simko V (2021) <https://github.com/tgouhier/biwavelet>. Angi Roesch and Harald Schmidbauer (2018) <https://CRAN.R-project.org/package=WaveletComp>. Russell, Brian, and Jiajun Han (2016)<https://www.crewes.org/Documents/ResearchReports/2016/CRR201668.pdf>. Gabor, Dennis (1946) <http://genesis.eecg.toronto.edu/gabor1946.pdf>. J. Laskar, P. Robutel, F. Joutel, M. Gastineau, A.C.M. Correia, and B. Levrard, B. (2004) <doi:10.1051/0004-6361:20041335>. Laskar, J., Fienga, A., Gastineau, M., Manche, H. (2011a) <doi:10.1051/0004-6361/201116836>. References for the astronomical solutions articles are: Laskar, J., Gastineau, M., Delisle, J.-B., Farres, A., Fienga, A. (2011b <doi:10.1051/0004-6361/201117504>. J. Laskar (2019) <doi:10.1016/B978-0-12-824360-2.00004-8>. Zeebe, Richard E (2017) <doi:10.3847/1538-3881/aa8cce>. Zeebe, R. E. and Lourens, L. J. (2019) <doi:10.1016/j.epsl.2022.117595>. Richard E. Zeebe Lucas J. Lourens (2022) <doi:10.1126/science.aax0612>.
This package implements numerous methods for testing for, modelling, and correcting for heteroskedasticity in the classical linear regression model. The most novel contribution of the package is found in the functions that implement the as-yet-unpublished auxiliary linear variance models and auxiliary nonlinear variance models that are designed to estimate error variances in a heteroskedastic linear regression model. These models follow principles of statistical learning described in Hastie (2009) <doi:10.1007/978-0-387-21606-5>. The nonlinear version of the model is estimated using quasi-likelihood methods as described in Seber and Wild (2003, ISBN: 0-471-47135-6). Bootstrap methods for approximate confidence intervals for error variances are implemented as described in Efron and Tibshirani (1993, ISBN: 978-1-4899-4541-9), including also the expansion technique described in Hesterberg (2014) <doi:10.1080/00031305.2015.1089789>. The wild bootstrap employed here follows the description in Davidson and Flachaire (2008) <doi:10.1016/j.jeconom.2008.08.003>. Tuning of hyper-parameters makes use of a golden section search function that is modelled after the MATLAB function of Zarnowiec (2022) <https://www.mathworks.com/matlabcentral/fileexchange/25919-golden-section-method-algorithm>. A methodological description of the algorithm can be found in Fox (2021, ISBN: 978-1-003-00957-3). There are 25 different functions that implement hypothesis tests for heteroskedasticity. These include a test based on Anscombe (1961) <https://projecteuclid.org/euclid.bsmsp/1200512155>, Ramsey's (1969) BAMSET Test <doi:10.1111/j.2517-6161.1969.tb00796.x>, the tests of Bickel (1978) <doi:10.1214/aos/1176344124>, Breusch and Pagan (1979) <doi:10.2307/1911963> with and without the modification proposed by Koenker (1981) <doi:10.1016/0304-4076(81)90062-2>, Carapeto and Holt (2003) <doi:10.1080/0266476022000018475>, Cook and Weisberg (1983) <doi:10.1093/biomet/70.1.1> (including their graphical methods), Diblasi and Bowman (1997) <doi:10.1016/S0167-7152(96)00115-0>, Dufour, Khalaf, Bernard, and Genest (2004) <doi:10.1016/j.jeconom.2003.10.024>, Evans and King (1985) <doi:10.1016/0304-4076(85)90085-5> and Evans and King (1988) <doi:10.1016/0304-4076(88)90006-1>, Glejser (1969) <doi:10.1080/01621459.1969.10500976> as formulated by Mittelhammer, Judge and Miller (2000, ISBN: 0-521-62394-4), Godfrey and Orme (1999) <doi:10.1080/07474939908800438>, Goldfeld and Quandt (1965) <doi:10.1080/01621459.1965.10480811>, Harrison and McCabe (1979) <doi:10.1080/01621459.1979.10482544>, Harvey (1976) <doi:10.2307/1913974>, Honda (1989) <doi:10.1111/j.2517-6161.1989.tb01749.x>, Horn (1981) <doi:10.1080/03610928108828074>, Li and Yao (2019) <doi:10.1016/j.ecosta.2018.01.001> with and without the modification of Bai, Pan, and Yin (2016) <doi:10.1007/s11749-017-0575-x>, Rackauskas and Zuokas (2007) <doi:10.1007/s10986-007-0018-6>, Simonoff and Tsai (1994) <doi:10.2307/2986026> with and without the modification of Ferrari, Cysneiros, and Cribari-Neto (2004) <doi:10.1016/S0378-3758(03)00210-6>, Szroeter (1978) <doi:10.2307/1913831>, Verbyla (1993) <doi:10.1111/j.2517-6161.1993.tb01918.x>, White (1980) <doi:10.2307/1912934>, Wilcox and Keselman (2006) <doi:10.1080/10629360500107923>, Yuce (2008) <https://dergipark.org.tr/en/pub/iuekois/issue/8989/112070>, and Zhou, Song, and Thompson (2015) <doi:10.1002/cjs.11252>. Besides these heteroskedasticity tests, there are supporting functions that compute the BLUS residuals of Theil (1965) <doi:10.1080/01621459.1965.10480851>, the conditional two-sided p-values of Kulinskaya (2008) <doi:10.48550/arXiv.0810.2124>, and probabilities for the nonparametric trend statistic of Lehmann (1975, ISBN: 0-816-24996-1). For handling heteroskedasticity, in addition to the new auxiliary variance model methods, there is a function to implement various existing Heteroskedasticity-Consistent Covariance Matrix Estimators from the literature, such as those of White (1980) <doi:10.2307/1912934>, MacKinnon and White (1985) <doi:10.1016/0304-4076(85)90158-7>, Cribari-Neto (2004) <doi:10.1016/S0167-9473(02)00366-3>, Cribari-Neto et al. (2007) <doi:10.1080/03610920601126589>, Cribari-Neto and da Silva (2011) <doi:10.1007/s10182-010-0141-2>, Aftab and Chang (2016) <doi:10.18187/pjsor.v12i2.983>, and Li et al. (2017) <doi:10.1080/00949655.2016.1198906>.
Suite of tools for functional analysis.
Converts elements of roxygen documentation to markdown'.
Tinymist is an integrated language service for Typst.
Base S4-classes and functions for robust asymptotic statistics.