Query for enriched data such as country, region, city, latitude & longitude, ZIP code, time zone, Autonomous System, Internet Service Provider, domain, net speed, International direct dialing (IDD) code, area code, weather station data, mobile data, elevation, usage type, address type, advertisement category, fraud score, and proxy data with an IP address. You can also query a list of hosted domain names for the IP address too. This package uses the IP2Location.io API to query this data. To get started with a free API key, sign up here <https://www.ip2location.io/sign-up?ref=1>.
Computes the optimal number of regions (or subdivisions) and their position in serial structures without a priori assumptions and to visualize the results. After reducing data dimensionality with the built-in function for data ordination, regions are fitted as segmented linear regressions along the serial structure. Every region boundary position and increasing number of regions are iteratively fitted and the best model (number of regions and boundary positions) is selected with an information criterion. This package expands on the previous regions package (Jones et al., Science 2018) with improved computation and more fitting and plotting options.
This package provides a tool to analyse ActiGraph
accelerometer data and to implement the use of the PROactive Physical Activity in COPD (chronic obstructive pulmonary disease) instruments. Once analysis is completed, the app allows to export results to .csv files and to generate a report of the measurement. All the configured inputs relevant for interpreting the results are recorded in the report. In addition to the existing R packages that are fully integrated with the app, the app uses some functions from the actigraph.sleepr package developed by Petkova (2021) <https://github.com/dipetkov/actigraph.sleepr/>.
These are useful tools and data sets for the study of quantitative peace science. The goal for this package is to include tools and data sets for doing original research that mimics well what a user would have to previously get from a software package that may not be well-sourced or well-supported. Those software bundles were useful the extent to which they encourage replications of long-standing analyses by starting the data-generating process from scratch. However, a lot of the functionality can be done relatively quickly and more transparently in the R programming language.
This package provides functions to test for a treatment effect in terms of the difference in survival between a treatment group and a control group using surrogate marker information obtained at some early time point in a time-to-event outcome setting. Nonparametric kernel estimation is used to estimate the test statistic and perturbation resampling is used for variance estimation. More details will be available in the future in: Parast L, Cai T, Tian L (2019) ``Using a Surrogate Marker for Early Testing of a Treatment Effect" Biometrics, 75(4):1253-1263. <doi:10.1111/biom.13067>.
This package provides a framework for developing n-gram models for text prediction. It provides data cleaning, data sampling, extracting tokens from text, model generation, model evaluation and word prediction. For information on how n-gram models work we referred to: "Speech and Language Processing" <https://web.archive.org/web/20240919222934/https%3A%2F%2Fweb.stanford.edu%2F~jurafsky%2Fslp3%2F3.pdf>. For optimizing R code and using R6 classes we referred to "Advanced R" <https://adv-r.hadley.nz/r6.html>. For writing R extensions we referred to "R Packages", <https://r-pkgs.org/index.html>.
This package provides a tool to calculate Cardiovascular Risk Scores in large data frames. Cardiovascular risk scores are statistical tools used to assess an individual's likelihood of developing a cardiovascular disease based on various risk factors, such as age, gender, blood pressure, cholesterol levels, and smoking. Here we bring together the six most commonly used in the emergency department. Using RiskScorescvd
', you can calculate all the risk scores in an extended dataset in seconds. PCE (ASCVD) described in Goff, et al (2013) <doi:10.1161/01.cir.0000437741.48606.98>. EDACS described in Mark DG, et al (2016) <doi:10.1016/j.jacc.2017.11.064>. GRACE described in Fox KA, et al (2006) <doi:10.1136/bmj.38985.646481.55>. HEART is described in Mahler SA, et al (2017) <doi:10.1016/j.clinbiochem.2017.01.003>. SCORE2/OP described in SCORE2 working group and ESC Cardiovascular risk collaboration (2021) <doi:10.1093/eurheartj/ehab309>. TIMI described in Antman EM, et al (2000) <doi:10.1001/jama.284.7.835>. SCORE2-Diabetes described in SCORE2-Diabetes working group and ESC Cardiovascular risk collaboration (2023) <doi:10.1093/eurheartj/ehab260>. SCORE2/OP with CKD add-on described in Kunihiro M et al (2022) <doi:10.1093/eurjpc/zwac176>.
Selected utilities, in particular geoms and stats functions, extending the ggplot2 package. This package imports functions from EnvStats
<doi:10.1007/978-1-4614-8456-1> by Millard (2013), ggpp <https://CRAN.R-project.org/package=ggpp> by Aphalo et al. (2023) and ggstats <doi:10.5281/zenodo.10183964> by Larmarange (2023), and then exports them. This package also contains modified code from ggquickeda <https://CRAN.R-project.org/package=ggquickeda> by Mouksassi et al. (2023) for Kaplan-Meier lines and ticks additions to plots. All functions are tested to make sure that they work reliably.
Uses least squares optimisation to estimate the parameters of the best-fitting JohnsonSU
distribution for a given dataset, with the possibility of the distributions corresponding to the limiting cases of the JohnsonSU
distribution. The code for the Golden Section Search used in the optimisation has been adapted from E. Cai. This package has been created as an extension of my Master's thesis. E. Cai (2013, "Scripts and Functions: Using R to Implement the Golden Section Search Method for Numerical Optimization", <https://chemicalstatistician.wordpress.com/2013/04/22/using-r-to-implement-the-golden-bisection-method/>).
Meta-analyses can be compromised by studies internal biases (e.g., confounding in nonrandomized studies) as well as by publication bias. This package conducts sensitivity analyses for the joint effects of these biases (per Mathur (2022) <doi:10.31219/osf.io/u7vcb>). These sensitivity analyses address two questions: (1) For a given severity of internal bias across studies and of publication bias, how much could the results change?; and (2) For a given severity of publication bias, how severe would internal bias have to be, hypothetically, to attenuate the results to the null or by a given amount?
Analytical methods to locate and characterise ecotones, ecosystems and environmental patchiness along ecological gradients. Methods are implemented for isolated sampling or for space/time series. It includes Detrended Correspondence Analysis (Hill & Gauch (1980) <doi:10.1007/BF00048870>), fuzzy clustering (De Cáceres et al. (2010) <doi:10.1080/01621459.1963.10500845>), biodiversity indices (Jost (2006) <doi:10.1111/j.2006.0030-1299.14714.x>), and network analyses (Epskamp et al. (2012) <doi:10.18637/jss.v048.i04>) - as well as tools to explore the number of clusters in the data. Functions to produce synthetic ecological datasets are also provided.
Fits mixed Poisson regression models (Poisson-Inverse Gaussian or Negative-Binomial) on data sets with response variables being count data. The models can have varying precision parameter, where a linear regression structure (through a link function) is assumed to hold on the precision parameter. The Expectation-Maximization algorithm for both these models (Poisson Inverse Gaussian and Negative Binomial) is an important contribution of this package. Another important feature of this package is the set of functions to perform global and local influence analysis. See Barreto-Souza and Simas (2016) <doi:10.1007/s11222-015-9601-6> for further details.
Implementation of analytical models for estimating streamflow depletion due to groundwater pumping, and other related tools. Functions are broadly split into two groups: (1) analytical streamflow depletion models, which estimate streamflow depletion for a single stream reach resulting from groundwater pumping; and (2) depletion apportionment equations, which distribute estimated streamflow depletion among multiple stream reaches within a stream network. See Zipper et al. (2018) <doi:10.1029/2018WR022707> for more information on depletion apportionment equations and Zipper et al. (2019) <doi:10.1029/2018WR024403> for more information on analytical depletion functions, which combine analytical models and depletion apportionment equations.
Facilitates basic and equation-based analyses of some important soil properties related to soil chemical environment and nutrient availability to plants. Freundlich H (1907). <doi:10.1515/zpch-1907-5723>. Datta SP, Bhadoria PBS (1999). <doi:10.1002%2F%28SICI%291522-2624%28199903%29162%3A2%3C183%3A%3AAID-JPLN183%3E3.0.CO%3B2-A>."Boron adsorption and desorption in some acid soils of West Bengal, India". Langmuir I (1918). <doi:10.1021/ja02242a004> "The adsorption of gases on plane surfaces of glass, mica, and platinum". Khasawneh FE (1971). <doi:10.2136/sssaj1971.03615995003500030029x> "Solution ion activity and plant growth".
This package provides tools for simulating spatially dependent predictors (continuous or binary), which are used to generate scalar outcomes in a (generalized) linear model framework. Continuous predictors are generated using traditional multivariate normal distributions or Gauss Markov random fields with several correlation function approaches (e.g., see Rue (2001) <doi:10.1111/1467-9868.00288> and Furrer and Sain (2010) <doi:10.18637/jss.v036.i10>), while binary predictors are generated using a Boolean model (see Cressie and Wikle (2011, ISBN: 978-0-471-69274-4)). Parameter vectors exhibiting spatial clustering can also be easily specified by the user.
This package provides a set of fast and convenient functions to help conducting accessibility analyses. Given a pre-computed travel cost matrix and a land use dataset (containing the location of jobs, healthcare and population, for example), the package allows one to calculate accessibility levels and accessibility poverty and inequality. The package covers the majority of the most commonly used accessibility measures (such as cumulative opportunities, gravity-based and floating catchment areas methods), as well as the most frequently used inequality and poverty metrics (such as the Palma ratio, the concentration and Theil indices and the FGT family of measures).
Extends the base classes and methods of caret package for integration of base learners. The user can input the number of different base learners, and specify the final learner, along with the train-validation-test data partition split ratio. The predictions on the unseen new data is the resultant of the ensemble meta-learning <https://machinelearningmastery.com/stacking-ensemble-machine-learning-with-python/> of the heterogeneous learners aimed to reduce the generalization error in the predictive models. It significantly lowers the barrier for the practitioners to apply heterogeneous ensemble learning techniques in an amateur fashion to their everyday predictive problems.
Perform Nonlinear Mixed-Effects (NLME) Modeling using Certara's NLME-Engine. Access the same Maximum Likelihood engines used in the Phoenix platform, including algorithms for parametric methods, individual, and pooled data analysis <https://www.certara.com/app/uploads/2020/06/BR_PhoenixNLME-v4.pdf>
. The Quasi-Random Parametric Expectation-Maximization Method (QRPEM) is also supported <https://www.page-meeting.org/default.asp?abstract=2338>. Execution is supported both locally or on remote machines. Remote execution includes support for Linux Sun Grid Engine (SGE), Terascale Open-source Resource and Queue Manager (TORQUE) grids, Linux and Windows multicore, and individual runs.
Computation of a cubic B-spline basis for arbitrary knots. It also provides the 1st and 2nd derivatives, as well as the integral of the basis elements. It is used by the author to fit penalized B-spline models, see e.g. Jullion, A. and Lambert, P. (2006) <doi:10.1016/j.csda.2006.09.027>, Lambert, P. and Eilers, P.H.C. (2009) <doi:10.1016/j.csda.2008.11.022> and, more recently, Lambert, P. (2021) <doi:10.1016/j.csda.2021.107250>. It is inspired by the algorithm developed by de Boor, C. (1977) <doi:10.1137/0714026>.
An R interface to version 0.3 of the ROPTLIB optimization library (see <https://www.math.fsu.edu/~whuang2/> for more information). Optimize real- valued functions over manifolds such as Stiefel, Grassmann, and Symmetric Positive Definite matrices. For details see Martin et. al. (2020) <doi:10.18637/jss.v093.i01>. Note that the optional ldr package used in some of this package's examples can be obtained from either JSS <https://www.jstatsoft.org/index.php/jss/article/view/v061i03/2886> or from the CRAN archives <https://cran.r-project.org/src/contrib/Archive/ldr/ldr_1.3.3.tar.gz>.
When analyzing data, plots are a helpful tool for visualizing data and interpreting statistical models. This package provides a set of simple tools for building plots incrementally, starting with an empty plot region, and adding bars, data points, regression lines, error bars, gradient legends, density distributions in the margins, and even pictures. The package builds further on R graphics by simply combining functions and settings in order to reduce the amount of code to produce for the user. As a result, the package does not use formula input or special syntax, but can be used in combination with default R plot functions.
Spike and slab regression with a variety of residual error distributions corresponding to Gaussian, Student T, probit, logit, SVM, and a few others. Spike and slab regression is Bayesian regression with prior distributions containing a point mass at zero. The posterior updates the amount of mass on this point, leading to a posterior distribution that is actually sparse, in the sense that if you sample from it many coefficients are actually zeros. Sampling from this posterior distribution is an elegant way to handle Bayesian variable selection and model averaging. See <DOI:10.1504/IJMMNO.2014.059942> for an explanation of the Gaussian case.
NEON data packages can be accessed through the NEON Data Portal <https://www.neonscience.org> or through the NEON Data API (see <https://data.neonscience.org/data-api> for documentation). Data delivered from the Data Portal are provided as monthly zip files packaged within a parent zip file, while individual files can be accessed from the API. This package provides tools that aid in discovering, downloading, and reformatting data prior to use in analyses. This includes downloading data via the API, merging data tables by type, and converting formats. For more information, see the readme file at <https://github.com/NEONScience/NEON-utilities>.
Gain seamless access to origin-destination (OD) data from the Spanish Ministry of Transport, hosted at <https://www.transportes.gob.es/ministerio/proyectos-singulares/estudios-de-movilidad-con-big-data/opendata-movilidad>. This package simplifies the management of these large datasets by providing tools to download zone boundaries, handle associated origin-destination data, and process it efficiently with the duckdb database interface. Local caching minimizes repeated downloads, streamlining workflows for researchers and analysts. Extensive documentation is available at <https://ropenspain.github.io/spanishoddata/index.html>, offering guides on creating static and dynamic mobility flow visualizations and transforming large datasets into analysis-ready formats.