This package provides methods for working with nominal dates, times, and durations. Base R has sophisticated facilities for handling time, but these can give unexpected results if, for example, timezone is not handled properly. This package provides a more casual approach to support cases which do not require rigorous treatment. It systematically deconstructs the concepts origin and timezone, and de-emphasizes the display of seconds. It also converts among nominal durations such as seconds, hours, days, and weeks. See ?datetime and ?duration for examples. Adapted from metrumrg <http://r-forge.r-project.org/R/?group_id=1215>.
This package provides a flexible container to transport and manipulate complex sets of data. These data may consist of multiple data files and associated meta data and ancillary files. Individual data objects have associated system level meta data, and data files are linked together using the OAI-ORE standard resource map which describes the relationships between the files. The OAI- ORE standard is described at <https://www.openarchives.org/ore/>. Data packages can be serialized and transported as structured files that have been created following the BagIt
specification. The BagIt
specification is described at <https://tools.ietf.org/html/draft-kunze-bagit-08>.
There are many different formats dates are commonly represented with: the order of day, month, or year can differ, different separators ("-", "/", or whitespace) can be used, months can be numerical, names, or abbreviations and year given as two digits or four. datefixR
takes dates in all these different formats and converts them to R's built-in date class. If datefixR
cannot standardize a date, such as because it is too malformed, then the user is told which date cannot be standardized and the corresponding ID for the row. datefixR
also allows the imputation of missing days and months with user-controlled behavior.
Efficiently and flexibly preprocess data using a set of data filtering, deletion, and interpolation tools. These data preprocessing methods are developed based on the principles of completeness, accuracy, threshold method, and linear interpolation and through the setting of constraint conditions, time completion & recovery, and fast & efficient calculation and grouping. Key preprocessing steps include deletions of variables and observations, outlier removal, and missing values (NA) interpolation, which are dependent on the incomplete and dispersed degrees of raw data. They clean data more accurately, keep more samples, and add no outliers after interpolation, compared with ordinary methods. Auto-identification of consecutive NA via run-length based grouping is used in observation deletion, outlier removal, and NA interpolation; thus, new outliers are not generated in interpolation. Conditional extremum is proposed to realize point-by-point weighed outlier removal that saves non-outliers from being removed. Plus, time series interpolation with values to refer to within short periods further ensures reliable interpolation. These methods are based on and improved from the reference: Liang, C.-S., Wu, H., Li, H.-Y., Zhang, Q., Li, Z. & He, K.-B. (2020) <doi:10.1016/j.scitotenv.2020.140923>.
For working with the DataRobot
predictive modeling platform's API <https://www.datarobot.com/>.
This package provides RStudio addins and R functions that make copy-pasting vectors and tables to text painless.
This package provides a thin wrapper around the Datorama API. Ideal for analyzing marketing data from <https://datorama.com>.
An interactive editor built on rhandsontable to allow the interactive viewing, entering, filtering and editing of data in R <https://dillonhammill.github.io/DataEditR/>
.
Utilities for mixed frequency data. In particular, use to aggregate and normalize tabular mixed frequency data, index dates to end of period, and seasonally adjust tabular data.
Transfer any date type to ISO standard. Package recognizes dates in given data frame and transform to ISO format. Only one date format can be applied within one data frame column.
This package provides access to Dataverse APIs <https://dataverse.org/> (versions 4-5), enabling data search, retrieval, and deposit. For Dataverse versions <= 3.0, use the archived dvn package <https://cran.r-project.org/package=dvn>.
Create tree structures from hierarchical data, and traverse the tree in various orders. Aggregate, cumulate, print, plot, convert to and from data.frame and more. This is useful for decision trees, machine learning, finance, conversion from and to JSON, and many other applications.
Quality control and formatting tools developed for the Copernicus Data Rescue Service. The package includes functions to handle the Station Exchange Format (SEF), various statistical tests for climate data at daily and sub-daily resolution, as well as functions to plot the data. For more information and documentation see <https://datarescue.climate.copernicus.eu/st_data-quality-control>.
This package provides an interactive viewer for data.frame and tibble objects using shiny <https://shiny.posit.co/> and DT <https://rstudio.github.io/DT/>. It supports complex filtering, column selection, and automatic generation of reproducible dplyr <https://dplyr.tidyverse.org/> code for data manipulation. The package is designed for ease of use in data exploration and reporting workflows.
The goal of dataspice is to make it easier for researchers to create basic, lightweight, and concise metadata files for their datasets. These basic files can then be used to make useful information available during analysis, create a helpful dataset "README" webpage, and produce more complex metadata formats to aid dataset discovery. Metadata fields are based on the Schema.org and Ecological Metadata Language standards.
This package provides functions to pipe data from R to DataGraph
', a graphing and analysis application for mac OS. Create a live connection using either .dtable or .dtbin files that can be read by DataGraph
'. Can save a data frame, collection of data frames and sequences of data frames and individual vectors. For more information see <https://community.visualdatatools.com/datagraph/knowledge-base/r-package/>.
Includes functions that researchers or practitioners may use to clean raw data, transferring html, xlsx, txt data file into other formats. And it also can be used to manipulate text variables, extract numeric variables from text variables and other variable cleaning processes. It is originated from a author's project which focuses on creative performance in online education environment. The resulting paper of that study will be published soon.
Data quality assessments guided by a data quality framework introduced by Schmidt and colleagues, 2021 <doi:10.1186/s12874-021-01252-7> target the data quality dimensions integrity, completeness, consistency, and accuracy. The scope of applicable functions rests on the availability of extensive metadata which can be provided in spreadsheet tables. Either standardized (e.g. as html5 reports) or individually tailored reports can be generated. For an introduction into the specification of corresponding metadata, please refer to the package website <https://dataquality.qihs.uni-greifswald.de/VIN_Annotation_of_Metadata.html>.
Download and import time series from <http://www.dataseries.org>, a comprehensive and up-to-date collection of open data from Switzerland.
This package provides functions to import multiple files of multiple data file types ('.xlsx', .xls', .csv', .txt') from a given directory into R data frames.
This package provides a lightweight package to easily manipulate, clean, transform, and prepare your data for analysis. It also forms the data wrangling backend for the packages in the easystats
ecosystem.
This package provides a convenient API interface to access immunological data within the CAVD DataSpace'(<https://dataspace.cavd.org>
), a data sharing and discovery tool that facilitates exploration of HIV immunological data from pre-clinical and clinical HIV vaccine studies.
The R package data.table
is an extension of data.frame
providing functions for fast aggregation of large data (e.g. 100GB in RAM), fast ordered joins, fast add/modify/delete of columns by group, column listing and fast file reading.
Flexible and efficient cleaning of data with interactivity. datacleanr facilitates best practices in data analyses and reproducibility with built-in features and by translating interactive/manual operations to code. The package is designed for interoperability, and so seamlessly fits into reproducible analyses pipelines in R'.