Toys / Webring for GNU Guix channels

r-udpipe 0.8.16

Propagated dependencies: r-rcpp@1.1.1-1.1 r-matrix@1.7-5 r-data-table@1.18.4

Channel: guix-cran

Location: guix-cran/packages/u.scm (guix-cran packages u)

Home page: https://bnosac.github.io/udpipe/en/index.html

Licenses: FSDG-compatible

Build system: r

Synopsis: Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

Description:

This natural language processing toolkit provides language-agnostic tokenization', parts of speech tagging', lemmatization and dependency parsing of raw text. Next to text parsing, the package also allows you to train annotation models based on data of treebanks in CoNLL-U format as provided at <https://universaldependencies.org/format.html>. The techniques are explained in detail in the paper: Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe', available at <doi:10.18653/v1/K17-3009>. The toolkit also contains functionalities for commonly used data manipulations on texts which are enriched with the output of the parser. Namely functionalities and algorithms for collocations, token co-occurrence, document term matrix handling, term frequency inverse document frequency calculations, information retrieval metrics (Okapi BM25), handling of multi-word expressions, keyword detection (Rapid Automatic Keyword Extraction, noun phrase extraction, syntactical patterns) sentiment scoring and semantic similarity analysis.

r-udpipe 0.8.11

Propagated dependencies: r-data-table@1.18.4 r-matrix@1.7-5 r-rcpp@1.1.1-1.1

Channel: guix-science

Location: guix-science/packages/cran.scm (guix-science packages cran)

Home page: https://bnosac.github.io/udpipe/en/index.html

Licenses: MPL 2.0

Build system: r

Synopsis: R bindings for UDPipe NLP toolkit

Description:

This natural language processing toolkit provides language-agnostic tokenization, parts of speech tagging, lemmatization and dependency parsing of raw text. Next to text parsing, the package also allows you to train annotation models based on data of treebanks in CoNLL-U format as provided at https://universaldependencies.org/format.html. The techniques are explained in detail in the paper: 'Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe', available at doi:10.18653/v1/K17-3009. The toolkit also contains functionalities for commonly used data manipulations on texts which are enriched with the output of the parser. Namely functionalities and algorithms for collocations, token co-occurrence, document term matrix handling, term frequency inverse document frequency calculations, information retrieval metrics (Okapi BM25), handling of multi-word expressions, keyword detection (Rapid Automatic Keyword Extraction, noun phrase extraction, syntactical patterns) sentiment scoring and semantic similarity analysis.

Total packages: 2