Enter the query into the form above. You can look for specific version of a package by using @ symbol like this: gcc@10.
API method:
GET /api/packages?search=hello&page=1&limit=20
where search is your query, page is a page number and limit is a number of items on a single page. Pagination information (such as a number of pages and etc) is returned
in response headers.
If you'd like to join our channel webring send a patch to ~whereiseveryone/toys@lists.sr.ht adding your channel as an entry in channels.scm.
Kokkos Core implements a programming model in C++ for writing performance portable applications targeting all major HPC platforms. For that purpose it provides abstractions for both parallel execution of code and data management. Kokkos is designed to target complex node architectures with N-level memory hierarchies and multiple types of execution resources.
RAJA offers portable, parallel loop execution by providing building blocks that extend the generally-accepted parallel for idiom. RAJA relies on standard C++14 features.
CHAI is a C++ libary providing an array object that can be used transparently in multiple memory spaces. Data is automatically migrated based on copy-construction, allowing for correct data access regardless of location. CHAI can be used standalone, but is best when paired with the RAJA library, which has built-in CHAI integration that takes care of everything.
Kokkos Core implements a programming model in C++ for writing performance portable applications targeting all major HPC platforms. For that purpose it provides abstractions for both parallel execution of code and data management. Kokkos is designed to target complex node architectures with N-level memory hierarchies and multiple types of execution resources.
Kokkos Core implements a programming model in C++ for writing performance portable applications targeting all major HPC platforms. For that purpose it provides abstractions for both parallel execution of code and data management. Kokkos is designed to target complex node architectures with N-level memory hierarchies and multiple types of execution resources.
Kokkos Core implements a programming model in C++ for writing performance portable applications targeting all major HPC platforms. For that purpose it provides abstractions for both parallel execution of code and data management. Kokkos is designed to target complex node architectures with N-level memory hierarchies and multiple types of execution resources.
Kokkos Core implements a programming model in C++ for writing performance portable applications targeting all major HPC platforms. For that purpose it provides abstractions for both parallel execution of code and data management. Kokkos is designed to target complex node architectures with N-level memory hierarchies and multiple types of execution resources.
CAMP collects a variety of macros and metaprogramming facilities for C++ projects. It's in the direction of projects like metal (a major influence) but with a focus on wide compiler compatibility across HPC-oriented systems.
Kokkos Core implements a programming model in C++ for writing performance portable applications targeting all major HPC platforms. For that purpose it provides abstractions for both parallel execution of code and data management. Kokkos is designed to target complex node architectures with N-level memory hierarchies and multiple types of execution resources.
This package provides several cubic spline interpolation methods of H. Akima for irregular and regular gridded data are available through this package, both for the bivariate case and univariate case. Linear interpolation of irregular gridded data is also covered. A bilinear interpolator for regular grids was also added for comparison with the bicubic interpolator on regular grids.
The rfacts package is an R interface to the Fixed and Adaptive Clinical Trial Simulator FACTS. It programmatically invokes FACTS to run clinical trial simulations. It aggregates simulation output data into tidy data frames. These capabilities provide end-to-end automation for large-scale simulation pipelines, and they enhance computational reproducibility.
This package provides a C++ header-only library that wraps the NVIDIA CUDA Deep Neural Network library (cuDNN) C backend API. This entry point to the same API is less verbose (without loss of control), and adds functionality on top of the backend API, such as errata filters and autotuning.
This package provides the CUDA compiler and the CUDA run-time support libraries for NVIDIA GPUs, all of which are proprietary.
This package provides the CUDA C++ developers with building blocks that make it easier to write safe and efficient code. It unifies three essential former CUDA C++ libraries into a single repository:
Thrust (former repo)
CUB (former repo)
libcudacxx (former repo)
The NVIDIA Management Library Headers (NVML) is a C-based API for monitoring and managing various states of the NVIDIA GPU devices. It provides a direct access to the queries and commands exposed via nvidia-smi.
This package enables the creation of profiling and tracing tools that target CUDA applications and give insight into the CPU and GPU behavior of CUDA applications. It provides the following APIs:
the Activity API,
the Callback API,
the Event API,
the Metric API,
the Profiling API,
the PC Sampling API,
the Checkpoint API.
This package provides the CUDA Deep Neural Network library.
This package provides the CUDA compiler and the CUDA run-time support libraries for NVIDIA GPUs, all of which are proprietary.
NCCL (pronounced "Nickel") is a stand-alone library of standard communication routines for NVIDIA GPUs, implementing all-reduce, all-gather, reduce, broadcast, reduce-scatter, as well as any send/receive based communication pattern. It has been optimized to achieve high bandwidth on platforms using PCIe, NVLink, NVswitch, as well as networking using InfiniBand Verbs or TCP/IP sockets. NCCL supports an arbitrary number of GPUs installed in a single node or across multiple nodes, and can be used in either single- or multi-process (e.g., MPI) applications.
CUTLASS is a collection of CUDA C++ template abstractions for implementing high-performance matrix-matrix multiplication (GEMM) and related computations at all levels and scales within CUDA. It incorporates strategies for hierarchical decomposition and data movement similar to those used to implement cuBLAS and cuDNN.
CUTLASS decomposes these ``moving parts'' into reusable, modular software components abstracted by C++ template classes. Primitives for different levels of a conceptual parallelization hierarchy can be specialized and tuned via custom tiling sizes, data types, and other algorithmic policy. The resulting flexibility simplifies their use as building blocks within custom kernels and applications.
This package provides a set of GPU-accelerated basic linear algebra subroutines used for handling sparse matrices that perform significantly faster than CPU-only alternatives. Depending on the specific operation, the library targets matrices with sparsity ratios in the range between 70%-99.9%.
This package provides a library of functions for performing CUDA accelerated 2D image and signal processing.
The primary library focuses on image processing and is widely applicable for developers in these areas. NPP will evolve over time to encompass more of the compute heavy tasks in a variety of problem domains. The NPP library is written to maximize flexibility, while maintaining high performance.
This package provides cuFFT, the NVIDIA® CUDA® Fast Fourier Transform (FFT) product. It consists of two separate libraries: cuFFT and cuFFTW. The cuFFT library is designed to provide high performance on NVIDIA GPUs. The cuFFTW library is provided as a porting tool to enable users of FFTW to start using NVIDIA GPUs with a minimum amount of effort.
The FFT is a divide-and-conquer algorithm for efficiently computing discrete Fourier transforms of complex or real-valued data sets. It is one of the most important and widely used numerical algorithms in computational physics and general signal processing. The cuFFT library provides a simple interface for computing FFTs on an NVIDIA GPU, which allows users to quickly leverage the floating-point power and parallelism of the GPU in a highly optimized and tested FFT library. The cuFFTW library provides the FFTW3 API to facilitate porting of existing FFTW applications.
This package provides a set of APIs which can be used at runtime to combine multiple CUDA objects into one CUDA fat binary (fatbin). The APIs accept inputs in multiple formats, either device cubins, PTX, or LTO-IR. The output is a fatbin that can be loaded by cuModuleLoadData of the CUDA Driver API. The functionality in this library is similar to the fatbinary offline tool in the CUDA toolkit, with the following advantages:
Support for runtime fatbin creation.
The clients get fine grain control over the input process.
Supports direct input from memory, rather than requiring inputs be written to files.