Enter the query into the form above. You can look for specific version of a package by using @ symbol like this: gcc@10.
API method:
GET /api/packages?search=hello&page=1&limit=20
where search is your query, page is a page number and limit is a number of items on a single page. Pagination information (such as a number of pages and etc) is returned
in response headers.
If you'd like to join our channel webring send a patch to ~whereiseveryone/toys@lists.sr.ht adding your channel as an entry in channels.scm.
This package provides the CUDA compiler and the CUDA run-time support libraries for NVIDIA GPUs, all of which are proprietary.
This package provides the CUDA run-time support libraries for NVIDIA GPUs, all of which are proprietary.
CUTLASS is a collection of CUDA C++ template abstractions for implementing high-performance matrix-matrix multiplication (GEMM) and related computations at all levels and scales within CUDA. It incorporates strategies for hierarchical decomposition and data movement similar to those used to implement cuBLAS and cuDNN.
CUTLASS decomposes these ``moving parts'' into reusable, modular software components abstracted by C++ template classes. Primitives for different levels of a conceptual parallelization hierarchy can be specialized and tuned via custom tiling sizes, data types, and other algorithmic policy. The resulting flexibility simplifies their use as building blocks within custom kernels and applications.
This package provides a cross-platform API for annotating source code to provide contextual information to developer tools.
This package provides the CUDA Deep Neural Network library.
This package provides the CUDA Deep Neural Network library.
NCCL (pronounced "Nickel") is a stand-alone library of standard communication routines for NVIDIA GPUs, implementing all-reduce, all-gather, reduce, broadcast, reduce-scatter, as well as any send/receive based communication pattern. It has been optimized to achieve high bandwidth on platforms using PCIe, NVLink, NVswitch, as well as networking using InfiniBand Verbs or TCP/IP sockets. NCCL supports an arbitrary number of GPUs installed in a single node or across multiple nodes, and can be used in either single- or multi-process (e.g., MPI) applications.
This package decodes (demangles) low-level identifiers that have been mangled by CUDA C++ into user readable names. For every input alphanumeric word, the output of cu++filt is either the demangled name if the name decodes to a CUDA C++ name, or the original name itself.
This package provides a GPU-accelerated library of primitives for deep neural networks, with highly tuned implementations for standard routines such as forward and backward convolution, attention, matmul, pooling, and normalization.
This package accepts CUDA C++ source code in character string form and creates handles that can be used to obtain the CUDA PTX, for further instrumentation with the CUDA Toolkit. It allows to shrink compilation overhead and simplify application deployment.
This package provides the CUDA compiler and the CUDA run-time support libraries for NVIDIA GPUs, all of which are proprietary.
This package provides the CUDA compiler and the CUDA run-time support libraries for NVIDIA GPUs, all of which are proprietary.
This package provides the CUDA compiler and the CUDA run-time support libraries for NVIDIA GPUs, all of which are proprietary.
This package provides the CUDA compiler and the CUDA run-time support libraries for NVIDIA GPUs, all of which are proprietary.
The NVIDIA Management Library Headers (NVML) is a C-based API for monitoring and managing various states of the NVIDIA GPU devices. It provides a direct access to the queries and commands exposed via nvidia-smi.
This package provides a set of APIs which can be used at runtime to link together GPU devide code. It supports Link Time Optimization.
This package provides a set of APIs which can be used at runtime to combine multiple CUDA objects into one CUDA fat binary (fatbin). The APIs accept inputs in multiple formats, either device cubins, PTX, or LTO-IR. The output is a fatbin that can be loaded by cuModuleLoadData of the CUDA Driver API. The functionality in this library is similar to the fatbinary offline tool in the CUDA toolkit, with the following advantages:
Support for runtime fatbin creation.
The clients get fine grain control over the input process.
Supports direct input from memory, rather than requiring inputs be written to files.
This package provides cuFFT, the NVIDIA® CUDA® Fast Fourier Transform (FFT) product. It consists of two separate libraries: cuFFT and cuFFTW. The cuFFT library is designed to provide high performance on NVIDIA GPUs. The cuFFTW library is provided as a porting tool to enable users of FFTW to start using NVIDIA GPUs with a minimum amount of effort.
The FFT is a divide-and-conquer algorithm for efficiently computing discrete Fourier transforms of complex or real-valued data sets. It is one of the most important and widely used numerical algorithms in computational physics and general signal processing. The cuFFT library provides a simple interface for computing FFTs on an NVIDIA GPU, which allows users to quickly leverage the floating-point power and parallelism of the GPU in a highly optimized and tested FFT library. The cuFFTW library provides the FFTW3 API to facilitate porting of existing FFTW applications.
This package provides a minimal low-level profiling API for CUDA.
This package provides the CUDA C++ developers with building blocks that make it easier to write safe and efficient code. It unifies three essential former CUDA C++ libraries into a single repository:
Thrust (former repo)
CUB (former repo)
libcudacxx (former repo)
This binary extracts information from standalone cubin files and presents them in human readable format. The output of nvdisasm includes CUDA assembly code for each kernel, listing of ELF data sections and other CUDA specific sections. Output style and options are controlled through nvdisasm command-line options. nvdisasm also does control flow analysis to annotate jump/branch targets and makes the output easier to read.
This package provides a <bits/floatn.h> header to override that of glibc and disable float128 support. This is required allow the use of nvcc with CUDA 8.0 and glibc 2.26+. Otherwise, nvcc fails like this:
/gnu/store/…-glibc-2.26.105-g0890d5379c/include/bits/floatn.h(61): error: invalid argument to attribute "__mode__" /gnu/store/…-glibc-2.26.105-g0890d5379c/include/bits/floatn.h(73): error: identifier "__float128" is undefined
This package provides the CUDA compiler and the CUDA run-time support libraries for NVIDIA GPUs, all of which are proprietary.
This package provides the CUDA Direct Sparse Solver library.