Enter the query into the form above. You can look for specific version of a package by using @ symbol like this: gcc@10.
API method:
GET /api/packages?search=hello&page=1&limit=20
where search is your query, page is a page number and limit is a number of items on a single page. Pagination information (such as a number of pages and etc) is returned
in response headers.
If you'd like to join our channel webring send a patch to ~whereiseveryone/toys@lists.sr.ht adding your channel as an entry in channels.scm.
This package provides a set of APIs which can be used at runtime to combine multiple CUDA objects into one CUDA fat binary (fatbin). The APIs accept inputs in multiple formats, either device cubins, PTX, or LTO-IR. The output is a fatbin that can be loaded by cuModuleLoadData of the CUDA Driver API. The functionality in this library is similar to the fatbinary offline tool in the CUDA toolkit, with the following advantages:
Support for runtime fatbin creation.
The clients get fine grain control over the input process.
Supports direct input from memory, rather than requiring inputs be written to files.
This package provides the CUDA compiler and the CUDA run-time support libraries for NVIDIA GPUs, all of which are proprietary.
CUTLASS is a collection of CUDA C++ template abstractions for implementing high-performance matrix-matrix multiplication (GEMM) and related computations at all levels and scales within CUDA. It incorporates strategies for hierarchical decomposition and data movement similar to those used to implement cuBLAS and cuDNN.
CUTLASS decomposes these ``moving parts'' into reusable, modular software components abstracted by C++ template classes. Primitives for different levels of a conceptual parallelization hierarchy can be specialized and tuned via custom tiling sizes, data types, and other algorithmic policy. The resulting flexibility simplifies their use as building blocks within custom kernels and applications.
This package provides a GPU-accelerated library of primitives for deep neural networks, with highly tuned implementations for standard routines such as forward and backward convolution, attention, matmul, pooling, and normalization.
This package enables the creation of profiling and tracing tools that target CUDA applications and give insight into the CPU and GPU behavior of CUDA applications. It provides the following APIs:
the Activity API,
the Callback API,
the Event API,
the Metric API,
the Profiling API,
the PC Sampling API,
the Checkpoint API.
This package provides tooling to configure the NVSwitch memory fabrics to form one memory fabric among all participating GPUs, and monitors the NVLinks that support the fabric. See docs for more information.
This package provides the CUDA compiler and the CUDA run-time support libraries for NVIDIA GPUs, all of which are proprietary.
This package provides the CUDA compiler and the CUDA run-time support libraries for NVIDIA GPUs, all of which are proprietary.
This package provides a high-level library based on the cuBLAS and cuSPARSE libraries. It consists of two modules corresponding to two sets of API: the cuSolver API on a single GPU; and the cuSolverMG API on a single node multiGPU. Each of these can be used independently or in concert with other toolkit libraries. The intent of cuSolver is to provide useful LAPACK-like features, such as common matrix factorization and triangular solve routines for dense matrices, a sparse least-squares solver and an eigenvalue solver. In addition, cuSolver provides a new refactorization library useful for solving sequences of matrices with a shared sparsity pattern.
CUTLASS is a collection of CUDA C++ template abstractions for implementing high-performance matrix-matrix multiplication (GEMM) and related computations at all levels and scales within CUDA. It incorporates strategies for hierarchical decomposition and data movement similar to those used to implement cuBLAS and cuDNN.
CUTLASS decomposes these ``moving parts'' into reusable, modular software components abstracted by C++ template classes. Primitives for different levels of a conceptual parallelization hierarchy can be specialized and tuned via custom tiling sizes, data types, and other algorithmic policy. The resulting flexibility simplifies their use as building blocks within custom kernels and applications.
This package provides a set of APIs which can be used at runtime to link together GPU devide code. It supports Link Time Optimization.
This package provides a minimal low-level profiling API for CUDA.
This package provides a binary that prunes host object files and libraries to only contain device code for the specified targets.
This package provides the CUDA compiler and the CUDA run-time support libraries for NVIDIA GPUs, all of which are proprietary.
OpenCL (Open Computing Language) is a multi-vendor open standard for general-purpose parallel programming of heterogeneous systems that include CPUs, GPUs and other processors. This package provides the API to use OpenCL on NVIDIA GPUs.
This package provides a an interactive profiler for CUDA and NVIDIA OptiX that provides detailed performance metrics and API debugging via a user interface and command-line tool. Users can run guided analysis and compare results with a customizable and data-driven user interface, as well as post-process and analyze results in their own workflows.
This package decodes (demangles) low-level identifiers that have been mangled by CUDA C++ into user readable names. For every input alphanumeric word, the output of cu++filt is either the demangled name if the name decodes to a CUDA C++ name, or the original name itself.
This package provides the CUDA compiler and the CUDA run-time support libraries for NVIDIA GPUs, all of which are proprietary.
This package provides the CUDA compiler and the CUDA run-time support libraries for NVIDIA GPUs, all of which are proprietary.
This package provides Python low-level bindings for NVIDIA CUDA toolkit.
This package provides a command-line tool to profile CUDA kernels. It enables the collection of a timeline of CUDA-related activities on both CPU and GPU, including kernel execution, memory transfers, memory set and CUDA API calls and events or metrics for CUDA kernels.
This package provides the CUDA Deep Neural Network library.
This package provides the CUDA compiler and the CUDA run-time support libraries for NVIDIA GPUs, all of which are proprietary.
This package provides the CUDA C++ developers with building blocks that make it easier to write safe and efficient code. It unifies three essential former CUDA C++ libraries into a single repository:
Thrust (former repo)
CUB (former repo)
libcudacxx (former repo)