Features
- Blazing Fast Performance: Achieves sorting of 1 billion (2^30) 32-bit key/value pairs in just 0.17 seconds (5.88 GKeys/s) on Nvidia RTX 4090. (Slightly slower than CUDA implementation, but optimizations can still be done)
- Open Source: Completely free to use and modify under the MIT license.
- SYCL Compatibility: Leverages SYCL for cross-platform compatibility and future-proofing.
- Efficient Resource Usage: Optimized to make the most of GPU resources without requiring proprietary CUDA libraries.
Installation
Instructions for installing the project.
git clone https://github.com/M-Gjerde/SYCLOneSweep
cd SYCLOneSweep
mkdir build
cd build
cmake ..
make
Usage
Instructions on how to use the project.
./onesweep_sycl
Change number of keys or other settings in the main.cpp file and rebuild