CuVec#
Unifying Python/C++/CUDA memory: Python buffered array ↔ C++11 std::vector
↔ CUDA managed memory.
Why#
Data should be manipulated using the existing functionality and design paradigms of each programming language. Python code should be Pythonic. CUDA code should be... CUDActic? C code should be... er, Clean.
However, in practice converting between data formats across languages can be a pain.
Other libraries which expose functionality to convert/pass data formats between these different language spaces tend to be bloated, unnecessarily complex, and relatively unmaintainable. By comparison, cuvec
uses the latest functionality of Python, C/C++11, and CUDA to keep its code (and yours) as succinct as possible. "Native" containers are exposed so your code follows the conventions of your language. Want something which works like a numpy.ndarray
? Not a problem. Want to convert it to a std::vector
? Or perhaps a raw float *
to use in a CUDA kernel? Trivial.
- Less boilerplate code (fewer bugs, easier debugging, and faster prototyping)
- Fewer memory copies (faster execution)
- Lower memory usage (do more with less hardware)
Non objectives#
Anything to do with mathematical functionality. The aim is to expose functionality, not (re)create it.
Even something as simple as setting element values is left to the user and/or pre-existing features - for example:
- Python:
arr[:] = value
- NumPy:
arr.fill(value)
- CuPy:
cupy.asarray(arr).fill(value)
- C++:
std::fill(vec.begin(), vec.end(), value)
- C & CUDA:
memset(vec.data(), value, sizeof(T) * vec.size())
Install#
pip install cuvec
Requirements:
- Python 3.7 or greater (e.g. via Anaconda or Miniconda or via
python3-dev
) - (optional) CUDA SDK/Toolkit (including drivers for an NVIDIA GPU)
- note that if the CUDA SDK/Toolkit is installed after CuVec, then CuVec must be re-installed to enable CUDA support
Usage#
Creating#
import cuvec.cpython as cuvec
arr = cuvec.zeros((1337, 42), "float32") # like `numpy.ndarray`
# print(sum(arr))
# some_numpy_func(arr)
# some_cpython_api_func(arr.cuvec)
# some_pybind11_func(arr.cuvec)
# import cupy; cupy_arr = cupy.asarray(arr)
import cuvec.pybind11 as cuvec
arr = cuvec.zeros((1337, 42), "float32") # like `numpy.ndarray`
# print(sum(arr))
# some_numpy_func(arr)
# some_cpython_api_func(arr.cuvec)
# some_pybind11_func(arr.cuvec)
# import cupy; cupy_arr = cupy.asarray(arr)
import cuvec.swig as cuvec
arr = cuvec.zeros((1337, 42), "float32") # like `numpy.ndarray`
# print(sum(arr))
# some_numpy_func(arr)
# some_cpython_api_func(arr.cuvec)
# some_pybind11_func(arr.cuvec)
# import cupy; cupy_arr = cupy.asarray(arr)
#include "Python.h"
#include "cuvec_cpython.cuh"
PyObject *obj = (PyObject *)PyCuVec_zeros<float>({1337, 42});
// don't forget to Py_DECREF(obj) if not returning it.
/// N.B.: convenience functions provided by "cuvec_cpython.cuh":
// PyCuVec<T> *PyCuVec_zeros(std::vector<Py_ssize_t> shape);
// PyCuVec<T> *PyCuVec_zeros_like(PyCuVec<T> *other);
// PyCuVec<T> *PyCuVec_deepcopy(PyCuVec<T> *other);
/// returns `NULL` if `self is None`, or
/// `getattr(self, 'cuvec', self)` otherwise:
// PyCuVec<T> *asPyCuVec(PyObject *self);
// PyCuVec<T> *asPyCuVec(PyCuVec<T> *self);
/// conversion functions for `PyArg_Parse*()`
/// e.g.: `PyArg_ParseTuple(args, "O&", &PyCuVec_f, &obj)`:
// int asPyCuVec_b(PyObject *o, PyCuVec<signed char> **self);
// int asPyCuVec_B(PyObject *o, PyCuVec<unsigned char> **self);
// int asPyCuVec_c(PyObject *o, PyCuVec<char> **self);
// int asPyCuVec_h(PyObject *o, PyCuVec<short> **self);
// int asPyCuVec_H(PyObject *o, PyCuVec<unsigned short> **self);
// int asPyCuVec_i(PyObject *o, PyCuVec<int> **self);
// int asPyCuVec_I(PyObject *o, PyCuVec<unsigned int> **self);
// int asPyCuVec_q(PyObject *o, PyCuVec<long long> **self);
// int asPyCuVec_Q(PyObject *o, PyCuVec<unsigned long long> **self);
// int asPyCuVec_e(PyObject *o, PyCuVec<__half> **self);
// int asPyCuVec_f(PyObject *o, PyCuVec<float> **self);
// int asPyCuVec_d(PyObject *o, PyCuVec<double> **self);
#include "cuvec.cuh"
NDCuVec<float> ndv({1337, 42});
#include "cuvec.cuh"
NDCuVec<float> *ndv = new NDCuVec<float>({1337, 42});
#include "cuvec.cuh"
CuVec<float> vec(1337 * 42); // like std::vector<float>
Converting#
The following involve no memory copies.
# import cuvec.cpython as cuvec, my_custom_lib
# arr = cuvec.zeros((1337, 42), "float32")
my_custom_lib.some_cpython_api_func(arr)
import cuvec.cpython as cuvec, my_custom_lib
arr = cuvec.asarray(my_custom_lib.some_cpython_api_func())
/// input: `PyObject *obj` (obtained from e.g.: `PyArg_Parse*()`, etc)
/// output: `CuVec<type> vec`, `std::vector<Py_ssize_t> shape`
CuVec<float> &vec = ((PyCuVec<float> *)obj)->vec; // like std::vector<float>
std::vector<Py_ssize_t> &shape = ((PyCuVec<float> *)obj)->shape;
# import cuvec.pybind11 as cuvec, my_custom_lib
# arr = cuvec.zeros((1337, 42), "float32")
my_custom_lib.some_pybind11_api_func(arr.cuvec)
import cuvec.pybind11 as cuvec, my_custom_lib
arr = cuvec.asarray(my_custom_lib.some_pybind11_api_func())
/// input: `NDCuVec<type> *ndv`
/// output: `CuVec<type> vec`, `std::vector<size_t> shape`
CuVec<float> &vec = ndv->vec; // like std::vector<float>
std::vector<size_t> &shape = ndv->shape;
# import cuvec.swig as cuvec, my_custom_lib
# arr = cuvec.zeros((1337, 42), "float32")
my_custom_lib.some_swig_api_func(arr.cuvec)
import cuvec.swig as cuvec, my_custom_lib
arr = cuvec.retarray(my_custom_lib.some_swig_api_func())
/// input: `NDCuVec<type> *ndv`
/// output: `CuVec<type> vec`, `std::vector<size_t> shape`
CuVec<float> &vec = ndv->vec; // like std::vector<float>
std::vector<size_t> &shape = ndv->shape;
/// input: `CuVec<type> vec`
/// output: `type *arr`
float *arr = vec.data(); // pointer to `cudaMallocManaged()` data
Examples#
Here's a before and after comparison of a Python ↔ CUDA interface.
Python:
1 2 3 4 |
|
1 2 3 4 |
|
1 2 3 4 |
|
1 2 3 4 |
|
C++:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
|
CUDA:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
1 2 3 4 5 6 7 8 9 10 11 12 |
|
1 2 3 4 5 6 7 8 9 10 11 12 |
|
1 2 3 4 5 6 7 8 9 10 11 12 |
|
For a full reference, see:
cuvec.example_cpython
source: example_cpython.cucuvec.example_pybind11
source: example_pybind11.cucuvec.example_swig
sources: example_swig.i & example_swig.cu
See also NumCu, a minimal stand-alone Python package built using CuVec.
External Projects#
Python objects (arr
, returned by cuvec.zeros()
, cuvec.asarray()
, or cuvec.copy()
) contain all the attributes of a numpy.ndarray
. Additionally, arr.cuvec
implements the buffer protocol, while arr.__cuda_array_interface__
(and arr.__array_interface__
) provide compatibility with other libraries such as Numba, CuPy, PyTorch, PyArrow, and RAPIDS.
When using the SWIG alternative module, arr.cuvec
is a wrapper around NDCuVec<type> *
.
cuvec
is a header-only library so simply do one of:
#include "cuvec_cpython.cuh" // CPython API
#include "cuvec.cuh" // C++/CUDA API
You can find the location of the headers using:
python -c "import cuvec; print(cuvec.include_path)"
For reference, see:
cuvec.example_cpython
source: example_cpython.cucuvec.example_pybind11
source: example_pybind11.cu
cuvec
is a header-only library so simply %include "cuvec.i"
in a SWIG interface file. You can find the location of the headers using:
python -c "import cuvec; print(cuvec.include_path)"
For reference, see cuvec.example_swig
's sources: example_swig.i and example_swig.cu.
This is likely unnecessary (see the "C++ & CUDA" tab above for simpler #include
instructions).
The raw C++/CUDA libraries may be included in external projects using cmake
. Simply build the project and use find_package(AMYPADcuvec)
.
# print installation directory (after `pip install cuvec`)...
python -c "import cuvec; print(cuvec.cmake_prefix)"
# ... or build & install directly with cmake
cmake -S cuvec -B build && cmake --build build
cmake --install build --prefix /my/install/dir
At this point any external project may include cuvec
as follows (Once setting -DCMAKE_PREFIX_DIR=<installation prefix from above>
):
1 2 3 4 5 6 |
|