GPU Developer's Guide
  • 1. Introduction
  • 2. GPU Execution Architecture in CODE.HEAAN
  • 3. Start a new project with HEaaN
    • 3-1. Create Project Directory
    • 3-2. Set up Basic Directory Structure
    • 3-3. CMake Configuration
    • 3-4. Build and Compile
    • 3-5. Run (gpu-run)
    • 3-6. Check the results
    • Additional tips
  • 4. Example Codes
    • 4-1. CUDA
    • 4-2. HEaaN End to End Example
  • HEaaN GPU Guideline
    • HEaaN GPU Component Overview
    • CudaTools
    • Device Class
    • HEaaN GPU API in use
  • Not supported features
Powered by GitBook

Copyright©️ 2025 CryptoLab, Inc. All rights reserved.

On this page
  • 1.1 Basic Information
  • 1.2 API Reference
  • 1.2.4 Profiling Tools
  • 1.2.5 Memory Information

Was this helpful?

  1. HEaaN GPU Guideline

CudaTools

1.1 Basic Information

  • Namespace: HEaaN::CudaTools

  • Purpose: Provides basic utility functions for CUDA device management and monitoring

1.2 API Reference

1.2.1 Device Availability Check

bool isAvailable()
  • Function: Checks the availability of CUDA devices

  • Return Value: Boolean indicating CUDA device availability (true/false)

  • Usage Example:

#include <HEaaN/HEaaN.hpp>
...
if(!HEaaN::CudaTools::isAvailable()) {
    return -1;
}

1.2.2 Device Synchronization

void cudaDeviceSynchronize()
  • Function: Waits until all CUDA device operations are completed

  • When to Use: Blocks the host thread until all previously launched CUDA kernels have completed their execution.

1.2.3 Device Information Management

int cudaGetDevice()
int cudaDeviceCount()
void cudaSetDevice(int device_id)
  • Functions:

    • cudaGetDevice() : Returns the ID of the currently active CUDA device

    • cudaDeviceCount() : Returns the number of available CUDA devices in the System

    • cudaSetDevice(int device_id) : Sets the active CUDA device

  • Usage Example:

int device_count = HEaaN::CudaTools::cudaGetDeviceCount();
if (device_count > 0) {
    HEaaN::CudaTools::cudaSetDevice(0); // Use first GPU
}
  • Note: Currently gpu-run allocates a single gpu to run your binary. Using device_id other than 0 will elad to a runtime error.

1.2.4 Profiling Tools

void nvtxPush(const char *msg)
void nvtxPop()
  • Function: Set markers for NVIDIA Visual Profiler (e.g., Nsight System)

  • Purpose: Performance profiling of CUDA code

  • Usage Example:

HEaaN::CudaTools::nvtxPush("CUDA function start");
// ... perform cuda operation ...
HEaaN::CudaTools::nvtxPop();

1.2.5 Memory Information

std::pair<u64, u64> getCudaMemoryInfo()
  • Function: Retrieves memory information of the current CUDA device

  • Return Value: {available_memory, total_memory} in bytes

  • Usage Example:

auto [free, total] = HEaaN::CudaTools::getCudaMemoryInfo();
PreviousHEaaN GPU Component OverviewNextDevice Class

Last updated 7 days ago

Was this helpful?

For detailed information, please refer to the official documentation of NVIDIA CUDA Runtime API.

NVIDIA CUDA RUNTIME API