OpenCL Platform Devices Information Using C++ and CMake

OpenCL logo

In OpenCL, a device is formed by multiple compute units (CUs), each CU is divided into processing elements (PEs). The host submits a kernel to the device and enqueues the commands to perform, then the device executes the commands on the processing elements that are the ones executing the computations.
GPUs are optimized to run converged tasks. A converged task is that one command that runs the same sequence of statements across all the processing elements. GPU memory access must be optimized to avoid memory banks conflicts, that would cause sequential access by the processing elements and causing a huge performance impact on the program. If the processing elements are accessing in a coalescence fashion the memory, the whole bank can be retrieved in a single transaction.
The host program that usually is a general purpose CPU will request for the platforms and retrieve all the devices of that platforms. Then it will read the kernels and submit them to the devices. A buffer will be required to transfer the data from the host to the device global memory and as well retrieve the results once the computations are done.

Is the responsibility of the host to be aware of each device capability and make sure the device supports the command that it will perform. The function clGetDeviceInfo can be used to query the device which operations and native types supports. Some devices implement functionalities that can be approved or not by Krhonos group and therefore have a special function name.

This program retrieves the platform or platforms of the system and all their devices to show available information.

Device C Construct

The C function to initialize the device structs has the following arguments:

// Provided by CL_VERSION_1_0
cl_int clGetDeviceInfo(
    cl_device_id device,
    cl_device_info param_name,
    size_t param_value_size,
    void* param_value,
    size_t* param_value_size_ret);

The parameters of the function are the following

  • device may be a device returned by clGetDeviceIDs or a sub-device created by clCreateSubDevices. If device is a sub-device, the specific information for the sub-device will be returned. The information that can be queried using clGetDeviceInfo is specified in the Device Queries table.
  • param_name is an enumeration constant that identifies the device information being queried. It can be one of the following values as specified in the Device Queries table.
  • param_value is a pointer to memory location where appropriate values for a given param_name, as specified in the Device Queries table, will be returned. If param_value is NULL, it is ignored.
  • param_value_size specifies the size in bytes of memory pointed to by param_value. This size must be greater than or equal to the size of the return type specified in the Device Queries table. If param_value is NULL, it is ignored.
  • param_value_size_ret returns the actual size in bytes of data being queried by param_name. If param_value_size_ret is NULL, it is ignored.

In C the code would perform the following operations:

cl_int i, err;
char name_data[48];
[...]
err = clGetDeviceIDs(platform, CL_DEVICE_TYPE_ALL, 1, NULL, &num_devices);
devices = (cl_device_id*) malloc(sizeof(cl_device_id) * num_devices);
clGetDeviceIDs(platform, CL_DEVICE_TYPE_ALL, num_devices, devices, NULL);
for(i = 0; i < num_devices; i++) {
    err = clGetDeviceInfo(devices[i], CL_DEVICE_NAME, sizeof(name_data), name_data, NULL);
    [...]
}

The C++ Wrapper a vector is used to retrieve the different devices:

public: cl_int Platform::getDevices(cl_device_type type, std::vector<Device> *devices) const

The code to retrieve all the platforms can be with the following two lines:

// collect the first platform available
cl::Platform platform{platforms.front()};
// initialize vector to store the devices from the platform
std::vector<cl::Device> devices;
// collect the GPUs devices from the platform
platform.getDevices(CL_DEVICE_TYPE_GPU, &devices);

Now that we have the devices getInfo<>() can be used to query the device info.

Note:

This program does not collect all the information that can be obtained using the function getInfo<>(). It tries to demostrate what a device is an how information of the device can be collected by the host program.
clinfo command can be used from the terminal to collect all the information of the device.

main.cpp

#include <iostream>
#include <CL/opencl.hpp>

int main() {
    // initialize a vector to store the platforms
    std::vector<cl::Platform> platforms{};

    // collect the platforms available
    cl::Platform::get(&platforms);

    // make sure at least we have a platform
    if (platforms.empty()){
        std::cerr << "No platforms found!" << std::endl;
        exit(1);
    }

    std::cout << "Number of platforms found: " << platforms.size() << std::endl;
    // iterate over each platform
    for (const cl::Platform& platform : platforms) {
        std::cout << "\tPlatform name: " << platform.getInfo<CL_PLATFORM_NAME>() << std::endl;
        std::cout << "\tPlatform vendor: " << platform.getInfo<CL_PLATFORM_VENDOR>() << std::endl;

        // collect the devices
        std::vector<cl::Device> devices;
        platform.getDevices(CL_DEVICE_TYPE_ALL, &devices);

        std::cout << "\tPlatform devices: " << devices.size() << std::endl;
        for (const cl::Device& device : devices) {
            std::cout << "\n\t\tDevice name: " << device.getInfo<CL_DEVICE_NAME>() << std::endl;
            std::cout << "\t\tDevice vendor: " << device.getInfo<CL_DEVICE_VENDOR>() << std::endl;
            std::cout << "\t\tDevice version: " << device.getInfo<CL_DEVICE_VERSION>() << std::endl;
            std::cout << "\t\tDevice profile: " << device.getInfo<CL_DEVICE_PROFILE>() << std::endl;
            std::cout << "\t\tDevice type: " << device.getInfo<CL_DEVICE_TYPE>() << std::endl;
            std::cout << "\t\tDevice available: " << device.getInfo<CL_DEVICE_AVAILABLE>() << std::endl;
            std::cout << "\t\tDevice compiler available: " << device.getInfo<CL_DEVICE_COMPILER_AVAILABLE>() << std::endl;
            std::cout << "\t\tMax clock frequency: " << device.getInfo<CL_DEVICE_MAX_CLOCK_FREQUENCY>() << std::endl;
            std::cout << "\t\tMax compute units: " << device.getInfo<CL_DEVICE_MAX_COMPUTE_UNITS>() << std::endl;
            std::cout << "\t\tMax work-items dimensions: " << device.getInfo<CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS>() << std::endl;
            std::cout << "\t\tMax work-group size: " << device.getInfo<CL_DEVICE_MAX_WORK_GROUP_SIZE>() << std::endl;
            std::cout << "\t\tMax work-group size: " << device.getInfo<CL_DEVICE_PREFERRED_WORK_GROUP_SIZE_MULTIPLE>() << std::endl;
            std::cout << "\t\tDevice extensions: " << device.getInfo<CL_DEVICE_EXTENSIONS>() << std::endl;
            std::cout << "\t\t..." << std::endl;
        }

        std::cout << "\n----------\n" << std::endl;
    }

    std::cout << "Done!" << std::endl;
    return 0;
}

CMakeLists.txt Code

cmake_minimum_required(VERSION 3.28)
project(OpenCL_PlatformDevicesInformation LANGUAGES CXX)

set(CMAKE_CXX_STANDARD 17)

add_executable(${PROJECT_NAME} main.cpp)

find_package(OpenCL REQUIRED)
target_link_libraries(${PROJECT_NAME} OpenCL::OpenCL)

target_compile_definitions(${PROJECT_NAME} PRIVATE CL_HPP_TARGET_OPENCL_VERSION=300)

Program Output

This is the output of my current system:

Number of platforms found: 2
	Platform name: Clover
	Platform vendor: Mesa
	Platform devices: 1

		Device name: AMD Radeon RX 580 Series (polaris10, LLVM 15.0.6, DRM 3.49, 6.1.0-26-amd64)
		Device vendor: AMD
		Device version: OpenCL 1.1 Mesa 22.3.6
		Device profile: FULL_PROFILE
		Device type: 4
		Device available: 1
		Device compiler available: 1
		Max clock frequency: 1370
		Max compute units: 36
		Max work-items dimensions: 3
		Max work-group size: 256
		Max work-group size: 1
		Device extensions: cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp64 cl_khr_extended_versioning
		...

----------

	Platform name: rusticl
	Platform vendor: Mesa/X.org
	Platform devices: 0

----------

Done!

Process finished with exit code 0

If you spot any typos, have questions, or need assistance with the build, feel free to contact me at: antonimercer@lthjournal.com

This guide contains no affiliate links or ads. If you'd like to support this or future projects, you can do so here:

By supporting monthly you will help me create awesome guides and improve current ones.


Technologies used

Debian 12, Linux, OpenCL, C++, AMD, Nvidia, Intel, GPU

Books are knowledge and knowledge is power.

After ending my studies, I always tried to dedicate some time to books. They helped me a lot and I want to dedicate a little space here as a greeting. From basics on different languages to more advanced level.