back to home

ROCm / rocm-examples

A collection of examples for the ROCm software stack

282 stars
86 forks
11 issues
C++CMakeCuda

AI Architecture Analysis

This repository is indexed by RepoMind. By analyzing ROCm/rocm-examples in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.

Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.

Source files are only loaded when you start an analysis to optimize performance.

Embed this Badge

Showcase RepoMind's analysis directly in your repository's README.

[![Analyzed by RepoMind](https://img.shields.io/badge/Analyzed%20by-RepoMind-4F46E5?style=for-the-badge)](https://repomind.in/repo/ROCm/rocm-examples)
Preview:Analyzed by RepoMind

Repository Overview (README excerpt)

Crawler view

ROCm Examples This repository is a collection of examples to enable new users to start using ROCm, as well as provide more advanced examples for experienced users. The examples are structured in several categories: • HIP-Basic showcases some basic functionality without any additional dependencies • HIP-Doc contains the example codes that are shown in the HIP documentation • Libraries contains examples for ROCm-libraries, that provide higher-level functionality • Applications showcases some common applications, using HIP to accelerate them • AI contains instructions on how to use ROCm for AI • Tutorials contains the code accompanying the HIP Tutorials that can be found in the HIP documentation. For a full overview over the examples see the section repository contents. Prerequisites Linux • CMake (at least version 3.21) • A number of examples also support building via GNU Make - available through the distribution's package manager • ROCm (at least version 7.x.x) • For example-specific prerequisites, see the example subdirectories. Windows • Visual Studio 2019 or 2022 with the "Desktop Development with C++" workload • HIP SDK for Windows • The Visual Studio ROCm extension needs to be installed to build with the solution files. • CMake (optional, to build with CMake. Requires at least version 3.21) • Ninja (optional, to build with CMake) • Perl (for hipify related scripts) Building the example suite Linux These instructions assume that the prerequisites for every example are installed on the system. CMake See CMake build options for an overview of build options. • • • (on ROCm) or (on CUDA) • • Make Beware that only a subset of the examples support building via Make. • • • (on ROCm) or (on CUDA) Linux with Docker Alternatively, instead of installing the prerequisites on the system, the Dockerfiles in this repository can be used to build images that provide all required prerequisites. Note, that the ROCm kernel GPU driver still needs to be installed on the host system. The following instructions showcase building the Docker image and full example suite inside the container using CMake: • • • (on ROCm) or (on CUDA) • (on ROCm) or (on CUDA) • • • (on ROCm) or (on CUDA) • The built executables can be found and run in the directory: • Windows Visual Studio The repository has Visual Studio project files for all examples and individually for each example. • Project files for Visual Studio are named as the example with suffix added e.g. for the device sum example. • The project files can be built from Visual Studio or from the command line using MSBuild. • Use the build solution command in Visual Studio to build. • To build from the command line execute . • To build in Release mode pass the option to MSBuild. • The executables will be created in a subfolder named "Debug" or "Release" inside the project folder. • The HIP specific project settings like the GPU architectures targeted can be set on the tab of project properties. • The top level solution files come in two flavors: and . The former contains all examples, while the latter contains the examples that support both ROCm and CUDA. CMake First, clone the repository and go to the source directory. There are two ways to build the project using CMake: with the Visual Studio Developer Command Prompt (recommended) or with a standard Command Prompt. See CMake build options for an overview of build options. Visual Studio Developer Command Prompt Select Start, search for "x64 Native Tools Command Prompt for VS 2019", and the resulting Command Prompt. Ninja must be selected as generator, and Clang as C++ compiler. Standard Command Prompt Run the standard Command Prompt. When using the standard Command Prompt to build the project, the Resource Compiler (RC) path must be specified. The RC is a tool used to build Windows-based applications, its default path is . Finally, the generator must be set to Ninja. CMake build options The following options are available when building with CMake. | Option | Relevant to | Default value | Description | |:---------------------------|:------------|:-----------------|:--------------------------------------------------------------------------------------------------------| | | HIP / CUDA | | GPU language to compile for. Set to to compile for NVIDIA GPUs and to for AMD GPUs. | | | HIP | Compiler default | HIP device architectures to target, e.g. to target architectures gfx908 and gfx1030. | | | CUDA | Compiler default | CUDA architecture to compile for e.g. to target compute capibility 50 and 72. | Repository Contents • AI Showcases the functionality for executing quantized models using Torch-MIGraphX. • Applications groups a number of examples ... . • bitonic_sort: Showcases how to order an array of $n$ elements using a GPU implementation of the bitonic sort. • convolution: A simple GPU implementation for the calculation of discrete convolutions. • floyd_warshall: Showcases a GPU implementation of the Floyd-Warshall algorithm for finding shortest paths in certain types of graphs. • histogram: Histogram over a byte array with memory bank optimization. • monte_carlo_pi: Monte Carlo estimation of $\pi$ using hipRAND for random number generation and hipCUB for evaluation. • prefix_sum: Showcases a GPU implementation of a prefix sum with a 2-kernel scan algorithm. • Common contains common utility functionality shared between the examples. • HIP-Basic hosts self-contained recipes showcasing HIP runtime functionality. • assembly_to_executable: Program and accompanying build systems that show how to manually compile and link a HIP application from host and device code. • bandwidth: Program that measures memory bandwidth from host to device, device to host, and device to device. • bit_extract: Program that showcases how to use HIP built-in bit extract. • device_globals: Show cases how to set global variables on the device from the host. • device_query: Program that showcases how properties from the devic…