Development & Optimization

Compiler Explorer: An Essential Kernel Playground for CUDA Developers

Have you ever wondered exactly what the CUDA compiler generates when you write GPU kernels? Ever wanted to share a minimal CUDA example with a colleague effortlessly, without the need for them to install a specific CUDA toolkit version first? Or perhaps you’re completely new to CUDA and looking for an easy way to start without needing to install anything or even having a GPU on hand?

Thanks to Compiler Explorer, a widely used open source tool often called godbolt, you can do all of this—instantly and interactively—right from your browser. This post explains what Compiler Explorer offers and why it should be part of your CUDA development toolkit.

What is Compiler Explorer?

Compiler Explorer is a web-based tool to help developers write source code, compile it live using various compilers, and immediately see generated outputs such as assembly, intermediate representation (IR), PTX, and more.

Screenshot of the Compiler Explorer browser window. The left pane is the input source code and the right pane has multiple windows showing the generated PTX, generated SASS, and the program execution output.
Figure 1. With the Compiler Explorer interface for CUDA development, you can write code, choose a compiler and flags, compile, view host assembly and device PTX, and run your code directly on a real GPUall in your browser running on a remote GPU

Initially created by Matt Godbolt to help developers understand compiler optimizations, Compiler Explorer has evolved into a powerful, multilanguage tool used across the entire systems programming world. The godbolt.org site currently supports over 70 languages—including C++, C, Rust, Python, and CUDA—and handles more than a million compilations per week, according to its Grafana dashboard.

While the names ‘Compiler Explorer’ and ‘godbolt’ are often used interchangeably, they’re not quite the same thing. Compiler Explorer is the name of the application. Godbolt is the surname of its creator, who also maintains the most widely used public instance of the tool at godbolt.org. Anyone can run Compiler Explorer locally or host their own instance, but “godbolt” mentions usually refer to the godbolt.org instance.

One of the key benefits of Compiler Explorer is that you can use it to both compile and run code. On the CPU side, this feature has long been available and is widely appreciated for its ability to simplify learning, validation, and debugging workflows. 

Compiler Explorer has supported compiling CUDA code for years, making it easy to prototype CUDA C++ code and inspect PTX or SASS output without installing a full toolkit. But it initially lacked one of the benefits CPU workflows enjoyed: the ability to actually run CUDA code.

Through close collaboration between NVIDIA engineers and the Compiler Explorer maintainers, that gap was closed in 2022. Compiler Explorer expanded to support executing CUDA code directly on NVIDIA GPUs—bringing full write-compile-run capability to CUDA developers in the browser.

Powerful features for CUDA developers

This section shares five practical ways Compiler Explorer is an indispensable tool for CUDA developers.

1. Running CUDA C++ code in your browser

Compiler Explorer makes it possible to write, compile, and run small CUDA programs directly in your browser on NVIDIA GPUs. This dramatically lowers the barrier to prototyping, debugging, and learning. You can go from idea to result in seconds.

This will speed up your productivity. Highlights include:

  • Work without needing a local setup 
  • Get real-time feedback: write → compile → run → output
  • Prototype, validate, and teach with ease

 Example workflows include:

  • Prototyping CUDA C++ code quickly without needing a full local environment
  • Reproducing and debugging issues
  • Using interactive execution for teaching and learning CUDA

2. Easy sharing and collaboration

Compiler Explorer simplifies sharing minimal CUDA examples for debugging, collaboration, or education.

  • Write a CUDA snippet, click Share, and get a permanent URL—your recipient sees exactly the same code, compiler settings, and outputs
  • Report compiler bugs or performance questions
  • Teach CUDA concepts

Example workflows include:

  • Debugging a compiler optimization issue by quickly sending colleagues exact code examples
  • Sharing a minimal, runnable example to illustrate CUDA best practices or common pitfalls to teammates or students

3. Experiment with libraries 

Compiler Explorer isn’t limited to vanilla CUDA C++ code. You can also use CUDA libraries like CCCL or MatX out of the box. Because it’s open source, you’re free to extend it further. If you have your own CUDA library you’d like to integrate, it’s entirely possible to do so by contributing to the compiler-explorer GitHub repo.

Screenshot of web browser page showing an example of adding libraries to a project.
Figure 2. Add libraries to your project on godbolt.org

4. Inspecting PTX and SASS assembly side by side

Viewing the assembly generated from your CUDA C++ code can reveal crucial insights into performance optimization. To learn more about PTX, see Understanding PTX, the Assembly Language of CUDA GPU Computing.

There are many ways to inspect GPU assembly with tools like cuobjdump, nvdisasm, or NVIDIA Nsight Compute. Compiler Explorer often proves to be the most convenient, especially for quick experiments, teaching moments, or early-stage debugging.

Compiler Explorer correlates each line of your source code with the corresponding generated instructions. This mapping is visually color-coded in the UI, making it easy to trace how each line of CUDA C++ code translates to PTX and SASS instructions in the output pane. For example, in Figure 1, you can see how the printf call on line 4 is color coded to match the generated PTX instructions on the right.

With Compiler Explorer, you can easily:

  • Write CUDA C++ kernels directly in your browser
  • See side-by-side views of PTX and SASS 

Example workflows include:

  • Observing how changing loop unrolling pragmas affects assembly
  • Tracking how __restrict__ impacts the generated load instructions
  • Verifying generation of vector load/store instructions

5. Comparing compiler versions and flags instantly

Wondering how different CUDA compiler versions or specific compiler flags impact your generated GPU code? Compiler Explorer makes this easy:

  • Select from multiple CUDA compiler versions
  • See results immediately without managing multiple local installations
  • Confirm or disprove changes in compiler behavior over different versions instantly

Example workflows include:

  • Verifying whether recent CUDA toolkit releases better optimize your device code
  • Tracking how -use_fast_math affects generated assembly
  • Checking resource usage like registers per thread with --resource-usage

Get started with Compiler Explorer

Compiler Explorer has become an indispensable part of the modern CUDA development workflow, simplifying everything from debugging and performance analysis to teaching and sharing code. For this reason, it was selected as the inaugural recipient of the NVIDIA FOSS Fund, a program that supports open source tools that have real impact for developers. The team is proud to support the project and hopes it becomes a core tool in your own CUDA toolkit.

If you’re not already using Compiler Explorer, try it for:

  • Prototyping and exploring CUDA concepts
  • Debugging performance regressions or verify compiler behaviors
  • Sharing minimal reproducible examples easily

Check out these samples to get started: Hello World, Vector Add, and CCCL. To learn more, visit godbolt.org, contribute to the compiler-explorer GitHub repo, and join Compiler Explorer on Discord.

Acknowledgments

Special thanks to Matt Godbolt, creator of Compiler Explorer, and to the entire community for building and maintaining such a powerful resource for developers. In particular, thanks to Patrick Quist, whose long-term collaboration and technical work have been instrumental in expanding CUDA support on Compiler Explorer.

Discuss (1)

Tags