Skip to content

Add support for building the ggml-virtgpu image#2467

Draft
kpouget wants to merge 2 commits intocontainers:mainfrom
kpouget:remoting
Draft

Add support for building the ggml-virtgpu image#2467
kpouget wants to merge 2 commits intocontainers:mainfrom
kpouget:remoting

Conversation

@kpouget
Copy link
Collaborator

@kpouget kpouget commented Feb 26, 2026

This PR adds the ability to build the remoting image in Ramalama. This image relies on the recently-introduced ggml-virtgpu llama.cpp backend for executing GGML operations outside of the virtual machine. This is particularly useful on MacOS, where the Linux VM is mandatory for running containers. The ggml-virtgpu, along with its host-side library the ggml-virtgpu-backend allows the container to leverage the ggml-metal native GPU acceleration.

See this blog post for the steps to reproduce:

https://developers.redhat.com/articles/2025/09/18/reach-native-speed-macos-llamacpp-container-inference#try_api_remoting_with_ramalama

and this repository for the latest builds:

https://github.com/crc-org/llama.cpp/releases

The source code is now merged in llama.cpp:

https://github.com/ggml-org/llama.cpp/tree/master/ggml/src/ggml-virtgpu
https://github.com/ggml-org/llama.cpp/blob/master/docs/backend/VirtGPU.md

The Virglrenderer library is MR pending review:

https://gitlab.freedesktop.org/virgl/virglrenderer/-/merge_requests/1590

I marked this PR as draft until PR ggml-org/llama.cpp#19846 is merged and I can relaunch the build to validate it against the latest rebase.

Summary by Sourcery

Add a new remoting container image that uses the ggml-virtgpu backend and optional Vulkan-based API remoting support, including required build and runtime dependencies.

New Features:

  • Introduce a remoting container image definition wired to the ggml-virtgpu backend and optional Vulkan backend.
  • Allow configuring an API remoting backend (currently Vulkan) for llama.cpp via the RAMALAMA_IMAGE_BUILD_REMOTING_BACKEND build argument.

Enhancements:

  • Extend the shared build script to handle the remoting target, including virglrenderer cloning/building and ggml-virtgpu-related CMake flags.
  • Refine runtime dependency installation to be conditional per-container and controlled by debug mode for all targets.

Build:

  • Update the container build orchestration to recognize the remoting target and pass through the new RAMALAMA_IMAGE_BUILD_REMOTING_BACKEND build argument.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @kpouget, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces comprehensive support for building a new ggml-virtgpu container image within the Ramalama project. This image is designed to facilitate the use of the ggml-virtgpu backend for llama.cpp in containerized environments, particularly benefiting macOS users by enabling native GPU acceleration (ggml-metal) through a Linux virtual machine and the virglrenderer library. The changes involve adding a new Containerfile, updating build scripts to manage dependencies and configurations, and integrating the virglrenderer build process.

Highlights

  • New remoting Container Image: A new Containerfile has been added to define the build process for the remoting image, which integrates ggml-virtgpu components for virtual GPU acceleration.
  • Enhanced Build Script for remoting: The build_llama.sh script was modified to include specific dependency installations and build configurations tailored for the remoting target, supporting both the ggml-virtgpu and its backend.
  • Virglrenderer Integration: A new function, clone_and_build_virglrenderer, was introduced to handle the compilation and installation of the virglrenderer library, which is crucial for the virtual GPU setup.
  • Updated Build Orchestration: The container_build.sh script was updated to recognize the new remoting build target and a new environment variable, RAMALAMA_IMAGE_BUILD_REMOTING_BACKEND, enabling conditional backend compilation.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • container-images/remoting/Containerfile
    • Created a new multi-stage Dockerfile for the remoting image.
    • Configured build-time dependencies and runtime environment variables for ggml-virtgpu and APIR.
  • container-images/scripts/build_llama.sh
    • Implemented dnf_install_remoting to install libdrm-devel and optionally meson, libepoxy-devel, python3-yaml.
    • Extended dnf_install and dnf_install_runtime_deps to conditionally install Vulkan drivers for the remoting target.
    • Added remoting specific flags to configure_common_flags to enable GGML_VIRTGPU and GGML_VIRTGPU_BACKEND.
    • Added clone_and_build_virglrenderer function to clone, build, and install virglrenderer with Venus and APIR support.
    • Integrated the call to clone_and_build_virglrenderer into the main build flow for remoting images.
  • container_build.sh
    • Added RAMALAMA_IMAGE_BUILD_REMOTING_BACKEND to the list of recognized build environment variables.
    • Included remoting as a valid target in the build function's case statement.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for building a remoting image that utilizes ggml-virtgpu. The changes include a new Containerfile for this image and substantial updates to the build scripts to handle the new remoting target. The overall approach is sound. However, I've identified a few areas for improvement in the shell scripts, including a potential bug in how CMake flags are configured, error messages being directed to stdout instead of stderr, and some inconsistencies in shell syntax. My review provides specific suggestions to address these points.

@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Feb 26, 2026

Reviewer's Guide

Adds support for building a new remoting container image that enables the ggml-virtgpu backend (optionally with a Vulkan-based host backend) by wiring new build-time/runtime dependencies, CMake flags, and a dedicated Containerfile, including optional virglrenderer build support.

File-Level Changes

Change Details Files
Introduce remoting-specific dnf installation paths and runtime dependencies, including optional Vulkan support and debug tooling handling.
  • Add dnf_install_remoting helper to install libdrm-devel and, when requested, Meson and related build tools for the remoting backend.
  • Extend dnf_install to handle the remoting containerfile, invoking remoting-specific dependencies and optionally installing Vulkan drivers when the remoting backend is set to vulkan.
  • Extend dnf_install_runtime_deps to add libdrm and optional Vulkan runtime packages for remoting, including enabling a COPR for mesa-libkrun-vulkan, and guard against empty runtime package sets.
  • Adjust debug tooling installation so gdb/strace are added as runtime dependencies rather than always installed in the main dnf path.
container-images/scripts/build_llama.sh
Wire ggml-virtgpu and optional backend/Vulkan support into the CMake configuration for the new remoting target.
  • Extend configure_common_flags with a remoting case that enables GGML_VIRTGPU and disables GGML backend-DL.
  • Conditionally enable GGML_VIRTGPU_BACKEND and GGML_VULKAN when RAMALAMA_IMAGE_BUILD_REMOTING_BACKEND is set to vulkan.
  • Add validation and error reporting when an unknown remoting backend value is provided.
container-images/scripts/build_llama.sh
Add support for building virglrenderer as part of the remoting image when a backend is requested.
  • Introduce clone_and_build_virglrenderer to clone from a configurable virglrenderer repo/commit, build with Meson/Ninja enabling Venus and API remoting, and install into /tmp/install.
  • Clean up the virglrenderer source directory unless debug mode is enabled.
  • Call clone_and_build_virglrenderer from main only when building the remoting containerfile with a remoting backend enabled.
container-images/scripts/build_llama.sh
Expose configuration for selecting the remoting backend via container build arguments.
  • Add RAMALAMA_IMAGE_BUILD_REMOTING_BACKEND to the set of propagated build-time environment variables in add_build_platform, documented as controlling inclusion of the API backend and ggml-vulkan backend.
container_build.sh
Enable building a new remoting container image in the Ramalama build flow.
  • Allow remoting as a valid target in the build() case statement used in build-all flows.
  • Add a new Containerfile for the remoting image that runs the llama build in a builder stage, copies virglrenderer and ggml artifacts into the final image, and installs runtime dependencies via the existing build_llama.sh runtime path.
  • Set environment variables in the remoting Containerfile to configure virglrenderer, ggml-virtgpu-backend, ggml-vulkan, and logging paths for API remoting.
container_build.sh
container-images/remoting/Containerfile

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Signed-off-by: Kevin Pouget <kpouget@redhat.com>
Signed-off-by: Kevin Pouget <kpouget@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant