Skip to content

Convert SBOM Generation from Java to Python#4442

Draft
Lukisorisch wants to merge 7 commits into
adoptium:masterfrom
Lukisorisch:issue-4421-sbom-gen-python-conversion
Draft

Convert SBOM Generation from Java to Python#4442
Lukisorisch wants to merge 7 commits into
adoptium:masterfrom
Lukisorisch:issue-4421-sbom-gen-python-conversion

Conversation

@Lukisorisch
Copy link
Copy Markdown
Contributor

@Lukisorisch Lukisorisch commented Apr 20, 2026

Description

This PR transitions our SBOM generation from the Java tool to the official cyclonedx-python-lib.

The changes have only been tested on Linux so far.

Resolves #4421.
Note: Please merge #4429 before merging this PR.

Workarounds and Changes

1.

Currently, the official cyclonedx-python-lib does not natively support the CycloneDX 1.6 formulation objects/array. To still migrate to python while still having a valid JSON without patching the library's internals, I have implemented a temporary "post-processing" script (temporary_sbom_post_processing.py). It injects the formulation workflows into the finished JSON at the very end of the pipeline i.e. at the end of generateSBoM().

All locations related to this temporary workaround are tagged with "TODO (CycloneDX 1.6)" for easy searching. Once the upstream library adds native support, the migration will look like this:

MIGRATION (once native formulation support arrives in cyclonedx-python-lib):
1. Delete cyclonedx-lib/temporary_sbom_post_processing.py
2. Update everything marked with TODO (CycloneDX 1.6) back to the correct state
3. Implement native formulation logic (--addWorkflow, --addWorkflowStep, --addWorkflowStepCmd) in cyclonedx-lib temurin_gen_sbom.py, replacing the _skip_formulation() stubs with real CycloneDX lib calls.
4. The Workflow/Step/Cmd calls in addTemurinBuildRecipeToSBOM and addReproducibleVerificationRecipeToSBOM will then work natively.

You can find this text in the source code of build.sh as well.

2.

For some reason, the cyclonedx-python-lib has a adifferent key ordering in the json. For now, I fixed this using a hardcoded constant that defines the order of keys. This can be removed later, if I get confirmation that the order is indeed irrelevant (as it should be for jsons anyway)

3.

To avoid externally-managed-environment errors, the build script creates a temporary Python virtual environment now (sbom_venv) for the SBOM generation. sbom_venv has been added to .gitignore.

Showcase

Here are two SBOMs (not from the same JDK or build script), one generated with the current Java implementation, the other generated with the new Python implementation.

jdk-hotspot-sbom-python.json
jdk-hotspot-sbom-java.json

Lukisorisch and others added 7 commits March 20, 2026 11:45
Replace cyclonedx-core-java with cyclonedx-python-lib, add temporary json post-processing script to inject workflows, since they are not currently supported in python, add sbom_venv to gitignore to ensure a clean python installation

Co-authored-by: GitHub Copilot <copilot@github.com>
@github-actions github-actions Bot added the testing Issues that enhance or fix our test suites label Apr 20, 2026
@github-actions
Copy link
Copy Markdown

Thank you for creating a pull request!
If you have not done so already, please familiarise yourself with our Contributing Guidelines and FAQ, even if you have contributed to the Adoptium project before. GitHub actions will now run a set of jobs against your PR that will lint and unit test your changes. Keep an eye out for the results from these on the latest commit you submitted. For more information, please see our testing documentation.

@andrew-m-leonard
Copy link
Copy Markdown
Contributor

I am not sure injecting the missing support via a script is going to be a great solution in the long term, as it’s likely to be vulnerable to schema issues, and adds an extra step that the TemurinGenSBOM.py can’t use directly. The managing python venv is also a typical issue with Python, which we haven't had to resolve with temurin build.sh since we don't currently use Python.

I have been also doing some work with the CycloneDX Attestation tool, which is also not available in python.

I think although this may work, it's not architecturally the right direction, given the python libraries limited schema support.

@Lukisorisch
Copy link
Copy Markdown
Contributor Author

Lukisorisch commented Apr 20, 2026

I am not sure injecting the missing support via a script is going to be a great solution in the long term, as it’s likely to be vulnerable to schema issues, and adds an extra step that the TemurinGenSBOM.py can’t use directly. The managing python venv is also a typical issue with Python, which we haven't had to resolve with temurin build.sh since we don't currently use Python.

I have been also doing some work with the CycloneDX Attestation tool, which is also not available in python.

I think although this may work, it's not architecturally the right direction, given the python libraries limited schema support.

I agree, the "injection script" would only be a temporary solution until the official CycloneDX 1.6 support for python is released though.
The venv issue on the other hand is a bummer, yes.

Feel free to archive this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

testing Issues that enhance or fix our test suites

Projects

None yet

Development

Successfully merging this pull request may close these issues.

To provide wider usage of the Temurin SBOM & CDXA generation clients, provide a python client package

3 participants