Address Lora shortcomings#28801
Open
yuslepukhin wants to merge 5 commits into
Open
Conversation
export_adapter wrote tensor.DataRaw() for tensor.SizeInBytes() bytes regardless of element type. For tensor(string) parameters this copied the std::string object representation - heap pointers and SSO padding - directly into Parameter.raw_data, leaking runtime addresses (ASLR bypass) and uninitialized bytes, and producing an adapter that cannot be safely loaded (reinterpreting the saved bytes as std::string objects is undefined behavior). Reject STRING element type with a clear error and defer opening the output file until after validation/serialization so a rejected export does not leave a stray empty file behind. Test: test_adapter_export_rejects_string_tensors asserts export_adapter raises on tensor(string) parameters and leaves no file on disk.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR hardens LoRA adapter handling across the core C++ implementation and Python bindings, focusing on stronger exception-safety during adapter loading, safer object lifetimes in Python, and safer adapter export behavior.
Changes:
- Refactors
LoraAdapterloading/memory-mapping to construct validated state locally before committing it to the object (strong exception guarantee). - Fixes Python binding lifetime hazards by tying the returned
parametersdict to its owningAdapterFormatinstance viapy::keep_alive. - Rejects exporting string tensors from Python adapter export, and adds Python regression tests for both the keep-alive behavior and string-tensor rejection.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
onnxruntime/core/session/lora_adapters.cc |
Refactors Load/MemoryMap to build a fresh params map locally via BuildParamsValues() before committing state. |
onnxruntime/core/session/lora_adapters.h |
Replaces InitializeParamsValues() with side-effect-free BuildParamsValues() to support strong exception guarantees. |
onnxruntime/python/onnxruntime_pybind_lora.cc |
Fixes loaded_adapter_ typo, adds keep_alive for parameters, and rejects STRING tensors during export. |
onnxruntime/test/python/onnxruntime_test_python.py |
Adds regression tests for Python keep-alive and for rejecting string-tensor adapter export without creating a file. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
pybind11's def_property() does not accept keep_alive directly (static_assert fires). Wrap the getter in py::cpp_function so the policy can be attached, restoring the keep-alive behavior intended by 245ffaf. Also compute the serialized adapter span before opening the output file, per PR review: a failure inside FinishWithSpan should not leave a stray empty file behind.
Comment on lines
+2129
to
+2131
| # Drop the AdapterFormat temporary; only `params` keeps a reference. | ||
| params = onnxrt.AdapterFormat.read_adapter(file_path).parameters | ||
| gc.collect() |
Comment on lines
118
to
+123
| for (auto& [n, value] : reader_writer->parameters_) { | ||
| const std::string param_name = py::str(n); | ||
| const OrtValue* ort_value = value.cast<OrtValue*>(); | ||
| const Tensor& tensor = ort_value->Get<Tensor>(); | ||
| const auto element_type = tensor.GetElementType(); | ||
| // Reject string tensors: Tensor::DataRaw() for a string tensor points to an |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request improves the robustness and safety of LoRA adapter handling in ONNX Runtime, focusing on exception safety, memory management, and secure parameter serialization. The most important changes include refactoring adapter loading to provide a strong exception guarantee, fixing a memory lifetime issue in the Python bindings, and ensuring that string tensors cannot be exported (which previously could cause security and correctness problems). Additional regression tests are added to verify these behaviors.
C++ Core: Exception Safety and Refactoring
LoraAdapter::LoadandLoraAdapter::MemoryMapto build and validate new adapter state in local variables before committing, ensuring that failed loads leave the object unchanged (strong exception guarantee). The parameter map is now constructed via a newBuildParamsValuesmethod, which is side-effect free and returns the new map for the caller to commit. (onnxruntime/core/session/lora_adapters.cc,onnxruntime/core/session/lora_adapters.h) [1] [2] [3] [4] [5]Python Bindings: Memory Lifetime and API Safety
PyAdapterFormatReaderWriterstruct (loaded_adater_→loaded_adapter_) and updated the property definition accordingly. (onnxruntime/python/onnxruntime_pybind_lora.cc)py::keep_alive<0, 1>policy to theparametersproperty, ensuring the parent adapter object is kept alive as long as the returned dictionary of parameterOrtValues is referenced in Python, preventing use-after-free bugs. (onnxruntime/python/onnxruntime_pybind_lora.cc)Adapter Export: Security and Correctness
export_adapterto reject string-typed tensors, preventing serialization of invalid or unsafe memory (such as leaking heap pointers and uninitialized data), and ensuring only supported tensor types are exported. The file is only created after all parameters are validated. (onnxruntime/python/onnxruntime_pybind_lora.cc)Testing: Regression Coverage
onnxruntime/test/python/onnxruntime_test_python.py)onnxruntime/test/python/onnxruntime_test_python.py)These changes collectively make LoRA adapter handling in ONNX Runtime safer, more robust, and easier to use from Python.