Skip to content

Address Lora shortcomings#28801

Open
yuslepukhin wants to merge 5 commits into
mainfrom
yuslepukhin/lora_msrc
Open

Address Lora shortcomings#28801
yuslepukhin wants to merge 5 commits into
mainfrom
yuslepukhin/lora_msrc

Conversation

@yuslepukhin
Copy link
Copy Markdown
Member

This pull request improves the robustness and safety of LoRA adapter handling in ONNX Runtime, focusing on exception safety, memory management, and secure parameter serialization. The most important changes include refactoring adapter loading to provide a strong exception guarantee, fixing a memory lifetime issue in the Python bindings, and ensuring that string tensors cannot be exported (which previously could cause security and correctness problems). Additional regression tests are added to verify these behaviors.

C++ Core: Exception Safety and Refactoring

  • Refactored LoraAdapter::Load and LoraAdapter::MemoryMap to build and validate new adapter state in local variables before committing, ensuring that failed loads leave the object unchanged (strong exception guarantee). The parameter map is now constructed via a new BuildParamsValues method, which is side-effect free and returns the new map for the caller to commit. (onnxruntime/core/session/lora_adapters.cc, onnxruntime/core/session/lora_adapters.h) [1] [2] [3] [4] [5]

Python Bindings: Memory Lifetime and API Safety

  • Fixed a typo in the PyAdapterFormatReaderWriter struct (loaded_adater_loaded_adapter_) and updated the property definition accordingly. (onnxruntime/python/onnxruntime_pybind_lora.cc)
  • Added a py::keep_alive<0, 1> policy to the parameters property, ensuring the parent adapter object is kept alive as long as the returned dictionary of parameter OrtValues is referenced in Python, preventing use-after-free bugs. (onnxruntime/python/onnxruntime_pybind_lora.cc)

Adapter Export: Security and Correctness

  • Updated export_adapter to reject string-typed tensors, preventing serialization of invalid or unsafe memory (such as leaking heap pointers and uninitialized data), and ensuring only supported tensor types are exported. The file is only created after all parameters are validated. (onnxruntime/python/onnxruntime_pybind_lora.cc)

Testing: Regression Coverage

  • Added regression tests to verify that:
    • The new keep-alive policy prevents use-after-free when accessing adapter parameters from Python. (onnxruntime/test/python/onnxruntime_test_python.py)
    • Exporting string tensors is rejected and does not create an adapter file. (onnxruntime/test/python/onnxruntime_test_python.py)

These changes collectively make LoRA adapter handling in ONNX Runtime safer, more robust, and easier to use from Python.

export_adapter wrote tensor.DataRaw() for tensor.SizeInBytes() bytes
regardless of element type. For tensor(string) parameters this copied
the std::string object representation - heap pointers and SSO padding -
directly into Parameter.raw_data, leaking runtime addresses (ASLR
bypass) and uninitialized bytes, and producing an adapter that cannot
be safely loaded (reinterpreting the saved bytes as std::string objects
is undefined behavior).

Reject STRING element type with a clear error and defer opening the
output file until after validation/serialization so a rejected export
does not leave a stray empty file behind.

Test: test_adapter_export_rejects_string_tensors asserts export_adapter
raises on tensor(string) parameters and leaves no file on disk.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens LoRA adapter handling across the core C++ implementation and Python bindings, focusing on stronger exception-safety during adapter loading, safer object lifetimes in Python, and safer adapter export behavior.

Changes:

  • Refactors LoraAdapter loading/memory-mapping to construct validated state locally before committing it to the object (strong exception guarantee).
  • Fixes Python binding lifetime hazards by tying the returned parameters dict to its owning AdapterFormat instance via py::keep_alive.
  • Rejects exporting string tensors from Python adapter export, and adds Python regression tests for both the keep-alive behavior and string-tensor rejection.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
onnxruntime/core/session/lora_adapters.cc Refactors Load/MemoryMap to build a fresh params map locally via BuildParamsValues() before committing state.
onnxruntime/core/session/lora_adapters.h Replaces InitializeParamsValues() with side-effect-free BuildParamsValues() to support strong exception guarantees.
onnxruntime/python/onnxruntime_pybind_lora.cc Fixes loaded_adapter_ typo, adds keep_alive for parameters, and rejects STRING tensors during export.
onnxruntime/test/python/onnxruntime_test_python.py Adds regression tests for Python keep-alive and for rejecting string-tensor adapter export without creating a file.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/python/onnxruntime_pybind_lora.cc Outdated
pybind11's def_property() does not accept keep_alive directly (static_assert fires). Wrap the getter in py::cpp_function so the policy can be attached, restoring the keep-alive behavior intended by 245ffaf.

Also compute the serialized adapter span before opening the output file, per PR review: a failure inside FinishWithSpan should not leave a stray empty file behind.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

Comment on lines +2129 to +2131
# Drop the AdapterFormat temporary; only `params` keeps a reference.
params = onnxrt.AdapterFormat.read_adapter(file_path).parameters
gc.collect()
Comment on lines 118 to +123
for (auto& [n, value] : reader_writer->parameters_) {
const std::string param_name = py::str(n);
const OrtValue* ort_value = value.cast<OrtValue*>();
const Tensor& tensor = ort_value->Get<Tensor>();
const auto element_type = tensor.GetElementType();
// Reject string tensors: Tensor::DataRaw() for a string tensor points to an
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants