Skip to content

feat: implement UpdateMapping and apply meta change to UpdateSchema#561

Open
zhjwpku wants to merge 2 commits intoapache:mainfrom
zhjwpku:implement_update_mapping
Open

feat: implement UpdateMapping and apply meta change to UpdateSchema#561
zhjwpku wants to merge 2 commits intoapache:mainfrom
zhjwpku:implement_update_mapping

Conversation

@zhjwpku
Copy link
Collaborator

@zhjwpku zhjwpku commented Feb 11, 2026

No description provided.

return NameMapping::Make(std::move(mapped_fields));
}

std::optional<std::string> UpdateMappingFromJsonString(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably works but it seems the pattern in this file is to use Result() for *FromJson. It is also unclear what the intend of the method is, it seems from reading the code that we are trying to update the initial json representation with updates.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to the Result pattern.

Copy link
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Report: PR #561

📄 File: src/iceberg/name_mapping.cc

Java Counterpart: core/src/main/java/org/apache/iceberg/mapping/MappingUtil.java

  • Parity Check: ✅ The UpdateMappingVisitor logically matches the Java UpdateMapping visitor. The logic handling additions and reassigned names perfectly mirrors the Java implementation (e.g. !field.field_id.has_value() || assign_it->second != field.field_id.value() corresponds correctly to Java's nullable integer checks assignedId != null && !Objects.equals(assignedId, field.id())).
  • Style Check: ⚠️ In UpdateMappingVisitor::VisitFields, std::ranges::for_each is used with a lambda that mutates update_assignments. Standard C++ algorithms are less idiomatic when they perform side-effects on captured external variables. A simple range-based for loop for (const auto& field : field_results) is cleaner and more readable here.
  • Logic Check: ✅ Handling of recursive nested structures with std::unique_ptr and implicit conversion to std::shared_ptr is correct and safe.
  • Design & Conciseness:UpdateMappingVisitor does a great job encapsulating the logic internally. std::erase_if correctly translates Sets.difference from Java without reinventing the wheel.

📄 File: src/iceberg/json_serde.cc

Java Counterpart: api/src/main/java/org/apache/iceberg/mapping/NameMappingParser.java

  • Parity Check:
  • Style Check: ⚠️ UpdateMappingFromJsonString returns std::optional<std::string>, which silently ignores JSON parsing or mapping errors. Returning Result<std::string> would allow callers to log or trace errors if they occur, keeping consistent with the codebase's Result<T> error handling pattern.
  • Logic Check:
  • Design & Conciseness: ✅ Consolidating mapping deserialization/serialization into UpdateMappingFromJsonString is a concise design that avoids leaking JSON boilerplate into UpdateSchema.

📄 File: src/iceberg/update/update_schema.cc

Java Counterpart: core/src/main/java/org/apache/iceberg/SchemaUpdate.java

  • Parity Check: ✅ In Java, updating the name mapping property is done inside SchemaUpdate#commit() via applyChangesToMetadata(). In C++, embedding this property update inside UpdateSchema::Apply() and returning it within updated_props is the correct equivalent pattern.
  • Style Check:
  • Logic Check:
  • Design & Conciseness:

Summary & Recommendation

  • Comment. The implementation is functionally sound, adheres tightly to Java parity, and cleanly introduces schema name mapping updates. Consider switching the std::ranges::for_each to a for loop in name_mapping.cc and returning a Result<std::string> instead of std::optional in json_serde.cc for better error traceability, though neither are critical blockers. Excellent work!

@zhjwpku zhjwpku force-pushed the implement_update_mapping branch from 467d908 to 08d3bac Compare February 26, 2026 06:29
@zhjwpku
Copy link
Collaborator Author

zhjwpku commented Feb 26, 2026

Review Report: PR #561

📄 File: src/iceberg/name_mapping.cc

Java Counterpart: core/src/main/java/org/apache/iceberg/mapping/MappingUtil.java

  • Parity Check: ✅ The UpdateMappingVisitor logically matches the Java UpdateMapping visitor. The logic handling additions and reassigned names perfectly mirrors the Java implementation (e.g. !field.field_id.has_value() || assign_it->second != field.field_id.value() corresponds correctly to Java's nullable integer checks assignedId != null && !Objects.equals(assignedId, field.id())).
  • Style Check: ⚠️ In UpdateMappingVisitor::VisitFields, std::ranges::for_each is used with a lambda that mutates update_assignments. Standard C++ algorithms are less idiomatic when they perform side-effects on captured external variables. A simple range-based for loop for (const auto& field : field_results) is cleaner and more readable here.

Yeah, changed to simple for loop.

  • Logic Check: ✅ Handling of recursive nested structures with std::unique_ptr and implicit conversion to std::shared_ptr is correct and safe.
  • Design & Conciseness:UpdateMappingVisitor does a great job encapsulating the logic internally. std::erase_if correctly translates Sets.difference from Java without reinventing the wheel.

📄 File: src/iceberg/json_serde.cc

Java Counterpart: api/src/main/java/org/apache/iceberg/mapping/NameMappingParser.java

  • Parity Check:
  • Style Check: ⚠️ UpdateMappingFromJsonString returns std::optional<std::string>, which silently ignores JSON parsing or mapping errors. Returning Result<std::string> would allow callers to log or trace errors if they occur, keeping consistent with the codebase's Result<T> error handling pattern.

Changed to Resultstd::string

  • Logic Check:
  • Design & Conciseness: ✅ Consolidating mapping deserialization/serialization into UpdateMappingFromJsonString is a concise design that avoids leaking JSON boilerplate into UpdateSchema.

📄 File: src/iceberg/update/update_schema.cc

Java Counterpart: core/src/main/java/org/apache/iceberg/SchemaUpdate.java

  • Parity Check: ✅ In Java, updating the name mapping property is done inside SchemaUpdate#commit() via applyChangesToMetadata(). In C++, embedding this property update inside UpdateSchema::Apply() and returning it within updated_props is the correct equivalent pattern.
  • Style Check:
  • Logic Check:
  • Design & Conciseness:

Summary & Recommendation

  • Comment. The implementation is functionally sound, adheres tightly to Java parity, and cleanly introduces schema name mapping updates. Consider switching the std::ranges::for_each to a for loop in name_mapping.cc and returning a Result<std::string> instead of std::optional in json_serde.cc for better error traceability, though neither are critical blockers. Excellent work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants