Fix large field number parsing by using unsigned right shift#3509
Fix large field number parsing by using unsigned right shift#3509oldergod merged 4 commits intosquare:masterfrom
Conversation
This fixes an issue where Wire fails to parse protobuf data with large field numbers (greater than 2^29) due to incorrect signed right shift operations that cause sign extension. Changed shr to ushr in nextTag() and skipGroup() methods to ensure proper handling of large field numbers. Only two lines modified: - Line 184: tag = tagAndFieldEncoding shr TAG_FIELD_ENCODING_BITS -> tag = (tagAndFieldEncoding ushr TAG_FIELD_ENCODING_BITS) - Line 250: val tag = tagAndFieldEncoding shr TAG_FIELD_ENCODING_BITS -> val tag = (tagAndFieldEncoding ushr TAG_FIELD_ENCODING_BITS)
|
Thanks for the PR. Is writing a test for this gonna be difficult? |
- Change signed right shift to unsigned right shift in ProtoReader to fix parsing of large field numbers (numbers greater than 0x10000000) :) - Add test case and fixture for large field number validation
|
Thanks! I’ve confirmed locally that the issue occurs when field numbers are >= 0x10000000. Using signed right shift (shr) causes sign extension, which makes tag/tagAndField decode incorrectly. This is exactly what the PR fixes by switching to unsigned right shift (ushr). I’ve also added a unit test specifically covering field numbers >= 0x10000000 to verify the fix. |
|
Thank you! and I'll be able to merge it 👍 |
| private val adapter = createRuntimeMessageAdapter( | ||
| LargeFieldMessage::class.java, | ||
| "square.github.io/wire/unknown", | ||
| Syntax.PROTO_2, | ||
| LargeFieldNumberTest::class.java.classLoader, | ||
| ) |
There was a problem hiding this comment.
You could access the generated adapter as well, I think?
| private val adapter = createRuntimeMessageAdapter( | |
| LargeFieldMessage::class.java, | |
| "square.github.io/wire/unknown", | |
| Syntax.PROTO_2, | |
| LargeFieldNumberTest::class.java.classLoader, | |
| ) | |
| private val adapter = LargeFieldMessage.ADAPTER |
|
It's done! Thanks for the opportunity. I'm really happy to contribute to this project. :P |
refactor: use direct adapter reference instead of factory
|
Hmm, is this actually a problem? It looks like this PR addresses a tag value which is outside of the supported bound for the wire format anyway? |
Thanks for raising the important question about large field number support. You noticed the Google documentation limit comments, which prompted me to investigate deeply. I found some key points:
So your question is very valuable, it helped me discover that Google's own comments contain calculation errors. The actual limit is 2^29-1, not 2^28-1, so our fix is reasonable. |
|
@oldergod Hi! Just checking whether you had a chance to look at the latest updates. Let me know if anything else needs improvement. |
|
@xi0yu in total honesty, I felt a cold shower reading an LLM text. We have some interop tests in |
|
Thanks for the feedback. To be honest, because my English isn't very strong, I used an AI tool to help polish the wording to make it clearer. I'll review and rewrite it to make sure it sounds more natural.More importantly, I'll look into the failing interop test regarding the max tag number right away and get back to you with a fix. |
Add test in wire-protoc-compatibility-tests validating Wire and protoc can encode/decode messages with field numbers at the 2^28 boundary: - 268435455 (2^28 - 1, below boundary) - 268435456 (2^28, at boundary) - 536870911 (2^29 - 1, maximum allowed) Remove the duplicate LargeFieldNumberTest from wire-tests which only tested Wire's internal encoding. The new interop test validates actual Wire ↔ protoc compatibility as requested by maintainers.
|
@oldergod I've added the interop test in wire-protoc-compatibility-tests. Tests pass on my machine. Could you share what error you saw when you tested? Want to make sure I haven't missed anything. |
|
Hey, any thoughts on this when you get a chance? |
This fixes an issue where Wire fails to parse protobuf data when encountering very large field numbers (for example, 290,848,974).
The failure is caused by using a signed right shift (shr), which introduces sign extension for large values, leading to incorrect tag extraction.
Changed shr to ushr in nextTag() and skipGroup() methods to ensure proper handling of large field numbers.
Only two lines modified: