Change the Ruby code generator to emit a serialized proto instead of the DSL (#12319)#12462
Closed
copybara-service[bot] wants to merge 1 commit intomainfrom
Closed
Change the Ruby code generator to emit a serialized proto instead of the DSL (#12319)#12462copybara-service[bot] wants to merge 1 commit intomainfrom
copybara-service[bot] wants to merge 1 commit intomainfrom
Conversation
72db5b1 to
ceb0315
Compare
…the DSL (#12319) This PR removes the DSL from the code generator, in anticipation of splitting the DSL out into a separate package. Given a .proto file like: ```proto syntax = "proto3"; package pkg; message TestMessage { optional int32 i32 = 1; optional TestMessage msg = 2; } ``` Generated code before: ```ruby # Generated by the protocol buffer compiler. DO NOT EDIT! # source: protoc_explorer/main.proto require 'google/protobuf' Google::Protobuf::DescriptorPool.generated_pool.build do add_file("test.proto", :syntax => :proto3) do add_message "pkg.TestMessage" do proto3_optional :i32, :int32, 1 proto3_optional :msg, :message, 2, "pkg.TestMessage" end end end module Pkg TestMessage = ::Google::Protobuf::DescriptorPool.generated_pool.lookup("pkg.TestMessage").msgclass end ``` Generated code after: ```ruby # frozen_string_literal: true # Generated by the protocol buffer compiler. DO NOT EDIT! # source: test.proto require 'google/protobuf' descriptor_data = "\n\ntest.proto\x12\x03pkg\"S\n\x0bTestMessage\x12\x10\n\x03i32\x18\x01 \x01(\x05H\x00\x88\x01\x01\x12\"\n\x03msg\x18\x02 \x01(\x0b\x32\x10.pkg.TestMessageH\x01\x88\x01\x01\x42\x06\n\x04_i32B\x06\n\x04_msgb\x06proto3" begin Google::Protobuf::DescriptorPool.generated_pool.add_serialized_file(descriptor_data) rescue TypeError => e # <compatibility code, see below> end module Pkg TestMessage = ::Google::Protobuf::DescriptorPool.generated_pool.lookup("pkg.TestMessage").msgclass end ``` This change fixes nearly all remaining conformance problems that existed previously. This is a side effect of moving from the DSL (which is lossy) to a serialized descriptor (which preserves all information). ## Backward Compatibility This change should be 100% compatible with Ruby Protobuf >= 3.18.0, released in Sept 2021. Additionally, it should be compatible with all existing users and deployments. However there is some special compatibility code I inserted to achieve this level of backward compatibility. Without the compatibility code, there is an edge case that could break backward compatibility. The existing code is lax in a way that the new code would be more strict. When we use a full serialized descriptor, it will contain a list of all `.proto` files imported by this file (whereas the DSL never added dependencies properly): https://github.com/protocolbuffers/protobuf/blob/dfb71558a2226718dc3bcf5df27cbc11c1f72382/src/google/protobuf/descriptor.proto#L65-L66 `add_serialized_file` will verify that all dependencies listed in the descriptor were previously added with `add_serialized_file`. Generally that should be fine, because the generated code will contain Ruby `require` statements for all dependencies, and the descriptor will fail to load anyway if the types we depend on were not previously defined in the DescriptorPool. But there is a potential for problems if there are ambiguities around file paths. For example, consider the following scenario: ```proto // foo/bar.proto syntax = "proto2"; message Bar {} ``` ```proto // foo/baz.proto syntax = "proto2"; import "bar.proto"; message Baz { optional Bar bar = 1; } ``` If you invoke `protoc` like so, it will work correctly: ``` $ protoc --ruby_out=. -Ifoo foo/bar.proto foo/baz.proto $ RUBYLIB=. ruby baz_pb.rb ``` However if you invoke `protoc` like so, and didn't have any compatibility code, it would fail to load: ``` $ protoc --ruby_out=. -I. -Ifoo foo/baz.proto $ protoc --ruby_out=. -I. -Ifoo foo/bar.proto $ RUBYLIB=foo ruby foo/baz_pb.rb foo/baz_pb.rb:10:in `add_serialized_file': Unable to build file to DescriptorPool: Depends on file 'bar.proto', but it has not been loaded (Google::Protobuf::TypeError) from foo/baz_pb.rb:10:in `<main>' ``` The problem is that `bar.proto` is being referred to by two different canonical names: `bar.proto` and `foo/bar.proto`. This is a user error: each import should always be referred to by a consistent full path. Hopefully user errors of this sort are rare, but it is hard to know without trying. The code in this PR prints a warning using `warn` if we detect that this edge case has occurred. We will plan to remove this compatibility code in the next major version. Closes #12319 COPYBARA_INTEGRATE_REVIEW=#12319 from haberman:ruby-gencode-binary 5c0e8f2 FUTURE_COPYBARA_INTEGRATE_REVIEW=#12319 from haberman:ruby-gencode-binary 5c0e8f2 PiperOrigin-RevId: 524059189
0e62c89 to
bd52d04
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Change the Ruby code generator to emit a serialized proto instead of the DSL (#12319)
This PR removes the DSL from the code generator, in anticipation of splitting the DSL out into a separate package.
Given a .proto file like:
Generated code before:
Generated code after:
This change fixes nearly all remaining conformance problems that existed previously. This is a side effect of moving from the DSL (which is lossy) to a serialized descriptor (which preserves all information).
Backward Compatibility
This change should be 100% compatible with Ruby Protobuf >= 3.18.0, released in Sept 2021. Additionally, it should be compatible with all existing users and deployments. However there is some special compatibility code I inserted to achieve this level of backward compatibility.
Without the compatibility code, there is an edge case that could break backward compatibility. The existing code is lax in a way that the new code would be more strict.
When we use a full serialized descriptor, it will contain a list of all
.protofiles imported by this file (whereas the DSL never added dependencies properly):protobuf/src/google/protobuf/descriptor.proto
Lines 65 to 66 in dfb7155
add_serialized_filewill verify that all dependencies listed in the descriptor were previously added withadd_serialized_file. Generally that should be fine, because the generated code will contain Rubyrequirestatements for all dependencies, and the descriptor will fail to load anyway if the types we depend on were not previously defined in the DescriptorPool.But there is a potential for problems if there are ambiguities around file paths. For example, consider the following scenario:
If you invoke
protoclike so, it will work correctly:However if you invoke
protoclike so, and didn't have any compatibility code, it would fail to load:The problem is that
bar.protois being referred to by two different canonical names:bar.protoandfoo/bar.proto. This is a user error: each import should always be referred to by a consistent full path. Hopefully user errors of this sort are rare, but it is hard to know without trying.The code in this PR prints a warning using
warnif we detect that this edge case has occurred. We will plan to remove this compatibility code in the next major version.Closes #12319
COPYBARA_INTEGRATE_REVIEW=#12319 from haberman:ruby-gencode-binary 5c0e8f2
FUTURE_COPYBARA_INTEGRATE_REVIEW=#12319 from haberman:ruby-gencode-binary 5c0e8f2