Ruby: Only add RepeatedField instance to cache once fully initialized by casperisfine · Pull Request #13054 · protocolbuffers/protobuf

casperisfine · 2023-06-14T14:09:21Z

Fix: #11968

Since calling a method on WeakMap allow the RubyVM to switch threads, adding a non-fully initialized object into the cache allow another thread from using it.

Note: with this fix, the other two thread may create their own wrapper for the same underlying Array. We don't know if it's a big deal or not.

Co-Authored-By: @peterzhu2118

cc @haberman

Fix: protocolbuffers#11968 Since calling a method on WeakMap allow the RubyVM to switch threads, adding a non-fully initialized object into the cache allow another thread from using it. Note: with this fix, the other two thread may create their own wrapper for the same underlying Array. We don't know if it's a big deal or not. Co-Authored-By: Peter Zhu <[email protected]>

google-cla · 2023-06-14T14:09:29Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

casperisfine · 2023-06-14T14:17:24Z

Ah damn it. Crediting @peterzhu2118 as co-author is making the CLA bot nervous -_-.

Peter could you sign the CLA?

peterzhu2118 · 2023-06-14T14:44:36Z

I signed the CLA.

haberman · 2023-06-14T15:17:51Z

Thanks so much for debugging and fixing this.

This pattern also occurs in the following place, which will also need the same change:

protobuf/ruby/ext/google/protobuf_c/map.c

Lines 96 to 105 in 2cf94fa

    
           ObjectCache_Add(map, val); 
        
           TypedData_Get_Struct(val, Map, &Map_type, self); 
        
           self->map = map; 
        
           self->arena = arena; 
        
           self->key_type = key_type; 
        
           self->value_type_info = value_type; 
        
           if (self->value_type_info.type == kUpb_CType_Message) { 
        
             const upb_MessageDef* val_m = self->value_type_info.def.msgdef; 
        
             self->value_type_class = Descriptor_DefToClass(val_m); 
        
           }

I wish there was a way to atomically perform a "get or insert" operation on the WeakMap, so we could avoid the duplicate wrapper situation. Duplicate wrappers are mostly benign, but they are observable by the user, eg. with BasicObject#equal?. If we could write our own WeakMap it would be possible to implement "get or insert", but it appears that it's not feasible to make a custom WeakMap unless we have rb_objspace_garbage_object_p.

Would it be possible to add a "get or insert" operation to WeakMap upstream, so we could at least avoid duplicate wrappers going forward?

fowles · 2023-06-14T15:19:05Z

After this fix in debug builds we may start to fail the assert here:

protobuf/ruby/ext/google/protobuf_c/protobuf.c

Line 406 in 2cf94fa

PBRUBY_ASSERT(ObjectCache_Get(key) == val);

I think the best solution is to have ObjectCache_Add return the VALUE that was inserted and then the caller can do a try insert and get back the official one that won the race.

fowles · 2023-06-14T15:26:39Z

After this fix in debug builds we may start to fail the assert here:

protobuf/ruby/ext/google/protobuf_c/protobuf.c

Line 406 in 2cf94fa

PBRUBY_ASSERT(ObjectCache_Get(key) == val);

I think the best solution is to have ObjectCache_Add return the VALUE that was inserted and then the caller can do a try insert and get back the official one that won the race.

Ok, that won't quite work since a racing thread can call item_set and clobber the existing value. An insert or return operation (like @haberman asked for is really the right answer here). I think we can survive the race condition of duplicate wrappers although it is a leaky abstraction, but we do need to remove the assert at the end

casperisfine · 2023-06-14T15:43:54Z

Would it be possible to add a "get or insert" operation to WeakMap upstream, so we could at least avoid duplicate wrappers going forward?

You can implement that yourself in Ruby with a mutex. If I read the code correctly the "secondary map" codepath uses a mutex, so similar, but it's easier to do in ruby (untested pseudo-code):

class MyWeakMap
  def initialize
    @map = ObjectSpace::WeakMap.new
    @mutex = Mutex.new
  end

  def getset(key, new_value)
    return old_value if old_value = @map[key]
    @mutex.synchronize do
      @map[key] ||= new_value
    end
  end
end

casperisfine · 2023-06-14T16:02:58Z

the other two thread may create their own wrapper for the same underlying Array. We don't know if it's a big deal or not.

Reading the code again, I think this case was already possible before since we were calling RepeatedField_alloc(cRepeatedField) before ObjectCache_Add().

Even with getset, both thread may need to alloc the VALUE and eventually one will notice it ran into a race condition and may abandon its own instance and use the other.

All this to say, I think this is a different concern, and shouldn't prevent the fix from being merged.

fowles · 2023-06-14T16:41:19Z

All this to say, I think this is a different concern, and shouldn't prevent the fix from being merged.

Agreed, but the asserts are a really issue that we should remove

Superseeds: protocolbuffers#13054 The object cache is fundamentally subject to race conditions. Objects must be created before they are registered into the cache, so if two threads try to create the same object, we'll inevitably end up with two instances mapping to the same underlying memory. To entirely prevent that we'd need a lot of extra locking which I don't think is really worth it compared to a few useless allocations. Instead we can replace `ObjectCache_Add` by a `getset` type of operation, the extra instance is still created, but the later threads will receive the "canonical" instance and will be able to abandon their duplicated instance. Additionally, this PR moves the ObjectCache implementation in Ruby, as it's much easier to debug there, and the performance difference is negligible. The `ObjectCache` instance is also exposed as `Google::Protobuf::OBJECT_CACHE` to better allow to debug potential memory issues.

fowles · 2023-06-16T17:12:33Z

Closing this in favor of #13075 13075

Superseeds: protocolbuffers#13054 The object cache is fundamentally subject to race conditions. Objects must be created before they are registered into the cache, so if two threads try to create the same object, we'll inevitably end up with two instances mapping to the same underlying memory. To entirely prevent that we'd need a lot of extra locking which I don't think is really worth it compared to a few useless allocations. Instead we can replace `ObjectCache_Add` by a `getset` type of operation, the extra instance is still created, but the later threads will receive the "canonical" instance and will be able to abandon their duplicated instance. Additionally, this PR moves the ObjectCache implementation in Ruby, as it's much easier to debug there, and the performance difference is negligible. The `ObjectCache` instance is also exposed as `Google::Protobuf::OBJECT_CACHE` to better allow to debug potential memory issues.

casperisfine requested a review from a team as a code owner June 14, 2023 14:09

casperisfine requested review from JasonLunn and removed request for a team June 14, 2023 14:09

casperisfine mentioned this pull request Jun 14, 2023

ruby: segfault iterating over protobuf objects #11968

Closed

fowles requested review from fowles and removed request for JasonLunn June 14, 2023 15:17

haberman added the 🅰️ safe for tests Mark a commit as safe to run presubmits over label Jun 14, 2023

fowles added the ruby label Jun 14, 2023

casperisfine mentioned this pull request Jun 15, 2023

Ruby: Refactor the object cache to better account for race conditions #13075

Closed

fowles closed this Jun 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ruby: Only add RepeatedField instance to cache once fully initialized#13054

Ruby: Only add RepeatedField instance to cache once fully initialized#13054
casperisfine wants to merge 1 commit intoprotocolbuffers:mainfrom
casperisfine:fix-uninitialized-repeated-fields-in-obj-cache

casperisfine commented Jun 14, 2023

Uh oh!

google-cla Bot commented Jun 14, 2023

Uh oh!

casperisfine commented Jun 14, 2023

Uh oh!

peterzhu2118 commented Jun 14, 2023

Uh oh!

haberman commented Jun 14, 2023

Uh oh!

fowles commented Jun 14, 2023

Uh oh!

fowles commented Jun 14, 2023

Uh oh!

casperisfine commented Jun 14, 2023

Uh oh!

casperisfine commented Jun 14, 2023

Uh oh!

fowles commented Jun 14, 2023

Uh oh!

fowles commented Jun 16, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

casperisfine commented Jun 14, 2023

Uh oh!

google-cla Bot commented Jun 14, 2023

Uh oh!

casperisfine commented Jun 14, 2023

Uh oh!

peterzhu2118 commented Jun 14, 2023

Uh oh!

haberman commented Jun 14, 2023

Uh oh!

fowles commented Jun 14, 2023

Uh oh!

fowles commented Jun 14, 2023

Uh oh!

casperisfine commented Jun 14, 2023

Uh oh!

casperisfine commented Jun 14, 2023

Uh oh!

fowles commented Jun 14, 2023

Uh oh!

fowles commented Jun 16, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants