-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Hi,
This could be a dupe of #5601, so apologies if that is the case, but in our case our data structures are not deeply nested at all so I think the cause is different. We were running a single node cluster without issues for several months, then a change in my python code seemingly caused it to start crashing several times a day. We are running 2.2.5~0trusty, and a sample backtrace is below:
2016-05-24T00:43:37.142876355 228.702036s error: Backtrace:
2016-05-24T00:43:37.337086495 228.896249s error: Tue May 24 00:43:37 2016\n\n1 [0xa5ce10]: backtrace_t::backtrace_t() at ??:?\n2 [0xa5d1a3]: format_backtrace(bool) at ??:?\n3 [0xc9f6b5]: report_fatal_error(char const*, int, char const*, ...) at ??:?\n4 [0x97f45f]: linux_thread_pool_t::fatal_signal_handler(int, siginfo_t*, void*) at ??:?\n5 [0x7f91120e2340]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340) [0x7f91120e2340] at 0x7f91120e2340 (/lib/x86_64-linux-gnu/libpthread.so.0)\n6 [0x9fd156]: key_range_t::intersection(key_range_t const&) const at ??:?\n7 [0x8fff50]: hash_region_t<key_range_t> region_intersection<key_range_t>(hash_region_t<key_range_t> const&, hash_region_t<key_range_t> const&) at ??:?\n8 [0x910fba]: bool rdb_r_shard_visitor_t::rangey_read<changefeed_stamp_t>(changefeed_stamp_t const&) const at ??:?\n9 [0x8fcdb2]: read_t::shard(hash_region_t<key_range_t> const&, read_t*) const at ??:?\n10 [0xb29d52]: void table_query_client_t::dispatch_immediate_op<read_t, fifo_enforcer_sink_t::exit_read_t, read_response_t>(void (primary_query_client_t::*)(fifo_enforcer_sink_t::exit_read_t*), void (primary_query_client_t::*)(read_t const&, read_response_t*, order_token_t, fifo_enforcer_sink_t::exit_read_t*, signal_t*), read_t const&, read_response_t*, order_token_t, signal_t*)::{lambda(hash_region_t<key_range_t> const&, std::set<table_query_client_t::relationship_t*, std::less<table_query_client_t::relationship_t*>, std::allocator<table_query_client_t::relationship_t*> > const&)#1}::operator()(hash_region_t<key_range_t> const&, std::set<table_query_client_t::relationship_t*, std::less<table_query_client_t::relationship_t*>, std::allocator<table_query_client_t::relationship_t*> > const&) const at ??:?\n11 [0xb2a1cf]: void region_map_t<std::set<table_query_client_t::relationship_t*, std::less<table_query_client_t::relationship_t*>, std::allocator<table_query_client_t::relationship_t*> > >::visit<void table_query_client_t::dispatch_immediate_op<read_t, fifo_enforcer_sink_t::exit_read_t, read_response_t>(void (primary_query_client_t::*)(fifo_enforcer_sink_t::exit_read_t*), void (primary_query_client_t::*)(read_t const&, read_response_t*, order_token_t, fifo_enforcer_sink_t::exit_read_t*, signal_t*), read_t const&, read_response_t*, order_token_t, signal_t*)::{lambda(hash_region_t<key_range_t> const&, std::set<table_query_client_t::relationship_t*, std::less<table_query_client_t::relationship_t*>, std::allocator<table_query_client_t::relationship_t*> > const&)#1}>(hash_region_t<key_range_t> const&, void table_query_client_t::dispatch_immediate_op<read_t, fifo_enforcer_sink_t::exit_read_t, read_response_t>(void (primary_query_client_t::*)(fifo_enforcer_sink_t::exit_read_t*), void (primary_query_client_t::*)(read_t const&, read_response_t*, order_token_t, fifo_enforcer_sink_t::exit_read_t*, signal_t*), read_t const&, read_response_t*, order_token_t, signal_t*)::{lambda(hash_region_t<key_range_t> const&, std::set<table_query_client_t::relationship_t*, std::less<table_query_client_t::relationship_t*>, std::allocator<table_query_client_t::relationship_t*> > const&)#1} const&) const::{lambda(key_range_t::right_bound_t const&, key_range_t::right_bound_t const, range_map_t<unsigned long, std::set<table_query_client_t::relationship_t*, std::less<table_query_client_t::relationship_t*>, std::allocator<table_query_client_t::relationship_t*> > > const&)#1}::operator()(key_range_t::right_bound_t const, key_range_t::right_bound_t const, range_map_t<unsigned long, std::set<table_query_client_t::relationship_t*, std::less<table_query_client_t::relationship_t*>, std::allocator<table_query_client_t::relationship_t*> > > const) const::{lambda(unsigned long, unsigned long, std::set<table_query_client_t::relationship_t*, std::less<table_query_client_t::relationship_t*>, std::allocator<table_query_client_t::relationship_t*> > const&)#1}::operator()(unsigned long, unsigned long, std::set<table_query_client_t::relationship_t*, std::less<table_query_client_t::relationship_t*>, std::allocator<table_query_client_t::relationship_t*> > const&) const at ??:?\n12 [0xb2a92f]: void table_query_client_t::dispatch_immediate_op<read_t, fifo_enforcer_sink_t::exit_read_t, read_response_t>(void (primary_query_client_t::*)(fifo_enforcer_sink_t::exit_read_t*), void (primary_query_client_t::*)(read_t const&, read_response_t*, order_token_t, fifo_enforcer_sink_t::exit_read_t*, signal_t*), read_t const&, read_response_t*, order_token_t, signal_t*) at ??:?\n13 [0xb2510f]: table_query_client_t::read(read_t const&, read_response_t*, order_token_t, signal_t*) at ??:?\n14 [0x8ba0c5]: ql::changefeed::range_sub_t::to_stream(ql::env_t*, std::string, namespace_interface_t*, mailbox_addr_t<void (ql::changefeed::stamped_msg_t)> const&, counted_t<ql::datum_stream_t>, scoped_ptr_t<ql::changefeed::subscription_t>&&, ql::backtrace_id_t) at ??:?\n15 [0x89a0b9]: ql::changefeed::client_t::new_stream(ql::env_t*, counted_t<ql::datum_stream_t>, ql::configured_limits_t, ql::datum_t const&, bool, uuid_u const&, ql::backtrace_id_t, std::string const&, boost::variant<ql::changefeed::keyspec_t::range_t, ql::changefeed::keyspec_t::limit_t, ql::changefeed::keyspec_t::point_t, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) at ??:?\n16 [0x7e74d2]: real_table_t::read_changes(ql::env_t*, counted_t<ql::datum_stream_t>, ql::configured_limits_t, ql::datum_t const&, bool, boost::variant<ql::changefeed::keyspec_t::range_t, ql::changefeed::keyspec_t::limit_t, ql::changefeed::keyspec_t::point_t, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>&&, ql::backtrace_id_t, std::string const&) at ??:?\n17 [0x841377]: ql::changes_term_t::eval_impl(ql::scope_env_t*, ql::args_t*, ql::eval_flags_t) const at ??:?\n18 [0x7e4d00]: ql::op_term_t::term_eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n19 [0x8e19f6]: ql::runtime_term_t::eval_on_current_stack(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n20 [0x8e1dcf]: ql::runtime_term_t::eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n21 [0x7e2dfa]: ql::op_term_t::maybe_grouped_data(ql::scope_env_t*, ql::argvec_t*, ql::eval_flags_t, counted_t<ql::grouped_data_t>*, scoped_ptr_t<ql::val_t>*) const at ??:?\n22 [0x7e4759]: ql::op_term_t::term_eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n23 [0x8e19f6]: ql::runtime_term_t::eval_on_current_stack(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n24 [0x8e1dcf]: ql::runtime_term_t::eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n25 [0x7e2dfa]: ql::op_term_t::maybe_grouped_data(ql::scope_env_t*, ql::argvec_t*, ql::eval_flags_t, counted_t<ql::grouped_data_t>*, scoped_ptr_t<ql::val_t>*) const at ??:?\n26 [0x7e4759]: ql::op_term_t::term_eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n27 [0x8e19f6]: ql::runtime_term_t::eval_on_current_stack(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n28 [0x8e1dcf]: ql::runtime_term_t::eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n29 [0x7e2dfa]: ql::op_term_t::maybe_grouped_data(ql::scope_env_t*, ql::argvec_t*, ql::eval_flags_t, counted_t<ql::grouped_data_t>*, scoped_ptr_t<ql::val_t>*) const at ??:?\n30 [0x7e4759]: ql::op_term_t::term_eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n31 [0x8e19f6]: ql::runtime_term_t::eval_on_current_stack(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n32 [0x8e1dcf]: ql::runtime_term_t::eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n33 [0x7e2dfa]: ql::op_term_t::maybe_grouped_data(ql::scope_env_t*, ql::argvec_t*, ql::eval_flags_t, counted_t<ql::grouped_data_t>*, scoped_ptr_t<ql::val_t>*) const at ??:?\n34 [0x7e4759]: ql::op_term_t::term_eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n35 [0x8e19f6]: ql::runtime_term_t::eval_on_current_stack(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n36 [0x8e1dcf]: ql::runtime_term_t::eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n37 [0x7e2dfa]: ql::op_term_t::maybe_grouped_data(ql::scope_env_t*, ql::argvec_t*, ql::eval_flags_t, counted_t<ql::grouped_data_t>*, scoped_ptr_t<ql::val_t>*) const at ??:?\n38 [0x7e4759]: ql::op_term_t::term_eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n39 [0x8e19f6]: ql::runtime_term_t::eval_on_current_stack(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n40 [0x8e1dcf]: ql::runtime_term_t::eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n41 [0x7e2dfa]: ql::op_term_t::maybe_grouped_data(ql::scope_env_t*, ql::argvec_t*, ql::eval_flags_t, counted_t<ql::grouped_data_t>*, scoped_ptr_t<ql::val_t>*) const at ??:?\n42 [0x7e4759]: ql::op_term_t::term_eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n43 [0x8e19f6]: ql::runtime_term_t::eval_on_current_stack(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n44 [0x8e1dcf]: ql::runtime_term_t::eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n45 [0x7e2dfa]: ql::op_term_t::maybe_grouped_data(ql::scope_env_t*, ql::argvec_t*, ql::eval_flags_t, counted_t<ql::grouped_data_t>*, scoped_ptr_t<ql::val_t>*) const at ??:?\n46 [0x7e4759]: ql::op_term_t::term_eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n47 [0x8e19f6]: ql::runtime_term_t::eval_on_current_stack(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n48 [0x8e1dcf]: ql::runtime_term_t::eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n49 [0x7e2dfa]: ql::op_term_t::maybe_grouped_data(ql::scope_env_t*, ql::argvec_t*, ql::eval_flags_t, counted_t<ql::grouped_data_t>*, scoped_ptr_t<ql::val_t>*) const at ??:?\n50 [0x7e4759]: ql::op_term_t::term_eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n51 [0x8e19f6]: ql::runtime_term_t::eval_on_current_stack(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n52 [0x8e1dcf]: ql::runtime_term_t::eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n53 [0x7e2dfa]: ql::op_term_t::maybe_grouped_data(ql::scope_env_t*, ql::argvec_t*, ql::eval_flags_t, counted_t<ql::grouped_data_t>*, scoped_ptr_t<ql::val_t>*) const at ??:?\n54 [0x7e4759]: ql::op_term_t::term_eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n55 [0x8e19f6]: ql::runtime_term_t::eval_on_current_stack(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n56 [0x8e1dcf]: ql::runtime_term_t::eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n57 [0x7e2dfa]: ql::op_term_t::maybe_grouped_data(ql::scope_env_t*, ql::argvec_t*, ql::eval_flags_t, counted_t<ql::grouped_data_t>*, scoped_ptr_t<ql::val_t>*) const at ??:?\n58 [0x7e4759]: ql::op_term_t::term_eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n59 [0x8e19f6]: ql::runtime_term_t::eval_on_current_stack(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n60 [0x8e1dcf]: ql::runtime_term_t::eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n61 [0x7e2dfa]: ql::op_term_t::maybe_grouped_data(ql::scope_env_t*, ql::argvec_t*, ql::eval_flags_t, counted_t<ql::grouped_data_t>*, scoped_ptr_t<ql::val_t>*) const at ??:?\n62 [0x7e4759]: ql::op_term_t::term_eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n63 [0x8e19f6]: ql::runtime_term_t::eval_on_current_stack(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n64 [0x8e1dcf]: ql::runtime_term_t::eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n65 [0x7e2dfa]: ql::op_term_t::maybe_grouped_data(ql::scope_env_t*, ql::argvec_t*, ql::eval_flags_t, counted_t<ql::grouped_data_t>*, scoped_ptr_t<ql::val_t>*) const at ??:?\n66 [0x7e4759]: ql::op_term_t::term_eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n67 [0x8e19f6]: ql::runtime_term_t::eval_on_current_stack(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n68 [0x8e1dcf]: ql::runtime_term_t::eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n69 [0x7e2dfa]: ql::op_term_t::maybe_grouped_data(ql::scope_env_t*, ql::argvec_t*, ql::eval_flags_t, counted_t<ql::grouped_data_t>*, scoped_ptr_t<ql::val_t>*) const at ??:?\n70 [0x7e4759]: ql::op_term_t::term_eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n71 [0x8e19f6]: ql::runtime_term_t::eval_on_current_stack(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n72 [0x8e1dcf]: ql::runtime_term_t::eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n73 [0x7e2dfa]: ql::op_term_t::maybe_grouped_data(ql::scope_env_t*, ql::argvec_t*, ql::eval_flags_t, counted_t<ql::grouped_data_t>*, scoped_ptr_t<ql::val_t>*) const at ??:?\n74 [0x7e4759]: ql::op_term_t::term_eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n75 [0x8e19f6]: ql::runtime_term_t::eval_on_current_stack(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n76 [0x8e1dcf]: ql::runtime_term_t::eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n77 [0x7e2dfa]: ql::op_term_t::maybe_grouped_data(ql::scope_env_t*, ql::argvec_t*, ql::eval_flags_t, counted_t<ql::grouped_data_t>*, scoped_ptr_t<ql::val_t>*) const at ??:?\n78 [0x7e4759]: ql::op_term_t::term_eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n79 [0x8e19f6]: ql::runtime_term_t::eval_on_current_stack(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n80 [0x8e1dcf]: ql::runtime_term_t::eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n81 [0x7e2dfa]: ql::op_term_t::maybe_grouped_data(ql::scope_env_t*, ql::argvec_t*, ql::eval_flags_t, counted_t<ql::grouped_data_t>*, scoped_ptr_t<ql::val_t>*) const at ??:?\n82 [0x7e4759]: ql::op_term_t::term_eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n83 [0x8e19f6]: ql::runtime_term_t::eval_on_current_stack(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n84 [0x8e1dcf]: ql::runtime_term_t::eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n85 [0x7e2dfa]: ql::op_term_t::maybe_grouped_data(ql::scope_env_t*, ql::argvec_t*, ql::eval_flags_t, counted_t<ql::grouped_data_t>*, scoped_ptr_t<ql::val_t>*) const at ??:?\n86 [0x7e4759]: ql::op_term_t::term_eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n87 [0x8e19f6]: ql::runtime_term_t::eval_on_current_stack(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n88 [0x8e1dcf]: ql::runtime_term_t::eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n89 [0x7e2dfa]: ql::op_term_t::maybe_grouped_data(ql::scope_env_t*, ql::argvec_t*, ql::eval_flags_t, counted_t<ql::grouped_data_t>*, scoped_ptr_t<ql::val_t>*) const at ??:?\n90 [0x7e4759]: ql::op_term_t::term_eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n91 [0x8e19f6]: ql::runtime_term_t::eval_on_current_stack(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n92 [0x8e1dcf]: ql::runtime_term_t::eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n93 [0x7e2dfa]: ql::op_term_t::maybe_grouped_data(ql::scope_env_t*, ql::argvec_t*, ql::eval_flags_t, counted_t<ql::grouped_data_t>*, scoped_ptr_t<ql::val_t>*) const at ??:?\n94 [0x7e4759]: ql::op_term_t::term_eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n95 [0x8e19f6]: ql::runtime_term_t::eval_on_current_stack(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n96 [0x8e1dcf]: ql::runtime_term_t::eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n97 [0x7e2dfa]: ql::op_term_t::maybe_grouped_data(ql::scope_env_t*, ql::argvec_t*, ql::eval_flags_t, counted_t<ql::grouped_data_t>*, scoped_ptr_t<ql::val_t>*) const at ??:?\n98 [0x7e4759]: ql::op_term_t::term_eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n99 [0x8e19f6]: ql::runtime_term_t::eval_on_current_stack(ql::scope_env_t*, ql::eval_flags_t) const at ??:?\n100 [0x8e1dcf]: ql::runtime_term_t::eval(ql::scope_env_t*, ql::eval_flags_t) const at ??:?
As you can see below:
2016-05-19T21:00:53.317031140 3521882.845827s error: Backtrace:
2016-05-20T22:25:52.076356956 9842.432243s error: Backtrace:
2016-05-21T15:22:14.706357252 59901.255012s error: Backtrace:
2016-05-21T15:28:14.912214221 241.976314s error: Backtrace:
2016-05-22T17:24:59.430987482 93225.933973s error: Backtrace:
2016-05-23T14:03:02.058563867 16594.275001s error: Backtrace:
2016-05-23T14:08:21.160673814 274.767481s error: Backtrace:
2016-05-23T15:03:26.490427623 3219.926406s error: Backtrace:
2016-05-23T15:08:24.114893263 277.364548s error: Backtrace:
2016-05-23T23:56:08.358959810 31581.539294s error: Backtrace:
2016-05-24T00:00:22.318075819 154.145026s error: Backtrace:
2016-05-24T00:37:46.054522192 2157.820086s error: Backtrace:
2016-05-24T00:37:56.436025567 8.057525s error: Backtrace:
2016-05-24T00:43:37.142876355 228.702036s error: Backtrace:
The crashes began a few days ago. There was no change in data structures at that time. I am using the python rethinkdb driver with the net_asyncio connection. Initially I was using a single connection shared for the entire application. That caused some problems with change feeds so I switched to using a connection pool, which is when these crashes started. (I actually also temporarily started creating one connection per query, but only for a day or two)
So, I think my issue is, I need to be a better Python programmer :D. However I think it's an issue for rethinkdb if an poorly written Python client can fatally crash the server.
The offending python code is available here: https://gist.github.com/skolsuper/a48e5077b28a858f90806b736032abcf