Skip to content

Downgrade libstdc++ so we can run inside old glibc environments#4567

Merged
TheMarex merged 1 commit intomasterfrom
downgrade_libstdc++
Oct 5, 2017
Merged

Downgrade libstdc++ so we can run inside old glibc environments#4567
TheMarex merged 1 commit intomasterfrom
downgrade_libstdc++

Conversation

@danpat
Copy link
Copy Markdown
Member

@danpat danpat commented Oct 2, 2017

Issue

When trying to use the published OSRM node modules inside some environments (i.e. AWS Lambda), we get errors like:

Error: /usr/local/lib64/node-v4.3.x/lib/libstdc++.so.6: version 'GLIBCXX_3.4.21' not found (required by /var/task/node_modules/osrm/lib/binding/node_osrm.node)

This PR downgrades us to libstdc++-4.9-dev, and adds an override for a libstdc++ internal symbol that we don't use (but which causes a dependency on GLIBCXX_3.9.20, which is higher than we want).

The goal is to only use symbols from GLIBCXX_3.4.19 or below, allowing us to run on environments with libstdc++-4.8 (which is quite a lot of them). We can't build directly against libstdc++-4.8 any more (we use newer language features), but we can effectively cross-compile for it, which is what this PR does.

Tasklist

  • review
  • adjust for comments

@danpat
Copy link
Copy Markdown
Member Author

danpat commented Oct 2, 2017

Ugh. So, I've tried building against libstdc++-4.9-dev - there are a small handful of leaky symbols that put GLIBCXX_3.4.21 symbol versions in our binaries.

I've also tried using libstdc++-4.8-dev and backporting bits of our code that aren't compatible. This is mostly easy, except that I've hit:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56278

which is affecting include/util/string_view.hpp with the following compile error:

[ 34%] Building CXX object CMakeFiles/EXTRACTOR.dir/src/extractor/edge_based_graph_factory.cpp.o
In file included from /root/osrm-backend/src/extractor/edge_based_graph_factory.cpp:1:
In file included from /root/osrm-backend/include/extractor/edge_based_graph_factory.hpp:6:
In file included from /root/osrm-backend/include/extractor/compressed_edge_container.hpp:4:
In file included from /root/osrm-backend/include/extractor/segment_data_container.hpp:15:
In file included from /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/unordered_map:47:
/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/bits/hashtable.h:268:7: error: static_assert failed "Cache the hash code or make functors involved in hash code and
      bucket index computation default constructible"
      static_assert(__if_hash_not_cached<
      ^             ~~~~~~~~~~~~~~~~~~~~~
/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/bits/unordered_set.h:96:18: note: in instantiation of template class 'std::_Hashtable<boost::basic_string_ref<char,
      std::char_traits<char> >, boost::basic_string_ref<char, std::char_traits<char> >, std::allocator<boost::basic_string_ref<char, std::char_traits<char> > >,
      std::__detail::_Identity, std::equal_to<boost::basic_string_ref<char, std::char_traits<char> > >, std::hash< ::osrm::util::StringView>, std::__detail::_Mod_range_hashing,
      std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, true, true> >' requested here
      _Hashtable _M_h;
                 ^
/root/osrm-backend/include/extractor/suffix_table.hpp:39:42: note: in instantiation of template class 'std::unordered_set<boost::basic_string_ref<char, std::char_traits<char> >,
      std::hash< ::osrm::util::StringView>, std::equal_to<boost::basic_string_ref<char, std::char_traits<char> > >, std::allocator<boost::basic_string_ref<char, std::char_traits<char>
      > > >' requested here
    std::unordered_set<util::StringView> suffix_set;
                                         ^

I've run out of time today to dig any deeper on this one. @oxidase or @springmeyer, if you have any great ideas during EU time, I'm all ears. /cc @TheMarex in case you know a quick way to re-write string_view.hpp to work with libstdc++-4.8

@daniel-j-h
Copy link
Copy Markdown
Member

Not sure downgrading to libstdc++-4.8 makes sense; it's the stdlib which shipped with gcc in 2013. We're not only missing out on C++14 stdlib support but four years of bug fixes and improvements.

Also you might need to re-build our dependencies in mason - or are all our current dependencies built against libstdc++4.8 already?

What about building a statically linked fat binary instead?
Would this work with how Node.js addons need to be set up?

@daniel-j-h
Copy link
Copy Markdown
Member

Ah, also noting: the underlying problem isn't the C++ stdlib but the available libc version. You could build the latest stdlib against an old libc version, and then ship this stdlib to your environment.

@springmeyer
Copy link
Copy Markdown
Contributor

Also you might need to re-build our dependencies in mason - or are all our current dependencies built against libstdc++4.8 already?

Boost 1.64 and 1.65 in mason are built against 4.8 to enable working on lambda/older systems. See https://github.com/mapbox/mason/blob/master/scripts/boost_libdate_time/1.64.0/.travis.yml#L15. I see that OSRM is currently using 1.63, so if you switch to 64/65 then the GLIBXX dep I think should drop from GLIBCXX_3.4.21 to GLIBCXX_3.4.19.

@oxidase
Copy link
Copy Markdown
Contributor

oxidase commented Oct 3, 2017

@danpat what about statically linking? cmake .. -DBUILD_SHARED_LIBS=On -DENABLE_NODE_BINDINGS=On -DCMAKE_EXE_LINKER_FLAGS="-static-libgcc -static-libstdc++"

  • with
ll Release/node_osrm.node
-rwxr-xr-x 1 miha miha 180352 Oct  3 12:52 Release/node_osrm.node
readelf -Wd Release/node_osrm.node | grep NEEDED
 0x0000000000000001 (NEEDED)             Shared library: [libosrm.so]
 0x0000000000000001 (NEEDED)             Shared library: [libboost_system.so.1.62.0]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
strings Release/node_osrm.node | grep GLIBC
GLIBC_2.2.5
GLIBC_2.14
GLIBC_2.4
  • without
ll Release/node_osrm.node
-rwxr-xr-x 1 miha miha 152584 Oct  3 12:54 Release/node_osrm.node
readelf -Wd Release/node_osrm.node | grep NEEDED
 0x0000000000000001 (NEEDED)             Shared library: [libosrm.so]
 0x0000000000000001 (NEEDED)             Shared library: [libboost_system.so.1.62.0]
 0x0000000000000001 (NEEDED)             Shared library: [libstdc++.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
strings Release/node_osrm.node | grep GLIBC
GLIBCXX_3.4
GLIBCXX_3.4.21
GLIBC_2.2.5
GLIBC_2.14
GLIBC_2.4
GLIBCXX_3.4.20

@TheMarex
Copy link
Copy Markdown
Member

TheMarex commented Oct 3, 2017

Linking stdlibc++ statically by default has the downsides that security fixes to stdlibc++ would mean we need to roll patch releases. Maybe we could publish a variant MAJOR.MINOR.PATCH-static using the static linking trick @oxidase suggested.

Polyfill code has the added maintenance overhead and breaks easy ("oh shiny a new C++14 API") so it would be great if we could avoid that.

@danpat
Copy link
Copy Markdown
Member Author

danpat commented Oct 4, 2017

With the addition of boost 1.65, this PR works as expected - the OSRM node module can be used inside the AWS Lambda environment, which was the original goal.

So, now to the static-vs-dynamic+hacks debate.

There is no perfect solution here - either approach has compromises. We either:

  1. Static link, which prevents security updates and bloats the binary, or;
  2. Dynamic link, which requires we build against libstdc++-4.9-dev and the __throw_out_of_range monkey patch

I'm inclined to vote for (2), because:

  1. It's working.
  2. It's good enough for BitCoin
  3. Doesn't require any packaging gymnastics to support both static/non-static editions

I acknowledge that:

  1. We have to use a slightly older libstdc++
  2. There is a monkey patch to maintain.

My hope is that AWS will update their NodeJS environment at some point (last update was in 2015), and we can simply remove this change.

I think being able to run OSRM inside the AWS Lambda environment is important enough to warrant downgrading to libstdc++-4.9. The changes that landed with libstdc++-5 don't look important enough that we can't afford to downgrade: https://gcc.gnu.org/onlinedocs/libstdc++/manual/api.html#api.rel_50. Improved C++14 support and TS implementations. might get annoying, but 4.9 has much of C++14 that we want to use.

@danpat danpat changed the title DO NOT MERGE (yet): Try downgrading libstdc++ so we can run inside old glibc environments Downgrade libstdc++ so we can run inside old glibc environments Oct 4, 2017
@daniel-j-h
Copy link
Copy Markdown
Member

Can't you ship a more recent stdlib to AWS Lambda environment separately? The problem of an old stdlib available in the environment conceptually has nothing to do with how we build OSRM, no?

What about not touching OSRM and putting a workaround for AWS Lambda in place in your deployment code which replaces the system-default old stdlib with a more recent one? The stdlib then only has to be compatible with AWS Lambda's libc, no?

The benefit would be

  • we don't have to add workarounds to osrm for a specific downstream consumers
  • we don't have to constrain ourselves to old APIs and ABIs and maintain stdlib patches

with the downsides that you have to upgrade the AWS Lambda stdlib manually until they do it themselves. But that's fair and should put pressure on AWS.

@springmeyer
Copy link
Copy Markdown
Contributor

Can't you ship a more recent stdlib to AWS Lambda environment separately?

Yes, LD_PRELOAD + shipping our own version would be idea. Unfortuntely I've attempted this along with @rclark and we found it was not possible. We were able to bundle a more recent libstc++ that worked on lambda however we could not get the node.js lamdba launcher to respect LD_PRELOAD. Hit me up for more details if you want, but I think this is a dead end (hence why I've also spent time on the (2) solution above and think that is the best route.

Copy link
Copy Markdown
Member

@TheMarex TheMarex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bad solution but at least one that is rather contained and easily removed. Guess we are now enterprise ready.

@@ -1,6 +1,6 @@
{
"name": "osrm",
"version": "5.12.0-roundaboutexits.1",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be the same value as master. Would conflict with the actual 5.13 version bump in #4571

@@ -0,0 +1,31 @@
#ifdef GLIBC_WORKAROUND
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a great solution but at least it's a contained change. Yay CentOS support.

…we can run inside

old environments (CentOS, AWS Linux, AWS Lambda, etc)
@danpat danpat force-pushed the downgrade_libstdc++ branch from 426b027 to 8862dc1 Compare October 4, 2017 18:51
@springmeyer
Copy link
Copy Markdown
Contributor

Before merging this I would recommend testing again without the GLIBC_WORKAROUND just to confirm that it is truly needed after the boost package change. If it is needed, 👍 to this PR. But it would be great to confirm it is.

@danpat
Copy link
Copy Markdown
Member Author

danpat commented Oct 5, 2017

@springmeyer ja, I confirmed this - the workaround is still required. Here's the symbol version dump with GLIBC_WORKAROUND turned off:

# strings src/nodejs/Release/node_osrm.node  | grep GLIBCXX
GLIBCXX_3.4
GLIBCXX_3.4.18
GLIBCXX_3.4.11
GLIBCXX_3.4.9
GLIBCXX_3.4.20
GLIBCXX_3.4.14
GLIBCXX_3.4.15
# readelf -a src/nodejs/Release/node_osrm.node  | grep GLIBCXX_3.4.20
   262: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND _ZSt24__throw_out_of_rang@GLIBCXX_3.4.20 (14)

@springmeyer
Copy link
Copy Markdown
Contributor

Thanks for confirming @danpat 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants