-
-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Requests to osrm-routed can block unexpectedly #6039
Description
A request to osrm-routed can be assigned to a thread which is currently busy processing another request, even when there are other threads/cores available. This unnecessarily delays the response, or as discussed in #6033 can make requests appear to hang when awaiting CPU intensive requests to finish.
Example
Using the example from #6033, this can be reproduced on the France OSM extract.
Run osrm-routed with --threads 4
Make a request that takes 5-10 seconds to complete
curl "http://localhost:5000/match/v1/car/2.321508,48.786804;2.321818,48.786879;2.324377,48.788494;2.325426,48.789201;2.329152,48.792156;2.330842,48.793761;2.331307,48.794565;2.331179,48.797515;2.329416,48.800333;2.330373,48.803502;2.325861,48.805695;2.326538,48.808909;2.326434,48.809562;2.322967,48.810585;2.321333,48.81268;2.318826,48.813388;2.313959,48.814762;2.311833,48.816219;2.3085,48.817093;2.307014,48.818689;2.306723,48.819006;2.306392,48.819381;2.30451,48.821406;2.302983,48.823178;2.302381,48.824416;2.302295,48.824626;2.302189,48.8249;2.302258,48.825207;2.302356,48.825429;2.29527,48.826921;2.286896,48.829859;2.278856,48.833171;2.264636,48.834948;2.254055,48.840947;2.253821,48.848681;2.252192,48.850262;2.252257,48.851837;2.252987,48.853035;2.25459,48.85441;2.25607,48.855465;2.25607,48.855465;2.25607,48.855465;2.256419,48.855363;2.261688,48.858715;2.263236,48.859848;2.263947,48.860949;2.264302,48.861604;2.265494,48.863572;2.265515,48.863127;2.26597,48.86359;2.268027,48.865524;2.269227,48.867344;2.271596,48.869799;2.272392,48.870425;2.276031,48.874408;2.280104,48.877403;2.28296,48.881999;2.290968,48.887533;2.299986,48.890925;2.3003,48.891081;2.300603,48.891258;2.300776,48.891359;2.300923,48.891491;2.301007,48.891499;2.30109,48.89158;2.301092,48.891574;2.307688,48.895367;2.309913,48.896152;2.309947,48.896179?geometries=geojson&radiuses=5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;100;5;5;5;5;20;5;5;5;5;5;1000;200;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5;5"
Then some quicker requests, which at some point will block until the above request completes, and then continue:
while true
do
curl "http://localhost:5000/match/v1/car/2.320208,48.702049;2.320521,48.702363;2.320843,48.702727;2.320874,48.702761;2.320874,48.702761;2.319644,48.701588" &>/dev/null
done
Cause
The issue looks like a bug in Boost.Asio multithreaded networking stack.
osrm-routed server implementation is heavily influenced by the HTTP server 3 example in the Boost.Asio docs.
Interestingly if we upgrade osrm-routed to look more like the example provided in the 1.70.0 release, the problem goes away.
The diff between them is a change in the way strand objects are used to prevent concurrent access from connection handles, which sounds related. It could be that the newer version has a fix for this issue.