-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Hi,
While performance testing our application (a JEE application running in WildFly, where a Servlet accepts client requests and processes them using graphql-java), we noticed a request throughput decrease from a source we didn't expect: if we simply return 'null' in our data fetcher(s) but do perform all other processing (parse the incoming request, do all processing in the data fetcher in order to construct the resulting response object, etc.), the throughput of our application increases significantly (usually at least doubling compared to when we do actually return a response in the data fetcher). While we did expect some overhead to be generated by the GraphQL parsing and processing, we didn't expect the impact to be this big.
(Note: the limiting factor during our load tests was the available CPU. Our application works on in-memory data, so in most cases, we're not waiting on database access or request to other systems in order to process client requests. The CPU was always at 100% while executing tests.)
As a test, I created a simple stand-alone web application for verification. It uses a simple GraphQL schema (located in src/main/resources) that contains some simple nested types and queries to retrieve them. A dedicated data fetcher is available for each of these queries. A compiled .war file of the web application can be found in the target directory of the attached zip.
In order to test the impact of different parts of the application, certain parts can be enabled or disabled in the test servlet:
- returnData: when 'true', the data fetcher implementations will return pre-generated instances of the correct type. If 'false', no data will be returned by the data fetchers (i.e. this is where we noticed the performance impact in our application)
- returnResponse: when 'true', the result of the GraphQL processing will be transformed into JSON and written into the client response. When 'false', the GraphQL result will not be converted. Using this setting, it can be made sure that the JSON transformation that is usually done after the GraphQL processing is not influencing the measured throughput.
- cpuLoadSize: Some simple list-manipulation code (sort & randomize) will simulate the actual CPU processing that a normal application would have to perform in order to generate response objects. This setting will specify the size of the list that is being used (and as such the amount of processing simulated). During the tests on my PC, I used a list size of 4096, which usually resulted in 1 or 2 millisecs of processing when going at max throughput.
These settings can be changed at runtime using an HTTP GET request. As an example, http://localhost:8080/graphql/graphql?returnData=true&returnResponse=false&cpuLoadSize=4096 would cause the data fetchers to return actual responses, CPU processing is simulated using a list with 4096 items, and the GraphQL processing result is not transformed into JSON, and not returned to the client.
I executed four test queries (which can be found under src/test/resources) on my PC using this application (using WildFly as web server):
- queryOne.txt: Retrieves the TypeOne instance with all its properties and all children with all their properties
- queryOne-min.txt: Retrieves the TypeOne instance with all its children, but doesn't retrieve any of the simple properties
- queryTwo.txt: Retrieves the TypeTwo instance with all its properties and children
- queryThree.txt: Retrieves the TypeThree instance with all its properties.
In the attached results.pdf, the troughput that I measured on my system when executing these queries in different scenarios can be found:
-
In Test 1, all functionality is enabled, so this simulates a fully active application.
-
Test 2 will (compared to test 1) just disable the conversion of the GraphQL result into JSON. All other tests will also have the JSON conversion disabled, so that it doesn't influence the measured throughput.
-
In Test 5, nothing is enabled, so the application is only parsing the incoming GraphQL query, and calling the datafetchers. No data is actually returned by the data fetchers, no response is returned to the client and no CPU load is simulated, resulting in the maximum throughput that can be achieved when doing no processing.
-
Test 4 will (compared to test 5) simulate CPU load that would be needed to construct the data that is being requested in the data fetcher. No data is being returned by the data fetchers however.
-
Test 3 will (compared to test 5) return a pre-generated object in the data fetcher, but will not simulate any additional CPU load that may be needed to create this response data.
Is it expected that the handling of data fetcher responses consumes this much of the CPU, especially where more complex responses are needed? I do understand that processing is needed on the data (e.g. in case any of the child properties returned needs further processing by another registered data fetcher, which is something that we do actually use in our full application), but the impact of this additional processing (which in this example application should only result in 'simple' property-getters being invoked) does seem more significant that we expected.
graphlperftest.zip
results.pdf