DataLoader dispatch being called prematurely
Describe the bug We have a GraphQL query that calls a DataLoader that should combine several items into a single batch. It appears to be combining all but the last item, dispatching the request, and then dispatching a second batch containing the last item.
When the query show below runs, we can see that StoreService.getNumEmployees() is called five times ("1234", "2345", "3456", "4567", "5678"), followed by EmployeeCountBatchLoader.load() with the five IDs. After that completes, StoreService.getNumEmployees() is called again with the last ID ("6789") and then EmployeeCountBatchLoader.load() is called once with the last ID.
Rearranging the query always result in the last chunk of the query being run in a separate loader. If the query was ordered first, second, last, group, getNumEmployees would be called three times ("1234", "2345", "6789"), EmployeeCountBatchLoader.load would be called, and then getNumEmployees would be called three times ("3456", "4567", "5678"), followed by another call to EmployeeCountBatchLoader.load.
To Reproduce Using java-graphql 13.0, which includes java-dataloader 2.1.1.
The resolver:
public class StoreResolver implements GraphQLResolver<Store> {
...
public CompletableFuture<Integer> getNumEmployees(Store store) {
return storeService.getNumEmployees(store);
}
...
}
The service:
public class StoreService {
...
public CompletableFuture<Integer> getNumEmployees(Store store) {
// BREAKPOINT ON NEXT LINE COUNTS SERVICE CALLS
return employeeCountDataLoader.load(store.getStoreId());
}
...
}
The data loader:
@Bean
public DataLoader<Pair<String, Boolean>, Integer> employeeCountDataLoader() {
return DataLoader.newMappedDataLoader(employeeCountBatchLoader);
}
The batch loader:
public class EmployeeCountBatchLoader implements MappedBatchLoader<String, Integer> {
...
@Override
public CompletionStage<String, Integer>> load(Set<String> storeIds) {
// BREAKPOINT ON NEXT LINE COUNTS DATALOADER CALLS
return CompletableFuture.supplyAsync(() -> {
// Call the repository and process return a map of Store ID -> employee count
});
}
...
}
And the query:
query {
first: store(id: "1234") {
id
numEmployees
}
second: store(id: "2345) {
id
numEmployees
}
group: stores(ids: ["3456", "4567", "5678"]) {
id
numEmployees
}
last: store(id: "6789") {
id
numEmployees
}
}
Your are using I guess graphql-java-kickstart because I can see GraphQLResolver classes which are not in the base graphql-java lib.
you might want to ask this also over at https://github.com/graphql-java-kickstart in case there is some extra behaviour in a GraphQLResolver versus a plain old DataFetcher
That said I am intrigued by the "plural" stores field - how is its "fetching code" implemented. I can see code that takes N arguments. How is this implemented.
We will try to create a plain old graphql-java demonstration of this problem to see if we can help resolve it.
The store and stores fields are in the base query resolver. If I remove the service code from the mix, it basically looks like this:
public class QueryResolver implements GraphQLQueryResolver {
public List<Store> getStores(List<String> ids) {
return storeRepository.findByIdsIn(ids);
}
public Optional<Store> getStore(String id) {
return storeRepository.findByIdsIn(Collections.singletonList(id)).findFirst();
}
}
This gives our consumers the option of getting an unwrapped object (not in a list) when they have a single ID.
This does use graphql-java-kickstart, but I think the issue is a graphql-java issue after some more digging. I'm happy to go ask this in graphql-java-kickstart if you still think I should after reading the following.
I have done some additional debugging and it looks like this issue may be in the dispatchIfNeeded logic. What appears to be happening is that the loaders are dispatched when the last field on level 1 is resolved, which happens before the level 2 fields inside that last field are traversed.
When I set a breakpoint on if (dispatchNeeded) in beginFieldFetch (FieldLevelTrackingApproach.class:167 in 13.0), I can watch each field resolve. I see the store, followed by the string store ID, followed by a null (unresolved numEmployees) for each store. The code iterates through the query (store->id->numEmployees, store->id->numEmployees, etc.) and dispatchNeeded is always false, so it collects the loader calls and waits to dispatch them.
When onDispatched (FieldLevelTrackingApproach.class:160) is called for the last store in the list, it calls dispatchIfNeeded which in turn calls levelReady. Up until this point, levelReady has returned false. The call to levelReady now sees that all fetches have happened on level 1 (actual dispatch count equals expected dispatch count) and dispatchIfNeeded returns true.
Since dispatchNeeded is now true, FieldLevelTrackingApproach.this.dispatch is called, which resolves every numEmployees reference except the last one (it hasn't had a chance to get queued). After the last reference is traversed and queued, level 2 is finished, the lone load request is dispatched, and query processing wraps up.
I can confirm the behavior @scgallafent is seeing also is present when using dataloaders in spring-graphql, so the issue is not specific to graphql-java-kickstart.
We've been able to work around the issue by querying an additional field (__typename) on "level 1", so the dataloaders on "level 2" have a change to be dispatched. This obviously is a hack, so a proper solution to the problem would be very much welcome.
Hello, this issue has been inactive for 60 days, so we're marking it as stale. If you would like to continue this discussion, please comment within the next 30 days or we'll close the issue.
Hello, as this issue has been inactive for 90 days, we're closing the issue. If you would like to resume the discussion, please create a new issue.