SPDX, DejaCode and other would be a good start. The goal would be to have a single purpose script to fetch, sync and update the ScanCode data