-
-
Notifications
You must be signed in to change notification settings - Fork 708
Description
Knowing if we detected a whole license text, or only a notice or mention is a relevant data to later bring some automated summarization and reasoning on what was detected.
For instance, say you have a directory with a COPYING file at the root with a full LGPL 2.1 license text and that all or most source files have a simple notice in a comment Licensed under the LGPL and that one file carries a BSD text.
With this data in hand, you could conclude that the code is overall LGPL-2.1 licensed inferring the version from the top level full text and that one file is BSD-licensed .
To implement this there are a possible avenues:
- automated using the size of a match and comparing that to the size of a corresponding full license text:
- if matched to a full text or if bigger or equal to the license text size: this is a full text match
- a small match of a few line significantly smaller than than the text, this could be a mention, such as
licensed under the same license as Perlorlicense: BSD. - a medium match of a several lines and still smaller than than the full text, this could be a notice
- data driven using tagged rules
we would tag each rule YAML as a full text, notice or mention and report that
The two approach can be combined and we could start with 1. and progressively adopt 2. (that requires updating several thousands rules... ;) )