Skip to content

When detecting a license report if a full license text, or only a notice or mention was detected #80

@pombredanne

Description

@pombredanne

Knowing if we detected a whole license text, or only a notice or mention is a relevant data to later bring some automated summarization and reasoning on what was detected.

For instance, say you have a directory with a COPYING file at the root with a full LGPL 2.1 license text and that all or most source files have a simple notice in a comment Licensed under the LGPL and that one file carries a BSD text.

With this data in hand, you could conclude that the code is overall LGPL-2.1 licensed inferring the version from the top level full text and that one file is BSD-licensed .

To implement this there are a possible avenues:

  1. automated using the size of a match and comparing that to the size of a corresponding full license text:
    • if matched to a full text or if bigger or equal to the license text size: this is a full text match
    • a small match of a few line significantly smaller than than the text, this could be a mention, such as licensed under the same license as Perl or license: BSD .
    • a medium match of a several lines and still smaller than than the full text, this could be a notice
  2. data driven using tagged rules
    we would tag each rule YAML as a full text, notice or mention and report that

The two approach can be combined and we could start with 1. and progressively adopt 2. (that requires updating several thousands rules... ;) )

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions