Skip to content

Enhance recognition subset of the matched_text is detected, and prevent returning unknown-license-reference #4386

Open
@chinyeungli

Description

@chinyeungli
- matches:
    - score: '100.0'
      matcher: 2-aho
      end_line: 4449
      rule_url: https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/lead-in_unknown_30.RULE
      from_file:
      start_line: 4448
      matched_text: |
           • The ‘libunistring’ library and its header files are dual-licensed
             under "the GNU LGPLv3+ or the GNU GPLv2+".  This means, you can use
      match_coverage: '100.0'
      matched_length: 3
      rule_relevance: 100
      rule_identifier: lead-in_unknown_30.RULE
      license_expression: unknown-license-reference
      license_expression_spdx: LicenseRef-scancode-unknown-license-reference
    - score: '100.0'
      matcher: 2-aho
      end_line: 4449
      rule_url: https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/lgpl-3.0-plus_74.RULE
      from_file:
      start_line: 4449
      matched_text: '     under "the GNU LGPLv3+ or the GNU GPLv2+".  This means, you can use'
      match_coverage: '100.0'
      matched_length: 2
      rule_relevance: 100
      rule_identifier: lgpl-3.0-plus_74.RULE
      license_expression: lgpl-3.0-plus
      license_expression_spdx: LGPL-3.0-or-later
    - score: '100.0'
      matcher: 2-aho
      end_line: 4449
      rule_url: https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/gpl-2.0-plus_488.RULE
      from_file:
      start_line: 4449
      matched_text: '     under "the GNU LGPLv3+ or the GNU GPLv2+".  This means, you can use'
      match_coverage: '100.0'
      matched_length: 2
      rule_relevance: 100
      rule_identifier: gpl-2.0-plus_488.RULE
      license_expression: gpl-2.0-plus
      license_expression_spdx: GPL-2.0-or-later

As the matched_text contains 2 lines and the second line have license detected alright, it may make sense if the matched_text only contains the first line: The ‘libunistring’ library and its header files are dual-licensed and return unknown-license-reference. However, as it also includes the second line under "the GNU LGPLv3+ or the GNU GPLv2+". The tool should be able to tell this 2 lines are detected alright and not to return unknown-license-reference`

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions