Skip to content

Improve MIT detection #4269

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
pombredanne opened this issue Apr 21, 2025 · 0 comments
Open

Improve MIT detection #4269

pombredanne opened this issue Apr 21, 2025 · 0 comments

Comments

@pombredanne
Copy link
Member

Following up from comments posted in @alok1304 's #4121 (comment)_

You can improve this further this way:

  1. create tests adding a test and expected file in https://github.com/aboutcode-org/scancode-toolkit/tree/develop/tests/licensedcode/data/datadriven/lic4 ... see all examples of test file pairs there.

The test for #3860 and #3861 would be the same with this text (like for https://github.com/aboutcode-org/scancode-toolkit/blob/develop/tests/licensedcode/data/datadriven/lic4/2675-sqlite.cpp )

# Copyright: (c) 2020, Jordan Borean (@jborean93) <jborean93@gmail.com>
# MIT License (see LICENSE or https://opensource.org/licenses/MIT)

And expected YAML file, (like for https://github.com/aboutcode-org/scancode-toolkit/blob/develop/tests/licensedcode/data/datadriven/lic4/2675-sqlite.cpp.yml )

license_expressions:
  - mit
  1. Also add a few new rules with this related contents (this can be a separate PR alright):
---
license_expression: mit
is_license_notice: yes
relevance: 100
referenced_filenames:
    - LICENSE
ignorable_urls:
    - https://opensource.org/licenses/MIT
---

{{MIT License (see LICENSE or https://opensource.org/licenses/MIT) }}

And another:

---
license_expression: mit
is_license_notice: yes
relevance: 100
referenced_filenames:
    - LICENSE
---

{{MIT License (see LICENSE) }}

And a few variations that can bee seen in the wild if we do not detect these exactly:

And all the variations where there is a LICENSE.txt:

And a few rst:

And more variants:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants