-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Mime type reload v2 #13647
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mime type reload v2 #13647
Conversation
8c68f72
to
c104e61
Compare
Setting to draft to see how the CI is going. Will unset if green "enough". |
c3123c5
to
44c063f
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #13647 +/- ##
=======================================
Coverage 83.69% 83.69%
=======================================
Files 1011 1014 +3
Lines 275071 275149 +78
=======================================
+ Hits 230230 230296 +66
- Misses 44841 44853 +12
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
e843bee
to
a01a3ef
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI failures: dependency issues? https://github.com/OISF/suricata/actions/runs/16516991164/job/46711126442?pr=13647#step:10:35
Yes, got an optional crate and need to make sure it is available everywhere. |
Following commit will add a new value that will overflow the actual size. Ticket: 7816
File magic is known to have performance issue. When looking for an alternative, I've been pointed to the tree_magic_mini crate that output the mime type of a binary stream. This is different from magic but it has multiple advantages as it is a standard so it can be correlated with other tools. So instead of replacing magic, this patch adds a new mime type output to fileinfo events. This patch also adds the `file.mimetype` sticky buffer. Mime type has 2 advantages over file magic. First it is really faster and second, the result are easier to use as the MIME type are well defined. But it provides less information than magic for example with regards to the size of images. Ticket: 7816
This patch adds a flag `--enable-gpl-mimetype` to force the usage of `tree_magic_db` that is GPL license and will not allow double licensing. This will allow the built Suricata to have mime type identification not dependant of the operating system database. Ticket: 7816
This can be used with the requires keyword to avoid signatures load failure. Ticket: 7816
As file data may have not been inspected yet, this is going to miss in the event even if it could be available. Let's compute it before logging the file. Ticket: 7816
a01a3ef
to
6f251a4
Compare
Why is CI still red ? |
strlcat(features, "HAVE_JA4 ", sizeof(features)); | ||
#endif | ||
strlcat(features, "HAVE_LIBJANSSON ", sizeof(features)); | ||
strlcat(features, "HAVE_MIMETYPE ", sizeof(features)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no ifdef around this ?
#define FILE_SIG_NEED_MD5 BIT_U16(4) | ||
#define FILE_SIG_NEED_SHA1 BIT_U16(5) | ||
#define FILE_SIG_NEED_SHA256 BIT_U16(6) | ||
#define FILE_SIG_NEED_SIZE BIT_U16(7) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you can start by cherry-picking #13601 ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Show off :P
Dependencies -- libraries not found, still 🤔 |
der-parser = { version = "~9.0.0", default-features = false } | ||
kerberos-parser = { version = "~0.8.0", default-features = false } | ||
|
||
tree_magic_mini = { version = "~3.1.6", features = [@WITH_GPL_DATA@] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the support is always compiled in? What happens if the data is not there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can load the mime type def from a file on disk. I'll document that.
Would it make more sense to only compile in the mime-type crate if the feature is enabled, so we're not left in the situation where a user starts relying on the operating system files? I know the license always allows linking in, but practically, it might not make sense to. Tests could then rely on the HAVE_MIMETYPE feature flag and know they are always testing against the same database. Likewise for rule writers. |
2 issues:
I'll work on it this week. |
I'm fine with it but it means proprietary license user can't get the feature. Should we do:
So GPL build does This will simplify the CI a lot if we get the MIME_TYPE feature conditional doing so. |
For the purposes of getting CI working, I think we should have:
And I'm not 100% sure that would be the best for a release, but it gets us somewhere, with an opt-in behavior for now. I took a quick look, and there does not appear to be any other more permissively licensed MIME database out there. |
I fear bug report with that approach: |
6f251a4
to
7b456ce
Compare
7b456ce
to
f37f5ab
Compare
I think enabling anything that pulls in GPL license code/data needs to be opt-in at this time. The only other feature I think that pulls in GPL code is nfqueue, and its opt-in as well. |
Yes but |
CI is green, cool :-) , will you do a new clean PR ? |
Yes, fixing the last request from Jason and doing that. |
Update of #13620 addressing comments.
Contribution style:
https://docs.suricata.io/en/latest/devguide/contributing/contribution-process.html
Our Contribution agreements:
https://suricata.io/about/contribution-agreement/ (note: this is only required once)
Changes (if applicable):
(including schema descriptions)
https://redmine.openinfosecfoundation.org/projects/suricata/issues
Link to ticket: https://redmine.openinfosecfoundation.org/issues/
Describe changes:
SV_BRANCH=OISF/suricata-verify#2606