Skip to content

Conversation

@JosePineiro
Copy link

This pull request refactors the _setContentTypeFromPath method in AsyncFileResponse to improve maintainability and performance in MIME type detection.

Key changes:

  • "application/javascript" by "text/javascript". Per IETF RFC 9239 (May 2022) text/javascript is now standard and application/javascript is considered obsolete.
  • More accurate fallback behavior, switching the default MIME type from text/plain to application/octet-stream for unknown file types. This causes browsers to download the file and not attempt to display it on the screen.
  • Formats Added: .txt → text/plain .opus → audio/opus (great compression, open and support in all modern browsers) .webm → video/webm (great compression, open and support in all modern browsers)
  • Formats Removed: .zip → application/zip (send application/octet-stream. Browsers will always download it and not display it on the screen.) .gz → application/x-gzip (send application/octet-stream. Browsers will always download it and not display it on the screen.) .eot → font/eot (Font file. Obsolete, no longer supported in modern browsers.) .ttf → font/ttf (Size it is too large for embedded systems. WOLF2 should be used.)

The update brings the function more in line with modern web and IoT usage while maintaining backward compatibility with legacy formats where still relevant.

@me-no-dev
Copy link
Member

adding is OK, removing, I am not a fan of. Things like zip, gz and ttf should stay

@me-no-dev
Copy link
Member

Also the default type needs to be tested. Surely there was a reason to be text/plain by default

This comment was marked as outdated.

Copy link
Member

@mathieucarbou mathieucarbou Jul 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Do not remove existing definitions
  • For the cases of CT change (like in 2022 for JS), add a new macro and do not change the existing definition to keep backward compatibility for users relying on the existing macros and its length.
  • IMO this is not the right way to speed up the comparison. Mime types are segmented by categories: application, text, image, etc. To speed up the comparison, first test if the content type starts with application/, text/, image/, etc. Then, for each one, you only test the second part of the content type. This will divide the number of comparisons to do way more and also avoid redoing each time the same comparisons for the categories.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extension removed:

  • .EOT. Only supported by Internet Explorer v9 to v11. Not supported by any current browser. Does it make sense to keep it?
  • .zip and .gz. MINE types are used to allow the browser to display files properly. All browsers download ZIP and GZ files, so the default mine type (application/octet-stream) is suitable for .zip and .gz.
    Having unnecessary extensions slows down your code, increases its size, and makes it harder to maintain. It's not a free breakfast.
    Do you still think it's worth keeping them?

I'll restore the old definition of for JS

I realize it's not the most efficient way to speed things up. But the cost is very low (I simply change the order) and the benefit is good. Do you think it's reasonable to leave it like this for now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reordering the conditions is not something I would do because it just adds noise and more difficulties to review the diff.

I would do no reordering and instead to the correct code change to optimize the comparison by checking the mime category first and the the second part.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For content types: they should NEVER be linked to a file extension.

content types should be set by the user, depending on HOW the resource is used.

That’s totally ok to answer some js content with text/plain, like it is totally fine to serve an image with application/octet-stream. These are different usages on consumer side.

That being said, the server can assume some default content types for common static resources served on web pages like html, images, css, js, etc for example in the case of serveStatic because it assumes a standard use in a web browser.

And yes, defaulting to text/plain for some is bad, especially for binary content. Default should be application/octet-stream I think.

@mathieucarbou mathieucarbou marked this pull request as draft July 23, 2025 12:15
@JosePineiro
Copy link
Author

Also the default type needs to be tested. Surely there was a reason to be text/plain by default

Browsers use Content-Type to know how to handle received files.
Content-Type = text/plain causes the browser to interpret it as a text file and display it on the screen.
When receiving Content-Type = application/octet-stream, the browser interprets it as a binary file and should download it, not display it on the screen.
For this reason, it doesn't make sense to set Content-Type to gz or ZIP. application/octet-stream is sufficient for these types of files.
In any case, if you'd like, I can set up a small project on platformio so you can verify this behavior.

@JosePineiro
Copy link
Author

Make changes:
In _setContentTypeFromPath removed : T__zip, gz (works ok with default MINE) y eot (Obsolete)
In _setContentTypeFromPath added : csv, txt (don't work with new default MINE), opus, webm (Best formats)
In _setContentTypeFromPath chaged "application/javascript" to "text/javascript"
Don´t removed the obsolete definition T_application_javascript
Changed default MINE to application/octet-stream

No reordering, as it doesn't provide much benefit. I hadn't realized the difficulty of reviewing changes.
The search system you propose isn't adequate, as it's the extensions you're searching for, not the MINE types.

I hope these changes are enough.

@JosePineiro
Copy link
Author

The search system you propose isn't adequate, as it's the extensions you're searching for, not the MINE types.
Something similar could be used, but the code would be less maintainable. Do you think it's worth it?

  const char *cpath = path.c_str();
  const char *dot = strrchr(cpath, '.');

  if (!dot) {
    _contentType = T_application_octet_stream;
    return;
  }
  
  switch (dot[1]) {
        case 'a':
            if (strcmp(dot, T__avif) == 0) _contentType = T_image_avif;
            break;
            
        case 'c':
            if (strcmp(dot, T__css) == 0) _contentType = T_text_css;
            break;
            
        case 'g':
            if (strcmp(dot, T__gif) == 0) _contentType = T_image_gif;
            break;
            
        case 'h':
            if (strcmp(dot, T__html) == 0 || strcmp(dot, T__htm) == 0) _contentType = T_text_html;
            break;
            
        case 'i':
            if (strcmp(dot, T__ico) == 0) _contentType = T_image_x_icon;
            break;
            
        case 'j':
            // Order by likelihood: JS is more common. JSON in less common.
            if (strcmp(dot, T__js) == 0) _contentType = T_application_javascript;
            else if (strcmp(dot, T__jpg) == 0) _contentType = T_image_jpeg;
            else if (strcmp(dot, T__json) == 0) _contentType = T_application_json;
            break;
            
        case 'm':
            if (strcmp(dot, T__mp4) == 0) _contentType = T_video_mp4;
            break;
            
        case 'o':
            if (strcmp(dot, T__opus) == 0) _contentType = T_audio_opus;
            break;
            
        case 'p':
            // PNG is more common in web contexts, so check it first
            if (strcmp(dot, T__png) == 0) _contentType = T_image_png;
            else if (strcmp(dot, T__pdf) == 0) _contentType = T_application_pdf;
            break;
            
        case 's':
            if (strcmp(dot, T__svg) == 0) _contentType = T_image_svg_xml;
            break;
            
        case 't':
            if (strcmp(dot, T__ttf) == 0) _contentType = T_font_ttf;
            else if (strcmp(dot, T__txt) == 0) _contentType = T_text_plain;
            break;
            
        case 'w':
            // Order by likelihood in web serving contexts
            if (strcmp(dot, T__webp) == 0) _contentType = T_image_webp;
            else if (strcmp(dot, T__webm) == 0) _contentType = T_video_webm;
            else if (strcmp(dot, T__woff2) == 0) _contentType = T_font_woff2;
            else if (strcmp(dot, T__woff) == 0) _contentType = T_font_woff;
            break;
            
        case 'x':
            if (strcmp(dot, T__xml) == 0) _contentType = T_text_xml;
            break;
            
        default:
            _contentType = T_application_octet_stream;
            return;
    }
    
    // If we reach here and contentType wasn't set, use default
    if (_contentType == nullptr) {
        _contentType = T_application_octet_stream;
	}

@mathieucarbou
Copy link
Member

The search system you propose isn't adequate, as it's the extensions you're searching for, not the MINE types.

Oh sorry you are completely right!
That's the part that determines which CT to use depending on a file extension on FS!

Sorry... 😓

@mathieucarbou
Copy link
Member

The search system you propose isn't adequate, as it's the extensions you're searching for, not the MINE types. Something similar could be used, but the code would be less maintainable. Do you think it's worth it?

  const char *cpath = path.c_str();
  const char *dot = strrchr(cpath, '.');

  if (!dot) {
    _contentType = T_application_octet_stream;
    return;
  }
  
  switch (dot[1]) {
        case 'a':
            if (strcmp(dot, T__avif) == 0) _contentType = T_image_avif;
            break;
            
        case 'c':
            if (strcmp(dot, T__css) == 0) _contentType = T_text_css;
            break;
            
        case 'g':
            if (strcmp(dot, T__gif) == 0) _contentType = T_image_gif;
            break;
            
        case 'h':
            if (strcmp(dot, T__html) == 0 || strcmp(dot, T__htm) == 0) _contentType = T_text_html;
            break;
            
        case 'i':
            if (strcmp(dot, T__ico) == 0) _contentType = T_image_x_icon;
            break;
            
        case 'j':
            // Order by likelihood: JS is more common. JSON in less common.
            if (strcmp(dot, T__js) == 0) _contentType = T_application_javascript;
            else if (strcmp(dot, T__jpg) == 0) _contentType = T_image_jpeg;
            else if (strcmp(dot, T__json) == 0) _contentType = T_application_json;
            break;
            
        case 'm':
            if (strcmp(dot, T__mp4) == 0) _contentType = T_video_mp4;
            break;
            
        case 'o':
            if (strcmp(dot, T__opus) == 0) _contentType = T_audio_opus;
            break;
            
        case 'p':
            // PNG is more common in web contexts, so check it first
            if (strcmp(dot, T__png) == 0) _contentType = T_image_png;
            else if (strcmp(dot, T__pdf) == 0) _contentType = T_application_pdf;
            break;
            
        case 's':
            if (strcmp(dot, T__svg) == 0) _contentType = T_image_svg_xml;
            break;
            
        case 't':
            if (strcmp(dot, T__ttf) == 0) _contentType = T_font_ttf;
            else if (strcmp(dot, T__txt) == 0) _contentType = T_text_plain;
            break;
            
        case 'w':
            // Order by likelihood in web serving contexts
            if (strcmp(dot, T__webp) == 0) _contentType = T_image_webp;
            else if (strcmp(dot, T__webm) == 0) _contentType = T_video_webm;
            else if (strcmp(dot, T__woff2) == 0) _contentType = T_font_woff2;
            else if (strcmp(dot, T__woff) == 0) _contentType = T_font_woff;
            break;
            
        case 'x':
            if (strcmp(dot, T__xml) == 0) _contentType = T_text_xml;
            break;
            
        default:
            _contentType = T_application_octet_stream;
            return;
    }
    
    // If we reach here and contentType wasn't set, use default
    if (_contentType == nullptr) {
        _contentType = T_application_octet_stream;
	}

I think it is OK (providing we add some comments to explain why it is like that).

I would extract that in a specific static method: const char* lookupContentTypeByExtension(const char* ) or something similar to directly return from the switch block.

@JosePineiro
Copy link
Author

Version 1 is the version in this PR.
Version 2 is the version with a switch.
Version 3 uses a double comparison (#define)

The results are:
=== PERFORMANCE COMPARISON ===
Version 1: 7.2863 μs per call
Version 2: 6.5086 μs per call
Version 3: 6.4374 μs per call

=== SYSTEM INFORMATION ===
ESP32 Chip: ESP32-S3
Cpu Frequency: 240 Mhz

You can test in https://github.com/JosePineiro/esp32_benchmark_1
You can see that the increase in performance is very limited.
Which version do you think is most appropriate?

Code of version 3:

#define EXT_EQ(dot, ext) (dot[1] == ext[1] && strcmp(dot, ext) == 0)
/**
 * @brief Optimized version using table
 */
void _setContentTypeFromPath_v3(const String &path) {
  const char *dot = strrchr(path.c_str(), '.');
  if (!dot)
    _contentType = T_application_octet_stream;
  else if (EXT_EQ(dot, T__html) || EXT_EQ(dot, T__htm))
    _contentType = T_text_html;
  else if (EXT_EQ(dot, T__css))
    _contentType = T_text_css;
  else if (EXT_EQ(dot, T__js))
    _contentType = T_application_javascript;
  else if (EXT_EQ(dot, T__avif))
    _contentType = T_image_avif;
  else if (EXT_EQ(dot, T__webp))
    _contentType = T_image_webp;
  else if (EXT_EQ(dot, T__svg))
    _contentType = T_image_svg_xml;
  else if (EXT_EQ(dot, T__json))
    _contentType = T_application_json;
  else if (EXT_EQ(dot, T__woff2))
    _contentType = T_font_woff2;
  else if (EXT_EQ(dot, T__png))
    _contentType = T_image_png;
  else if (EXT_EQ(dot, T__jpg))
    _contentType = T_image_jpeg;
  else if (EXT_EQ(dot, T__gif))
    _contentType = T_image_gif;
  else if (EXT_EQ(dot, T__ico))
    _contentType = T_image_x_icon;
  else if (EXT_EQ(dot, T__woff))
    _contentType = T_font_woff;
  else if (EXT_EQ(dot, T__ttf))
    _contentType = T_font_ttf;
  else if (EXT_EQ(dot, T__xml))
    _contentType = T_text_xml;
  else if (EXT_EQ(dot, T__pdf))
    _contentType = T_application_pdf;
  else if (EXT_EQ(dot, T__webm))
    _contentType = T_video_webm;
  else if (EXT_EQ(dot, T__mp4))
    _contentType = T_video_mp4;
  else if (EXT_EQ(dot, T__opus))
    _contentType = T_audio_opus;
  else if (EXT_EQ(dot, T__txt))
    _contentType = T_text_plain;
  else
    _contentType = T_application_octet_stream;
}

@mathieucarbou
Copy link
Member

You can see that the increase in performance is very limited.
Which version do you think is most appropriate?

Indeed, in that case do not optimize and go with the more readable version… thanks for the comparison efforts! This helps taking the right décision.

@JosePineiro
Copy link
Author

You now have the PR with the most readable version.
The only changes are the addition of missing extensions, the removal of unneeded extensions, and the change to the default encoding.
You just need to approve it.

@mathieucarbou mathieucarbou marked this pull request as ready for review July 27, 2025 07:18
@mathieucarbou mathieucarbou requested a review from Copilot July 27, 2025 07:19
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors MIME type handling for file responses to modernize supported formats and improve browser behavior. The changes align with current web standards and IoT requirements while optimizing for embedded systems.

Key changes:

  • Updates JavaScript MIME type from obsolete "application/javascript" to standard "text/javascript" per RFC 9239
  • Changes default fallback from "text/plain" to "application/octet-stream" to ensure unknown files are downloaded rather than displayed
  • Adds modern formats (.opus, .webm, .txt, .csv) and removes obsolete/problematic formats (.eot, .zip, .gz)

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/literals.h Adds new file extension constants and MIME types, removes obsolete formats, adds documentation comments
src/WebResponses.cpp Updates MIME type detection logic with new formats and improved fallback behavior
Comments suppressed due to low confidence (1)

src/WebResponses.cpp:676

  • The JavaScript file extension is still mapped to the obsolete 'application/javascript' MIME type instead of the new standard 'text/javascript' mentioned in the PR description. This should be updated to use T_text_javascript.
    _contentType = T_font_ttf;

src/literals.h Outdated
static constexpr const char *T__woff = ".woff"; // WOFF: Font file. Legacy support
static constexpr const char *T__woff2 = ".woff2"; // WOFF2: Better compression. Compatible with all modern browsers.
static constexpr const char *T__xml = ".xml"; // XML: Configuration and data files
static constexpr const char *T_application_javascript = "application/javascript"; // Obsolete type for JavaScript
Copy link

Copilot AI Jul 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The obsolete 'application/javascript' constant is still defined but appears unused after the refactor. Consider removing it to reduce code clutter, or if keeping for backward compatibility, add a comment explaining why it's retained.

Suggested change
static constexpr const char *T_application_javascript = "application/javascript"; // Obsolete type for JavaScript
static constexpr const char *T_application_javascript = "application/javascript"; // Obsolete type for JavaScript
// Retained for backward compatibility with older systems or clients that may still rely on this MIME type.

Copilot uses AI. Check for mistakes.
Copy link
Member

@mathieucarbou mathieucarbou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the proposed behavior changes - they are more aligned with what a http server should do.

@mathieucarbou
Copy link
Member

@me-no-dev : can you please have a look ? thanks!

JosePineiro and others added 5 commits August 4, 2025 20:52
This pull request refactors the _setContentTypeFromPath method in AsyncFileResponse to improve maintainability and performance in MIME type detection.

Key changes:
- More accurate fallback behavior, switching the default MIME type from text/plain to application/octet-stream for unknown file types.
- Formats Added:
 .txt → text/plain
 .opus → audio/opus (great compression, open and support in all modern browsers)
 .webm → video/webm (great compression, open and support in all modern browsers)

- Formats Removed:
.zip → application/zip (send application/octet-stream. Browsers will always download it and not display it on the screen.)
.gz → application/x-gzip (send application/octet-stream. Browsers will always download it and not display it on the screen.)
.eot → font/eot (Font file. Obsolete, no longer supported in modern browsers.)
.ttf → font/ttf  (Size it is too large for embedded systems. WOLF2 should be used.)

The update brings the function more in line with modern web and IoT usage while maintaining backward compatibility with legacy formats where still relevant.









Preguntar a ChatGPT
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Remove autoMINE extensions: zip, gz y eot
Add extensions: csv, opus, txt, webm
@mathieucarbou mathieucarbou force-pushed the feature/mime-type-refactor branch from 94969f2 to b0db1e1 Compare August 4, 2025 18:52
@mathieucarbou mathieucarbou merged commit 2773653 into ESP32Async:main Aug 5, 2025
33 checks passed
@JosePineiro JosePineiro deleted the feature/mime-type-refactor branch August 19, 2025 18:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants