Skip to content

bootstrap.py regression doubles distribution size #1546

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
juj opened this issue Apr 3, 2025 · 4 comments
Open

bootstrap.py regression doubles distribution size #1546

juj opened this issue Apr 3, 2025 · 4 comments

Comments

@juj
Copy link
Collaborator

juj commented Apr 3, 2025

It looks like there is an unfortunate regression from the creation of the bootstrap.py script in Emscripten.

Before bootstrap.py script, the node_modules\ subdirectory on Windows would read 81.0 MB (85,015,488 bytes) in size. (after emsdk install sdk-main-64bit)

After the addition of that script, it then reads 169 MB (177,525,552 bytes), a 2x increase.

Here are the directory contents:

The diff reads:

    43  node_modules\.bin
    40  node_modules\@colors
    29  node_modules\@dabh
   144  node_modules\@discoveryjs
10,251  node_modules\@esbuild
 1,358  node_modules\@eslint
   862  node_modules\@eslint-community
   158  node_modules\@humanfs
    84  node_modules\@humanwhocodes
   790  node_modules\@nicolo-ribaudo
    58  node_modules\@nodelib
 3,494  node_modules\@rollup
 2,509  node_modules\@types
   606  node_modules\@webassemblyjs
    19  node_modules\@webpack-cli
   194  node_modules\@xtuc
    23  node_modules\acorn-jsx
    51  node_modules\acorn-walk
   907  node_modules\ajv
 1,079  node_modules\ajv-formats
     9  node_modules\anymatch
   167  node_modules\argparse
   788  node_modules\async
     4  node_modules\binary-extensions
    43  node_modules\braces
     6  node_modules\callsites
    88  node_modules\chokidar
     9  node_modules\chrome-trace-event
     7  node_modules\clone-deep
    51  node_modules\color
     9  node_modules\color-string
    16  node_modules\colorette
     3  node_modules\colorspace
    15  node_modules\cross-spawn
     7  node_modules\deep-is
     5  node_modules\enabled
   208  node_modules\enhanced-resolve
   158  node_modules\envinfo
   771  node_modules\es-check
    89  node_modules\es-module-lexer
   130  node_modules\esbuild
     3  node_modules\escape-string-regexp
 2,836  node_modules\eslint
    57  node_modules\eslint-config-prettier
   151  node_modules\eslint-scope
    35  node_modules\eslint-visitor-keys
    77  node_modules\espree
 1,012  node_modules\esquery
    13  node_modules\esrecurse
    36  node_modules\estraverse
    80  node_modules\events
    12  node_modules\fast-deep-equal
    96  node_modules\fast-glob
    16  node_modules\fast-json-stable-stringify
     9  node_modules\fast-levenshtein
   106  node_modules\fast-uri
    20  node_modules\fastest-levenshtein
    42  node_modules\fastq
   140  node_modules\fecha
    15  node_modules\file-entry-cache
    16  node_modules\fill-range
    11  node_modules\find-up
    25  node_modules\flat
    28  node_modules\flat-cache
    30  node_modules\flatted
     6  node_modules\fn.name
    11  node_modules\glob-parent
    17  node_modules\glob-to-regexp
   183  node_modules\globals
13,339  node_modules\google-closure-compiler-java
    31  node_modules\graceful-fs
    52  node_modules\ignore
     4  node_modules\import-fresh
     4  node_modules\import-local
    11  node_modules\imurmurhash
    20  node_modules\interpret
    53  node_modules\is-arrayish
     3  node_modules\is-binary-path
     6  node_modules\is-extglob
    13  node_modules\is-glob
     9  node_modules\is-number
     7  node_modules\is-plain-object
     5  node_modules\is-stream
    10  node_modules\isexe
     6  node_modules\isobject
    79  node_modules\jest-worker
   395  node_modules\js-yaml
     5  node_modules\json-buffer
    10  node_modules\json-parse-even-better-errors
    19  node_modules\json-schema-traverse
    13  node_modules\json-stable-stringify-without-jsonify
    27  node_modules\keyv
    22  node_modules\kind-of
     5  node_modules\kuler
    24  node_modules\levn
    17  node_modules\loader-runner
     6  node_modules\locate-path
    52  node_modules\lodash.merge
   108  node_modules\logform
     4  node_modules\merge-stream
     8  node_modules\merge2
    55  node_modules\micromatch
   200  node_modules\mime-db
    17  node_modules\mime-types
    55  node_modules\nanoid
     5  node_modules\natural-compare
   290  node_modules\neo-async
     9  node_modules\normalize-path
     5  node_modules\one-time
    48  node_modules\optionator
     7  node_modules\p-limit
     7  node_modules\p-locate
     4  node_modules\p-try
     3  node_modules\parent-module
     3  node_modules\path-exists
     4  node_modules\path-key
    87  node_modules\picomatch
    36  node_modules\pkg-dir
   197  node_modules\postcss
    35  node_modules\prelude-ls
 7,691  node_modules\prettier
    32  node_modules\punycode
     8  node_modules\queue-microtask
     6  node_modules\randombytes
    19  node_modules\readdirp
     8  node_modules\rechoir
     3  node_modules\require-from-string
    10  node_modules\resolve-cwd
     4  node_modules\resolve-from
     9  node_modules\reusify
 2,645  node_modules\rollup
     6  node_modules\run-parallel
    29  node_modules\safe-stable-stringify
 1,231  node_modules\schema-utils
    16  node_modules\serialize-javascript
     9  node_modules\shallow-clone
     2  node_modules\shebang-command
     2  node_modules\shebang-regex
     3  node_modules\simple-swizzle
   220  node_modules\source-map
   136  node_modules\source-map-js
     8  node_modules\stack-trace
     6  node_modules\strip-json-comments
     8  node_modules\supports-color
    45  node_modules\tapable
    85  node_modules\terser-webpack-plugin
     2  node_modules\text-hex
    22  node_modules\to-regex-range
     9  node_modules\triple-beam
    20  node_modules\type-check
22,330  node_modules\typescript
    81  node_modules\undici-types
   458  node_modules\uri-js
 2,785  node_modules\vite
    55  node_modules\watchpack
 5,202  node_modules\webpack
   289  node_modules\webpack-cli
    48  node_modules\webpack-merge
    89  node_modules\webpack-sources
     9  node_modules\which
    13  node_modules\wildcard
   385  node_modules\winston
   174  node_modules\winston-transport
    11  node_modules\word-wrap
   143  node_modules\ws
     5  node_modules\yocto-queue

I think this might be caused by the installation of dev dependencies in addition to the release dependencies? Plus then the java version of closure compiler, which Emscripten doesn't use/need.

This kind of increase causes threefold issues in our distribution at Unity:

  1. The size increase generating CDN costs is one concern. The reason I carefully removed the java closure was to help reduce dead weight.
  2. The license jungle with 3rd party NPM libraries. Even if a library is MIT or Apache or BSD, its use needs to be explicitly acknowledged and tracked by our legal. So every library goes through an audit.
  3. Some partners we work with, have some ISO certificate workflow, where they require that no library is used with outstanding CVE reports - and they use automated checkers to scan for these. They must stop using tools that have outstanding CVE reports beyond some score number associated with them. This is a massive headache for us and I've tried to explain that we don't ship these NPM modules onto any website, or production service, but that does not matter to them. So that is why I do not want to ship the devDependencies in our packaging of Unity, no matter how small those would be.

It would be great somehow to get back to the previous state before the bootstrap.py regression. The PR #1541 caused a regression with removing Java closure compiler, but it went masked by the earlier regression with the bootstrap.py script.

@sbc100
Copy link
Collaborator

sbc100 commented Apr 3, 2025

I agree the dev dependencies should never be part of the distribution of emscripten/emsdk.

./bootstrap.py is designed for local developers, so it need to install dev depednencies and git submodules. These things should not be part of any distrubution of emscripten to end users.

My understanding o the specific issue with closure compiler was that it was the various native binaries that the were the real problem, not the java version. Perhaps my memory is not correct, I will need to go back and re-read the issue.

@sbc100
Copy link
Collaborator

sbc100 commented Apr 3, 2025

I can look into updated the emsdk build of emscripten to avoid shipping the result of ./bootstrap.py

@sbc100
Copy link
Collaborator

sbc100 commented Apr 3, 2025

@juj what is the process you use to create a distribution from the contents of the emsdk directory? I want to make sure the changes I make are compatible.

@sbc100
Copy link
Collaborator

sbc100 commented Apr 3, 2025

My understanding o the specific issue with closure compiler was that it was the various native binaries that the were the real problem, not the java version. Perhaps my memory is not correct, I will need to go back and re-read the issue.

Going back to npm/cli#558 and google/closure-compiler-npm#186 and https://chromium-review.googlesource.com/c/emscripten-releases/+/6388656, I'm pretty sure the original issue that were were trying to solve was not avoiding the installation of the java version of closure-compiler, but the installation of the all 3 mac/linux/windows binary versions.

Indeed, its seems that the java version is actually still required, for example, on macOS arm64 machines. Currently we are seeing the fallback to the java version there due to the lack of the arm64 binary in npm: google/closure-compiler-npm#291

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants