Skip to content

[ php-wasm ] add intl extension#2187

Merged
adamziel merged 38 commits intoWordPress:trunkfrom
mho22:add-intl-extension
May 23, 2025
Merged

[ php-wasm ] add intl extension#2187
adamziel merged 38 commits intoWordPress:trunkfrom
mho22:add-intl-extension

Conversation

@mho22
Copy link
Collaborator

@mho22 mho22 commented Apr 16, 2025

Motivation for the change, related issues

Mostly based on @oskardydo excellent work and pull request #2173

This pull request adds support for the intl extension in php-wasm from version 7.2 to version 8.4 on both Node and Web platforms. A new option, withICU, is now available on the Web platform to enable loading of the ICU data file.

const php = new PHP( await loadWebRuntime( '8.4', {

    withICU : true,

    emscriptenOptions : { ... }

} ) );

Crash risk!

Warning

In Web platform, calling an intl function without loading the dat file will crash PHP.

These functions are available because Intl is built as a static extension and included in all php.wasm binaries. However, they cannot operate without the required data. To avoid accidental crashes, you can explicitly disable specific Intl functions using the disable_functions directive in your php.ini file.

Roadmap

Implementation details

This pull request is composed of :

  1. Intl extension build
  2. ICU data file generation using ICU 74.2 version
  3. Copy data files in shared directories in Node and Web platforms during PHP compilation.
  4. Injection of Intl extension in PHP compilation with additional manipulation for version below 7.3
  5. Copy data files from src/ to dist/ on build time for Node and Web platforms.
  6. Implementation of dedicated with-icu-data files for ICU data file loading during PHP runtime, provided in Node but optional with withICU boolean option in Web platform.
  7. Data file import simulation during build for Web platform, similar to .wasm files.
  8. Ignoring data imports in Playground directories, similar to ignoring .wasm imports.
  9. Add intl functions test in php.spec.ts
  10. Test file splitting and job Isolation for multiple Node tests.

Some points deserve more explanation :

2. The Intl extension requires locale-specific data to function properly. This data is stored in an ICU data file named icudt74l.dat, which is generated during the static build process in the compile/libintl Dockerfile. Note that this file is quite large [ ~30MB ].

3. After PHP compilation, the ICU data file, similar to the .wasm files tied to the PHP version, is copied into a platform-specific newly created shared directory:

  • packages/php-wasm/node/src/lib/data/shared for Node
  • packages/php-wasm/web/public/shared for Web

Note: The original location for the Node version was at the root of the php-wasm/node directory, but this caused issues (see point 6 below).

5. During the build process, the ICU data file is copied from src to dist:

For Node, this is handled using esbuild in node/build.js.
For Web, since the shared directory is placed under public, Vite automatically copies its contents to the dist directory.

6. Once the file is in place, we need to load it into the PHP-WASM filesystem and keep track of its path. This must be done before completing the loadNodeRuntime or loadWebRuntime process.

This logic is handled by the data/with-icu-data.ts file, which exports a function returning a Promise that resolves to EmscriptenOptions:

1. An ENV variable with the ICU data file path : ICU_DATA: '/internal/shared',
2. A callback with access to the PHP runtime [ needed to access the filesystem ] : onRuntimeInitialized: (phpRuntime: PHPRuntime) => { ... }

Note: The original location of the data file couldn't remain in the root php-wasm/node directory because the filePath must remain consistent between src and dist:

const filePath = `${__dirname}/shared/${fileName}`;

The simplest solution was to move the shared directory next to the with-icu-data.ts file.

Next, this Promise is executed within the loadPHPRuntime method, called inside either loadWebRuntime or loadNodeRuntime. The key difference lies in how and when it’s executed:

For Node, loading the file is straightforward and synchronous.

For Web, the ICU data file must be loaded concurrently with the .wasm file due to its size and performance implications. Furthermore, this loading is optional, if the withICU option is false or absent from the loaderOptions, the file will not be loaded.

7. While loading the file in Node is easy via readFileSync, the Web version requires more setup due to the build system.

To load the data using fetch, a helper JavaScript file [ icudt74l.js ] is added to the shared directory :

web/public/shared/icudt74l.js

import dataFilename from './icudt74l.dat';
export { dataFilename };

This enables the Vite to locate the .dat file. However, since we want to load the data at runtime and not during build, we must tell Vite to ignore .dat file imports. This is done in vite.config.ts:

vite.config.ts at line 37

{
	name: 'ignore-data-imports',

	load(id: string): any {
		if (id?.endsWith('.dat')) {
			return {
				code: 'export default {}',
				map: null,
			};
		}
	},
},

This ensures the .dat file is excluded from the bundle and instead loaded dynamically at runtime, only if the withICU option is enabled. Therefore simulating the data import.

10. Due to resource limitations, several Node test files were split and distributed across multiple jobs to ensure tests pass reliably:

php.spec.ts → 1 new job
php-request-handler → split into 3 files → 2 new jobs
php-asyncify → split into 4 files → 4 new jobs

Increasing test-unit-asyncify to 8 jobs.

Build process

// Intl extension
cd packages/php-wasm/compile

// Asyncify
rm -rf libintl/asyncify/dist && make libintl_asyncify

// JSPI
rm -rf libintl/jspi/dist && make libintl_jspi
// PHP with INTL

// Node JSPI
node packages/php-wasm/compile/build.js --PLATFORM=node --PHP_VERSION=8.4 --output-dir=packages/php-wasm/node/jspi --WITH_JSPI=yes --WITH_INTL=yes


// Web Asyncify
node packages/php-wasm/compile/build.js --PLATFORM=web --PHP_VERSION=8.4 --output-dir=packages/php-wasm/web/public/php/asyncify --WITH_INTL=yes

Testing Instructions

Node

scripts/node.js

import { PHP } from '@php-wasm/universal';
import { loadNodeRuntime } from '@php-wasm/node';


const script = `<?php

$formatter = new \NumberFormatter('en-US', \NumberFormatter::CURRENCY);

var_dump($formatter->format(100.00));

$formatter = new \NumberFormatter('fr-FR', \NumberFormatter::CURRENCY);

var_dump($formatter->format(100.00));`



const php = new PHP( await loadNodeRuntime( '8.4' ) );

const result = await php.run( { code : script } );

console.log( result.text );

commands

JSPI : node --experimental-wasm-stack-switching scripts/node.js
Asyncify : node scripts/node.js

results

string(7) "$100.00"
string(11) "€ 100.00"

Web

scripts/web.js

import { PHP } from '@php-wasm/universal';
import { loadWebRuntime } from '@php-wasm/web';
import { EmscriptenDownloadMonitor } from '@php-wasm/progress';


const script = `<?php

$formatter = new \NumberFormatter('en-US', \NumberFormatter::CURRENCY);

var_dump($formatter->format(100.00));

$formatter = new \NumberFormatter('fr-FR', \NumberFormatter::CURRENCY);

var_dump($formatter->format(100.00));`


let wasmUrl = '';

const php = new PHP( await loadWebRuntime( '8.4', {

    withICU : true,

    emscriptenOptions : { ... }

} ) );

const result = await php.run( { code : script } );

console.log( result.text );

commands

JSPI : chrome://flags > Search for JSPI > Enable && npm run dev
Asyncify : npm run dev

results

[vite] connecting...
[vite] connected.

string(7) "$100.00"
string(11) "100,00 €"

@mho22
Copy link
Collaborator Author

mho22 commented Apr 16, 2025

I tried to optimize the intl builds in different ways :

But first of all, these are the sizes of the compile/libintl directories before build :

data : 30,8 Mb [ icudt74l.dat file ]
include : 4,8 Mb
lib : 8 Mb


My first attempt was to greatly decrease the icudt74l.dat file with a filter.json file :

{
	"localeFilter": {
		"filterType": "language",
		"includelist": ["en"]
	}
}

it resulted in a 12.7Mb size. Not great, not terrible.



The second attempt was to decrease the lib directory with new flags :

RUN set -euxo pipefail && \
    mkdir -p /root/lib && \
    source /root/emsdk/emsdk_env.sh && \
    CPPFLAGS="-DUCONFIG_NO_LEGACY_CONVERSION=1 -DUCONFIG_NO_COLLATION=1 -DUCONFIGU_NO_FORMATTING=1 -DUCONFIG_NO_TRANSLITERATION=1 -DUCONFIG_NO_REGULAR_EXPRESSIONS=1" \
    emconfigure /root/icu/source/configure \
include : 4,8 Mb
lib : 6,8 Mb


RUN set -euxo pipefail && \
    mkdir -p /root/lib && \
    source /root/emsdk/emsdk_env.sh && \
    CPPFLAGS="-DUCONFIG_ONLY_COLLATION=1 -DUCONFIG_NO_LEGACY_CONVERSION=1 -DUCONFIG_NO_SERVICE=1" \
    emconfigure /root/icu/source/configure \
include : 4,8 Mb
lib : 3,9 Mb

Unfortunately PHP won't compile with the -DUCONFIG_ONLY_COLLATION flag on.

So I ended up having my best optimization process :

data : 12,7 Mb
include : 4,8 Mb
lib : 6,8 Mb


And here is a comparison between dependenciesTotalSize data from different php_8_4.js versions :

php 8.4 with intl

- export const dependenciesTotalSize = 16143865; 
+ export const dependenciesTotalSize = 18472927;
php 8.4 with intl -DUCONFIG_NO_LEGACY_CONVERSION=1 -DUCONFIG_NO_COLLATION=1 -DUCONFIGU_NO_FORMATTING=1 -DUCONFIG_NO_TRANSLITERATION=1 -DUCONFIG_NO_REGULAR_EXPRESSIONS=1

- export const dependenciesTotalSize = 16143865;
+ export const dependenciesTotalSize = 18135309;


Questions :

  1. Should I keep these flags ?
  2. Should I remove WITH_INTL from web ?

@adamziel
Copy link
Collaborator

adamziel commented Apr 17, 2025

Should I keep these flags ?

What are the consequences of having them? Are we missing out on some languages or types of information? Or is it just more compressed? If we retain most information, yes, let's keep those flags on.

Should I remove WITH_INTL from web ?

Just to summarize the total download size impact for the JSPI build

  • php.data: ~29MB, new file, must be downloaded upfront
  • php_8_4.wasm: 15MB -> 17MB
  • php_8_4.js: More or less the same

29MB is way too large for a default download on the web, so let's leave WITH_INTL=false until we figure out how to ship extensions as dynamic libraries that can be declared in the Blueprint (e.g. XDebug). Even better if we had optimistic lazy loading without declarations.

The ICU version is 74.2, version used in base pull request. Should I use a more recent or older release ?

Would using the latest version be just a matter of changing the build configuration? If so, let's do it. However, if that would create additional compilation hurdles, let's stick with 74.2 for now. It's from December 2023 so still fairly recent.

The preloaded ICU data file is located in /internal/shared/preload. Since this file is mandatory in order for PHP to work properly, should I create another more specific directory ?

/internal/shared/preload is for PHP files that are preloaded with auto_prepend_file. Just /internal/shared should be fine.

I didn't find another way to inject ICU_DATA inside the php-wasm build except with the use of --pre-js. But instead of adding yet another possibly useless file, do you think of a better way? Maybe that --pre-js flag and .env.js file can be useful in the future?

Thinking about Node.js, a separate file seems fine. Here's a few thoughts I had:

  • Using an actual meaningful library name would be more helpful than shipping a file called php.data
  • With dynamic libraries, we'll need to separate the dependencies of every library. We may potentially ship them as separate via npm packages eventually, or have a small package repository embedded in the Playground repo.
  • Do we know why it only works via --pre-js? Does the file need to be present when the first WASM function call is made, for example? If so, would creating it in the initializeRuntime() method still work?

@mho22
Copy link
Collaborator Author

mho22 commented Apr 18, 2025

-DUCONFIG_NO_LEGACY_CONVERSION=1 disables support for legacy encodings, like ISO-8859-1, Shift-JIS, etc.
-DUCONFIG_NO_REGULAR_EXPRESSIONS=1 disables ICU’s regex engine. But PHP itself uses PCRE extension for regex.

Adding the others will disable some php functions like collator_compare or numfmt_format. I decided to remove them.

Here are the different sizes without and with intl.

php 8.4 without intl 

php_8_4.wasm: 16,1 Mb
php_8_4.js: 148 Kb
php 8.4 with intl without filters

data : 31,9 Mb
include : 5,1 Mb
lib : 8,5 Mb

php.data: 31,9 Mb
php_8_4.wasm: 18,5 Mb
php_8_4.js: 153 Kb
php 8.4 with intl -DUCONFIG_NO_LEGACY_CONVERSION=1 -DUCONFIG_NO_REGULAR_EXPRESSIONS=1 without filters

data : 31,9 Mb
include : 5,1 Mb
lib : 8,2 Mb

php.data: 31,9 Mb
php_8_4.wasm: 18,4 Mb
php_8_4.js: 153 Kb

These builds are made with latest ICU version 77.1. Nothing more has to be done to make this version work.



29MB is way too large for a default download on the web, so let's leave WITH_INTL=false until we figure out how to ship extensions as dynamic libraries that can be declared in the Blueprint (e.g. XDebug). Even better if we had optimistic #89 without declarations.

Just to be sure, should I disable intl completely or only in web :

web: {
    WITH_INTL: 'no',
},

Using an actual meaningful library name would be more helpful than shipping a file called php.data

If you add multiple --preload-file files in emcc all of these preloaded files will be stored in that php.data, that is why I didn't want to rename it intl.data for example.

With dynamic libraries, we'll need to separate the dependencies of every library. We may potentially ship them as separate via npm packages eventually, or have a small package repository embedded in the Playground repo.

If I understand that correctly : We have two strategies here. First is the asyncify way, we will ship extensions with dynamic libraries [ .so files, I guess ]. This means keeping asyncify. But this seems to be normal since jspi is still experimental. On the other hand, with JSPI, we have lazy loading. JSPI lazy loading seems really promising and I will be glad to contribute on that, but I suppose dynamic libraries is more short term than lazy loading. Correct ?

Do we know why it only works via --pre-js? Does the file need to be present when the first WASM function call is made, for example? If so, would creating it in the initializeRuntime() method still work?

I am still investigating this but simply instanciating the environment variable ENV.ICU_DATA = "/internal/shared" before calling the line callRuntimeCallbacks(__ATINIT__) composed of one callback : (...args) => original(...args) works.

node/jspi/php_8_4.js on line 346:

function initRuntime() {
    runtimeInitialized = true;
    SOCKFS.root = FS.mount(SOCKFS, {}, null);
    if (!Module["noFSInit"] && !FS.init.initialized)
      FS.init();
    FS.ignorePermissions = false;
    TTY.init();
    PIPEFS.root = FS.mount(PIPEFS, {}, null);
    ENV.ICU_DATA = "/internal/shared";  // This works
    callRuntimeCallbacks(__ATINIT__);
    // ENV.ICU_DATA = "/internal/shared"; This doesn't work
}

Adding that line in the --pre-js file will run the content of that file before the initRuntime function and therefore run it successfully.

This is not the initializeRuntime you were looking for I guess.

@adamziel
Copy link
Collaborator

Just to be sure, should I disable intl completely or only in web :

Only in web, let's still build the Node version with intl since the bundle size doesn't matter that much there. Could we reuse the same .data file for all the PHP versions to keep the npm package size small?

@adamziel
Copy link
Collaborator

If you add multiple --preload-file files in emcc all of these preloaded files will be stored in that php.data, that is why I didn't want to rename it intl.data for example

Gotcha! What would it take to still rename it, though? Would it be as simple as a string replacement in the built php.js? Or is there more to it? If it's complex, let's leave it.

This means keeping asyncify.

We'll need to keep Asyncify until Blink (Safari, Bun) supports JSPI 😢

I suppose dynamic libraries is more short term than lazy loading. Correct ?

Yes, e.g. XDebug is a dynamic library and it's a short term priority. Lazy loading will be challenging in that we'll need to create extension stubs with the right function signatures to trick PHP into thinking it actually loaded the extension.

This is not the initializeRuntime you were looking for I guess.

I've meant this one:

https://github.com/Automattic/wordpress-playground-private/blob/0d16adc6c1935037099e7d34466afd14d158be23/packages/php-wasm/universal/src/lib/php.ts#L212

But it seems to be called too late. Hm. There's always the ENV here that we can control without messing with the php.js module:

https://github.com/Automattic/wordpress-playground-private/blob/0d16adc6c1935037099e7d34466afd14d158be23/packages/php-wasm/universal/src/lib/load-php-runtime.ts#L141

Perhaps there's some elegant way of injecting that env variable from here:

https://github.com/Automattic/wordpress-playground-private/blob/0d16adc6c1935037099e7d34466afd14d158be23/packages/php-wasm/node/src/lib/load-runtime.ts#L21

Or maybe baking it into the php.js module is for the best, since it depends on the build options. Looping in @brandonpayton for thoughts

@mho22
Copy link
Collaborator Author

mho22 commented Apr 19, 2025

Could we reuse the same .data file for all the PHP versions to keep the npm package size small?

Yes but there is one for asyncify and another for jspi. Should I investigate for one unique file?



What would it take to still rename it, though?

It is as easy as it looks. What would you like to name it? Maybe intl.data?



Perhaps there's some elegant way of injecting that env variable from here:

This works as you mentioned :

const runtime = await loadNodeRuntime( '8.4', { emscriptenOptions : { ENV : { ICU_DATA : "/internal/shared" } } } );

However, it implies that the path can be changed, while in reality it's fixed at build time based on this line in php/Dockerfile:

echo -n ' --preload-file /root/lib/data@/internal/shared ' >> /root/.emcc-php-wasm-flags; \

But it is indeed way more elegant and it means we can avoid a --pre-js file during build.

@mho22
Copy link
Collaborator Author

mho22 commented Apr 20, 2025

Actually, this works as well without --pre-js and --preload-file during php build :

const script = `<?php $formatter = new \NumberFormatter('en-US', \NumberFormatter::CURRENCY);

var_dump($formatter->format(100.00));

$formatter = new \NumberFormatter('fr-FR', \NumberFormatter::CURRENCY);

var_dump($formatter->format(100.00));
`;




const php = new PHP( await loadNodeRuntime( '8.4', { emscriptenOptions : { ENV : { ICU_DATA : "/icu-data-path" } } } ) );

fs.readFile( 'data/icudt74l.dat', async ( error, data ) =>
{
    php.mkdir( '/icu-data-path' );

    php.writeFile( '/icu-data-path/icudt74l.dat', data );

    const result = await php.run( { code : script } );

    console.log( result.text );
} );


node --experimental-wasm-stack-switching scripts/node.js

string(7) "$100.00"
string(9) "€100.00"




import script from '../php/intl.php?raw';

...

const php = new PHP( loadWebRuntime( '8.4', { emscriptenOptions : { ENV : { ICU_DATA : "/icu-data-path" }, ... } ) );

fetch( 'data/icudt74l.dat' ).then( async data =>
{
    php.mkdir( '/icu-data-path' );

    php.writeFile( '/icu-data-path/icudt74l.dat', new Uint8Array( await data.arrayBuffer() ) );

    const result = await php.run( { code : script } );

    console.log( script );

    console.log( result.text );

    const phpinfo = await php.run( { code : "<?php echo phpinfo();" } );

    document.getElementById( 'app' ).innerHTML = phpinfo.text;
} );

web.js

Capture d’écran 2025-04-21 à 10 19 44

So we could avoid having to add the big icudt74l.dat data file inside the builds, letting users use the data file version they want and perhaps enable WITH_INTL for web and node without extra data file ? The wasm file will still be 2Mb heavier per php version.

We should probably provide some documentation about the process if we decide to go for that solution.

@brandonpayton
Copy link
Member

Perhaps there's some elegant way of injecting that env variable from here:

https://github.com/Automattic/wordpress-playground-private/blob/0d16adc6c1935037099e7d34466afd14d158be23/packages/php-wasm/node/src/lib/load-runtime.ts#L21

Or maybe baking it into the php.js module is for the best, since it depends on the build options. Looping in @brandonpayton for thoughts

If we are configuring a fixed path that we completely control, it seems like it would be cleanest to just bake a global into the build. I haven't digested all the details in this PR, but adding another --pre-js file that is populated conditionally seems like a fine approach.

@adamziel
Copy link
Collaborator

So we could avoid having to add the big icudt74l.dat data file inside the builds, letting users use the data file version they want and perhaps enable WITH_INTL for web and node without extra data file ? The wasm file will still be 2Mb heavier per php version.

This is great! Lovely! To confirm my understanding:

  • Every php.wasm version would be 2MB larger
  • The npm packages for php-wasm/node and php-wasm/web would both ship a single icudt74l.dat file that's under 20MB
  • We can choose whether or not to download that file. If we do, the intl extension just works. If we don't, php still works, but the PHP refuses to load the intl extension.

Is that right? If yes then yes, let's build all php versions WITH_INTL. Then, separately from this PR, let's discuss the API to load the dat file on the web. In Node we can just always load it.

@mho22
Copy link
Collaborator Author

mho22 commented Apr 24, 2025

@adamziel That's right! I probably need some extra informations :

  • Where should I store the dat file in php-wasm/web and php-wasm/node ?
  • Where should I load the dat file in Node ? In the php/Dockerfile build with --pre-js for the env variable and --preload-file ? Or maybe more elegantly [ I currently don't know how but there is certainly another way ]

If we don't, php still works, but the PHP refuses to load the intl extension.

PHP will still work, and intl extension will be loaded, but when running intl functions without the data from the dat file, php exceptions will be thrown.

@adamziel
Copy link
Collaborator

Where should I store the dat file in php-wasm/web and php-wasm/node?

For php-wasm/web, the public directory seems reasonable. For php-wasm/node, I'm not sure – feel free to propose something. The most important part is to make sure it's shipped with the built package and double-check it's being loaded. Unfortunately we don't have any post-build smoke tests.

Where should I load the dat file in Node ? In the php/Dockerfile build with --pre-js for the env variable and --preload-file ? Or maybe more elegantly [ I currently don't know how but there is certainly another way ]

I'm confused. I thought it worked as well without --pre-js and --preload-file? In which case we'd load it via fetch or node:fs somewhere around getPHPLoaderModule?

PHP will still work, and intl extension will be loaded, but when running intl functions without the data from the dat file, php execptions will be thrown.

This is fine for v1. For v2, let's explore disabling those functions – I worry some developers might check the availability of the intl extension with a simplistic function_exists() check.

@mho22
Copy link
Collaborator Author

mho22 commented May 2, 2025

@adamziel I was wrong about the size of icudt74l.dat. It is not 20Mb, but 30.8Mb. And I assume this is much more than expected. I think I need to make a summary of what this pull request is trying to do :



  • We can enable intl on each node and web php 7.2+ versions . Even without icu data file, PHP-WASM still runs.
  • PHP-WASM will return errors when trying to run related code and functions from the intl extension if there is no ICU_DATA environment variable added :
Uncaught IntlException: Constructor failed in /internal/eval.php:3

Users could add it manually in loadNodeRuntime or loadWebRuntime through emscriptenOptions OR I could add it in universal/src/lib/load-php-runtime.js on line 142:

ENV: {
	ICU_DATA : "/icu-data-path"
},


  • PHP-WASM will return errors when trying to run related code and functions from the intl extension if there is no data file available in the ICU_DATA directory mentionned in the environment variable above:
Uncaught IntlException: Constructor failed in /internal/eval.php:3

Users could add it manually after PHP-WASM is loaded this way :

const php = new PHP( await loadNodeRuntime( '8.3' ) );

php.mkdir( '/icu-data-path' );

php.writeFile( '/icu-data-path/icudt74l.dat', fs.readFileSync( 'node_modules/@php-wasm/node/shared/icudt74l.dat' ) );


OR I should do it in the code, around getPHPLoaderModule as you said, but honestly, I don't know exactly where. I need to access the PHP-WASM FS and also the data file with node:fs or fetch but to access the FS I need to have php-node or php-web ready. And I don't. So the only way I made it was with this :

node/src/lib/load-runtime.ts

export async function loadNodeRuntime(
	phpVersion: SupportedPHPVersion,
	options: PHPLoaderOptions = {}
) {
	const emscriptenOptions: EmscriptenOptions = {...};

	const id = await loadPHPRuntime(
		await getPHPLoaderModule(phpVersion),
		await withNetworking(emscriptenOptions)
	);

	const php = new PHP( id );

	php.mkdir( '/icu-data-path' );

	php.writeFile( '/icu-data-path/icudt74l.dat', new Uint8Array( readFileSync( `${__dirname}/shared/icudt74l.dat` ) ) );

	return id;
}

And yes, this is really bad.



But now this code works without having to indicate a ENV variable or loading a data file by myself :

import { PHP } from '@php-wasm/universal';
import { loadNodeRuntime } from '@php-wasm/node';


const code = `<?php

$formatter = new \NumberFormatter('en-US', \NumberFormatter::CURRENCY);

var_dump($formatter->format(100.00));

$formatter = new \NumberFormatter('fr-FR', \NumberFormatter::CURRENCY);

var_dump($formatter->format(100.00));`




const php = new PHP( await loadNodeRuntime( '8.3' ) );

const result = await php.run( { code : code } );

console.log( result.text );
> node --experimental-wasm-stack-switching scripts/node.js

string(7) "$100.00"
string(11) "100,00 €"


But honestly, this is not the right solution. What do you think about it ?



I'm confused. I thought it worked as well without --pre-js and --preload-file?

Apologies for the confusion. It works as well without, I just wanted to know what was the best way for you, and it seems to be the "after runtime loaded" way.

@mho22
Copy link
Collaborator Author

mho22 commented May 2, 2025

@adamziel Regarding the directories, I suggest creating a shared directory in the following locations:

- dist/packages/php-wasm/web
- dist/packages/php-wasm/node
- packages/php-wasm/node
- packages/php-wasm/web/public

This setup will streamline the transfer of the icudt74l.dat file from the compile directory to the dist directory. The flow would look like this:



npm run recompile:php:node:jspi:8.3

packages/php-wasm/compile/libintl/icudt74l.dat    >>    packages/php-wasm/node/shared/icudt74l.dat

nx run php-wasm-node:build

packages/php-wasm/node/shared/icudt74l.dat    >>    dist/packages/php-wasm/node/shared/icudt74l.dat


npm run recompile:php:web:jspi:8.3

packages/php-wasm/compile/libintl/icudt74l.dat    >>    packages/php-wasm/web/public/shared/icudt74l.dat

nx run php-wasm-web:build

packages/php-wasm/web/public/shared/icudt74l.dat    >>    dist/packages/php-wasm/web/shared/icudt74l.dat


I think shared is a suitable name for a directory composed of files that are used by each php version. What do you think ?

@adamziel
Copy link
Collaborator

adamziel commented May 2, 2025

Shared directory sounds great! The rest I'll address on Monday, but the rule of thumb is this: we dont want the minimum download size by more than 2-3 MB

@adamziel
Copy link
Collaborator

adamziel commented May 5, 2025

PHP-WASM will return errors when trying to run related code and functions from the intl extension if there is no data file available in the ICU_DATA directory mentionned in the environment variable above:

Ideally we'd disable these functions when the file is missing. It's tricky with a statically built extension – we'd either need to disable_functions in php.ini or remove the functions from the PHP functions hashtable. It's not blocking for this PR, though, we could ship it with functions that trigger an error when called and then iterate on it in a subsequent PR.

php-web ready. And I don't. So the only way I made it was with this :
node/src/lib/load-runtime.ts

export async function loadNodeRuntime(
	phpVersion: SupportedPHPVersion,
	options: PHPLoaderOptions = {}
) {
	const emscriptenOptions: EmscriptenOptions = {...};

	const id = await loadPHPRuntime(
		await getPHPLoaderModule(phpVersion),
		await withNetworking(emscriptenOptions)
	);
	const php = new PHP( id );
	php.mkdir( '/icu-data-path' );
	php.writeFile( '/icu-data-path/icudt74l.dat', new Uint8Array( readFileSync( `${__dirname}/shared/icudt74l.dat` ) ) );
	return id;
}

And yes, this is really bad.

It's not bad at all! But it would be a bit cleaner if we added another with function, e.g.:

 	const id = await loadPHPRuntime(
 		await getPHPLoaderModule(phpVersion),
 		await withNetworking(emscriptenOptions),
 		await withIcuData()
 	);

Where withIcuData() would be runtime-specific. In Node, we'd read a file from disk. In the browser, we'd fetch().

We could parallelize all the async stuff to speed it up:

 	const args = await Promise.all([
 		getPHPLoaderModule(phpVersion),
 		withNetworking(emscriptenOptions),
 		withIcuData()
 	]);
 	const id = await loadPHPRuntime( ...args );

And, finally, make the ICU loaded conditional in the browser:

 	const args = await Promise.all([
 		getPHPLoaderModule(phpVersion),
 		withNetworking(emscriptenOptions),
 		options.withICU && withIcuData()
 	]);
 	const id = await loadPHPRuntime( ...args );

That way, loadWebRuntime() caller could decide whether they want the additional power and overhead or not. It would also work if we fetched in php.js, but I think it only runs once we have the wasm stream meaning we'd be able to parallelize the downloads less.

@adamziel
Copy link
Collaborator

adamziel commented May 5, 2025

Aha, there's no way to access the php instance in the withIcuData() because we haven't created one yet. That's fine. Feel free to refactor loadPHPRuntime(). Perhaps we could utilize the onRuntimeInitialized callback? Or introduce another, similar one? Or offload some of that to the php.js module after all and initialize the ICU data file when a ReadableStream is provided in the initialization options?

@bgrgicak
Copy link
Collaborator

bgrgicak commented May 6, 2025

Perhaps we could utilize the onRuntimeInitialized callback? Or introduce another, similar one?

In dynamic symlink mounting, I used onRuntimeIntialized to access PHP in loadNodeRuntime.

@mho22
Copy link
Collaborator Author

mho22 commented May 7, 2025

@adamziel Thanks for pointing me in the right direction!
@bgrgicak Thanks a lot! Your suggestions made it much easier to implement mine!

Since the Files changed section isn’t the most readable, here’s a quick breakdown of what I coded:



First, the data file copy from compile to web and node :

compile/build.jsat line 267:

// Copy data files
const libDir = path.resolve(process.cwd(), 'packages/php-wasm/compile');
const publicDir =
	platform === 'node'
		? `${path.dirname(outputDir)}`
		: `${path.dirname(path.dirname(outputDir))}`;
if (getArg('WITH_INTL').endsWith('yes')) {
	await asyncSpawn(
		'cp',
		[`${libDir}/libintl/icudt74l.dat`, `${publicDir}/shared/icudt74l.dat`],
		{ cwd: sourceDir, stdio: 'inherit' }
	);
}


Next is the copy of node to dist/packages/php-wasm/node :

node/build/js at line 11:

try {
	fs.mkdirSync('dist/packages/php-wasm/node/shared');

	fs.copyFileSync(
		'packages/php-wasm/node/shared/icudt74l.dat',
		'dist/packages/php-wasm/node/shared/icudt74l.dat'
	);
} catch (e) {
	// Ignore
}


Nothing to do on the web-side since it copies everything from src/public to dist/packages/php-wasm/web;



Next up is the data file loading.
As mentioned in a previous comment, this requires setting the ICU_DATA environment variable and ensuring the data file is properly loaded. However, since there are two different ways of accessing files in node and `web, this needs to be handled in two distinct ways.

Let’s start with the initialization of the environment variable:



universal/src/lib at line 141 :

ENV: {
	ICU_DATA: '/shared',
},


Then the node-side data file loading.

node/src/lib/load-runtime.ts at line 39:

...(options.emscriptenOptions || {}),
onRuntimeInitialized: (phpRuntime: PHPRuntime) => {
	/*
	* An ICU data file must be loaded to support Intl extension.
	* To achieve this, a shared directory is mounted and referenced
	* via the ICU_DATA environment variable.
	* By default, this variable is set to `/shared`,
	* which corresponds to the actual file location.
	*/
	const icuFileName = 'icudt74l.dat';
	const icuFilePath = `${__dirname}/shared/${icuFileName}`;
	if (
		!FSHelpers.fileExists(phpRuntime.FS, `${phpRuntime.ENV.ICU_DATA}/${icuFileName}`) &&
		fs.existsSync(icuFilePath)
	) {
		phpRuntime.FS.mkdirTree(phpRuntime.ENV.ICU_DATA);
		phpRuntime.FS.writeFile(
			`${phpRuntime.ENV.ICU_DATA}/${icuFileName}`,
			new Uint8Array(fs.readFileSync(icuFilePath))
		);
	}
},


The data file will be written to whatever path ICU_DATA points to. By default /shared, which corresponds to the actual location in the node package.



However, in the web-side, ICU loading is controlled via a withICU option:

web/src/lib/load-runtime.ts on line 48:

export async function loadWebRuntime(
	phpVersion: SupportedPHPVersion,
	options: LoaderOptions = {}
) {
	const emscriptenOptions: EmscriptenOptions = {
		...(options.emscriptenOptions || {}),
		onRuntimeInitialized: async (phpRuntime: PHPRuntime) => {
			/*
			* An ICU data file must be loaded to support Intl extension.
			* To achieve this, a shared directory is mounted and referenced
			* via the ICU_DATA environment variable.
			* By default, this variable is set to `/shared`,
			* which corresponds to the actual file location.
 			* The web version requires a `loaderOption` to load ICU data.
			*/
			if (options?.withICU === true) {
				const icuFileName = 'icudt74l.dat';
				const icuFilePath = 'node_modules/@php-wasm/web/shared/icudt74l.dat';
				if (
					!FSHelpers.fileExists(phpRuntime.FS, `${phpRuntime.ENV.ICU_DATA}/${icuFileName}`) &&
					(await fetch(icuFilePath, {method: 'HEAD'})).ok
				) {
					phpRuntime.FS.mkdirTree(phpRuntime.ENV.ICU_DATA);
					phpRuntime.FS.writeFile(
						`${phpRuntime.ENV.ICU_DATA}/${icuFileName}`,
						new Uint8Array(await (await fetch(icuFilePath)).arrayBuffer())
					);
				}
			}
		}
	}

	const phpLoaderModule = await getPHPLoaderModule(phpVersion);
	options.onPhpLoaderModuleLoaded?.(phpLoaderModule);
	const websocketExtension = options.tcpOverFetch
		? tcpOverFetchWebsocket(options.tcpOverFetch)
		: fakeWebsocket();
	return await loadPHPRuntime(phpLoaderModule, {
		...emscriptenOptions,
		...websocketExtension,
	});


This setup works, but in my separate test project, hardcoding the path to the ICU data file like this:

const icuFilePath = 'node_modules/@php-wasm/web/shared/icudt74l.dat';

feels like bad practice. I’m struggling to find a cleaner, more flexible approach. I've always found path resolution a bit tricky. Any suggestions?



Another consideration : to fetch() the data file in web/src/lib/load-runtime.ts I had to make the root onRuntimeInitialized function in load-php-runtime.ts async :

load-php-runtime.ts at line 150:

async onRuntimeInitialized() {
	if (phpModuleArgs.onRuntimeInitialized) {
		await phpModuleArgs.onRuntimeInitialized(PHPRuntime);
	}
	resolvePHP();
},

Is this relevant enough ?

@adamziel
Copy link
Collaborator

adamziel commented May 7, 2025

About the hardcoded path, there are two stages to consider.

Take the php_8_2.js file and the process of loading the related php_8_2.wasm asset:

Stage 1: Building a reusable npm package

php_8_2.js "imports" a wasm file but in reality it only gets a URL of that file. I think vite.config.js make that happen.

import dependencyFilename from './8_2_10/php_8_2.wasm'; 
export { dependencyFilename }; 

When the @php-wasm/web package is then built, this is the ESM shipped on npm:

import dependencyFilename from './8_2_10/php_8_2.wasm'; 
export { dependencyFilename }; 

Stage 2: Building playground.wordpress.net with specific paths

The final build stage relies in part on the vite.config.js files shipped by the remote and website packages. Here's what the path looks like in final php_8_2.js asset shipped with the Playground website:

var hi="/assets/php_8_2-5df719e6.wasm";

@adamziel
Copy link
Collaborator

adamziel commented May 7, 2025

Another consideration : to fetch() the data file in web/src/lib/load-runtime.ts I had to make the root onRuntimeInitialized function in load-php-runtime.ts async :

The built emscripten module doesn't await there so I'd say it cannot be async. But you could fetch earlier:

	/*
	* An ICU data file must be loaded to support Intl extension.
	* To achieve this, a shared directory is mounted and referenced
	* via the ICU_DATA environment variable.
	* By default, this variable is set to `/shared`,
	* which corresponds to the actual file location.
 	* The web version requires a `loaderOption` to load ICU data.
	*/
	if (options?.withICU === true) {
		// fetch()
		// Add the onRuntimeInitialized callback to emscriptenOptions
	}

	const phpLoaderModule = await getPHPLoaderModule(phpVersion);
	options.onPhpLoaderModuleLoaded?.(phpLoaderModule);
	const websocketExtension = options.tcpOverFetch
		? tcpOverFetchWebsocket(options.tcpOverFetch)
		: fakeWebsocket();
	return await loadPHPRuntime(phpLoaderModule, {
		...emscriptenOptions,
		...websocketExtension,
	});

Also, consider a separate async withIcu() helper to parallelize downloading the WASM file and the ICU data file.

@mho22
Copy link
Collaborator Author

mho22 commented May 21, 2025

You were right to be concerned about the canceled operation.

It appears that php-request-handler.spec.ts is causing the first check to crash. I isolated it in the first job, but one job isn't sufficient. Even the first describe block, when isolated, is too resource-intensive. I'll likely need to split that describe block into multiple files and jobs. The rest of the file is fine.

@adamziel
Copy link
Collaborator

Maybe separating php 7.2-8.0 from 8.1-8.4 could also help? I wonder if there's a single test somewhere that takes a lot of memory and we run it 10 times - once for each php version

@mho22
Copy link
Collaborator Author

mho22 commented May 22, 2025

@adamziel Splitting php-request-handler tests made the checks pass! So now, I have two questions :

  1. Should I split up the first describe block into more descriptive pieces instead of PHP versions or PHP versions splitting is enough ?

  2. Do you think I can solve the last test-e2e-playwright failing checks related to not found artifacts ? All green checks would be satisfying.

@adamziel
Copy link
Collaborator

@mho22 1. Up to you :) 2. Yes please!

@adamziel
Copy link
Collaborator

Let's brush up the PR description with all the necessary details for posterity and we can merge

@mho22
Copy link
Collaborator Author

mho22 commented May 22, 2025

@adamziel For the record, the issue with the failing delete-artifacts github action was related to its version no more compatible with actions/upload-artifact@v4 as it is mentioned here.

I also rewrote the pull request description. Do you see anything missing?

@adamziel
Copy link
Collaborator

adamziel commented May 22, 2025

The details section reads weird, e.g.:

Data file loading emulation during build for web platform, identical to wasm files.

Was it AI generated? Let's make it specific - what's the shared directory path? Why this design choice over another? What else was tried and didn't work? Etc. PR descriptions are a form of documentation and we'll be revisiting this one soon to learn about this or that aspect of intl support - let's provide a nice writeup for the next person.

@mho22
Copy link
Collaborator Author

mho22 commented May 22, 2025

To be honest, I reviewed the files changed in this pull request and listed each step I took to complete it. I didn’t use AI, but it seems I wasn’t precise enough. Let me provide a more detailed explanation.

@mho22
Copy link
Collaborator Author

mho22 commented May 22, 2025

@adamziel Done. Is this what you had in mind? I intentionally skipped some minor points and focused on explaining the most important ones, along with a bit of background context.

@adamziel
Copy link
Collaborator

adamziel commented May 22, 2025

This is great, thank you! And it seems like I was too hasty with my AI comment, sorry about that. The description is great and gives so much context now ❤️ I'd just add a few more things:

  • explicitly say in the first paragraph that all php versions are built with intl now, web and node
  • document the gotcha where intl functions will exist but fail when the dat file is not loaded - maybe even with the colorful GitHub notice syntax to make it very prominent:

Crash risk!

[!WARNING] Crash risk
Calling an intl function without loading the dat file will crash PHP.

Those functions are available because intl is built as a static extension and baked into all PHP.wasm binaries. ... it's a risk ... we still take it ... potential solutions ...

Edit: it didnt work here 🙈 maybe because I'm writing this comment from the ios app? I meant this: https://github.com/orgs/community/discussions/16925

Other than that, I only have one nitpick - a matter of taste really: There's a lot of empty lines in the description. It doesn't really matter if it's this way or the other way, but since:

  • these description become commit message
  • other commit messages use denser formatting

It would be nice to condense these empty lines for consistency.

@mho22
Copy link
Collaborator Author

mho22 commented May 22, 2025

@adamziel I’ve updated the description with your suggestions. Let me know if there’s anything else that needs to be adjusted.

Ah and sorry for the extra empty lines! I sometimes find my comments too condensed, so I add a bit of spacing for readability. I’ll keep that in mind for future PR descriptions.

@adamziel adamziel merged commit ebfd486 into WordPress:trunk May 23, 2025
15 of 16 checks passed
@github-project-automation github-project-automation bot moved this from Inbox to Done in Playground Board May 23, 2025
adamziel pushed a commit that referenced this pull request Aug 28, 2025
## Motivation for the change, related issues

This is a pull request to dynamically load Intl in @php-wasm Node JSPI.

## Related issues and pull requests

Issues 

- #2466
- #2299
- #1295

Pull requests

- #2247
- #2187

## Implementation details

### Intl Dynamic Extension Compilation JSPI

- Creation of a dedicated `shared` directory in `php-wasm/compile` which
will store the dynamic extensions build processes and files.
- Creation of a main `build.js` script with options related to the
dynamic extensions
- Creation of a specific Dockerfile for the creation of the Intl
extension `.so` file based on PHP versions and JSPI
- Creation of a dedicated `project.json` file which will store the list
of compilation commands related to each dynamic extension for JSPI
- Compilation of every version of Intl Dynamic Extension JSPI

### PHP.wasm Node WithIntl option

- Loading of Intl extension based on the option `withIntl` [ same logic
as Xdebug ]. This loads dynamically the needed version of the dynamic
extension. Stores it in the filesystem. Prepare the related php ini file
and load the related ICU data file.
- Test the correct use of the extension in the
`php-dynamic-loading.spec.ts` file.
- Keep the Intl static extension working for PHP.wasm Node Asyncify. 
- Keep the Intl static extension compilation process for PHP.wasm Node
Asyncify and Web.

## Testing Instructions (or ideally a Blueprint)

`test.js` 

```javascript
import { PHP } from '@php-wasm/universal';
import { loadNodeRuntime } from '@php-wasm/node';

const script = `<?php

$formatter = numfmt_create('en-US', NumberFormatter::CURRENCY);
echo numfmt_format($formatter, 100.00);
$formatter = numfmt_create('fr-FR', NumberFormatter::CURRENCY);
echo numfmt_format($formatter, 100.00);

?>`;

const php = new PHP( await loadNodeRuntime( '8.3', { withIntl : true } ) );

const result = await php.runStream( { code : script } );

console.log( await result.stdoutText );
```

```
> node --experimental-wasm-jspi scripts/example.js

//withIntl : true
$100.00100,00 €

//withIntl : false
<br />
<b>Fatal error</b>:  Uncaught Error: Call to undefined function numfmt_create() in /internal/eval.php:3
Stack trace:
#0 {main}
  thrown in <b>/internal/eval.php</b> on line <b>3</b><br />
```

## Next steps

- [x] Experimental PHP Node JSPI 8.3
- [x] PHP.wasm Node JSPI 
- [ ] PHP.wasm Node  Asyncify
- [ ] PHP.wasm Web
- [ ] Remove artifacts in PHP.wasm
- [ ] Remove artifacts in Playground
- [ ] Move Xdebug in shared directory alongside Intl
adamziel pushed a commit that referenced this pull request Sep 2, 2025
…2501 (#2557)

## Motivation for the change, related issues

This is a pull request to dynamically load Intl in @php-wasm Node
ASYNCIFY.

## Related issues and pull requests

Issues 

- #2466
- #2299
- #1295

Pull requests

- #2501
- #2247
- #2187

## Implementation details

### Intl Dynamic Extension Compilation ASYNCIFY 

- Improvement to the specific Intl dynamic extension Dockerfile file
based on PHP versions and ASYNCIFY
- Modification of the dedicated `project.json` file which will store the
list of compilation commands related to each dynamic extension for
asyncify.
- Compilation of every version of Intl Dynamic Extension For Asyncify

### PHP.wasm Node WithIntl option

- Add Intl extension file import for Asyncify
- Test the correct use of the extension in the
`php-dynamic-loading.spec.ts` file.
- Keep the Intl static extension compilation process for PHP.wasm Web.

## Testing Instructions (or ideally a Blueprint)

`test.js` 

```javascript
import { PHP } from '@php-wasm/universal';
import { loadNodeRuntime } from '@php-wasm/node';

const script = `<?php

$formatter = numfmt_create('en-US', NumberFormatter::CURRENCY);
echo numfmt_format($formatter, 100.00);
$formatter = numfmt_create('fr-FR', NumberFormatter::CURRENCY);
echo numfmt_format($formatter, 100.00);

?>`;

const php = new PHP( await loadNodeRuntime( '8.3', { withIntl : true } ) );

const result = await php.runStream( { code : script } );

console.log( await result.stdoutText );
```

```
> node scripts/example.js

//withIntl : true
$100.00100,00 €

//withIntl : false
<br />
<b>Fatal error</b>:  Uncaught Error: Call to undefined function numfmt_create() in /internal/eval.php:3
Stack trace:
#0 {main}
  thrown in <b>/internal/eval.php</b> on line <b>3</b><br />
```

## Next steps

- [x] Experimental PHP Node JSPI 8.3
- [x] PHP.wasm Node JSPI 
- [x] PHP.wasm Node  Asyncify
- [ ] PHP.wasm Web
- [ ] Remove artifacts in PHP.wasm
- [ ] Remove artifacts in Playground
- [ ] Move Xdebug in shared directory alongside Intl
@mho22 mho22 mentioned this pull request Nov 26, 2025
3 tasks
adamziel added a commit that referenced this pull request Nov 28, 2025
## Motivation for the change, related issues

This is a pull request to dynamically load Intl in PHP.wasm Web.

## Related issues and pull requests

Issues 

- #2466
- #2299
- #1295

Pull requests

- #2557
- #2501
- #2247
- #2187

## Implementation details

- Removal of static Intl options in PHP compilation
- Set up of PHP as a `MAIN_MODULE` in node and web 
- Correction of #2318 by adding`worker` to the [`web`] environment
- Improvement of build file for shared libraries 
- Implementation of Intl dynamic extension lazy loading logic in
PHP.wasm web
- Creation of a `ignore-lib-imports` Vite plugin
- Playwright E2E tests implementation for PHP.wasm web by duplicating
existing ones from PHP.wasm Node
- Creation of a virtual alias for `wasm-feature-detect` to simulate JSPI
mode enabled based on Playwright ENV
- CI jobs implementation to test PHP.wasm web in JSPI and Asyncify mode

## Testing Instructions (or ideally a Blueprint)

CI

🧪 test-e2e-php-wasm-web-jspi 
🧪 test-e2e-php-wasm-web-asyncify 

## Next steps

- [x] Experimental PHP.wasm Node JSPI 8.3
- [x] PHP.wasm Node JSPI 
- [x] PHP.wasm Node  Asyncify
- [x] Experimental PHP.wasm Web JSPI 8.3
- [x] Experimental PHP.wasm Web Asyncify 8.3
- [x] PHP.wasm Web JSPI
- [x] PHP.wasm Web Asyncify
- [ ] Implement Intl in Blueprints
- [ ] Remove remaining Intl artifacts in PHP.wasm
- [ ] Remove remaining Intl artifacts in Playground

---------

Co-authored-by: Adam Zieliński <adam@adamziel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

No open projects
Archived in project

Development

Successfully merging this pull request may close these issues.

4 participants