Skip to content

Commit 7d77415

Browse files
committed
Faster float formatting
This commit provides an alternative implementation for a float → decimal conversion. It integrates a C implementation of Fabian Loitsch's Grisu-algorithm [[pdf]](http://florian.loitsch.com/publications/dtoa-pldi2010.pdf), extracted from https://github.com/night-shift/fpconv. The relevant files are added in this PR, they are, as is all of https://github.com/night-shift/fpconv, available under a MIT License. As a result, I see a speedup of 900% on Apple Silicon M1 for a float set of benchmarks. floats don't have a single correct string representation: a float like `1000.0` can be represented as "1000", "1e3", "1000.0" (and more). The Grisu algorithm converts floating point numbers to an optimal decimal string representation without loss of precision. As a result, a float that is exactly an integer (like `Float(10)`) will be converted by that algorithm into `"10"`. While technically correct – the JSON format treats floats and integers identically –, this differs from the current behaviour of the `"json"` gem. To address this, the integration checks for that case, and explicitely adds a ".0" suffix in those cases. This is sufficient to meet all existing tests; there is, however, a chance that the current implementation and this implementation occasionally encode floats differently. ``` == Encoding floats (4179311 bytes) ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- json (local) 4.000 i/100ms Calculating ------------------------------------- json (local) 46.046 (± 2.2%) i/s (21.72 ms/i) - 232.000 in 5.039611s Normalize to 2090234 byte == Encoding floats (4179242 bytes) ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- json (2.10.2) 1.000 i/100ms Calculating ------------------------------------- json (2.10.2) 4.614 (± 0.0%) i/s (216.74 ms/i) - 24.000 in 5.201871s ``` These benchmarks are run via a script ([link](https://gist.github.com/radiospiel/04019402726a28b31616df3d0c17bd1c)) which is based on the gem's `benchmark/encoder.rb` file. There are probably better ways to run benchmarks :) My version allows to combine multiple test cases into a single one. The `dumps` benchmark, which covers the JSON files in `benchmark/data/*.json` – with the exception of `canada.json` – , reported a minor speedup within statistical uncertainty.
1 parent 2074f0c commit 7d77415

File tree

6 files changed

+516
-3
lines changed

6 files changed

+516
-3
lines changed

ext/json/ext/generator/generator.c

Lines changed: 31 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1050,12 +1050,15 @@ static void generate_json_integer(FBuffer *buffer, struct generate_json_data *da
10501050
}
10511051
#endif
10521052

1053+
#include "../vendor/fpconv/src/fpconv.c"
1054+
10531055
static void generate_json_float(FBuffer *buffer, struct generate_json_data *data, JSON_Generator_State *state, VALUE obj)
10541056
{
10551057
double value = RFLOAT_VALUE(obj);
10561058
char allow_nan = state->allow_nan;
1057-
if (!allow_nan) {
1058-
if (isinf(value) || isnan(value)) {
1059+
if (isinf(value) || isnan(value)) {
1060+
/* for NaN and Infinity values we either raise an error or rely on Float#to_s. */
1061+
if (!allow_nan) {
10591062
if (state->strict && state->as_json) {
10601063
VALUE casted_obj = rb_proc_call_with_block(state->as_json, 1, &obj, Qnil);
10611064
if (casted_obj != obj) {
@@ -1067,8 +1070,33 @@ static void generate_json_float(FBuffer *buffer, struct generate_json_data *data
10671070
}
10681071
raise_generator_error(obj, "%"PRIsVALUE" not allowed in JSON", rb_funcall(obj, i_to_s, 0));
10691072
}
1073+
1074+
VALUE tmp = rb_funcall(obj, i_to_s, 0);
1075+
fbuffer_append_str(buffer, tmp);
1076+
return;
1077+
}
1078+
1079+
/* This implementation writes directly into the buffer. We reserve
1080+
* the 24 characters that fpconv_dtoa states as its maximum, plus
1081+
* 2 more characters for the potential ".0" suffix.
1082+
*/
1083+
fbuffer_inc_capa(buffer, 26);
1084+
char* d = buffer->ptr + buffer->len;
1085+
int len = fpconv_dtoa(value, d);
1086+
1087+
/* fpconv_dtoa converts a float to its shorted string representation. When
1088+
* converting a float that is exactly an integer (e.g. `Float(2)`) this
1089+
* returns in a string that looks like an integer. This is correct, since
1090+
* JSON treats ints and floats the same. However, to not break integrations
1091+
* that expect a string representation looking like a float, we append a
1092+
* "." in that case.
1093+
*/
1094+
if(!memchr(d, '.', len) && !memchr(d, 'e', len)) {
1095+
d[len] = '.';
1096+
d[len+1] = '0';
1097+
len += 2;
10701098
}
1071-
fbuffer_append_str(buffer, rb_funcall(obj, i_to_s, 0));
1099+
buffer->len += len;
10721100
}
10731101

10741102
static void generate_json_fragment(FBuffer *buffer, struct generate_json_data *data, JSON_Generator_State *state, VALUE obj)

ext/json/ext/vendor/fpconv/README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
The contents of this directory is extracted from https://github.com/night-shift/fpconv
2+
3+
It is licensed under the provisions of the Boost Software License - Version 1.0 - August 17th, 2003. See the ./license file for details.

ext/json/ext/vendor/fpconv/license

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
Boost Software License - Version 1.0 - August 17th, 2003
2+
3+
Permission is hereby granted, free of charge, to any person or organization
4+
obtaining a copy of the software and accompanying documentation covered by
5+
this license (the "Software") to use, reproduce, display, distribute,
6+
execute, and transmit the Software, and to prepare derivative works of the
7+
Software, and to permit third-parties to whom the Software is furnished to
8+
do so, all subject to the following:
9+
10+
The copyright notices in the Software and this entire statement, including
11+
the above license grant, this restriction and the following disclaimer,
12+
must be included in all copies of the Software, in whole or in part, and
13+
all derivative works of the Software, unless such copies or derivative
14+
works are solely in the form of machine-executable object code generated by
15+
a source language processor.
16+
17+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
18+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
19+
FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. IN NO EVENT
20+
SHALL THE COPYRIGHT HOLDERS OR ANYONE DISTRIBUTING THE SOFTWARE BE LIABLE
21+
FOR ANY DAMAGES OR OTHER LIABILITY, WHETHER IN CONTRACT, TORT OR OTHERWISE,
22+
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
23+
DEALINGS IN THE SOFTWARE.

0 commit comments

Comments
 (0)