Skip to content

feat: switch grisu2 float-to-string algorithm to hybrid of xjb & zmij algorithms#3025

Open
JairusSW wants to merge 29 commits into
AssemblyScript:mainfrom
JairusSW:jairus/switch-to-xjb
Open

feat: switch grisu2 float-to-string algorithm to hybrid of xjb & zmij algorithms#3025
JairusSW wants to merge 29 commits into
AssemblyScript:mainfrom
JairusSW:jairus/switch-to-xjb

Conversation

@JairusSW

@JairusSW JairusSW commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Fixes #3012.

Changes proposed in this pull request:
⯈ Switch to xjb-as which is an improvement over zmij-as
⯈ Comply to the ECMA262 Specification with the exception of a trailing .0

Note: I wrote xjb-as with claude assisting me for an initial port. I then optimized by hand. Everything is carefully checked and passes over a trillion fuzz cases compared directly against Number::parse() in V8. It also reaches 100% code coverage and tests every edge case both I and claude could think of.

It's good quality stuff though, and I'm quite confident in that.

Here's some performance notes. xjb-as/README.md has more extensive notes and stuff. These benches are taken on an AMD 7800x3D.

dtoa (f64) vs the AssemblyScript stdlib

ftoa (f32) vs the AssemblyScript stdlib

dtoa (f64) per-stage latency

program uses grisu2 xjb Δ
f64 only 4.56 6.84 +2.28
f32 only 4.56 4.61 +0.05
both 5.17 8.97 +3.80

Lookup-table footprint vs the old grisu2 implementation (omiting shared tables):

program uses grisu2 xjb Δ
f64 (dtoa) only 910 B 672 B −238 B
f32 (ftoa) only 910 B 616 B −294 B
both 910 B 1288 B +378 B
  • I've read the contributing guidelines
  • I've added my name and email to the NOTICE file

P.S. Sorry about the diff. Most of it is just the wasm files.

JairusSW added 3 commits June 8, 2026 21:45
Signed-off-by: Jairus Tanaka <me@jairus.dev>
Signed-off-by: Jairus Tanaka <me@jairus.dev>
Signed-off-by: Jairus Tanaka <me@jairus.dev>
@JairusSW JairusSW marked this pull request as draft June 9, 2026 06:04
@JairusSW JairusSW marked this pull request as ready for review June 17, 2026 18:13
@PaperPrototype

Copy link
Copy Markdown

Just curious here, why the exception of a trailing .0?

@JairusSW

JairusSW commented Jun 17, 2026

Copy link
Copy Markdown
Contributor Author

Just curious here, why the exception of a trailing .0?

Max and Dan made that decision so that floats are easily identifiable when converting them to a string. For example, you know that the string "1.0" is a result of f64.toString() while the string "1" is likely a result of i32.toString(). Slight deviation from ECMA262-spec, but it's not awful

@MaxGraey

MaxGraey commented Jun 17, 2026

Copy link
Copy Markdown
Member

Can you rename xjb.ts to dtoa.ts and in comments mentioned about implementation based on xjb and zmij? xjb filename is very exotic and doesn't explains exactly what this file does

Comment thread std/assembly/util/dtoa.ts
Comment thread std/assembly/util/xjb.ts Outdated
Comment thread std/assembly/util/dtoa.ts
Comment thread std/assembly/util/dtoa.ts Outdated
Comment thread std/assembly/util/dtoa.ts Outdated
JairusSW and others added 2 commits June 17, 2026 16:23
Co-authored-by: Max Graey <maxgraey@gmail.com>
Co-authored-by: Max Graey <maxgraey@gmail.com>
Comment thread std/assembly/util/dtoa.ts Outdated
Comment thread std/assembly/util/dtoa.ts
Comment on lines +135 to +136
let gBcd: u64 = 0;
let gBcdLen: i32 = 0;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gBcd -> gBcdVal
?

Comment thread std/assembly/util/dtoa.ts Outdated
Comment thread std/assembly/util/dtoa.ts Outdated
Comment on lines +151 to +153
export let gDigHi: u64 = 0;
export let gDigLo: u64 = 0;
export let gDigNum: i32 = 0;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have a lot of global vars. Does we need all of them? And can we declare all of necessary vars in one place?

Comment thread std/assembly/util/dtoa.ts Outdated
Co-authored-by: Max Graey <maxgraey@gmail.com>
Comment thread tests/compiler/bindings/raw.release.wat Outdated
Comment on lines +57 to +91
(data $5 (i32.const 1239) "\80\00\00\00\00\00\00\00\a0\00\00\00\00\00\00\00\c8\00\00\00\00\00\00\00\fa\00\00\00\00\00\00@\9c\00\00\00\00\00\00P\c3\00\00\00\00\00\00$\f4\00\00\00\00\00\80\96\98\00\00\00\00\00 \bc\be\00\00\00\00\00(k\ee\00\00\00\00\00\f9\02\95\00\00\00\00@\b7C\ba\00\00\00\00\10\a5\d4\e8\00\00\00\00*\e7\84\91\00\00\00\80\f4 \e6\b5\00\00\00\a01\a9_\e3\00\00\00\04\bf\c9\1b\8e\00\00\00\c5.\bc\a2\b1\00\00@v:k\0b\de\00\00\e8\89\04#\c7\8a\00\00b\ac\c5\ebx\ad\00\80z\17\b7&\d7\d8\00\90\acn2x\86\87\00\b4W\n?\16h\a9\00\a1\ed\cc\ce\1b\c2\d3\a0\84\14@aQY\84\c8\a5\19\90\b9\a5o\a5:\0f \f4\'\8f\cb\ce")
(data $6 (i32.const 1456) "o\1b\8e(\10T\8e\af\daM\e4^\ae\f0\ec\07J\fb\9f\f4\98\'D\b1\9dwA\df\cf\11\cd\99\07\ef\99\85\0b?\fe\b2\15\aa\b4\dc\e6\a7\1f\86c\beZ\06\0b\a5\bc\b4\aaSkuz\07\ed\0f\08\bf,)Ud\7f\b6C\d5\b1\17L\c8;\1a\fb;\efi\c2\87F\b8B\a7\ee@OQ]=\eb\dd\e4PF\1a\12\ba\13\e4labM\f3\92\ea\af(\b6\ef&\e2\bb\8c6U\n\f7\89\04\89\0f`\cb\05\e9\b8\b6\bd!\c9\c1\bb\87\e9\00T\96_\9a\84x\db\8f\bf4\d0\bdr\04R\98\de\'\8a\92\95\00\9am\c1\94\82\17\0f<\05\b7u\00\00\00\00\00\00P\c3\00\00\00\00\00\00\00\00\05\e3L6\12\197\c5\00\00\00\00\00\00(l\d6\aa\80\9d\ef\f0\"\c7\f6~\b9\b7\d2:MBL\c8q\d5m\93\13\c9\ea8\1e\cd\19:\bc\03\1cU\ab\01\80\0c\t\cb\c6,\07\d3\bf\f5\ad\\\a1\90\08\137h\03\cd\10\8cz\c3\87\a8\db6.\ef\07\12\c2\b2\02\cf\bc\f4\03^\e4g\f9\94\c7\85\d7in\f8\06\d1R\ba\be\01\d763\e1|\a0\1c4\a8E\10\d3Q\a0\t\12\11H\de\1e1Vx\85\fa\a6\1e\d5f\a5>\7f\"t*U3\f1\ca\ba\0f)2\d7\96@\adGy\17|\a9t\088\c7\b1\d8J\d9\bc\"x\ae\81R7\18")
(data $7 (i32.const 1824) "?6N\n@\18\00\00\00d\00\00@\00 $\00\00\00\00\00\00\00\0c\80\13\c8\82\1f\e0L^\0f\f60\d7\1b\00\00\00\00\00\00\00\fc\ff\f7\cd\d8\01\82n\d1?\cd@\01%d\db\r\r\00\00\00$\04\14@8qS\b4\1dx\11")
(data $9 (i32.const 2032) "p\\\ea{\ce2~\8f\1a\c7C\c6\b0\b7\96\e5\ae\05\03\05\'\c6\ab\b7\bf7\cf\d0\b8\d1\ef\92\fe%\e5\1a\8eO\19\eb2\ebP\e2\a4?\14\bc\f5\88\r\b5P\99v\96!\dbH\bb\1a\c2\bd\f0\b4\15\07\c9{\ce\97\c0]\11l:\96\0b\13\9a\c7\1b\e0\c3V\df\84\f6\06\e3L6\12\197\c5\9e\b5p+\a8\ad\c5\9d\97\"\81E@|o\fc\dfNg\04\cd\c9\f2\c9\e6\0b\b96\d7\07\8f\a1\85\t\94\f8x9?\81:\0f \f4\'\8f\cb\ce\c8\a5\19\90\b9\a5o\a5\a0\84\14@aQY\84\00\a1\ed\cc\ce\1b\c2\d3\00\b4W\n?\16h\a9\00\90\acn2x\86\87\00\80z\17\b7&\d7\d8\00\00b\ac\c5\ebx\ad\00\00\e8\89\04#\c7\8a\00\00@v:k\0b\de\00\00\00\c5.\bc\a2\b1\00\00\00\04\bf\c9\1b\8e\00\00\00\a01\a9_\e3\00\00\00\80\f4 \e6\b5\00\00\00\00*\e7\84\91\00\00\00\00\10\a5\d4\e8\00\00\00\00@\b7C\ba\00\00\00\00\00\f9\02\95\00\00\00\00\00(k\ee\00\00\00\00\00 \bc\be\00\00\00\00\00\80\96\98\00\00\00\00\00\00$\f4\00\00\00\00\00\00P\c3\00\00\00\00\00\00@\9c\00\00\00\00\00\00\00\fa\00\00\00\00\00\00\00\c8\00\00\00\00\00\00\00\a0\00\00\00\00\00\00\00\80\cd\cc\cc\cc\cc\cc\cc\cc\0b\d7\a3p=\n\d7\a3<\dfO\8d\97n\12\83,e\19\e2X\17\b7\d1$\84G\1bG\ac\c5\a7\b6il\af\05\bd7\86\bdBz\e5\d5\94\bf\d6\fd\cea\84\11w\cc\ab\98\a5\b46A_p\89\bf\d5\ed\bd\ce\fe\e6\db\ff\aa$\cb\0b\ff\eb\af\cc\88Po\t\cc\bc\8c\14\0e\b4KB\13.\e1\10\d8\\\t5\dc$\b4\da\ac\b0:\f7|\1d\90\\\e1M\c4\be\94\95\e6J\b4\a462\aaw\b8\08]\1d\92\8e\ee\92\93\a6a\95\b6}J\1e\ec\eb\1a\11\92d\08\e5\bc\ef{\datP\a0\1d\97\b2,\f7\ba\80\00\c9\f1(\8a\92\95\00\9am\c1S;uD\cd\14\be\9aR\c5\ee\d3\ae\87\96\f7\db\9dXv%\06\12\c6I~\e0\91\b7\d1t\9e\0e\ca\00\83\f2\b5\87\fd?;\9a5\f5\f7\d2\ca2\fc\14^\f7_B\a2\f5\fcCK,\b3\ce\81\bb\949E\ad\1e\b1\cf")
(data $10 (i32.const 2648) "\"\"#$!\"#$\"#$\"#$!\"#$\"#$\"#$!\"#$\"#$\"#$!\"#$\"#$\"#$!\"#$\"#$\"#$\"#$!\"#$\"#$\"#$!\"#$\"#$\"#$!\"#$\"#$\"#$!\"#$\"#$\"#$!\"#$\"#$\"#$!\"#$\"#$\"#$!\"#$\"#$\"#$!\"#$\"#$\"#$!\"#$\"#$\"#$\"#$!\"#$\"#$\"#$!\"#$\"#$\"#$!\"#$\"#$\"#$!\"#$\"#$\"#$!\"#$\"#$\"#$!\"#$\"#$\"#$!\"#$\"#$\"#$!\"#$\"#$\"#$!\"#$\"#$\"#$!\"#$\"#$\"#$\"#$!\"#")
(data $11 (i32.const 2908) ",")
(data $11.1 (i32.const 2920) "\02\00\00\00\1c\00\00\00I\00n\00v\00a\00l\00i\00d\00 \00l\00e\00n\00g\00t\00h")
(data $12 (i32.const 2956) "<")
(data $12.1 (i32.const 2968) "\02\00\00\00&\00\00\00~\00l\00i\00b\00/\00a\00r\00r\00a\00y\00b\00u\00f\00f\00e\00r\00.\00t\00s")
(data $13 (i32.const 3020) "<")
(data $13.1 (i32.const 3032) "\02\00\00\00(\00\00\00A\00l\00l\00o\00c\00a\00t\00i\00o\00n\00 \00t\00o\00o\00 \00l\00a\00r\00g\00e")
(data $14 (i32.const 3084) "<")
(data $14.1 (i32.const 3096) "\02\00\00\00 \00\00\00~\00l\00i\00b\00/\00r\00t\00/\00i\00t\00c\00m\00s\00.\00t\00s")
(data $17 (i32.const 3212) "<")
(data $17.1 (i32.const 3224) "\02\00\00\00$\00\00\00I\00n\00d\00e\00x\00 \00o\00u\00t\00 \00o\00f\00 \00r\00a\00n\00g\00e")
(data $18 (i32.const 3276) ",")
(data $18.1 (i32.const 3288) "\02\00\00\00\14\00\00\00~\00l\00i\00b\00/\00r\00t\00.\00t\00s")
(data $20 (i32.const 3356) "<")
(data $20.1 (i32.const 3368) "\02\00\00\00\1e\00\00\00~\00l\00i\00b\00/\00r\00t\00/\00t\00l\00s\00f\00.\00t\00s")
(data $21 (i32.const 3420) "\1c")
(data $21.1 (i32.const 3432) "\02")
(data $22 (i32.const 3452) "<")
(data $22.1 (i32.const 3464) "\02\00\00\00$\00\00\00~\00l\00i\00b\00/\00t\00y\00p\00e\00d\00a\00r\00r\00a\00y\00.\00t\00s")
(data $23 (i32.const 3516) "<")
(data $23.1 (i32.const 3528) "\02\00\00\00&\00\00\00~\00l\00i\00b\00/\00s\00t\00a\00t\00i\00c\00a\00r\00r\00a\00y\00.\00t\00s")
(data $24 (i32.const 3580) ",")
(data $24.1 (i32.const 3592) "\02\00\00\00\1a\00\00\00~\00l\00i\00b\00/\00a\00r\00r\00a\00y\00.\00t\00s")
(data $25 (i32.const 3628) "|")
(data $25.1 (i32.const 3640) "\02\00\00\00^\00\00\00E\00l\00e\00m\00e\00n\00t\00 \00t\00y\00p\00e\00 \00m\00u\00s\00t\00 \00b\00e\00 \00n\00u\00l\00l\00a\00b\00l\00e\00 \00i\00f\00 \00a\00r\00r\00a\00y\00 \00i\00s\00 \00h\00o\00l\00e\00y")
(data $26 (i32.const 3756) "<")
(data $26.1 (i32.const 3768) "\02\00\00\00*\00\00\00O\00b\00j\00e\00c\00t\00 \00a\00l\00r\00e\00a\00d\00y\00 \00p\00i\00n\00n\00e\00d")
(data $27 (i32.const 3820) "<")
(data $27.1 (i32.const 3832) "\02\00\00\00(\00\00\00O\00b\00j\00e\00c\00t\00 \00i\00s\00 \00n\00o\00t\00 \00p\00i\00n\00n\00e\00d")
(data $28 (i32.const 3888) "\10\00\00\00 \00\00\00 \00\00\00 ")
(data $28.1 (i32.const 3912) "\81\08\00\00\01\19\00\00\01\02\00\00$\t\00\00\a4\00\00\00$\n\00\00\02\t\00\00\02A\00\00\00\00\00\00A\00\00\00 ")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if the baseline amount of static memory can be reduced here by computing the parts of it that aren't strictly necessary to have as LUTs?

Comment thread std/assembly/util/dtoa.ts Outdated
Co-authored-by: Max Graey <maxgraey@gmail.com>
Comment thread std/assembly/util/dtoa.ts
Comment thread std/assembly/util/dtoa.ts Outdated
Comment thread std/assembly/util/dtoa.ts Outdated
JairusSW and others added 3 commits June 17, 2026 16:41
Co-authored-by: Max Graey <maxgraey@gmail.com>
Co-authored-by: Max Graey <maxgraey@gmail.com>
Co-authored-by: Max Graey <maxgraey@gmail.com>
Comment thread std/assembly/util/dtoa.ts Outdated
const shift = h + 1 + EXTRA_SHIFT;

loadPow10Xjb64(powExp);
const pHi = gPow10Hi, pLo = gPow10Lo;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const pHi = gPow10Hi, pLo = gPow10Lo;
let pHi = gPow10Hi;
let pLo = gPow10Lo;

Comment thread std/assembly/util/dtoa.ts Outdated
Comment on lines +476 to +478
const plo64 = pHi * y;
const lo = plo64 + mulhi64(pLo, y);
const p_hi = a + u64(lo < plo64);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very inconsistent names plo64 instead pLo64, p_hi instead pHi64

Comment thread std/assembly/util/dtoa.ts Outdated
Comment thread std/assembly/util/dtoa.ts
Comment on lines +547 to +552
while (v >= 100) {
const q = v / 100;
p -= 4;
store<u32>(p, load<u32>(DIGITS + (<usize>(v - q * 100) << alignof<u32>())));
v = q;
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this loop can be unrolled

Comment thread std/assembly/util/dtoa.ts
Comment on lines +520 to +535
@inline function decimalLen15(v: u64): i32 {
if (v < 100000000) {
if (v < 10000) {
if (v < 100) return v < 10 ? 1 : 2;
return v < 1000 ? 3 : 4;
}
if (v < 1000000) return v < 100000 ? 5 : 6;
return v < 10000000 ? 7 : 8;
}
if (v < 1000000000000) {
if (v < 10000000000) return v < 1000000000 ? 9 : 10;
return v < 100000000000 ? 11 : 12;
}
if (v < 100000000000000) return v < 10000000000000 ? 13 : 14;
return 15;
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See similar apreach which slightly more optimal for perf for branch leafs:

https://github.com/AssemblyScript/assemblyscript/blob/main/std/assembly/util/number.ts#L133

Comment thread std/assembly/util/dtoa.ts
Comment on lines +543 to +559
@inline function writeUInt16(buf: usize, value: u64): usize {
const len = decimalLen16(value);
let p = buf + (<usize>len << 1);
let v = value;
while (v >= 100) {
const q = v / 100;
p -= 4;
store<u32>(p, load<u32>(DIGITS + (<usize>(v - q * 100) << alignof<u32>())));
v = q;
}
if (v >= 10) {
store<u32>(buf, load<u32>(DIGITS + (<usize>v << alignof<u32>())));
} else {
store<u16>(buf, 0x30 + <u32>v);
}
return buf + (<usize>len << 1);
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also did you compared this approach with original?
https://github.com/AssemblyScript/assemblyscript/blob/main/std/assembly/util/number.ts#L221

I think original whould be faster

Comment thread std/assembly/util/dtoa.ts Outdated
if (binSig != 0) return writeNaN(buf);
return writeInfinity(buf, neg);
}
if (binSig == 0) { store<u16>(buf, 0x30); return buf + 2; } // +/-0 -> "0"

@MaxGraey MaxGraey Jun 17, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (binSig == 0) { store<u16>(buf, 0x30); return buf + 2; } // +/-0 -> "0"
// +/-0 -> "0"
if (binSig == 0) {
store<u16>(buf, 0x30);
return buf + 2;
}

Btw I guess it should be "0.0"?

Comment thread std/assembly/util/dtoa.ts
Comment on lines +721 to +726
@inline export function toDigits32(value: u64): void {
if (ASC_FEATURE_SIMD) return toDigits32Simd(value);
toBcd8(value);
gDigHi = gBcd + ZEROS;
gDigNum = gBcdLen;
}

@MaxGraey MaxGraey Jun 17, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@inline export function toDigits32(value: u64): void {
if (ASC_FEATURE_SIMD) return toDigits32Simd(value);
toBcd8(value);
gDigHi = gBcd + ZEROS;
gDigNum = gBcdLen;
}
@inline export function toDigits32(value: u64): void {
if (ASC_FEATURE_SIMD) {
toDigits32Simd(value);
} else {
toBcd8(value);
gDigHi = gBcd + ZEROS;
gDigNum = gBcdLen;
}
}

Comment thread std/assembly/util/dtoa.ts Outdated
if (hasExtraDigit) store<u16>(buf + 16, <u32>lastDigitChar);
const sig = 8 + i32(hasExtraDigit);
const endByte = buf + ((decExp + 1) << 1);
for (let z = buf + (sig << 1); z < endByte; z += 16) putBlock8(z, ZEROS);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (let z = buf + (sig << 1); z < endByte; z += 16) putBlock8(z, ZEROS);
for (let z = buf + (sig << 1); z < endByte; z += 16) {
putBlock8(z, ZEROS);
}

Comment thread std/assembly/util/dtoa.ts Outdated
}

const n = numDigits + i32(hasExtraDigit);
const endPos = decExp >= 0 ? (n > decExp + 1 ? n + 1 : decExp + 1) : n;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: double ternary

@MaxGraey

Copy link
Copy Markdown
Member

I think I’ve roughly point style direction. There are still quite a lot of tweaks & nits to make. Ping me when you’re done, and I’ll give it another pass.

Comment thread std/assembly/util/dtoa.ts Outdated
Comment thread std/assembly/util/dtoa.ts Outdated
@MaxGraey MaxGraey changed the title feat: switch grisu2 float-to-string algorithm to Xiang JunBo's xjb algorithm feat: switch grisu2 float-to-string algorithm to hybrid of xjb & zmij algorithms Jun 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Task] Replace Grisu2 with Zmij for float-to-string conversion

4 participants