Conversation
|
This PR was discussed at the recent TC39 plenary, and concluded with support for merging this, once it's been reviewed by @sffc and/or @gibson042. |
|
With style percent, it should format as 6.50%, yes? So I don't see why we need to adjust stringDigitCount in style percent. I also don't understand why you say that |
No, its string digit count is 5.
Yes.
That is required because when multiplying by 100 for formatting the leading zeros need to be dropped.
It's not possible for us to always determine the significant digit count, for example for a value like |
gibson042
left a comment
There was a problem hiding this comment.
Unfortunately, the |ZeroDigits| approach does not work.
sffc
left a comment
There was a problem hiding this comment.
I still think modeling this as stringSignificantDigits instead of stringDigitCount would be better, because then you don't need to worry so much about how you deal with the leading zeros. Just define that 120 has stringSignificantDigits=3 for spec purposes.
spec.emu
Outdated
| 1. <ins>Else, let _magnitude_ be the base 10 logarithm of abs(_x_) rounded down to the nearest integer.</ins> | ||
| 1. If _numberFormat_.[[Style]] is *"percent"*<del>, set _x_ be 100 × _x_.</del><ins>, then</ins> | ||
| 1. <ins>Set _x_ to 100 × _x_.</ins> | ||
| 1. <ins>If _magnitude_ < 0, set _stringDigitCount_ to _stringDigitCount_ + max(_magnitude_, -2).</ins> |
There was a problem hiding this comment.
I've convinced myself for 0.61, 0.06, 0.067
how about 0.6 => 60, should get 2 digits out right?
And 65 => 6500, should be 4?
There was a problem hiding this comment.
We only need to apply a change to stringDigitCount here if we're effectively losing the leading 0 or 0.0 from the value. We don't want to artificially increase the count, because that would combine badly when percent style is e.g. combined with scientific notation, as we want the result of
let nf = new Intl.NumberFormat('en',
{ style: 'percent', notation: 'scientific', maximumFractionDigits: 3 });
nf.format(65);to be '6.5E3%' and not '6.500E3%'
spec.emu
Outdated
| 1. <ins>Set _magnitude_ to _magnitude_ + 2.</ins> | ||
| 1. Set _exponent_ to ComputeExponent(_numberFormat_, _x_). | ||
| 1. Set _x_ to _x_ × 10<sup>-_exponent_</sup>. | ||
| 1. <ins>If _magnitude_ < 0 and _exponent_ < 0, set _stringDigitCount_ to _stringDigitCount_ + max(_magnitude_, _exponent_).</ins> |
There was a problem hiding this comment.
I'm really trying to follow the cases here.
| x and stringDigitCount In | Magnitude | Exponent | x and stringDigitCount Out |
|---|---|---|---|
| 67.0, 3 | 2 | 2 | 6700, 4? |
| 67.0, 3 | 2 | -2 | 0.670, 4? |
| 0.670, 4 | -1 | 2 | 6.70, 3? |
| 0.670, 4 | -1 | -2 | 0.00670, 6 ? |
I don't think your formula achieves the above outcomes.
There was a problem hiding this comment.
I don't really follow your table. You might have missed the negation of the exponent in the line above this one?
Set x to x × 10-exponent.
Also, keep in mind that stringDigitCount is only used to ensure that trailing zeros are retained, and if it's smaller than the count of formatted digits, its value does not matter.
|
@gibson042 I've now dropped the ZeroDigits grammar constructions, as you asked for in #10 (comment). PTAL? |
| 1. <ins>If _rounded_ is *+0*<sub>𝔽</sub>, then</ins> | ||
| 1. <ins>If _intlMV_ < 0, set _intlMV_ to ~negative-zero~.</ins> | ||
| 1. <ins>Else, set _intlMV_ to 0.</ins> | ||
| 1. <ins>If _intlMV_ < 0, set _intlMV_ to ~negative-zero~; else set _intlMV_ to 0.</ins> |
There was a problem hiding this comment.
I think this also needs to override stringDigitCount, for cases like "123e-15000" (which should also be covered in test262).
| 1. <ins>If _intlMV_ < 0, set _intlMV_ to ~negative-zero~; else set _intlMV_ to 0.</ins> | |
| 1. <ins>If _intlMV_ < 0, set _intlMV_ to ~negative-zero~; else set _intlMV_ to 0.</ins> | |
| 1. <ins>Set _stringDigitCount_ to 0.</ins> |
There was a problem hiding this comment.
I disagree, as it doesn't make sense for us to format values differently depending on whether they get rounded to zero here, or during formatting. If we were to apply this change, you'd see:
let nf = new Intl.NumberFormat('en')
nf.format('1.00e-100') // '0.00'
nf.format('1.00e-1000') // '0'If we do not reset the stringDigitCount, both of those would format with three significant digits.
There was a problem hiding this comment.
I think it does make sense, because precision truly is lost at the boundary. Consider a toy example in which we support at most three fractional digits and input consists of five significant digits, at least one of which extends beyond that threshold—the result inherently has fewer significant digits than the input, and that should be communicated. The most I could see doing is indicating precision loss by preserving the exponent in such cases even when rounding removes all significant digits (the last row below).
| Input | 100 | 10-1 | 10-2 | 10-3 | 10-4 | 10-5 | 10-6 | 10-7 | 10-8 | Truncated result |
|---|---|---|---|---|---|---|---|---|---|---|
| "1.2345" | 1 | 2 | 3 | 4 | 1.234e0 | |||||
| "0.12345" | 0 | 1 | 2 | 3 | 1.23e-1 | |||||
| "0.012345" | 0 | 0 | 1 | 2 | 1.2e-2 | |||||
| "0.0012345" | 0 | 0 | 0 | 1 | 1e-3 | |||||
| "0.00012345" | 0 | 0 | 0 | 0 | 0e-3 |
let nf = new Intl.NumberFormat('en') nf.format('1.00e-100') // '0.00' nf.format('1.00e-1000') // '0'If we do not reset the stringDigitCount, both of those would format with three significant digits.
What I'm advocating for is reduction of significant digits in correspondence with the limits. In an implementation that supports no more than 100 fractional digits, "1.00e-99" should be treated as "1.0e-99", "1.00e-100" as "1e-100", and `1e-${x}` (where x is an integer > 100) as "0e0" or "0e-100".
There was a problem hiding this comment.
I don't really understand the relevance of your toy example above. The only observable impact that stringDigitCount ever has is on the number of trailing zeros, and the impact of this particular line is only on values for which Number(n) === 0, like -0.0 and 1.00e-400, but not 1.00e-200.
Those input values fall into two categories:
- Nonzero values that are very very small, like
1.00e-400. - Representations of zero, like
0.00.
I think we've mostly been talking here about the former, but the latter is likely to be much more common. With the spec language that's currently proposed, we'd retain trailling zeros also for zero values:
new Intl.NumberFormat().format('0.00') // '0.00'If we were to apply the change you suggest here, we would lose them:
new Intl.NumberFormat().format('0.00') // '0'I don't think that's a good idea, and we should not lose this precision.
Note also that this is covered by the following tests in tc39/test262#4608:
const nf = new Intl.NumberFormat('en-US', { maximumFractionDigits: 20 });
assert.sameValue(nf.format('0.0'), '0.0');
assert.sameValue(nf.format('00.0'), '0.0');
assert.sameValue(nf.format('-0.00'), '-0.00');
assert.sameValue(nf.format('-.00'), '-0.00');
assert.sameValue(nf.format('1.2345e-1000'), '0.0000');
const nf3 = new Intl.NumberFormat('en-US', {
minimumSignificantDigits: 2,
maximumSignificantDigits: 4,
});
assert.sameValue(nf3.format('0.00'), '0.00');
assert.sameValue(nf3.format('.00'), '0.00');
const spf = new Intl.NumberFormat('en-US', {
style: 'percent',
notation: 'scientific',
maximumFractionDigits: 10,
});
assert.sameValue(spf.format('0.0'), '0.0E0%');If you think one or more of the tests is wrong, maybe it'd be more useful to discuss that first, before continuing here?
| <ins class="block"> | ||
| <p> | ||
| The conversion of a |StringNumericLiteral| to a mathematical value and a precision is similar overall to the determination of the NumericValue of a |NumericLiteral| (see <emu-xref href="#sec-literals-numeric-literals"></emu-xref>), but some of the details are different. | ||
| The result of StringIntlMV is a List value with two elements, a mathematical value and the count of decimal digits in the source text. |
There was a problem hiding this comment.
I'm having a hard time with the StringIntlMV, and I think it's because the semantics of the mathematical value it returns are not clear—what exactly is meant by "the count of decimal digits in the source text"? Also, I'm not sure whether or not this is incorrect, but I'm definitely surprised that exponents can serve to increase that count:
| literal | e | 1 − e | m | m′ | n | stringDigitCount | surprising? |
|---|---|---|---|---|---|---|---|
.07 |
0 | 1 | 1 | - | 2 | 3 | |
0.07 |
0 | 1 | 1† | - | 2 | 3 | |
00.07 |
0 | 1 | 1† | - | 2 | 3 | |
7e-2 |
-2 | 3 | 1 | 3 | - | 3 | |
70e-3 |
-3 | 4 | 2 | 4 | - | 4 | mildly |
7.0e-2 |
-2 | 3 | 1 | 3 | 1 | 4 | no |
0.7e-1 |
-1 | 2 | 1† | 2 | 1 | 3 | |
.7e-1 |
-1 | 2 | 2 | - | 1 | 3 | |
.07e0 |
0 | -1 | 1 | - | 2 | 3 | |
00.07e0 |
0 | -1 | 1† | - | 2 | 3 | |
.007e1 |
1 | 0 | 1 | - | 3 | 4 | yes, why equivalent to .070? |
0.007e1 |
1 | 0 | 1† | - | 3 | 4 | yes, why equivalent to 0.070? |
† clamped
If not for those last two rows, I would say that it's something like "the minimum count of digits necessary to express all explicitly-present digits of a decimal literal without using an exponent part".
There was a problem hiding this comment.
I've reworked this so that stringDigitCount now excludes all leading zeros, and is not affected by the exponent. These are the new values matching your table above:
| literal | m | n | z | stringDigitCount |
|---|---|---|---|---|
.07 |
0 | 2 | 1 | 1 |
0.07 |
1 | 2 | 2 | 1 |
00.07 |
2 | 2 | 3 | 1 |
7e-2 |
1 | 0 | 0 | 1 |
70e-3 |
2 | 0 | 0 | 2 |
7.0e-2 |
1 | 1 | 0 | 2 |
0.7e-1 |
1 | 1 | 1 | 1 |
.7e-1 |
0 | 1 | 0 | 1 |
.07e0 |
0 | 2 | 1 | 1 |
00.07e0 |
2 | 2 | 3 | 1 |
.007e1 |
0 | 3 | 2 | 1 |
0.007e1 |
1 | 3 | 3 | 1 |
spec.emu
Outdated
| <emu-alg> | ||
| 1. Let _b_ be MV of |DecimalDigits|. | ||
| 1. If |ExponentPart| is present, let _e_ be MV of |ExponentPart|. Otherwise, let _e_ be 0. | ||
| 1. <ins>If _e_ < 0, let _m_ be 1 - _e_; else, let _m_ be 1.</ins> |
There was a problem hiding this comment.
It was rather cumbersome to confirm that the three algorithms for |StrUnsignedDecimalLiteral| productions with |DecimalDigits| are equivalent to each other with respect to missing parts (i.e., that DecimalDigits `.` DecimalDigits? ExponentPart? and DecimalDigits ExponentPart? are in fact treated as degenerate cases of DecimalDigits `.` DecimalDigits? ExponentPart?. I think that would be more clear if all three delegated to a common operation, and might also help with understanding semantics of the returned count by internal naming as an alias.
<emu-grammar>StrUnsignedDecimalLiteral ::: DecimalDigits `.` DecimalDigits? ExponentPart?</emu-grammar>
<emu-alg>
1. If |ExponentPart| is present, let _e_ be MV of |ExponentPart|; else let _e_ be 0.
1. Let _intPart_ be the first |DecimalDigits|.
1. If the second |DecimalDigits| is present, let _fracPart_ be the second |DecimalDigits|; else let _fracPart_ be ~empty~.
1. Return StringIntlMVFromParts(_intPart_, _fracPart_, _e_).
</emu-alg>
<emu-grammar>StrUnsignedDecimalLiteral ::: `.` DecimalDigits ExponentPart?</emu-grammar>
<emu-alg>
1. If |ExponentPart| is present, let _e_ be MV of |ExponentPart|; else let _e_ be 0.
1. Return StringIntlMVFromParts(~empty~, |DecimalDigits|, _e_).
</emu-alg>
<emu-grammar>StrUnsignedDecimalLiteral ::: DecimalDigits ExponentPart?</emu-grammar>
<emu-alg>
1. If |ExponentPart| is present, let _e_ be MV of |ExponentPart|; else let _e_ be 0.
1. Return StringIntlMVFromParts(|DecimalDigits|, ~empty~, _e_).
</emu-alg>
|
I'd like to propose a reframing... every non-empty sequence of decimal digits d1…dk that either contains only zeros or does not start with zero is uniquely correlated with a set of values having the same significant digits that differ only in power-of-ten scaling (i.e. d1e±x if k = 1 and d1.d2…dke±x otherwise). For example, 7, 0.7, 0.07, 7e0, 7e6, and 7e-3 all correspond with the sequence "7" while 7.0, 0.70, 0.070, 7.0e0, 7.0e6, and 7.0e-3 all correspond with sequence "70". This proposal is concerned with the latter, values corresponding with sequences that end in one or more zeros. Formatting a value from one of those sets is already solved AFAIK, but this PR relates to ingesting and describing such values. Ingestion is trivial if the input includes a decimal point inside of or before the significant sequence, but can be tricky otherwise—7.0 and 7.0e2 clearly both have sequence "70", but 700 and 700e-2 could be mapped to sequence "7" or "70" or "700" (together or separately). I think I want to advocate for both of those being mapped to sequence "7" (i.e., trailing zeros in the input representation are only significant if the last one is to the right of a decimal point in that representation, regardless of presence vs. absence of an exponent part). As for description, I really want to use the above framing, capturing not a mathematical value and a count of digits but rather a sequence of significant digits and a power-of-ten scale (or something clearly analogous). For example, where di ∈ {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, d1 ≠ 0, dk ≠ 0, q ≥ 0, and s ≥ 0,
|
There was a problem hiding this comment.
I'm trying to figure out what to write here. I have stared at the first chunk of expressions, for at least 5 minutes apiece, to convince myself that they are correct. I haven't had time to do all of them. But, I do not wish to do so, because I believe that this logic should not be so hard to follow. All we should be doing is propagating a number of minimum significant digits through the stack, which should be not hard. Dealing with leading zeros is what makes it hard, and that seems totally unnecessary to me. At the same time, I hear what you said about implementing this in tests and in a polyfill according to this choice of language.
|
@gibson042 I've applied most of the changes you've asked for. I have not changed the consideration of integer trailing zeros, though, as I do not think that they ought to be discarded. This corresponds with the repesentation used by (700).toPrecision(1) // '7e+2'
(700).toPrecision(2) // '7.0e+2'
(700).toPrecision(3) // '700'
(700).toPrecision(4) // '700.0'As we have this prior art in the language, we should ensure that using it works as expected with Intl.NumberFormat. |
|
As I pointed out in Matrix, I don't consider that to be a relevant precedent. Most obviously, it doesn't even have bearing on input in exponential notation (e.g., it cannot differentiate "700e1" from "7000", which is critical to do here because exponential notation is the means by which precision is accurately conveyed). And even if it were relevant, this proposal is precisely the kind of extension that allows the language to improve. But that said, this PR can be reviewed independently of such concerns. |
|
A rendered view of the spec with this PR's changes is currently available at https://eemeli.org/tc39-proposal-intl-keep-trailing-zeros/ I wasn't able to deploy it to tc39.es due to branch protection rules that I can't change myself for this repo. |
While putting together tc39/test262#4608, I validated the proposed spec text with a patched fork of FormatJS, and this identified a few places where the spec text needs to be updated:
Leading zeros need to be discarded, so we count
'0012.3'to have three string digits. To do so in the syntax-directed operation, I introduceZeroDigitsas a new syntax rule.An elided leading zero needs to be accounted for, so
'.45'should count as having three string digits.Changes in exponents that reduce the number of leading zeros need to also reduce the string digit count accordingly. This means that when formatted as a percentage,
'0.06'should format as if it had only one string digit, rather than three. This adjustment needs to be done potentially twice, asstyle: 'percent'can be combined withnotation: 'engineering'ornotation: 'scientific'.Ping @sffc, @gibson042, @jessealama, @ben-allen for reviews.