Add automatic conversion to BigDecimal when using JSON.parse with high-precision floats#946
Add automatic conversion to BigDecimal when using JSON.parse with high-precision floats#946HoneyryderChuck wants to merge 1 commit intoruby:masterfrom
JSON.parse with high-precision floats#946Conversation
Automatically parses JSON numbers with >17 significant digits as BigDecimal Falls back to using Float otherwise. This behaviour matches what `Oj.load` does.
# JSON.load is liberal with empty string
JSON.load("") #=> nil
JSON.parse("") #=> unexpected end of input at line 1 column 1 (JSON::ParserError)This is desired behavior. # JSON.load ignores inputs which can't be converted
JSON.load([]) #=> nil
JSON.parse([]) #=> no implicit conversion of Array into String (TypeError)That's a bug I believe. Line 707 in 3f32c47 empty?.
I'm open to make this an option, but as is this PR isn't acceptable as it's a breaking change. |
|
|
Actually, I think the feature you want is doable with pure Ruby without any change to the parser: require "bigdecimal"
require "json"
module AutoBigDecimal
def self.try_convert(str)
decimal = BigDecimal(str)
if decimal.precision < 17
decimal.to_f
else
decimal
end
end
end
DOC = <<~'JSON'
{
"small": 12.34,
"large": 3.141592653589793238462643383279502884197
}
JSON
p JSON.parse(DOC, decimal_class: AutoBigDecimal).values.map(&:class) # => [Float, BigDecimal] |
How would you make it so that this option is taken by default? Considering that there's no
Indeed. There's the probably minor drawback of always instantiating a bigdecimal to perform the check, but then again, I was willing to just turn everything to bigdecimals initially. That's not usable for the same reasons I mentioned above, as I'd have to ensure that all calls to I see now that I also need to do jruby. LMK if this is something that you'd consider via some special option or as a breaking change for a future release, and I'll work on it. I can close it otherwise. |
Yes. IMO there is no difference between a monkey patch and a global option, the consequence is the same (code end up using options it didn't test for) but the monkey patch is more honest about the consequence and is Ractor friendly.
Yes, that's why I prefer pure Ruby solution when possible, saves on fixing two parsers. IMO given this feature is mostly for migrating out of Oj, I think it's fine with a custom |
|
Ok, so I think I'm gonna close. I get how this can be helpful to migrate out of Oj, but the more I think about this feature the more I think it's a bad one, and as shown above, it can be emulated if strictly necessary. |
|
@byroot sorry for the delay, I wanted to validate a few things on my side in order to provide a more complete list of patches one has to move forward with when going through an oj-to-json migration, just for posterity's sake. So here it goes: module JSONParseWithDecimalClass
module AutoBigDecimalConversion
def self.try_convert(str)
decimal = BigDecimal(str)
if decimal.precision > 17
decimal
else
decimal.to_f
end
end
end
OPTS_COMPAT_WITH_OJ = {
# Oj ignores control characters (like newline) by default
allow_control_characters: true,
# Oj converts high-precision floats to bigdecimal by default
decimal_class: AutoBigDecimalConversion,
}.freeze
# Oj.mimic_json has its own internal logic to try to convert an input to json, whereas raw `JSON.generate`
# calls to_s it, which in most cases is wrong.
#
# @example
# JSON.generate(Object.new) #=> "\"#<Object:0x0000ffff72363f68>\""
# Oj.dump(Object.new) #=> "{\"^o\":\"Object\"}"
#
# I couldn't figure the exact logic, so I just preemptively `.to_json` it, since it0s a rails app; which btw, it's
# also not 100% compatible, as `.to_json` uses activesupport json encoding logic, which unicode-escapes
# characters like "&" to "\\u0026\\"
def generate(src, opts = nil)
return super unless opts.nil?
return src.to_json if src.respond_to?(:to_json)
super
end
def parse(src, opts = nil)
# Oj doesn't try to be fancy with nil
if src
# Oj can receive a "readable" source as input (files, IOs, ...), JSON doesn't.
if src.respond_to?(:to_io)
src = src.to_io
end
if src.respond_to?(:read)
src = src.read
else
# this seems to be Oj's fallback when not a readable source
src = src.to_s
end
end
opts ||= OPTS_COMPAT_WITH_OJ
opts[:decimal_class] ||= AutoBigDecimalConversion
opts[:allow_control_characters] = true unless opts.key?(:allow_control_characters)
super
end
end
JSON.singleton_class.prepend(JSONParseWithDecimalClass)Other things that are different with
I think so too, however just want to point out that there's a slight difference between the check performed in ruby and in C. Both check for precision of digits (your initial suggest had the condition the wrong way though, fixed it), but the C extension also checks the mantissa, which isn't exposed in ruby. Not sure how much of an edge case in terms of precision loss this will be though. |
|
Thanks for posting your patch, that's should be helpful to a few people going though the same sort of issues. |
Automatically parses JSON numbers with >17 significant digits as BigDecimal, falls back to using Float otherwise. This behaviour is conditional on "bigdecimal" being around, and will fall back on current behaviour of converting to Float with precision loss otherwise.
This behaviour matches what
Oj.loaddoes by default, and is a proposal designed to smoothen migrations away fromoj.@byroot while I started #944 mostly focused on encoding bigdecimals, which we agreed it could be done via some monkeypatching which eventually can live in activesupport, the main blocker for our internal migration has actually been parsing. I mentioned that, by using
JSON.loadwith:decimal_classoption, but in practice this is turning out not to be true, for a few reasons.The main one is that
JSON.loadandJSON.parsehave some surprisingly different behaviour. Take these examples:The difference in behaviour and the (in our case) widespread use of
JSON.parse(partly driven by a rubocop cop which prevents usage ofJSON.loadfor security reasons) make a rewrite toJSON.loadquite risky.Another drawback is that 3rd party libs mostly use
JSON.parseunder the hood, and few/none allow passing down your own json decoder, so one could convert everything toJSON.loadpotentially and still trip on it down the line.A way to use
JSON.parsewhile supporting this conversion to bigdecimals would be ideal.jsondoes not support a similar:bigdecimal_loadoption, andJSON.parsedoes not support global options anyway (not that I'm advocating for it), hence why I went with this "no config" approach based on having bigdecimals loaded.There's also the argument of backwards compatibility (a bigdecimal is not a float) which could surprise someone in case of a release. I'm not sure what's the best way to deal with this, but FWIW I still consider that the benefit (no precision loss) would outweight the backwards incompatible change cost.
This change was originally authored by @samgiddins while trying to solve the same issue internally. FYI I removed everything encoding-related, removed the block always trying to load bigdecimal, and tweaked the check for const defined.