Skip to content

Add automatic conversion to BigDecimal when using JSON.parse with high-precision floats#946

Closed
HoneyryderChuck wants to merge 1 commit intoruby:masterfrom
HoneyryderChuck:decode-to-bigdecimal
Closed

Add automatic conversion to BigDecimal when using JSON.parse with high-precision floats#946
HoneyryderChuck wants to merge 1 commit intoruby:masterfrom
HoneyryderChuck:decode-to-bigdecimal

Conversation

@HoneyryderChuck
Copy link

Automatically parses JSON numbers with >17 significant digits as BigDecimal, falls back to using Float otherwise. This behaviour is conditional on "bigdecimal" being around, and will fall back on current behaviour of converting to Float with precision loss otherwise.

This behaviour matches what Oj.load does by default, and is a proposal designed to smoothen migrations away from oj.

@byroot while I started #944 mostly focused on encoding bigdecimals, which we agreed it could be done via some monkeypatching which eventually can live in activesupport, the main blocker for our internal migration has actually been parsing. I mentioned that, by using JSON.load with :decimal_class option, but in practice this is turning out not to be true, for a few reasons.

The main one is that JSON.load and JSON.parse have some surprisingly different behaviour. Take these examples:

# JSON.load is liberal with empty string
JSON.load("") #=> nil
JSON.parse("") #=> unexpected end of input at line 1 column 1 (JSON::ParserError)

# JSON.load ignores inputs which can't be converted
JSON.load([]) #=> nil
JSON.parse([]) #=> no implicit conversion of Array into String (TypeError)
# sidenote, but the above is made even more confusing with oj's monkey-patching of JSON.load and JSON.parse...
JSON.load([]) #=> ArgumentError: parser io argument must be a String or respond to readpartial() or read(). (ArgumentError)
JSON.parse([]) #=> []

The difference in behaviour and the (in our case) widespread use of JSON.parse (partly driven by a rubocop cop which prevents usage of JSON.load for security reasons) make a rewrite to JSON.load quite risky.

Another drawback is that 3rd party libs mostly use JSON.parse under the hood, and few/none allow passing down your own json decoder, so one could convert everything to JSON.load potentially and still trip on it down the line.

A way to use JSON.parse while supporting this conversion to bigdecimals would be ideal. json does not support a similar :bigdecimal_load option, and JSON.parse does not support global options anyway (not that I'm advocating for it), hence why I went with this "no config" approach based on having bigdecimals loaded.

There's also the argument of backwards compatibility (a bigdecimal is not a float) which could surprise someone in case of a release. I'm not sure what's the best way to deal with this, but FWIW I still consider that the benefit (no precision loss) would outweight the backwards incompatible change cost.

This change was originally authored by @samgiddins while trying to solve the same issue internally. FYI I removed everything encoding-related, removed the block always trying to load bigdecimal, and tweaked the check for const defined.

Automatically parses JSON numbers with >17 significant digits as BigDecimal

Falls back to using Float otherwise.

This behaviour matches what `Oj.load` does.
@byroot
Copy link
Member

byroot commented Mar 5, 2026

# JSON.load is liberal with empty string
JSON.load("") #=> nil
JSON.parse("") #=> unexpected end of input at line 1 column 1 (JSON::ParserError)

This is desired behavior. load is meant to be a bit more liberal than parse in ignoring nil and "blank", which is convenient in many cases.

# JSON.load ignores inputs which can't be converted
JSON.load([]) #=> nil
JSON.parse([]) #=> no implicit conversion of Array into String (TypeError)

That's a bug I believe.

if opts[:allow_blank] && (source.nil? || source.empty?)
should only allow strings, not anything that respond to empty?.

Automatically parses JSON numbers with >17 significant digits as BigDecimal, falls back to using Float otherwise.

I'm open to make this an option, but as is this PR isn't acceptable as it's a breaking change.

@byroot
Copy link
Member

byroot commented Mar 5, 2026

should only allow strings, not anything that respond to empty?.

#947

byroot added a commit to byroot/json that referenced this pull request Mar 5, 2026
byroot added a commit that referenced this pull request Mar 5, 2026
@byroot
Copy link
Member

byroot commented Mar 5, 2026

Actually, I think the feature you want is doable with pure Ruby without any change to the parser:

require "bigdecimal"
require "json"

module AutoBigDecimal
  def self.try_convert(str)
    decimal = BigDecimal(str)
    if decimal.precision < 17
      decimal.to_f
    else
      decimal
    end
  end
end

DOC = <<~'JSON'
{
  "small": 12.34,
  "large": 3.141592653589793238462643383279502884197
}
JSON

p JSON.parse(DOC, decimal_class: AutoBigDecimal).values.map(&:class) # => [Float, BigDecimal]

@HoneyryderChuck
Copy link
Author

I'm open to make this an option, but as is this PR isn't acceptable as it's a breaking change.

How would you make it so that this option is taken by default? Considering that there's no JSON.parse_default_options.

Actually, I think the feature you want is doable with pure Ruby without any change to the parser.

Indeed. There's the probably minor drawback of always instantiating a bigdecimal to perform the check, but then again, I was willing to just turn everything to bigdecimals initially.

That's not usable for the same reasons I mentioned above, as I'd have to ensure that all calls to JSON.parse also set :decimal_class. I guess I could monkey-patch that as well, is that what you're suggesting?

I see now that I also need to do jruby. LMK if this is something that you'd consider via some special option or as a breaking change for a future release, and I'll work on it. I can close it otherwise.

@byroot
Copy link
Member

byroot commented Mar 5, 2026

I guess I could monkey-patch that as well, is that what you're suggesting?

Yes. IMO there is no difference between a monkey patch and a global option, the consequence is the same (code end up using options it didn't test for) but the monkey patch is more honest about the consequence and is Ractor friendly.

I see now that I also need to do jruby.

Yes, that's why I prefer pure Ruby solution when possible, saves on fixing two parsers.

IMO given this feature is mostly for migrating out of Oj, I think it's fine with a custom decimal_class over a new feature.

@byroot
Copy link
Member

byroot commented Mar 6, 2026

Ok, so I think I'm gonna close. I get how this can be helpful to migrate out of Oj, but the more I think about this feature the more I think it's a bad one, and as shown above, it can be emulated if strictly necessary.

@byroot byroot closed this Mar 6, 2026
@HoneyryderChuck
Copy link
Author

@byroot sorry for the delay, I wanted to validate a few things on my side in order to provide a more complete list of patches one has to move forward with when going through an oj-to-json migration, just for posterity's sake. So here it goes:

module JSONParseWithDecimalClass
  module AutoBigDecimalConversion
    def self.try_convert(str)
      decimal = BigDecimal(str)
      if decimal.precision > 17
        decimal
      else
        decimal.to_f
      end
    end
  end

  OPTS_COMPAT_WITH_OJ = {
    # Oj ignores control characters (like newline) by default
    allow_control_characters: true,
    # Oj converts high-precision floats to bigdecimal by default
    decimal_class: AutoBigDecimalConversion,
  }.freeze

  # Oj.mimic_json has its own internal logic to try to convert an input to json, whereas raw `JSON.generate` 
  # calls to_s it, which in most cases is wrong. 
  # 
  # @example
  #     JSON.generate(Object.new) #=> "\"#<Object:0x0000ffff72363f68>\""
  #     Oj.dump(Object.new) #=> "{\"^o\":\"Object\"}"
  #
  # I couldn't figure the exact logic, so I just preemptively `.to_json` it, since it0s a rails app; which btw, it's 
  # also not 100% compatible, as `.to_json` uses activesupport json encoding logic, which unicode-escapes
  # characters like "&" to "\\u0026\\"
  def generate(src, opts = nil)
    return super unless opts.nil?

    return src.to_json if src.respond_to?(:to_json)

    super
  end

  def parse(src, opts = nil)
    # Oj doesn't try to be fancy with nil
    if src
      # Oj can receive a "readable" source as input (files, IOs, ...), JSON doesn't.
      if src.respond_to?(:to_io)
        src = src.to_io
      end

      if src.respond_to?(:read)
        src = src.read
      else
        # this seems to be Oj's fallback when not a readable source
        src = src.to_s
      end
    end

    opts ||= OPTS_COMPAT_WITH_OJ

    opts[:decimal_class] ||= AutoBigDecimalConversion
    opts[:allow_control_characters] = true unless opts.key?(:allow_control_characters)

    super
  end
end

JSON.singleton_class.prepend(JSONParseWithDecimalClass)

Other things that are different with Oj.mimic_JSON :

  • error messages are completely different, so if you're relying on them, it sucks. It will also make the error message contain the payload (this was surprising and was being relied on 🙈 )
  • JSON.pretty_generate returns slightly different payload.

... and as shown above, it can be emulated if strictly necessary.

I think so too, however just want to point out that there's a slight difference between the check performed in ruby and in C. Both check for precision of digits (your initial suggest had the condition the wrong way though, fixed it), but the C extension also checks the mantissa, which isn't exposed in ruby. Not sure how much of an edge case in terms of precision loss this will be though.

@byroot
Copy link
Member

byroot commented Mar 6, 2026

Thanks for posting your patch, that's should be helpful to a few people going though the same sort of issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants