[ruby-core:95898] Re: [Ruby master Bug#16352] Marshal limit of >= 2 GiB
From:
Austin Ziegler <halostatue@...>
Date:
2019-11-20 19:58:43 UTC
List:
ruby-core #95898
Marshal2? On Tue, Nov 19, 2019 at 9:05 PM <[email protected]> wrote: > Issue #16352 has been updated by shyouhei (Shyouhei Urabe). > > Description updated > > This behaviour has been there since the beginning. No ruby version since > 0.49 has successfully dumped such long string. Same thing happens for a > very big bignum, a very long array, a class that has very long classpath > (Q::W::E::R::...), an object of 2**31 instance variables (which isn't > impossible these days), and much much more. > > The limitation is due to marshal's binary format. I guess the reason > behind this is simply because at the time the format was designed (back i= n > 1990s), there simply was no such thing like a 64 bit integer type. To > properly reroute we have to reconsider all use of `long` in marshal > format. I guess that is essentially a format change. That should hurt > data portability so not that easy. > > Any nice idea to fix the situation? > > ---------------------------------------- > Bug #16352: Marshal limit of >=3D 2 GiB > https://bugs.ruby-lang.org/issues/16352#change-82729 > > * Author: seoanezonjic (Pedro Seoane) > * Status: Open > * Priority: Normal > * Assignee: > * Target version: > * ruby -v: ruby 2.7.0dev (2019-11-12T12:03:22Z master 3816622fbe) > [x86_64-linux] > * Backport: 2.5: UNKNOWN, 2.6: UNKNOWN > ---------------------------------------- > Hi > Using a gem to handle matrix operations called Numo-array I found the > following error when save large matrix: > in `dump': long too big to dump (TypeError) > Github thread: https://github.com/ruby-numo/numo-narray/issues/144 > Digging with the authors, we found the following code that reproduces the > error: > ``` > ruby -e 'Marshal.dump(" "*2**31)' > ``` > Executed in : > ruby 2.7.0dev (2019-11-12T12:03:22Z master 3816622fbe) [x86_64-linux] > > The marshal library has a limit that is checked with the SIZEOF_LONG > constant. This check is performed in this line > https://github.com/ruby/ruby/blob/e7ea6e078fecb70fbc91b04878b69f696749afa= c/marshal.c#L301 > to 321 of the Marshal.c file. I don't understand the motivation of this > limit and has a great impact in libraries that need to serialize large > objects as numeric matrix. In this case, the limit of >=3D 2 GiB it's > reached easily and it blocks the ruby development in scientifical project= s > as cited. I found other bug related: #1560, but the Marshal problem itsel= f > was not addressed in this case. > Thank you in advance > PEdro Seoane > > > > -- > https://bugs.ruby-lang.org/ > > Unsubscribe: <mailto:[email protected]?subject=3Dunsubscrib= e> > <http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core> > --=20 Austin Ziegler =E2=80=A2 [email protected] =E2=80=A2 [email protected]= a http://www.halostatue.ca/ =E2=80=A2 http://twitter.com/halostatue Unsubscribe: <mailto:[email protected]?subject=unsubscribe> <http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>