-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Evolved Binary (EB-843) reference counted objects - Java interface proposal #10736
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
alanpaxton
wants to merge
78
commits into
facebook:main
Choose a base branch
from
alanpaxton:eb-843-ref-counted-objects
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
opening a db. Now we need to close it, use CFs from it successfully (before DB close) and throwing an expcetion (after DB close). Then similarly with iterators. Then check performance.
Add some C++ wrapper to turn the C++ handle into a C++ RCA (Reference Counted API) object. Pull on the cascading changes necessary. Nearly in a position to run RefCountTest in the new world, at least for opening up a DB. And remove all the shadow (org.rocksdb.api package) code as that path isn't the one we need to take.
Just a few.. lots still to do.. working out how it works for the moment.
It seems better to just incrementally change in place the existing classes.
RocksNative is now in the org.rocksdb package, org.rocksdb.api package seems unnecessary.
Further steps on the way to the new reference counted API.
Introduce some operator->() jiggery pokery All the iterator.cc methods Some more DB methods
Plenty more cases not fixed yet.
There is a single default_cf_handle_ which gets deleted as part of the deletion of the DB itself. So we can't put it in a shared_ptr or it will get deleted with the final shared_ptr instance, which is too soon, rather than explicitly deleted only when the DB is finally deleted.
So move it into that class which we use to confirm the issues with ref counts, or which we want to fix with ref counts.
nondefault is an unowned reference
rocksdb_get_helper converted to new model, along with the methods that call it. That's just the indirect methods, we will do rocksdb_get_helper_direct next. As part of this, add/revise some helper methods for API objects (getters, lockers) that make code look more consistent where we access stuff via the new model.
We can still improve things by adding a WeakDB to the DB, and using it to verify that the C++ rocks database itself has truly been closed.
JNI name mangling nonsense causing a problem. Can't figure out why the weakdb names are mangled. Bah.
Now linking the C++ method with the correct name.
Added closed DB checks. Removed a ReadOptions test which wasn't relevant to ref count testing.
fix put(), enableAutoCompaction() Convert parameters to use the new-style handles. Provide a helper function to copy a vector of CF handles.
cd88951
to
b1b0b1a
Compare
More template params for iterators so that they cover the WBWI iterators. WBWI iterator already delete()s the BaseDeltaIterator internally (C++ Rocks level), so we can’t run it as a shared_ptr. Best we can manage is a unique_ptr but we leave a hole where the base iterator Java object may crash if used after the derived is closed. Some polish/tweaks around references and constructors.
5775970
to
5dd112f
Compare
WriteBatchWithIndex passing all existing tests in new shared reference model.
CF->lockDBCheck() is too complicated, would need more context when dealing with CF used against base DB extracted from OptimisticTransactionDB (which should be valid). There’s some checking internally anyway, so I think what we’re doing may count as unnecessary. Some small bits of localized code improvement/polish, and comments, where I came across them.
e1acf5a
to
cf0eafa
Compare
0bcf435
to
53dee7f
Compare
remove booby traps
Convert to ref-counted API Add test
0c9fddc
to
ae52ca9
Compare
f6e9e9c
to
ae52ca9
Compare
alanpaxton
added a commit
to alanpaxton/rocksdb
that referenced
this pull request
Feb 2, 2023
ComparatorOptions is AutoCloseable. AbstractComparator does not hold a reference to its ComparatorOptions, but the native C++ ComparatorJniCallback holds a reference to the ComparatorOptions’ native C++ options structure. This gets deleted when the ComparatorOptions is closed, either explicitly, or as part of try-with-resources. Later, the deleted C++ options structure gets used by the callback and the comparator options are effectively random. The original bug report facebook#8715 was caused by a GC-initiated finalization closing the still-in-use ComparatorOptions. As of 7.0, finalization of RocksDB objects no longer closes them, which worked round the reported bug, but still left ComparatorOptions with a potentially broken lifetime. The solution implemented here is to copy the native C++ options object into the ComparatorJniCallback. Then the deletion of the native object held by ComparatorOptions is *correctly* deleted when its scope is closed in try/finally. There are probably other instances of the same problem. They could all be addressed using the model proposed/prototyped in facebook#10736 In the meantime, we will look for and fix other instances using the same copy technique.
facebook-github-bot
pushed a commit
that referenced
this pull request
Feb 17, 2023
Summary: The problem ------------- ComparatorOptions is AutoCloseable. AbstractComparator does not hold a reference to its ComparatorOptions, but the native C++ ComparatorJniCallback holds a reference to the ComparatorOptions’ native C++ options structure. This gets deleted when the ComparatorOptions is closed, either explicitly, or as part of try-with-resources. Later, the deleted C++ options structure gets used by the callback and the comparator options are effectively random. The original bug report #8715 was caused by a GC-initiated finalization closing the still-in-use ComparatorOptions. As of 7.0, finalization of RocksDB objects no longer closes them, which worked round the reported bug, but still left ComparatorOptions with a potentially broken lifetime. In any case, we encourage API clients to use the try-with-resources model, and so we need it to work. And if they don't use it, they leak resources. The solution ------------- The solution implemented here is to make a copy of the native C++ options object into the ComparatorJniCallback, rather than a reference. Then the deletion of the native object held by ComparatorOptions is *correctly* deleted when its scope is closed in try/finally. Testing ------- We added a regression unit test based on the original test for the reported ticket. This checkin closes #8715 We expect that there are more instances of "lifecycle" bugs in the Java API. They are a major source of support time/cost, and we note that they could be addressed as a whole using the model proposed/prototyped in #10736 Pull Request resolved: #11176 Reviewed By: cbi42 Differential Revision: D43160885 Pulled By: pdillinger fbshipit-source-id: 60b54215a02ad9abb17363319650328c00a9ad62
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
Experimental re-imagining of how the Java API maintains consistency with internal RocksDB objects.
The current RocksObject / AbstractImmutableNativeReference / AbstractNativeReference implementation expects a single Java object to be the single owning handle (
isOwningHandle()
) for an underlying C++ object, for instance a database handle or a column family handle.Multiple Java objects sharing the same underlying object can exist, and then the owning protocol must be observed in order that neither double-deallocation nor a resource leak occurs.
We want to do away with the problem by introducing a form of reference counting so that the last owner of an object always close()s it for real, and no other owner does.
Our idea is that we can do this by changing the handle from being a C++ T* to being a reference to a C++ shared_ptr *, where each Java object has its own companion shared_ptr. On closing a Java object, the shared pointer is reset() and deallocated.
There is a pull request with the changes WIP - #10736
Implementation Detail (Java)
We have introduced a new class,
RocksNative
which wraps the native reference to a RocksDB C++ objectin a modified way to the old
RocksObject
.RocksNative
.AtomicBoolean isOpen
. There is no concept of having disowned the native reference at the Java side; ownership and lifetime is dealt with entirely by smart pointers on the C++ side.AbstractImmutableNativeReference
because all references are necessarily to an immutable shadow object.Implementation Detail (C++)
In rocksdb repo branch we have added a number of API files to the
java/rocksjni
directory, namedapi_<item>.h
thusapi_base.h
- common support functionsapi_rocksdb.h
- the shadow database handleapi_columnfamilyhandle.h
- the shadow column family handleapi_iterator.h
- the shadow iterator handleapi_wrapper.h
- a standard wrapper for less commonly used objects which have shadowsThese shadow objects all implement variations of the same pattern:
api_rocksdb.h
- the shadow database handleThe shadow database handle contains shared pointers to the RocksDB itself (
TDatabase
is usuallyROCKSDB_NAMESPACE::DB
)and to the column family handles open within the database.
The shadow database handle is created on a database
open()
JNI call, and its reference is cast tolong
and returned as the handlein the parallel Java object (class
RocksDB
inRocksDB.java
). Previously, a pointer to theROCKSDB_NAMESPACE::DB
was returned.When the
RocksDB
class, is accessed in native code, the handle is cast to the shadow object, so we define theshorthand in C++
using API_DB = APIRocksDB<ROCKSDB_NAMESPACE::DB>;
and thenthis replaces the previous common idiom
The shadow database handle also defines the standard pointer methods on itself,
so that it can be used as if it were a pointer
and this leads to the idiom
(*dbAPI)->Method(...)
being used as a replacement fordb->Method(...)
throughout the changed codebase.
The shadow object is deleted in the
Java_org_rocksdb_RocksDB_nativeClose
method.When this happens, reference counts on the
std::shared_ptr
members are reduced,and by the magic of shared pointers the
ROCKSDB_NAMESPACE::DB
andROCKSDB_NAMESPACE::ColumnFamilyHandle
objects are deleted (with the C++
delete()
method) on their final reference.Crucially, however, these references will persist when the shared pointers have been copied into other
shadow objects. This gives us the guarantees that we will be able to access certain structures (e.g. iterators)
where their lifetime persists beyond that of the database, while still cleaning them up when all references
to them are closed.
This means that we can use idioms like closing objects out of nested order, or passing objects out of a loop, while still being secure that our calls will work, or at the very worst fail cleanly with an obvious java exception.
What they won't ever do is crash in C++ when invoked from Java (practically speaking, not never but much less frequently and in much more corner cases).
api_iterator.h
- the shadow iterator handleIterators are used with shadow objects in a similar way to databases. Because iterators are created with a
database in hand, we can copy the database (and if there is one, column family handle) from the RocksDB shadow
object into the iterator shadow object. The shadow object for an iterator then consists of the following:
the
std::unique_ptr<TIterator> iterator
is all that is necessary because we never have multiple reference to a single iterator. Thestd::shared_ptr
s maintain the lifetime of the database and column family handle for thelifetime of the shadow object, so that the iterator is always valid until it is closed. The only requirement on the
user is to
close()
the iterator when they are done with it, most often just by using Java's try-with-resourcesmechanism, but reserving the option to
close()
it manually if its lifetime is more sophisticated.api_columnfamilyhandle.h
- the shadow column family handleColumn family handles also have shadow objects. We choose to make their lifetime different from that of iterators.
When the corresponding database (and all other objects which implicitly use it) are closed, the column family handle becomes invalid. We take this view because usually a column family is accessed with a database in hand anyway,
so it is wrong to do so if that database is closed.
When the column family handle is used closed, we return a clean Java exception.
This behaviour is achieved using the C++
std::weak_ptr
mechanism. Each weak pointer is associated with astd::shared_ptr
and can belock()
-ed to yield astd::shared_ptr
as long as there are other outstanding copies of the shared pointer (its reference count has not fallen to 0). A successfullock()
yields astd::shared_ptr<ROCKSDB_NAMESPACE::ColumnFamilyHandle>
which is valid at least until it goes out of scope, and then in C++ we dereference that to make the same calls to the database as before. An unsuccessfullock()
of the database weak pointer tells us that all database and iterator handles are already closed.So when accessing column families in client code, we must access the column family while we have an open database.
If the database has been closed, we will fail with a
RocksDBException
with the message "column family already closed".Performance
The new code does what might be considered as some extra work in 2 ways
AtomicBoolean isOpen
before every access of the native handlestd::shared_ptr
or astd::weak_ptr
rather than a raw pointer.For smart pointers, we devised a worst case performance test based on existing JMH tests for JNI performance of alternative implementations of
get()
methods.The RocksDB JNI Performance branch has been
adapted to provide an extra test which simulates the access pattern using a
std::shared_ptr
to access asimulated database object, and a
std::weak_ptr
to access a simulated column family object.The performance of 4 separate implementations of an implementation of
get()
which uses the JNISetRegion
methodsto return data in a java
byte[]
are compared. A distinct 5th implementation usingSetRegion
to get into an indirectByteBuffer
is compared for reference.The 4 implementations are:
get()
callsget()
callsstd::shared_ptr
holding a reference to the model database.std::shared_ptr
andstd::weak_ptr
references; fetch both,check that they are equal (of course this is true), and then use the reference.
As the buffer size grows,
get()
performance becomes indistinguishable between all theSetRegion
intobyte[]
implementations.SetRegion
into an indirectByteBuffer
is significantly slower.SetRegion
into an indirectByteBuffer
is significantly slower.For small buffer sizes there is a statistically significant but still extremely small difference between (1) and
(2,3,4). This may be attributable to optimizations possible when the address of the mock database is a known, static value or it may be due to passing 1 less parameter to the
get()
method.Otherwise, dereferencing a shadow object via a smart pointer, even where locking is involved, seems to have negligible performance impact. Bear in mind that the performance being tested is entirely stripped down to the
nuts and bolts of buffer transfer, so it would exaggerate any performance difference compared to what we would
see in RocksDB, which has to go through a longer code path before it can access buffers.
Status
A large number of
RocksObject
classes have been converted toRocksNative
, and the corresponding C++ code updated. All Java unit tests are passing in this branch. As well as the fundamental classes (RockDB, ColumnFamilyHandle, Iterator) this has necessitated the conversion of a large number of derivative or related classes.The classes that remain to be converted include the various
Options
classes.TODO