Skip to content

Refactor global allocator to safer TrackingAllocator #205

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
May 6, 2023
Merged

Conversation

ianks
Copy link
Collaborator

@ianks ianks commented May 1, 2023

Previously, it was possible for the previous RbAllocator implementation to deadlock and/or segfault when dealloc was called after the Ruby VM was shut down. In particular, this can happen when thread local destructors are run, since they are run after the Ruby VM has been shut down. When the Ruby VM is shut down, calls to rb_gc_adjust_memory_usage will fail. So we need to avoid that.

Bullet Points

  • Use some hidden symbols (ruby_vm_current_ptr and ruby_current_ec) to check the availability of the Ruby VM before attempting to report memory usage.
  • Rename RbAllocator to TrackingAllocator to communicate the implementation a bit more accurately
  • Add new ManuallyTracked<T> type which allows RAII style allocation tracking outside of std::system::GlobalAlloc. This is useful when you want to track memory usage from direct calls to mmap(2), for example.

Caveats

Unfortunately, this solution will not work with Ruby 3.3 since ruby_current_vm_ptr is not longer publicly exported (as it was private to begin with). I've opened an issue on the Ruby bug tracker to add official support for the needed behavior, though. Please chime in on the issue so we can have reliable Rust memory tracking in the next Ruby release!

@ianks ianks force-pushed the allocator-check-vm branch from 6d5798a to b36114d Compare May 2, 2023 01:40
@ianks ianks marked this pull request as ready for review May 2, 2023 01:41
@ianks ianks force-pushed the allocator-check-vm branch from c1c9eb5 to 4d940bf Compare May 2, 2023 05:36
@ianks ianks force-pushed the allocator-check-vm branch from 3d26c77 to d02bd6f Compare May 2, 2023 17:53
R: Send + 'static,
{
self.run(|| {
trigger_full_gc!();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just for the tracking allocator tests? Couldn't this just be used in those tests?

for closure in receiver {
closure();
// Wait for the main thread to finish setting up Ruby
while unsafe { STATE.load(Ordering::Acquire) != 2 } {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the std::sync::Once::call_once docs:

This method will block the calling thread if another initialization routine is currently running.

so there shouldn't be any need for this redundant synchronization.

@@ -66,7 +77,7 @@ impl RubyTestExecutor {
F: FnOnce() -> R + Send + 'static,
R: Send + 'static,
{
let (result_sender, result_receiver) = mpsc::sync_channel(1);
let (result_sender, result_receiver) = mpsc::sync_channel(8);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 8? This channel is only used in this single run call, so it will only have a single send to it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sync_channel has some strange behaviors in 1.54, was using this as a way to robustness test. Will remove.

Comment on lines 39 to 40
static INIT: Once = Once::new();
static mut STATE: AtomicU8 = AtomicU8::new(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is more synchronization needed now? It looks like moving this initialization code into executor.run(|| { that precedes the executor being returned means that we shouldn't even need the INIT, let alone the extra STATE synchronization

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was having some major synchronization headaches on Rust 1.54 (but not on stable). I threw everything at the book at it to finally make these tests stable since libruby doesn't always use atomic loads for these values.

My thinking is that since these are test helpers, it's better have too much synchronization than not enough - to avoid flakiness at almost all costs. That being said, planning on circling back to clean up the unnecessary stuff.

let result = $e;
let after = unsafe { rb_sys::rb_gc_stat(id) };

$crate::trigger_full_gc!();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be triggered before the first rb_gc_stat?

Comment on lines 36 to 41
pub static CAPTURE_LOCK_INIT: std::sync::Once = std::sync::Once::new();
pub static mut CAPTURE_LOCK: Option<std::sync::Mutex<()>> = None;

CAPTURE_LOCK_INIT.call_once(|| unsafe {
CAPTURE_LOCK.replace(std::sync::Mutex::new(()));
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't each use of this macro end up with a separate lock? What is this lock protecting? Isn't #[ruby_test] already keeping multiple ruby tests from running in concurrency?

let ret = !crate::hidden::ruby_current_vm_ptr.is_null();

#[cfg(any(ruby_lte_2_4, ruby_gt_3_2))]
let ret = crate::rb_cBasicObject != 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does ruby rb_cBasicObject = 0? I only see it getting set in Init_class_hierarchy. Hopefully your upstream PR will make this unnecessary anyways.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this doesn't actually inform us about a shutdown VM, only if it has started or not. Half-measure.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, Ruby doensn't explicitly set it to 0 anywhere, instead we are relying on the fact that the default initialization value of a static variable is zero in C

@ianks ianks requested a review from dylanahsmith May 4, 2023 18:43
@ianks ianks merged commit b536c1e into main May 6, 2023
@ianks ianks deleted the allocator-check-vm branch May 6, 2023 02:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants