[#103680] [Ruby master Bug#17843] Ruby on Rails error[BUG] Segmentation fault at 0x0000000000000110 ruby 3.0.1p64 (2021-04-05 revision 0fb782ee38) [x86_64-darwin15] (#42110) — nayaronfire@...

Issue #17843 has been reported by nayaronfire (kk nayar).

7 messages 2021/05/01

[#103686] [Ruby master Misc#17845] Windows Ruby - ucrt build? — Greg.mpls@...

Issue #17845 has been reported by MSP-Greg (Greg L).

22 messages 2021/05/01

[#103690] [Ruby master Bug#17846] Percent mode changes the output from ERB beyond what is documented — wolf@...

Issue #17846 has been reported by graywolf (Gray Wolf).

8 messages 2021/05/02

[#103724] [Ruby master Feature#17849] Fix Timeout.timeout so that it can be used in threaded Web servers — duerst@...

Issue #17849 has been reported by duerst (Martin D=FCrst).

22 messages 2021/05/05

[#103756] [Ruby master Feature#17853] Add Thread#thread_id — komamitsu@...

Issue #17853 has been reported by komamitsu (Mitsunori Komatsu).

11 messages 2021/05/06

[#103801] [Ruby master Feature#17859] Start IRB when running just `ruby` — deivid.rodriguez@...

Issue #17859 has been reported by deivid (David Rodr=EDguez).

18 messages 2021/05/12

[#103866] [Ruby master Bug#17866] Incompatible changes with Psych 4.0.0 — hsbt@...

Issue #17866 has been reported by hsbt (Hiroshi SHIBATA).

13 messages 2021/05/17

[#103892] [Ruby master Bug#17871] TestGCCompact#test_ast_compacts test failing again — jaruga@...

Issue #17871 has been reported by jaruga (Jun Aruga).

11 messages 2021/05/19

[#103912] [Ruby master Bug#17873] Update of default gems in Ruby 3.1 — hsbt@...

Issue #17873 has been reported by hsbt (Hiroshi SHIBATA).

38 messages 2021/05/20

[#103971] [Ruby master Bug#17880] [BUG] We are killing the stack canary set by `opt_setinlinecache` — jean.boussier@...

Issue #17880 has been reported by byroot (Jean Boussier).

8 messages 2021/05/22

[#103974] [Ruby master Feature#17881] Add a Module#const_added callback — jean.boussier@...

Issue #17881 has been reported by byroot (Jean Boussier).

29 messages 2021/05/22

[#104004] [Ruby master Feature#17883] Load bundler/setup earlier to make `bundle exec ruby -r` respect Gemfile — mame@...

Issue #17883 has been reported by mame (Yusuke Endoh).

21 messages 2021/05/24

[#104109] [Ruby master Feature#17930] Add column information into error backtrace — mame@...

Issue #17930 has been reported by mame (Yusuke Endoh).

34 messages 2021/05/31

[ruby-core:103697] [Ruby master Feature#17795] Around `Process.fork` callbacks API

From: eregontp@...
Date: 2021-05-03 11:34:04 UTC
List: ruby-core #103697
Issue #17795 has been updated by Eregon (Benoit Daloze).


Another use-case I found: https://github.com/rubyjs/mini_racer#fork-safety
That gem (and others of course) could provide some helpers methods based on the chosen strategy if there is a common hook.

Right now providing fork safety (if it has any resource that needs handling across fork) is basically impossible for a gem, which seems a severe limitation.
And we all know that if something is not fork-safe and fork is used it's often leading to data corruption.

----------------------------------------
Feature #17795: Around `Process.fork` callbacks API
https://bugs.ruby-lang.org/issues/17795#change-91789

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
Replaces: https://bugs.ruby-lang.org/issues/5446

### Context

Ruby code in production is very often running in a forking setup (puma, unicorn, etc), and it is common some types of libraries to need to know when the Ruby process was forked. For instance:

  - Most database clients, ORMs or other libraries keeping a connection pool might need to close connections before the fork happens.
  - Libraries relying on some kind of dispatcher thread might need to restart the thread in the forked children, and clear any internal buffer (e.g. statsd clients, newrelic_rpm).

**This need is only for forking the whole ruby process, extensions doing a `fork(2) + exec(2)` combo etc are not a concern, this aim at only catching `kernel.fork`, `Process.fork` and maybe `Process.daemon`.**.
The use case is for forks that end up executing Ruby code.

### Current solutions

Right now this use case is handled in several ways.

#### Rely on the integrating code to call a `before_fork` or `after_fork` callback.

Some libraries simply rely on documentation and require the user to use the hooks provided by their forking server.

Examples:

  - Sequel: http://sequel.jeremyevans.net/rdoc/files/doc/fork_safety_rdoc.html
  - Rails's Active Record: https://devcenter.heroku.com/articles/concurrency-and-database-connections#multi-process-servers
  - ScoutAPM (it tries to detect popular forking setup and register itself): https://github.com/scoutapp/scout_apm_ruby/blob/fa83793b9e8d2f9a32c920f59b57d7f198f466b8/lib/scout_apm/environment.rb#L142-L146
  - NewRelic RPM (similarly tries to register to popular forking setups): https://www.rubydoc.info/github/newrelic/rpm/NewRelic%2FAgent:after_fork


#### Continuously check `Process.pid`

Some libraries chose to instead keep the process PID in a variable, and to regularly compare it to `Process.pid` to detect forked children.
Unfortunately `Process.pid` is relatively slow on Linux, and these checks tend to be in tight loops, so it's not uncommon when using these libraries
to spend `1` or `2%` of runtime in `Process.pid`.

Examples:

  - Rails's Active Record used to check `Process.pid` https://github.com/Shopify/rails/blob/411ccbdab2608c62aabdb320d52cb02d446bb39c/activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb#L946, it still does but a bit less: https://github.com/rails/rails/pull/41850
  - the `Typhoeus` HTTP client: https://github.com/typhoeus/typhoeus/blob/a345545e5e4ac0522b883fe0cf19e5e2e807b4b0/lib/typhoeus/pool.rb#L34-L42
  - Redis client: https://github.com/redis/redis-rb/blob/6542934f01b9c390ee450bd372209a04bc3a239b/lib/redis/client.rb#L384
  - Some older versions of NewRelic RPM: https://github.com/opendna/scoreranking-api/blob/8fba96d23b4d3e6b64f625079c184f3a292bbc12/vendor/gems/ruby/1.9.1/gems/newrelic_rpm-3.7.3.204/lib/new_relic/agent/harvester.rb#L39-L41

#### Continuously check `Thread#alive?`

Similar to checking `Process.pid`, but for the background thread use case. `Thread#alive?` is regularly checked, and if the thread is dead, it is assumed that the process was forked.
It's much less costly than a `Process.pid`, but also a bit less reliable as the thread could have died for other reasons. It also delays re-creating the thread to the next check rather than immediately upon forking.

Examples:

  - `statsd-instrument`: https://github.com/Shopify/statsd-instrument/blob/0445cca46e29aa48e9f1efec7c72352aff7ec931/lib/statsd/instrument/batched_udp_sink.rb#L63

#### Decorate `Kernel.fork` and `Process.fork`

Another solution is to prepend a module in `Process` and `Kernel`, to decorate the fork method and implement your own callback. It works well, but is made difficult by `Kernel.fork`.


Examples:

  - Active Support: https://github.com/rails/rails/blob/9aed3dcdfea6b64c18035f8e2622c474ba499846/activesupport/lib/active_support/fork_tracker.rb
  - `dd-trace-rb`: https://github.com/DataDog/dd-trace-rb/blob/793946146b4709289cfd459f3b68e8227a9f5fa7/lib/ddtrace/profiling/ext/forking.rb
  - To some extent, `nakayoshi_fork` decorates the `fork` method: https://github.com/ko1/nakayoshi_fork/blob/19ef5efc51e0ae51d7f5f37a0b785309bf16e97f/lib/nakayoshi_fork.rb

### Proposals

I see two possible features to improve this situation:

####  Fork callbacks

One solution would be for Ruby to expose a callback API for these two events, similar to `Kernel.at_exit`.

Most implementations of this functionnality in other languages ([C's `pthread_atfork`](https://man7.org/linux/man-pages/man3/pthread_atfork.3.html), [Python's `os.register_at_fork`](https://docs.python.org/3/library/os.html#os.register_at_fork)) expose 3 callbacks:

  - `prepare` or `before` executed in the parent process before the `fork(2)`
  - `parent` or `after_in_parent` executed in the parent process after the `fork(2)`
  - `child` or `after_in_child` executed in the child process after the `fork(2)`

A direct translation of such API in Ruby could look like `Process.at_fork(prepare: Proc, parent: Proc, child: Proc)` if inspired by `pthread_atfork`.

Or alternatively each callback could be exposed idependently: `Process.before_fork {}`, `Process.after_fork_parent {}`, `Process.after_fork_child {}`.

Also note that similar APIs don't expose any way to unregister callbacks, and expect users to use weak references or to not hold onto objects that should be garbage collected.

Pseudo code:

```ruby
module Process
  @prepare = []
  @parent = []
  @child = []

  def self.at_fork(prepare: nil, parent: nil, child: nil)
    @prepare.unshift(prepare) if prepare # prepare callbacks are executed in reverse registration order
    @parent << parent if parent
    @child << child if child
  end

  def self.fork
    @prepare.each(&:call)
    if pid = Primitive.fork
      @parent.each(&:call) # We could consider passing the pid here.
    else
      @child.each(&:call)
    end
  end
end

```

#### Make `Kernel.fork` a delegator

A simpler change would be to just make `Kernel.fork` a delegator to `Process.fork`. This would make it much easier to prepend a module on `Process` for each library to implement its own callback.

Proposed patch: https://github.com/ruby/ruby/pull/4361



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:[email protected]?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread