From: Eric Wong Date: 2018-07-24T04:35:47+00:00 Subject: [ruby-core:88067] Re: [Feature #5446] identify execution context to avoid inadvertant sharing ``` tenderlove@ruby-lang.org wrote: > normalperson (Eric Wong) wrote: > > eregontp@gmail.com wrote: > > > I want to help protect future programmers from such bugs, if at all possible. > > > And I believe it's possible. > > > > > > > I agree with Jeremy on this; it will likely cause new problems > > > > and surprises if libraries use it. > > > > > > Let's design it so it doesn't. > > > What's the harm/surprise to reconnect in at_fork(:after_child) for instance? > > > > It's a waste of time and resources when the child has no need for > > a connection at all. Simply put, library authors have no clue > > how an application will use/manage processes and would > > (rightfully) not bother using such hooks. > > I think library authors can make things easier though. Web > frameworks, like Rails for example, are expected to handle > this situation for the user. In addition, say a library > author provided no such feature like Sequel, how would a user > know they need to call `DB.disconnect` after a fork? Are they > responsible for completely understanding the implementation of > the library they are using? Even if an end user called > `DB.disconnect` in an after fork hook, what if that wasn't > enough? How would an end user know what needs to be called? I don't know how to deal with unknowledgeable users aside from making them knowledgeable about the tools they use. However, the issue also relates to allowing things to become thread-safe and fiber-safe at the moment, and in the future: Guild-safe. So maybe ko1 can give his thoughts on this topic since he is working on Guilds > > Also, consider that pthread_atfork has been around for many > > years, it's not adopted by library authors (of C/C++ libraries) > > because of problems surrounding it; and POSIX is even > > considering deprecating pthread_atfork[1]. > > > > How about an alternate proposal? > > > > Introduce a new object_id-like identifier which changes > > across fork: Thread.current.thread_id > > > > It doesn't penalize platforms without fork, and can work well > > with existing thread-aware code. > > I think this is a good idea, but I'm not sure it addresses the > communication issue I brought up. IMO it would be great to > have some sort of hook so that library authors can dictate > what "the right thing to do" is after a fork (maybe there are > other resources or caches that need to be cleaned, and maybe > that changes from version to version). Library authors also do not know if the child process will touch their objects/code; so making such assumptions can be costly when child processes with different goals are used. I think the only thing low-level library authors (e.g. authors of libpq/sqlite3/libcurl/etc...) can do is document how to deal with such issues when interacting with threads and processes. Maybe Sequel and AR can have add optional PID checks ($$ != expected_pid), which is totally generic and usable in bunch of cases. One thing I have been annoyed by is END/at_exit hooks firing unexpectedly in child processes. I work around them with the PID check. So I want to avoid having the same annoyance from at_exit by avoiding atfork hooks. > > IMHO, Thread.current.object_id being stable in forked child > > isn't good; but I expect compatibility problems if we change it > > at this point. At least some usages of monitor.rb would > > break. > > > > > The current hooks are webserver-specific and so migrating > > > between unicorn/puma/passenger/etc means it's quite easy to > > > forget to adapt to the new webserver hook, which would trigger > > > this bug. > > > > I hate the amount of vendor lock-in each webserver has. > > But making hooks which library authors can fire unpredictably > > on application authors is worse, especially if there's no > > "opt-out". > > I think requiring users to specify a db disconnect after fork > causes even more "vendor lock-in". Lets say I did add the > after fork code to deal with Sequel, but now I want to switch > to a threaded webserver. Now I have to do more work to figure > out what's required (if anything) in a threaded environment. > It puts the onus on the app dev to figure out what's right for > a particular environment, and that means it's harder to > change: locking you in by making more work. AFAIK, Sequel and AR already provides out-of-the-box thread-safe behavior. Perhaps Sequel and AR should also have out-of-the-box thread-AND-fork-safe connection pool which checks the PID. I don't expect PID check cost to be expensive, but they can make that an option for users and it'll work on all forking servers and maybe Guild servers in the future. People who don't use forking can use old thread-safe-only or single-threaded-only code. The key thing I want to avoid from having atfork hooks is: DB.disconnect pid = fork do # something which never touches DB exit!(0) # avoid at_exit hook end DB.reconnect It's needlessly expensive in the parent process to pay the disconnect/reconnect cost when forking a child which never touches the DB. And there's already a gotcha for avoiding at_exit hooks. > Additionally, forking servers all have to provide this type of > hook anyway (Unicorn, Resque, Puma, to name a few) but today > they have to specify their own API. I think it would be great > if we had a "Rack for fork hooks", if that makes sense. :) I don't think limiting this to fork is necessary; forking may become obsolete with Guilds. ``` Unsubscribe: