From: "ioquatix (Samuel Williams)" Date: 2022-01-09T20:19:50+00:00 Subject: [ruby-core:107019] [Ruby master Bug#18465] Make `IO#write` atomic. Issue #18465 has been updated by ioquatix (Samuel Williams). > I think it'd better to guarantee atomicity for puts and write, even if the same fd is used by multiple IO instances, and even if the same fd is used by multiple processes. The current implementation before and after this PR makes no such guarantee unfortunately. The best you can do as a user is buffer your own string and call write with that as an argument to get any kind of atomic behaviour, but that only applies to non-sync IOs. > The write lock does not address both of these cases and it's then leaking implementation details (e.g., other Rubies might not have a write lock). The write lock internally protects `sync=true` IO which has an internal per-IO buffer. All I've done in my PR is perform the lock once per operation rather than once per system call, so I've moved the lock "up" a little bit to reduce the number of times it would be invoked and increase the amount of synchronisation so that IOs can't interrupt each other. > This is what TruffleRuby does, and it's fully portable. https://github.com/oracle/truffleruby/blob/bd36e75003a1f2d57dbc947350cb076e9a827cbd/src/main/ruby/truffleruby/core/io.rb#L2375-L2394 How is sync handled? I don't see the internal buffer is used in `def write`. > Ah and in either case the IO scheduler should either receive the already-concatenated string, or all strings to write together in a single hook call. Calling the write hook for each argument is of course broken. The interface for IO#write has to deal with both buffered and un-buffered operations and so we hooked into the internal read and write functions of IO. The best we can hope for is to tidy up how they are used. We don't provide a strong guarantee between one call to `IO#write` corresponding to one call to `Scheduler#io_write`... while it's true in most cases, it's not true in all cases. This is really just an artefact of the complexity of the current implementation in `io.c`. ---------------------------------------- Bug #18465: Make `IO#write` atomic. https://bugs.ruby-lang.org/issues/18465#change-95853 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal * Assignee: ioquatix (Samuel Williams) * Backport: 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN ---------------------------------------- Right now, `IO#write` including everything that calls it including `IO#puts`, has a poorly specified behaviour w.r.t. other fibers/threads that call `IO#write` at the same time. Internally, we have a write lock, however it's only used to lock against individual writes rather than the whole operation. From a user point of view, there is some kind of atomicity, but it's not clearly defined and depends on many factors, e.g. whether `write` or `writev` is used internally. We propose to make `IO#write` an atomic operation, that is, `IO#write` on a synchronous/buffered IO will always perform the write operation using a lock around the entire operation. In theory, this should actually be more efficient than the current approach which may acquire and release the lock several times per operation, however in practice I'm sure it's almost unnoticeable. Where it does matter, is when interleaved operations invoke the fiber scheduler. By using a single lock around the entire operation, rather than one or more locks around the system calls, the entire operation is more predictable and behaves more robustly. -- https://bugs.ruby-lang.org/ Unsubscribe: