-
-
Notifications
You must be signed in to change notification settings - Fork 219
Description
This is based on an issue found by @bhive01 and discussed with him
and @clauswilke in wilkelab/gridtext#9.
Running the code included at the end in R versions up through 4.0.0
with the C stack set to 1M will segfault with a C stack overflow. The
same happens (for me on Ubuntu at least) with the default 8M stack if
you change the value of N to 1000.
Running this code as is (i.e. N == 500) and with an 8M stack will not
fail, but takes 14 minutes due to several multi-minute-long pauses in
the second loop.
The problem is that this code, via calls to R_PreserveObject
in
Rcpp, puts around 300K objects into the preserved object list. This is
a simple linked list, so has to be searched linearly to remove objects
pushed on early. That is the reason for the long pauses. The segfault
happens because the code for R_ReleaseObject
was using recursion to
work down the list. R-devel and R-patched now use iteration and no
longer segfault, but that doesn't help anyone running R 4.0.0 or
earlier.
The R Extensions Manual says about the
R_PreserveObject
/R_ReleaseObject
mechanism:
It is less efficient than the normal protection mechanism, and
should be used sparingly.
The linked list data structure, and even the recursive delete, are
reasonable for this usage but are not suited for handling 300K objects
that may need to be removed in random or first in/first-out order.
Assuming you need to keep alive R objects, a better way to do this is
to create an appropriate data structure that you control, and protect
only that data structure with R_PreserveObject
(and maybe release it
on unload with R_ReleseObject
). That way you can use a more
appropriate data structure, such as a hash table. Using a hash table
the code that currently takes 14 minutes would only take about 1.5
minutes. The improvement is even larger for N == 1000.
Even if you don't want to go to a hash table now it would be a good
idea to switch to using your own list so users of R 4.0.0 or earlier
don't run the risk of a segfault.
For using your own list you need something like this
static SEXP Rcpp_precious = NULL;
void R_init_Rcpp(DllInfo* dllinfo) {
Rcpp_precious = CONS(R_NilValue, R_NilValue);
R_PreserveObject(Rcpp_precious);
...
}
Then replace R_PreserveObject
and R_ReleaseObject
with
void Rcpp_PreserveObject(SEXP object)
{
SETCDR(Rcpp_precious, CONS(object, CDR(Rcpp_precious)));
}
void Rcpp_ReleaseObject(SEXP object)
{
SETCDR(Rcpp_precious, DeleteFromList(object, CDR(Rcpp_precious)));
}
where (this is now in R_devel/R_patched)
static SEXP DeleteFromList(SEXP object, SEXP list)
{
if (CAR(list) == object)
return CDR(list);
else {
SEXP last = list;
for (SEXP head = CDR(list); head != R_NilValue; head = CDR(head)) {
if (CAR(head) == object) {
SETCDR(last, CDR(head));
return list;
}
else last = head;
}
return list;
}
}
This maintains your own list, which is protected by being in the CDR
field of the protected Rcpp_precious
list cell. This will give you
the same reliability and performance across all R versions. [Code is
not tested, so there may be typos, but that is design I recommend].
If you want better performance for a use pattern as exhibited in the
example here you could look at using a hash table stored in a VECSXP
placed in one of the cells of Rcpp_precious
. In a quick and dirty
experiment a simple hash table did much better, but the code is of
course more complicated.
All that said, I have a feeling that it should be possible to do
better by using a combination of the protected fields in external
pointer objects and weak references. If you have a document describing
the design of your memory management I could take a look and see if I
can suggest something.
Here is the code to run, from @clauswilke in wilkelab/gridtext#9.
library(ggplot2)
library(ggtext)
library(grid)
plot_grob <- function(df) {
p <- ggplot(data.frame(x = 1, y = 1)) +
geom_point(aes(x, y)) +
labs(caption = "<br>Pink dots represent outliers and are removed from downstream analyses.<br>Error bars represent the standard error of the mean.<br>ANOVA model significance *p* < 0.05.<br>Treatments with the same letter are not significantly different at *α* = 0.05 according to Tukey's HSD. ") +
theme(plot.caption = element_textbox_simple())
ggplotGrob(p)
}
N <- 500
l <- list()
for (i in 1:N) {
cat(i, " ")
g <- plot_grob()
grid.newpage()
grid.draw(g)
l[[i]] <- g
}
l <- NULL
l <- list()
for (i in 1:N) {
cat(i, " ")
g <- plot_grob()
grid.newpage()
grid.draw(g)
l[[i]] <- g
}
Activity
eddelbuettel commentedon May 17, 2020
Thanks for the detailed post, and the worked suggestion using
Rcpp_precious
(love the name ;-) ) and a simple list. For disruptive changes I appreciate the minimalism. If this works (as expected) we could look into a hashed structure later.Now, is there by chance a truly minimally reproducible example that would not involve the ggplot2 stack, tibbles, and a number of other packages (including one not on CRAN) ?
ltierney commentedon May 17, 2020
eddelbuettel commentedon May 17, 2020
Use of
R_PreserveObject
(and the matching release) goes back 15 years to the very first C++ class wrappers aroundSEXP
objects. By using C++ facilities like the constructor and destructor we can generally ensure proper resource management (cf the RTTI idiom in C++ parlance).And we rather happily relied on this base R feature all those years.
But presumably all use cases up to now were only 'lower-case s' stress testing where @clauswilke and @bhive01 seem to have found a way to put a capital S there. Or, quite possibly, other use cases that may have seen failures were less determined to drill down. You all now did, so thanks for that.
Resource management in the context of a dynamically managed system can have its challenges. Relying on this mechanism worked really well so far. Let's see if you can turn it up one notch. Looking at the code now, requires some tweaks.
ltierney commentedon May 17, 2020
kevinushey commentedon May 17, 2020
This would probably function as a reproducible example (depending on stack size +
n
). Save this to a file and then callRcpp::sourceCpp("file.cpp")
:(Note: reproducibility may depend on whether your C++ standard library implementation destructs elements from first-to-last, or last-to-first. AFAIK this isn't explicitly mandated by the standard)
The problem occurs because of order of destruction for vector elements doesn't match the order in which they were created, leading to this deep recursion into the precious list.
Ultimately, the goal of the Rcpp constructors and destructors is to protect the associated object from the GC for the lifetime of that object, and unprotect it once collection is safe.
I think using a hash table is worth considering just because it would insulate us from other issues (e.g. performance) related to order-of-destruction issues.
eddelbuettel commentedon May 17, 2020
Correct, and we were on the same calling path.
eddelbuettel commentedon May 17, 2020
While the suggested design can be added to the package fairly easily, it does not account for use from
sourceCpp()
et al so this will need a refinement. Not as quick a change as I had hoped.clauswilke commentedon May 17, 2020
I have been trying to create a simpler reprex, but I'm not sure I've been successful. The code below attempts to simulate what I think is causing the issues in the ggplot2 example, and it crashes my R session if I run it twice. But the error is
segfault: 'memory not mapped'
, which seems different from stack overflow. Posting it here in case it's helpful.eddelbuettel commentedon May 17, 2020
(But that is slightly different.
XPtr
means "user knows what he is doing, we leave it alone". When you as a user pick one and allocate with vianew
, you are also either supposed todelete
(or ensure it is done for you). Plus we're now entering a whole new element to the story by adding an STL vector too.)clauswilke commentedon May 17, 2020
Ah, Ok. I'm the user here, but I may not know what I'm doing. :-) The gridtext package definitely wraps things into
XPtr
s. The STL vector may not be needed for the example. I'll try to refine the reprex.clauswilke commentedon May 17, 2020
@eddelbuettel Could you provide a minimal correct example using
XPtr
? How does the C++ code know when the R object holding the pointer goes out of scope? I always assumeddelete
is called automatically when the last reference to the pointer disappears.eddelbuettel commentedon May 17, 2020
Sorry, I need to rephrase that. The XPtr generally has a default finaliser too so I stated that poorly, my bad. I use them a lot myself when I also rely on a default cleanup. It's just that maybe in this example ensuring a cleanup at end of scope may help. To be seen.
clauswilke commentedon May 17, 2020
In my reprex, you could just replace
std::vector<Rcpp::String> s;
withRcpp::StringVector s;
to avoid the STL. It just gets very slow, and I'm not sure it triggers the stack overflow either.eddelbuettel commentedon May 17, 2020
Simplifying is good. And apologies again for the comment two up---I also rely on the default finalizer (see
inst/include/Rcpp/XPtr.h
but that is of course gnarly code -- key isstandard_delete_finalizer()
). In your example, though. you write to the same long list twice overwriting elements. Methinks those are then lost and never cleaned so that looks like a code smell to me.Other (standard) XPtr use (I do a ton myself) is to ... do some work, set up some data structures or pointers to persist ... do some other stuff ... have a code block relying on the XPtr allocated objects ... some other things ... and then end-of-scope (end of subroutine say) to clean things up. That works of course as intended.
"It just gets very slow": Yes, we always say that R objects we wrap should not use
push_back()
because by the semantics of R vectors each of those is a full copy. So to that extend you were right in that code piece using a STL vector. Anyway, maybe we can go back the example by @kevinushey .eddelbuettel commentedon May 17, 2020
Ok, needed to take a break, clear my head, work on something else ... to then come back and now have the (basic idea and solution of) Luke's proposal implemented (in a slightly different way because of the way Rcpp is called etc pp). Still passes unit tests, but (once I installed package
ggtext
) it does not appear to run the example above any better: it still slows to halt in loop and then spends minutes sitting there. (R 4.0.0, Ubuntu 19.10, everything current from CRAN).45 remaining items