How to debug the variable graph to find source of NaN?

I have a rather large PyMC model in v5.23.0. It samples the posterior fine. But when I sample the posterior predictive, it throws an error, complaining that a Binomial RV is seeing a NaN within the tensor for the p argument:

image

I know which Binomial RV is the offender, and within the debugger I can see that yes, some of the values are NaNs:

But I cannot figure out how to debug further. Some ancestor of this node in the variable graph is creating the NaNs, but I cannot find it by code inspection.

Is there a way to step through the variable graph, one node at a time, to find the source of the NaNs?

PyTensor has a NanGuardMode (nanguardmode — PyTensor dev documentation) you can try to use. Here is an example with sample_prior_predictive. Not sure if it does the job for you.

from pytensor.compile.nanguardmode import NanGuardMode

import pymc as pm

with pm.Model() as m:
    u = pm.Uniform("p")
    du = pm.Categorical("du", [.1, .3, .6])
    
    x = pm.NegativeBinomial("x", n=1000, p=u / du)
    pm.sample_prior_predictive(compile_kwargs=dict(mode=NanGuardMode()))
    
from pytensor.compile.nanguardmode import NanGuardMode

import pymc as pm

with pm.Model() as m:
    u = pm.Uniform("p")
    du = pm.Categorical("du", [.1, .3, .6])
    
    x = pm.NegativeBinomial("x", n=1000, p=u / du)
    pm.sample_prior_predictive(compile_kwargs=dict(mode=NanGuardMode()))
    
# AssertionError: Inf detected
# Big value detected
# NanGuardMode found an error in the output of a node in this variable:
# True_div [id A]
#  ├─ uniform_rv{"(),()->()"}.1 [id B] 'p'
#  │  ├─ RNG(<Generator(PCG64) at 0x7FF622D7ECE0>) [id C]
#  │  ├─ NoneConst{None} [id D]
#  │  ├─ 0 [id E]
#  │  └─ 1 [id F]
#  └─ categorical_rv{"(p)->()"}.1 [id G] 'du'
#     ├─ RNG(<Generator(PCG64) at 0x7FF6238D26C0>) [id H]
#     ├─ NoneConst{None} [id D]
#     └─ [0.1 0.3 0.6] [id I]
# ...
# Apply node that caused the error: True_div(p, du)
# Toposort index: 2
# Inputs types: [TensorType(float64, shape=()), TensorType(int64, shape=())]
# Inputs shapes: [(), ()]
# Inputs strides: [(), ()]
# Inputs values: [array(0.97849175), array(0)]
# Outputs clients: [[negative_binomial_rv{"(),()->()"}(RNG(<Generator(PCG64) at 0x7FF624053920>), NoneConst{None}, 1000, True_div.0)]]
1 Like

Thanks. Just what I needed.