-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Add save_trace and load_trace #2975
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
THANK YOU! This will solve so much pickling issues! |
Just so I understand what's the pickle issues? Incompatible between python versions and security concerns? |
This looks good to me and ready to merge. Any objections @ColCarroll |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Great stuff! |
Oh, I forgot, we should add this to the release-notes, and also add some example docs somewhere. |
A separate PR for that would work. I'll have a stab at adding some docs on
this this afternoon.
…On Fri, 18 May 2018, 11:03 am Thomas Wiecki, ***@***.***> wrote:
Oh, I forgot, we should add this to the release-notes, and also add some
example docs somewhere.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#2975 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AA8DiGDKgQzXEbf0Oi_HRtC5JUQZ-KUSks5tzpxZgaJpZM4UAZ6x>
.
|
I will add release notes, and I realized that there's an edge case I didn't cover that requires deleting the directory before writing to it (if you save a model with variables |
Does anyone have a sample code to predict with this trace load functionality? I have a Gaussian model : My objective is to predict f(x) for a new x in a new python session without running the model training piece... |
I might need more detail for what you're trying to do. Here's an example, though: First, generate a random model: import os
import numpy as np
import matplotlib.pyplot as plt
import theano
import theano.tensor as tt
dims = 2
N = 100
true_weights = np.random.normal(size=(dims,))
data = np.random.normal(size=(N, dims))
noise = np.random.normal(0, 0.5, size=N)
y = np.dot(data, true_weights) + noise
print(true_weights) Now do a cached prediction -- running this multiple times will work, even changing the cache_file = 'my_trace.trace'
s_data = theano.shared(data)
with pm.Model() as model:
weights = pm.Normal('weights', mu=0, sd=1, shape=dims)
y_obs = pm.Normal('y_obs', mu=tt.dot(s_data, weights), sd=0.5, observed=y, shape=s_data.shape[0].eval())
if not os.path.exists(cache_file):
with model:
trace = pm.sample()
pm.save_trace(trace, directory=cache_file)
else:
trace = pm.load_trace(cache_file, model=model)
predict_data = np.array([
[0, 1],
[1, 0],
[1, 1],
[2, 2],
])
s_data.set_value(predict_data)
with model:
ppc = pm.sample_ppc(trace)
print(trace['weights'].mean(axis=0)) # pretty close to true weights
print(ppc['y_obs'].mean(axis=0)) # should be reasonable |
Thanks very much. Here is what I'm trying to do....
*Defining Priors:*
with pm.Model() as gp_fit:
ρ = pm.Gamma('ρ', 1, 2)
η = pm.Gamma('η', 1, 2)
K = η * pm.gp.cov.ExpQuad(2, ρ)
with gp_fit:
M = pm.gp.mean.Zero()
σ = pm.HalfCauchy('σ', 0.5)
Initial Pseudo Points
k_m_point=20
Xu_init = pm.gp.util.kmeans_inducing_points(k_m_point, Xs)
*Sparse Gaussian Optimization:*
with gp_fit:
gp = pm.gp.MarginalSparse(cov_func=K, approx="VFE")
#Xu=Xu_init
Xu = pm.Flat("Xu", shape=(20, 2), testval=Xu_init)
y_ = gp.marginal_likelihood("y_", X=Xs, Xu=Xu, y=y, noise=σ)
#y_ = gp.marginal_likelihood("y_", X=Xs, Xu=Xu, y=y, noise=σ)
mp = pm.find_MAP()
trace = pm.sample(500, n_init=1000)
*Like to Save the fitted model at this stage....*
*Prediction:*
mu_dev, var_dev = gp.predict(X_new, point=mp, diag=True)
I would like to do this prediction without the preceding steps by loading
the model/trace... What all do I need to save? Do I need to also load the
pseudo points at the time of inference?
…On Sat, Aug 25, 2018 at 1:11 PM Colin ***@***.***> wrote:
I might need more detail for what you're trying to do. Here's an example,
though:
First, generate a random model:
import os
import numpy as npimport matplotlib.pyplot as pltimport theanoimport theano.tensor as tt
dims = 2
N = 100
true_weights = np.random.normal(size=(dims,))
data = np.random.normal(size=(N, dims))
noise = np.random.normal(0, 0.5, size=N)
y = np.dot(data, true_weights) + noiseprint(true_weights)
Now do a cached prediction -- running this multiple times will work, even
changing the predict_data.
cache_file = 'my_trace.trace'
s_data = theano.shared(data)
with pm.Model() as model:
weights = pm.Normal('weights', mu=0, sd=1, shape=dims)
y_obs = pm.Normal('y_obs', mu=tt.dot(s_data, weights), sd=0.5, observed=y, shape=s_data.shape[0].eval())
if not os.path.exists(cache_file):
with model:
trace = pm.sample()
pm.save_trace(trace, directory=cache_file)else:
trace = pm.load_trace(cache_file, model=model)
predict_data = np.array([
[0, 1],
[1, 0],
[1, 1],
[2, 2],
])
s_data.set_value(predict_data)
with model:
ppc = pm.sample_ppc(trace)
print(trace['weights'].mean(axis=0)) # pretty close to true weightsprint(ppc['y_obs'].mean(axis=0)) # should be reasonable
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#2975 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AotFpVie8LwPrLBfizLnhbmAThOOBjfEks5uUYVbgaJpZM4UAZ6x>
.
--
-------------------------
Sudipta Mazumdar
Home: 905-604-3325
Cell: 647-687-5900
|
This provides functions to save and load traces, avoiding
pickle
. My main use would be saving traces while running a large notebook, or distributing the traces with code containing the models used to produce them.Pros:
Cons:
trace.report
yet (though that could be added without breaking compatibility)