简单Tensorflow线性拟合类及tf.get_variable()使用示例

最新推荐文章于 2024-06-17 13:03:01 发布

原创最新推荐文章于 2024-06-17 13:03:01 发布 · 563 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#Tensorflow #线性拟合

Tensorflow 专栏收录该内容

9 篇文章

订阅专栏

本文介绍如何使用TensorFlow实现变量共享，并基于此构建了一个简单的线性拟合类。通过实例演示了如何在不同模型实例间共享训练好的参数。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

在完成cs224d第二个大作业（assignment2）之后，个人认为它的模型类封装做的不错，为方便以后学习使用，特简化形成一个简单的线性拟合类(class)，以备后续学习使用。同时，在其中探索了利用tf.variable_scope()和tf.get_variable()进行变量共享的方法。代码见后面。

在完成cs224d assignment2作业时，用到variable_scope()和get_variable()操作，非常迷糊。经过大量网上文档查阅和试验之后，个人认为下面这个博客说得比较清楚：http://blog.csdn.net/Jerr__y/article/details/70809528。具体不再赘述。

需要说明的是，通过tf.get_variable()共享变量，可用于传递训练得到的神经网络模型参数。在cs224d assignment2的q3_RNNLM中便需要两次调用同一个class，其中，第一次是训练RNN模型，第二次是使用该RNN模型进行语句生成。显然，这其中需要共享模型参数。

这里基于上述简化的线性拟合类（class）代码对使用方法进行简单说明：

1. 在主程序中两次调用同一个class，如下（注意tf.variable_scope()和scope.reuse_variables()的使用）：
    with tf.variable_scope('LR') as scope:
      model = linearReg(config)
      scope.reuse_variables()
      test_model = linearReg(config)
其中，第一次model是用于模型训练，第二次test_model是其它用途，但希望使用model训练出的模型参数。

2. 在linearReg里模型参数变量定义如下（注意使用tf.get_variable()而不是tf.Variable()）：
def add_model(self):
    with tf.variable_scope('Layer'):
      self.W = tf.get_variable('W', [1,], initializer= tf.zeros_initializer())
      self.b = tf.get_variable('b', [1,], initializer= tf.zeros_initializer())
      output = self.W*self.x_placeholder + self.b
此处W和b定义为类内参数（self.）是为了后续打印确认参数共享，实际使用中完全可以仅定义为函数内部参数。

在这种使用方法下，model训练完成之后，可以看到，test_model的W和b与之完全一样。该示例程序打印结果如下：
======================================
Trained results, W = 2.000, b = 0.188
(Real value: W = 2.000, b = 0.200)
======================================
W in train model = W in test model ?
Yes!
b in train model = b in test model ?
Yes!

详见下面代码。

顺便广告一下，目前网上可以找到的cs224d assignment2的解答，很多可能都是基于低版本tensorflow的，在高版本（我的是r1.3）上会有问题，无法运行。其中有些是tensorflow新旧版本函数兼容性问题，但也有variable_scope()的使用问题。针对这个问题，个人进行了相应的修改，确保可以在tensorflow r1.3上正确运行。修改后的cs224d assignment2代码已上传至：http://download.csdn.net/download/foreseerwang/10274823 欢迎下载、交流。

简化的线性拟合类（class）代码如下：

import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt

class Config(object):
  max_epochs = 100
  early_stopping = 5
  lr = 0.3

class linearReg(object):

  def load_data(self):

    self.X_train = np.random.rand(1000)*2-1
    self.y_train = 2 * self.X_train + np.random.randn(*self.X_train.shape) * 0.4 + 0.2

    self.X_dev = np.random.rand(100)*2-1
    self.y_dev = 2 * self.X_dev + np.random.randn(*self.X_dev.shape) * 0.4 + 0.2

    self.X_test = np.random.rand(100)*2-1
    self.y_test = 2 * self.X_test + np.random.randn(*self.X_test.shape) * 0.4 + 0.2

  def add_placeholders(self):
    self.x_placeholder = tf.placeholder(tf.float32, shape=(None))
    self.y_placeholder = tf.placeholder(tf.float32, shape=(None))

  def create_feed_dict(self, x_batch, y_batch=None):
    if y_batch is None:
      feed_dict = {
        self.x_placeholder: x_batch
      }
    else:
      feed_dict = {
        self.x_placeholder: x_batch,
        self.y_placeholder: y_batch
      }
    return feed_dict

  def add_model(self):
    with tf.variable_scope('Layer'):
      self.W = tf.get_variable('W', [1,], initializer= tf.zeros_initializer())
      self.b = tf.get_variable('b', [1,], initializer= tf.zeros_initializer())
      output = self.W*self.x_placeholder + self.b

    return output

  def add_loss_op(self, y):

    loss = tf.reduce_mean(tf.pow((y-self.y_placeholder), 2))

    return loss

  def add_training_op(self, loss):

    optimizer = tf.train.GradientDescentOptimizer(self.config.lr)
    train_op = optimizer.minimize(loss)

    return train_op

  def __init__(self, config):
    """Constructs the network using the helper functions defined above."""
    self.config = config
    self.load_data()
    self.add_placeholders()
    self.ypred = self.add_model()

    self.loss = self.add_loss_op(self.ypred)
    self.train_op = self.add_training_op(self.loss)

  def run_epoch(self, session, input_x, input_y, train_op=None):

    orig_X, orig_y = input_x, input_y

    if not train_op:
      train_op = tf.no_op()

    feed = self.create_feed_dict(x_batch=orig_X, y_batch=orig_y)
    loss, ypred, _ = session.run([self.loss, self.ypred, train_op], feed_dict=feed)

    return loss, ypred

def test_LR():

  config = Config()

  with tf.Graph().as_default():
    with tf.variable_scope('LR') as scope:
      model = linearReg(config)
      scope.reuse_variables()
      test_model = linearReg(config)

    init = tf.initialize_all_variables()

    with tf.Session() as session:
      best_val_loss = float('inf')
      best_val_epoch = 0

      session.run(init)
      for epoch in xrange(config.max_epochs):
        #print 'Epoch {}'.format(epoch)
        train_loss, _ = model.run_epoch(session, model.X_train, model.y_train, model.train_op)
        val_loss, y_val_pred = model.run_epoch(session, model.X_dev, model.y_dev)

        #print 'Training loss: {}'.format(train_loss)
        #print 'Validation loss: {}'.format(val_loss)

        if val_loss < best_val_loss:
          best_val_loss = val_loss
          best_val_epoch = epoch

        if epoch - best_val_epoch > config.early_stopping:
          break

      print("======================================")
      print("Trained results, W = %5.3f, b = %5.3f" %(session.run(model.W), session.run(model.b)))
      print("    (Real value: W = %5.3f, b = %5.3f)" %(2.0, 0.2))
      print("======================================")
      print("W in train model = W in test model ?")
      print("Yes!" if session.run(model.W)==session.run(test_model.W) else "No!")
      print("b in train model = b in test model ?")
      print("Yes!" if session.run(model.b)==session.run(test_model.b) else "No!")

  #plt.scatter(model.X_dev, model.y_dev)
  #plt.scatter(model.X_dev, y_val_pred)
  #plt.show()

if __name__ == "__main__":
  test_LR()