tensorflow使用猫狗大战数据库生成tfrecords数据

最新推荐文章于 2020-09-09 09:32:35 发布

AchDream

最新推荐文章于 2020-09-09 09:32:35 发布

阅读量885

点赞数

CC 4.0 BY-SA版权

分类专栏： tensorflow 教程

本文链接：https://blog.csdn.net/AchDream/article/details/80039884

tensorflow 同时被 2 个专栏收录

3 篇文章

订阅专栏

教程

3 篇文章

订阅专栏

本文介绍如何使用Kaggle的猫狗数据集生成TFRecords文件。原数据集包含25000张图片，被划分为训练集（23000张）和测试集（2000张）。通过Python脚本处理图像并转换为TensorFlow可读格式。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

程序使用kaggle中的猫狗大战train data set，用来生成tfrecords数据。

原数据库共有25000张照片
使用原来的train数据库,然后分为train.tfrecords和test.tfrecords两个数据集。
train.tfrecords包含23000张照片,test.tfrecords包含另外2000张照片

代码块

代码块语法遵循标准markdown代码，例如：

import os
import tensorflow as tf
from PIL import Image


cwd = os.getcwd()  #返回当前进程的工作目录。
classes = ["cat", "dog"]

def create_record():
    writer_train = tf.python_io.TFRecordWriter("train_227.tfrecords")
    writer_test = tf.python_io.TFRecordWriter("test_227.tfrecords")
    class_path = cwd + "/train/"
    i = 0
    img_names = os.listdir(class_path)
    print(len(img_names))
    for i in range(20):
        print(img_names[i])
    for img_name in img_names:
        i += 1
        animal = img_name.split(".")[0]
        if animal == "cat":
            index = 0
        else:
            index = 1
        img_path = class_path + img_name
        img = Image.open(img_path)
        img = img.resize((227,227))
        img_raw = img.tobytes()
        #print(index,img_name)
        if i<=23000:
            example_train = tf.train.Example(
                features=tf.train.Features(feature={"label": tf.train.Feature(int64_list=tf.train.Int64List(value=[index])),
                          "img_raw": tf.train.Feature(bytes_list=tf.train.BytesList(value=[img_raw]))})
            )
            writer_train.write(example_train.SerializeToString())
        else:

            example_test = tf.train.Example(
                features=tf.train.Features(feature={
                    "label": tf.train.Feature(int64_list=tf.train.Int64List(value=[index])),
                    "img_raw": tf.train.Feature(bytes_list=tf.train.BytesList(value=[img_raw]))
                })
            )
            #print("start to write test dataset....")
            writer_test.write(example_test.SerializeToString())

    writer_train.close()
    writer_test.close()
    exit()


data = create_record()