毕业设计深度学习 opencv 公式识别

最新推荐文章于 2025-07-11 15:18:36 发布

原创

最新推荐文章于 2025-07-11 15:18:36 发布 · 476 阅读

17 ·

CC 4.0 BY-SA版权

文章标签：

#python #毕业设计

文章目录

0 前言
1 课题说明
2 效果展示
3 具体实现
4 关键代码实现
5 算法综合效果

0 前言

分享优质毕业设计项目，今天要分享的是

🚩 基于深度学习的数学公式识别算法实现

项目运行效果：

毕业设计深度学习的公式识别

🧿 项目分享:见文末!

1 课题说明

手写数学公式识别较传统OCR问题而言，是一个更复杂的二维手写识别问题，其内部复杂的二维空间结构使得其很难被解析，传统方法的识别效果不佳。随着深度学习在各领域的成功应用，基于深度学习的端到端离线数学公式算法，并在公开数据集上较传统方法获得了显著提升，开辟了全新的数学公式识别框架。然而在线手写数学公式识别框架还未被提出，论文TAP则是首个基于深度学习的端到端在线手写数学公式识别模型，且针对数学公式识别的任务特性提出了多种优化。

公式识别是OCR领域一个非常有挑战性的工作，工作的难点在于它是一个二维的数据，因此无法用传统的CRNN进行识别。

在这里插入图片描述

推荐大家用于毕业设计。。。。

2 效果展示

这里简单的展示一下效果

在这里插入图片描述

3 具体实现

在这里插入图片描述

神经网络模型是 Seq2Seq + Attention + Beam Search。Seq2Seq的Encoder是CNN，Decoder是LSTM。Encoder和Decoder之间插入Attention层，具体操作是这样：Encoder到Decoder有个扁平化的过程，Attention就是在这里插入的。具体模型的可视化结果如下

在这里插入图片描述

4 关键代码实现

class Encoder(object):
    """Class with a __call__ method that applies convolutions to an image"""
 
    def __init__(self, config):
        self._config = config
 
 
    def __call__(self, img, dropout):
        """Applies convolutions to the image
        Args:
            img: batch of img, shape = (?, height, width, channels), of type tf.uint8
            tf.uint8 因为 2^8 = 256，所以元素值区间 [0, 255]，线性压缩到 [-1, 1] 上就是 img = (img - 128) / 128
        Returns:
            the encoded images, shape = (?, h', w', c')
        """
        with tf.variable_scope("Encoder"):
            img = tf.cast(img, tf.float32) - 128.
            img = img / 128.
 
            with tf.variable_scope("convolutional_encoder"):
                # conv + max pool -> /2
                # 64 个 3*3 filters, strike = (1, 1), output_img.shape = ceil(L/S) = ceil(input/strike) = (H, W)
                out = tf.layers.conv2d(img, 64, 3, 1, "SAME", activation=tf.nn.relu)
                image_summary("out_1_layer", out)
                out = tf