[论文阅读]Towards Realistic Long-Tailed Semi-Supervised Learning: Consistency Is All You Need-CSDN博客

本文链接：https://blog.csdn.net/m0_53623159/article/details/144402150

摘要

解决无标签样本和有标签样本之间类别分布不平衡的问题

While long-tailed semi-supervised learning (LTSSL) has received tremendous attention in many real-world classification problems, existing LTSSL algorithms typically assume that the class distributions of labeled and unlabeled data are almost identical. Those LTSSL algorithms built upon the assumption can severely suffer when the class distributions of labeled and unlabeled data are mismatched since they utilize biased pseudo-labels from the model. To alleviate this issue, we propose a new simple method that can effectively utilize unlabeled data of unknown class distributions by introducing the adaptive consistency regularizer (ACR). ACR realizes the dynamic refinery of pseudo-labels for various distributions in a unified formula by estimating the true class distribution of unlabeled data. Despite its simplicity, we show that ACR achieves state-of-the-art performance on a variety of standard LTSSL benchmarks, e.g., an averaged 10% absolute increase of test accuracy against existing algorithms when the class distributions of labeled and unlabeled data are mismatched. Even when the class distributions are identical, ACR consistently outperforms many sophisticated LTSSL algorithms. We carry out extensive ablation studies to tease apart the factors that are most important to ACR’s success. Source code is available at https://github.com/Gank0078/ACR.

不平衡的情况在这里插入图片描述
整体的框图

针对本文的公式3，下面进行了详细的分析

深入解析：τ在平衡Softmax损失函数中的关键作用

在深度学习的领域中，分类任务常常面临着类别不平衡的问题。为了应对这一挑战，研究者们提出了诸多方法，其中平衡Softmax损失函数便是一种有效的方式。今天，我将通过自己的学习和思考，详细解析τ（tau）参数在这一损失函数中的重要作用，以及它如何影响整个模型的训练过程。

一、问题背景

我们考虑一个双分支网络结构，包括一个标准分支和一个平衡分支。标准分支采用FixMatch方法，通过优化标准交叉熵损失来学习良好的特征表示 $f$ 。而平衡分支 $f~\tilde{f}$ 则通过优化一个改进的交叉熵损失——平衡Softmax，来实现更平衡的分类器训练。

平衡Softmax的损失函数定义如下：
$\mathcal{L}_{b-\mathrm{label}} = -\sum_{i=1}^N \log \frac{\exp\left(\tilde{f}_{y_i^{(l)}}(x_i^{(l)}) + \tau \cdot \log \pi_{y_i^{(l)}}\right)}{\sum_{c=1}^C \exp\left(\tilde{f}_c(x_i^{(l)}) + \tau \cdot \log \pi_c\right)}$