该仓库开源了《Multi-Round Dialogue State Tracking by Object-Entity Alignment in Visual Dialog》论文中的所有代码和数据,已发表在 CICAI 2023,属于CAAI-A类,Oral论文(Top 4%)。
This repository provides an overview of all components from the paper Multi-Round Dialogue State Tracking by Object-Entity Alignment in Visual Dialog, Accepted at CICAI 2023 (CAAI-A), Oral paper (Top 4%)
.
观点: 多轮对话信息流都压缩到一个状态中,与Mamba思想一致,从一开始就持有这种观念。
这篇论文的主要贡献包括:
-
引入MDST模型:论文提出了多轮对话状态跟踪(MDST)模型,通过解决将整个对话历史视为单一文本输入的限制,改进了之前的视觉对话(VD)方法。
-
增强的对话状态跟踪:MDST利用内部对话状态表示,这些表示定义为视觉-语言表示的二元组,以捕捉和利用每轮对话历史中的信息。这种方法有效地对当前问题进行基础对接,从而生成准确的回答。
-
通过人工研究验证效果:通过一系列人工研究验证了MDST的有效性,证明其能够生成较长、一致且类似于人类的回答,同时在回答一系列问题时保持高准确性。
Perspective: Compressing multi-round dialogue information flows into a single state aligns with the Mamba approach, which has been consistently held from the outset.
The main contributions of the paper are:
-
Introduction of MDST Model: The paper presents the Multi-round Dialogue State Tracking (MDST) model, which improves upon previous Visual Dialog (VD) methods by addressing the limitation of treating the entire dialog history as a single text input.
-
Enhanced Dialogue State Tracking: MDST utilizes internal dialogue state representations, defined as 2-tuples of vision-language representations, to capture and utilize information from each round of the dialog history. This allows for more accurate grounding of the current question and generation of precise answers.
-
Validation through Human Studies: The effectiveness of MDST is further validated through human studies, demonstrating its ability to generate long, consistent, and human-like answers while maintaining high accuracy in responding to a series of questions.
coming soon ...