活动介绍
file-type

Whisper对话框:创新UI交互组件

ZIP文件

1.21MB | 更新于2024-11-29 | 86 浏览量 | 0 下载量 举报 收藏
download 立即下载
该项目提供了一个名为Whisper的组件,旨在增强移动应用中的用户交互体验。Whisper组件打破了传统UI设计中单一的信息展示方式,通过模仿人类交流中的耳语、喊叫和吹口哨三种不同的方式来呈现应用内的通知或消息提示。 具体来说,Whisper对话框提供了三种不同的视图模式: 1. Whispers(耳语):这种方式提供了一种低调而微妙的交互方式,适用于不需要打扰用户的通知,例如后台更新状态、小提示或非紧急信息。 2. Shouts(喊叫):当需要立即引起用户注意时,使用喊叫模式。这种模式通常伴随有更显著的视觉和声音提示,适合重要的通知或警告,如应用错误、数据丢失警报或紧急事件通知。 3. Whistles(吹口哨):这是一种介于耳语和喊叫之间的互动方式,它比耳语更为明显,但又不会像喊叫那样打扰用户。它适用于一般的通知,例如新邮件到达、好友请求或系统更新提示。 Whisper组件的引入可以使应用的消息通知变得更加生动和有趣,改善用户体验。开发者可以通过简单配置或自定义,使其与应用的风格和品牌形象相符合。此外,该项目被作者hyperoslo以其撩妹高手的风格呈现,也从侧面展示了作者对用户体验和设计的独到见解。 Whisper项目的发布,为iOS开发者提供了一个实用的工具包,帮助他们快速实现吸引人的用户界面元素,从而提升应用的互动性和用户满意度。其源码包的文件名“Whisper-master”表明这是一个成熟的主版本,用户可以期待一个稳定、可信赖的组件。 在使用Whisper时,开发者需要关注以下几个方面: - 集成Whisper组件到iOS项目中,并确保兼容性。 - 根据应用需求选择合适的视图模式,并进行相应的配置。 - 个性化定制Whisper对话框的外观和行为,以符合应用的整体设计风格。 - 考虑用户体验和上下文,合理安排不同通知方式的触发时机和场景。 需要注意的是,过度依赖动态和有趣的通知可能会分散用户的注意力,因此开发者应该权衡利弊,合理使用Whisper组件,以增强应用的交互性,而不是干扰用户。此外,过度的通知可能会被用户视为骚扰,因此在设计通知策略时需要谨慎考虑。 总的来说,Whisper对话框是一个值得探索的iOS源码项目,它不仅仅是一个消息通知组件,更是一个提升用户体验的工具。开发者可以在这个基础上进一步扩展,创造出更多富有创意的用户界面交互设计。"

相关推荐

filetype

根据修改建议进行修改: 1、AnalysisThread.add_charts 中生成图表后虽删除临时文件,但未显式释放图表对象,可能导致内存泄漏(尤其批量处理时)。使用 plt.close(fig) 显式关闭图表,释放内存 2、AnalysisThread.stop() 仅设置 stop_requested=True,但线程池中的任务仍会继续运行,可能导致资源占用。结合 concurrent.futures 的 Executor.shutdown(wait=False) 强制终止线程池。 3、AudioAnalyzer.convert_audio 中若临时目录创建失败,temp_dir 为 None,后续 temp_dir.cleanup() 会报错。添加容错处理,确保临时目录安全清理。 4、ModelLoader 中加载的 Whisper 模型未充分利用硬件性能(如未指定 language 参数,可能增加语言检测耗时)。指定 language=“chinese” 减少推理时间。 5、convert_audio 中无论原始音频格式如何,均强制转换为 WAV,部分格式(如 16kHz 单声道 WAV)可跳过转换。 6、短时间内同一说话人的片段可能被拆分(如停顿导致),影响后续文本关联准确性。合并连续相同说话人的片段(如间隔 < 1 秒)。 7、当前情感分析仅基于文本内容,未结合对话上下文(如客户抱怨后客服回应的情感)。增加上下文权重,如客户表达消极情绪后,客服的回应情感权重提升。 8、模型加载失败后无重试逻辑,用户需重启程序。添加重试按钮,允许用户重新加载模型。 9、未校验音频文件的有效性(如损坏文件、非音频文件),可能导致分析线程崩溃。添加文件合法性校验,过滤无效文件。 代码: import os import sys import time import json import traceback import numpy as np import pandas as pd import torch import librosa import jieba import tempfile from pydub import AudioSegment from transformers import pipeline, AutoModelForSequenceClassification, AutoTokenizer from pyannote.audio import Pipeline from concurrent.futures import ThreadPoolExecutor, as_completed from PyQt5.QtWidgets import (QApplication, QMainWindow, QWidget, QVBoxLayout, QHBoxLayout, QLabel, QLineEdit, QPushButton, QFileDialog, QTextEdit, QProgressBar, QGroupBox, QCheckBox, QListWidget, QMessageBox) from PyQt5.QtCore import QThread, pyqtSignal, Qt, QTimer from PyQt5.QtGui import QFont from docx import Document from docx.shared import Inches import matplotlib.pyplot as plt from matplotlib.backends.backend_qt5agg import FigureCanvasQTAgg as FigureCanvas from collections import Counter # 全局配置 MODEL_CONFIG = { "whisper_model": "openai/whisper-small", "diarization_model": "pyannote/[email protected]", # 使用更轻量模型 "sentiment_model": "IDEA-CCNL/Erlangshen-Roberta-110M-Sentiment", "chunk_size": 10, # 强制10秒分块 "sample_rate": 16000, "device": "cuda" if torch.cuda.is_available() else "cpu", "max_workers": 2 if torch.cuda.is_available() else 4, # GPU模式下并行度降低 "batch_size": 8 # 批处理大小 } # 初始化分词器 jieba.initialize() class ModelLoader(QThread): """模型加载线程""" progress = pyqtSignal(str) finished = pyqtSignal(bool, str) def __init__(self): super().__init__() self.models = {} self.error = None def run(self): try: self.progress.emit("正在加载语音识别模型...") # 语音识别模型 self.models["asr_pipeline"] = pipeline( "automatic-speech-recognition", model=MODEL_CONFIG["whisper_model"], torch_dtype=torch.float16, device=MODEL_CONFIG["device"], batch_size=MODEL_CONFIG["batch_size"] # 添加批处理支持 ) self.progress.emit("正在加载说话人分离模型...") # 说话人分离模型 - 使用更轻量版本 self.models["diarization_pipeline"] = Pipeline.from_pretrained( MODEL_CONFIG["diarization_model"], use_auth_token=True ).to(torch.device(MODEL_CONFIG["device"]), torch.float16) self.progress.emit("正在加载情感分析模型...") # 情感分析模型 self.models["sentiment_tokenizer"] = AutoTokenizer.from_pretrained( MODEL_CONFIG["sentiment_model"] ) self.models["sentiment_model"] = AutoModelForSequenceClassification.from_pretrained( MODEL_CONFIG["sentiment_model"], torch_dtype=torch.float16 ).to(MODEL_CONFIG["device"]) self.finished.emit(True, "模型加载完成!") except Exception as e: self.error = str(e) traceback.print_exc() self.finished.emit(False, f"模型加载失败: {str(e)}") class AudioAnalyzer: """深度优化的核心音频分析类""" def __init__(self, models): self.keywords = { "opening": ["您好", "请问是", "先生/女士", "很高兴为您服务"], "closing": ["感谢接听", "祝您生活愉快", "再见", "有问题随时联系"], "forbidden": ["不可能", "没办法", "我不管", "随便你", "投诉也没用"], "solution": ["解决", "处理好了", "已完成", "满意吗", "还有问题吗"] } self.synonyms = { "不可能": ["不可能", "没可能", "做不到", "无法做到"], "解决": ["解决", "处理", "完成", "搞定", "办妥"] } self.models = models self.models_loaded = True if models else False def load_keywords(self, excel_path): """从Excel加载关键词和同义词""" try: # 使用更健壮的Excel读取方式 df = pd.read_excel(excel_path, sheet_name=None) if "开场白" in df: self.keywords["opening"] = df["开场白"].dropna()["关键词"].tolist() if "结束语" in df: self.keywords["closing"] = df["结束语"].dropna()["关键词"].tolist() if "禁语" in df: self.keywords["forbidden"] = df["禁语"].dropna()["关键词"].tolist() if "解决关键词" in df: self.keywords["solution"] = df["解决关键词"].dropna()["关键词"].tolist() # 加载同义词表 if "同义词" in df: for _, row in df["同义词"].iterrows(): main_word = row["主词"] synonyms = row["同义词"].split("、") self.synonyms[main_word] = synonyms return True, "关键词加载成功" except Exception as e: error_msg = f"加载关键词失败: {str(e)}" return False, error_msg def convert_audio(self, input_path): """转换音频为WAV格式并分块,使用临时目录管理""" try: # 创建临时目录 temp_dir = tempfile.TemporaryDirectory() # 读取音频文件 audio = AudioSegment.from_file(input_path) # 转换为单声道16kHz audio = audio.set_frame_rate(MODEL_CONFIG["sample_rate"]) audio = audio.set_channels(1) # 计算总时长 duration = len(audio) / 1000.0 # 毫秒转秒 # 分块处理(10秒) chunks = [] chunk_size = MODEL_CONFIG["chunk_size"] * 1000 # 毫秒 for i in range(0, len(audio), chunk_size): chunk = audio[i:i + chunk_size] chunk_path = os.path.join(temp_dir.name, f"chunk_{i // chunk_size}.wav") chunk.export(chunk_path, format="wav") chunks.append({ "path": chunk_path, "start_time": i / 1000.0, # 全局起始时间(秒) "end_time": (i + len(chunk)) / 1000.0 # 全局结束时间(秒) }) return chunks, duration, temp_dir except Exception as e: error_msg = f"音频转换失败: {str(e)}" return [], 0, None def diarize_speakers(self, audio_path): """说话人分离""" try: diarization = self.models["diarization_pipeline"](audio_path) segments = [] for turn, _, speaker in diarization.itertracks(yield_label=True): segments.append({ "start": turn.start, "end": turn.end, "speaker": speaker, "text": "" }) return segments except Exception as e: error_msg = f"说话人分离失败: {str(e)}" raise Exception(error_msg) from e def transcribe_audio_batch(self, chunk_paths): """批量语音识别多个分块""" try: # 批量处理音频分块 results = self.models["asr_pipeline"]( chunk_paths, chunk_length_s=MODEL_CONFIG["chunk_size"], stride_length_s=(4, 2), batch_size=MODEL_CONFIG["batch_size"], return_timestamps=True ) # 整理结果 transcribed_data = [] for result in results: text = result["text"] chunks = result["chunks"] transcribed_data.append((text, chunks)) return transcribed_data except Exception as e: error_msg = f"语音识别失败: {str(e)}" raise Exception(error_msg) from e def analyze_sentiment_batch(self, texts): """批量情感分析 - 支持长文本处理""" try: if not texts: return [] # 预处理文本 - 截断并添加特殊token inputs = self.models["sentiment_tokenizer"]( texts, padding=True, truncation=True, max_length=512, return_tensors="pt" ).to(MODEL_CONFIG["device"]) # 批量推理 with torch.no_grad(): outputs = self.models["sentiment_model"](**inputs) # 计算概率 probs = torch.softmax(outputs.logits, dim=-1).cpu().numpy() # 处理结果 results = [] labels = ["积极", "消极", "中性"] for i, text in enumerate(texts): sentiment = labels[np.argmax(probs[i])] # 情感强度检测 strong_negative = probs[i][1] > 0.7 # 消极概率超过70% strong_positive = probs[i][0] > 0.7 # 积极概率超过70% # 特定情绪检测 specific_emotion = "无" if "生气" in text or "愤怒" in text or "气死" in text: specific_emotion = "愤怒" elif "不耐烦" in text or "快点" in text or "急死" in text: specific_emotion = "不耐烦" elif "失望" in text or "无奈" in text: specific_emotion = "失望" # 如果有强烈情感则覆盖平均结果 if strong_negative: sentiment = "强烈消极" elif strong_positive: sentiment = "强烈积极" results.append({ "sentiment": sentiment, "emotion": specific_emotion, "scores": probs[i].tolist() }) return results except Exception as e: error_msg = f"情感分析失败: {str(e)}" raise Exception(error_msg) from e def match_keywords(self, text, keyword_type): """高级关键词匹配 - 使用分词和同义词""" # 获取关键词列表 keywords = self.keywords.get(keyword_type, []) if not keywords: return False # 分词处理 words = jieba.lcut(text) # 检查每个关键词 for keyword in keywords: # 检查直接匹配 if keyword in text: return True # 检查同义词 synonyms = self.synonyms.get(keyword, []) for synonym in synonyms: if synonym in text: return True # 检查分词匹配(全词匹配) if keyword in words: return True return False def identify_agent(self, segments, full_text): """智能客服身份识别""" # 候选客服信息 candidates = {} # 特征1:开场白关键词 for i, segment in enumerate(segments[:5]): # 检查前5个片段 if self.match_keywords(segment["text"], "opening"): speaker = segment["speaker"] candidates.setdefault(speaker, {"score": 0, "segments": []}) candidates[speaker]["score"] += 3 # 开场白权重高 candidates[speaker]["segments"].append(i) # 特征2:结束语关键词 for i, segment in enumerate(segments[-3:]): # 检查最后3个片段 if self.match_keywords(segment["text"], "closing"): speaker = segment["speaker"] candidates.setdefault(speaker, {"score": 0, "segments": []}) candidates[speaker]["score"] += 2 # 结束语权重中等 candidates[speaker]["segments"].append(len(segments) - 3 + i) # 特征3:说话时长 speaker_durations = {} for segment in segments: duration = segment["end"] - segment["start"] speaker_durations[segment["speaker"]] = speaker_durations.get(segment["speaker"], 0) + duration # 为说话时长最长的加分 if speaker_durations: max_duration = max(speaker_durations.values()) for speaker, duration in speaker_durations.items(): candidates.setdefault(speaker, {"score": 0, "segments": []}) if duration == max_duration: candidates[speaker]["score"] += 1 # 特征4:客服特定词汇出现频率 agent_keywords = ["客服", "代表", "专员", "先生", "女士"] speaker_keyword_count = {} for segment in segments: text = segment["text"] speaker = segment["speaker"] for word in agent_keywords: if word in text: speaker_keyword_count[speaker] = speaker_keyword_count.get(speaker, 0) + 1 # 为关键词出现最多的加分 if speaker_keyword_count: max_count = max(speaker_keyword_count.values()) for speaker, count in speaker_keyword_count.items(): if count == max_count: candidates.setdefault(speaker, {"score": 0, "segments": []}) candidates[speaker]["score"] += 1 # 选择得分最高的作为客服 if candidates: best_speaker = max(candidates.items(), key=lambda x: x[1]["score"])[0] return best_speaker # 默认选择第一个说话人 return segments[0]["speaker"] if segments else None def associate_speaker_text(self, segments, full_text_chunks): """基于时间重叠度的说话人-文本关联""" for segment in segments: segment_text = "" segment_start = segment["start"] segment_end = segment["end"] for word_info in full_text_chunks: if "global_start" not in word_info: continue word_start = word_info["global_start"] word_end = word_info["global_end"] # 计算重叠度 overlap_start = max(segment_start, word_start) overlap_end = min(segment_end, word_end) overlap = max(0, overlap_end - overlap_start) # 计算重叠比例 word_duration = word_end - word_start segment_duration = segment_end - segment_start if overlap > 0: # 如果重叠超过50%或单词完全在片段内 if (overlap / word_duration > 0.5) or (overlap / segment_duration > 0.5): segment_text += word_info["text"] + " " segment["text"] = segment_text.strip() def analyze_audio(self, audio_path): """完整分析单个音频文件 - 优化版本""" try: # 步骤1: 转换音频并分块(使用临时目录) chunks, duration, temp_dir = self.convert_audio(audio_path) if not chunks or not temp_dir: raise Exception("音频转换失败或未生成分块") try: # 步骤2: 说话人分离 segments = self.diarize_speakers(audio_path) # 步骤3: 批量语音识别 chunk_paths = [chunk["path"] for chunk in chunks] transcribed_data = self.transcribe_audio_batch(chunk_paths) # 步骤4: 处理识别结果 full_text_chunks = [] for idx, (text, chunk_data) in enumerate(transcribed_data): chunk = chunks[idx] # 调整时间戳为全局时间 for word_info in chunk_data: if "timestamp" in word_info: start, end = word_info["timestamp"] word_info["global_start"] = chunk["start_time"] + start word_info["global_end"] = chunk["start_time"] + end else: word_info["global_start"] = chunk["start_time"] word_info["global_end"] = chunk["end_time"] full_text_chunks.extend(chunk_data) # 步骤5: 基于时间重叠度关联说话人和文本 self.associate_speaker_text(segments, full_text_chunks) # 步骤6: 智能识别客服身份 agent_id = self.identify_agent(segments, full_text_chunks) # 步骤7: 提取客服和客户文本 agent_text = "" customer_text = "" opening_found = False closing_found = False forbidden_found = False for segment in segments: if segment["speaker"] == agent_id: agent_text += segment["text"] + " " else: customer_text += segment["text"] + " " # 使用高级关键词匹配 if not opening_found and self.match_keywords(segment["text"], "opening"): opening_found = True if not closing_found and self.match_keywords(segment["text"], "closing"): closing_found = True if not forbidden_found and self.match_keywords(segment["text"], "forbidden"): forbidden_found = True # 步骤8: 批量情感分析 sentiment_results = self.analyze_sentiment_batch([agent_text, customer_text]) if sentiment_results: agent_sentiment = sentiment_results[0]["sentiment"] agent_emotion = sentiment_results[0]["emotion"] customer_sentiment = sentiment_results[1]["sentiment"] customer_emotion = sentiment_results[1]["emotion"] else: agent_sentiment = "未知" agent_emotion = "无" customer_sentiment = "未知" customer_emotion = "无" # 问题解决率分析 solution_found = self.match_keywords(agent_text, "solution") # 语速分析 agent_words = len(agent_text.split()) agent_duration = sum([s["end"] - s["start"] for s in segments if s["speaker"] == agent_id]) agent_speed = agent_words / (agent_duration / 60) if agent_duration > 0 else 0 # 词/分钟 # 音量分析(简单版) try: y, sr = librosa.load(audio_path, sr=MODEL_CONFIG["sample_rate"]) rms = librosa.feature.rms(y=y) avg_volume = np.mean(rms) volume_stability = np.std(rms) / avg_volume if avg_volume > 0 else 0 except: avg_volume = 0 volume_stability = 0 # 构建结果 result = { "file_name": os.path.basename(audio_path), "duration": round(duration, 2), "opening_check": "是" if opening_found else "否", "closing_check": "是" if closing_found else "否", "forbidden_check": "是" if forbidden_found else "否", "agent_sentiment": agent_sentiment, "agent_emotion": agent_emotion, "customer_sentiment": customer_sentiment, "customer_emotion": customer_emotion, "agent_speed": round(agent_speed, 1), "volume_level": round(avg_volume, 4), "volume_stability": round(volume_stability, 2), "solution_rate": "是" if solution_found else "否", "agent_text": agent_text[:500] + "..." if len(agent_text) > 500 else agent_text, "customer_text": customer_text[:500] + "..." if len(customer_text) > 500 else customer_text } return result finally: # 自动清理临时目录 temp_dir.cleanup() except Exception as e: error_msg = f"分析文件 {os.path.basename(audio_path)} 时出错: {str(e)}" raise Exception(error_msg) from e class AnalysisThread(QThread): """分析线程 - 并行优化版本""" progress = pyqtSignal(int, str) result_ready = pyqtSignal(dict) finished_all = pyqtSignal() error_occurred = pyqtSignal(str, str) def __init__(self, audio_files, keywords_file, output_dir, models): super().__init__() self.audio_files = audio_files self.keywords_file = keywords_file self.output_dir = output_dir self.stop_requested = False self.analyzer = AudioAnalyzer(models) self.completed_count = 0 def run(self): try: total = len(self.audio_files) # 加载关键词 if self.keywords_file: success, msg = self.analyzer.load_keywords(self.keywords_file) if not success: self.error_occurred.emit("关键词加载", msg) results = [] errors = [] # 使用线程池进行并行处理 with ThreadPoolExecutor(max_workers=MODEL_CONFIG["max_workers"]) as executor: # 提交所有任务 future_to_file = { executor.submit(self.analyzer.analyze_audio, audio_file): audio_file for audio_file in self.audio_files } # 处理完成的任务 for future in as_completed(future_to_file): if self.stop_requested: break audio_file = future_to_file[future] try: result = future.result() if result: results.append(result) self.result_ready.emit(result) except Exception as e: error_msg = str(e) errors.append({ "file": audio_file, "error": error_msg }) self.error_occurred.emit(os.path.basename(audio_file), error_msg) # 更新进度 self.completed_count += 1 progress = int(self.completed_count / total * 100) self.progress.emit( progress, f"已完成 {self.completed_count}/{total} ({progress}%)" ) # 生成报告 if results: self.generate_reports(results, errors) self.finished_all.emit() except Exception as e: self.error_occurred.emit("全局错误", str(e)) def stop(self): self.stop_requested = True def generate_reports(self, results, errors): """生成Excel和Word报告 - 优化版本""" try: # 生成Excel报告 df = pd.DataFrame(results) excel_path = os.path.join(self.output_dir, "质检分析报告.xlsx") # 创建Excel写入器 with pd.ExcelWriter(excel_path, engine='xlsxwriter') as writer: df.to_excel(writer, sheet_name='详细结果', index=False) # 添加统计摘要 stats_data = { "指标": ["分析文件总数", "成功分析文件数", "分析失败文件数", "开场白合格率", "结束语合格率", "禁语出现率", "客服积极情绪占比", "客户消极情绪占比", "问题解决率"], "数值": [ len(results) + len(errors), len(results), len(errors), f"{df['opening_check'].value_counts(normalize=True).get('是', 0) * 100:.1f}%", f"{df['closing_check'].value_counts(normalize=True).get('是', 0) * 100:.1f}%", f"{df['forbidden_check'].value_counts(normalize=True).get('是', 0) * 100:.1f}%", f"{df[df['agent_sentiment'] == '积极'].shape[0] / len(df) * 100:.1f}%", f"{df[df['customer_sentiment'] == '消极'].shape[0] / len(df) * 100:.1f}%", f"{df['solution_rate'].value_counts(normalize=True).get('是', 0) * 100:.1f}%" ] } stats_df = pd.DataFrame(stats_data) stats_df.to_excel(writer, sheet_name='统计摘要', index=False) # 生成Word报告 doc = Document() doc.add_heading('外呼电话质检分析汇总报告', 0) # 添加统计信息 doc.add_heading('整体统计', level=1) stats = [ f"分析文件总数: {len(results) + len(errors)}", f"成功分析文件数: {len(results)}", f"分析失败文件数: {len(errors)}", f"开场白合格率: {stats_data['数值'][3]}", f"结束语合格率: {stats_data['数值'][4]}", f"禁语出现率: {stats_data['数值'][5]}", f"客服积极情绪占比: {stats_data['数值'][6]}", f"客户消极情绪占比: {stats_data['数值'][7]}", f"问题解决率: {stats_data['数值'][8]}" ] for stat in stats: doc.add_paragraph(stat) # 添加图表 self.add_charts(doc, df) # 添加错误列表 if errors: doc.add_heading('分析失败文件', level=1) table = doc.add_table(rows=1, cols=2) hdr_cells = table.rows[0].cells hdr_cells[0].text = '文件' hdr_cells[1].text = '错误原因' for error in errors: row_cells = table.add_row().cells row_cells[0].text = os.path.basename(error['file']) row_cells[1].text = error['error'] word_path = os.path.join(self.output_dir, "可视化分析报告.docx") doc.save(word_path) return True, f"报告已保存到: {self.output_dir}" except Exception as e: return False, f"生成报告失败: {str(e)}" def add_charts(self, doc, df): """在Word文档中添加图表""" try: # 客服情感分布 fig1, ax1 = plt.subplots(figsize=(6, 4)) sentiment_counts = df['agent_sentiment'].value_counts() sentiment_counts.plot(kind='bar', ax=ax1, color=['green', 'red', 'blue', 'darkred', 'darkgreen']) ax1.set_title('客服情感分布') ax1.set_xlabel('情感类型') ax1.set_ylabel('数量') fig1.tight_layout() fig1.savefig('agent_sentiment.png') doc.add_picture('agent_sentiment.png', width=Inches(5)) os.remove('agent_sentiment.png') # 客户情感分布 fig2, ax2 = plt.subplots(figsize=(6, 4)) df['customer_sentiment'].value_counts().plot(kind='bar', ax=ax2, color=['green', 'red', 'blue', 'darkred', 'darkgreen']) ax2.set_title('客户情感分布') ax2.set_xlabel('情感类型') ax2.set_ylabel('数量') fig2.tight_layout() fig2.savefig('customer_sentiment.png') doc.add_picture('customer_sentiment.png', width=Inches(5)) os.remove('customer_sentiment.png') # 合规性检查 fig3, ax3 = plt.subplots(figsize=(6, 4)) compliance = df[['opening_check', 'closing_check', 'forbidden_check']].apply( lambda x: x.value_counts().get('是', 0)) compliance.plot(kind='bar', ax=ax3, color=['blue', 'green', 'red']) ax3.set_title('合规性检查') ax3.set_xlabel('检查项') ax3.set_ylabel('合格数量') fig3.tight_layout() fig3.savefig('compliance.png') doc.add_picture('compliance.png', width=Inches(5)) os.remove('compliance.png') except Exception as e: print(f"生成图表失败: {str(e)}") class MainWindow(QMainWindow): """主界面 - 优化版本""" def __init__(self): super().__init__() self.setWindowTitle("外呼电话录音质检分析系统") self.setGeometry(100, 100, 1000, 800) # 初始化变量 self.audio_files = [] self.keywords_file = "" self.output_dir = os.getcwd() self.analysis_thread = None self.model_loader = None self.models = {} self.models_loaded = False # 初始化为False # 设置全局字体 app_font = QFont("Microsoft YaHei", 10) QApplication.setFont(app_font) # 创建主布局 main_widget = QWidget() main_layout = QVBoxLayout() main_layout.setSpacing(10) main_layout.setContentsMargins(15, 15, 15, 15) # 状态栏 self.status_label = QLabel("准备就绪") self.status_label.setAlignment(Qt.AlignCenter) self.status_label.setStyleSheet("background-color: #f0f0f0; padding: 5px; border-radius: 5px;") # 文件选择区域 file_group = QGroupBox("文件选择") file_layout = QVBoxLayout() file_layout.setSpacing(10) # 音频选择 audio_layout = QHBoxLayout() self.audio_label = QLabel("音频文件/文件夹:") self.audio_path_edit = QLineEdit() self.audio_path_edit.setReadOnly(True) self.audio_path_edit.setPlaceholderText("请选择音频文件或文件夹") self.audio_browse_btn = QPushButton("浏览...") self.audio_browse_btn.setFixedWidth(80) self.audio_browse_btn.clicked.connect(self.browse_audio) audio_layout.addWidget(self.audio_label) audio_layout.addWidget(self.audio_path_edit, 1) audio_layout.addWidget(self.audio_browse_btn) # 关键词选择 keyword_layout = QHBoxLayout() self.keyword_label = QLabel("关键词文件:") self.keyword_path_edit = QLineEdit() self.keyword_path_edit.setReadOnly(True) self.keyword_path_edit.setPlaceholderText("可选:选择关键词Excel文件") self.keyword_browse_btn = QPushButton("浏览...") self.keyword_browse_btn.setFixedWidth(80) self.keyword_browse_btn.clicked.connect(self.browse_keywords) keyword_layout.addWidget(self.keyword_label) keyword_layout.addWidget(self.keyword_path_edit, 1) keyword_layout.addWidget(self.keyword_browse_btn) # 输出目录 output_layout = QHBoxLayout() self.output_label = QLabel("输出目录:") self.output_path_edit = QLineEdit(os.getcwd()) self.output_path_edit.setReadOnly(True) self.output_browse_btn = QPushButton("浏览...") self.output_browse_btn.setFixedWidth(80) self.output_browse_btn.clicked.connect(self.browse_output) output_layout.addWidget(self.output_label) output_layout.addWidget(self.output_path_edit, 1) output_layout.addWidget(self.output_browse_btn) file_layout.addLayout(audio_layout) file_layout.addLayout(keyword_layout) file_layout.addLayout(output_layout) file_group.setLayout(file_layout) # 控制按钮区域 control_layout = QHBoxLayout() control_layout.setSpacing(15) self.start_btn = QPushButton("开始分析") self.start_btn.setFixedHeight(40) self.start_btn.setStyleSheet("background-color: #4CAF50; color: white; font-weight: bold;") self.start_btn.clicked.connect(self.start_analysis) self.stop_btn = QPushButton("停止分析") self.stop_btn.setFixedHeight(40) self.stop_btn.setStyleSheet("background-color: #f44336; color: white; font-weight: bold;") self.stop_btn.clicked.connect(self.stop_analysis) self.stop_btn.setEnabled(False) self.clear_btn = QPushButton("清空") self.clear_btn.setFixedHeight(40) self.clear_btn.setStyleSheet("background-color: #2196F3; color: white; font-weight: bold;") self.clear_btn.clicked.connect(self.clear_all) control_layout.addWidget(self.start_btn) control_layout.addWidget(self.stop_btn) control_layout.addWidget(self.clear_btn) # 进度条 self.progress_bar = QProgressBar() self.progress_bar.setRange(0, 100) self.progress_bar.setTextVisible(True) self.progress_bar.setStyleSheet("QProgressBar {border: 1px solid grey; border-radius: 5px; text-align: center;}" "QProgressBar::chunk {background-color: #4CAF50; width: 10px;}") # 结果展示区域 result_group = QGroupBox("分析结果") result_layout = QVBoxLayout() result_layout.setSpacing(10) # 结果标签 result_header = QHBoxLayout() self.result_label = QLabel("分析结果:") self.result_count_label = QLabel("0/0") self.result_count_label.setAlignment(Qt.AlignRight) result_header.addWidget(self.result_label) result_header.addWidget(self.result_count_label) self.result_text = QTextEdit() self.result_text.setReadOnly(True) self.result_text.setStyleSheet("font-family: Consolas, 'Microsoft YaHei';") # 错误列表 error_header = QHBoxLayout() self.error_label = QLabel("错误信息:") self.error_count_label = QLabel("0") self.error_count_label.setAlignment(Qt.AlignRight) error_header.addWidget(self.error_label) error_header.addWidget(self.error_count_label) self.error_list = QListWidget() self.error_list.setFixedHeight(120) self.error_list.setStyleSheet("color: #d32f2f;") result_layout.addLayout(result_header) result_layout.addWidget(self.result_text) result_layout.addLayout(error_header) result_layout.addWidget(self.error_list) result_group.setLayout(result_layout) # 添加到主布局 main_layout.addWidget(file_group) main_layout.addLayout(control_layout) main_layout.addWidget(self.progress_bar) main_layout.addWidget(self.status_label) main_layout.addWidget(result_group) main_widget.setLayout(main_layout) self.setCentralWidget(main_widget) # 启动模型加载 self.load_models() def load_models(self): """后台加载模型""" self.status_label.setText("正在加载AI模型,请稍候...") self.start_btn.setEnabled(False) self.model_loader = ModelLoader() self.model_loader.progress.connect(self.update_model_loading_status) self.model_loader.finished.connect(self.handle_model_loading_finished) self.model_loader.start() def update_model_loading_status(self, message): """更新模型加载状态""" self.status_label.setText(message) def handle_model_loading_finished(self, success, message): """处理模型加载完成""" if success: self.models = self.model_loader.models self.models_loaded = True # 修复标志位 self.status_label.setText(message) self.start_btn.setEnabled(True) else: self.status_label.setText(message) QMessageBox.critical(self, "模型加载失败", message) def browse_audio(self): """选择音频文件或文件夹""" options = QFileDialog.Options() files, _ = QFileDialog.getOpenFileNames( self, "选择音频文件", "", "音频文件 (*.mp3 *.wav *.amr *.flac *.m4a);;所有文件 (*)", options=options ) if files: self.audio_files = files self.audio_path_edit.setText(f"已选择 {len(files)} 个文件") self.result_count_label.setText(f"0/{len(files)}") def browse_keywords(self): """选择关键词文件""" options = QFileDialog.Options() file, _ = QFileDialog.getOpenFileName( self, "选择关键词文件", "", "Excel文件 (*.xlsx);;所有文件 (*)", options=options ) if file: self.keywords_file = file self.keyword_path_edit.setText(os.path.basename(file)) def browse_output(self): """选择输出目录""" options = QFileDialog.Options() directory = QFileDialog.getExistingDirectory( self, "选择输出目录", options=options ) if directory: self.output_dir = directory self.output_path_edit.setText(directory) def start_analysis(self): """开始分析""" if not self.audio_files: self.show_message("错误", "请先选择音频文件!") return if not self.models_loaded: # 使用修复后的标志位 self.show_message("错误", "AI模型尚未加载完成!") return # 检查输出目录 if not os.path.exists(self.output_dir): try: os.makedirs(self.output_dir) except Exception as e: self.show_message("错误", f"无法创建输出目录: {str(e)}") return # 更新UI状态 self.start_btn.setEnabled(False) self.stop_btn.setEnabled(True) self.result_text.clear() self.error_list.clear() self.error_count_label.setText("0") self.result_text.append("开始分析音频文件...") self.progress_bar.setValue(0) # 创建并启动分析线程 self.analysis_thread = AnalysisThread( self.audio_files, self.keywords_file, self.output_dir, self.models ) # 连接信号 self.analysis_thread.progress.connect(self.update_progress) self.analysis_thread.result_ready.connect(self.handle_result) self.analysis_thread.finished_all.connect(self.analysis_finished) self.analysis_thread.error_occurred.connect(self.handle_error) self.analysis_thread.start() def stop_analysis(self): """停止分析""" if self.analysis_thread and self.analysis_thread.isRunning(): self.analysis_thread.stop() self.analysis_thread.wait() self.result_text.append("分析已停止") self.status_label.setText("分析已停止") self.start_btn.setEnabled(True) self.stop_btn.setEnabled(False) def clear_all(self): """清空所有内容""" self.audio_files = [] self.keywords_file = "" self.audio_path_edit.clear() self.keyword_path_edit.clear() self.result_text.clear() self.error_list.clear() self.progress_bar.setValue(0) self.status_label.setText("准备就绪") self.result_count_label.setText("0/0") self.error_count_label.setText("0") def update_progress(self, value, message): """更新进度""" self.progress_bar.setValue(value) self.status_label.setText(message) # 更新结果计数 if "已完成" in message: parts = message.split() if len(parts) >= 2: self.result_count_label.setText(parts[1]) def handle_result(self, result): """处理单个结果""" summary = f""" 文件: {result['file_name']} 时长: {result['duration']}秒 ---------------------------------------- 开场白: {result['opening_check']} | 结束语: {result['closing_check']} | 禁语: {result['forbidden_check']} 客服情感: {result['agent_sentiment']} ({result['agent_emotion']}) | 语速: {result['agent_speed']}词/分 客户情感: {result['customer_sentiment']} ({result['customer_emotion']}) 问题解决: {result['solution_rate']} 音量水平: {result['volume_level']} | 稳定性: {result['volume_stability']} ---------------------------------------- """ self.result_text.append(summary) def handle_error(self, file_name, error): """处理错误""" self.error_list.addItem(f"{file_name}: {error}") self.error_count_label.setText(str(self.error_list.count())) def analysis_finished(self): """分析完成""" self.start_btn.setEnabled(True) self.stop_btn.setEnabled(False) self.status_label.setText(f"分析完成! 报告已保存到: {self.output_dir}") self.result_text.append("分析完成!") # 显示完成消息 self.show_message("完成", f"分析完成! 报告已保存到: {self.output_dir}") def show_message(self, title, message): """显示消息对话框""" msg = QMessageBox(self) msg.setWindowTitle(title) msg.setText(message) msg.setStandardButtons(QMessageBox.Ok) msg.exec_() if __name__ == "__main__": app = QApplication(sys.argv) # 检查GPU可用性 if MODEL_CONFIG["device"] == "cuda": try: gpu_mem = torch.cuda.get_device_properties(0).total_memory / (1024 ** 3) print(f"GPU内存: {gpu_mem:.2f}GB") # 根据GPU内存调整并行度 if gpu_mem < 4: # 确保有足够内存 MODEL_CONFIG["device"] = "cpu" MODEL_CONFIG["max_workers"] = 4 print("GPU内存不足,切换到CPU模式") elif gpu_mem < 8: MODEL_CONFIG["max_workers"] = 2 else: MODEL_CONFIG["max_workers"] = 4 except: MODEL_CONFIG["device"] = "cpu" MODEL_CONFIG["max_workers"] = 4 print("无法获取GPU信息,切换到CPU模式") window = MainWindow() window.show() sys.exit(app.exec_())

filetype
资源下载链接为: https://pan.quark.cn/s/140386800631 通用大模型文本分类实践的基本原理是,借助大模型自身较强的理解和推理能力,在使用时需在prompt中明确分类任务目标,并详细解释每个类目概念,尤其要突出类目间的差别。 结合in-context learning思想,有效的prompt应包含分类任务介绍及细节、类目概念解释、每个类目对应的例子和待分类文本。但实际应用中,类目和样本较多易导致prompt过长,影响大模型推理效果,因此可先通过向量检索缩小范围,再由大模型做最终决策。 具体方案为:离线时提前配置好每个类目的概念及对应样本;在线时先对给定query进行向量召回,再将召回结果交给大模型决策。 该方法不更新任何模型参数,直接使用开源模型参数。其架构参考GPT-RE并结合相关实践改写,加入上下文学习以提高准确度,还使用BGE作为向量模型,K-BERT提取文本关键词,拼接召回的相似例子作为上下文输入大模型。 代码实现上,大模型用Qwen2-7B-Instruct,Embedding采用bge-base-zh-v1.5,向量库选择milvus。分类主函数的作用是在向量库中召回相似案例,拼接prompt后输入大模型。 结果方面,使用ICL时accuracy达0.94,比bert文本分类的0.98低0.04,错误类别6个,处理时添加“家居”类别,影响不大;不使用ICL时accuracy为0.88,错误58项,可能与未修改prompt有关。 优点是无需训练即可有较好结果,例子优质、类目界限清晰时效果更佳,适合围绕通用大模型api打造工具;缺点是上限不高,仅针对一个分类任务部署大模型不划算,推理速度慢,icl的token使用多,用收费api会有额外开销。 后续可优化的点是利用key-bert提取的关键词,因为核心词语有时比语意更重要。 参考资料包括
filetype
内容概要:本文详细介绍了哈希表及其相关概念和技术细节,包括哈希表的引入、哈希函数的设计、冲突处理机制、字符串哈希的基础、哈希错误率分析以及哈希的改进与应用。哈希表作为一种高效的数据结构,通过键值对存储数据,能够快速定位和检索。文中讨论了整数键值和字符串键值的哈希方法,特别是字符串哈希中的多项式哈希及其优化方法,如双哈希和子串哈希的快速计算。此外,还探讨了常见的冲突处理方法——拉链法和闭散列法,并提供了C++实现示例。最后,文章列举了哈希在字符串匹配、最长回文子串、最长公共子字符串等问题中的具体应用。 适合人群:计算机科学专业的学生、算法竞赛选手以及有一定编程基础并对数据结构和算法感兴趣的开发者。 使用场景及目标:①理解哈希表的工作原理及其在各种编程任务中的应用;②掌握哈希函数的设计原则,包括如何选择合适的模数和基数;③学会处理哈希冲突的方法,如拉链法和闭散列法;④了解并能运用字符串哈希解决实际问题,如字符串匹配、回文检测等。 阅读建议:由于哈希涉及较多数学知识和编程技巧,建议读者先熟悉基本的数据结构和算法理论,再结合代码实例进行深入理解。同时,在实践中不断尝试不同的哈希策略,对比性能差异,从而更好地掌握哈希技术。
weixin_38638596
  • 粉丝: 3
上传资源 快速赚钱