活动介绍
file-type

CVPR 2019论文官方代码库:混合密度网络局限性的解决

下载需积分: 50 | 22.1MB | 更新于2024-12-13 | 22 浏览量 | 0 下载量 举报 收藏
download 立即下载
CVPR 2019是计算机视觉和模式识别领域的顶级会议之一,而这篇论文则是关于如何克服混合密度网络在多模态未来预测方面的局限性的研究。混合密度网络(Mixture Density Networks,MDN)是一种通过网络输出参数化的混合高斯分布来模拟数据分布的方法。这种方法在处理具有多个可能结果的任务时具有天然优势,例如在进行轨迹预测时,未来的路径往往存在多种可能性。 首先,该论文所涉及的多模态未来预测(Multimodal Future Prediction)是目前智能系统和自动驾驶领域的一个热门研究方向,它要求系统能够对一个对象未来的多种可能行为做出预测。在这样的场景中,比如行人或车辆的未来运动轨迹,可以有多种不同的可能性,这取决于当前的环境和上下文。传统的单模态预测方法往往只给出一条预测路径,这在现实世界中往往不够用,因为现实世界充满了不确定性。因此,多模态预测技术能够提供一系列可能的预测结果,为决策制定提供更全面的信息。 论文中所提到的混合密度网络(MDN)正是为了解决这类问题而设计的模型,它通过学习数据的分布,能够输出一个概率分布,进而可以采样出多条可能的预测轨迹。然而,尽管MDN模型理论上可以提供多模态输出,但在实际应用中仍会遇到一些局限性,比如难以训练、计算复杂度高、难以平衡模型的多样性和准确性等问题。 针对这些局限性,该论文提出了一种新的方法或改进策略,以克服混合密度网络在多模态未来预测中的不足。论文的官方存储库提供了源码实现,它旨在帮助研究者和开发人员复现实验结果,并基于此进行进一步的研究和开发。 为了使用这个仓库的代码,用户需要安装Tensorflow-GPU 1.14和一系列Python库(opencv-python、sklearn、matplotlib、Pillow)。Tensorflow-GPU是谷歌开发的开源深度学习框架,支持GPU加速计算,非常适合处理大量数据和复杂模型,比如多模态未来预测。而opencv-python是开源计算机视觉库,它提供了丰富的图像处理功能。其他Python库如sklearn、matplotlib、Pillow则分别用于数据科学、绘图和图像处理。 该存储库还使用了从WEMD获取的源代码来计算SEMD评估指标。SEMD是一个用于评估多模态预测效果的评估指标。WEMD是指加权期望马氏距离,它扩展了传统的期望马氏距离(EMD),使其能够考虑不同预测模式的权重。WEMD的计算较为复杂,需要先解压并构建blitz++.zip文件,以获得所需的库。 最后,为了重现论文中的实验结果,仓库提供了处理过的测试样本数据集。这些数据集是从斯坦福自动驾驶数据集(Stanford Drone Dataset,SDD)中获得的,SDD包含了多个场景的多种对象的丰富数据,非常适合进行多模态未来预测的研究。通过下载并解压datasets.zip文件,研究者可以得到一个包含测试场景文件夹的数据集,每个场景都有相应的图像和特征文件夹。 总结起来,这篇论文和相应的存储库代表了在多模态未来预测领域的重要进展。通过克服混合密度网络的局限性,研究者可以开发出更好的模型来预测复杂场景中的多模态未来路径,这对于智能系统和自动驾驶汽车的路径规划和决策制定具有重要意义。

相关推荐

filetype

ERROR 03-21 07:20:34 engine.py:400] 'NoneType' object is not iterable ERROR 03-21 07:20:34 engine.py:400] Traceback (most recent call last): ERROR 03-21 07:20:34 engine.py:400] File "/usr/local/python3.10/lib/python3.10/site-packages/vllm/engine/multiprocessing/engine.py", line 391, in run_mp_engine ERROR 03-21 07:20:34 engine.py:400] engine = MQLLMEngine.from_engine_args(engine_args=engine_args, ERROR 03-21 07:20:34 engine.py:400] File "/usr/local/python3.10/lib/python3.10/site-packages/vllm/engine/multiprocessing/engine.py", line 119, in from_engine_args ERROR 03-21 07:20:34 engine.py:400] engine_config = engine_args.create_engine_config(usage_context) ERROR 03-21 07:20:34 engine.py:400] File "/usr/local/python3.10/lib/python3.10/site-packages/vllm/engine/arg_utils.py", line 1127, in create_engine_config ERROR 03-21 07:20:34 engine.py:400] model_config = self.create_model_config() ERROR 03-21 07:20:34 engine.py:400] File "/usr/local/python3.10/lib/python3.10/site-packages/vllm/engine/arg_utils.py", line 1047, in create_model_config ERROR 03-21 07:20:34 engine.py:400] return ModelConfig( ERROR 03-21 07:20:34 engine.py:400] File "/usr/local/python3.10/lib/python3.10/site-packages/vllm/config.py", line 366, in __init__ ERROR 03-21 07:20:34 engine.py:400] self.multimodal_config = self._init_multimodal_config( ERROR 03-21 07:20:34 engine.py:400] File "/usr/local/python3.10/lib/python3.10/site-packages/vllm/config.py", line 427, in _init_multimodal_config ERROR 03-21 07:20:34 engine.py:400] if ModelRegistry.is_multimodal_model(architectures): ERROR 03-21 07:20:34 engine.py:400] File "/usr/local/python3.10/lib/python3.10/site-packages/vllm/model_executor/models/registry.py", line 460, in is_multimodal_model ERROR 03-21 07:20:34 engine.py:400] model_cls, _ = self.inspect_model_cls(architectures) ERROR 03-21 07:20:34 engine.py:400] File "/usr/local/python3.10/lib/python3.10/site-packages/vllm/model_executor/models/registry.py", line 413, in inspect_model_cls ERROR 03-21 07:20:34 engine.py:400] architectures = self._normalize_archs(architectures) ERROR 03-21 07:20:34 engine.py:400] File "/usr/local/python3.10/lib/python3.10/site-packages/vllm/model_executor/models/registry.py", line 403, in _normalize_archs ERROR 03-21 07:20:34 engine.py:400] for model in architectures: ERROR 03-21 07:20:34 engine.py:400] TypeError: 'NoneType' object is not iterable

filetype

执行python -m vllm.entrypoints.openai.api_server --served-model-name ui-tars --model /data1/lihuan_data/downloads/UI-TARS-2B/ --trust-remote-code来启动api服务,结果报错WARNING 07-07 16:36:11 [utils.py:2413] Methods determine_num_available_blocks,device_config,get_cache_block_size_bytes,initialize_cache not implemented in <vllm.v1.worker.gpu_worker.Worker object at 0x7fec2f27b0a0> INFO 07-07 16:36:11 [parallel_state.py:957] rank 0 in world size 1 is assigned as DP rank 0, PP rank 0, TP rank 0 INFO 07-07 16:36:11 [cuda.py:221] Using Flash Attention backend on V1 engine. Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`. ERROR 07-07 16:36:12 [core.py:390] EngineCore hit an exception: Traceback (most recent call last): ERROR 07-07 16:36:12 [core.py:390] File "/home/lihuan/.conda/envs/ui-tars/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 378, in run_engine_core ERROR 07-07 16:36:12 [core.py:390] engine_core = EngineCoreProc(*args, **kwargs) ERROR 07-07 16:36:12 [core.py:390] File "/home/lihuan/.conda/envs/ui-tars/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 319, in __init__ ERROR 07-07 16:36:12 [core.py:390] super().__init__(vllm_config, executor_class, log_stats) ERROR 07-07 16:36:12 [core.py:390] File "/home/lihuan/.conda/envs/ui-tars/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 67, in __init__ ERROR 07-07 16:36:12 [core.py:390] self.model_executor = executor_class(vllm_config) ERROR 07-07 16:36:12 [core.py:390] File "/home/lihuan/.conda/envs/ui-tars/lib/python3.10/site-packages/vllm/executor/executor_base.py", line 52, in __init__ ERROR 07-07 16:36:12 [core.py:390] self._init_executor() ERROR 07-07 16:36:12 [core.py:390] File "/home/lihuan/.conda/envs/ui-tars/lib/python3.10/site-packages/vllm/executor/uniproc_executor.py", line 46, in _init_executor ERROR 07-07 16:36:12 [core.py:390] self.collective_rpc("init_device") ERROR 07-07 16:36:12 [core.py:390] File "/home/lihuan/.conda/envs/ui-tars/lib/python3.10/site-packages/vllm/executor/uniproc_executor.py", line 56, in collective_rpc ERROR 07-07 16:36:12 [core.py:390] answer = run_method(self.driver_worker, method, args, kwargs) ERROR 07-07 16:36:12 [core.py:390] File "/home/lihuan/.conda/envs/ui-tars/lib/python3.10/site-packages/vllm/utils.py", line 2347, in run_method ERROR 07-07 16:36:12 [core.py:390] return func(*args, **kwargs) ERROR 07-07 16:36:12 [core.py:390] File "/home/lihuan/.conda/envs/ui-tars/lib/python3.10/site-packages/vllm/worker/worker_base.py", line 604, in init_device ERROR 07-07 16:36:12 [core.py:390] self.worker.init_device() # type: ignore ERROR 07-07 16:36:12 [core.py:390] File "/home/lihuan/.conda/envs/ui-tars/lib/python3.10/site-packages/vllm/v1/worker/gpu_worker.py", line 120, in init_device ERROR 07-07 16:36:12 [core.py:390] self.model_runner: GPUModelRunner = GPUModelRunner( ERROR 07-07 16:36:12 [core.py:390] File "/home/lihuan/.conda/envs/ui-tars/lib/python3.10/site-packages/vllm/v1/worker/gpu_model_runner.py", line 144, in __init__ ERROR 07-07 16:36:12 [core.py:390] encoder_compute_budget, encoder_cache_size = compute_encoder_budget( ERROR 07-07 16:36:12 [core.py:390] File "/home/lihuan/.conda/envs/ui-tars/lib/python3.10/site-packages/vllm/v1/core/encoder_cache_manager.py", line 94, in compute_encoder_budget ERROR 07-07 16:36:12 [core.py:390] ) = _compute_encoder_budget_multimodal( ERROR 07-07 16:36:12 [core.py:390] File "/home/lihuan/.conda/envs/ui-tars/lib/python3.10/site-packages/vllm/v1/core/encoder_cache_manager.py", line 124, in _compute_encoder_budget_multimodal ERROR 07-07 16:36:12 [core.py:390] .get_max_tokens_per_item_by_nonzero_modality(model_config) ERROR 07-07 16:36:12 [core.py:390] File "/home/lihuan/.conda/envs/ui-tars/lib/python3.10/site-packages/vllm/multimodal/registry.py", line 289, in get_max_tokens_per_item_by_nonzero_modality ERROR 07-07 16:36:12 [core.py:390] self.get_max_tokens_per_item_by_modality(model_config).items() ERROR 07-07 16:36:12 [core.py:390] File "/home/lihuan/.conda/envs/ui-tars/lib/python3.10/site-packages/vllm/multimodal/registry.py", line 263, in get_max_tokens_per_item_by_modality ERROR 07-07 16:36:12 [core.py:390] return processor.info.get_mm_max_tokens_per_item( ERROR 07-07 16:36:12 [core.py:390] File "/home/lihuan/.conda/envs/ui-tars/lib/python3.10/site-packages/vllm/model_executor/models/qwen2_vl.py", line 827, in get_mm_max_tokens_per_item ERROR 07-07 16:36:12 [core.py:390] "image": self.get_max_image_tokens(), ERROR 07-07 16:36:12 [core.py:390] File "/home/lihuan/.conda/envs/ui-tars/lib/python3.10/site-packages/vllm/model_executor/models/qwen2_vl.py", line 915, in get_max_image_tokens ERROR 07-07 16:36:12 [core.py:390] target_width, target_height = self.get_image_size_with_most_features() ERROR 07-07 16:36:12 [core.py:390] File "/home/lihuan/.conda/envs/ui-tars/lib/python3.10/site-packages/vllm/model_executor/models/qwen2_vl.py", line 907, in get_image_size_with_most_features ERROR 07-07 16:36:12 [core.py:390] max_image_size, _ = self._get_vision_info( ERROR 07-07 16:36:12 [core.py:390] File "/home/lihuan/.conda/envs/ui-tars/lib/python3.10/site-packages/vllm/model_executor/models/qwen2_vl.py", line 841, in _get_vision_info ERROR 07-07 16:36:12 [core.py:390] image_processor = self.get_image_processor() ERROR 07-07 16:36:12 [core.py:390] File "/home/lihuan/.conda/envs/ui-tars/lib/python3.10/site-packages/vllm/model_executor/models/qwen2_vl.py", line 810, in get_image_processor ERROR 07-07 16:36:12 [core.py:390] return cached_image_processor_from_config( ERROR 07-07 16:36:12 [core.py:390] File "/home/lihuan/.conda/envs/ui-tars/lib/python3.10/site-packages/vllm/transformers_utils/processor.py", line 157, in cached_image_processor_from_config ERROR 07-07 16:36:12 [core.py:390] return cached_get_image_processor( ERROR 07-07 16:36:12 [core.py:390] File "/home/lihuan/.conda/envs/ui-tars/lib/python3.10/site-packages/vllm/transformers_utils/processor.py", line 145, in get_image_processor ERROR 07-07 16:36:12 [core.py:390] raise e ERROR 07-07 16:36:12 [core.py:390] File "/home/lihuan/.conda/envs/ui-tars/lib/python3.10/site-packages/vllm/transformers_utils/processor.py", line 127, in get_image_processor ERROR 07-07 16:36:12 [core.py:390] processor = AutoImageProcessor.from_pretrained( ERROR 07-07 16:36:12 [core.py:390] File "/home/lihuan/.conda/envs/ui-tars/lib/python3.10/site-packages/transformers/models/auto/image_processing_auto.py", line 580, in from_pretrained ERROR 07-07 16:36:12 [core.py:390] return image_processor_class.from_dict(config_dict, **kwargs) ERROR 07-07 16:36:12 [core.py:390] File "/home/lihuan/.conda/envs/ui-tars/lib/python3.10/site-packages/transformers/image_processing_base.py", line 412, in from_dict ERROR 07-07 16:36:12 [core.py:390] image_processor = cls(**image_processor_dict) ERROR 07-07 16:36:12 [core.py:390] File "/home/lihuan/.conda/envs/ui-tars/lib/python3.10/site-packages/transformers/models/qwen2_vl/image_processing_qwen2_vl.py", line 141, in __init__ ERROR 07-07 16:36:12 [core.py:390] raise ValueError("size must contain 'shortest_edge' and 'longest_edge' keys.") ERROR 07-07 16:36:12 [core.py:390] ValueError: size must contain 'shortest_edge' and 'longest_edge' keys. ERROR 07-07 16:36:12 [core.py:390] CRITICAL 07-07 16:36:12 [core_client.py:361] Got fatal signal from worker processes, shutting down. See stack trace above for root cause issue. 已杀死

量子学园
  • 粉丝: 32
上传资源 快速赚钱