Table of Contents
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2025-05-26 | GraLoRA: Granular Low-Rank Adaptation for Parameter-Efficient Fine-Tuning | Yeonjoon Jung et.al. | 2505.20355 | null |
2025-05-26 | Parameter-Efficient Fine-Tuning with Column Space Projection | Junseo Hwang et.al. | 2505.20211 | null |
2025-05-26 | UORA: Uniform Orthogonal Reinitialization Adaptation in Parameter-Efficient Fine-Tuning of Large Models | Xueyan Zhang et.al. | 2505.20154 | null |
2025-05-25 | Optimization-Inspired Few-Shot Adaptation for Large Language Models | Boyan Gao et.al. | 2505.19107 | null |
2025-05-27 | Universal Reasoner: A Single, Composable Plug-and-Play Reasoner for Frozen LLMs | Jaemin Kim et.al. | 2505.19075 | link |
2025-05-24 | HD-PiSSA: High-Rank Distributed Orthogonal Adaptation | Yiding Wang et.al. | 2505.18777 | null |
2025-05-24 | AuroRA: Breaking Low-Rank Bottleneck of LoRA with Nonlinear Mapping | Haonan Dong et.al. | 2505.18738 | null |
2025-05-24 | LLM-QFL: Distilling Large Language Model for Quantum Federated Learning | Dev Gurung et.al. | 2505.18656 | null |
2025-05-24 | Knowledge Grafting of Large Language Models | Guodong Du et.al. | 2505.18502 | null |
2025-05-22 | Representation Discrepancy Bridging Method for Remote Sensing Image-Text Retrieval | Hailong Ning et.al. | 2505.16756 | null |
2025-05-28 | Larger Is Not Always Better: Exploring Small Open-source Language Models in Logging Statement Generation | Renyi Zhong et.al. | 2505.16590 | null |
2025-05-21 | VP Lab: a PEFT-Enabled Visual Prompting Laboratory for Semantic Segmentation | Niccolo Avogaro et.al. | 2505.15592 | null |
2025-05-21 | CoLA: Collaborative Low-Rank Adaptation | Yiyun Zhou et.al. | 2505.15471 | link |
2025-05-21 | Gated Integration of Low-Rank Adaptation for Continual Learning of Language Models | Yan-Shuo Liang et.al. | 2505.15424 | link |
2025-05-21 | Parameter-Efficient Fine-Tuning of Multispectral Foundation Models for Hyperspectral Image Classification | Bernardin Ligan et.al. | 2505.15334 | null |
2025-05-21 | Few-Shot Adversarial Low-Rank Fine-Tuning of Vision-Language Models | Sajjad Ghiasvand et.al. | 2505.15130 | null |
2025-05-21 | Dual Decomposition of Weights and Singular Value Low Rank Adaptation | Jialong Han et.al. | 2505.14367 | null |
2025-05-21 | OSoRA: Output-Dimension and Singular-Value Initialized Low-Rank Adaptation | Jialong Han et.al. | 2505.14350 | null |
2025-05-23 | ABBA: Highly Expressive Hadamard Product Adaptation for Large Language Models | Raghav Singhal et.al. | 2505.14238 | link |
2025-05-18 | Adaptive parameter-efficient fine-tuning via Hessian-informed subset selection | Shiyun Xu et.al. | 2505.12579 | null |
2025-05-18 | Exploring Sparsity for Parameter Efficient Fine Tuning Using Wavelets | Ahmet Bilican et.al. | 2505.12532 | link |
2025-05-18 | SRLoRA: Subspace Recomposition in Low-Rank Adaptation via Importance-Based Fusion and Reinitialization | Haodong Yang et.al. | 2505.12433 | null |
2025-05-16 | Memory-Efficient Orthogonal Fine-Tuning with Principal Subspace Adaptation | Fei Wu et.al. | 2505.11235 | null |
2025-05-15 | Multi-Token Prediction Needs Registers | Anastasios Gerontopoulos et.al. | 2505.10518 | link |
2025-05-14 | PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning | Zongqian Li et.al. | 2505.09519 | link |
2025-05-13 | Parameter-Efficient Fine-Tuning of Vision Foundation Model for Forest Floor Segmentation from UAV Imagery | Mohammad Wasil et.al. | 2505.08932 | link |
2025-05-10 | Efficient Telecom Specific LLM: TSLAM-Mini with QLoRA and Digital Twin Data | Vignesh Ethiraj et.al. | 2505.07877 | null |
2025-05-11 | DAPE: Dual-Stage Parameter-Efficient Fine-Tuning for Consistent Video Editing with Diffusion Models | Junhao Xia et.al. | 2505.07057 | null |
2025-05-10 | Enfoque Odychess: Un método dialéctico, constructivista y adaptativo para la enseñanza del ajedrez con inteligencias artificiales generativas | Ernesto Giralt Hernandez et.al. | 2505.06652 | null |
2025-05-07 | GAPrompt: Geometry-Aware Point Cloud Prompt for 3D Vision Model | Zixiang Ai et.al. | 2505.04119 | link |
2025-05-05 | HSplitLoRA: A Heterogeneous Split Parameter-Efficient Fine-Tuning Framework for Large Language Models | Zheng Lin et.al. | 2505.02795 | null |
2025-05-05 | Parameter-Efficient Fine-Tuning with Attributed Patch Semantic Graph for Automated Patch Correctness Assessment | Zhenyu Yang et.al. | 2505.02629 | link |
2025-05-01 | AdCare-VLM: Leveraging Large Vision Language Model (LVLM) to Monitor Long-Term Medication Adherence and Care | Md Asaduzzaman Jabin et.al. | 2505.00275 | link |
2025-04-30 | Enhancing Health Mention Classification Performance: A Study on Advancements in Parameter Efficient Tuning | Reem Abdel-Salam et.al. | 2504.21685 | null |
2025-05-09 | A Systematic Literature Review of Parameter-Efficient Fine-Tuning for Large Code Models | Md Zahidul Haque et.al. | 2504.21569 | link |
2025-04-29 | TT-LoRA MoE: Unifying Parameter-Efficient Fine-Tuning and Sparse Mixture-of-Experts | Pradip Kunwar et.al. | 2504.21190 | null |
2025-04-29 | A Survey on Parameter-Efficient Fine-Tuning for Foundation Models in Federated Learning | Jieming Bian et.al. | 2504.21099 | null |
2025-04-29 | ReCIT: Reconstructing Full Private Data from Gradient in Parameter-Efficient Fine-Tuning of Large Language Models | Jin Xie et.al. | 2504.20570 | null |
2025-04-23 | Parameter-Efficient Checkpoint Merging via Metrics-Weighted Averaging | Shi Jie Yu et.al. | 2504.18580 | null |
2025-04-24 | Fine-tune Smarter, Not Harder: Parameter-Efficient Fine-Tuning for Geospatial Foundation Models | Francesc Marti-Escofet et.al. | 2504.17397 | null |
2025-04-22 | PointLoRA: Low-Rank Adaptation with Token Selection for Point Cloud Learning | Song Wang et.al. | 2504.16023 | null |
2025-04-21 | What Lurks Within? Concept Auditing for Shared Diffusion Models at Scale | Xiaoyong Yuan et.al. | 2504.14815 | null |
2025-04-20 | Harnessing Generative LLMs for Enhanced Financial Event Entity Extraction Performance | Soo-joon Choi et.al. | 2504.14633 | null |
2025-04-20 | Vision-Centric Representation-Efficient Fine-Tuning for Robust Universal Foreground Segmentation | Guoyi Zhang et.al. | 2504.14481 | null |
2025-04-19 | PEFT A2Z: Parameter-Efficient Fine-Tuning Survey for Large Language and Vision Models | Nusrat Jahan Prottasha et.al. | 2504.14117 | null |
2025-04-18 | Parameter-Efficient Continual Fine-Tuning: A Survey | Eric Nuertey Coleman et.al. | 2504.13822 | null |
2025-04-17 | All-in-One Transferring Image Compression from Human Perception to Multi-Machine Perception | Jiancheng Zhao et.al. | 2504.12997 | null |
2025-04-15 | A Decade of Wheat Mapping for Lebanon | Hasan Wehbi et.al. | 2504.11366 | null |
2025-04-14 | CROSSAN: Towards Efficient and Effective Adaptation of Multiple Multimodal Foundation Models for Sequential Recommendation | Junchen Fu et.al. | 2504.10307 | link |
2025-04-10 | LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank Adaptation | Juzheng Zhang et.al. | 2504.07448 | link |
2025-04-14 | DUKAE: DUal-level Knowledge Accumulation and Ensemble for Pre-Trained Model-Based Continual Learning | Songze Li et.al. | 2504.06521 | null |
2025-04-16 | Earth-Adapter: Bridge the Geospatial Domain Gaps with Mixture of Frequency Adaptation | Xiaoxing Hu et.al. | 2504.06220 | link |
2025-04-11 | AROMA: Autonomous Rank-one Matrix Adaptation | Hao Nan Sheng et.al. | 2504.05343 | link |
2025-04-05 | FISH-Tuning: Enhancing PEFT Methods with Fisher Information | Kang Xue et.al. | 2504.04050 | null |
2025-04-02 | CLIP-SLA: Parameter-Efficient CLIP Adaptation for Continuous Sign Language Recognition | Sarah Alyami et.al. | 2504.01666 | link |
2025-04-01 | Generalized Tensor-based Parameter-Efficient Fine-Tuning via Lie Group Transformations | Chongjie Si et.al. | 2504.00851 | null |
2025-04-01 | DynMoLE: Boosting Mixture of LoRA Experts Fine-Tuning with a Hybrid Routing Mechanism | Dengchun Li et.al. | 2504.00661 | link |
2025-03-31 | ElaLoRA: Elastic & Learnable Low-Rank Adaptation for Efficient Model Fine-Tuning | Huandong Chang et.al. | 2504.00254 | null |
2025-03-31 | Order Matters: On Parameter-Efficient Image-to-Video Probing for Recognizing Nearly Symmetric Actions | Thinesh Thiyakesan Ponbagavathi et.al. | 2503.24298 | null |
2025-03-29 | Efficient Adaptation For Remote Sensing Visual Grounding | Hasan Moughnieh et.al. | 2503.23083 | null |
2025-03-27 | MSPLoRA: A Multi-Scale Pyramid Low-Rank Adaptation for Efficient Model Fine-Tuning | Jiancheng Zhao et.al. | 2503.21838 | link |
2025-03-26 | Enhancing Multi-modal Models with Heterogeneous MoE Adapters for Fine-tuning | Sashuai Zhou et.al. | 2503.20633 | null |
2025-03-26 | IAP: Improving Continual Learning of Vision-Language Models via Instance-Aware Prompting | Hao Fu et.al. | 2503.20612 | link |
2025-03-26 | Unlocking the Hidden Potential of CLIP in Generalizable Deepfake Detection | Andrii Yermakov et.al. | 2503.19683 | link |
2025-03-25 | VectorFit : Adaptive Singular & Bias Vector Fine-Tuning of Pre-trained Foundation Models | Suhas G Hegde et.al. | 2503.19530 | null |
2025-03-24 | MoST: Efficient Monarch Sparse Tuning for 3D Representation Learning | Xu Han et.al. | 2503.18368 | link |
2025-03-24 | Coeff-Tuning: A Graph Filter Subspace View for Tuning Attention-Based Large Models | Zichen Miao et.al. | 2503.18337 | null |
2025-03-23 | Decoupling Angles and Strength in Low-rank Adaptation | Massimo Bini et.al. | 2503.18225 | link |
2025-03-22 | Visual Variational Autoencoder Prompt Tuning | Xi Xiao et.al. | 2503.17650 | null |
2025-03-21 | PE-CLIP: A Parameter-Efficient Fine-Tuning of Vision Language Models for Dynamic Facial Expression Recognition | Ibtissam Saadi et.al. | 2503.16945 | null |
2025-03-20 | VP-NTK: Exploring the Benefits of Visual Prompting in Differentially Private Data Synthesis | Chia-Yi Hsu et.al. | 2503.16195 | null |
2025-03-20 | SALT: Singular Value Adaptation with Low-Rank Transformation | Abdelrahman Elsayed et.al. | 2503.16055 | link |
2025-03-19 | FedSCA: Federated Tuning with Similarity-guided Collaborative Aggregation for Heterogeneous Medical Image Segmentation | Yumin Zhang et.al. | 2503.15390 | null |
2025-03-18 | MAST-Pro: Dynamic Mixture-of-Experts for Adaptive Segmentation of Pan-Tumors with Knowledge-Driven Prompts | Runqi Meng et.al. | 2503.14355 | null |
2025-03-15 | A Survey on Federated Fine-tuning of Large Language Models | Yebo Wu et.al. | 2503.12016 | link |
2025-03-14 | Rethinking Few-Shot Adaptation of Vision-Language Models in Two Stages | Matteo Farina et.al. | 2503.11609 | link |
2025-03-14 | MoLEx: Mixture of Layer Experts for Finetuning with Sparse Upcycling | Rachel S. Y. Teo et.al. | 2503.11144 | link |
2025-03-13 | Efficient Federated Fine-Tuning of Large Language Models with Layer Dropout | Shilong Wang et.al. | 2503.10217 | null |
2025-03-13 | Singular Value Fine-tuning for Few-Shot Class-Incremental Learning | Zhiwu Wang et.al. | 2503.10214 | null |
2025-03-12 | Revisiting semi-supervised learning in the era of foundation models | Ping Zhang et.al. | 2503.09707 | link |
2025-03-11 | 1LoRA: Summation Compression for Very Low-Rank Adaptation | Alessio Quercia et.al. | 2503.08333 | null |
2025-03-11 | Adapting Large Language Models for Parameter-Efficient Log Anomaly Detection | Ying Fu Lim et.al. | 2503.08045 | null |
2025-03-09 | MoFE: Mixture of Frozen Experts Architecture | Jean Seo et.al. | 2503.06491 | null |
2025-03-08 | Lifelong Learning with Task-Specific Adaptation: Addressing the Stability-Plasticity Dilemma | Ruiyu Wang et.al. | 2503.06213 | null |
2025-03-07 | Quantum-PEFT: Ultra parameter-efficient fine-tuning | Toshiaki Koike-Akino et.al. | 2503.05431 | null |
2025-03-07 | Personalized Text Generation with Contrastive Activation Steering | Jinghao Zhang et.al. | 2503.05213 | null |
2025-03-06 | TableLoRA: Low-rank Adaptation on Table Structure Understanding for Large Language Models | Xinyi He et.al. | 2503.04396 | null |
2025-03-05 | State-offset Tuning: State-based Parameter-Efficient Fine-Tuning for State Space Models | Wonjun Kang et.al. | 2503.03499 | link |
2025-03-11 | PaCA: Partial Connection Adaptation for Efficient Fine-Tuning | Sunghyeon Woo et.al. | 2503.01905 | null |
2025-03-03 | Parameter-Efficient Fine-Tuning of Large Language Models via Deconvolution in Subspace | Jia-Chen Zhang et.al. | 2503.01419 | null |
2025-03-03 | PROPER: A Progressive Learning Framework for Personalized Large Language Models with Group-Level Adaptation | Linhai Zhang et.al. | 2503.01303 | null |
2025-03-03 | Beyond QA Pairs: Assessing Parameter-Efficient Fine-Tuning for Fact Embedding in LLMs | Shivam Ratnakar et.al. | 2503.01131 | null |
2025-03-09 | Re-Imagining Multimodal Instruction Tuning: A Representation View | Yiyang Liu et.al. | 2503.00723 | link |
2025-02-27 | MobiLLM: Enabling LLM Fine-Tuning on the Mobile Device via Server Assisted Side Tuning | Liang Li et.al. | 2502.20421 | null |
2025-02-26 | LORENZA: Enhancing Generalization in Low-Rank Gradient LLM Training via Efficient Zeroth-Order Adaptive SAM | Yehonathan Refael et.al. | 2502.19571 | null |
2025-02-22 | ELBA-Bench: An Efficient Learning Backdoor Attacks Benchmark for Large Language Models | Xuxu Liu et.al. | 2502.18511 | null |
2025-03-04 | SECURA: Sigmoid-Enhanced CUR Decomposition with Uninterrupted Retention and Low-Rank Adaptation in Large Language Models | Yuxuan Zhang et.al. | 2502.18168 | null |
2025-02-21 | Sparsity May Be All You Need: Sparse Random Parameter Adaptation | Jesus Rios et.al. | 2502.15975 | link |
2025-02-19 | Black Sheep in the Herd: Playing with Spuriously Correlated Attributes for Vision-Language Recognition | Xinyu Tian et.al. | 2502.15809 | null |
2025-02-21 | R-LoRA: Random Initialization of Multi-Head LoRA for Multi-Task Learning | Jinda Liu et.al. | 2502.15455 | link |
2025-02-20 | Generative Modeling of Individual Behavior at Scale | Nabil Omi et.al. | 2502.14998 | null |
2025-02-20 | LoRA-GGPO: Mitigating Double Descent in LoRA Fine-Tuning via Gradient-Guided Perturbation Optimization | Yupeng Chang et.al. | 2502.14538 | link |
2025-02-20 | NLoRA: Nyström-Initiated Low-Rank Adaptation for Large Language Models | Chenlu Guo et.al. | 2502.14482 | link |
2025-02-21 | Token Adaptation via Side Graph Convolution for Efficient Fine-tuning of 3D Point Cloud Transformers | Takahiko Furuya et.al. | 2502.14142 | link |
2025-02-19 | LSR-Adapt: Ultra-Efficient Parameter Tuning with Matrix Low Separation Rank Kernel Adaptation | Xin Li et.al. | 2502.13568 | null |
2025-02-24 | GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning | Sifan Zhou et.al. | 2502.12913 | null |
2025-02-17 | Mitigating Visual Knowledge Forgetting in MLLM Instruction-tuning via Modality-decoupled Gradient Descent | Junda Wu et.al. | 2502.11740 | null |
2025-02-13 | DiffoRA: Enabling Parameter-Efficient LLM Fine-Tuning via Differential Low-Rank Matrix Adaptation | Tangyu Jiang et.al. | 2502.08905 | null |
2025-02-12 | LowRA: Accurate and Efficient LoRA Fine-Tuning of LLMs under 2 Bits | Zikai Zhou et.al. | 2502.08141 | null |
2025-02-12 | Music for All: Exploring Multicultural Representations in Music Generation Models | Atharva Mehta et.al. | 2502.07328 | link |
2025-02-10 | Model Diffusion for Certifiable Few-shot Transfer Learning | Fady Rezk et.al. | 2502.06970 | null |
2025-02-10 | Hyper Compressed Fine-Tuning of Large Foundation Models with Quantum Inspired Adapters | Snehal Raj et.al. | 2502.06916 | null |
2025-02-10 | KARST: Multi-Kernel Kronecker Adaptation with Re-Scaling Transmission for Visual Classification | Yue Zhu et.al. | 2502.06779 | null |
2025-02-10 | FunduSAM: A Specialized Deep Learning Model for Enhanced Optic Disc and Cup Segmentation in Fundus Images | Jinchen Yu et.al. | 2502.06220 | null |
2025-02-08 | SSH: Sparse Spectrum Adaptation via Discrete Hartley Transformation | Yixian Shen et.al. | 2502.05539 | null |
2025-02-07 | SSMLoRA: Enhancing Low-Rank Adaptation with State Space Model | Jiayang Yu et.al. | 2502.04958 | link |
2025-02-05 | FedP |
Royson Lee et.al. | 2502.04387 | null |
2025-02-06 | Rank Also Matters: Hierarchical Configuration for Mixture of Adapter Experts in LLM Fine-Tuning | Peizhuang Cong et.al. | 2502.03884 | null |
2025-02-05 | Bilevel ZOFO: Bridging Parameter-Efficient and Zeroth-Order Techniques for Efficient LLM Fine-Tuning and Meta-Training | Reza Shirkavand et.al. | 2502.03604 | null |
2025-02-05 | RepLoRA: Reparameterizing Low-Rank Adaptation via the Perspective of Mixture of Experts | Tuan Truong et.al. | 2502.03044 | null |
2025-02-13 | Robust Federated Finetuning of LLMs via Alternating Optimization of LoRA | Shuangyi Chen et.al. | 2502.01755 | null |
2025-02-03 | Joint Localization and Activation Editing for Low-Resource Fine-Tuning | Wen Lai et.al. | 2502.01179 | link |
2025-02-03 | PARA: Parameter-Efficient Fine-tuning with Prompt Aware Representation Adjustment | Zequan Liu et.al. | 2502.01033 | null |
2025-02-01 | Parameter Efficient Fine-Tuning of Segment Anything Model | Carolin Teuber et.al. | 2502.00418 | link |
2025-02-01 | Sparse Gradient Compression for Fine-Tuning Large Language Models | David H. Yang et.al. | 2502.00311 | null |
2025-01-30 | Enhancing Large Language Model Efficiencyvia Symbolic Compression: A Formal Approach Towards Interpretability | Lumen AI et.al. | 2501.18657 | null |
2025-01-23 | Low-Rank Adapters Meet Neural Architecture Search for LLM Compression | J. Pablo Muñoz et.al. | 2501.16372 | link |
2025-01-26 | Fine Tuning without Catastrophic Forgetting via Selective Low Rank Adaptation | Reza Akbarian Bafghi et.al. | 2501.15377 | null |
2025-02-09 | Decentralized Low-Rank Fine-Tuning of Large Language Models | Sajjad Ghiasvand et.al. | 2501.15361 | null |
2025-01-25 | Complementary Subspace Low-Rank Adaptation of Vision-Language Models for Few-Shot Classification | Zhongqi Wang et.al. | 2501.15040 | null |
2025-01-24 | Domain Expansion: Parameter-Efficient Modules as Building Blocks for Composite Domains | Mann Patel et.al. | 2501.14321 | link |
2025-01-23 | Parameter-Efficient Fine-Tuning for Foundation Models | Dan Zhang et.al. | 2501.13787 | link |
2025-01-21 | EDoRA: Efficient Weight-Decomposed Low-Rank Adaptation via Singular Value Decomposition | Hamid Nasiri et.al. | 2501.12067 | link |
2025-01-21 | Is your LLM trapped in a Mental Set? Investigative study on how mental sets affect the reasoning capabilities of LLMs | Saiful Haq et.al. | 2501.11833 | null |
2025-01-17 | OMoE: Diversifying Mixture of Low-Rank Adaptation by Orthogonal Finetuning | Jinyuan Feng et.al. | 2501.10062 | null |
2025-01-15 | Transformed Low-rank Adaptation via Tensor Decomposition and Its Applications to Text-to-image Models | Zerui Tao et.al. | 2501.08727 | null |
2025-01-14 | TriAdaptLoRA: Brain-Inspired Triangular Adaptive Low-Rank Adaptation for Parameter-Efficient Fine-Tuning | Yao Liang et.al. | 2501.08008 | null |
2025-01-14 | Optimizing Language Models for Grammatical Acceptability: A Comparative Study of Fine-Tuning Techniques | Shobhit Ratan et.al. | 2501.07853 | null |
2025-01-12 | A Hessian-informed hyperparameter optimization for differential learning rate | Shiyun Xu et.al. | 2501.06954 | null |
2025-01-10 | Aggregating Low Rank Adapters in Federated Fine-tuning | Evelyn Trautmann et.al. | 2501.06332 | null |
2025-01-10 | How to Tune a Multilingual Encoder Model for Germanic Languages: A Study of PEFT, Full Fine-Tuning, and Language Adapters | Romina Oji et.al. | 2501.06025 | link |
2025-01-08 | TADFormer : Task-Adaptive Dynamic Transformer for Efficient Multi-Task Learning | Seungmin Baek et.al. | 2501.04293 | null |
2025-01-20 | Spectral-Aware Low-Rank Adaptation for Speaker Verification | Zhe Li et.al. | 2501.03829 | link |
2025-01-06 | ADePT: Adaptive Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning | Pengwei Tang et.al. | 2501.03291 | link |
2025-01-05 | HALO: Hadamard-Assisted Lossless Optimization for Efficient Low-Precision LLM Training and Fine-Tuning | Saleh Ashkboos et.al. | 2501.02625 | link |
2025-01-05 | Efficient Deployment of Large Language Models on Resource-constrained Devices | Zhiwei Yao et.al. | 2501.02438 | null |
2025-01-09 | tCURLoRA: Tensor CUR Decomposition Based Low-Rank Parameter Adaptation and Its Application in Medical Image Segmentation | Guanghua He et.al. | 2501.02227 | null |
2025-01-03 | SaLoRA: Safety-Alignment Preserved Low-Rank Adaptation | Mingjie Li et.al. | 2501.01765 | null |
2025-01-07 | Practical Secure Inference Algorithm for Fine-tuned Large Language Model Based on Fully Homomorphic Encryption | Zhang Ruoyan et.al. | 2501.01672 | null |
2024-12-30 | Disentangling Preference Representation and Text Generation for Efficient Individual Preference Alignment | Jianfei Zhang et.al. | 2412.20834 | link |
2024-12-28 | VELoRA: A Low-Rank Adaptation Approach for Efficient RGB-Event based Recognition | Lan Chen et.al. | 2412.20064 | link |
2025-01-05 | Gradient Weight-normalized Low-rank Projection for Efficient LLM Training | Jia-Hong Huang et.al. | 2412.19616 | link |
2024-12-27 | Parameter Efficient Fine-Tuning for Deep Learning-Based Full-Waveform Inversion | Koustav Ghosal et.al. | 2412.19510 | null |
2024-12-24 | Multi-Point Positional Insertion Tuning for Small Object Detection | Kanoko Goto et.al. | 2412.18090 | null |
2024-12-23 | Interweaving Memories of a Siamese Large Language Model | Xin Song et.al. | 2412.17383 | link |
2024-12-26 | LLMsAgainstHate @ NLU of Devanagari Script Languages 2025: Hate Speech Detection and Target Identification in Devanagari Languages via Parameter Efficient Fine-Tuning of LLMs | Rushendra Sidibomma et.al. | 2412.17131 | link |
2024-12-21 | Label Privacy in Split Learning for Large Models with Parameter-Efficient Training | Philip Zmushko et.al. | 2412.16669 | link |
2024-12-19 | FedPIA -- Permuting and Integrating Adapters leveraging Wasserstein Barycenters for Finetuning Foundation Models in Multi-Modal Federated Learning | Pramit Saha et.al. | 2412.14424 | null |
2024-12-18 | Parameter-efficient Fine-tuning for improved Convolutional Baseline for Brain Tumor Segmentation in Sub-Saharan Africa Adult Glioma Dataset | Bijay Adhikari et.al. | 2412.14100 | link |
2024-12-18 | A Comprehensive Evaluation of Parameter-Efficient Fine-Tuning on Method-Level Code Smell Detection | Beiqi Zhang et.al. | 2412.13801 | link |
2024-12-18 | Refining Salience-Aware Sparse Fine-Tuning Strategies for Language Models | Xinxin Liu et.al. | 2412.13488 | null |
2024-12-17 | Train More Parameters But Mind Their Placement: Insights into Language Adaptation with PEFT | Jenny Kunz et.al. | 2412.12674 | link |
2024-12-16 | Visual Instruction Tuning with 500x Fewer Parameters through Modality Linear Representation-Steering | Jinhe Bi et.al. | 2412.12359 | link |
2024-12-16 | A LoRA is Worth a Thousand Pictures | Chenxi Liu et.al. | 2412.12048 | null |
2024-12-11 | Adaptive Principal Components Allocation with the |
Jingjing Zheng et.al. | 2412.08592 | link |
2024-12-10 | PETALface: Parameter Efficient Transfer Learning for Low-resolution Face Recognition | Kartik Narayan et.al. | 2412.07771 | null |
2024-12-10 | MoDULA: Mixture of Domain-Specific and Universal LoRA for Multi-Task Learning | Yufei Ma et.al. | 2412.07405 | null |
2024-12-13 | Crack-EdgeSAM Self-Prompting Crack Segmentation System for Edge Devices | Yingchu Wang et.al. | 2412.07205 | null |
2024-12-08 | Taming Sensitive Weights : Noise Perturbation Fine-tuning for Robust LLM Quantization | Dongwei Wang et.al. | 2412.06858 | null |
2024-12-09 | BoRA: Bi-dimensional Weight-Decomposed Low-Rank Adaptation | Qiushi Wang et.al. | 2412.06441 | null |
2024-12-19 | S |
Xinyu Yang et.al. | 2412.06289 | null |
2024-12-08 | KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models | Fan Wang et.al. | 2412.06071 | link |
2024-12-07 | Training-Free Bayesianization for Low-Rank Adapters of Large Language Models | Haizhou Shi et.al. | 2412.05723 | link |
2024-12-06 | PETapter: Leveraging PET-style classification heads for modular few-shot parameter-efficient fine-tuning | Jonas Rieger et.al. | 2412.04975 | null |
2024-12-04 | Prompting Large Language Models for Clinical Temporal Relation Extraction | Jianping He et.al. | 2412.04512 | null |
2024-12-05 | SoRA: Singular Value Decomposed Low-Rank Adaptation for Domain Generalizable Representation Learning | Seokju Yun et.al. | 2412.04077 | link |
2024-12-04 | Improving Linguistic Diversity of Large Language Models with Possibility Exploration Fine-Tuning | Long Mai et.al. | 2412.03343 | link |
2024-12-03 | Mixture of Physical Priors Adapter for Parameter-Efficient Fine-Tuning | Zhaozhi Wang et.al. | 2412.02759 | null |
2024-12-03 | CPP-UT-Bench: Can LLMs Write Complex Unit Tests in C++? | Vaishnavi Bhargava et.al. | 2412.02735 | null |
2024-12-03 | LoRA Diffusion: Zero-Shot LoRA Synthesis for Diffusion Model Personalization | Ethan Smith et.al. | 2412.02352 | null |
2024-12-03 | A Comprehensive Evaluation of Large Language Models on Aspect-Based Sentiment Analysis | Changzhi Zhou et.al. | 2412.02279 | null |
2024-11-30 | Unified Parameter-Efficient Unlearning for LLMs | Chenlu Ding et.al. | 2412.00383 | link |
2024-11-29 | SURE-VQA: Systematic Understanding of Robustness Evaluation in Medical VQA Tasks | Kim-Celine Kahl et.al. | 2411.19688 | link |
2024-11-28 | Parameter-Efficient Transfer Learning for Music Foundation Models | Yiwei Ding et.al. | 2411.19371 | link |
2024-11-28 | PEFT-as-an-Attack! Jailbreaking Language Models during Federated Parameter-Efficient Fine-Tuning | Shenghui Li et.al. | 2411.19335 | null |
2024-11-28 | Enhancing Parameter-Efficient Fine-Tuning of Vision Transformers through Frequency-Based Adaptation | Son Thai Ly et.al. | 2411.19297 | link |
2024-11-27 | Challenges in Adapting Multilingual LLMs to Low-Resource Languages using LoRA PEFT Tuning | Omkar Khade et.al. | 2411.18571 | null |
2024-11-26 | PEFTGuard: Detecting Backdoor Attacks Against Parameter-Efficient Fine-Tuning | Zhen Sun et.al. | 2411.17453 | null |
2024-11-29 | Promptable Anomaly Segmentation with SAM Through Self-Perception Tuning | Hui-Yue Yang et.al. | 2411.17217 | null |
2024-11-25 | Towards Efficient Model-Heterogeneity Federated Learning for Large Models | Ruofan Jia et.al. | 2411.16796 | null |
2024-11-25 | Parameter Efficient Instruction Tuning: An Empirical Study | Pengfei He et.al. | 2411.16775 | link |
2024-11-25 | Graph Adapter of EEG Foundation Models for Parameter Efficient Fine Tuning | Toyotaro Suzumura et.al. | 2411.16155 | null |
2024-11-24 | Efficient and Private: Memorisation under differentially private parameter-efficient fine-tuning in language models | Olivia Ma et.al. | 2411.15831 | null |
2024-11-21 | Parameter Efficient Mamba Tuning via Projector-targeted Diagonal-centric Linear Transformation | Seokil Ham et.al. | 2411.15224 | null |
2024-11-22 | LoRA-FAIR: Federated LoRA Fine-Tuning with Aggregation and Initialization Refinement | Jieming Bian et.al. | 2411.14961 | null |
2024-11-21 | Multi LoRA Meets Vision: Merging multiple adapters to create a multi task model | Ege Kesim et.al. | 2411.14064 | null |
2024-11-17 | F |
Pramit Saha et.al. | 2411.11912 | null |
2024-11-16 | HELENE: Hessian Layer-wise Clipping and Gradient Annealing for Accelerating Fine-tuning LLM with Zeroth-order Optimization | Huaqin Zhao et.al. | 2411.10696 | null |
2024-11-12 | PERFT: Parameter-Efficient Routed Fine-Tuning for Mixture-of-Expert Model | Yilun Liu et.al. | 2411.08212 | null |
2024-11-10 | Prompt-Efficient Fine-Tuning for GPT-like Deep Models to Reduce Hallucination and to Improve Reproducibility in Scientific Text Generation Using Stochastic Optimisation Techniques | Daniil Sulimov et.al. | 2411.06445 | null |
2024-11-06 | MambaPEFT: Exploring Parameter-Efficient Fine-Tuning for Mamba | Masakazu Yoshimura et.al. | 2411.03855 | link |
2024-11-04 | PipeLLM: Fast and Confidential Large Language Model Services with Speculative Pipelined Encryption | Yifan Tan et.al. | 2411.03357 | null |
2024-11-05 | Efficient and Effective Adaptation of Multimodal Foundation Models in Sequential Recommendation | Junchen Fu et.al. | 2411.02992 | null |
2024-11-04 | Parameter-Efficient Fine-Tuning of Large Language Models for Unit Test Generation: An Empirical Study | André Storhaug et.al. | 2411.02462 | null |
2024-11-04 | Expanding Sparse Tuning for Low Memory Usage | Shufan Shen et.al. | 2411.01800 | link |
2024-11-15 | Visual Fourier Prompt Tuning | Runjia Zeng et.al. | 2411.01327 | link |
2024-10-31 | CleaR: Towards Robust and Generalized Parameter-Efficient Fine-Tuning for Noisy Label Learning | Yeachan Kim et.al. | 2411.00873 | null |
2024-10-30 | FPE-LLM: Highly Intelligent Time-Series Forecasting and Language Interaction LLM in Energy Systems | Zihang Qiu et.al. | 2411.00852 | null |
2024-11-01 | Dual Low-Rank Adaptation for Continual Learning with Pre-Trained Models | Huancheng Chen et.al. | 2411.00623 | null |
2024-11-01 | Is Multiple Object Tracking a Matter of Specialization? | Gianluca Mancusi et.al. | 2411.00553 | null |
2024-11-01 | C2A: Client-Customized Adaptation for Parameter-Efficient Federated Learning | Yeachan Kim et.al. | 2411.00311 | link |
2024-10-29 | Preserving Pre-trained Representation Space: On Effectiveness of Prefix-tuning for Large Multi-modal Models | Donghoon Kim et.al. | 2411.00029 | null |
2024-10-30 | Efficient Adaptation of Pre-trained Vision Transformer via Householder Transformation | Wei Dong et.al. | 2410.22952 | null |
2024-10-30 | MALoRA: Mixture of Asymmetric Low-Rank Adaptation for Enhanced Multi-Task Learning | Xujia Wang et.al. | 2410.22782 | null |
2024-10-29 | Meta-Learning Adaptable Foundation Models | Jacob L. Block et.al. | 2410.22264 | null |
2024-10-29 | Capacity Control is an Effective Memorization Mitigation Mechanism in Text-Conditional Diffusion Models | Raman Dutt et.al. | 2410.22149 | link |
2024-10-30 | IntLoRA: Integral Low-rank Adaptation of Quantized Diffusion Models | Hang Guo et.al. | 2410.21759 | link |
2024-10-28 | KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and Knowledge Distillation | Rambod Azimi et.al. | 2410.20777 | link |
2024-10-27 | Get Large Language Models Ready to Speak: A Late-fusion Approach for Speech Generation | Maohao Shen et.al. | 2410.20336 | null |
2024-11-01 | Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies | Luping Wang et.al. | 2410.19878 | null |
2024-10-23 | MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning | Jingfan Zhang et.al. | 2410.18035 | null |
2024-10-22 | Towards Real Zero-Shot Camouflaged Object Segmentation without Camouflaged Annotations | Cheng Lei et.al. | 2410.16953 | null |
2024-10-22 | MoRE: Multi-Modal Contrastive Pre-training with Transformers on X-Rays, ECGs, and Diagnostic Report | Samrajya Thapa et.al. | 2410.16239 | link |
2024-10-21 | Natural GaLore: Accelerating GaLore for memory-efficient LLM Training and Fine-tuning | Arijit Das et.al. | 2410.16029 | link |
2024-10-18 | Unlearning Backdoor Attacks for LLMs with Weak-to-Strong Knowledge Distillation | Shuai Zhao et.al. | 2410.14425 | link |
2024-10-17 | LoLDU: Low-Rank Adaptation via Lower-Diag-Upper Decomposition for Parameter-Efficient Fine-Tuning | Yiming Shi et.al. | 2410.13618 | link |
2024-10-16 | Communication-Efficient and Tensorized Federated Fine-Tuning of Large Language Models | Sajjad Ghiasvand et.al. | 2410.13097 | null |
2024-10-17 | Prompt Compression for Large Language Models: A Survey | Zongqian Li et.al. | 2410.12388 | link |
2024-10-15 | Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language Models | Kai Yao et.al. | 2410.11772 | link |
2024-10-15 | LoKO: Low-Rank Kalman Optimizer for Online Fine-Tuning of Large Models | Hossein Abdi et.al. | 2410.11551 | null |
2024-10-15 | RoCoFT: Efficient Finetuning of Large Language Models with Row-Column Updates | Md Kowsher et.al. | 2410.10075 | link |
2024-10-13 | BiDoRA: Bi-level Optimization-Based Weight-Decomposed Low-Rank Adaptation | Peijia Qin et.al. | 2410.09758 | null |
2024-10-12 | Towards Efficient Visual-Language Alignment of the Q-Former for Visual Reasoning Tasks | Sungkyung Kim et.al. | 2410.09489 | link |
2024-10-15 | MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning | Yaming Yang et.al. | 2410.09437 | link |
2024-10-09 | Parameter-Efficient Fine-Tuning via Selective Discrete Cosine Transform | Yixian Shen et.al. | 2410.09103 | null |
2024-10-04 | BIPEFT: Budget-Guided Iterative Search for Parameter Efficient Fine-Tuning of Large Pretrained Language Models | Aofei Chang et.al. | 2410.09079 | null |
2024-10-11 | Parameter-Efficient Fine-Tuning of State Space Models | Kevin Galim et.al. | 2410.09016 | link |
2024-10-10 | Parameter-Efficient Fine-Tuning in Spectral Domain for Point Cloud Learning | Dingkang Liang et.al. | 2410.08114 | link |
2024-10-10 | SLIM: Let LLM Learn More and Forget Less with Soft LoRA and Identity Mixture | Jiayi Han et.al. | 2410.07739 | null |
2024-10-10 | Enhancing Zeroth-order Fine-tuning for Language Models with Low-rank Structures | Yiming Chen et.al. | 2410.07698 | link |
2024-10-09 | SparseGrad: A Selective Method for Efficient Fine-tuning of MLP Layers | Viktoriia Chekalina et.al. | 2410.07383 | link |
2024-10-09 | Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs | Ruijia Niu et.al. | 2410.06431 | null |
2024-10-08 | Are Large Language Models State-of-the-art Quality Estimators for Machine Translation of User-generated Content? | Shenbin Qian et.al. | 2410.06338 | link |
2024-10-15 | LoRTA: Low Rank Tensor Adaptation of Large Language Models | Ignacio Hounie et.al. | 2410.04060 | null |
2024-10-03 | Llama SLayer 8B: Shallow Layers Hold the Key to Knowledge Injection | Tianxiang Chen et.al. | 2410.02330 | link |
2024-10-02 | TPP-LLM: Modeling Temporal Point Processes by Efficiently Fine-Tuning Large Language Models | Zefang Liu et.al. | 2410.02062 | link |
2024-10-02 | NEAT: Nonlinear Parameter-efficient Adaptation of Pre-trained Models | Yibo Zhong et.al. | 2410.01870 | null |
2024-09-27 | A GEN AI Framework for Medical Note Generation | Hui Yi Leong et.al. | 2410.01841 | null |
2024-10-02 | DLP-LoRA: Efficient Task-Specific LoRA Fusion with a Dynamic, Lightweight Plugin for Large Language Models | Yuxuan Zhang et.al. | 2410.01497 | link |
2024-10-01 | PrivTuner with Homomorphic Encryption and LoRA: A P3EFT Scheme for Privacy-Preserving Parameter-Efficient Fine-Tuning of AI Foundation Models | Yang Li et.al. | 2410.00433 | null |
2024-09-30 | Adapting LLMs for the Medical Domain in Portuguese: A Study on Fine-Tuning and Model Evaluation | Pedro Henrique Paiola et.al. | 2410.00163 | null |
2024-09-30 | Resource Allocation for Stable LLM Training in Mobile Edge Computing | Chang Liu et.al. | 2409.20247 | null |
2024-09-30 | Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models | Luohe Shi et.al. | 2409.20181 | link |
2024-09-28 | FINE: Factorizing Knowledge for Initialization of Variable-sized Diffusion Models | Yucheng Xie et.al. | 2409.19289 | null |
2024-10-01 | Backdoor Attacks for LLMs with Weak-To-Strong Knowledge Distillation | Shuai Zhao et.al. | 2409.17946 | null |
2024-09-26 | PEDRO: Parameter-Efficient Fine-tuning with Prompt DEpenDent Representation MOdification | Tianfang Xie et.al. | 2409.17834 | null |
2024-09-30 | Efficient In-Domain Question Answering for Resource-Constrained Environments | Isaac Chung et.al. | 2409.17648 | null |
2024-10-07 | PACE: marrying generalization in PArameter-efficient fine-tuning with Consistency rEgularization | Yao Ni et.al. | 2409.17137 | link |
2024-09-25 | Parameter-efficient Bayesian Neural Networks for Uncertainty-aware Depth Estimation | Richard D. Paul et.al. | 2409.17085 | null |
2024-10-02 | Bone: Block Affine Transformation as Parameter Efficient Fine-tuning Methods for Large Language Models | Jiale Kang et.al. | 2409.15371 | link |
2024-09-22 | Flat-LoRA: Low-Rank Adaption over a Flat Loss Landscape | Tao Li et.al. | 2409.14396 | null |
2024-10-01 | Obliviate: Neutralizing Task-agnostic Backdoors within the Parameter-efficient Fine-tuning Paradigm | Jaehan Kim et.al. | 2409.14119 | link |
2024-09-20 | HUT: A More Computation Efficient Fine-Tuning Method With Hadamard Updated Transformation | Geyuan Zhang et.al. | 2409.13501 | null |
2024-09-17 | THaMES: An End-to-End Tool for Hallucination Mitigation and Evaluation in Large Language Models | Mengfei Liang et.al. | 2409.11353 | link |
2024-09-17 | LPT++: Efficient Training on Mixture of Long-tailed Experts | Bowen Dong et.al. | 2409.11323 | null |
2024-09-17 | Beyond LoRA: Exploring Efficient Fine-Tuning Techniques for Time Series Foundational Models | Divij Gupta et.al. | 2409.11302 | null |
2024-09-18 | Propulsion: Steering LLM with Tiny Fine-Tuning | Md Kowsher et.al. | 2409.10927 | link |
2024-09-16 | From Text to Emoji: How PEFT-Driven Personality Manipulation Unleashes the Emoji Potential in LLMs | Navya Jain et.al. | 2409.10245 | null |
2024-09-14 | COMFORT: A Continual Fine-Tuning Framework for Foundation Models Targeted at Consumer Healthcare | Chia-Hao Li et.al. | 2409.09549 | null |
2024-09-14 | Comparing Retrieval-Augmentation and Parameter-Efficient Fine-Tuning for Privacy-Preserving Personalization of Large Language Models | Alireza Salemi et.al. | 2409.09510 | link |
2024-09-13 | Risks When Sharing LoRA Fine-Tuned Diffusion Model Weights | Dixi Yao et.al. | 2409.08482 | null |
2024-09-12 | Do Vision Foundation Models Enhance Domain Generalization in Medical Image Segmentation? | Kerem Cekmeceli et.al. | 2409.07960 | link |
2024-09-11 | Efficient Localized Adaptation of Neural Weather Forecasting: A Case Study in the MENA Region | Muhammad Akhtar Munir et.al. | 2409.07585 | link |
2024-09-10 | Sam2Rad: A Segmentation Model for Medical Images with Learnable Prompts | Assefa Seyoum Wahd et.al. | 2409.06821 | link |
2024-09-11 | Ferret: Federated Full-Parameter Tuning at Scale for Large Language Models | Yao Shu et.al. | 2409.06277 | link |
2024-09-09 | SVFit: Parameter-Efficient Fine-Tuning of Large Pre-Trained Models Using Singular Values | Chengwei Sun et.al. | 2409.05926 | null |
2024-09-10 | Improving Multimodal Emotion Recognition by Leveraging Acoustic Adaptation and Visual Alignment | Zhixian Zhao et.al. | 2409.05015 | null |
2024-09-06 | Customizing Large Language Model Generation Style using Parameter-Efficient Finetuning | Xinyue Liu et.al. | 2409.04574 | null |
2024-09-04 | iConFormer: Dynamic Parameter-Efficient Tuning with Input-Conditioned Adaptation | Hayeon Jo et.al. | 2409.02838 | null |
2024-09-04 | Deconfounded Causality-aware Parameter-Efficient Fine-Tuning for Problem-Solving Improvement of LLMs | Ruoyu Wang et.al. | 2409.02686 | null |
2024-09-04 | Robust Federated Finetuning of Foundation Models via Alternating Minimization of LoRA | Shuangyi Chen et.al. | 2409.02346 | null |
2024-09-02 | Unleashing the Power of Task-Specific Directions in Parameter Efficient Fine-tuning | Chongjie Si et.al. | 2409.01035 | link |
2024-08-28 | 3-in-1: 2D Rotary Adaptation for Efficient Finetuning, Efficient Batching and Composability | Baohao Liao et.al. | 2409.00119 | link |
2024-08-21 | SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models | Yang Cao et.al. | 2409.00055 | link |
2024-08-30 | MoRe Fine-Tuning with 10x Fewer Parameters | Wenxuan Tan et.al. | 2408.17383 | link |
2024-09-02 | Instant Adversarial Purification with Adversarial Consistency Distillation | Chun Tong Lei et.al. | 2408.17064 | null |
2024-08-28 | Scaling Up Summarization: Leveraging Large Language Models for Long Text Extractive Summarization | Léo Hemamou et.al. | 2408.15801 | null |
2024-08-27 | GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMs | Maxim Zhelnin et.al. | 2408.15300 | link |
2024-08-27 | Pre-training Everywhere: Parameter-Efficient Fine-Tuning for Medical Image Analysis via Target Parameter Pre-training | Xingliang Lei et.al. | 2408.15011 | null |
2024-08-27 | CVPT: Cross-Attention help Visual Prompt Tuning adapt visual task | Lingyun Huang et.al. | 2408.14961 | link |
2024-08-27 | Step-by-Step Unmasking for Parameter-Efficient Fine-tuning of Large Language Models | Aradhye Agarwal et.al. | 2408.14470 | link |
2024-08-24 | Advancing Enterprise Spatio-Temporal Forecasting Applications: Data Mining Meets Instruction Tuning of Language Models For Multi-modal Time Series Analysis in Low-Resource Settings | Sagar Srinivas Sakhinana et.al. | 2408.13622 | null |
2024-08-21 | Positional Prompt Tuning for Efficient 3D Representation Learning | Shaochen Zhang et.al. | 2408.11567 | link |
2024-08-20 | Pluto and Charon: A Time and Memory Efficient Collaborative Edge AI Framework for Personal LLMs Fine-Tuning | Bei Ouyang et.al. | 2408.10746 | null |
2024-08-20 | TDS-CLIP: Temporal Difference Side Network for Image-to-Video Transfer Learning | Bin Wang et.al. | 2408.10688 | link |
2024-08-19 | TeamLoRA: Boosting Low-Rank Adaptation with Expert Collaboration and Competition | Tianwei Lin et.al. | 2408.09856 | link |
2024-08-16 | Learning to Route for Dynamic Adapter Composition in Continual Learning with Language Models | Vladimir Araujo et.al. | 2408.09053 | null |
2024-08-14 | KIND: Knowledge Integration and Diversion in Diffusion Models | Yucheng Xie et.al. | 2408.07337 | link |
2024-08-30 | TaSL: Task Skill Localization and Consolidation for Language Model Continual Learning | Yujie Feng et.al. | 2408.05200 | link |
2024-08-08 | Bias-Aware Low-Rank Adaptation: Mitigating Catastrophic Inheritance of Large Language Models | Yupeng Chang et.al. | 2408.04556 | link |
2024-08-06 | SARA: Singular-Value Based Adaptive Low-Rank Adaption | Jihao Gu et.al. | 2408.03290 | null |
2024-08-06 | Leveraging Parameter Efficient Training Methods for Low Resource Text Classification: A Case Study in Marathi | Pranita Deshmukh et.al. | 2408.03172 | null |
2024-08-03 | TS-SAM: Fine-Tuning Segment-Anything Model for Downstream Tasks | Yang Yu et.al. | 2408.01835 | link |
2024-08-02 | MoDE: Effective Multi-task Parameter Efficient Fine-Tuning with a Mixture of Dyadic Experts | Lin Ning et.al. | 2408.01505 | null |
2024-08-02 | Tensor Train Low-rank Approximation (TT-LoRA): Democratizing AI with Accelerated LLMs | Afia Anjum et.al. | 2408.01008 | null |
2024-07-31 | A Federated Learning-Friendly Approach for Parameter-Efficient Fine-Tuning of SAM in 3D Segmentation | Mothilal Asokan et.al. | 2407.21739 | null |
2024-07-28 | Forecast-PEFT: Parameter-Efficient Fine-Tuning for Pre-trained Motion Forecasting Models | Jifeng Wang et.al. | 2407.19564 | link |
2024-07-24 | Parameter-Efficient Fine-Tuning for Continual Learning: A Neural Tangent Kernel Perspective | Jingren Liu et.al. | 2407.17120 | null |
2024-07-22 | Zero-Shot Embeddings Inform Learning and Forgetting with Vision-Language Encoders | Laura Niss et.al. | 2407.15731 | null |
2024-07-21 | Learn to Preserve and Diversify: Parameter-Efficient Group with Orthogonal Regularization for Domain Generalization | Jiajun Hu et.al. | 2407.15085 | link |
2024-07-16 | InstructAV: Instruction Fine-tuning Large Language Models for Authorship Verification | Yujia Hu et.al. | 2407.12882 | link |
2024-07-18 | Turning Generative Models Degenerate: The Power of Data Poisoning Attacks | Shuli Jiang et.al. | 2407.12281 | null |
2024-07-16 | Probing the Efficacy of Federated Parameter-Efficient Fine-Tuning of Vision Transformers for Medical Image Classification | Naif Alkhunaizi et.al. | 2407.11573 | null |
2024-07-16 | An efficient framework based on large foundation model for cervical cytopathology whole slide image screening | Jialong Huang et.al. | 2407.11486 | link |
2024-07-10 | RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization | Xijie Huang et.al. | 2407.08044 | link |
2024-07-10 | ROSA: Random Subspace Adaptation for Efficient Fine-Tuning | Marawan Gamal Abdel Hameed et.al. | 2407.07802 | link |
2024-07-10 | Parameter Efficient Fine Tuning for Multi-scanner PET to PET Reconstruction | Yumin Kim et.al. | 2407.07517 | null |
2024-07-09 | Reprogramming Distillation for Medical Foundation Models | Yuhang Zhou et.al. | 2407.06504 | link |
2024-07-07 | See Further for Parameter Efficient Fine-tuning by Standing on the Shoulders of Decomposition | Chongjie Si et.al. | 2407.05417 | link |
2024-07-16 | LoRA-GA: Low-Rank Adaptation with Gradient Approximation | Shaowen Wang et.al. | 2407.05000 | link |
2024-07-05 | GPT vs RETRO: Exploring the Intersection of Retrieval and Parameter-Efficient Fine-Tuning | Aleksander Ficek et.al. | 2407.04528 | null |
2024-07-04 | Deep Content Understanding Toward Entity and Aspect Target Sentiment Analysis on Foundation Models | Vorakit Vorakitphan et.al. | 2407.04050 | link |
2024-07-04 | ASteISR: Adapting Single Image Super-resolution Pre-trained Model for Efficient Stereo Image Super-resolution | Yuanbo Zhou et.al. | 2407.03598 | link |
2024-07-03 | Knowledge Composition using Task Vectors with Learned Anisotropic Scaling | Frederic Z. Zhang et.al. | 2407.02880 | link |
2024-07-03 | Exploring the Capabilities of LLMs for Code Change Related Tasks | Lishui Fan et.al. | 2407.02824 | link |
2024-07-02 | FineCLIPER: Multi-modal Fine-grained CLIP for Dynamic Facial Expression Recognition with AdaptERs | Haodong Chen et.al. | 2407.02157 | null |
2024-07-02 | CatMemo at the FinLLM Challenge Task: Fine-Tuning Large Language Models using Data Fusion in Financial Applications | Yupeng Cao et.al. | 2407.01953 | null |
2024-07-05 | Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models | Zihan Wang et.al. | 2407.01906 | link |
2024-07-01 | A Fingerprint for Large Language Models | Zhiguang Yang et.al. | 2407.01235 | null |
2024-07-02 | Embedded Prompt Tuning: Towards Enhanced Calibration of Pretrained Models for Medical Images | Wenqiang Zu et.al. | 2407.01003 | link |
2024-06-25 | Structured Unrestricted-Rank Matrices for Parameter Efficient Fine-tuning | Arijit Sehanobish et.al. | 2406.17740 | link |
2024-06-19 | Parameter Training Efficiency Aware Resource Allocation for AIGC in Space-Air-Ground Integrated Networks | Liangxin Qian et.al. | 2406.13602 | null |
2024-06-19 | Sparse High Rank Adapters | Kartikeya Bhardwaj et.al. | 2406.13175 | null |
2024-06-18 | Bayesian-LoRA: LoRA based Parameter Efficient Fine-Tuning using Optimal Quantization levels and Rank Values trough Differentiable Bayesian Gates | Cristian Meo et.al. | 2406.13046 | null |
2024-06-18 | Fighting Randomness with Randomness: Mitigating Optimisation Instability of Fine-Tuning using Delayed Ensemble and Noisy Interpolation | Branislav Pecher et.al. | 2406.12471 | link |
2024-06-17 | A Semantic-based Layer Freezing Approach to Efficient Fine-Tuning of Language Models | Jian Gu et.al. | 2406.11753 | null |
2024-06-16 | ExPLoRA: Parameter-Efficient Extended Pre-Training to Adapt Vision Transformers under Domain Shifts | Samar Khanna et.al. | 2406.10973 | null |
2024-06-16 | ShareLoRA: Parameter Efficient and Robust Large Language Model Fine-tuning via Shared Low-Rank Adaptation | Yurun Song et.al. | 2406.10785 | link |
2024-06-16 | RoseLoRA: Row and Column-wise Sparse Low-rank Adaptation of Pre-trained Language Model for Knowledge Editing and Fine-tuning | Haoyu Wang et.al. | 2406.10777 | link |
2024-06-15 | Benchmarking Children's ASR with Supervised and Self-supervised Speech Foundation Models | Ruchao Fan et.al. | 2406.10507 | link |
2024-06-15 | Personalized Pieces: Efficient Personalized Large Language Models through Collaborative Efforts | Zhaoxuan Tan et.al. | 2406.10471 | link |
2024-06-13 | Reflecting on the State of Rehearsal-free Continual Learning with Pretrained Models | Lukas Thede et.al. | 2406.09384 | null |
2024-06-12 | Exploring Fact Memorization and Style Imitation in LLMs Using QLoRA: An Experimental Study and Quality Assessment Methods | Eugene Vyborov et.al. | 2406.08582 | null |
2024-06-12 | The Impact of Initialization on LoRA Finetuning Dynamics | Soufiane Hayou et.al. | 2406.08447 | null |
2024-06-20 | Low-Rank Quantization-Aware Training for LLMs | Yelysei Bondarenko et.al. | 2406.06385 | link |
2024-06-10 | A Parameter-efficient Language Extension Framework for Multilingual ASR | Wei Liu et.al. | 2406.06329 | null |
2024-06-09 | A Comprehensive Evaluation of Parameter-Efficient Fine-Tuning on Automated Program Repair | Guochang Li et.al. | 2406.05639 | link |
2024-06-07 | Efficient Differentially Private Fine-Tuning of Diffusion Models | Jing Liu et.al. | 2406.05257 | null |
2024-06-07 | CorDA: Context-Oriented Decomposition Adaptation of Large Language Models | Yibo Yang et.al. | 2406.05223 | link |
2024-06-07 | An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models | Xiongtao Zhou et.al. | 2406.05130 | link |
2024-06-07 | MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter | Jitai Hao et.al. | 2406.04984 | link |
2024-06-06 | Time Sensitive Knowledge Editing through Efficient Finetuning | Xiou Ge et.al. | 2406.04496 | link |
2024-06-06 | VHDL-Eval: A Framework for Evaluating Large Language Models in VHDL Code Generation | Prashanth Vijayaraghavan et.al. | 2406.04379 | null |
2024-06-10 | Hypernetworks for Personalizing ASR to Atypical Speech | Max Müller-Eberstein et.al. | 2406.04240 | null |
2024-06-06 | Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning | Naibin Gu et.al. | 2406.03792 | link |
2024-06-05 | Choice of PEFT Technique in Continual Learning: Prompt Tuning is Not All You Need | Martin Wistuba et.al. | 2406.03216 | null |
2024-06-06 | Adapter-X: A Novel General Parameter-Efficient Fine-Tuning Framework for Vision | Minglei Li et.al. | 2406.03051 | null |
2024-05-31 | Mamba State-Space Models Can Be Strong Downstream Learners | John T. Halloran et.al. | 2406.00209 | null |
2024-05-30 | ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections | Massimo Bini et.al. | 2405.20271 | link |
2024-05-30 | SVFT: Parameter-Efficient Fine-Tuning with Singular Vectors | Vijay Lingam et.al. | 2405.19597 | link |
2024-05-29 | MemControl: Mitigating Memorization in Medical Diffusion Models via Automated Parameter Selection | Raman Dutt et.al. | 2405.19458 | link |
2024-05-29 | MLAE: Masked LoRA Experts for Parameter-Efficient Fine-Tuning | Junjie Wang et.al. | 2405.18897 | link |
2024-05-29 | Parameter-efficient Fine-tuning in Hyperspherical Space for Open-vocabulary Semantic Segmentation | Zelin Peng et.al. | 2405.18840 | null |
2024-06-01 | Low-Rank Few-Shot Adaptation of Vision-Language Models | Maxime Zanella et.al. | 2405.18541 | null |
2024-05-28 | Semantic are Beacons: A Semantic Perspective for Unveiling Parameter-Efficient Fine-Tuning in Knowledge Learning | Renzhi Wang et.al. | 2405.18292 | null |
2024-05-28 | VeLoRA: Memory Efficient Training using Rank-1 Sub-Token Projections | Roy Miles et.al. | 2405.17991 | link |
2024-05-28 | Sparsity- and Hybridity-Inspired Visual Parameter-Efficient Fine-Tuning for Medical Diagnosis | Mingyuan Liu et.al. | 2405.17877 | null |
2024-05-27 | LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters | Klaudia Bałazy et.al. | 2405.17604 | link |
2024-05-23 | EMR-Merging: Tuning-Free High-Performance Model Merging | Chenyu Huang et.al. | 2405.17461 | link |
2024-05-28 | DoRA: Enhancing Parameter-Efficient Fine-Tuning with Dynamic Rank Distribution | Yulong Mao et.al. | 2405.17357 | link |
2024-05-27 | Runqian Wang et.al. | 2405.17258 | null | |
2024-05-30 | Sparse Matrix in Large Language Model Fine-tuning | Haoze He et.al. | 2405.15525 | null |
2024-05-24 | Prompt Tuning Strikes Back: Customizing Foundation Models with Low-Rank Prompt Adaptation | Abhinav Jain et.al. | 2405.15282 | link |
2024-05-27 | VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks | Yang Li et.al. | 2405.15179 | link |
2024-05-23 | Bitune: Bidirectional Instruction-Tuning | Dawid J. Kopiczko et.al. | 2405.14862 | null |
2024-05-23 | Sparse-Tuning: Adapting Vision Transformers with Efficient Fine-tuning and Inference | Ting Liu et.al. | 2405.14700 | link |
2024-05-22 | Spectral Adapter: Fine-Tuning in Spectral Space | Fangzhao Zhang et.al. | 2405.13952 | link |
2024-05-24 | MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models | Jingwei Xu et.al. | 2405.13053 | link |
2024-05-20 | FeTT: Continual Class Incremental Learning via Feature Transformation Tuning | Sunyuan Qiang et.al. | 2405.11822 | null |
2024-05-21 | HARIS: Human-Like Attention for Reference Image Segmentation | Mengxi Zhang et.al. | 2405.10707 | null |
2024-05-28 | DP-DyLoRA: Fine-Tuning Transformer-Based Models On-Device under Differentially Private Federated Learning using Dynamic Low-Rank Adaptation | Jie Xu et.al. | 2405.06368 | null |
2024-05-09 | Selective Fine-tuning on LLM-labeled Data May Reduce Reliance on Human Annotation: A Case Study Using Schedule-of-Event Table Detection | Bhawesh Kumar et.al. | 2405.06093 | null |
2024-05-09 | Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning | Shibo Jie et.al. | 2405.05615 | link |
2024-05-07 | Refining Joint Text and Source Code Embeddings for Retrieval Task with Parameter-Efficient Fine-Tuning | Karim Galliamov et.al. | 2405.04126 | link |
2024-05-04 | Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuning | Jing Xu et.al. | 2405.02596 | link |
2024-03-16 | Empirical Studies of Parameter Efficient Methods for Large Language Models of Code and Knowledge Transfer to R | Amirreza Esmaeili et.al. | 2405.01553 | link |
2024-05-02 | NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment | Gerald Shen et.al. | 2405.01481 | link |
2024-04-29 | LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report | Justin Zhao et.al. | 2405.00732 | link |
2024-05-01 | Investigating Automatic Scoring and Feedback using Large Language Models | Gloria Ashiya Katuka et.al. | 2405.00602 | null |
2024-05-01 | MoPEFT: A Mixture-of-PEFTs for the Segment Anything Model | Rajat Sahay et.al. | 2405.00293 | null |
2024-04-30 | SPAFIT: Stratified Progressive Adaptation Fine-tuning for Pre-trained Large Language Models | Samir Arora et.al. | 2405.00201 | null |
2024-05-23 | HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning | Chunlin Tian et.al. | 2404.19245 | link |
2024-05-25 | FeDeRA:Efficient Fine-tuning of Language Models in Federated Learning Leveraging Weight Decomposition | Yuxuan Yan et.al. | 2404.18848 | null |
2024-04-25 | Efficiency in Focus: LayerNorm as a Catalyst for Fine-tuning Medical Visual Language Pre-trained Models | Jiawei Chen et.al. | 2404.16385 | null |
2024-05-23 | MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts | Dengchun Li et.al. | 2404.15159 | link |
2024-04-22 | ColA: Collaborative Adaptation with Gradient Learning | Enmao Diao et.al. | 2404.13844 | link |
2024-04-23 | Parameter Efficient Fine Tuning: A Comprehensive Analysis Across Applications | Charith Chandra Sai Balne et.al. | 2404.13506 | null |
2024-04-18 | SKIP: Skill-Localized Prompt Tuning for Inference Speed Boost-Up | Nakyeong Yang et.al. | 2404.11916 | null |
2024-04-16 | Shears: Unstructured Sparsity with Neural Low-rank Adapter Search | J. Pablo Muñoz et.al. | 2404.10934 | link |
2024-04-16 | Exact and Efficient Unlearning for Large Language Model-based Recommendation | Zhiyu Hu et.al. | 2404.10327 | null |
2024-04-15 | LoRA Dropout as a Sparsity Regularizer for Overfitting Control | Yang Lin et.al. | 2404.09610 | null |
2024-04-21 | Analyzing the Impact of Data Selection and Fine-Tuning on Economic and Political Biases in LLMs | Ahmed Agiza et.al. | 2404.08699 | link |
2024-04-08 | Certified PEFTSmoothing: Parameter-Efficient Fine-Tuning with Randomized Smoothing | Chengyan Fu et.al. | 2404.05350 | null |
2024-04-08 | DLoRA: Distributed Parameter-Efficient Fine-Tuning Solution for Large Language Model | Chao Gao et.al. | 2404.05182 | null |
2024-04-12 | Q-PEFT: Query-dependent Parameter Efficient Fine-tuning for Text Reranking with Large Language Models | Zhiyuan Peng et.al. | 2404.04522 | null |
2024-04-05 | Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language Translation | Tong Su et.al. | 2404.04212 | null |
2024-05-22 | ReFT: Representation Finetuning for Language Models | Zhengxuan Wu et.al. | 2404.03592 | link |
2024-06-11 | Personalized LLM Response Generation with Parameterized Memory Injection | Kai Zhang et.al. | 2404.03565 | link |
2024-06-20 | Eigenpruning: an Interpretability-Inspired PEFT Method | Tomás Vergara-Browne et.al. | 2404.03147 | link |
2024-05-28 | PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models | Fanxu Meng et.al. | 2404.02948 | link |
2024-04-03 | Enhancing Low-Resource LLMs Classification with PEFT and Synthetic Data | Parth Patwa et.al. | 2404.02422 | null |
2024-04-11 | IISAN: Efficiently Adapting Multimodal Representation for Sequential Recommendation with Decoupled PEFT | Junchen Fu et.al. | 2404.02059 | link |
2024-03-31 | Query-driven Relevant Paragraph Extraction from Legal Judgments | T. Y. S. S Santosh et.al. | 2404.00595 | null |
2024-03-30 | Edinburgh Clinical NLP at SemEval-2024 Task 2: Fine-tune your model unless you have access to GPT-4 | Aryo Pradipta Gema et.al. | 2404.00484 | link |
2024-04-03 | InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning | Yan-Shuo Liang et.al. | 2404.00228 | link |
2024-03-27 | Is Modularity Transferable? A Case Study through the Lens of Knowledge Distillation | Mateusz Klimaszewski et.al. | 2403.18804 | link |
2024-03-26 | The Unreasonable Ineffectiveness of the Deeper Layers | Andrey Gromov et.al. | 2403.17887 | null |
2024-04-15 | ALoRA: Allocating Low-Rank Adaptation for Fine-tuning Large Language Models | Zequan Liu et.al. | 2403.16187 | null |
2024-03-22 | KnowLA: Enhancing Parameter-efficient Finetuning with Knowledgeable Adaptation | Xindi Luo et.al. | 2403.14950 | link |
2024-03-22 | A Single Linear Layer Yields Task-Adapted Low-Rank Matrices | Hwichan Kim et.al. | 2403.14946 | null |
2024-03-21 | AutoRE: Document-Level Relation Extraction with Large Language Models | Xue Lilong et.al. | 2403.14888 | link |
2024-04-29 | Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey | Zeyu Han et.al. | 2403.14608 | null |
2024-03-20 | Harnessing Large Language Models for Text-Rich Sequential Recommendation | Zhi Zheng et.al. | 2403.13325 | link |
2024-04-16 | AFLoRA: Adaptive Freezing of Low Rank Adaptation in Parameter Efficient Fine-Tuning of Large Models | Zeyu Liu et.al. | 2403.13269 | null |
2024-03-18 | Improving LoRA in Privacy-preserving Federated Learning | Youbang Sun et.al. | 2403.12313 | null |
2024-03-18 | Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation | Wangbo Zhao et.al. | 2403.11808 | link |
2024-03-18 | Let's Focus on Neuron: Neuron-Level Supervised Fine-tuning for Large Language Model | Haoyun Xu et.al. | 2403.11621 | null |
2024-03-19 | JORA: JAX Tensor-Parallel LoRA Library for Retrieval Augmented Fine-Tuning | Anique Tahir et.al. | 2403.11366 | link |
2024-03-14 | Introducing Routing Functions to Vision-Language Parameter-Efficient Fine-Tuning with Low-Rank Bottlenecks | Tingyu Qu et.al. | 2403.09377 | link |
2024-03-14 | PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation | Yizhe Xiong et.al. | 2403.09192 | link |
2024-03-13 | Data-oriented Dynamic Fine-tuning Parameter Selection Strategy for FISH Mask based Efficient Fine-tuning | Ming Dong et.al. | 2403.08484 | null |
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2025-05-27 | Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers | Wei Pang et.al. | 2505.21497 | null |
2025-05-27 | Be Decisive: Noise-Induced Layouts for Multi-Subject Generation | Omer Dahary et.al. | 2505.21488 | null |
2025-05-27 | PropMolFlow: Property-guided Molecule Generation with Geometry-Complete Flow Matching | Cheng Zeng et.al. | 2505.21469 | null |
2025-05-27 | Accelerating Diffusion Language Model Inference via Efficient KV Caching and Guided Diffusion | Zhanqiu Hu et.al. | 2505.21467 | null |
2025-05-27 | Designing Cyclic Peptides via Harmonic SDE with Atom-Bond Modeling | Xiangxin Zhou et.al. | 2505.21452 | null |
2025-05-27 | CoDA: Coordinated Diffusion Noise Optimization for Whole-Body Manipulation of Articulated Objects | Huaijin Pi et.al. | 2505.21437 | null |
2025-05-27 | Learning Individual Behavior in Agent-Based Models with Graph Diffusion Networks | Francesco Cozzi et.al. | 2505.21426 | null |
2025-05-27 | GUARD:Dual-Agent based Backdoor Defense on Chain-of-Thought in Neural Code Generation | Naizhu Jin et.al. | 2505.21425 | null |
2025-05-27 | A Framework for Adversarial Analysis of Decision Support Systems Prior to Deployment | Brett Bissey et.al. | 2505.21414 | null |
2025-05-27 | A Convergence Theory for Diffusion Language Models: An Information-Theoretic Perspective | Gen Li et.al. | 2505.21400 | null |
2025-05-28 | OVERT: A Benchmark for Over-Refusal Evaluation on Text-to-Image Models | Ziheng Cheng et.al. | 2505.21347 | link |
2025-05-28 | MagicTryOn: Harnessing Diffusion Transformer for Garment-Preserving Video Virtual Try-on | Guangyuan Li et.al. | 2505.21325 | null |
2025-05-27 | Evaluation of LLMs in Medical Text Summarization: The Role of Vocabulary Adaptation in High OOV Settings | Gunjan Balde et.al. | 2505.21242 | null |
2025-05-28 | Custom Representations of Inductive Families | Constantine Theocharis et.al. | 2505.21225 | null |
2025-05-27 | Simulations of the churning mode: toroidally symmetric plasma convection and turbulence around the X-points in a snowflake divertor | D Power et.al. | 2505.21223 | null |
2025-05-26 | Multimodal Federated Learning With Missing Modalities through Feature Imputation Network | Pranav Poudel et.al. | 2505.20232 | null |
2025-05-26 | Continuous Learning for Children's ASR: Overcoming Catastrophic Forgetting with Elastic Weight Consolidation and Synaptic Intelligence | Edem Ahadzi et.al. | 2505.20216 | null |
2025-05-26 | Adaptive Classifier-Free Guidance via Dynamic Low-Confidence Masking | Pengxiang Li et.al. | 2505.20199 | null |
2025-05-26 | Private Geometric Median in Nearly-Linear Time | Syamantak Kumar et.al. | 2505.20189 | null |
2025-05-26 | Exposing Go's Hidden Bugs: A Novel Concolic Framework | Karolina Gorna et.al. | 2505.20183 | null |
2025-05-26 | Long-Context State-Space Video World Models | Ryan Po et.al. | 2505.20171 | null |
2025-05-26 | MolEditRL: Structure-Preserving Molecular Editing via Discrete Diffusion and Reinforcement Learning | Yuanxin Zhuang et.al. | 2505.20131 | null |
2025-05-26 | Understanding Generalization in Diffusion Models via Probability Flow Distance | Huijie Zhang et.al. | 2505.20123 | null |
2025-05-26 | Proxy-Free GFlowNet | Ruishuo Chen et.al. | 2505.20110 | null |
2025-05-26 | Refining Few-Step Text-to-Multiview Diffusion via Reinforcement Learning | Ziyi Zhang et.al. | 2505.20107 | null |
2025-05-26 | Safety Through Reasoning: An Empirical Study of Reasoning Guardrail Models | Makesh Narsimhan Sreedhar et.al. | 2505.20087 | null |
2025-05-26 | PAMD: Plausibility-Aware Motion Diffusion Model for Long Dance Generation | Hongsong Wang et.al. | 2505.20056 | null |
2025-05-26 | Multimodal LLM-Guided Semantic Correction in Text-to-Image Diffusion | Zheqi Lv et.al. | 2505.20053 | null |
2025-05-26 | The Many Challenges of Human-Like Agents in Virtual Game Environments | Maciej Świechowski et.al. | 2505.20011 | null |
2025-05-26 | ICDM: Interference Cancellation Diffusion Models for Wireless Semantic Communications | Tong Wu et.al. | 2505.19983 | null |
2025-05-26 | Rethinking Probabilistic Circuit Parameter Learning | Anji Liu et.al. | 2505.19982 | null |
2025-05-26 | UltraVSR: Achieving Ultra-Realistic Video Super-Resolution with Efficient One-Step Diffusion Space | Yong Liu et.al. | 2505.19958 | null |
2025-05-26 | Underwater Diffusion Attention Network with Contrastive Language-Image Joint Learning for Underwater Image Enhancement | Afrah Shaahid et.al. | 2505.19895 | null |
2025-05-26 | A fully automated urban PV parameterization framework for improved estimation of energy production profiles | Bowen Tian et.al. | 2505.19876 | null |
2025-05-26 | StyleAR: Customizing Multimodal Autoregressive Model for Style-Aligned Text-to-Image Generation | Yi Wu et.al. | 2505.19874 | null |
2025-05-26 | Harnessing the Power of Training-Free Techniques in Text-to-2D Generation for Text-to-3D Generation via Score Distillation Sampling | Junhong Lee et.al. | 2505.19868 | null |
2025-05-23 | Generative Distribution Embeddings | Nic Fishman et.al. | 2505.18150 | null |
2025-05-23 | Stochastic agent-based Monte Carlo simulations for reaction-diffusion models, population dynamics, and epidemic spreading | Mohamed Swailem et.al. | 2505.18145 | null |
2025-05-26 | TokBench: Evaluating Your Visual Tokenizer before Visual Generation | Junfeng Wu et.al. | 2505.18142 | null |
2025-05-23 | One RL to See Them All: Visual Triple Unified Reinforcement Learning | Yan Ma et.al. | 2505.18129 | null |
2025-05-23 | Towards more transferable adversarial attack in black-box manner | Chun Tong Lei et.al. | 2505.18097 | null |
2025-05-23 | DualTalk: Dual-Speaker Interaction for 3D Talking Head Conversations | Ziqiao Peng et.al. | 2505.18096 | null |
2025-05-23 | SpikeGen: Generative Framework for Visual Spike Stream Processing | Gaole Dai et.al. | 2505.18049 | null |
2025-05-23 | RestoreVAR: Visual Autoregressive Generation for All-in-One Image Restoration | Sudarshan Rajagopalan et.al. | 2505.18047 | null |
2025-05-26 | Strictly Constrained Generative Modeling via Split Augmented Langevin Sampling | Matthieu Blanke et.al. | 2505.18017 | null |
2025-05-23 | Segment Anyword: Mask Prompt Inversion for Open-Set Grounded Segmentation | Zhihua Liu et.al. | 2505.17994 | null |
2025-05-23 | To Glue or Not to Glue? Classical vs Learned Image Matching for Mobile Mapping Cameras to Textured Semantic 3D Building Models | Simone Gaisbauer et.al. | 2505.17973 | null |
2025-05-23 | Diffusion Classifiers Understand Compositionality, but Conditions Apply | Yujin Jeong et.al. | 2505.17955 | null |
2025-05-23 | SplatCo: Structure-View Collaborative Gaussian Splatting for Detail-Preserving Rendering of Large-Scale Unbounded Scenes | Haihong Xiao et.al. | 2505.17951 | null |
2025-05-23 | Survival Games: Human-LLM Strategic Showdowns under Severe Resource Scarcity | Zhihong Chen et.al. | 2505.17937 | null |
2025-05-23 | Flexible MOF Generation with Torsion-Aware Flow Matching | Nayoung Kim et.al. | 2505.17914 | null |
2025-05-22 | GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning | Chengqi Duan et.al. | 2505.17022 | link |
2025-05-22 | When Are Concepts Erased From Diffusion Models? | Kevin Lu et.al. | 2505.17013 | null |
2025-05-22 | Guided Diffusion Sampling on Function Spaces with Applications to PDEs | Jiachen Yao et.al. | 2505.17004 | link |
2025-05-22 | Pursuing Temporal-Consistent Video Virtual Try-On via Dynamic Pose Interaction | Dong Li et.al. | 2505.16980 | null |
2025-05-22 | Incorporating Visual Correspondence into Diffusion Model for Virtual Try-On | Siqi Wan et.al. | 2505.16977 | link |
2025-05-22 | Creatively Upscaling Images with Global-Regional Priors | Yurui Qian et.al. | 2505.16976 | null |
2025-05-22 | Bigger Isn't Always Memorizing: Early Stopping Overparameterized Diffusion Models | Alessandro Favero et.al. | 2505.16959 | null |
2025-05-22 | From Reality to Virtual Worlds: The Role of Photogrammetry in Game Development | Santiago Berrezueta-Guzman et.al. | 2505.16951 | null |
2025-05-22 | LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning | Zebin You et.al. | 2505.16933 | null |
2025-05-22 | Code Graph Model (CGM): A Graph-Integrated Large Language Model for Repository-Level Software Engineering Tasks | Hongyuan Tao et.al. | 2505.16901 | null |
2025-05-22 | T2I-ConBench: Text-to-Image Benchmark for Continual Post-training | Zhehao Huang et.al. | 2505.16875 | null |
2025-05-22 | Training-Free Efficient Video Generation via Dynamic Token Carving | Yuechen Zhang et.al. | 2505.16864 | link |
2025-05-22 | Conditional Panoramic Image Generation via Masked Autoregressive Modeling | Chaoyang Wang et.al. | 2505.16862 | null |
2025-05-23 | LaViDa: A Large Diffusion Language Model for Multimodal Understanding | Shufan Li et.al. | 2505.16839 | link |
2025-05-22 | From EduVisBench to EduVisAgent: A Benchmark and Multi-Agent Framework for Pedagogical Visualization | Haonian Ji et.al. | 2505.16832 | link |
2025-05-21 | Leveraging the Powerful Attention of a Pre-trained Diffusion Model for Exemplar-based Image Colorization | Satoshi Kosugi et.al. | 2505.15812 | link |
2025-05-21 | On the creation of narrow AI: hierarchy and nonlocality of neural network skills | Eric J. Michaud et.al. | 2505.15811 | link |
2025-05-21 | Neural Conditional Transport Maps | Carlos Rodriguez-Pardo et.al. | 2505.15808 | null |
2025-05-21 | Interspatial Attention for Efficient 4D Human Video Generation | Ruizhi Shao et.al. | 2505.15800 | null |
2025-05-21 | VARD: Efficient and Dense Fine-Tuning for Diffusion Models with Value-based RL | Fengyuan Dai et.al. | 2505.15791 | null |
2025-05-21 | Exploring the Innovation Opportunities for Pre-trained Models | Minjung Park et.al. | 2505.15790 | null |
2025-05-21 | IA-T2I: Internet-Augmented Text-to-Image Generation | Chuanhao Li et.al. | 2505.15779 | null |
2025-05-21 | Constructing a 3D Town from a Single Image | Kaizhi Zheng et.al. | 2505.15765 | null |
2025-05-21 | HybridProver: Augmenting Theorem Proving with LLM-Driven Proof Synthesis and Refinement | Jilin Hu et.al. | 2505.15740 | null |
2025-05-21 | Distributionally Robust Planning of Hydrogen-Electrical Microgrids for Sea Islands | Yuchen Dong et.al. | 2505.15733 | null |
2025-05-21 | Can Large Language Models be Effective Online Opinion Miners? | Ryang Heo et.al. | 2505.15695 | null |
2025-05-21 | SwarmDiff: Swarm Robotic Trajectory Planning in Cluttered Environments via Diffusion Transformer | Kang Ding et.al. | 2505.15679 | null |
2025-05-21 | Graph Conditional Flow Matching for Relational Data Generation | Davide Scassola et.al. | 2505.15668 | link |
2025-05-21 | FragFake: A Dataset for Fine-Grained Detection of Edited Images with Vision Language Models | Zhen Sun et.al. | 2505.15644 | link |
2025-05-21 | Trial and Return Option Strategy in Omnichannel Retailing | Yasuyuki Kusuda et.al. | 2505.15597 | null |
2025-05-20 | NExT-Search: Rebuilding User Feedback Ecosystem for Generative AI Search | Sunhao Dai et.al. | 2505.14680 | null |
2025-05-20 | Training-Free Watermarking for Autoregressive Image Generation | Yu Tong et.al. | 2505.14673 | null |
2025-05-21 | General-Reasoner: Advancing LLM Reasoning Across All Domains | Xueguang Ma et.al. | 2505.14652 | null |
2025-05-20 | Enhancing Learned Knowledge in LoRA Adapters Through Efficient Contrastive Decoding on Ascend NPUs | Morgan Lindsay Heisler et.al. | 2505.14620 | null |
2025-05-20 | Towards a Foundation Model for Communication Systems | Davide Buffelli et.al. | 2505.14603 | null |
2025-05-20 | Neural Inverse Scattering with Score-based Regularization | Yuan Gao et.al. | 2505.14560 | null |
2025-05-20 | Dynadiff: Single-stage Decoding of Images from Continuously Evolving fMRI | Marlène Careil et.al. | 2505.14556 | link |
2025-05-20 | GUARD: Constructing Realistic Two-Player Matrix and Security Games for Benchmarking Game-Theoretic Algorithms | Noah Krever et.al. | 2505.14547 | null |
2025-05-20 | NavBench: A Unified Robotics Benchmark for Reinforcement Learning-Based Autonomous Navigation | Matteo El-Hariry et.al. | 2505.14526 | null |
2025-05-21 | Sparc3D: Sparse Representation and Construction for High-Resolution 3D Shapes Modeling | Zhihao Li et.al. | 2505.14521 | null |
2025-05-20 | Learning to Integrate Diffusion ODEs by Averaging the Derivatives | Wenze Liu et.al. | 2505.14502 | null |
2025-05-20 | A Direct Comparison of Simultaneously Recorded Scalp, Around-Ear, and In-Ear EEG for Neural Selective Auditory Attention Decoding to Speech | Simon Geirnaert et.al. | 2505.14478 | null |
2025-05-20 | Enhancing Interpretability of Sparse Latent Representations with Class Information | Farshad Sangari Abiz et.al. | 2505.14476 | null |
2025-05-20 | CtrlDiff: Boosting Large Diffusion Language Models with Dynamic Block Prediction and Controllable Generation | Chihan Huang et.al. | 2505.14455 | null |
2025-05-20 | Compositional amortized inference for large-scale hierarchical Bayesian models | Jonas Arruda et.al. | 2505.14429 | null |
2025-05-19 | Mean Flows for One-step Generative Modeling | Zhengyang Geng et.al. | 2505.13447 | null |
2025-05-19 | Synthetic-Powered Predictive Inference | Meshi Bashari et.al. | 2505.13432 | link |
2025-05-20 | A Practical Guide for Incorporating Symmetry in Diffusion Policy | Dian Wang et.al. | 2505.13431 | null |
2025-05-19 | Faster Video Diffusion with Trainable Sparse Attention | Peiyuan Zhang et.al. | 2505.13389 | null |
2025-05-19 | Restoration Score Distillation: From Corrupted Diffusion Pretraining to One-Step High-Quality Generation | Yasi Zhang et.al. | 2505.13377 | null |
2025-05-20 | Minimum-Excess-Work Guidance | Christopher Kolloff et.al. | 2505.13375 | null |
2025-05-20 | One-Step Offline Distillation of Diffusion-based Models via Koopman Modeling | Nimrod Berman et.al. | 2505.13358 | link |
2025-05-19 | Frequency-Dependent Power Consumption Modeling of CMOS Transmitters for WNoC Architectures | Mohammad Shahmoradi et.al. | 2505.13310 | null |
2025-05-19 | FlowPure: Continuous Normalizing Flows for Adversarial Purification | Elias Collaert et.al. | 2505.13280 | link |
2025-05-19 | Seeing the Unseen: How EMoE Unveils Bias in Text-to-Image Diffusion Models | Lucas Berry et.al. | 2505.13273 | null |
2025-05-19 | Distilling a speech and music encoder with task arithmetic | Fabian Ritter-Gutierrez et.al. | 2505.13270 | null |
2025-05-19 | Correlation between U/Th and Pb/Os abundance ratios and its application in nuclear cosmochronology | Y. Y. Huang et.al. | 2505.13269 | null |
2025-05-19 | JNLP at SemEval-2025 Task 11: Cross-Lingual Multi-Label Emotion Detection Using Generative Models | Jieying Xue et.al. | 2505.13244 | link |
2025-05-19 | Conformalized Decision Risk Assessment | Wenbin Zhou et.al. | 2505.13243 | null |
2025-05-19 | Diffusion Models with Double Guidance: Generate with aggregated datasets | Yanfeng Yang et.al. | 2505.13213 | null |
2025-05-16 | Evolution of granular salty ice analogs for Europa: Sublimation and Irradiation | Rafael Ottersberg et.al. | 2505.11498 | null |
2025-05-16 | QVGen: Pushing the Limit of Quantized Video Generative Models | Yushi Huang et.al. | 2505.11497 | null |
2025-05-16 | Unsupervised Detection of Distribution Shift in Inverse Problems using Diffusion Models | Shirin Shoushtari et.al. | 2505.11482 | null |
2025-05-16 | PSDiffusion: Harmonized Multi-Layer Image Generation via Layout and Appearance Alignment | Dingbang Huang et.al. | 2505.11468 | null |
2025-05-16 | Exploiting Radiance Fields for Grasp Generation on Novel Synthetic Views | Abhishek Kashyap et.al. | 2505.11467 | null |
2025-05-16 | Disentangling Reasoning and Knowledge in Medical Large Language Models | Rahul Thapa et.al. | 2505.11462 | null |
2025-05-16 | A Generative Framework for Causal Estimation via Importance-Weighted Diffusion Distillation | Xinran Song et.al. | 2505.11444 | null |
2025-05-19 | MegaScale-MoE: Large-Scale Communication-Efficient Training of Mixture-of-Experts Models in Production | Chao Jin et.al. | 2505.11432 | null |
2025-05-16 | Diff-Unfolding: A Model-Based Score Learning Framework for Inverse Problems | Yuanhao Wang et.al. | 2505.11393 | null |
2025-05-16 | LipDiffuser: Lip-to-Speech Generation with Conditional Diffusion Models | Danilo de Oliveira et.al. | 2505.11391 | null |
2025-05-16 | MARRS: Masked Autoregressive Unit-based Reaction Synthesis | Y. B. Wang et.al. | 2505.11334 | null |
2025-05-16 | Decomposing stimulus-specific sensory neural information via diffusion models | Steeve Laquitaine et.al. | 2505.11309 | null |
2025-05-16 | Effective Probabilistic Time Series Forecasting with Fourier Adaptive Noise-Separated Diffusion | Xinyan Wang et.al. | 2505.11306 | null |
2025-05-16 | A Fourier Space Perspective on Diffusion Models | Fabian Falck et.al. | 2505.11278 | null |
2025-05-16 | DRAGON: A Large-Scale Dataset of Realistic Images Generated by Diffusion Models | Giulia Bertazzini et.al. | 2505.11257 | null |
2025-05-15 | 3D-Fixup: Advancing Photo Editing with 3D Priors | Yen-Chi Cheng et.al. | 2505.10566 | null |
2025-05-15 | T2A-Feedback: Improving Basic Capabilities of Text-to-Audio Generation via Fine-grained AI Feedback | Zehan Wang et.al. | 2505.10561 | null |
2025-05-15 | Style Customization of Text-to-Vector Generation with Image Diffusion Priors | Peiying Zhang et.al. | 2505.10558 | null |
2025-05-15 | Flowing Through Hilbert Space: Quantum-Enhanced Generative Models for Lattice Field Theory | Jehu Martinez et.al. | 2505.10553 | null |
2025-05-15 | Does Feasibility Matter? Understanding the Impact of Feasibility on Synthetic Training Data | Yiwen Liu et.al. | 2505.10551 | link |
2025-05-15 | Pharmacophore-Conditioned Diffusion Model for Ligand-Based De Novo Drug Design | Amira Alakhdar et.al. | 2505.10545 | null |
2025-05-15 | LibIQ: Toward Real-Time Spectrum Classification in O-RAN dApps | Filippo Olimpieri et.al. | 2505.10537 | null |
2025-05-15 | Optimal Pricing With Impatient Customers | Jieqi Di et.al. | 2505.10514 | null |
2025-05-15 | CheXGenBench: A Unified Benchmark For Fidelity, Privacy and Utility of Synthetic Chest Radiographs | Raman Dutt et.al. | 2505.10496 | link |
2025-05-15 | Campus AI vs Commercial AI: A Late-Breaking Study on How LLM As-A-Service Customizations Shape Trust and Usage Patterns | Leon Hannig et.al. | 2505.10490 | null |
2025-05-15 | UniEval: Unified Holistic Evaluation for Unified Multimodal Understanding and Generation | Yi Li et.al. | 2505.10483 | null |
2025-05-15 | Fine-tuning Diffusion Policies with Backpropagation Through Diffusion Timesteps | Ningyuan Yang et.al. | 2505.10482 | null |
2025-05-15 | AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenge | Ranjan Sapkota et.al. | 2505.10468 | null |
2025-05-15 | Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models | Zemin Huang et.al. | 2505.10446 | null |
2025-05-15 | Score-based diffusion nowcasting of GOES imagery | Randy J. Chase et.al. | 2505.10432 | null |
2025-05-14 | Customizing a Large Language Model for VHDL Design of High-Performance Microprocessors | Nicolas Dupuis et.al. | 2505.09610 | null |
2025-05-14 | LightLab: Controlling Light Sources in Images with Diffusion Models | Nadav Magar et.al. | 2505.09608 | null |
2025-05-14 | Don't Forget your Inverse DDIM for Image Editing | Guillermo Gomez-Trenado et.al. | 2505.09571 | null |
2025-05-14 | BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset | Jiuhai Chen et.al. | 2505.09568 | link |
2025-05-14 | CXMArena: Unified Dataset to benchmark performance in realistic CXM Scenarios | Raghav Garg et.al. | 2505.09436 | link |
2025-05-14 | Efficient Modelling of Lyman-α opacity fluctuations during late EoR | Barun Maity et.al. | 2505.09369 | null |
2025-05-14 | Diffusion Recommender Models and the Illusion of Progress: A Concerning Study of Reproducibility and a Conceptual Mismatch | Michael Benigni et.al. | 2505.09364 | null |
2025-05-14 | Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis | Bingxin Ke et.al. | 2505.09358 | link |
2025-05-14 | APR-Transformer: Initial Pose Estimation for Localization in Complex Environments through Absolute Pose Regression | Srinivas Ravuri et.al. | 2505.09356 | link |
2025-05-14 | Access Controls Will Solve the Dual-Use Dilemma | Evžen Wybitul et.al. | 2505.09341 | null |
2025-05-14 | DCSNet: A Lightweight Knowledge Distillation-Based Model with Explainable AI for Lung Cancer Diagnosis from Histopathological Images | Sadman Sakib Alif et.al. | 2505.09334 | null |
2025-05-14 | TransDiffuser: End-to-end Trajectory Generation with Decorrelated Multi-modal Representation for Autonomous Driving | Xuefeng Jiang et.al. | 2505.09315 | null |
2025-05-14 | Generating Full-field Evolution of Physical Dynamics from Irregular Sparse Observations | Panqi Chen et.al. | 2505.09284 | null |
2025-05-14 | A Note on Semantic Diffusion | Alexander P. Ryjov et.al. | 2505.09283 | null |
2025-05-14 | Few-Shot Anomaly-Driven Generation for Anomaly Classification and Segmentation | Guan Gui et.al. | 2505.09263 | link |
2025-05-13 | PCS-UQ: Uncertainty Quantification via the Predictability-Computability-Stability Framework | Abhineet Agarwal et.al. | 2505.08784 | null |
2025-05-13 | Generative Molecular Design with Steerable and Granular Synthesizability Control | Jeff Guo et.al. | 2505.08774 | link |
2025-05-13 | Controllable Image Colorization with Instance-aware Texts and Masks | Yanru An et.al. | 2505.08705 | null |
2025-05-13 | A Survey of Deep Learning for Complex Speech Spectrograms | Yuying Xie et.al. | 2505.08694 | null |
2025-05-13 | A Machine Learning Pipeline for Molecular Property Prediction using ChemXploreML | Aravindh Nivas Marimuthu et.al. | 2505.08688 | null |
2025-05-13 | Comparison of laser system designs for quantum technologies: BECCAL flight system vs. BECCAL ground test bed | Victoria A. Henderson et.al. | 2505.08680 | null |
2025-05-13 | A Study of Data-driven Methods for Inventory Optimization | Lee Yeung Ping et.al. | 2505.08673 | null |
2025-05-13 | WixQA: A Multi-Dataset Benchmark for Enterprise Retrieval-Augmented Generation | Dvir Cohen et.al. | 2505.08643 | null |
2025-05-13 | Visually Guided Decoding: Gradient-Free Hard Prompt Inversion with Language Models | Donghoon Kim et.al. | 2505.08622 | null |
2025-05-13 | Boosting Zero-shot Stereo Matching using Large-scale Mixed Images Sources in the Real World | Yuran Wang et.al. | 2505.08607 | null |
2025-05-13 | Extract the Best, Discard the Rest: CSI Feedback with Offline Large AI Models | Jialin Zhuang et.al. | 2505.08566 | null |
2025-05-13 | DFA-CON: A Contrastive Learning Approach for Detecting Copyright Infringement in DeepFake Art | Haroon Wahab et.al. | 2505.08552 | null |
2025-05-13 | Diffusion-assisted Model Predictive Control Optimization for Power System Real-Time Operation | Linna Xu et.al. | 2505.08535 | null |
2025-05-13 | Building-Block Aware Generative Modeling for 3D Crystals of Metal Organic Frameworks | Chenru Duan et.al. | 2505.08531 | link |
2025-05-14 | Improving Data Fidelity via Diffusion Model-based Correction and Super-Resolution | Wuzhe Xu et.al. | 2505.08526 | null |
2025-05-12 | H |
Yiyang Lu et.al. | 2505.07819 | null |
2025-05-12 | DanceGRPO: Unleashing GRPO on Visual Generation | Zeyue Xue et.al. | 2505.07818 | null |
2025-05-12 | Pixel Motion as Universal Representation for Robot Control | Kanchana Ranasinghe et.al. | 2505.07817 | null |
2025-05-12 | Continuous Visual Autoregressive Generation via Score Maximization | Chenze Shao et.al. | 2505.07812 | link |
2025-05-12 | Improving Trajectory Stitching with Flow Models | Reece O'Mahoney et.al. | 2505.07802 | null |
2025-05-12 | Learning Dynamics in Continual Pre-Training for Large Language Models | Xingjin Wang et.al. | 2505.07796 | null |
2025-05-12 | Synthesizing Diverse Network Flow Datasets with Scalable Dynamic Multigraph Generation | Arya Grayeli et.al. | 2505.07777 | null |
2025-05-12 | LAMM-ViT: AI Face Detection via Layer-Aware Modulation of Region-Guided Attention | Jiangling Zhang et.al. | 2505.07734 | null |
2025-05-12 | ShotAdapter: Text-to-Multi-Shot Video Generation with Diffusion Models | Ozgur Kara et.al. | 2505.07652 | null |
2025-05-12 | Markov Modelling Approach for Queues with Correlated Service Times -- the |
Suman Thapa et.al. | 2505.07648 | null |
2025-05-12 | Diffused Responsibility: Analyzing the Energy Consumption of Generative Text-to-Audio Diffusion Models | Riccardo Passoni et.al. | 2505.07615 | null |
2025-05-12 | SecReEvalBench: A Multi-turned Security Resilience Evaluation Benchmark for Large Language Models | Huining Cui et.al. | 2505.07584 | null |
2025-05-12 | Noise Optimized Conditional Diffusion for Domain Adaptation | Lingkun Luo et.al. | 2505.07548 | null |
2025-05-12 | RAI: Flexible Agent Framework for Embodied AI | Kajetan Rachwał et.al. | 2505.07532 | link |
2025-05-13 | FLUXSynID: A Framework for Identity-Controlled Synthetic Face Generation with Document and Live Images | Raul Ismayilov et.al. | 2505.07530 | link |
2025-05-09 | Long time behaviour of Mean Field Games with fractional diffusion | Olav Ersland et.al. | 2505.06183 | null |
2025-05-09 | DiffLocks: Generating 3D Hair from a Single Image using Diffusion Models | Radu Alexandru Rosu et.al. | 2505.06166 | null |
2025-05-09 | Can Prompting LLMs Unlock Hate Speech Detection across Languages? A Zero-shot and Few-shot Study | Faeze Ghorbanpour et.al. | 2505.06149 | null |
2025-05-09 | Constraints to Lorentz violation and ultrahigh-energy electrons in D-foamy space-times | Chengyi Li et.al. | 2505.06121 | null |
2025-05-09 | Photovoltaic Defect Image Generator with Boundary Alignment Smoothing Constraint for Domain Shift Mitigation | Dongying Li et.al. | 2505.06117 | null |
2025-05-09 | FIC-TSC: Learning Time Series Classification with Fisher Information Constraint | Xiwen Chen et.al. | 2505.06114 | null |
2025-05-09 | Noise-Consistent Siamese-Diffusion for Medical Image Synthesis and Segmentation | Kunpeng Qiu et.al. | 2505.06068 | link |
2025-05-09 | Droplet Outbursts from Onion Cutting | Zixuan Wu et.al. | 2505.06016 | null |
2025-05-09 | Offline Multi-agent Reinforcement Learning via Score Decomposition | Dan Qiao et.al. | 2505.05968 | null |
2025-05-09 | GEORCE: A Fast New Control Algorithm for Computing Geodesics | Frederik Möbius Rygaard et.al. | 2505.05961 | link |
2025-05-09 | Summarisation of German Judgments in conjunction with a Class-based Evaluation | Bianca Steffes et.al. | 2505.05947 | link |
2025-05-09 | Autoencoder-Based Hybrid Replay for Class-Incremental Learning | Milad Khademi Nori et.al. | 2505.05926 | null |
2025-05-09 | A 3D pocket-aware and evolutionary conserved interaction guided diffusion model for molecular optimization | Anjie Qiao et.al. | 2505.05874 | null |
2025-05-09 | Screening Mechanisms on White Dwarfs: Symmetron & Dilaton | Joan Bachs-Esteban et.al. | 2505.05871 | null |
2025-05-09 | Generative Discovery of Partial Differential Equations by Learning from Math Handbooks | Hao Xu et.al. | 2505.05869 | null |
2025-05-08 | SVAD: From Single Image to 3D Avatar via Synthetic Data Generation with Video Diffusion and Data Augmentation | Yonwoo Choi et.al. | 2505.05475 | link |
2025-05-08 | 3D Scene Generation: A Survey | Beichen Wen et.al. | 2505.05474 | link |
2025-05-08 | DiffusionSfM: Predicting Structure and Motion via Ray Origin and Endpoint Diffusion | Qitao Zhao et.al. | 2505.05473 | null |
2025-05-08 | Mogao: An Omni Foundation Model for Interleaved Multi-Modal Generation | Chao Liao et.al. | 2505.05472 | null |
2025-05-08 | Denoising Diffusion Probabilistic Models for Coastal Inundation Forecasting | Kazi Ashik Islam et.al. | 2505.05381 | null |
2025-05-08 | SDR-RDMA: Software-Defined Reliability Architecture for Planetary Scale RDMA Communication | Mikhail Khalilov et.al. | 2505.05366 | null |
2025-05-08 | Modelling and Verifying Neuronal Archetypes in Coq | Abdorrahim Bahrami et.al. | 2505.05362 | link |
2025-05-08 | SmartTrap: Automated Precision Experiments with Optical Tweezers | Martin Selin et.al. | 2505.05290 | null |
2025-05-08 | Diffusion Model Quantization: A Review | Qian Zeng et.al. | 2505.05215 | link |
2025-05-08 | EAM: Enhancing Anything with Diffusion Transformers for Blind Super-Resolution | Haizhen Xie et.al. | 2505.05209 | null |
2025-05-08 | Societal and technological progress as sewing an ever-growing, ever-changing, patchy, and polychrome quilt | Joel Z. Leibo et.al. | 2505.05197 | null |
2025-05-08 | Overcoming Dimensional Factorization Limits in Discrete Diffusion Models through Quantum Joint Distribution Learning | Chuangtao Chen et.al. | 2505.05151 | link |
2025-05-08 | Research on Anomaly Detection Methods Based on Diffusion Models | Yi Chen et.al. | 2505.05137 | null |
2025-05-08 | Taming OOD Actions for Offline Reinforcement Learning: An Advantage-Based Approach | Xuyang Chen et.al. | 2505.05126 | null |
2025-05-08 | MDAA-Diff: CT-Guided Multi-Dose Adaptive Attention Diffusion Model for PET Denoising | Xiaolong Niu et.al. | 2505.05112 | null |
2025-05-07 | Score Distillation Sampling for Audio: Source Separation, Synthesis, and Beyond | Jessie Richter-Powell et.al. | 2505.04621 | null |
2025-05-08 | Flexing RISC-V Instruction Subset Processors (RISPs) to Extreme Edge | Alireza Raisiardali et.al. | 2505.04567 | null |
2025-05-07 | Risk-sensitive Reinforcement Learning Based on Convex Scoring Functions | Shanyu Han et.al. | 2505.04553 | null |
2025-05-07 | Text2CT: Towards 3D CT Volume Generation from Free-text Descriptions Using Diffusion Model | Pengfei Guo et.al. | 2505.04522 | null |
2025-05-08 | HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation | Teng Hu et.al. | 2505.04512 | null |
2025-05-07 | Detecting Spelling and Grammatical Anomalies in Russian Poetry Texts | Ilya Koziev et.al. | 2505.04507 | null |
2025-05-07 | Uncovering Key Features for Model-Driven Engineering of Complex Performance Indicators: A Scoping Review | Benito Giunta et.al. | 2505.04498 | null |
2025-05-08 | Defining and Quantifying Creative Behavior in Popular Image Generators | Aditi Ramaswamy et.al. | 2505.04497 | null |
2025-05-07 | Efficient Flow Matching using Latent Variables | Anirban Samaddar et.al. | 2505.04486 | null |
2025-05-08 | FA-KPConv: Introducing Euclidean Symmetries to KPConv via Frame Averaging | Ali Alawieh et.al. | 2505.04485 | null |
2025-05-07 | Miipher-2: A Universal Speech Restoration Model for Million-Hour Scale Data Restoration | Shigeki Karita et.al. | 2505.04457 | link |
2025-05-07 | An Asynchronous Distributed-Memory Parallel Algorithm for k-mer Counting | Souvadra Hati et.al. | 2505.04431 | link |
2025-05-07 | Recognizing Ornaments in Vocal Indian Art Music with Active Annotation | Sumit Kumar et.al. | 2505.04419 | null |
2025-05-07 | Localized Diffusion Models for High Dimensional Distributions Generation | Georg A. Gottwald et.al. | 2505.04417 | null |
2025-05-07 | The Aloe Family Recipe for Open and Specialized Healthcare LLMs | Dario Garcia-Gasulla et.al. | 2505.04388 | null |
2025-05-06 | FlexiAct: Towards Flexible Action Control in Heterogeneous Scenarios | Shiyi Zhang et.al. | 2505.03730 | null |
2025-05-06 | Demonstrating ViSafe: Vision-enabled Safety for High-speed Detect and Avoid | Parv Kapoor et.al. | 2505.03694 | null |
2025-05-06 | CaRaFFusion: Improving 2D Semantic Segmentation with Camera-Radar Point Cloud Fusion and Zero-Shot Image Inpainting | Huawei Sun et.al. | 2505.03679 | null |
2025-05-06 | Distribution-Conditional Generation: From Class Distribution to Creative Generation | Fu Feng et.al. | 2505.03667 | null |
2025-05-06 | Bounding Box-Guided Diffusion for Synthesizing Industrial Images and Segmentation Map | Alessandro Simoni et.al. | 2505.03623 | link |
2025-05-07 | PAHA: Parts-Aware Audio-Driven Human Animation with Diffusion Model | Y. B. Wang et.al. | 2505.03603 | null |
2025-05-06 | From Pixels to Polygons: A Survey of Deep Learning Approaches for Medical Image-to-Mesh Reconstruction | Fengming Lin et.al. | 2505.03599 | null |
2025-05-06 | Real-Time Person Image Synthesis Using a Flow Matching Model | Jiwoo Jeong et.al. | 2505.03562 | null |
2025-05-06 | A Comprehensive Survey of Large AI Models for Future Communications: Foundations, Applications and Challenges | Feibo Jiang et.al. | 2505.03556 | link |
2025-05-06 | Efficient Training of Physics-enhanced Neural ODEs via Direct Collocation and Nonlinear Programming | Linus Langenkamp et.al. | 2505.03552 | null |
2025-05-06 | Causal Intervention Framework for Variational Auto Encoder Mechanistic Interpretability | Dip Roy et.al. | 2505.03530 | null |
2025-05-06 | Modality-Guided Dynamic Graph Fusion and Temporal Diffusion for Self-Supervised RGB-T Tracking | Shenglan Li et.al. | 2505.03507 | link |
2025-05-06 | A new membership inference attack that spots memorization in generative and predictive models: Loss-Based with Reference Model algorithm (LBRM) | Faiz Taleb et.al. | 2505.03490 | null |
2025-05-06 | Wasserstein Convergence of Score-based Generative Models under Semiconvexity and Discontinuous Gradients | Stefano Bruno et.al. | 2505.03432 | null |
2025-05-06 | Phenotype-Guided Generative Model for High-Fidelity Cardiac MRI Synthesis: Advancing Pretraining and Clinical Applications | Ziyu Li et.al. | 2505.03426 | null |
2025-05-05 | Towards Dataset Copyright Evasion Attack against Personalized Text-to-Image Diffusion Models | Kuofeng Gao et.al. | 2505.02824 | link |
2025-05-05 | MUSAR: Exploring Multi-Subject Customization from Single-Subject Dataset via Attention Routing | Zinan Guo et.al. | 2505.02823 | link |
2025-05-05 | Advancing Generalizable Tumor Segmentation with Anomaly-Aware Open-Vocabulary Attention Maps and Frozen Foundation Diffusion Models | Yankai Jiang et.al. | 2505.02753 | link |
2025-05-05 | The use of Artificial Intelligence for Intervention and Assessment in Individuals with ASD | Aggeliki Sideraki et.al. | 2505.02747 | null |
2025-05-05 | Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play | Yemin Shi et.al. | 2505.02707 | link |
2025-05-05 | Hierarchical random measures without tables | Marta Catalano et.al. | 2505.02653 | null |
2025-05-06 | MCCD: Multi-Agent Collaboration-based Compositional Diffusion for Complex Text-to-Image Generation | Mingcheng Li et.al. | 2505.02648 | null |
2025-05-05 | Towards Cross-Modality Modeling for Time Series Analytics: A Survey in the LLM Era | Chenxi Liu et.al. | 2505.02583 | link |
2025-05-05 | Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities | Xinjie Zhang et.al. | 2505.02567 | link |
2025-05-05 | Bielik v3 Small: Technical Report | Krzysztof Ociepa et.al. | 2505.02550 | null |
2025-05-06 | Resolving Memorization in Empirical Diffusion Model for Manifold Data in High-Dimensional Spaces | Yang Lyu et.al. | 2505.02508 | null |
2025-05-05 | Hypothesis testing and Stein's lemma in general probability theories with Euclidean Jordan algebra and its quantum realization | Kanta Sonoda et.al. | 2505.02487 | null |
2025-05-05 | Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction | Biao Gong et.al. | 2505.02471 | link |
2025-05-05 | Data Augmentation With Back translation for Low Resource languages: A case of English and Luganda | Richard Kimera et.al. | 2505.02463 | null |
2025-05-05 | Predicting the Dynamics of Complex System via Multiscale Diffusion Autoencoder | Ruikun Li et.al. | 2505.02450 | null |
2025-05-02 | GENMO: A GENeralist Model for Human MOtion | Jiefeng Li et.al. | 2505.01425 | null |
2025-05-02 | Computational, Data-Driven, and Physics-Informed Machine Learning Approaches for Microstructure Modeling in Metal Additive Manufacturing | D. Patel et.al. | 2505.01424 | null |
2025-05-02 | VIDSTAMP: A Temporally-Aware Watermark for Ownership and Integrity in Video Diffusion Models | Mohammadreza Teymoorianfard et.al. | 2505.01406 | link |
2025-05-02 | Provable Efficiency of Guidance in Diffusion Models for General Data Distribution | Gen Li et.al. | 2505.01382 | null |
2025-05-02 | Binamix -- A Python Library for Generating Binaural Audio Datasets | Dan Barry et.al. | 2505.01369 | link |
2025-05-02 | FreeInsert: Disentangled Text-Guided Object Insertion in 3D Gaussian Scene without Spatial Priors | Chenxi Li et.al. | 2505.01322 | null |
2025-05-02 | Model See Model Do: Speech-Driven Facial Animation with Style Control | Yifang Pan et.al. | 2505.01319 | null |
2025-05-02 | ViSA-Flow: Accelerating Robot Skill Learning via Large-Scale Video Semantic Action Flow | Changhe Chen et.al. | 2505.01288 | null |
2025-05-02 | Scoring-Assisted Generative Exploration for Proteins (SAGE-Prot): A Framework for Multi-Objective Protein Optimization via Iterative Sequence Generation and Evaluation | Hocheol Lim et.al. | 2505.01277 | link |
2025-05-02 | Enhancing Obsolescence Forecasting with Deep Generative Data Augmentation: A Semi-Supervised Framework for Low-Data Industrial Applications | Elie Saad et.al. | 2505.01261 | null |
2025-05-05 | Enabling Training-Free Semantic Communication Systems with Generative Diffusion Models | Shunpu Tang et.al. | 2505.01209 | null |
2025-05-02 | A Secured Triad of IoT, Machine Learning, and Blockchain for Crop Forecasting in Agriculture | Najmus Sakib Sizan et.al. | 2505.01196 | null |
2025-05-02 | A Combinatorial Proof of Universal Optimality for Computing a Planar Convex Hull | Ivor van der Hoog et.al. | 2505.01194 | null |
2025-05-02 | FreePCA: Integrating Consistency Information across Long-short Frames in Training-free Long Video Generation via Principal Component Analysis | Jiangtong Tan et.al. | 2505.01172 | link |
2025-05-02 | Retrieval-Augmented Generation in Biomedicine: A Survey of Technologies, Datasets, and Clinical Applications | Jiawei He et.al. | 2505.01146 | null |
2025-05-01 | Controllable Weather Synthesis and Removal with Video Diffusion Models | Chih-Hao Lin et.al. | 2505.00704 | null |
2025-05-01 | T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT | Dongzhi Jiang et.al. | 2505.00703 | link |
2025-05-01 | GuideSR: Rethinking Guidance for One-Step High-Fidelity Diffusion-Based Super-Resolution | Aditya Arora et.al. | 2505.00687 | null |
2025-05-01 | Visual Trajectory Prediction of Vessels for Inland Navigation | Alexander Puzicha et.al. | 2505.00599 | null |
2025-05-01 | ParkDiffusion: Heterogeneous Multi-Agent Multi-Modal Trajectory Prediction for Automated Parking using Diffusion Models | Jiarong Wei et.al. | 2505.00586 | null |
2025-05-01 | Safety-Critical Traffic Simulation with Guided Latent Diffusion Model | Mingxing Peng et.al. | 2505.00515 | null |
2025-05-01 | A General Model for Linearly Polarized Optical Vector Beams | Jonathan Nichols et.al. | 2505.00471 | null |
2025-05-01 | A Neural Network Mode for PX4 on Embedded Flight Controllers | Sindre M. Hegre et.al. | 2505.00432 | link |
2025-05-01 | Over-the-Air Inference over Multi-hop MIMO Networks | Chenghong Bian et.al. | 2505.00430 | null |
2025-05-01 | Leveraging Pretrained Diffusion Models for Zero-Shot Part Assembly | Ruiyuan Zhang et.al. | 2505.00426 | null |
2025-05-01 | CSE-SFP: Enabling Unsupervised Sentence Representation Learning via a Single Forward Pass | Bowen Zhang et.al. | 2505.00389 | link |
2025-05-01 | Towards Lightweight Hyperspectral Image Super-Resolution with Depthwise Separable Dilated Convolutional Network | Usman Muhammad et.al. | 2505.00374 | link |
2025-05-01 | Denoising weak lensing mass maps with diffusion model: systematic comparison with generative adversarial network | Shohei D. Aoyama et.al. | 2505.00345 | null |
2025-05-01 | T2VPhysBench: A First-Principles Benchmark for Physical Consistency in Text-to-Video Generation | Xuyang Guo et.al. | 2505.00337 | null |
2025-05-01 | Quaternion Wavelet-Conditioned Diffusion Models for Image Super-Resolution | Luigi Sigillo et.al. | 2505.00334 | null |
2025-04-30 | ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction | Qihao Liu et.al. | 2504.21855 | null |
2025-04-30 | 3D Stylization via Large Reconstruction Model | Ipek Oztas et.al. | 2504.21836 | null |
2025-04-30 | From Aesthetics to Human Preferences: Comparative Perspectives of Evaluating Text-to-Music Systems | Huan Zhang et.al. | 2504.21815 | null |
2025-04-30 | Anatomical Similarity as a New Metric to Evaluate Brain Generative Models | Bahram Jafrasteh et.al. | 2504.21771 | null |
2025-04-30 | MovementVR: An open-source tool for the study of motor control and learning in virtual reality | Cristina Rossi et.al. | 2504.21696 | null |
2025-04-30 | HoloTime: Taming Video Diffusion Models for Panoramic 4D Scene Generation | Haiyang Zhou et.al. | 2504.21650 | link |
2025-04-30 | Diffusion-based Adversarial Identity Manipulation for Facial Privacy Protection | Liqin Wang et.al. | 2504.21646 | null |
2025-04-30 | ODE and PDE models for COVID-19, with reinfection and vaccination process for Cameroon and Germany | Hamadjam Abboubakar et.al. | 2504.21613 | null |
2025-04-30 | Latent Feature-Guided Conditional Diffusion for High-Fidelity Generative Image Semantic Communication | Zehao Chen et.al. | 2504.21577 | null |
2025-04-30 | Generative AI in Financial Institution: A Global Survey of Opportunities, Threats, and Regulation | Bikash Saha et.al. | 2504.21574 | null |
2025-04-30 | FreeBeacon: Efficient Communication and Data Aggregation in Battery-Free IoT | Gaosheng Liu et.al. | 2504.21571 | null |
2025-04-30 | MagicPortrait: Temporally Consistent Face Reenactment with 3D Geometric Guidance | Mengting Wei et.al. | 2504.21497 | link |
2025-04-30 | DGSolver: Diffusion Generalist Solver with Universal Posterior Sampling for Image Restoration | Hebaixu Wang et.al. | 2504.21487 | link |
2025-04-30 | GarmentDiffusion: 3D Garment Sewing Pattern Generation with Multimodal Diffusion Transformers | Xinyu Li et.al. | 2504.21476 | null |
2025-04-30 | SimPRIVE: a Simulation framework for Physical Robot Interaction with Virtual Environments | Federico Nesti et.al. | 2504.21454 | null |
2025-04-29 | YoChameleon: Personalized Vision and Language Generation | Thao Nguyen et.al. | 2504.20998 | null |
2025-04-29 | TesserAct: Learning 4D Embodied World Models | Haoyu Zhen et.al. | 2504.20995 | null |
2025-04-29 | Trace-of-Thought: Enhanced Arithmetic Problem Solving via Reasoning Distillation From Large to Small Language Models | Tyler McDonald et.al. | 2504.20946 | null |
2025-04-30 | End-to-end Audio Deepfake Detection from RAW Waveforms: a RawNet-Based Approach with Cross-Dataset Evaluation | Andrea Di Pierno et.al. | 2504.20923 | link |
2025-04-29 | Evaluating Generative Models for Tabular Data: Novel Metrics and Benchmarking | Dayananda Herurkar et.al. | 2504.20900 | null |
2025-04-29 | The Leaderboard Illusion | Shivalika Singh et.al. | 2504.20879 | null |
2025-04-29 | AI-GenBench: A New Ongoing Benchmark for AI-Generated Image Detection | Lorenzo Pellegrini et.al. | 2504.20865 | null |
2025-04-29 | Universal language model with the intervention of quantum theory | D. -F. Qin et.al. | 2504.20839 | null |
2025-04-29 | SoccerDiffusion: Toward Learning End-to-End Humanoid Robot Soccer from Gameplay Recordings | Florian Vahl et.al. | 2504.20808 | null |
2025-04-29 | JTreeformer: Graph-Transformer via Latent-Diffusion Model for Molecular Generation | Ji Shi et.al. | 2504.20770 | null |
2025-04-29 | DDPS: Discrete Diffusion Posterior Sampling for Paths in Layered Graphs | Hao Luan et.al. | 2504.20754 | null |
2025-04-29 | Learning a General Model: Folding Clothing with Topological Dynamics | Yiming Liu et.al. | 2504.20720 | null |
2025-04-29 | What's Wrong with Your Synthetic Tabular Data? Using Explainable AI to Evaluate Generative Models | Jan Kapar et.al. | 2504.20687 | link |
2025-04-29 | DiffLiB: High-fidelity differentiable modeling of lithium-ion batteries and efficient gradient-based parameter identification | Weipeng Xu et.al. | 2504.20674 | link |
2025-04-29 | LDPoly: Latent Diffusion for Polygonal Road Outline Extraction in Large-Scale Topographic Mapping | Weiqin Jiao et.al. | 2504.20645 | null |
2025-04-28 | Shopformer: Transformer-Based Framework for Detecting Shoplifting via Human Pose | Narges Rashvand et.al. | 2504.19970 | null |
2025-04-28 | Warm-Starting QAOA with XY Mixers: A Novel Approach for Quantum-Enhanced Vehicle Routing Optimization | Rafael S. do Carmo et.al. | 2504.19934 | null |
2025-04-28 | CineVerse: Consistent Keyframe Synthesis for Cinematic Scene Composition | Quynh Phung et.al. | 2504.19894 | null |
2025-04-28 | Queue or lounge: strategic design for strategic customer | Riya Sultana et.al. | 2504.19889 | null |
2025-04-28 | DeeCLIP: A Robust and Generalizable Transformer-Based Framework for Detecting AI-Generated Images | Mamadou Keita et.al. | 2504.19876 | link |
2025-04-28 | CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback | Chenhan Jiang et.al. | 2504.19860 | null |
2025-04-28 | Automated Generation of Precedence Graphs in Digital Value Chains for Automotive Production | Cornelius Hake et.al. | 2504.19835 | null |
2025-04-28 | Contextures: The Mechanism of Representation Learning | Runtian Zhai et.al. | 2504.19792 | null |
2025-04-28 | Heterophily-informed Message Passing | Haishan Wang et.al. | 2504.19785 | null |
2025-04-28 | Crafting a Personal Journaling Practice: Negotiating Ecosystems of Materials, Personal Context, and Community in Analog Journaling | Katherine Lin et.al. | 2504.19767 | null |
2025-04-28 | Lossy Beyond Diagonal Reconfigurable Intelligent Surfaces: Modeling and Optimization | Yiyang Peng et.al. | 2504.19744 | null |
2025-04-28 | RepText: Rendering Visual Text via Replicating | Haofan Wang et.al. | 2504.19724 | null |
2025-04-28 | Madhur Jindal et.al. | 2504.19674 | link | |
2025-04-28 | Multimodal Conditioned Diffusive Time Series Forecasting | Chen Su et.al. | 2504.19669 | null |
2025-04-28 | Hardware/Software Co-Design of RISC-V Extensions for Accelerating Sparse DNNs on FPGAs | Muhammad Sabih et.al. | 2504.19659 | null |
2025-04-25 | Eval3D: Interpretable and Fine-grained Evaluation for 3D Generation | Shivam Duggal et.al. | 2504.18509 | null |
2025-04-25 | Action-Minimization Meets Generative Modeling: Efficient Transition Path Sampling with the Onsager-Machlup Functional | Sanjeev Raja et.al. | 2504.18506 | null |
2025-04-25 | LaRI: Layered Ray Intersections for Single-view 3D Geometric Reasoning | Rui Li et.al. | 2504.18424 | null |
2025-04-25 | HepatoGEN: Generating Hepatobiliary Phase MRI with Perceptual and Adversarial Models | Jens Hooge et.al. | 2504.18405 | null |
2025-04-25 | Paradigm shift on Coding Productivity Using GenAI | Liang Yu et.al. | 2504.18404 | null |
2025-04-25 | The Foundation for Developing an Exoskeleton for the Rehabilitation of Temporomandibular Disorders | Paul-Otto Müller et.al. | 2504.18379 | link |
2025-04-25 | Enhanced Sampling, Public Dataset and Generative Model for Drug-Protein Dissociation Dynamics | Maodong Li et.al. | 2504.18367 | null |
2025-04-25 | SSD-Poser: Avatar Pose Estimation with State Space Duality from Sparse Observations | Shuting Zhao et.al. | 2504.18332 | null |
2025-04-25 | STP4D: Spatio-Temporal-Prompt Consistent Modeling for Text-to-4D Gaussian Splatting | Yunze Deng et.al. | 2504.18318 | null |
2025-04-25 | Seeing Soundscapes: Audio-Visual Generation and Separation from Soundscapes Using Audio-Visual Separator | Minjae Kang et.al. | 2504.18283 | null |
2025-04-25 | TextTIGER: Text-based Intelligent Generation with Entity Prompt Refinement for Text-to-Image Generation | Shintaro Ozaki et.al. | 2504.18269 | null |
2025-04-25 | Efficient Single-Pass Training for Multi-Turn Reasoning | Ritesh Goru et.al. | 2504.18246 | null |
2025-04-25 | Optimizing Multi-Round Enhanced Training in Diffusion Models for Improved Preference Understanding | Kun Li et.al. | 2504.18204 | null |
2025-04-25 | Generative AI for Physical-Layer Authentication | Rui Meng et.al. | 2504.18175 | null |
2025-04-25 | Offline Learning of Controllable Diverse Behaviors | Mathieu Petitbois et.al. | 2504.18160 | null |
2025-04-24 | LiDPM: Rethinking Point Diffusion for Lidar Scene Completion | Tetiana Martyniuk et.al. | 2504.17791 | null |
2025-04-24 | Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models | Xu Ma et.al. | 2504.17789 | null |
2025-04-24 | WI2easy: warm inflation dynamics made easy | Gabriel S. Rodrigues et.al. | 2504.17760 | null |
2025-04-24 | User Profiles: The Achilles' Heel of Web Browsers | Dolière Francis Somé et.al. | 2504.17692 | null |
2025-04-24 | DiMeR: Disentangled Mesh Reconstruction Model | Lutao Jiang et.al. | 2504.17670 | null |
2025-04-24 | polyGen: A Learning Framework for Atomic-level Polymer Structure Generation | Ayush Jain et.al. | 2504.17656 | null |
2025-04-24 | Beyond Labels: Zero-Shot Diabetic Foot Ulcer Wound Segmentation with Self-attention Diffusion Models and the Potential for Text-Guided Customization | Abderrachid Hamrani et.al. | 2504.17628 | null |
2025-04-24 | Likelihood-Free Variational Autoencoders | Chen Xu et.al. | 2504.17622 | null |
2025-04-24 | Enhancing CNNs robustness to occlusions with bioinspired filters for border completion | Catarina P. Coutinho et.al. | 2504.17619 | null |
2025-04-24 | Mitigating xApp conflicts for efficient network slicing in 6G O-RAN: a graph convolutional-based attention network approach | Sihem Bakri et.al. | 2504.17590 | null |
2025-04-24 | TileLang: A Composable Tiled Programming Model for AI Systems | Lei Wang et.al. | 2504.17577 | null |
2025-04-24 | ESDiff: Encoding Strategy-inspired Diffusion Model with Few-shot Learning for Color Image Inpainting | Junyan Zhang et.al. | 2504.17524 | null |
2025-04-24 | Unveiling Hidden Vulnerabilities in Digital Human Generation via Adversarial Attacks | Zhiying Li et.al. | 2504.17457 | null |
2025-04-24 | 3DV-TON: Textured 3D-Guided Consistent Video Try-on via Diffusion Models | Min Wei et.al. | 2504.17414 | null |
2025-04-24 | DRC: Enhancing Personalized Image Generation via Disentangled Representation Composition | Yiyan Xu et.al. | 2504.17349 | null |
2025-04-23 | Generalized Neighborhood Attention: Multi-dimensional Sparse Attention at the Speed of Light | Ali Hassani et.al. | 2504.16922 | null |
2025-04-23 | DreamO: A Unified Framework for Image Customization | Chong Mou et.al. | 2504.16915 | null |
2025-04-23 | BadVideo: Stealthy Backdoor Attack against Text-to-Video Generation | Ruotong Wang et.al. | 2504.16907 | null |
2025-04-23 | Practical approaches for crystal structure predictions with inpainting generation and universal interatomic potentials | Peichen Zhong et.al. | 2504.16893 | null |
2025-04-23 | Situational Preparedness Dynamics for Sequential Tropical Cyclone Hazards | Tianle Duan et.al. | 2504.16878 | null |
2025-04-23 | Planning with Diffusion Models for Target-Oriented Dialogue Systems | Hanwen Du et.al. | 2504.16858 | null |
2025-04-23 | Physically Consistent Humanoid Loco-Manipulation using Latent Diffusion Models | Ilyass Taouil et.al. | 2504.16843 | null |
2025-04-23 | Snorkeling in dark waters: A longitudinal surface exploration of unique Tor Hidden Services (Extended Version) | Alfonso Rodriguez Barredo-Valenzuela et.al. | 2504.16836 | null |
2025-04-23 | Evaluating Autoencoders for Parametric and Invertible Multidimensional Projections | Frederik L. Dennig et.al. | 2504.16831 | null |
2025-04-23 | Advanced Chest X-Ray Analysis via Transformer-Based Image Descriptors and Cross-Model Attention Mechanism | Lakshita Agarwal et.al. | 2504.16774 | null |
2025-04-23 | How Effective are Generative Large Language Models in Performing Requirements Classification? | Waad Alhoshan et.al. | 2504.16768 | null |
2025-04-23 | Tri-FusionNet: Enhancing Image Description Generation with Transformer-based Fusion Network and Dual Attention Mechanism | Lakshita Agarwal et.al. | 2504.16761 | null |
2025-04-23 | Feature Mixing Approach for Detecting Intraoperative Adverse Events in Laparoscopic Roux-en-Y Gastric Bypass Surgery | Rupak Bose et.al. | 2504.16749 | null |
2025-04-24 | Simple Graph Contrastive Learning via Fractional-order Neural Diffusion Networks | Yanan Zhao et.al. | 2504.16748 | null |
2025-04-23 | MOSAIC: A Skill-Centric Algorithmic Framework for Long-Horizon Manipulation Planning | Itamar Mishani et.al. | 2504.16738 | null |
2025-04-22 | Survey of Video Diffusion Models: Foundations, Implementations, and Applications | Yimu Wang et.al. | 2504.16081 | link |
2025-04-22 | From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning | Le Zhuo et.al. | 2504.16080 | null |
2025-04-22 | Intent-aware Diffusion with Contrastive Learning for Sequential Recommendation | Yuanpeng Qu et.al. | 2504.16077 | null |
2025-04-22 | High-performance training and inference for deep equivariant interatomic potentials | Chuin Wei Tan et.al. | 2504.16068 | null |
2025-04-22 | Boosting Generative Image Modeling via Joint Image-Feature Synthesis | Theodoros Kouzelis et.al. | 2504.16064 | null |
2025-04-22 | Evaluating Vision Language Models (VLMs) for Radiology: A Comprehensive Analysis | Frank Li et.al. | 2504.16047 | null |
2025-04-22 | Efficient Temporal Consistency in Diffusion-Based Video Editing with Adaptor Modules: A Theoretical Framework | Xinyuan Song et.al. | 2504.16016 | null |
2025-04-22 | Deep learning of point processes for modeling high-frequency data | Yoshihiro Gyotoku et.al. | 2504.15944 | null |
2025-04-22 | Adversarial Observations in Weather Forecasting | Erik Imgrund et.al. | 2504.15942 | null |
2025-04-22 | Text-based Animatable 3D Avatars with Morphable Model Alignment | Yiqian Wu et.al. | 2504.15835 | null |
2025-04-22 | Satellite to GroundScape -- Large-scale Consistent Ground View Generation from Satellite Views | Ningli Xu et.al. | 2504.15786 | null |
2025-04-22 | Clifford Group Equivariant Diffusion Models for 3D Molecular Generation | Cong Liu et.al. | 2504.15773 | null |
2025-04-22 | Stochastic Programming for Dynamic Temperature Control of Refrigerated Road Transport | Francesco Giliberto et.al. | 2504.15741 | null |
2025-04-22 | Riemannian Neural Geodesic Interpolant | Jiawen Wu et.al. | 2504.15736 | null |
2025-04-22 | Structure-Preserving Zero-Shot Image Editing via Stage-Wise Latent Injection in Diffusion Models | Dasol Jeong et.al. | 2504.15723 | null |
2025-04-21 | Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction | Vaishnavh Nagarajan et.al. | 2504.15266 | link |
2025-04-21 | Bringing Diversity from Diffusion Models to Semantic-Guided Face Asset Generation | Yunxuan Cai et.al. | 2504.15259 | null |
2025-04-21 | Evaluating Judges as Evaluators: The JETTS Benchmark of LLM-as-Judges as Test-Time Scaling Evaluators | Yilun Zhou et.al. | 2504.15253 | link |
2025-04-21 | DRAGON: Distributional Rewards Optimize Diffusion Generative Models | Yatong Bai et.al. | 2504.15217 | null |
2025-04-21 | Integrating Symbolic Execution into the Fine-Tuning of Code-Generating LLMs | Marina Sakharova et.al. | 2504.15210 | null |
2025-04-21 | Tiger200K: Manually Curated High Visual Quality Video Dataset from UGC Platform | Xianpan Zhou et.al. | 2504.15182 | null |
2025-04-21 | FaceCraft4D: Animated 3D Facial Avatar Generation from a Single Image | Fei Yin et.al. | 2504.15179 | null |
2025-04-21 | DSPO: Direct Semantic Preference Optimization for Real-World Image Super-Resolution | Miaomiao Cai et.al. | 2504.15176 | null |
2025-04-21 | Automatic Generation of Aerobatic Flight in Complex Environments via Diffusion Models | Yuhang Zhong et.al. | 2504.15138 | null |
2025-04-21 | Robust and Real-time Surface Normal Estimation from Stereo Disparities using Affine Transformations | Csongor Csanad Kariko et.al. | 2504.15121 | null |
2025-04-22 | VistaDepth: Frequency Modulation With Bias Reweighting For Enhanced Long-Range Depth Estimation | Mingxia Zhan et.al. | 2504.15095 | null |
2025-04-21 | Generative Artificial Intelligence for Beamforming in Low-Altitude Economy | Geng Sun et.al. | 2504.15079 | null |
2025-04-21 | SOLIDO: A Robust Watermarking Method for Speech Synthesis via Low-Rank Adaptation | Yue Li et.al. | 2504.15035 | null |
2025-04-21 | Gaussian Shading++: Rethinking the Realistic Deployment Challenge of Performance-Lossless Image Watermark for Diffusion Models | Zijin Yang et.al. | 2504.15026 | null |
2025-04-21 | PIV-FlowDiffuser:Transfer-learning-based denoising diffusion models for PIV | Qianyu Zhu et.al. | 2504.14952 | link |
2025-04-18 | Decoding Vision Transformers: the Diffusion Steering Lens | Ryota Takatsuki et.al. | 2504.13763 | link |
2025-04-18 | ESPLoRA: Enhanced Spatial Precision with Low-Rank Adaption in Text-to-Image Diffusion Models for High-Definition Synthesis | Andrea Rigo et.al. | 2504.13745 | null |
2025-04-18 | MLEP: Multi-granularity Local Entropy Patterns for Universal AI-generated Image Detection | Lin Yuan et.al. | 2504.13726 | null |
2025-04-18 | Magnecko: Design and Control of a Quadrupedal Magnetic Climbing Robot | Stefan Leuthard et.al. | 2504.13672 | null |
2025-04-18 | Word Embedding Techniques for Classification of Star Ratings | Hesham Abdelmotaleb et.al. | 2504.13653 | null |
2025-04-18 | Simulating Before Planning: Constructing Intrinsic User World Model for User-Tailored Dialogue Policy Planning | Tao He et.al. | 2504.13643 | null |
2025-04-18 | SupResDiffGAN a new approach for the Super-Resolution task | Dawid Kopeć et.al. | 2504.13622 | null |
2025-04-18 | Entropic Time Schedulers for Generative Diffusion Models | Dejan Stancevic et.al. | 2504.13612 | null |
2025-04-18 | WeatherGen: A Unified Diverse Weather Generator for LiDAR Point Clouds via Spider Mamba Diffusion | Yang Wu et.al. | 2504.13561 | link |
2025-04-18 | Task Assignment and Exploration Optimization for Low Altitude UAV Rescue via Generative AI Enhanced Multi-agent Reinforcement Learning | Xin Tang et.al. | 2504.13554 | null |
2025-04-18 | Beyond One-Hot Labels: Semantic Mixing for Model Calibration | Haoyang Luo et.al. | 2504.13548 | link |
2025-04-18 | Enhancing Multilingual Sentiment Analysis with Explainability for Sinhala, English, and Code-Mixed Content | Azmarah Rizvi et.al. | 2504.13545 | null |
2025-04-18 | MusFlow: Multimodal Music Generation via Conditional Flow Matching | Jiahao Song et.al. | 2504.13535 | null |
2025-04-18 | U-Shape Mamba: State Space Model for faster diffusion | Alex Ergasti et.al. | 2504.13499 | link |
2025-04-18 | Early Timestep Zero-Shot Candidate Selection for Instruction-Guided Image Editing | Joowon Kim et.al. | 2504.13490 | null |
2025-04-17 | Aligning Constraint Generation with Design Intent in Parametric CAD | Evan Casey et.al. | 2504.13178 | null |
2025-04-17 | SemCORE: A Semantic-Enhanced Generative Cross-Modal Retrieval Framework with MLLMs | Haoxuan Li et.al. | 2504.13172 | null |
2025-04-17 | Personalized Text-to-Image Generation with Auto-Regressive Models | Kaiyue Sun et.al. | 2504.13162 | link |
2025-04-17 | Science-T2I: Addressing Scientific Illusions in Image Synthesis | Jialuo Li et.al. | 2504.13129 | null |
2025-04-17 | UniEdit-Flow: Unleashing Inversion and Editing in the Era of Flow Models | Guanlong Jiao et.al. | 2504.13109 | null |
2025-04-17 | RF-DETR Object Detection vs YOLOv12 : A Study of Transformer-based and CNN-based Architectures for Single-Class and Multi-Class Greenfruit Detection in Complex Orchard Environments Under Label Ambiguity | Ranjan Sapkota et.al. | 2504.13099 | null |
2025-04-17 | An All-Atom Generative Model for Designing Protein Complexes | Ruizhe Chen et.al. | 2504.13075 | null |
2025-04-18 | SkyReels-V2: Infinite-length Film Generative Model | Guibin Chen et.al. | 2504.13074 | link |
2025-04-17 | ArtistAuditor: Auditing Artist Style Pirate in Text-to-Image Generation Models | Linkang Du et.al. | 2504.13061 | link |
2025-04-17 | Design Topological Materials by Reinforcement Fine-Tuned Generative Model | Haosheng Xu et.al. | 2504.13048 | null |
2025-04-17 | Evidence for sulfur chemistry in the atmosphere of the warm sub-Neptune TOI-270 d | Lukas Felix et.al. | 2504.13039 | null |
2025-04-17 | TTRD3: Texture Transfer Residual Denoising Dual Diffusion Model for Remote Sensing Image Super-Resolution | Yide Liu et.al. | 2504.13026 | link |
2025-04-17 | GSAC: Leveraging Gaussian Splatting for Photorealistic Avatar Creation with Unity Integration | Rendong Zhang et.al. | 2504.12999 | link |
2025-04-17 | QLLM: Do We Really Need a Mixing Network for Credit Assignment in Multi-Agent Reinforcement Learning? | Zhouyang Jiang et.al. | 2504.12961 | null |
2025-04-17 | Systemic risk mitigation in supply chains through network rewiring | Giacomo Zelbi et.al. | 2504.12955 | null |
2025-04-16 | VGDFR: Diffusion-based Video Generation with Dynamic Latent Frame Rate | Zhihang Yuan et.al. | 2504.12259 | link |
2025-04-16 | Cobra: Efficient Line Art COlorization with BRoAder References | Junhao Zhuang et.al. | 2504.12240 | null |
2025-04-16 | Coding-Prior Guided Diffusion Network for Video Deblurring | Yike Liu et.al. | 2504.12222 | null |
2025-04-16 | Validating and monitoring bibliographic and citation data in OpenCitations collections | Ivan Heibi et.al. | 2504.12195 | null |
2025-04-16 | Deep Generative Models for Bayesian Inference on High-Rate Sensor Data: Applications in Automotive Radar and Medical Imaging | Tristan S. W. Stevens et.al. | 2504.12154 | null |
2025-04-16 | Anti-Aesthetics: Protecting Facial Privacy against Customized Text-to-Image Synthesis | Songping Wang et.al. | 2504.12129 | null |
2025-04-16 | A Diffusion-Based Framework for Terrain-Aware Remote Sensing Image Reconstruction | Zhenyu Yu et.al. | 2504.12112 | null |
2025-04-16 | Generalized Visual Relation Detection with Diffusion Models | Kaifeng Gao et.al. | 2504.12100 | null |
2025-04-16 | Generative Deep Learning Framework for Inverse Design of Fuels | Kiran K. Yalamanchi et.al. | 2504.12075 | null |
2025-04-16 | Modular-Cam: Modular Dynamic Camera-view Video Generation with LLM | Zirui Pan et.al. | 2504.12048 | null |
2025-04-17 | Understanding Attention Mechanism in Video Diffusion Models | Bingyan Liu et.al. | 2504.12027 | null |
2025-04-16 | Instruction-augmented Multimodal Alignment for Image-Text and Element Matching | Xinli Yue et.al. | 2504.12018 | null |
2025-04-17 | Dual-Energy Cone-Beam CT Using Two Orthogonal Projection Views: A Phantom Study | Junbo Peng et.al. | 2504.12010 | null |
2025-04-16 | Generative Recommendation with Continuous-Token Diffusion | Haohao Qu et.al. | 2504.12007 | null |
2025-04-16 | R-Meshfusion: Reinforcement Learning Powered Sparse-View Mesh Reconstruction with Diffusion Priors | Haoyang Wang et.al. | 2504.11946 | null |
2025-04-15 | Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception | Ziqi Pang et.al. | 2504.11457 | link |
2025-04-16 | Elucidating the Design Space of Multimodal Protein Language Models | Cheng-Yen Hsieh et.al. | 2504.11454 | null |
2025-04-16 | Diffusion Distillation With Direct Preference Optimization For Efficient 3D LiDAR Scene Completion | An Zhao et.al. | 2504.11447 | link |
2025-04-15 | NormalCrafter: Learning Temporally Consistent Normals from Video Diffusion Priors | Yanrui Bin et.al. | 2504.11427 | null |
2025-04-15 | ADT: Tuning Diffusion Models with Adversarial Supervision | Dazhong Shen et.al. | 2504.11423 | null |
2025-04-15 | VideoPanda: Video Panoramic Diffusion with Multi-view Attention | Kevin Xie et.al. | 2504.11389 | null |
2025-04-15 | Ring Artifacts Correction Based on Global-Local Features Interaction Guidance in the Projection Domain | Yunze Liu et.al. | 2504.11375 | null |
2025-04-15 | Evaluating DAO Sustainability and Longevity Through On-Chain Governance Metrics | Silvio Meneguzzo et.al. | 2504.11341 | null |
2025-04-15 | Autoregressive Distillation of Diffusion Transformers | Yeongmin Kim et.al. | 2504.11295 | link |
2025-04-15 | DeepSelective: Feature Gating and Representation Matching for Interpretable Clinical Prediction | Ruochi Zhang et.al. | 2504.11264 | null |
2025-04-15 | VEXP: A Low-Cost RISC-V ISA Extension for Accelerated Softmax Computation in Transformers | Run Wang et.al. | 2504.11227 | null |
2025-04-15 | Focal Split: Untethered Snapshot Depth from Differential Defocus | Junjie Luo et.al. | 2504.11202 | null |
2025-04-15 | DMAGaze: Gaze Estimation Based on Feature Disentanglement and Multi-Scale Attention | Haohan Chen et.al. | 2504.11160 | null |
2025-04-15 | SAR-to-RGB Translation with Latent Diffusion for Earth Observation | Kaan Aydin et.al. | 2504.11154 | null |
2025-04-15 | Taming Consistency Distillation for Accelerated Human Image Animation | Xiang Wang et.al. | 2504.11143 | null |
2025-04-14 | REPA-E: Unlocking VAE for End-to-End Tuning with Latent Diffusion Transformers | Xingjian Leng et.al. | 2504.10483 | null |
2025-04-14 | Online Advanced Labs in Physics | Peter A. Bennett et.al. | 2504.10470 | null |
2025-04-14 | Art3D: Training-Free 3D Generation from Flat-Colored Illustration | Xiaoyan Cong et.al. | 2504.10466 | null |
2025-04-14 | Anchor Token Matching: Implicit Structure Locking for Training-free AR Image Editing | Taihang Hu et.al. | 2504.10434 | link |
2025-04-14 | MonoDiff9D: Monocular Category-Level 9D Object Pose Estimation via Diffusion Model | Jian Liu et.al. | 2504.10433 | link |
2025-04-14 | AI-Driven Code Refactoring: Using Graph Neural Networks to Enhance Software Maintainability | Gopichand Bandarupalli et.al. | 2504.10412 | null |
2025-04-14 | LLM-driven Constrained Copy Generation through Iterative Refinement | Varun Vasudevan et.al. | 2504.10391 | null |
2025-04-14 | Improving diffusion modeling in all-solid-state lithium batteries: a novel approach for grain boundary effects | Lena Scholz et.al. | 2504.10348 | null |
2025-04-14 | Chaoran Cheng et.al. | 2504.10283 | null | |
2025-04-14 | DiffMOD: Progressive Diffusion Point Denoising for Moving Object Detection in Remote Sensing | Jinyue Zhang et.al. | 2504.10278 | null |
2025-04-14 | When Technologies Are Not Enough: Understanding How Domestic Workers Employ (and Avoid) Online Technologies in Their Work Practices | Mariana Fernandez-Espinosa et.al. | 2504.10265 | null |
2025-04-14 | A Model Zoo of Vision Transformers | Damian Falk et.al. | 2504.10231 | link |
2025-04-14 | Localized Cultural Knowledge is Conserved and Controllable in Large Language Models | Veniamin Veselovsky et.al. | 2504.10191 | null |
2025-04-14 | Efficient Generative Model Training via Embedded Representation Warmup | Deyuan Liu et.al. | 2504.10188 | link |
2025-04-14 | A New Paradigm in IBR Modeling for Power Flow and Short Circuit Analysis | Zahid Javid et.al. | 2504.10181 | null |
2025-04-11 | Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model | Team Seawead et.al. | 2504.08685 | null |
2025-04-11 | Safe Flow Matching: Robot Motion Planning with Control Barrier Functions | Xiaobing Dai et.al. | 2504.08661 | null |
2025-04-11 | Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization | Jialu Li et.al. | 2504.08641 | null |
2025-04-11 | Quantum Fluctuation-enhanced Milli-Kelvin Magnetic Refrigeration in Triangular Lattice Magnet GdBO3 | Weijie Lin et.al. | 2504.08636 | null |
2025-04-11 | Discretization Error Analysis of a High Order Unfitted Space-Time Method for moving domain problems | Fabian Heimann et.al. | 2504.08608 | null |
2025-04-11 | Neural Fidelity Calibration for Informative Sim-to-Real Adaptation | Youwei Yu et.al. | 2504.08604 | null |
2025-04-11 | ZipIR: Latent Pyramid Diffusion Transformer for High-Resolution Image Restoration | Yongsheng Yu et.al. | 2504.08591 | null |
2025-04-11 | COP-GEN-Beta: Unified Generative Modelling of COPernicus Imagery Thumbnails | Miguel Espinosa et.al. | 2504.08548 | null |
2025-04-11 | Slicing the Gaussian Mixture Wasserstein Distance | Moritz Piening et.al. | 2504.08544 | link |
2025-04-11 | Discriminator-Free Direct Preference Optimization for Video Diffusion | Haoran Cheng et.al. | 2504.08542 | null |
2025-04-11 | On The Landscape of Spoken Language Models: A Comprehensive Survey | Siddhant Arora et.al. | 2504.08528 | null |
2025-04-11 | TickIt: Leveraging Large Language Models for Automated Ticket Escalation | Fengrui Liu et.al. | 2504.08475 | null |
2025-04-11 | Cut-and-Splat: Leveraging Gaussian Splatting for Synthetic Data Generation | Bram Vanherle et.al. | 2504.08473 | link |
2025-04-11 | On the Design of Diffusion-based Neural Speech Codecs | Pietro Foti et.al. | 2504.08470 | null |
2025-04-11 | Muon-Accelerated Attention Distillation for Real-Time Edge Synthesis via Optimized Latent Diffusion | Weiye Chen et.al. | 2504.08451 | link |
2025-04-10 | PixelFlow: Pixel-Space Generative Models with Flow | Shoufa Chen et.al. | 2504.07963 | link |
2025-04-10 | Geo4D: Leveraging Video Generators for Geometric 4D Scene Reconstruction | Zeren Jiang et.al. | 2504.07961 | link |
2025-04-10 | VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning | Zhong-Yu Li et.al. | 2504.07960 | null |
2025-04-10 | Activating high-power parametric oscillation in photonic-crystal resonators | Grant M. Brodnik et.al. | 2504.07947 | null |
2025-04-10 | GenEAva: Generating Cartoon Avatars with Fine-Grained Facial Expressions from Realistic Diffusion-based Faces | Hao Yu et.al. | 2504.07945 | null |
2025-04-10 | Echo: An Open-Source, Low-Cost Teleoperation System with Force Feedback for Dataset Collection in Robot Learning | Artem Bazhenov et.al. | 2504.07939 | null |
2025-04-10 | Optimal Control For Anti-Abeta Treatment in Alzheimer's Disease using a Reaction-Diffusion Model | Wenrui Hao et.al. | 2504.07913 | null |
2025-04-10 | DiverseFlow: Sample-Efficient Diverse Mode Coverage in Flows | Mashrur M. Morshed et.al. | 2504.07894 | null |
2025-04-10 | QubitHammer Attacks: Qubit Flipping Attacks in Multi-tenant Superconducting Quantum Computers | Yizhuo Tan et.al. | 2504.07875 | null |
2025-04-11 | Pangu Ultra: Pushing the Limits of Dense Large Language Models on Ascend NPUs | Yichun Yin et.al. | 2504.07866 | null |
2025-04-10 | A Review of HPC-Accelerated CFD in National Security and Defense | James Afful et.al. | 2504.07837 | null |
2025-04-10 | The ISC Creator: Human-Centered Design of Learning Analytics Interactive Indicator Specification Cards | Shoeb Joarder et.al. | 2504.07811 | null |
2025-04-10 | Revisiting Likelihood-Based Out-of-Distribution Detection by Modeling Representations | Yifan Ding et.al. | 2504.07793 | link |
2025-04-10 | Characterization of the Electronic Noise in the Readout of Resistive Micromegas in the High-Angle Time Projection Chambers of the T2K Experiment | D. Attié et.al. | 2504.07759 | null |
2025-04-10 | Virtual-mask Informed Prior for Sparse-view Dual-Energy CT Reconstruction | Zini Chen et.al. | 2504.07753 | null |
2025-04-09 | Identifying Unknown Stochastic Dynamics via Finite expression methods | Senwei Liang et.al. | 2504.07085 | null |
2025-04-09 | Evaluating Retrieval Augmented Generative Models for Document Queries in Transportation Safety | Chad Melton et.al. | 2504.07022 | null |
2025-04-09 | Latent Diffusion U-Net Representations Contain Positional Embeddings and Anomalies | Jonas Loos et.al. | 2504.07008 | link |
2025-04-09 | A Comparison of Deep Learning Methods for Cell Detection in Digital Cytology | Marco Acerbis et.al. | 2504.06957 | link |
2025-04-09 | PathSegDiff: Pathology Segmentation using Diffusion model representations | Sachin Kumar Danisetty et.al. | 2504.06950 | null |
2025-04-09 | The Importance of Being Discrete: Measuring the Impact of Discretization in End-to-End Differentially Private Synthetic Data | Georgi Ganev et.al. | 2504.06923 | null |
2025-04-09 | Data Augmentation for Fake Reviews Detection in Multiple Languages and Multiple Domains | Ming Liu et.al. | 2504.06917 | null |
2025-04-09 | MedSegFactory: Text-Guided Generation of Medical Image-Mask Pairs | Jiawei Mao et.al. | 2504.06897 | null |
2025-04-09 | EIDT-V: Exploiting Intersections in Diffusion Trajectories for Model-Agnostic, Zero-Shot, Training-Free Text-to-Video Generation | Diljeet Jagpal et.al. | 2504.06861 | null |
2025-04-09 | CasTex: Cascaded Text-to-Texture Synthesis via Explicit Texture Maps and Physically-Based Shading | Mishan Aliev et.al. | 2504.06856 | null |
2025-04-09 | Open Problems and a Hypothetical Path Forward in LLM Knowledge Paradigms | Xiaotian Ye et.al. | 2504.06823 | null |
2025-04-09 | DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation | Wangbo Zhao et.al. | 2504.06803 | link |
2025-04-09 | A Meaningful Perturbation Metric for Evaluating Explainability Methods | Danielle Cohen et.al. | 2504.06800 | null |
2025-04-09 | FedMerge: Federated Personalization via Model Merging | Shutong Chen et.al. | 2504.06768 | null |
2025-04-09 | DIMA: DIffusing Motion Artifacts for unsupervised correction in brain MRI images | Paolo Angella et.al. | 2504.06767 | null |
2025-04-08 | OmniSVG: A Unified Scalable Vector Graphics Generation Model | Yiying Yang et.al. | 2504.06263 | null |
2025-04-08 | Transfer between Modalities with MetaQueries | Xichen Pan et.al. | 2504.06256 | null |
2025-04-08 | Electronic Structure Guided Inverse Design Using Generative Models | Shuyi Jia et.al. | 2504.06249 | link |
2025-04-08 | From 128K to 4M: Efficient Training of Ultra-Long Context Large Language Models | Chejian Xu et.al. | 2504.06214 | null |
2025-04-08 | WoundAmbit: Bridging State-of-the-Art Semantic Segmentation and Real-World Wound Care | Vanessa Borst et.al. | 2504.06185 | null |
2025-04-08 | Deploying Chatbots in Customer Service: Adoption Hurdles and Simple Remedies | Evgeny Kagan et.al. | 2504.06145 | null |
2025-04-08 | QGen Studio: An Adaptive Question-Answer Generation, Training and Evaluation Platform | Movina Moses et.al. | 2504.06136 | null |
2025-04-08 | FaceCloak: Learning to Protect Face Templates | Sudipta Banerjee et.al. | 2504.06131 | null |
2025-04-08 | OSDM-MReg: Multimodal Image Registration based One Step Diffusion Model | Xiaochen Wei et.al. | 2504.06027 | null |
2025-04-08 | CamContextI2V: Context-aware Controllable Video Generation | Luis Denninger et.al. | 2504.06022 | link |
2025-04-08 | Note on the Universality of Parameterized IQP Circuits with Hidden Units for Generating Probability Distributions | Andrii Kurkin et.al. | 2504.05997 | null |
2025-04-08 | An Empirical Study of GPT-4o Image Generation Capabilities | Sixiang Chen et.al. | 2504.05979 | link |
2025-04-08 | Diffusion Based Ambiguous Image Segmentation | Jakob Lønborg Christensen et.al. | 2504.05977 | null |
2025-04-08 | Adaptive Extended Kalman Filtering for Battery State of Charge Estimation on STM32 | António Barros et.al. | 2504.05936 | null |
2025-04-08 | Pushing JWST to the extremes: search and scrutiny of bright galaxy candidates at z |
M. Castellano et.al. | 2504.05893 | null |
2025-04-07 | CREA: A Collaborative Multi-Agent Framework for Creative Content Generation with Diffusion Models | Kavana Venkatesh et.al. | 2504.05306 | null |
2025-04-07 | Gaussian Mixture Flow Matching Models | Hansheng Chen et.al. | 2504.05304 | link |
2025-04-07 | Dimension-Free Convergence of Diffusion Models for Approximate Gaussian Mixtures | Gen Li et.al. | 2504.05300 | null |
2025-04-07 | Unleashing the Power of LLMs in Dense Retrieval with Query Likelihood Modeling | Hengran Zhang et.al. | 2504.05216 | null |
2025-04-07 | P2Mark: Plug-and-play Parameter-intrinsic Watermarking for Neural Speech Generation | Yong Ren et.al. | 2504.05197 | null |
2025-04-07 | Learning symmetries in datasets | Veronica Sanz et.al. | 2504.05174 | null |
2025-04-07 | DDPM Score Matching and Distribution Learning | Sinho Chewi et.al. | 2504.05161 | null |
2025-04-07 | DA2Diff: Exploring Degradation-aware Adaptive Diffusion Priors for All-in-One Weather Restoration | Jiamei Xiong et.al. | 2504.05135 | null |
2025-04-07 | Graph-based Diffusion Model for Collaborative Filtering | Xuan Zhang et.al. | 2504.05029 | null |
2025-04-07 | RS-RAG: Bridging Remote Sensing Imagery and Comprehensive Knowledge with a Multi-Modal Dataset and Retrieval-Augmented Generation Model | Congcong Wen et.al. | 2504.04988 | null |
2025-04-07 | Low-Rate Semantic Communication with Codebook-based Conditional Generative Models | Kailang Ye et.al. | 2504.04977 | null |
2025-04-08 | REWIND: Real-Time Egocentric Whole-Body Motion Diffusion with Exemplar-Based Identity Conditioning | Jihyun Lee et.al. | 2504.04956 | null |
2025-04-07 | A Unified Pairwise Framework for RLHF: Bridging Generative Reward Modeling and Policy Optimization | Wenyuan Xu et.al. | 2504.04950 | null |
2025-04-07 | One Quantizer is Enough: Toward a Lightweight Audio Codec | Linwei Zhai et.al. | 2504.04949 | link |
2025-04-07 | Video-Bench: Human-Aligned Video Generation Benchmark | Hui Han et.al. | 2504.04907 | null |
2025-04-04 | MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models | Wulin Xie et.al. | 2504.03641 | null |
2025-04-04 | Enhancing Causal Effect Estimation with Diffusion-Generated Data | Li Chen et.al. | 2504.03630 | null |
2025-04-04 | Quantifying the uncertainty of model-based synthetic image quality metrics | Ciaran Bench et.al. | 2504.03623 | null |
2025-04-04 | VISTA-OCR: Towards generative and interactive end to end OCR models | Laziz Hamdi et.al. | 2504.03621 | null |
2025-04-04 | Autonomous and Self-Adapting System for Synthetic Media Detection and Attribution | Aref Azizpour et.al. | 2504.03615 | null |
2025-04-04 | Multimodal Diffusion Bridge with Attention-Based SAR Fusion for Satellite Image Cloud Removal | Yuyang Hu et.al. | 2504.03607 | null |
2025-04-04 | HumanDreamer-X: Photorealistic Single-image Human Avatars Reconstruction via Gaussian Restoration | Boyuan Wang et.al. | 2504.03536 | null |
2025-04-04 | Diffusion Active Learning: Towards Data-Driven Experimental Design in Computed Tomography | Luis Barba et.al. | 2504.03491 | null |
2025-04-04 | BUFF: Bayesian Uncertainty Guided Diffusion Probabilistic Model for Single Image Super-Resolution | Zihao He et.al. | 2504.03490 | null |
2025-04-04 | Structured Legal Document Generation in India: A Model-Agnostic Wrapper Approach with VidhikDastaavej | Shubham Kumar Nigam et.al. | 2504.03486 | null |
2025-04-04 | Dynamic Importance in Diffusion U-Net for Enhanced Image Synthesis | Xi Wang et.al. | 2504.03471 | link |
2025-04-04 | D-Garment: Physics-Conditioned Latent Diffusion for Dynamic Garment Deformations | Antoine Dumoulin et.al. | 2504.03468 | null |
2025-04-04 | Generating ensembles of spatially-coherent in-situ forecasts using flow matching | David Landry et.al. | 2504.03463 | null |
2025-04-04 | Conditioning Diffusions Using Malliavin Calculus | Jakiw Pidstrigach et.al. | 2504.03461 | null |
2025-04-04 | QuinID: Enabling FDMA-Based Fully Parallel RFID with Frequency-Selective Antenna | Xin Na et.al. | 2504.03412 | link |
2025-04-03 | Concept Lancet: Image Editing with Compositional Representation Transplant | Jinqi Luo et.al. | 2504.02828 | null |
2025-04-03 | Efficient Autoregressive Shape Generation via Octree-Based Adaptive Tokenization | Kangle Deng et.al. | 2504.02817 | null |
2025-04-03 | F-ViTA: Foundation Model Guided Visible to Thermal Translation | Jay N. Paranjape et.al. | 2504.02801 | link |
2025-04-03 | Scene Splatter: Momentum 3D Scene Generation from Single Image with Video Diffusion Model | Shengjun Zhang et.al. | 2504.02764 | null |
2025-04-03 | MD-ProjTex: Texturing 3D Shapes with Multi-Diffusion Projection | Ahmet Burak Yildirim et.al. | 2504.02762 | null |
2025-04-03 | Echoes of the hidden: Uncovering coordination beyond network structure | Shahar Somin et.al. | 2504.02757 | null |
2025-04-04 | RBT4DNN: Requirements-based Testing of Neural Networks | Nusrat Jahan Mozumder et.al. | 2504.02737 | link |
2025-04-03 | Pushing the Limit of PPG Sensing in Sedentary Conditions by Addressing Poor Skin-sensor Contact | Manh Pham Hung et.al. | 2504.02735 | null |
2025-04-03 | RoSMM: A Robust and Secure Multi-Modal Watermarking Framework for Diffusion Models | ZhongLi Fang et.al. | 2504.02640 | null |
2025-04-03 | Variational Online Mirror Descent for Robust Learning in Schrödinger Bridge | Dong-Sig Han et.al. | 2504.02618 | null |
2025-04-03 | Fine-Tuning Visual Autoregressive Models for Subject-Driven Generation | Jiwoo Chung et.al. | 2504.02612 | null |
2025-04-03 | Bridging the Gap between Gaussian Diffusion Models and Universal Quantization for Image Compression | Lucas Relic et.al. | 2504.02579 | null |
2025-04-03 | MAD: Makeup All-in-One with Cross-Domain Diffusion Model | Bo-Kai Ruan et.al. | 2504.02545 | null |
2025-04-03 | High Numerical Aperture Achromatic Meta-Devices through Dispersion Compensation | Yuzhong Wang et.al. | 2504.02535 | null |
2025-04-04 | ARCANE: Adaptive RISC-V Cache Architecture for Near-memory Extensions | Vincenzo Petrolo et.al. | 2504.02533 | null |
2025-04-02 | Diffusion-Guided Gaussian Splatting for Large-Scale Unconstrained 3D Reconstruction and Novel View Synthesis | Niluthpol Chowdhury Mithun et.al. | 2504.01960 | null |
2025-04-03 | VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step | Hanyang Wang et.al. | 2504.01956 | null |
2025-04-02 | A Unified Approach to Analysis and Design of Denoising Markov Models | Yinuo Ren et.al. | 2504.01938 | null |
2025-04-03 | ILLUME+: Illuminating Unified MLLM with Dual Visual Tokenization and Diffusion Refinement | Runhui Huang et.al. | 2504.01934 | null |
2025-04-02 | Gen-C: Populating Virtual Worlds with Generative Crowds | Andreas Panayiotou et.al. | 2504.01924 | null |
2025-04-03 | Bridging the Linguistic Divide: A Survey on Leveraging Large Language Models for Machine Translation | Baban Gain et.al. | 2504.01919 | null |
2025-04-02 | Multi-fidelity Parameter Estimation Using Conditional Diffusion Models | Caroline Tatsuoka et.al. | 2504.01894 | null |
2025-04-02 | A Diffusion-Based Framework for Occluded Object Movement | Zheng-Peng Duan et.al. | 2504.01873 | null |
2025-04-02 | Interpreting Emergent Planning in Model-Free Reinforcement Learning | Thomas Bush et.al. | 2504.01871 | null |
2025-04-02 | BOGausS: Better Optimized Gaussian Splatting | Stéphane Pateux et.al. | 2504.01844 | null |
2025-04-02 | YourBench: Easy Custom Evaluation Sets for Everyone | Sumuk Shashidhar et.al. | 2504.01833 | link |
2025-04-02 | Implicit Bias Injection Attacks against Text-to-Image Diffusion Models | Huayang Huang et.al. | 2504.01819 | link |
2025-04-02 | DISINFOX: an open-source threat exchange platform serving intelligence on disinformation and influence operations | Felipe Sánchez González et.al. | 2504.01803 | null |
2025-04-02 | The protein escape process at the ribosomal exit tunnel has conserved mechanisms across the domains of life | Phuong Thuy Bui et.al. | 2504.01731 | null |
2025-04-02 | An Adaptive Proximal Inexact Gradient Framework and Its Application to Per-Antenna Constrained Joint Beamforming and Compression Design | Xilai Fan et.al. | 2504.01721 | null |
2025-03-31 | Consistent Subject Generation via Contrastive Instantiated Concepts | Lee Hsin-Ying et.al. | 2503.24387 | null |
2025-03-31 | Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation | Shengqiong Wu et.al. | 2503.24379 | null |
2025-03-31 | InstructRestore: Region-Customized Image Restoration with Human Instructions | Shuaizheng Liu et.al. | 2503.24357 | link |
2025-03-31 | Enhancing Image Resolution of Solar Magnetograms: A Latent Diffusion Model Approach | Francesco Pio Ramunno et.al. | 2503.24271 | link |
2025-04-01 | Visual Acoustic Fields | Yuelei Li et.al. | 2503.24270 | null |
2025-03-31 | Pre-training with 3D Synthetic Data: Learning 3D Point Cloud Instance Segmentation from 3D Synthetic Scenes | Daichi Otsuka et.al. | 2503.24229 | null |
2025-03-31 | AI-Assisted Colonoscopy: Polyp Detection and Segmentation using Foundation Models | Uxue Delaquintana-Aramendi et.al. | 2503.24138 | link |
2025-03-31 | Grounding Agent Reasoning in Image Schemas: A Neurosymbolic Approach to Embodied Cognition | François Olivier et.al. | 2503.24110 | null |
2025-03-31 | Controlled Latent Diffusion Models for 3D Porous Media Reconstruction | Danilo Naiff et.al. | 2503.24083 | link |
2025-03-31 | COSMO: Combination of Selective Memorization for Low-cost Vision-and-Language Navigation | Siqi Zhang et.al. | 2503.24065 | null |
2025-03-31 | ReaLM: Reliable and Efficient Large Language Model Inference with Statistical Algorithm-Based Fault Tolerance | Tong Xie et.al. | 2503.24053 | link |
2025-03-31 | Automated Discovery of Tactic Libraries for Interactive Theorem Proving | Yutong Xin et.al. | 2503.24036 | null |
2025-03-31 | DenseFormer: Learning Dense Depth Map from Sparse Depth and Image via Conditional Diffusion Model | Ming Yuan et.al. | 2503.23993 | null |
2025-03-31 | Two-wheel-driven Electric Superbike Powertrain Optimization | Adelmo Niccolai et.al. | 2503.23984 | null |
2025-04-02 | Machine Learning-assisted High-speed Combinatorial Optimization with Ising Machines for Dynamically Changing Problems | Yohei Hamakawa et.al. | 2503.23966 | null |
2025-03-28 | DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness | Ruining Li et.al. | 2503.22677 | null |
2025-03-28 | Zero4D: Training-Free 4D Video Generation From Single Video Using Off-the-Shelf Video Diffusion Model | Jangho Park et.al. | 2503.22622 | null |
2025-03-28 | Generative Latent Neural PDE Solver using Flow Matching | Zijie Li et.al. | 2503.22600 | null |
2025-03-28 | RELD: Regularization by Latent Diffusion Models for Image Restoration | Pasquale Cascarano et.al. | 2503.22563 | null |
2025-03-28 | Deterministic Medical Image Translation via High-fidelity Brownian Bridges | Qisheng He et.al. | 2503.22531 | null |
2025-03-28 | Automated UX Insights from User Research Videos by Integrating Facial Emotion and Text Sentiment | Simran Kaur Ghatoray et.al. | 2503.22510 | null |
2025-03-28 | Scenario Dreamer: Vectorized Latent Diffusion for Generating Driving Simulation Environments | Luke Rowe et.al. | 2503.22496 | null |
2025-03-28 | GAITGen: Disentangled Motion-Pathology Impaired Gait Generative Model -- Bringing Motion Generation to the Clinical Domain | Vida Adeli et.al. | 2503.22397 | null |
2025-03-28 | Volumetric Material Decomposition Using Spectral Diffusion Posterior Sampling with a Compressed Polychromatic Forward Model | Xiao Jiang et.al. | 2503.22392 | null |
2025-03-28 | Meta-LoRA: Meta-Learning LoRA Components for Domain-Aware ID Personalization | Barış Batuhan Topal et.al. | 2503.22352 | null |
2025-03-28 | GCRayDiffusion: Pose-Free Surface Reconstruction via Geometric Consistent Ray Diffusion | Li-Heng Chen et.al. | 2503.22349 | null |
2025-03-28 | Semantix: An Energy Guided Sampler for Semantic Style Transfer | Huiang He et.al. | 2503.22344 | null |
2025-03-28 | SKDU at De-Factify 4.0: Natural Language Features for AI-Generated Text-Detection | Shrikant Malviya et.al. | 2503.22338 | link |
2025-03-28 | Imperceptible but Forgeable: Practical Invisible Watermark Forgery via Diffusion Models | Ziping Dong et.al. | 2503.22330 | null |
2025-03-28 | BanglAssist: A Bengali-English Generative AI Chatbot for Code-Switching and Dialect-Handling in Customer Service | Francesco Kruk et.al. | 2503.22283 | null |
2025-03-27 | VideoMage: Multi-Subject and Motion Customization of Text-to-Video Diffusion Models | Chi-Pin Huang et.al. | 2503.21781 | null |
2025-03-27 | StyleMotif: Multi-Modal Motion Stylization using Style-Content Cross Fusion | Ziyu Guo et.al. | 2503.21775 | null |
2025-03-27 | Optimal Stepsize for Diffusion Sampling | Jianning Pei et.al. | 2503.21774 | link |
2025-03-27 | A Unified Image-Dense Annotation Generation Model for Underwater Scenes | Hongkai Lin et.al. | 2503.21771 | link |
2025-03-27 | Exploring the Evolution of Physics Cognition in Video Generation: A Survey | Minghui Lin et.al. | 2503.21765 | link |
2025-03-27 | A Unified Framework for Diffusion Bridge Problems: Flow Matching and Schrödinger Matching into One | Minyoung Kim et.al. | 2503.21756 | null |
2025-03-27 | VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness | Dian Zheng et.al. | 2503.21755 | link |
2025-03-27 | 3DGen-Bench: Comprehensive Benchmark Suite for 3D Generative Models | Yuhan Zhang et.al. | 2503.21745 | null |
2025-03-27 | Progressive Rendering Distillation: Adapting Stable Diffusion for Instant Text-to-Mesh Generation without 3D Data | Zhiyuan Ma et.al. | 2503.21694 | link |
2025-03-27 | A Comprehensive Benchmark for RNA 3D Structure-Function Modeling | Luis Wyss et.al. | 2503.21681 | link |
2025-03-27 | A friendly introduction to triangular transport | Maximilian Ramgraber et.al. | 2503.21673 | null |
2025-03-27 | Audio-driven Gesture Generation via Deviation Feature in the Latent Space | Jiahui Chen et.al. | 2503.21616 | null |
2025-03-27 | Critical Iterative Denoising: A Discrete Generative Model Applied to Graphs | Yoann Boget et.al. | 2503.21592 | null |
2025-03-27 | AlignDiff: Learning Physically-Grounded Camera Alignment via Diffusion | Liuyue Xie et.al. | 2503.21581 | null |
2025-03-27 | SyncSDE: A Probabilistic Framework for Diffusion Synchronization | Hyunjun Lee et.al. | 2503.21555 | null |
2025-03-26 | Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency | Tianqi Liu et.al. | 2503.20785 | link |
2025-03-26 | FB-4D: Spatial-Temporal Coherent Dynamic 3D Content Generation with Feature Banks | Jinwei Li et.al. | 2503.20784 | link |
2025-03-26 | PUREPath-B: A Tessellated Bayesian Model for Recovering CMB B-modes over Large Angular Scales of the Sky | Vipin Sudevan et.al. | 2503.20774 | null |
2025-03-26 | Reliable algorithm selection for machine learning-guided design | Clara Fannjiang et.al. | 2503.20767 | null |
2025-03-26 | RecTable: Fast Modeling Tabular Data with Rectified Flow | Masane Fuchi et.al. | 2503.20731 | link |
2025-03-26 | Continual learning via probabilistic exchangeable sequence modelling | Hanwen Xing et.al. | 2503.20725 | null |
2025-03-26 | Dynamic Motion Blending for Versatile Motion Editing | Nan Jiang et.al. | 2503.20724 | null |
2025-03-26 | From Annotation to Adaptation: Metrics, Synthetic Data, and Aspect Extraction for Aspect-Based Sentiment Analysis with Large Language Models | Nikita Neveditsin et.al. | 2503.20715 | null |
2025-03-26 | Flow of a two-dimensional liquid foam: Impact of surfactant type and boundary conditions | Farshad Nazari et.al. | 2503.20710 | null |
2025-03-26 | BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation | Yuyang Peng et.al. | 2503.20672 | null |
2025-03-26 | ARMO: Autoregressive Rigging for Multi-Category Objects | Mingze Sun et.al. | 2503.20663 | null |
2025-03-26 | MMGen: Unified Multi-modal Image Generation and Understanding in One Go | Jiepeng Wang et.al. | 2503.20644 | null |
2025-03-26 | Diffusion Counterfactuals for Image Regressors | Trung Duc Ha et.al. | 2503.20595 | link |
2025-03-26 | Supply chain network rewiring dynamics at the firm-level | Tobias Reisch et.al. | 2503.20594 | link |
2025-03-26 | Stochastic Transport Maps in Diffusion Models and Sampling | Xicheng Zhang et.al. | 2503.20573 | null |
2025-03-25 | Learning 3D Object Spatial Relationships from Pre-trained 2D Diffusion Models | Sangwon Beak et.al. | 2503.19914 | null |
2025-03-25 | PartRM: Modeling Part-Level Dynamics with Large Cross-State Reconstruction Model | Mingju Gao et.al. | 2503.19913 | null |
2025-03-26 | AvatarArtist: Open-Domain 4D Avatarization | Hongyu Liu et.al. | 2503.19906 | null |
2025-03-25 | ICE: Intrinsic Concept Extraction from a Single Image via Diffusion Models | Fernando Julio Cendra et.al. | 2503.19902 | null |
2025-03-25 | Scaling Down Text Encoders of Text-to-Image Diffusion Models | Lifu Wang et.al. | 2503.19897 | link |
2025-03-25 | Visuo-Tactile Object Pose Estimation for a Multi-Finger Robot Hand with Low-Resolution In-Hand Tactile Sensing | Lukas Mack et.al. | 2503.19893 | null |
2025-03-25 | FireEdit: Fine-grained Instruction-based Image Editing via Region-aware Vision Language Model | Jun Zhou et.al. | 2503.19839 | null |
2025-03-25 | TopoGEN: topology-driven microstructure generation for in silico modeling of fiber network mechanics | Sara Cardona et.al. | 2503.19832 | null |
2025-03-25 | IgCraft: A versatile sequence generation framework for antibody discovery and engineering | Matthew Greenig et.al. | 2503.19821 | link |
2025-03-25 | Unpaired Object-Level SAR-to-Optical Image Translation for Aircraft with Keypoints-Guided Diffusion Models | Ruixi You et.al. | 2503.19798 | null |
2025-03-26 | In the Blink of an Eye: Instant Game Map Editing using a Generative-AI Smart Brush | Vitaly Gnatyuk et.al. | 2503.19793 | null |
2025-03-25 | SITA: Structurally Imperceptible and Transferable Adversarial Attacks for Stylized Image Generation | Jingdan Kang et.al. | 2503.19791 | link |
2025-03-25 | Fine-Grained Erasure in Text-to-Image Diffusion-based Foundation Models | Kartik Thakral et.al. | 2503.19783 | null |
2025-03-25 | PCM : Picard Consistency Model for Fast Parallel Sampling of Diffusion Models | Junhyuk So et.al. | 2503.19731 | null |
2025-03-25 | CoSimGen: Controllable Diffusion Model for Simultaneous Image and Mask Generation | Rupak Bose et.al. | 2503.19661 | null |
2025-03-24 | Target-Aware Video Diffusion Models | Taeksoo Kim et.al. | 2503.18950 | null |
2025-03-24 | Equivariant Image Modeling | Ruixiao Dong et.al. | 2503.18948 | link |
2025-03-25 | Aether: Geometric-Aware Unified World Modeling | Aether Team et.al. | 2503.18945 | null |
2025-03-24 | Video-T1: Test-Time Scaling for Video Generation | Fangfu Liu et.al. | 2503.18942 | null |
2025-03-24 | Training-free Diffusion Acceleration with Bottleneck Sampling | Ye Tian et.al. | 2503.18940 | null |
2025-03-24 | SyncVP: Joint Diffusion for Synchronous Multi-Modal Video Prediction | Enrico Pallotta et.al. | 2503.18933 | link |
2025-03-24 | Entanglement swapping systems toward a quantum internet | Samantha I. Davis et.al. | 2503.18906 | null |
2025-03-24 | 3DSwapping: Texture Swapping For 3D Object From Single Reference Image | Xiao Cao et.al. | 2503.18853 | null |
2025-03-24 | Dual-domain Multi-path Self-supervised Diffusion Model for Accelerated MRI Reconstruction | Yuxuan Zhang et.al. | 2503.18836 | null |
2025-03-24 | Blind structured illumination microscopy via generalized Richardson-Lucy method | Valentina Capalbo et.al. | 2503.18786 | null |
2025-03-24 | Duality Symmetry in Causality Constraints for Enhanced Acoustic Absorption | Sichao Qu et.al. | 2503.18740 | null |
2025-03-24 | RoboEngine: Plug-and-Play Robot Data Augmentation with Semantic Robot Segmentation and Background Generation | Chengbo Yuan et.al. | 2503.18738 | null |
2025-03-24 | Thermalizer: Stable autoregressive neural emulation of spatiotemporal chaos | Chris Pedersen et.al. | 2503.18731 | null |
2025-03-24 | NullSwap: Proactive Identity Cloaking Against Deepfake Face Swapping | Tianyi Wang et.al. | 2503.18678 | null |
2025-03-24 | Human Motion Unlearning | Edoardo De Matteis et.al. | 2503.18674 | null |
2025-03-21 | Position: Interactive Generative Video as Next-Generation Game Engine | Jiwen Yu et.al. | 2503.17359 | null |
2025-03-21 | Predicting Potential Customer Support Needs and Optimizing Search Ranking in a Two-Sided Marketplace | Do-kyum Kim et.al. | 2503.17329 | null |
2025-03-21 | Preference-Guided Diffusion for Multi-Objective Offline Optimization | Yashas Annadani et.al. | 2503.17299 | null |
2025-03-21 | Cross-Band Modulation Design for Hybrid RF-Optical Systems | Thrassos K. Oikonomou et.al. | 2503.17296 | null |
2025-03-21 | Offline Model-Based Optimization: Comprehensive Review | Minsu Kim et.al. | 2503.17286 | link |
2025-03-21 | Unsupervised Joint Learning of Optical Flow and Intensity with Event Cameras | Shuang Guo et.al. | 2503.17262 | link |
2025-03-21 | Deep End-to-End Posterior ENergy (DEEPEN) for image recovery | Jyothi Rikhab Chand et.al. | 2503.17244 | null |
2025-03-21 | Leveraging Text-to-Image Generation for Handling Spurious Correlation | Aryan Yazdan Parast et.al. | 2503.17226 | null |
2025-03-21 | Neuro-Symbolic Scene Graph Conditioning for Synthetic Image Dataset Generation | Giacomo Savazzi et.al. | 2503.17224 | null |
2025-03-21 | UniCon: Unidirectional Information Flow for Effective Control of Large-Scale Diffusion Models | Fanghua Yu et.al. | 2503.17221 | null |
2025-03-21 | FreeUV: Ground-Truth-Free Realistic Facial UV Texture Recovery via Cross-Assembly Inference Strategy | Xingchao Yang et.al. | 2503.17197 | null |
2025-03-21 | TreeSynth: Synthesizing Diverse Data from Scratch via Tree-Guided Subspace Partitioning | Sheng Wang et.al. | 2503.17195 | null |
2025-03-21 | ExplainitAI: When do we trust artificial intelligence? The influence of content and explainability in a cross-cultural comparison | Sora Kang et.al. | 2503.17158 | null |
2025-03-21 | D2C: Unlocking the Potential of Continuous Autoregressive Image Generation with Discrete Tokens | Panpan Wang et.al. | 2503.17155 | null |
2025-03-21 | R2LDM: An Efficient 4D Radar Super-Resolution Framework Leveraging Diffusion Model | Boyuan Zheng et.al. | 2503.17097 | null |
2025-03-20 | Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation | Yuqing Wang et.al. | 2503.16430 | null |
2025-03-20 | SynCity: Training-Free Generation of 3D Worlds | Paul Engstler et.al. | 2503.16420 | null |
2025-03-20 | DreamTexture: Shape from Virtual Texture with Analysis by Augmentation | Ananta R. Bhattarai et.al. | 2503.16412 | null |
2025-03-20 | VerbDiff: Text-Only Diffusion Models with Enhanced Interaction Awareness | SeungJu Cha et.al. | 2503.16406 | link |
2025-03-20 | ScalingNoise: Scaling Inference-Time Search for Generating Infinite Videos | Haolin Yang et.al. | 2503.16400 | null |
2025-03-20 | Scale-wise Distillation of Diffusion Models | Nikita Starodubcev et.al. | 2503.16397 | null |
2025-03-21 | SV4D 2.0: Enhancing Spatio-Temporal Consistency in Multi-View Video Diffusion for High-Quality 4D Generation | Chun-Han Yao et.al. | 2503.16396 | null |
2025-03-20 | Do Visual Imaginations Improve Vision-and-Language Navigation Agents? | Akhil Perincherry et.al. | 2503.16394 | null |
2025-03-20 | LaPIG: Cross-Modal Generation of Paired Thermal and Visible Facial Images | Leyang Wang et.al. | 2503.16376 | null |
2025-03-20 | Heat transfer and mixing in initiated Chemical Vapor Deposition analyzed by in-situ gas composition sensing | Simon Shindler et.al. | 2503.16373 | null |
2025-03-20 | Ultra-Resolution Adaptation with Ease | Ruonan Yu et.al. | 2503.16322 | link |
2025-03-20 | Rapid patient-specific neural networks for intraoperative X-ray to volume registration | Vivek Gopalakrishnan et.al. | 2503.16309 | link |
2025-03-20 | Unleashing Vecset Diffusion Model for Fast Shape Generation | Zeqiang Lai et.al. | 2503.16302 | link |
2025-03-20 | Diffusion-augmented Graph Contrastive Learning for Collaborative Filter | Fan Huang et.al. | 2503.16290 | null |
2025-03-20 | SceneMI: Motion In-betweening for Modeling Human-Scene Interactions | Inwoo Hwang et.al. | 2503.16289 | null |
2025-03-19 | FP4DiT: Towards Effective Floating Point Quantization for Diffusion Transformers | Ruichen Chen et.al. | 2503.15465 | link |
2025-03-19 | Di |
Yuanzhi Zhu et.al. | 2503.15457 | null |
2025-03-19 | MotionStreamer: Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space | Lixing Xiao et.al. | 2503.15451 | null |
2025-03-19 | LIFT: Latent Implicit Functions for Task- and Data-Agnostic Encoding | Amirhossein Kazerouni et.al. | 2503.15420 | null |
2025-03-19 | Temporal Regularization Makes Your Video Generator Stronger | Harold Haodong Chen et.al. | 2503.15417 | null |
2025-03-19 | Visual Persona: Foundation Model for Full-Body Human Customization | Jisu Nam et.al. | 2503.15406 | null |
2025-03-19 | HQNN-FSP: A Hybrid Classical-Quantum Neural Network for Regression-Based Financial Stock Market Prediction | Prashant Kumar Choudhary et.al. | 2503.15403 | null |
2025-03-19 | Online Matching under KIID: Enhanced Competitive Analysis through Ordinary Differential Equation Systems | Pan Xu et.al. | 2503.15399 | null |
2025-03-19 | CCDP: Composition of Conditional Diffusion Policies with Guided Sampling | Amirreza Razmjoo et.al. | 2503.15386 | null |
2025-03-19 | Material Decomposition in Photon-Counting Computed Tomography with Diffusion Models: Comparative Study and Hybridization with Variational Regularizers | Corentin Vazia et.al. | 2503.15383 | null |
2025-03-19 | Real-world validation of a multimodal LLM-powered pipeline for High-Accuracy Clinical Trial Patient Matching leveraging EHR data | Anatole Callies et.al. | 2503.15374 | link |
2025-03-19 | SPILL: Domain-Adaptive Intent Clustering based on Selection and Pooling with Large Language Models | I-Fan Lin et.al. | 2503.15351 | null |
2025-03-19 | Euclid Quick Data Release (Q1). Active galactic nuclei identification using diffusion-based inpainting of Euclid VIS images | Euclid Collaboration et.al. | 2503.15321 | null |
2025-03-19 | SENAI: Towards Software Engineering Native Generative Artificial Intelligence | Mootez Saad et.al. | 2503.15282 | null |
2025-03-19 | ImputeGAP: A Comprehensive Library for Time Series Imputation | Quentin Nater et.al. | 2503.15250 | null |
2025-03-18 | MusicInfuser: Making Video Diffusion Listen and Dance | Susung Hong et.al. | 2503.14505 | null |
2025-03-18 | The Power of Context: How Multimodality Improves Image Super-Resolution | Kangfu Mei et.al. | 2503.14503 | null |
2025-03-18 | Deeply Supervised Flow-Based Generative Models | Inkyu Shin et.al. | 2503.14494 | null |
2025-03-18 | Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control | NVIDIA et.al. | 2503.14492 | link |
2025-03-18 | Stable Virtual Camera: Generative View Synthesis with Diffusion Models | Jensen et.al. | 2503.14489 | null |
2025-03-18 | DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers | Minglei Shi et.al. | 2503.14487 | null |
2025-03-18 | Lux Post Facto: Learning Portrait Performance Relighting with Conditional Video Diffusion and a Hybrid Dataset | Yiqun Mei et.al. | 2503.14485 | null |
2025-03-18 | ICE-Bench: A Unified and Comprehensive Benchmark for Image Creating and Editing | Yulin Pan et.al. | 2503.14482 | null |
2025-03-18 | SIR-DIFF: Sparse Image Sets Restoration with Multi-View Diffusion Model | Yucheng Mao et.al. | 2503.14463 | null |
2025-03-18 | The Atacama Cosmology Telescope: DR6 Constraints on Extended Cosmological Models | Erminia Calabrese et.al. | 2503.14454 | null |
2025-03-18 | Bolt3D: Generating 3D Scenes in Seconds | Stanislaw Szymanowicz et.al. | 2503.14445 | null |
2025-03-18 | MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation | Hongyu Zhang et.al. | 2503.14428 | null |
2025-03-18 | Diffusion-based Facial Aesthetics Enhancement with 3D Structure Guidance | Lisha Li et.al. | 2503.14402 | null |
2025-03-18 | A Comprehensive Scatter Correction Model for Micro-Focus Dual-Source Imaging Systems: Combining Ambient, Cross, and Forward Scatter | Jianing Sun et.al. | 2503.14386 | null |
2025-03-18 | Impossible Videos | Zechen Bai et.al. | 2503.14378 | null |
2025-03-17 | Amodal3R: Amodal 3D Reconstruction from Occluded 2D Images | Tianhao Wu et.al. | 2503.13439 | null |
2025-03-17 | Infinite Mobility: Scalable High-Fidelity Synthesis of Articulated Objects via Procedural Generation | Xinyu Lian et.al. | 2503.13424 | null |
2025-03-17 | Securing Virtual Reality Experiences: Unveiling and Tackling Cybersickness Attacks with Explainable AI | Ripan Kumar Kundu et.al. | 2503.13419 | null |
2025-03-17 | Cream of the Crop: Harvesting Rich, Scalable and Transferable Multi-Modal Data for Instruction Fine-Tuning | Mengyao Lyu et.al. | 2503.13383 | null |
2025-03-17 | One-Step Residual Shifting Diffusion for Image Super-Resolution via Distillation | Daniil Selikhanovych et.al. | 2503.13358 | null |
2025-03-17 | A 1.8 m class pathfinder Raman LIDAR for the Northern Site of the Cherenkov Telescope Array Observatory -- Technical Design | Otger Ballester et.al. | 2503.13349 | null |
2025-03-17 | Artificial Intelligence-Driven Prognostic Classification of COVID-19 Using Chest X-rays: A Deep Learning Approach | Alfred Simbun et.al. | 2503.13277 | null |
2025-03-17 | Generative Gaussian Splatting: Generating 3D Scenes with Video Diffusion Priors | Katja Schwarz et.al. | 2503.13272 | null |
2025-03-17 | Graph Generative Models Evaluation with Masked Autoencoder | Chengen Wang et.al. | 2503.13271 | null |
2025-03-17 | FlexWorld: Progressively Expanding 3D Scenes for Flexiable-View Synthesis | Luxi Chen et.al. | 2503.13265 | null |
2025-03-17 | Dense Policy: Bidirectional Autoregressive Learning of Actions | Yue Su et.al. | 2503.13217 | null |
2025-03-17 | MedLoRD: A Medical Low-Resource Diffusion Model for High-Resolution 3D CT Image Synthesis | Marvin Seyfarth et.al. | 2503.13211 | null |
2025-03-17 | Patient-specific radiomic feature selection with reconstructed healthy persona of knee MR images | Yaxi Chen et.al. | 2503.13131 | null |
2025-03-17 | 3D Human Interaction Generation: A Survey | Siyuan Fan et.al. | 2503.13120 | null |
2025-03-17 | DTGBrepGen: A Novel B-rep Generative Model through Decoupling Topology and Geometry | Jing Li et.al. | 2503.13110 | link |
2025-03-14 | From few to many maps: A fast map-level emulator for extreme augmentation of CMB systematics datasets | P. Campeti et.al. | 2503.11643 | link |
2025-03-14 | Gradient-bridged Posterior: Bayesian Inference for Models with Implicit Functions | Cheng Zeng et.al. | 2503.11637 | null |
2025-03-14 | Pathology Image Compression with Pre-trained Autoencoders | Srikar Yellapragada et.al. | 2503.11591 | null |
2025-03-14 | Dynamics of a coupled nonlocal PDE-ODE system with spatial memory: well-posedness, stability, and bifurcation analysis | Yurij Salmaniw et.al. | 2503.11550 | null |
2025-03-14 | AugGen: Synthetic Augmentation Can Improve Discriminative Models | Parsa Rahimi et.al. | 2503.11544 | null |
2025-03-14 | Exploring Typographic Visual Prompts Injection Threats in Cross-Modality Generation Models | Hao Cheng et.al. | 2503.11519 | null |
2025-03-14 | Perfect Stabilization of Biomolecular Adhesions under Load | Anton F. Burnet et.al. | 2503.11510 | null |
2025-03-14 | Exponential Quantum Advantage for Simulating Open Classical Systems | Agi Villanyi et.al. | 2503.11483 | null |
2025-03-14 | T2I-FineEval: Fine-Grained Compositional Metric for Text-to-Image Evaluation | Seyed Mohammad Hadi Hosseini et.al. | 2503.11481 | null |
2025-03-14 | Integrating LLMs in Gamified Systems | Carlos J. Costa et.al. | 2503.11458 | null |
2025-03-14 | Extending Ambient Pressure X-ray Photoelectron Spectroscopy to Plasma Studies: A novel and flexible plasma gun approach | Yang Gu et.al. | 2503.11446 | null |
2025-03-14 | TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-Object Interaction for Generalizable Robotic Manipulation | Hongxiang Zhao et.al. | 2503.11423 | null |
2025-03-14 | MTV-Inpaint: Multi-Task Long Video Inpainting | Shiyuan Yang et.al. | 2503.11412 | null |
2025-03-14 | Towards A Correct Usage of Cryptography in Semantic Watermarks for Diffusion Models | Jonas Thietke et.al. | 2503.11404 | null |
2025-03-14 | BEVDiffLoc: End-to-End LiDAR Global Localization in BEV View based on Diffusion Model | Ziyue Wang et.al. | 2503.11372 | link |
2025-03-13 | GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing | Rongyao Fang et.al. | 2503.10639 | link |
2025-03-13 | Studying Classifier(-Free) Guidance From a Classifier-Centric Perspective | Xiaoming Zhao et.al. | 2503.10638 | null |
2025-03-14 | Distilling Diversity and Control in Diffusion Models | Rohit Gandikota et.al. | 2503.10637 | null |
2025-03-13 | HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model | Jiaming Liu et.al. | 2503.10631 | null |
2025-03-13 | NIL: No-data Imitation Learning by Leveraging Pre-trained Video Diffusion Models | Mert Albaba et.al. | 2503.10626 | null |
2025-03-13 | DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture Design in Text to Image Generation | Chen Chen et.al. | 2503.10618 | null |
2025-03-13 | MuDG: Taming Multi-modal Diffusion with Gaussian Splatting for Urban Scene Reconstruction | Yingshuang Zou et.al. | 2503.10604 | null |
2025-03-13 | CameraCtrl II: Dynamic Scene Exploration via Camera-controlled Video Diffusion Models | Hao He et.al. | 2503.10592 | null |
2025-03-13 | Long Context Tuning for Video Generation | Yuwei Guo et.al. | 2503.10589 | null |
2025-03-13 | Sample and Map from a Single Convex Potential: Generation using Conjugate Moment Measures | Nina Vesseron et.al. | 2503.10576 | null |
2025-03-13 | MASQUE: A Text-Guided Diffusion-Based Framework for Localized and Customized Adversarial Makeup | Youngjin Kwon et.al. | 2503.10549 | null |
2025-03-13 | Conformal Prediction Sets for Deep Generative Models via Reduction to Conformal Regression | Hooman Shahrokhi et.al. | 2503.10512 | null |
2025-03-13 | Streaming Generation of Co-Speech Gestures via Accelerated Rolling Diffusion | Evgeniia Vu et.al. | 2503.10488 | null |
2025-03-13 | Applying Tabular Deep Learning Models to Estimate Crash Injury Types of Young Motorcyclists | Shriyank Somvanshi et.al. | 2503.10474 | null |
2025-03-13 | Finetuning Generative Trajectory Model with Reinforcement Learning from Human Feedback | Derun Li et.al. | 2503.10434 | null |
2025-03-12 | PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop | Chenyu Li et.al. | 2503.09595 | link |
2025-03-12 | Minimax Optimality of the Probability Flow ODE for Diffusion Models | Changxiao Cai et.al. | 2503.09583 | null |
2025-03-12 | Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models | Marianne Arriola et.al. | 2503.09573 | link |
2025-03-12 | TPDiff: Temporal Pyramid Video Diffusion Model | Lingmin Ran et.al. | 2503.09566 | null |
2025-03-12 | FCaS: Fine-grained Cardiac Image Synthesis based on 3D Template Conditional Diffusion Model | Jiahao Xia et.al. | 2503.09560 | null |
2025-03-12 | GenHPE: Generative Counterfactuals for 3D Human Pose Estimation with Radio Frequency Signals | Shuokang Huang et.al. | 2503.09537 | null |
2025-03-12 | Total Ionizing Dose Measurements in Small Satellites in LEO using LabOSat-01 | Lucas Finazzi et.al. | 2503.09520 | null |
2025-03-12 | CM-Diff: A Single Generative Network for Bidirectional Cross-Modality Translation Diffusion Model Between Infrared and Visible Images | Bin Hu et.al. | 2503.09514 | null |
2025-03-12 | DAMM-Diffusion: Learning Divergence-Aware Multi-Modal Diffusion Model for Nanoparticles Distribution Prediction | Junjie Zhou et.al. | 2503.09491 | link |
2025-03-12 | Hybrid Rendering for Multimodal Autonomous Driving: Merging Neural and Physics-Based Simulation | Máté Tóth et.al. | 2503.09464 | null |
2025-03-12 | How Well Does Your Tabular Generator Learn the Structure of Tabular Data? | Xiangjian Jiang et.al. | 2503.09453 | link |
2025-03-12 | Sparse Autoencoder as a Zero-Shot Classifier for Concept Erasing in Text-to-Image Diffusion Models | Zhihua Tian et.al. | 2503.09446 | link |
2025-03-12 | SuperCarver: Texture-Consistent 3D Geometry Super-Resolution for High-Fidelity Surface Detail Generation | Qijian Zhang et.al. | 2503.09439 | null |
2025-03-12 | Alias-Free Latent Diffusion Models:Improving Fractional Shift Equivariance of Diffusion Latent Space | Yifan Zhou et.al. | 2503.09419 | link |
2025-03-12 | Diff-CL: A Novel Cross Pseudo-Supervision Method for Semi-supervised Medical Image Segmentation | Xiuzhen Guo et.al. | 2503.09408 | null |
2025-03-11 | OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models | Jialv Zou et.al. | 2503.08686 | link |
2025-03-11 | GarmentCrafter: Progressive Novel View Synthesis for Single-View 3D Garment Reconstruction and Editing | Yuanhao Wang et.al. | 2503.08678 | null |
2025-03-12 | OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting | Yongsheng Yu et.al. | 2503.08677 | null |
2025-03-11 | Language-Depth Navigated Thermal and Visible Image Fusion | Jinchang Zhang et.al. | 2503.08676 | null |
2025-03-11 | Keypoint Detection and Description for Raw Bayer Images | Jiakai Lin et.al. | 2503.08673 | null |
2025-03-11 | Modeling Stock Return Distributions and Pricing Options | Xinxin Jiang et.al. | 2503.08666 | null |
2025-03-11 | REGEN: Learning Compact Video Embedding with (Re-)Generative Decoder | Yitian Zhang et.al. | 2503.08665 | null |
2025-03-11 | MEAT: Multiview Diffusion Model for Human Generation on Megapixels with Mesh Attention | Yuhan Wang et.al. | 2503.08664 | link |
2025-03-11 | MF-VITON: High-Fidelity Mask-Free Virtual Try-On with Minimal Input | Zhenchen Wan et.al. | 2503.08650 | null |
2025-03-11 | Rethinking Diffusion Model in High Dimension | Zhenxin Zheng et.al. | 2503.08643 | link |
2025-03-11 | Efficient Many-Shot In-Context Learning with Dynamic Block-Sparse Attention | Emily Xiao et.al. | 2503.08640 | link |
2025-03-11 | LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization | Xianfeng Wu et.al. | 2503.08619 | link |
2025-03-11 | Tuning-Free Multi-Event Long Video Generation via Synchronized Coupled Sampling | Subin Kim et.al. | 2503.08605 | null |
2025-03-11 | 3D Point Cloud Generation via Autoregressive Up-sampling | Ziqiao Meng et.al. | 2503.08594 | null |
2025-03-11 | Proc4Gem: Foundation models for physical agency through procedural generation | Yixin Lin et.al. | 2503.08593 | null |
2025-03-10 | GenAIReading: Augmenting Human Cognition with Interactive Digital Textbooks Using Large Language Models and Image Generation Models | Ryugo Morita et.al. | 2503.07463 | null |
2025-03-10 | Advancing our Understanding of Optoionic Effects for the Design of Solar Batteries: A Theoretical Perspective | Matteo Rinaldi et.al. | 2503.07460 | null |
2025-03-10 | Is a Good Foundation Necessary for Efficient Reinforcement Learning? The Computational Role of the Base Model in Exploration | Dylan J. Foster et.al. | 2503.07453 | null |
2025-03-10 | DRESS: Diffusion Reasoning-based Reward Shaping Scheme For Intelligent Networks | Feiran You et.al. | 2503.07433 | link |
2025-03-10 | AR-Diffusion: Asynchronous Video Generation with Auto-Regressive Diffusion | Mingzhen Sun et.al. | 2503.07418 | null |
2025-03-10 | TimeStep Master: Asymmetrical Mixture of Timestep LoRA Experts for Versatile and Efficient Diffusion Models in Vision | Shaobin Zhuang et.al. | 2503.07416 | null |
2025-03-10 | SPEED: Scalable, Precise, and Efficient Concept Erasure for Diffusion Models | Ouxiang Li et.al. | 2503.07392 | link |
2025-03-10 | PersonaBooth: Personalized Text-to-Motion Generation | Boeun Kim et.al. | 2503.07390 | null |
2025-03-10 | TRCE: Towards Reliable Malicious Concept Erasure in Text-to-Image Diffusion Models | Ruidong Chen et.al. | 2503.07389 | link |
2025-03-10 | RepoST: Scalable Repository-Level Coding Environment Construction with Sandbox Testing | Yiqing Xie et.al. | 2503.07358 | link |
2025-03-10 | AttenST: A Training-Free Attention-Driven Style Transfer Framework with Pre-Trained Diffusion Models | Bo Huang et.al. | 2503.07307 | link |
2025-03-10 | Cool-3D: An End-to-End Thermal-Aware Framework for Early-Phase Design Space Exploration of Microfluidic-Cooled 3DICs | Runxi Wang et.al. | 2503.07297 | link |
2025-03-10 | Efficient Distillation of Classifier-Free Guidance using Adapters | Cristian Perez Jensen et.al. | 2503.07274 | null |
2025-03-10 | Customized SAM 2 for Referring Remote Sensing Image Segmentation | Fu Rong et.al. | 2503.07266 | null |
2025-03-11 | AnomalyPainter: Vision-Language-Diffusion Synergy for Zero-Shot Realistic and Diverse Industrial Anomaly Synthesis | Zhangyu Lai et.al. | 2503.07253 | null |
2025-03-07 | AIM-Fair: Advancing Algorithmic Fairness via Selectively Fine-Tuning Biased Models with Contextual Synthetic Data | Zengqun Zhao et.al. | 2503.05665 | link |
2025-03-07 | TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models | Mark YU et.al. | 2503.05638 | null |
2025-03-07 | A functional approach for curve alignment and shape analysis | Issam-Ali Moindjié et.al. | 2503.05632 | null |
2025-03-07 | Geometric Optimization of Patterned Conductive Polymer Composite-based Strain Sensors Toward Enhanced Sensing Performance | Jia-Chen Shang et.al. | 2503.05603 | null |
2025-03-07 | Diffusion Models for Cayley Graphs | Michael R. Douglas et.al. | 2503.05558 | null |
2025-03-07 | Radio Frequency from Optical with Instabilities below |
A. Hati et.al. | 2503.05547 | null |
2025-03-10 | Accelerating db-A for Kinodynamic Motion Planning Using Diffusion* | Julius Franke et.al. | 2503.05539 | null |
2025-03-07 | Post-Hoc Concept Disentanglement: From Correlated to Isolated Concept Representations | Eren Erogullari et.al. | 2503.05522 | link |
2025-03-07 | Noise-Robust Radio Frequency Fingerprint Identification Using Denoise Diffusion Model | Guolin Yin et.al. | 2503.05514 | null |
2025-03-07 | Localized necking under global compression in two-scale metallic hierarchical solids | Naresh Chockalingam S. et.al. | 2503.05498 | null |
2025-03-07 | Umbilical Choir: Automated Live Testing for Edge-To-Cloud FaaS Applications | Mohammadreza Malekabbasi et.al. | 2503.05495 | link |
2025-03-07 | Statistical Deficiency for Task Inclusion Estimation | Loïc Fosse et.al. | 2503.05491 | null |
2025-03-07 | De Novo Design of Protein-Binding Peptides by Quantum Computing | Lars Meuser et.al. | 2503.05458 | null |
2025-03-07 | VLMs Play StarCraft II: A Benchmark and Multimodal Decision Method | Weiyu Ma et.al. | 2503.05383 | link |
2025-03-07 | PhysicsGen: Can Generative Models Learn from Images to Predict Complex Physical Relations? | Martin Spitznagel et.al. | 2503.05333 | null |
2025-03-06 | Compositional World Knowledge leads to High Utility Synthetic data | Sachit Gaudi et.al. | 2503.04687 | null |
2025-03-06 | What Are You Doing? A Closer Look at Controllable Human Video Generation | Emanuele Bugliarello et.al. | 2503.04666 | null |
2025-03-06 | Risk-aware Trading Portfolio Optimization | Marco Bianchetti et.al. | 2503.04662 | null |
2025-03-06 | IFIR: A Comprehensive Benchmark for Evaluating Instruction-Following in Expert-Domain Information Retrieval | Tingyu Song et.al. | 2503.04644 | null |
2025-03-06 | Simulating the Real World: A Unified Survey of Multimodal Generative Models | Yuqi Hu et.al. | 2503.04641 | link |
2025-03-06 | 3HANDS Dataset: Learning from Humans for Generating Naturalistic Handovers with Supernumerary Robotic Limbs | Artin Saberpour Abadian et.al. | 2503.04635 | null |
2025-03-06 | The Best of Both Worlds: Integrating Language Models and Diffusion Models for Video Generation | Aoxiong Yin et.al. | 2503.04606 | link |
2025-03-07 | Method for recovering data on unreported low-severity crashes | Alberto Morando et.al. | 2503.04529 | null |
2025-03-06 | Learning Object Placement Programs for Indoor Scene Synthesis with Iterative Self Training | Adrian Chang et.al. | 2503.04496 | null |
2025-03-06 | InfoSEM: A Deep Generative Model with Informative Priors for Gene Regulatory Network Inference | Tianyu Cui et.al. | 2503.04483 | null |
2025-03-06 | ToolFuzz -- Automated Agent Tool Testing | Ivan Milev et.al. | 2503.04479 | null |
2025-03-06 | Semantic Alignment of Unimodal Medical Text and Vision Representations | Maxime Di Folco et.al. | 2503.04478 | null |
2025-03-06 | PALo: Learning Posture-Aware Locomotion for Quadruped Robots | Xiangyu Miao et.al. | 2503.04462 | null |
2025-03-06 | Polling on a circle with non-uniform batch arrivals | Tim Engels et.al. | 2503.04448 | null |
2025-03-06 | Can Large Language Models Predict Antimicrobial Resistance Gene? | Hyunwoo Yoo et.al. | 2503.04413 | null |
2025-03-05 | Rethinking Video Tokenization: A Conditioned Diffusion-based Approach | Nianzu Yang et.al. | 2503.03708 | link |
2025-03-05 | DualDiff+: Dual-Branch Diffusion for High-Fidelity Video Generation with Reward Guidance | Zhao Yang et.al. | 2503.03689 | link |
2025-03-05 | Attentive Reasoning Queries: A Systematic Method for Optimizing Instruction-Following in Large Language Models | Bar Karov et.al. | 2503.03669 | link |
2025-03-05 | A Generative Approach to High Fidelity 3D Reconstruction from Text Data | Venkat Kumar R et.al. | 2503.03664 | null |
2025-03-05 | DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles | Rui Zhao et.al. | 2503.03651 | link |
2025-03-05 | Towards Understanding Text Hallucination of Diffusion Models via Local Generation Bias | Rui Lu et.al. | 2503.03595 | null |
2025-03-05 | Generative Artificial Intelligence in Robotic Manipulation: A Survey | Kun Zhang et.al. | 2503.03464 | null |
2025-03-05 | Predicting Practically? Domain Generalization for Predictive Analytics in Real-world Environments | Hanyu Duan et.al. | 2503.03399 | link |
2025-03-05 | Top-K Maximum Intensity Projection Priors for 3D Liver Vessel Segmentation | Xiaotong Zhang et.al. | 2503.03367 | null |
2025-03-05 | Video Super-Resolution: All You Need is a Video Diffusion Model | Zhihao Zhan et.al. | 2503.03355 | null |
2025-03-05 | Label-Efficient LiDAR Semantic Segmentation with 2D-3D Vision Transformer Adapters | Julia Hindel et.al. | 2503.03299 | null |
2025-03-05 | Group Delay Dispersion Measurements of Novel Multilayer Interference Coatings in the Mid-Infrared Spectral Regime | Ulrich Galander et.al. | 2503.03289 | null |
2025-03-06 | Optimizing for the Shortest Path in Denoising Diffusion Model | Ping Chen et.al. | 2503.03265 | link |
2025-03-05 | Mean Field Game of Controls with State Reflections: Existence and Limit Theory | Lijun Bo et.al. | 2503.03253 | null |
2025-03-05 | GenColor: Generative Color-Concept Association in Visual Design | Yihan Hou et.al. | 2503.03236 | null |
2025-03-04 | ARINAR: Bi-Level Autoregressive Feature-by-Feature Generative Models | Qinyu Zhao et.al. | 2503.02883 | link |
2025-03-04 | SeqFusion: Sequential Fusion of Pre-Trained Models for Zero-Shot Time-Series Forecasting | Ting-Ji Huang et.al. | 2503.02836 | link |
2025-03-04 | A Multimodal Symphony: Integrating Taste and Sound through Generative AI | Matteo Spanio et.al. | 2503.02823 | link |
2025-03-04 | Feynman-Kac Correctors in Diffusion: Annealing, Guidance, and Product of Experts | Marta Skreta et.al. | 2503.02819 | link |
2025-03-04 | "What If Smart Homes Could See Our Homes?": Exploring DIY Smart Home Building Experiences with VLM-Based Camera Sensors | Sojeong Yun et.al. | 2503.02816 | null |
2025-03-04 | Generating Reliable Initial Velocity Models for Full-waveform Inversion with Well and Structural Constraints | Qingchen Zhang et.al. | 2503.02815 | null |
2025-03-04 | Applying Computational Engineering Modelling to Analyse the Social Impact of Conflict and Violent Events | Felix Schwebel et.al. | 2503.02771 | null |
2025-03-04 | Revolutionizing Command Interface: Maximizing Control Efficiency in INO ICAL Experiment with UDP Protocol | Yuvaraj Elangovan et.al. | 2503.02751 | null |
2025-03-04 | Seeded Poisson Factorization: Leveraging domain knowledge to fit topic models | Bernd Prostmaier et.al. | 2503.02741 | link |
2025-03-04 | Variable-Friction In-Hand Manipulation for Arbitrary Objects via Diffusion-Based Imitation Learning | Qiyang Yan et.al. | 2503.02738 | null |
2025-03-04 | Zero-Shot Complex Question-Answering on Long Scientific Documents | Wanting Wang et.al. | 2503.02695 | link |
2025-03-04 | Generative Modeling of Microweather Wind Velocities for Urban Air Mobility | Tristan A. Shah et.al. | 2503.02690 | link |
2025-03-04 | A user-friendly SPARQL query editor powered by lightweight metadata | Vincent Emonet et.al. | 2503.02688 | link |
2025-03-04 | Cellular Automaton With CNN | Valery Ashu et.al. | 2503.02652 | link |
2025-03-04 | Xavier: Toward Better Coding Assistance in Authoring Tabular Data Wrangling Scripts | Yunfan Zhou et.al. | 2503.02639 | null |
2025-02-28 | How far can we go with ImageNet for Text-to-Image generation? | L. Degeorge et.al. | 2502.21318 | null |
2025-02-28 | Raccoon: Multi-stage Diffusion Training with Coarse-to-Fine Curating Videos | Zhiyu Tan et.al. | 2502.21314 | null |
2025-02-28 | Does Generation Require Memorization? Creative Diffusion Models using Ambient Diffusion | Kulin Shah et.al. | 2502.21278 | null |
2025-02-28 | Dynamic Markov Blanket Detection for Macroscopic Physics Discovery | Jeff Beck et.al. | 2502.21217 | link |
2025-02-28 | AMPLE: Event-Driven Accelerator for Mixed-Precision Inference of Graph Neural Networks | Pedro Gimenes et.al. | 2502.21196 | null |
2025-02-28 | Joint Modeling in Recommendations: A Survey | Xiangyu Zhao et.al. | 2502.21195 | null |
2025-02-28 | SYN-LUNGS: Towards Simulating Lung Nodules with Anatomy-Informed Digital Twins for AI Training | Fakrul Islam Tushar et.al. | 2502.21187 | null |
2025-02-28 | A Review on Generative AI For Text-To-Image and Image-To-Image Generation and Implications To Scientific Images | Zineb Sordo et.al. | 2502.21151 | null |
2025-02-28 | Rare event modeling with self-regularized normalizing flows: what can we learn from a single failure? | Charles Dawson et.al. | 2502.21110 | null |
2025-02-28 | Spatial Reasoning with Denoising Models | Christopher Wewer et.al. | 2502.21075 | null |
2025-02-28 | GUIDE: LLM-Driven GUI Generation Decomposition for Automated Prototyping | Kristian Kolthoff et.al. | 2502.21068 | null |
2025-02-28 | Synthesizing Individualized Aging Brains in Health and Disease with Generative Models and Parallel Transport | Jingru Fu et.al. | 2502.21049 | link |
2025-02-28 | Toward interoperable representation and sharing of disinformation incidents in cyber threat intelligence | Felipe Sánchez González et.al. | 2502.20997 | link |
2025-02-28 | Generative Uncertainty in Diffusion Models | Metod Jazbec et.al. | 2502.20946 | null |
2025-02-28 | DiffBrush:Just Painting the Art by Your Hands | Jiaming Chu et.al. | 2502.20904 | null |
2025-02-27 | InterMimic: Towards Universal Whole-Body Control for Physics-Based Human-Object Interactions | Sirui Xu et.al. | 2502.20390 | link |
2025-02-27 | Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation | Sucheng Ren et.al. | 2502.20388 | link |
2025-02-27 | Tight Inversion: Image-Conditioned Inversion for Real Image Editing | Edo Kadosh et.al. | 2502.20376 | null |
2025-02-27 | Constrained Generative Modeling with Manually Bridged Diffusion Models | Saeid Naderiparizi et.al. | 2502.20371 | null |
2025-02-27 | ACCORD: Application Context-aware Cross-layer Optimization and Resource Design for 5G/NextG Machine-centric Applications | Azuka Chiejina et.al. | 2502.20320 | null |
2025-02-27 | FlexVAR: Flexible Visual Autoregressive Modeling without Residual Prediction | Siyu Jiao et.al. | 2502.20313 | link |
2025-02-27 | Mobius: Text to Seamless Looping Video Generation via Latent Shift | Xiuli Bi et.al. | 2502.20307 | link |
2025-02-27 | Explainable, Multi-modal Wound Infection Classification from Images Augmented with Generated Captions | Palawat Busaranuvong et.al. | 2502.20277 | null |
2025-02-27 | Do computer vision foundation models learn the low-level characteristics of the human visual system? | Yancheng Cai et.al. | 2502.20256 | null |
2025-02-28 | Beyond Natural Language Perplexity: Detecting Dead Code Poisoning in Code Generation Datasets | Chi-Chien Tsai et.al. | 2502.20246 | null |
2025-02-27 | From Retrieval to Generation: Comparing Different Approaches | Abdelrahman Abdallah et.al. | 2502.20245 | null |
2025-02-27 | Attention Distillation: A Unified Approach to Visual Characteristics Transfer | Yang Zhou et.al. | 2502.20235 | link |
2025-02-27 | AI Will Always Love You: Studying Implicit Biases in Romantic AI Companions | Clare Grogan et.al. | 2502.20231 | link |
2025-02-27 | Model Checking Linear Temporal Logic with Standpoint Modalities | Rajab Aghamov et.al. | 2502.20193 | null |
2025-02-27 | Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think | Liang Chen et.al. | 2502.20172 | link |
2025-02-26 | Multi-modal Contrastive Learning for Tumor-specific Missing Modality Synthesis | Minjoo Lim et.al. | 2502.19390 | null |
2025-02-26 | Deep Learning For Time Series Analysis With Application On Human Motion | Ali Ismail-Fawaz et.al. | 2502.19364 | null |
2025-02-26 | Shh, don't say that! Domain Certification in LLMs | Cornelius Emde et.al. | 2502.19320 | null |
2025-02-26 | AI-Powered Bayesian Inference | Veronika Ročková et.al. | 2502.19231 | null |
2025-02-26 | HDM: Hybrid Diffusion Model for Unified Image Anomaly Detection | Zekang Weng et.al. | 2502.19200 | null |
2025-02-27 | INFO-SEDD: Continuous Time Markov Chains as Scalable Information Metrics Estimators | Alberto Foresti et.al. | 2502.19183 | null |
2025-02-26 | A Model-Centric Review of Deep Learning for Protein Design | Gregory W. Kyro et.al. | 2502.19173 | null |
2025-02-27 | RetinaRegen: A Hybrid Model for Readability and Detail Restoration in Fundus Images | Yuhan Tang et.al. | 2502.19153 | null |
2025-02-26 | Identification Under the Semantic Effective Secrecy Constraint | Abdalla Ibrahim et.al. | 2502.19142 | null |
2025-02-26 | Improving customer service with automatic topic detection in user emails | Bojana Bašaragin et.al. | 2502.19115 | null |
2025-02-26 | Modulation of the galactic cosmic ray spectrum in an anisotropic diffusion approach | V. D. Borisov et.al. | 2502.19062 | null |
2025-02-26 | A Dual-Purpose Framework for Backdoor Defense and Backdoor Amplification in Diffusion Models | Vu Tuan Truong Long et.al. | 2502.19047 | null |
2025-02-26 | OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment | Jiaxin Deng et.al. | 2502.18965 | null |
2025-02-26 | DualSpec: Text-to-spatial-audio Generation via Dual-Spectrogram Guided Diffusion Model | Lei Zhao et.al. | 2502.18952 | null |
2025-02-26 | A Novel Topology Recovery Method for Low Voltage Distribution Networks | Sina Mohammadi et.al. | 2502.18939 | null |
2025-02-25 | K-LoRA: Unlocking Training-Free Fusion of Any Subject and Style LoRAs | Ziheng Ouyang et.al. | 2502.18461 | null |
2025-02-25 | ToMCAT: Theory-of-Mind for Cooperative Agents in Teams via Multiagent Diffusion Policies | Pedro Sequeira et.al. | 2502.18438 | null |
2025-02-25 | Sparse Bayesian Generative Modeling for Joint Parameter and Channel Estimation | Benedikt Böck et.al. | 2502.18369 | null |
2025-02-25 | ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation | Yifan Pu et.al. | 2502.18364 | null |
2025-02-25 | Stretchable Capacitive and Resistive Strain Sensors: Accessible Manufacturing Using Direct Ink Writing | Lukas Cha et.al. | 2502.18363 | null |
2025-02-25 | Towards softerware: Enabling personalization of interactive data representations for users with disabilities | Frank Elavsky et.al. | 2502.18348 | link |
2025-02-25 | LDGen: Enhancing Text-to-Image Synthesis via Large Language Model-Driven Language Representation | Pengzhi Li et.al. | 2502.18302 | null |
2025-02-26 | Bayesian Computation in Deep Learning | Wenlong Chen et.al. | 2502.18300 | null |
2025-02-26 | Citrus: Leveraging Expert Cognitive Pathways in a Medical Language Model for Advanced Medical Decision Support | Guoxin Wang et.al. | 2502.18274 | link |
2025-02-25 | Imperfect Knowledge Management (IKM) in GEFRED (GENeralized model for Fuzzy RElational Databases) | Leoncio Jimenez et.al. | 2502.18255 | null |
2025-02-25 | A 3D Printed Quad-Ridged Flared Horn Antenna Feeder for Radio-Telescopes | Andreas Hofmann et.al. | 2502.18243 | null |
2025-02-25 | Causal AI-based Root Cause Identification: Research to Practice at Scale | Saurabh Jha et.al. | 2502.18240 | null |
2025-02-25 | Beyond the convexity assumption: Realistic tabular data generation under quantifier-free real linear constraints | Mihaela Cătălina Stoian et.al. | 2502.18237 | link |
2025-02-25 | Principled priors for Bayesian inference of circular models | Xiang Ye et.al. | 2502.18223 | null |
2025-02-25 | UASTrack: A Unified Adaptive Selection Framework with Modality-Customization in Single Object Tracking | He Wang et.al. | 2502.18220 | null |
2025-02-24 | Fractal Generative Models | Tianhong Li et.al. | 2502.17437 | link |
2025-02-24 | GCC: Generative Color Constancy via Diffusing a Color Checker | Chen-Wei Chang et.al. | 2502.17435 | null |
2025-02-24 | S4S: Solving for a Diffusion Model Solver | Eric Frankel et.al. | 2502.17423 | null |
2025-02-24 | X-Dancer: Expressive Music to Human Dance Video Generation | Zeyuan Chen et.al. | 2502.17414 | null |
2025-02-24 | What is a Good Question? Utility Estimation with LLM-based Simulations | Dong-Ho Lee et.al. | 2502.17383 | null |
2025-02-25 | KV-Edit: Training-Free Image Editing for Precise Background Preservation | Tianrui Zhu et.al. | 2502.17363 | link |
2025-02-24 | RELICT: A Replica Detection Framework for Medical Image Generation | Orhun Utku Aydin et.al. | 2502.17360 | link |
2025-02-24 | How Scientists Use Large Language Models to Program | Gabrielle O'Brien et.al. | 2502.17348 | null |
2025-02-24 | AnyTop: Character Animation Diffusion with Any Topology | Inbar Gat et.al. | 2502.17327 | link |
2025-02-24 | Turning Conversations into Workflows: A Framework to Extract and Evaluate Dialog Workflows for Service AI Agents | Prafulla Kumar Choubey et.al. | 2502.17321 | null |
2025-02-24 | Robust Federated Learning in Unreliable Wireless Networks: A Client Selection Approach | Yanmeng Wang et.al. | 2502.17260 | null |
2025-02-24 | VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing | Xiangpeng Yang et.al. | 2502.17258 | null |
2025-02-24 | Learning Image Fractals Using Chaotic Differentiable Point Splatting | Adarsh Djeacoumar et.al. | 2502.17230 | null |
2025-02-24 | Dimitra: Audio-driven Diffusion model for Expressive Talking Head Generation | Baptiste Chopin et.al. | 2502.17198 | null |
2025-02-24 | Unsupervised Accelerated MRI Reconstruction via Ground-Truth-Free Flow Matching | Xinzhe Luo et.al. | 2502.17174 | null |
2025-02-21 | One-step Diffusion Models with |
Yilun Xu et.al. | 2502.15681 | null |
2025-02-21 | VaViM and VaVAM: Autonomous Driving through Video Generative Modeling | Florent Bartoccioni et.al. | 2502.15672 | link |
2025-02-21 | Overview of the data acquisition system architecture for the DarkSide-20k experiment | Maria Adriana Sabia et.al. | 2502.15651 | null |
2025-02-21 | WorldCraft: Photo-Realistic 3D World Creation and Customization via LLM Agents | Xinhang Liu et.al. | 2502.15601 | null |
2025-02-21 | Chats-Grid: An Iterative Retrieval Q&A Optimization Scheme Leveraging Large Model and Retrieval Enhancement Generation in smart grid | Yunfeng Li et.al. | 2502.15583 | null |
2025-02-21 | Enhancing RWKV-based Language Models for Long-Sequence Text Generation | Xinghan Pan et.al. | 2502.15485 | link |
2025-02-21 | Development and Performance Validation of a Versatile VLBI Digital Backend Using the ROACH2 Platform | Jiyun Li et.al. | 2502.15446 | null |
2025-02-21 | Modeling Infectious Diseases: From SIR Models to Diffusion-Based Approaches and Numerical Solutions | Ayesha Baig et.al. | 2502.15439 | null |
2025-02-21 | Efficiently Solving Discounted MDPs with Predictions on Transition Matrices | Lixing Lyu et.al. | 2502.15345 | null |
2025-02-21 | Bridging Bug Localization and Issue Fixing: A Hierarchical Localization Framework Leveraging Large Language Models | Jianming Chang et.al. | 2502.15292 | null |
2025-02-21 | BundleFlow: Deep Menus for Combinatorial Auctions by Diffusion-Based Optimization | Tonghan Wang et.al. | 2502.15283 | null |
2025-02-21 | CopyJudge: Automated Copyright Infringement Identification and Mitigation in Text-to-Image Diffusion Models | Shunchang Liu et.al. | 2502.15278 | null |
2025-02-21 | On the (In)Security of Non-resettable Device Identifiers in Custom Android Systems | Zikan Dong et.al. | 2502.15270 | null |
2025-02-21 | User Experience with LLM-powered Conversational Recommendation Systems: A Case of Music Recommendation | Sojeong Yun et.al. | 2502.15229 | null |
2025-02-21 | Lung-DDPM: Semantic Layout-guided Diffusion Models for Thoracic CT Image Synthesis | Yifan Jiang et.al. | 2502.15204 | link |
2025-02-20 | Improving the Diffusability of Autoencoders | Ivan Skorokhodov et.al. | 2502.14831 | null |
2025-02-20 | A Survey on Text-Driven 360-Degree Panorama Generation | Hai Wang et.al. | 2502.14799 | null |
2025-02-20 | Real-Time Device Reach Forecasting Using HLL and MinHash Data Sketches | Chandrashekar Muniyappa et.al. | 2502.14785 | null |
2025-02-20 | DC-ControlNet: Decoupling Inter- and Intra-Element Conditions in Image Generation with Diffusion Models | Hongji Yang et.al. | 2502.14779 | null |
2025-02-20 | Multi-dataset synergistic in supervised learning to pre-label structural components in point clouds from shell construction scenes | Lukas Rauch et.al. | 2502.14721 | null |
2025-02-20 | ReQFlow: Rectified Quaternion Flow for Efficient and High-Quality Protein Backbone Generation | Angxiao Yue et.al. | 2502.14637 | link |
2025-02-20 | A Theory for Conditional Generative Modeling on Multiple Data Sources | Rongzhen Wang et.al. | 2502.14583 | link |
2025-02-20 | Multiscale Byte Language Models -- A Hierarchical Architecture for Causal Million-Length Sequence Modeling | Eric Egli et.al. | 2502.14553 | link |
2025-02-20 | Dynamic Preference-based Multi-modal Trip Planning of Public Transport and Shared Mobility | Yimeng Zhang et.al. | 2502.14528 | null |
2025-02-20 | How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM? | Sergey Pletenev et.al. | 2502.14502 | link |
2025-02-20 | StructFlowBench: A Structured Flow Benchmark for Multi-turn Instruction Following | Jinnan Li et.al. | 2502.14494 | link |
2025-02-20 | How Jailbreak Defenses Work and Ensemble? A Mechanistic Investigation | Zhuohang Long et.al. | 2502.14486 | null |
2025-02-20 | Algorithms for min-buying in networks | Aaditya Bhardwaj et.al. | 2502.14459 | null |
2025-02-20 | PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data | Shijie Huang et.al. | 2502.14397 | link |
2025-02-20 | Enhancing Portuguese Variety Identification with Cross-Domain Approaches | Hugo Sousa et.al. | 2502.14394 | null |
2025-02-19 | IP-Composer: Semantic Composition of Visual Concepts | Sara Dorfman et.al. | 2502.13951 | null |
2025-02-19 | Image compositing is all you need for data augmentation | Ang Jia Ning Shermaine et.al. | 2502.13936 | null |
2025-02-19 | TESS 2: A Large-Scale Generalist Diffusion Language Model | Jaesung Tae et.al. | 2502.13917 | link |
2025-02-19 | DataSciBench: An LLM Agent Benchmark for Data Science | Dan Zhang et.al. | 2502.13897 | link |
2025-02-19 | Performance Comparison of Graph Representations Which Support Dynamic Graph Updates | Subhajit Sahu et.al. | 2502.13862 | link |
2025-02-19 | Reverse Markov Learning: Multi-Step Generative Models for Complex Distributions | Xinwei Shen et.al. | 2502.13747 | null |
2025-02-19 | Deep Learning for VWAP Execution in Crypto Markets: Beyond the Volume Curve | Remi Genet et.al. | 2502.13722 | link |
2025-02-19 | Beyond One-Size-Fits-All: Tailored Benchmarks for Efficient Evaluation | Peiwen Yuan et.al. | 2502.13576 | null |
2025-02-19 | ETS: Efficient Tree Search for Inference-Time Scaling | Coleman Hooper et.al. | 2502.13575 | link |
2025-02-19 | RestoreGrad: Signal Restoration Using Conditional Denoising Diffusion Models with Jointly Learned Prior | Ching-Hua Lee et.al. | 2502.13574 | null |
2025-02-19 | Diffusion Model Agnostic Social Influence Maximization in Hyperbolic Space | Hongliang Qiao et.al. | 2502.13571 | null |
2025-02-19 | Extracting Social Connections from Finnish Karelian Refugee Interviews Using LLMs | Joonatan Laato et.al. | 2502.13566 | null |
2025-02-19 | Controlling deposition and characterising dynamics of thin liquid films with high temporal and spatial resolution | G Le Lay et.al. | 2502.13552 | null |
2025-02-19 | VLAS: Vision-Language-Action Model With Speech Instructions For Customized Robot Manipulation | Wei Zhao et.al. | 2502.13508 | link |
2025-02-19 | Towards Lightweight, Adaptive and Attribute-Aware Multi-Aspect Controllable Text Generation with Large Language Models | Chenyu Zhu et.al. | 2502.13474 | null |
2025-02-18 | AV-Flow: Transforming Text to Audio-Visual Human-like Interactions | Aggelina Chatziagapi et.al. | 2502.13133 | null |
2025-02-18 | Is Noise Conditioning Necessary for Denoising Generative Models? | Qiao Sun et.al. | 2502.13129 | null |
2025-02-18 | HARP: A Taxonomy for Heterogeneous and Hierarchical Processors for Mixed-reuse Workloads | Raveesh Garg et.al. | 2502.13113 | null |
2025-02-18 | Score Matching Riemannian Diffusion Means | Frederik Möbius Rygaard et.al. | 2502.13106 | null |
2025-02-18 | tn4ml: Tensor Network Training and Customization for Machine Learning | Ema Puljak et.al. | 2502.13090 | link |
2025-02-18 | A Neural Difference-of-Entropies Estimator for Mutual Information | Haoran Ni et.al. | 2502.13085 | null |
2025-02-18 | Personalized Image Generation with Deep Generative Models: A Decade Survey | Yuxiang Wei et.al. | 2502.13081 | link |
2025-02-18 | Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs | Longxu Dou et.al. | 2502.12982 | null |
2025-02-18 | Towards Variational Flow Matching on General Geometries | Olga Zaghen et.al. | 2502.12981 | null |
2025-02-18 | Does Training with Synthetic Data Truly Protect Privacy? | Yunpeng Zhao et.al. | 2502.12976 | link |
2025-02-18 | CooLBM: A Collaborative Open-Source Reactive Multi-Phase/Component Simulation Code via Lattice Boltzmann Method | R. Alamian et.al. | 2502.12955 | null |
2025-02-18 | Guaranteed Conditional Diffusion: 3D Block-based Models for Scientific Data Compression | Jaemoon Lee et.al. | 2502.12951 | null |
2025-02-18 | A Simplified and Numerically Stable Approach to the BG/NBD Churn Prediction model | Dylan Zammit et.al. | 2502.12912 | null |
2025-02-18 | Probabilistic neural operators for functional uncertainty quantification | Christopher Bülte et.al. | 2502.12902 | link |
2025-02-18 | CAST: Component-Aligned 3D Scene Reconstruction from an RGB Image | Kaixin Yao et.al. | 2502.12894 | null |
2025-02-17 | Diffusion Models without Classifier-free Guidance | Zhicong Tang et.al. | 2502.12154 | link |
2025-02-17 | Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening | Ye Tian et.al. | 2502.12146 | link |
2025-02-17 | Correlative X-ray and electron tomography for scale-bridging, quantitative analysis of complex, hierarchical particle systems | Alexander Götz et.al. | 2502.12140 | null |
2025-02-17 | LaM-SLidE: Latent Space Modeling of Spatial Dynamical Systems via Linked Entities | Florian Sestak et.al. | 2502.12128 | link |
2025-02-17 | Descriminative-Generative Custom Tokens for Vision-Language Models | Pramuditha Perera et.al. | 2502.12095 | null |
2025-02-17 | How compositional generalization and creativity improve as diffusion models are trained | Alessandro Favero et.al. | 2502.12089 | null |
2025-02-17 | AdaSplash: Adaptive Sparse Flash Attention | Nuno Gonçalves et.al. | 2502.12082 | link |
2025-02-17 | HumanGif: Single-View Human Diffusion with Generative Prior | Shoukang Hu et.al. | 2502.12080 | link |
2025-02-17 | A Survey on Bridging EEG Signals and Generative AI: From Image and Text to Beyond | Shreya Shukla et.al. | 2502.12048 | null |
2025-02-17 | Unsupervised Structural-Counterfactual Generation under Domain Shift | Krishn Vishwas Kher et.al. | 2502.12013 | null |
2025-02-17 | Characterizing Photorealism and Artifacts in Diffusion Model-Generated Images | Negar Kamali et.al. | 2502.11989 | link |
2025-02-17 | Design Considerations Based on Stability for a Class of TCP Algorithms | Sreekanth Prabhakar et.al. | 2502.11983 | null |
2025-02-17 | Image Inversion: A Survey from GANs to Diffusion and Beyond | Yinan Chen et.al. | 2502.11974 | link |
2025-02-17 | Generating Text from Uniform Meaning Representation | Emma Markle et.al. | 2502.11973 | link |
2025-02-17 | Massively Scaling Explicit Policy-conditioned Value Functions | Nico Bohlinger et.al. | 2502.11949 | null |
2025-02-14 | Region-Adaptive Sampling for Diffusion Transformers | Ziming Liu et.al. | 2502.10389 | null |
2025-02-14 | ReStyle3D: Scene-Level Appearance Transfer with Semantic Correspondences | Liyuan Zhu et.al. | 2502.10377 | null |
2025-02-14 | AffinityFlow: Guided Flows for Antibody Affinity Maturation | Can Chen et.al. | 2502.10365 | null |
2025-02-14 | Dimension-free Score Matching and Time Bootstrapping for Diffusion Models | Syamantak Kumar et.al. | 2502.10354 | null |
2025-02-14 | DiOpt: Self-supervised Diffusion for Constrained Optimization | Shutong Ding et.al. | 2502.10330 | null |
2025-02-14 | Generalised Parallel Tempering: Flexible Replica Exchange via Flows and Diffusions | Leo Zhang et.al. | 2502.10328 | null |
2025-02-14 | Analysis and Prediction of Coverage and Channel Rank for UAV Networks in Rural Scenarios with Foliage | Donggu Lee et.al. | 2502.10324 | null |
2025-02-14 | Probabilistic Super-Resolution for High-Fidelity Physical System Simulations with Uncertainty Quantification | Pengyu Zhang et.al. | 2502.10280 | null |
2025-02-14 | Dark Matter Attenuation Effects: Sensitivity Ceilings for Spin-Dependent and Spin-Independent Interactions | QUEST-DMC Collaboration et.al. | 2502.10251 | null |
2025-02-14 | Shaping Inductive Bias in Diffusion Models through Frequency-Based Noise Control | Thomas Jiralerspong et.al. | 2502.10236 | null |
2025-02-14 | Integrated Multi-Simulation Environments for Aerial Robotics Research | Pascal Goldschmid et.al. | 2502.10218 | link |
2025-02-14 | VideoDiff: Human-AI Video Co-Creation with Alternatives | Mina Huh et.al. | 2502.10190 | null |
2025-02-14 | Agentic End-to-End De Novo Protein Design for Tailored Dynamics Using a Language Diffusion Model | Bo Ni et.al. | 2502.10173 | null |
2025-02-14 | Modeling biases in binary decision-making within the generalized nonlinear q-voter model | Maciej Doniec et.al. | 2502.10172 | link |
2025-02-14 | Modeling and Simulating Emerging Memory Technologies: A Tutorial | Yun-Chih Chen et.al. | 2502.10167 | null |
2025-02-13 | Theoretical Benefit and Limitation of Diffusion Language Model | Guhao Feng et.al. | 2502.09622 | null |
2025-02-13 | RigAnything: Template-Free Autoregressive Rigging for Diverse 3D Assets | Isabella Liu et.al. | 2502.09615 | null |
2025-02-13 | Designing a Conditional Prior Distribution for Flow-Based Generative Models | Noam Issachar et.al. | 2502.09611 | null |
2025-02-14 | Score-of-Mixture Training: Training One-Step Generative Models Made Simple via Score Estimation of Mixture Distributions | Tejas Jayashankar et.al. | 2502.09609 | null |
2025-02-13 | Rolling Ahead Diffusion for Traffic Scene Simulation | Yunpeng Liu et.al. | 2502.09587 | null |
2025-02-13 | Memorization and Generalization in Generative Diffusion under the Manifold Hypothesis | Beatrice Achilli et.al. | 2502.09578 | null |
2025-02-13 | Wireless and passive pressure detection using magneto-mechanical resonances in process engineering | Timo Merbach et.al. | 2502.09575 | null |
2025-02-13 | DiffMS: Diffusion Generation of Molecules Conditioned on Mass Spectra | Montgomery Bohde et.al. | 2502.09571 | link |
2025-02-13 | Diffusing DeBias: a Recipe for Turning a Bug into a Feature | Massimiliano Ciranni et.al. | 2502.09564 | null |
2025-02-13 | Cryogenic SiPMs for the Optical Readout of DarkSide-20k | Giuseppe Matteucci et.al. | 2502.09558 | null |
2025-02-13 | Long-Term TalkingFace Generation via Motion-Prior Conditional Diffusion Model | Fei Shen et.al. | 2502.09533 | null |
2025-02-13 | SQ-GAN: Semantic Image Communications Using Masked Vector Quantization | Francesco Pezone et.al. | 2502.09520 | link |
2025-02-13 | Diffusion Models for Molecules: A Survey of Methods and Tasks | Liang Wang et.al. | 2502.09511 | link |
2025-02-14 | EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling | Theodoros Kouzelis et.al. | 2502.09509 | null |
2025-02-13 | AttentionSmithy: A Modular Framework for Rapid Transformer Development and Customization | Caleb Cranney et.al. | 2502.09503 | null |
2025-02-12 | SwiftSketch: A Diffusion Model for Image-to-Vector Sketch Generation | Ellie Arar et.al. | 2502.08642 | null |
2025-02-12 | CineMaster: A 3D-Aware and Controllable Framework for Cinematic Text-to-Video Generation | Qinghe Wang et.al. | 2502.08639 | null |
2025-02-12 | Learning Selection Cuts With Gradients | Mike Hance et.al. | 2502.08615 | null |
2025-02-12 | An Initial Condition-Dependent Neural Network Approach for Optimal Control Problems | Mominul Rubel et.al. | 2502.08607 | null |
2025-02-12 | Chasing Charge Carriers: Diffusion Dynamics in Mixed-n Quasi-Two-Dimensional Colloidal MAPbBr3 Perovskites | Ronja Maria Piehler et.al. | 2502.08601 | null |
2025-02-12 | Enhancing Diffusion Models Efficiency by Disentangling Total-Variance and Signal-to-Noise Ratio | Khaled Kahouli et.al. | 2502.08598 | link |
2025-02-12 | Light-A-Video: Training-free Video Relighting via Progressive Light Fusion | Yujie Zhou et.al. | 2502.08590 | link |
2025-02-12 | Ultrasound Image Generation using Latent Diffusion Models | Benoit Freiche et.al. | 2502.08580 | null |
2025-02-12 | Mapping the Landscape of Generative AI in Network Monitoring and Management | Giampaolo Bovenzi et.al. | 2502.08576 | null |
2025-02-12 | Statistically validated projection of bipartite signed networks | Anna Gallo et.al. | 2502.08567 | null |
2025-02-12 | Human-Centric Foundation Models: Perception, Generation and Agentic Modeling | Shixiang Tang et.al. | 2502.08556 | link |
2025-02-12 | BCDDM: Branch-Corrected Denoising Diffusion Model for Black Hole Image Generation | Ao liu et.al. | 2502.08528 | null |
2025-02-12 | FedMHO: Heterogeneous One-Shot Federated Learning Towards Resource-Constrained Edge Devices | Dezhong Yao et.al. | 2502.08518 | link |
2025-02-12 | One-Shot Federated Learning with Classifier-Free Diffusion Models | Obaidullah Zaland et.al. | 2502.08488 | null |
2025-02-12 | Computed fingertip touch for the instrumental control of musical sound with an excursion on the computed retinal afterimage | Staas de Jong et.al. | 2502.08471 | null |
2025-02-11 | Pippo: High-Resolution Multi-View Humans from a Single Image | Yash Kant et.al. | 2502.07785 | null |
2025-02-11 | MatSwap: Light-aware material transfers in images | Ivan Lopes et.al. | 2502.07784 | null |
2025-02-11 | Stay-Positive: A Case for Ignoring Real Image Features in Fake Image Detection | Anirudh Sundara Rajan et.al. | 2502.07778 | null |
2025-02-11 | The Economics of Large Language Models: Token Allocation, Fine-Tuning, and Optimal Pricing | Dirk Bergemann et.al. | 2502.07736 | null |
2025-02-11 | Revisiting Non-Acyclic GFlowNets in Discrete Environments | Nikita Morozov et.al. | 2502.07735 | link |
2025-02-11 | DOGlove: Dexterous Manipulation with a Low-Cost Open-Source Haptic Force Feedback Glove | Han Zhang et.al. | 2502.07730 | null |
2025-02-11 | Near-Optimal Sample Complexity in Reward-Free Kernel-Based Reinforcement Learning | Aya Kayal et.al. | 2502.07715 | null |
2025-02-11 | Magic 1-For-1: Generating One Minute Video Clips within One Minute | Hongwei Yi et.al. | 2502.07701 | link |
2025-02-11 | Steering Protein Family Design through Profile Bayesian Flow | Jingjing Gong et.al. | 2502.07671 | null |
2025-02-11 | Guiding Time-Varying Generative Models with Natural Gradients on Exponential Family Manifold | Song Liu et.al. | 2502.07650 | null |
2025-02-11 | Distributional Instrumental Variable Method | Anastasiia Holovchak et.al. | 2502.07641 | link |
2025-02-11 | Consistency Training with Physical Constraints | Che-Chia Chang et.al. | 2502.07636 | null |
2025-02-11 | Tractable Transformers for Flexible Conditional Generation | Anji Liu et.al. | 2502.07616 | null |
2025-02-11 | YOLO Network For Defect Detection In Optical lenses | Habib Yaseen et.al. | 2502.07592 | null |
2025-02-11 | Generative Modeling with Bayesian Sample Inference | Marten Lienen et.al. | 2502.07580 | link |
2025-02-10 | Lumina-Video: Efficient and Flexible Video Generation with Multi-scale Next-DiT | Dongyang Liu et.al. | 2502.06782 | null |
2025-02-10 | Learning an Optimal Assortment Policy under Observational Data | Yuxuan Han et.al. | 2502.06777 | null |
2025-02-10 | Enhancing Performance of Explainable AI Models with Constrained Concept Refinement | Geyu Liang et.al. | 2502.06775 | null |
2025-02-10 | Train for the Worst, Plan for the Best: Understanding Token Ordering in Masked Diffusions | Jaeyeon Kim et.al. | 2502.06768 | null |
2025-02-10 | History-Guided Video Diffusion | Kiwhan Song et.al. | 2502.06764 | null |
2025-02-10 | Señorita-2M: A High-Quality Instruction-based Dataset for General Video Editing by Video Specialists | Bojia Zi et.al. | 2502.06734 | null |
2025-02-10 | RSAttAE: An Information-Aware Attention-based Autoencoder Recommender System | Amirhossein Dadashzadeh Taromi et.al. | 2502.06705 | null |
2025-02-10 | No Trick, No Treat: Pursuits and Challenges Towards Simulation-free Training of Neural Samplers | Jiajun He et.al. | 2502.06685 | null |
2025-02-10 | Transfer Your Perspective: Controllable 3D Generation from Any Viewpoint in a Driving Scene | Tai-Yu Pan et.al. | 2502.06682 | null |
2025-02-10 | Filling a gap in materials mechanics: Nanoindentation at high constant strain rates upto |
Lalith Kumar Bhaskar et.al. | 2502.06668 | null |
2025-02-11 | Unleashing the Potential of Pre-Trained Diffusion Models for Generalizable Person Re-Identification | Jiachen Li et.al. | 2502.06619 | link |
2025-02-10 | MaterialFusion: High-Quality, Zero-Shot, and Controllable Material Transfer with Diffusion Models | Kamil Garifullin et.al. | 2502.06606 | null |
2025-02-10 | Joint parameter and state estimation for regularized time-discrete multibody dynamics | Hannes Marklund et.al. | 2502.06599 | null |
2025-02-10 | A Large-scale AI-generated Image Inpainting Benchmark | Paschalis Giakoumoglou et.al. | 2502.06593 | null |
2025-02-10 | Optimizing Energy Efficiency in Subthreshold RISC-V Cores | Asbjørn Djupdal et.al. | 2502.06588 | null |
2025-02-07 | FlashVideo:Flowing Fidelity to Detail for Efficient High-Resolution Video Generation | Shilong Zhang et.al. | 2502.05179 | link |
2025-02-07 | Fillerbuster: Multi-View Scene Completion for Casual Captures | Ethan Weber et.al. | 2502.05175 | null |
2025-02-07 | Multitwine: Multi-Object Compositing with Text and Layout Control | Gemma Canet Tarrés et.al. | 2502.05165 | null |
2025-02-07 | Hummingbird: High Fidelity Image Generation via Multimodal Context Alignment | Minh-Quan Le et.al. | 2502.05153 | null |
2025-02-07 | Latent Swap Joint Diffusion for Long-Form Audio Generation | Yusheng Dai et.al. | 2502.05130 | null |
2025-02-07 | Beautiful Images, Toxic Words: Understanding and Addressing Offensive Text in Generated Images | Aditya Kumar et.al. | 2502.05066 | link |
2025-02-07 | Prospects for detecting generic fast-time features in the neutrino lightcurve of nearby supernovae in neutrino telescopes | Jakob Beise et.al. | 2502.05024 | null |
2025-02-07 | Seasonal Station-Keeping of Short Duration High Altitude Balloons using Deep Reinforcement Learning | Tristan K. Schuler et.al. | 2502.05014 | null |
2025-02-07 | Robust Graph Learning Against Adversarial Evasion Attacks via Prior-Free Diffusion-Based Structure Purification | Jiayi Luo et.al. | 2502.05000 | link |
2025-02-07 | C2GM: Cascading Conditional Generation of Multi-scale Maps from Remote Sensing Images Constrained by Geographic Features | Chenxing Sun et.al. | 2502.04991 | null |
2025-02-07 | FF7: A Code Package for High-throughput Calculations and Constructing Materials Database | Tiancheng Ma et.al. | 2502.04984 | null |
2025-02-07 | Generative-enhanced optimization for knapsack problems: an industry-relevant study | Yelyzaveta Vodovozova et.al. | 2502.04928 | null |
2025-02-07 | ARTInp: CBCT-to-CT Image Inpainting and Image Translation in Radiotherapy | Ricardo Coimbra Brioso et.al. | 2502.04898 | null |
2025-02-07 | Goku: Flow Based Video Generative Foundation Models | Shoufa Chen et.al. | 2502.04896 | null |
2025-02-07 | Training-free Task-oriented Grasp Generation | Jiaming Wang et.al. | 2502.04873 | null |
2025-02-06 | Can Grammarly and ChatGPT accelerate language change? AI-powered technologies and their impact on the English language: wordiness vs. conciseness | Karolina Rudnicka et.al. | 2502.04324 | null |
2025-02-06 | HOG-Diff: Higher-Order Guided Diffusion for Graph Generation | Yiming Huang et.al. | 2502.04308 | link |
2025-02-06 | MotionCanvas: Cinematic Shot Design with Controllable Image-to-Video Generation | Jinbo Xing et.al. | 2502.04299 | null |
2025-02-06 | Learning Real-World Action-Video Dynamics with Heterogeneous Masked Autoregression | Lirui Wang et.al. | 2502.04296 | null |
2025-02-06 | Breaking the Vault: A Case Study of the 2022 LastPass Data Breach | Jessica Gentles et.al. | 2502.04287 | null |
2025-02-06 | Non-Variational Quantum Random Access Optimization with Alternating Operator Ansatz | Zichang He et.al. | 2502.04277 | null |
2025-02-06 | Digital Gatekeeping: An Audit of Search Engine Results shows tailoring of queries on the Israel-Palestine Conflict | Íris Damião et.al. | 2502.04266 | null |
2025-02-06 | Realistic Image-to-Image Machine Unlearning via Decoupling and Knowledge Retention | Ayush K. Varshney et.al. | 2502.04260 | null |
2025-02-06 | TriNER: A Series of Named Entity Recognition Models For Hindi, Bengali & Marathi | Mohammed Amaan Dhamaskar et.al. | 2502.04245 | null |
2025-02-06 | NLP-Based .NET CLR Event Logs Analyzer | Maxim Stavtsev et.al. | 2502.04219 | link |
2025-02-06 | MRAMG-Bench: A BeyondText Benchmark for Multimodal Retrieval-Augmented Multimodal Generation | Qinhan Yu et.al. | 2502.04176 | link |
2025-02-06 | Diffusion-based mass map reconstruction from weak lensing data | Supranta S. Boruah et.al. | 2502.04158 | null |
2025-02-06 | Synthetic Datasets for Machine Learning on Spatio-Temporal Graphs using PDEs | Jost Arndt et.al. | 2502.04140 | link |
2025-02-06 | Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis | Zhen Ye et.al. | 2502.04128 | link |
2025-02-06 | Generative Adversarial Networks Bridging Art and Machine Intelligence | Junhao Song et.al. | 2502.04116 | null |
2025-02-05 | Dress-1-to-3: Single Image to Simulation-Ready 3D Outfit with Diffusion Prior and Differentiable Physics | Xuan Li et.al. | 2502.03449 | null |
2025-02-05 | Masked Autoencoders Are Effective Tokenizers for Diffusion Models | Hao Chen et.al. | 2502.03444 | null |
2025-02-05 | Taking a Big Step: Large Learning Rates in Denoising Score Matching Prevent Memorization | Yu-Han Wu et.al. | 2502.03435 | null |
2025-02-05 | A Temporal Convolutional Network-Based Approach and a Benchmark Dataset for Colonoscopy Video Temporal Segmentation | Carlo Biffi et.al. | 2502.03430 | null |
2025-02-05 | TruePose: Human-Parsing-guided Attention Diffusion for Full-ID Preserving Pose Transfer | Zhihong Xu et.al. | 2502.03426 | null |
2025-02-05 | Can Text-to-Image Generative Models Accurately Depict Age? A Comparative Study on Synthetic Portrait Generation and Age Estimation | Alexey A. Novikov et.al. | 2502.03420 | null |
2025-02-05 | A Mixture-Based Framework for Guiding Diffusion Models | Yazid Janati et.al. | 2502.03332 | null |
2025-02-05 | An efficient end-to-end computational framework for the generation of ECG calibrated volumetric models of human atrial electrophysiology | Elena Zappon et.al. | 2502.03322 | null |
2025-02-05 | Simplifying Formal Proof-Generating Models with ChatGPT and Basic Searching Techniques | Sangjun Han et.al. | 2502.03321 | null |
2025-02-05 | Electronic properties and transport in metal/2D material/metal vertical junctions | Gaëlle Bigeard et.al. | 2502.03318 | null |
2025-02-05 | Posterior SBC: Simulation-Based Calibration Checking Conditional on Data | Teemu Säilynoja et.al. | 2502.03279 | link |
2025-02-05 | General Time-series Model for Universal Knowledge Representation of Multivariate Time-Series data | Cheng He et.al. | 2502.03264 | null |
2025-02-05 | Practical Introduction to FEM with GMSH: A MATLAB/Octave Perspective | Victor Dominguez et.al. | 2502.03248 | null |
2025-02-05 | MotionAgent: Fine-grained Controllable Video Generation via Motion Field Agent | Xinyao Liao et.al. | 2502.03207 | null |
2025-02-05 | Low-cost analog signal chain for transmit-receive circuits of passive induction-based resonators | Fabian Mohn et.al. | 2502.03202 | null |
2025-02-04 | COCONut-PanCap: Joint Panoptic Segmentation and Grounded Captions for Fine-Grained Understanding and Generation | Xueqing Deng et.al. | 2502.02589 | null |
2025-02-04 | Calibrated Multi-Preference Optimization for Aligning Diffusion Models | Kyungmin Lee et.al. | 2502.02588 | null |
2025-02-04 | Open Materials Generation with Stochastic Interpolants | Philipp Hoellmer et.al. | 2502.02582 | null |
2025-02-04 | A Family-Based Approach to Safety Cases for Controlled Airspaces in Small Uncrewed Aerial Systems | Michael C. Hunter et.al. | 2502.02559 | null |
2025-02-04 | Diff9D: Diffusion-Based Domain-Generalized Category-Level 9-DoF Object Pose Estimation | Jian Liu et.al. | 2502.02525 | link |
2025-02-04 | Privacy Attacks on Image AutoRegressive Models | Antoni Kowalczuk et.al. | 2502.02514 | link |
2025-02-04 | Generative Modeling on Lie Groups via Euclidean Generalized Score Matching | Marco Bertolini et.al. | 2502.02513 | null |
2025-02-04 | Learning to generate physical ocean states: Towards hybrid climate modeling | Etienne Meunier et.al. | 2502.02499 | null |
2025-02-04 | Do Graph Diffusion Models Accurately Capture and Generate Substructure Distributions? | Xiyuan Wang et.al. | 2502.02488 | null |
2025-02-04 | Distributional Diffusion Models with Scoring Rules | Valentin De Bortoli et.al. | 2502.02483 | null |
2025-02-04 | Style transfer as data augmentation: evaluating unpaired image-to-image translation models in mammography | Emir Ahmed et.al. | 2502.02475 | null |
2025-02-04 | Towards Consistent and Controllable Image Synthesis for Face Editing | Mengting Wei et.al. | 2502.02465 | null |
2025-02-04 | Personalization Toolkit: Training Free Personalization of Large Vision Language Models | Soroush Seifi et.al. | 2502.02452 | null |
2025-02-04 | Sparse Data Generation Using Diffusion Models | Phil Ostheimer et.al. | 2502.02448 | null |
2025-02-04 | TransformDAS: Mapping Φ-OTDR Signals to Riemannian Manifold for Robust Classification | Jiaju Kang et.al. | 2502.02428 | null |
2025-01-31 | LiDAR Loop Closure Detection using Semantic Graphs with Graph Attention Networks | Liudi Yang et.al. | 2501.19382 | link |
2025-01-31 | Creative Problem-Solving: A Study with Blind and Low Vision Software Professionals | Karina Kohl et.al. | 2501.19380 | null |
2025-01-31 | Beyond Fixed Horizons: A Theoretical Framework for Adaptive Denoising Diffusions | Sören Christensen et.al. | 2501.19373 | null |
2025-01-31 | Addressing the correlation of Stokes-shifted photons emitted from two quantum emitters | Adrián Juan-Delgado et.al. | 2501.19356 | null |
2025-01-31 | Do Large Multimodal Models Solve Caption Generation for Scientific Figures? Lessons Learned from SCICAP Challenge 2023 | Ting-Yao E. Hsu et.al. | 2501.19353 | null |
2025-01-31 | Low-cost Microfluidic Testbed for Molecular Communications with Integrated Hydrodynamic Gating and Screen-printed Sensors | Maide Miray Albay et.al. | 2501.19341 | null |
2025-01-31 | Pathological MRI Segmentation by Synthetic Pathological Data Generation in Fetuses and Neonates | Misha P. T Kaandorp et.al. | 2501.19338 | null |
2025-01-31 | Analysis of LLMs vs Human Experts in Requirements Engineering | Cory Hymel et.al. | 2501.19297 | null |
2025-01-31 | Medical Semantic Segmentation with Diffusion Pretrain | David Li et.al. | 2501.19265 | null |
2025-01-31 | Inference-Time Text-to-Video Alignment with Diffusion Latent Beam Search | Yuta Oshima et.al. | 2501.19252 | null |
2025-01-31 | Single cell resolution 3D imaging and segmentation within intact live tissues | G. Paci et.al. | 2501.19203 | link |
2025-01-31 | A Variational Perspective on Generative Protein Fitness Optimization | Lea Bogensperger et.al. | 2501.19200 | null |
2025-01-31 | PSyDUCK: Training-Free Steganography for Latent Diffusion | Georgia Channing et.al. | 2501.19172 | null |
2025-01-31 | RMDM: Radio Map Diffusion Model with Physics Informed | Haozhe Jia et.al. | 2501.19160 | link |
2025-01-31 | A theoretical framework for overfitting in energy-based modeling | Giovanni Catania et.al. | 2501.19158 | null |
2025-01-30 | Diffusion Autoencoders are Scalable Image Tokenizers | Yinbo Chen et.al. | 2501.18593 | null |
2025-01-30 | DiffusionRenderer: Neural Inverse and Forward Rendering with Video Diffusion Models | Ruofan Liang et.al. | 2501.18590 | null |
2025-01-30 | WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training | Benjamin Feuer et.al. | 2501.18511 | link |
2025-01-30 | Examining the Expanding Role of Synthetic Data Throughout the AI Development Pipeline | Shivani Kapania et.al. | 2501.18493 | null |
2025-01-30 | CodeBrain: Impute Any Brain MRI via Instance-specific Scalar-quantized Codes | Yicheng Wu et.al. | 2501.18328 | null |
2025-01-30 | How to Select Datapoints for Efficient Human Evaluation of NLG Models? | Vilém Zouhar et.al. | 2501.18251 | link |
2025-01-30 | Free-T2M: Frequency Enhanced Text-to-Motion Diffusion Model With Consistency Loss | Wenshuo Chen et.al. | 2501.18232 | link |
2025-01-30 | Inverse source problem of sub-diffusion of variable exponent | Zhiyuan Li et.al. | 2501.18228 | null |
2025-01-30 | Behavior Modeling Space Reconstruction for E-Commerce Search | Yejing Wang et.al. | 2501.18216 | null |
2025-01-30 | Joint Design and Pricing of Extended Warranties for Multiple Automobiles with Different Price Bands | Yajing Chen et.al. | 2501.18203 | null |
2025-01-30 | Advancing Personalized Federated Learning: Integrative Approaches with AI for Enhanced Privacy and Customization | Kevin Cooper et.al. | 2501.18174 | null |
2025-01-31 | RepoAudit: An Autonomous LLM-Agent for Repository-Level Code Auditing | Jinyao Guo et.al. | 2501.18160 | null |
2025-01-30 | The Dilemma of Building Do-It-Yourself (DIY) Solutions for Workplace Accessibility | Yoonha Cha et.al. | 2501.18148 | null |
2025-01-30 | HyperZero: A Customized End-to-End Auto-Tuning System for Recommendation with Hourly Feedback | Xufeng Cai et.al. | 2501.18126 | null |
2025-01-29 | SAeUron: Interpretable Concept Unlearning in Diffusion Models with Sparse Autoencoders | Bartosz Cywiński et.al. | 2501.18052 | link |
2025-01-29 | Enriched Immersed Finite Element and Isogeometric Analysis -- Algorithms and Data Structures | Nils Wunsch et.al. | 2501.17853 | null |
2025-01-29 | acoupi: An Open-Source Python Framework for Deploying Bioacoustic AI Models on Edge Devices | Aude Vuilliomenet et.al. | 2501.17841 | link |
2025-01-29 | Atomic Transfer Graphs: Secure-by-design Protocols for Heterogeneous Blockchain Ecosystems | Stephan Dübler et.al. | 2501.17786 | null |
2025-01-29 | Generative Unordered Flow for Set-Structured Data Generation | Yangming Li et.al. | 2501.17770 | null |
2025-01-29 | Formally Verified Binary-level Pointer Analysis | Freek Verbeek et.al. | 2501.17766 | null |
2025-01-29 | In-IDE Programming Courses: Learning Software Development in a Real-World Setting | Anastasiia Birillo et.al. | 2501.17747 | null |
2025-01-29 | Testing Research Software: An In-Depth Survey of Practices, Methods, and Tools | Nasir U. Eisty et.al. | 2501.17739 | null |
2025-01-29 | A technical review of multi-omics data integration methods: from classical statistical to deep generative approaches | Ana R. Baião et.al. | 2501.17729 | null |
2025-01-29 | VICCA: Visual Interpretation and Comprehension of Chest X-ray Anomalies in Generated Report Without Human Feedback | Sayeh Gholipour Picha et.al. | 2501.17726 | null |
2025-01-29 | Source-Channel Separation Theorems for Distortion Perception Coding | Chao Tian et.al. | 2501.17706 | null |
2025-01-29 | Distinguished Quantized Guidance for Diffusion-based Sequence Recommendation | Wenyu Mao et.al. | 2501.17670 | null |
2025-01-29 | In-Context Meta LoRA Generation | Yihua Shao et.al. | 2501.17635 | null |
2025-01-29 | Semantic Consistency Regularization with Large Language Models for Semi-supervised Sentiment Analysis | Kunrong Li et.al. | 2501.17598 | null |
2025-01-29 | Music2Latent2: Audio Compression with Summary Embeddings and Autoregressive Decoding | Marco Pasini et.al. | 2501.17578 | null |
2025-01-29 | Exploring the Potential of Wireless-enabled Multi-Chip AI Accelerators | Emmanuel Irabor et.al. | 2501.17567 | null |
2025-01-28 | CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation | Nikolai Kalischek et.al. | 2501.17162 | null |
2025-01-28 | IC-Portrait: In-Context Matching for View-Consistent Personalized Portrait | Han Yang et.al. | 2501.17159 | null |
2025-01-28 | First Axion-Like Particle Results from a Broadband Search for Wave-Like Dark Matter in the 44 to 52 |
Gabe Hoshino et.al. | 2501.17119 | null |
2025-01-28 | Goodness of Fit for Bayesian Generative Models with Applications in Population Genetics | Guillaume Le Mailloux et.al. | 2501.17107 | link |
2025-01-28 | DataLens: ML-Oriented Interactive Tabular Data Quality Dashboard | Mohamed Abdelaal et.al. | 2501.17074 | null |
2025-01-28 | Generative diffusion models from a PDE perspective | Fei Cao et.al. | 2501.17054 | null |
2025-01-28 | MIDI-GPT: A Controllable Generative Model for Computer-Assisted Multitrack Music Composition | Philippe Pasquier et.al. | 2501.17011 | null |
2025-01-28 | Generative quantum combinatorial optimization by means of a novel conditional generative quantum eigensolver | Shunya Minami et.al. | 2501.16986 | null |
2025-01-28 | A totally non-compensatory multi-criteria method for evaluating and improving level of satisfaction (LoS): proposal and application on Airport Terminal of Passengers | Phelipe Medeiros da Rocha et.al. | 2501.16979 | null |
2025-01-28 | Adversarial Masked Autoencoder Purifier with Defense Transferability | Yuan-Chih Chen et.al. | 2501.16904 | null |
2025-01-28 | Extending Information Bottleneck Attribution to Video Sequences | Veronika Solopova et.al. | 2501.16889 | link |
2025-01-28 | DIRIGENt: End-To-End Robotic Imitation of Human Demonstrations Based on a Diffusion Model | Josua Spisak et.al. | 2501.16800 | null |
2025-01-28 | Algorithm for Automatic Legislative Text Consolidation | Matias Etcheverry et.al. | 2501.16794 | null |
2025-01-28 | Exponential Family Attention | Kevin Christian Wibisono et.al. | 2501.16790 | link |
2025-01-28 | FlexMotion: Lightweight, Physics-Aware, and Controllable Human Motion Generation | Arvin Tashakori et.al. | 2501.16778 | null |
2025-01-27 | RelightVid: Temporal-Consistent Diffusion Model for Video Relighting | Ye Fang et.al. | 2501.16330 | null |
2025-01-27 | Movement- and Traffic-based User Identification in Commercial Virtual Reality Applications: Threats and Opportunities | Sara Baldoni et.al. | 2501.16326 | link |
2025-01-27 | Evaluating The Performance of Using Large Language Models to Automate Summarization of CT Simulation Orders in Radiation Oncology | Meiyun Cao et.al. | 2501.16309 | null |
2025-01-27 | RAPID: Retrieval-Augmented Parallel Inference Drafting for Text-Based Video Event Retrieval | Long Nguyen et.al. | 2501.16303 | null |
2025-01-27 | Congested Crossing Pedestrian Traffic Flow : Dispersion vs. Transport in Crowded Areas | Mariam Al Khatib et.al. | 2501.16275 | null |
2025-01-27 | Improving DBMS Scheduling Decisions with Fine-grained Performance Prediction on Concurrent Queries -- Extended | Ziniu Wu et.al. | 2501.16256 | null |
2025-01-27 | A foundation model for human-AI collaboration in medical literature mining | Zifeng Wang et.al. | 2501.16255 | null |
2025-01-27 | UDBE: Unsupervised Diffusion-based Brightness Enhancement in Underwater Images | Tatiana Taís Schein et.al. | 2501.16211 | link |
2025-01-27 | HERITRACE: A User-Friendly Semantic Data Editor with Change Tracking and Provenance Management for Cultural Heritage Institutions | Arcangelo Massari et.al. | 2501.16197 | null |
2025-01-27 | Multi-front dynamics in spatially inhomogeneous Allen-Cahn equations | Robbin Bastiaansen et.al. | 2501.16195 | null |
2025-01-27 | BAG: Body-Aligned 3D Wearable Asset Generation | Zhongjin Luo et.al. | 2501.16177 | null |
2025-01-27 | Efficient Portrait Matte Creation With Layer Diffusion and Connectivity Priors | Zhiyuan Lu et.al. | 2501.16147 | null |
2025-01-27 | Disruption-aware Microservice Re-orchestration for Cost-efficient Multi-cloud Deployments | Marco Zambianco et.al. | 2501.16143 | null |
2025-01-27 | Using Generative Models to Produce Realistic Populations of UK Windstorms | Yee Chun Tsoi et.al. | 2501.16110 | null |
2025-01-27 | ARFlow: Autogressive Flow with Hybrid Linear Attention | Mude Hui et.al. | 2501.16085 | null |
2025-01-24 | An Attentive Graph Agent for Topology-Adaptive Cyber Defence | Ilya Orson Sandoval et.al. | 2501.14700 | link |
2025-01-24 | Diffusion based Text-to-Music Generationwith Global and Local Text based Conditioning | Jisi Zhang et.al. | 2501.14680 | null |
2025-01-24 | End-to-end workflow for machine learning-based qubit readout with QICK and hls4ml | Giuseppe Di Guglielmo et.al. | 2501.14663 | null |
2025-01-24 | Towards Scalable Topological Regularizers | Hiu-Tung Wong et.al. | 2501.14641 | null |
2025-01-24 | Single-neuron deep generative model uncovers underlying physics of neuronal activity in Ca imaging data | Jordi Abante et.al. | 2501.14615 | null |
2025-01-24 | Visual Localization via Semantic Structures in Autonomous Photovoltaic Power Plant Inspection | Viktor Kozák et.al. | 2501.14587 | null |
2025-01-24 | Training-Free Style and Content Transfer by Leveraging U-Net Skip Connections in Stable Diffusion 2.* | Ludovica Schaerf et.al. | 2501.14524 | null |
2025-01-24 | Pesti-Gen: Unleashing a Generative Molecule Approach for Toxicity Aware Pesticide Design | Taehan Kim et.al. | 2501.14469 | null |
2025-01-24 | CENTS: Generating synthetic electricity consumption time series for rare and unseen scenarios | Michael Fuest et.al. | 2501.14426 | null |
2025-01-24 | DeepFlow: Serverless Large Language Model Serving at Scale | Junhao Hu et.al. | 2501.14417 | null |
2025-01-24 | Uncovering the bias in the evidence for dynamical dark energy through minimal and generalized modeling approaches | Ziad Sakr et.al. | 2501.14366 | null |
2025-01-24 | Advancing data-driven broadband seismic wavefield simulation with multi-conditional diffusion model | Zhengfa Bi et.al. | 2501.14348 | null |
2025-01-24 | HorNets: Learning from Discrete and Continuous Signals with Routing Neural Networks | Boshko koloski et.al. | 2501.14346 | link |
2025-01-24 | Stochastic Method for Delayed Neutron Precursors Transport in Liquid Fuel | Mathis Caprais et.al. | 2501.14332 | null |
2025-01-24 | PAID: A Framework of Product-Centric Advertising Image Design | Hongyu Chen et.al. | 2501.14316 | null |
2025-01-23 | IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models | Jiayi Lei et.al. | 2501.13920 | null |
2025-01-23 | Improving Video Generation with Human Feedback | Jie Liu et.al. | 2501.13918 | null |
2025-01-23 | Binary Diffusion Probabilistic Model | Vitaliy Kinakh et.al. | 2501.13915 | null |
2025-01-23 | Privacy-Preserving Personalized Federated Prompt Learning for Multimodal Large Language Models | Linh Tran et.al. | 2501.13904 | null |
2025-01-23 | A RAG-Based Institutional Assistant | Gustavo Kuratomi et.al. | 2501.13880 | null |
2025-01-23 | Unveiling the Power of Noise Priors: Enhancing Diffusion Models for Mobile Traffic Prediction | Zhi Sheng et.al. | 2501.13794 | null |
2025-01-23 | An Efficient Diffusion-based Non-Autoregressive Solver for Traveling Salesman Problem | Mingzhao Wang et.al. | 2501.13767 | link |
2025-01-23 | A Mutual Information Perspective on Multiple Latent Variable Generative Models for Positive View Generation | Dario Serez et.al. | 2501.13718 | null |
2025-01-23 | YOLO11-JDE: Fast and Accurate Multi-Object Tracking with Self-Supervised Re-ID | Iñaki Erregue et.al. | 2501.13710 | link |
2025-01-23 | Training-Free Consistency Pipeline for Fashion Repose | Potito Aghilar et.al. | 2501.13692 | null |
2025-01-23 | A Transformer-based Autoregressive Decoder Architecture for Hierarchical Text Classification | Younes Yousef et.al. | 2501.13598 | link |
2025-01-23 | Funnelling super-resolution STED microscopy through multimode fibres | André Gomes et.al. | 2501.13572 | null |
2025-01-24 | One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt | Tao Liu et.al. | 2501.13554 | link |
2025-01-23 | Diffusion-based Perceptual Neural Video Compression with Temporal Diffusion Information Reuse | Wenzhuo Ma et.al. | 2501.13528 | null |
2025-01-23 | LDR-Net: A Novel Framework for AI-generated Image Detection via Localized Discrepancy Representation | JiaXin Chen et.al. | 2501.13475 | null |
2025-01-22 | Accelerate High-Quality Diffusion Models with Inner Loop Feedback | Matthew Gwilliam et.al. | 2501.13107 | null |
2025-01-22 | Robust Representation Consistency Model via Contrastive Denoising | Jiachen Lei et.al. | 2501.13094 | link |
2025-01-22 | Innovative Web Tool for Remote Data Acquisition and Analysis: Customized for SKA Low frequency Beamforming Test Bed LPDA Array at Gauribidanur Radio Observatory | Anumanchi Agastya Sai Ram Likhit et.al. | 2501.13090 | null |
2025-01-22 | Orchid: Image Latent Diffusion for Joint Appearance and Geometry Generation | Akshay Krishnan et.al. | 2501.13087 | null |
2025-01-22 | Robust Body Composition Analysis by Generating 3D CT Volumes from Limited 2D Slices | Lianrui Zuo et.al. | 2501.13071 | null |
2025-01-22 | Beyond the Lungs: Extending the Field of View in Chest CT with Latent Diffusion Models | Lianrui Zuo et.al. | 2501.13068 | null |
2025-01-22 | Neural network enhanced cross entropy benchmark for monitored circuits | Yangrui Hu et.al. | 2501.13005 | null |
2025-01-22 | Low-dimensional adaptation of diffusion models: Convergence in total variation | Jiadong Liang et.al. | 2501.12982 | null |
2025-01-22 | Accessible Smart Contracts Verification: Synthesizing Formal Models with Tamed LLMs | Jan Corazza et.al. | 2501.12972 | null |
2025-01-22 | Observation of Strong Nonreciprocal Thermal Emission | Zhenong Zhang et.al. | 2501.12947 | null |
2025-01-22 | 3D Object Manipulation in a Single Image using Generative Models | Ruisi Zhao et.al. | 2501.12935 | null |
2025-01-22 | Reinforcement learning Based Automated Design of Differential Evolution Algorithm for Black-box Optimization | Xu Yang et.al. | 2501.12881 | null |
2025-01-22 | CrossDiff: Diffusion Probabilistic Model With Cross-conditional Encoder-Decoder for Crack Segmentation | Xianglong Shi et.al. | 2501.12860 | null |
2025-01-22 | AMM-Diff: Adaptive Multi-Modality Diffusion Network for Missing Modality Imputation | Aghiles Kebaili et.al. | 2501.12840 | null |
2025-01-22 | Inverse Design of Chiral Structures for Giant Helical Dichroism | Chia-Chun Pan et.al. | 2501.12825 | null |
2025-01-21 | Towards Affordance-Aware Articulation Synthesis for Rigged Objects | Yu-Chu Yu et.al. | 2501.12393 | null |
2025-01-22 | GPS as a Control Signal for Image Generation | Chao Feng et.al. | 2501.12390 | null |
2025-01-21 | Audio Texture Manipulation by Exemplar-Based Analogy | Kan Jen Cheng et.al. | 2501.12385 | null |
2025-01-21 | Accelerating Pulsar Parameter Estimation Using Convolutional Neural Networks | Greg Olmschenk et.al. | 2501.12383 | null |
2025-01-21 | DiffDoctor: Diagnosing Image Diffusion Models Before Treating | Yiyang Wang et.al. | 2501.12382 | null |
2025-01-22 | Video Depth Anything: Consistent Depth Estimation for Super-Long Videos | Sili Chen et.al. | 2501.12375 | null |
2025-01-21 | FuocChuVIP123 at CoMeDi Shared Task: Disagreement Ranking with XLM-Roberta Sentence Embeddings and Deep Neural Regression | Phuoc Duong Huy Chu et.al. | 2501.12336 | null |
2025-01-21 | VipDiff: Towards Coherent and Diverse Video Inpainting via Training-free Denoising Diffusion Models | Chaohao Xie et.al. | 2501.12267 | null |
2025-01-21 | Joint Reconstruction and Motion Estimation in Sparse-View 4DCT Using Diffusion Models within a Blind Inverse Problem Framework | Antoine De Paepe et.al. | 2501.12249 | null |
2025-01-21 | InsTALL: Context-aware Instructional Task Assistance with Multi-modal Large Language Models | Pha Nguyen et.al. | 2501.12231 | null |
2025-01-21 | TokenVerse: Versatile Multi-concept Personalization in Token Modulation Space | Daniel Garibi et.al. | 2501.12224 | null |
2025-01-21 | Early Detection and Classification of Breast Cancer Using Deep Learning Techniques | Mst. Mumtahina Labonno et.al. | 2501.12217 | null |
2025-01-22 | Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation | Zibo Zhao et.al. | 2501.12202 | link |
2025-01-21 | An End-to-End Approach for Korean Wakeword Systems with Speaker Authentication | Geonwoo Seo et.al. | 2501.12194 | link |
2025-01-21 | ComposeAnyone: Controllable Layout-to-Human Generation with Decoupled Multimodal Conditions | Shiyue Zhang et.al. | 2501.12173 | link |
2025-01-17 | Zero-Shot Monocular Scene Flow Estimation in the Wild | Yiqing Liang et.al. | 2501.10357 | null |
2025-01-17 | Agent4Edu: Generating Learner Response Data by Generative Agents for Intelligent Education Systems | Weibo Gao et.al. | 2501.10332 | link |
2025-01-17 | DiffStereo: High-Frequency Aware Diffusion Model for Stereo Image Restoration | Huiyun Cao et.al. | 2501.10325 | null |
2025-01-17 | SEANN: A Domain-Informed Neural Network for Epidemiological Insights | Jean-Baptiste Guimbaud et.al. | 2501.10273 | null |
2025-01-17 | Drift time calibration of the ultra-Low material budget GEM-based TPC for MIXE | X. Zhao et.al. | 2501.10249 | null |
2025-01-17 | Over-the-Air Multi-Sensor Inference with Neural Networks Using Memristor-Based Analog Computing | Busra Tegin et.al. | 2501.10245 | null |
2025-01-17 | Modelling Activity Scheduling Behaviour with Deep Generative Machine Learning | Fred Shone et.al. | 2501.10221 | null |
2025-01-17 | Adaptive Clustering for Efficient Phenotype Segmentation of UAV Hyperspectral Data | Ciem Cornelissen et.al. | 2501.10199 | null |
2025-01-17 | Optimizing Structured-Sparse Matrix Multiplication in RISC-V Vector Processors | Vasileios Titopoulos et.al. | 2501.10189 | null |
2025-01-17 | Convex Physics Informed Neural Networks for the Monge-Ampère Optimal Transport Problem | Alexandre Caboussat et.al. | 2501.10162 | null |
2025-01-17 | AI-Generated Music Detection and its Challenges | Darius Afchar et.al. | 2501.10111 | link |
2025-01-17 | DiffVSR: Enhancing Real-World Video Super-Resolution with Diffusion Models for Advanced Visual Quality and Temporal Consistency | Xiaohui Li et.al. | 2501.10110 | null |
2025-01-17 | landmarker: a Toolkit for Anatomical Landmark Localization in 2D/3D Images | Jef Jonkers et.al. | 2501.10098 | link |
2025-01-17 | Conditional Latent Diffusion-Based Speech Enhancement Via Dual Context Learning | Shengkui Zhao et.al. | 2501.10052 | link |
2025-01-17 | DiffuEraser: A Diffusion Model for Video Inpainting | Xiaowen Li et.al. | 2501.10018 | link |
2025-01-16 | SynthLight: Portrait Relighting with Diffusion Model by Learning to Re-render Synthetic Faces | Sumit Chaturvedi et.al. | 2501.09756 | null |
2025-01-16 | Learnings from Scaling Visual Tokenizers for Reconstruction and Generation | Philippe Hansen-Estruch et.al. | 2501.09755 | null |
2025-01-16 | KU AIGEN ICL EDI@BC8 Track 3: Advancing Phenotype Named Entity Recognition and Normalization for Dysmorphology Physical Examination Reports | Hajung Kim et.al. | 2501.09744 | null |
2025-01-16 | Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps | Nanye Ma et.al. | 2501.09732 | null |
2025-01-16 | Comparative Insights from 12 Machine Learning Models in Extracting Economic Ideology from Political Text | Jihed Ncib et.al. | 2501.09719 | null |
2025-01-16 | Reward-Guided Controlled Generation for Inference-Time Alignment in Diffusion Models: Tutorial and Review | Masatoshi Uehara et.al. | 2501.09685 | null |
2025-01-16 | A Survey of Research in Large Language Models for Electronic Design Automation | Jingyu Pan et.al. | 2501.09655 | null |
2025-01-16 | Fabrication of Mode-Matched, Low-Loss Optical Resonators by Combination of FIB-Milling and CO |
Patrick Maier et.al. | 2501.09577 | null |
2025-01-16 | AnyStory: Towards Unified Single and Multiple Subject Personalization in Text-to-Image Generation | Junjie He et.al. | 2501.09503 | link |
2025-01-16 | Pruning for Sparse Diffusion Models based on Gradient Flow | Ben Wan et.al. | 2501.09464 | null |
2025-01-16 | "A Great Start, But...": Evaluating LLM-Generated Mind Maps for Information Mapping in Video-Based Design | Tianhao He et.al. | 2501.09457 | null |
2025-01-16 | CaPa: Carve-n-Paint Synthesis for Efficient 4K Textured Mesh Generation | Hwan Heo et.al. | 2501.09433 | link |
2025-01-16 | Towards a Framework for Enterprise Architecture in Mobile Government: A Case Study | Son Pham et.al. | 2501.09401 | null |
2025-01-16 | Contract-Inspired Contest Theory for Controllable Image Generation in Mobile Edge Metaverse | Guangyuan Liu et.al. | 2501.09391 | null |
2025-01-16 | Identification of Traditional Medicinal Plant Leaves Using an effective Deep Learning model and Self-Curated Dataset | Deepjyoti Chetia et.al. | 2501.09363 | null |
2025-01-15 | How Do Generative Models Draw a Software Engineer? A Case Study on Stable Diffusion Bias | Tosin Fadahunsi et.al. | 2501.09014 | link |
2025-01-15 | SimGen: A Diffusion-Based Framework for Simultaneous Surgical Image and Segmentation Mask Generation | Aditya Bhat et.al. | 2501.09008 | null |
2025-01-15 | CrystalGRW: Generative Modeling of Crystal Structures with Targeted Properties via Geodesic Random Walks | Krit Tangsongcharoen et.al. | 2501.08998 | link |
2025-01-15 | VECT-GAN: A variationally encoded generative model for overcoming data scarcity in pharmaceutical science | Youssef Abdalla et.al. | 2501.08995 | link |
2025-01-15 | RepVideo: Rethinking Cross-Layer Representation for Video Generation | Chenyang Si et.al. | 2501.08994 | null |
2025-01-15 | CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities | Haozhe Xie et.al. | 2501.08983 | link |
2025-01-15 | Learning to Extract Cross-Domain Aspects and Understanding Sentiments Using Large Language Models | Karukriti Kaushik Ghosh et.al. | 2501.08974 | null |
2025-01-15 | Karatsuba Matrix Multiplication and its Efficient Custom Hardware Implementations | Trevor E. Pogue et.al. | 2501.08889 | link |
2025-01-15 | Connecting SPDE to SGMs | Junsu Seo et.al. | 2501.08877 | null |
2025-01-16 | Silent Abandonment in Text-Based Contact Centers: Identifying, Quantifying, and Mitigating its Operational Impacts | Antonio Castellanos et.al. | 2501.08869 | null |
2025-01-15 | Boosting Diffusion Guidance via Learning Degradation-Aware Models for Blind Super Resolution | Shao-Hao Lu et.al. | 2501.08819 | link |
2025-01-15 | Securities Transaction Settlement Optimization on superconducting quantum devices | Francesco Martini et.al. | 2501.08794 | null |
2025-01-15 | Near-Field ISAC: Synergy of Dual-Purpose Codebooks and Space-Time Adaptive Processing | Ahmed Hussain et.al. | 2501.08776 | null |
2025-01-15 | Adaptive Approximation Schemes for Matching Queues | Alireza AmaniHamedani et.al. | 2501.08775 | null |
2025-01-15 | An Ultra-Wideband Dual Polarization Antenna Array for the Detection and Localization of Bright Fast Radio Transients in the Milky Way | Diego Gallardo et.al. | 2501.08764 | null |
2025-01-14 | DAViD: Modeling Dynamic Affordance of 3D Objects using Pre-trained Video Diffusion Models | Hyeonwoo Kim et.al. | 2501.08333 | null |
2025-01-14 | MangaNinja: Line Art Colorization with Precise Reference Following | Zhiheng Liu et.al. | 2501.08332 | null |
2025-01-14 | Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise | Ryan Burgert et.al. | 2501.08331 | link |
2025-01-14 | GameFactory: Creating New Games with Generative Interactive Videos | Jiwen Yu et.al. | 2501.08325 | null |
2025-01-14 | Diffusion Adversarial Post-Training for One-Step Video Generation | Shanchuan Lin et.al. | 2501.08316 | null |
2025-01-14 | LayerAnimate: Layer-specific Control for Animation | Yuxue Yang et.al. | 2501.08295 | null |
2025-01-14 | HALoGEN: Fantastic LLM Hallucinations and Where to Find Them | Abhilasha Ravichander et.al. | 2501.08292 | null |
2025-01-14 | FDPP: Fine-tune Diffusion Policy with Human Preference | Yuxin Chen et.al. | 2501.08259 | null |
2025-01-14 | Text-Diffusion Red-Teaming of Large Language Models: Unveiling Harmful Behaviors with Proximity Constraints | Jonathan Nöther et.al. | 2501.08246 | null |
2025-01-14 | Engineering LLM Powered Multi-agent Framework for Autonomous CloudOps | Kannan Parthasarathy et.al. | 2501.08243 | null |
2025-01-14 | CodecFake-Omni: A Large-Scale Codec-based Deepfake Speech Dataset | Jiawei Du et.al. | 2501.08238 | null |
2025-01-14 | FramePainter: Endowing Interactive Image Editing with Video Diffusion Priors | Yabo Zhang et.al. | 2501.08225 | link |
2025-01-14 | D |
Qian Zeng et.al. | 2501.08180 | link |
2025-01-14 | DM-Mamba: Dual-domain Multi-scale Mamba for MRI reconstruction | Yucong Meng et.al. | 2501.08163 | link |
2025-01-14 | Multiple-Input Variational Auto-Encoder for Anomaly Detection in Heterogeneous Data | Phai Vu Dinh et.al. | 2501.08149 | null |
2025-01-13 | Training-Free Motion-Guided Video Generation with Enhanced Temporal Consistency Using Motion Consistency Loss | Xinyu Zhang et.al. | 2501.07563 | null |
2025-01-13 | Confident Pseudo-labeled Diffusion Augmentation for Canine Cardiomegaly Detection | Shiman Zhang et.al. | 2501.07533 | link |
2025-01-13 | IP-FaceDiff: Identity-Preserving Facial Video Editing with Diffusion | Tharun Anand et.al. | 2501.07530 | null |
2025-01-13 | LitmusKt: Concurrency Stress Testing for Kotlin | Denis Lochmelis et.al. | 2501.07472 | link |
2025-01-13 | PrecipDiff: Leveraging image diffusion models to enhance satellite-based precipitation observations | Ting-Yu Dai et.al. | 2501.07447 | null |
2025-01-13 | Diff-Ensembler: Learning to Ensemble 2D Diffusion Models for Volume-to-Volume Medical Image Translation | Xiyue Zhu et.al. | 2501.07430 | null |
2025-01-13 | OCORD: Open-Campus Object Removal Dataset | Shuo Zhang et.al. | 2501.07397 | null |
2025-01-13 | Bigger Isn't Always Better: Towards a General Prior for Medical Image Reconstruction | Lukas Glaszner et.al. | 2501.07376 | link |
2025-01-13 | Simulating the Hubbard Model with Equivariant Normalizing Flows | Dominic Schuh et.al. | 2501.07371 | null |
2025-01-13 | Multimodal semantic retrieval for product search | Dong Liu et.al. | 2501.07365 | null |
2025-01-13 | Predicting System Dynamics of Universal Growth Patterns in Complex Systems | Leila Hedayatifar et.al. | 2501.07349 | null |
2025-01-13 | The Spectrum of C/2023 A3 Indicates A Depleted Composition | Yunyi Tang et.al. | 2501.07340 | null |
2025-01-13 | Foundation Models at Work: Fine-Tuning for Fairness in Algorithmic Hiring | Buse Sibel Korkmaz et.al. | 2501.07324 | link |
2025-01-13 | ViewVR: Visual Feedback Modes to Achieve Quality of VR-based Telemanipulation | A. Erkhov et.al. | 2501.07299 | link |
2025-01-13 | Skip Mamba Diffusion for Monocular 3D Semantic Scene Completion | Li Liang et.al. | 2501.07260 | link |
2025-01-10 | ScooterLab: A Programmable and Participatory Sensing Research Testbed using Micromobility Vehicles | Ubaidullah Khan et.al. | 2501.06177 | null |
2025-01-10 | VideoAuteur: Towards Long Narrative Video Generation | Junfei Xiao et.al. | 2501.06173 | null |
2025-01-10 | GenMol: A Drug Discovery Generalist with Discrete Diffusion | Seul Lee et.al. | 2501.06158 | null |
2025-01-10 | From discrete-time policies to continuous-time diffusion samplers: Asymptotic equivalences and faster training | Julius Berner et.al. | 2501.06148 | link |
2025-01-10 | The interplay of user preference and precision in different gaze-based interaction methods | Björn Rene Severitt et.al. | 2501.06073 | null |
2025-01-10 | Photokinetics of Photothermal Reactions | Mounir Maafi et.al. | 2501.06057 | null |
2025-01-10 | Nonisotropic Gaussian Diffusion for Realistic 3D Human Motion Prediction | Cecilia Curreli et.al. | 2501.06035 | null |
2025-01-10 | Resiliency metrics quantifying emergency response in a distribution system | Shikhar Pandey et.al. | 2501.06030 | null |
2025-01-10 | RPKI-Based Location-Unaware Tor Guard Relay Selection Algorithms | Zhifan Lu et.al. | 2501.06010 | link |
2025-01-10 | CamCtrl3D: Single-Image Scene Exploration with Precise 3D Camera Control | Stefan Popov et.al. | 2501.06006 | null |
2025-01-10 | Model Inversion in Split Learning for Personalized LLMs: New Insights from Information Bottleneck Theory | Yunmeng Shu et.al. | 2501.05965 | null |
2025-01-10 | Estimation and Restoration of Unknown Nonlinear Distortion using Diffusion | Michal Švento et.al. | 2501.05959 | link |
2025-01-10 | DiffuSETS: 12-lead ECG Generation Conditioned on Clinical Text Reports and Patient-Specific Information | Yongfan Lai et.al. | 2501.05932 | link |
2025-01-10 | Beyond Flat Text: Dual Self-inherited Guidance for Visual Text Generation | Minxing Luo et.al. | 2501.05892 | null |
2025-01-10 | Poetry in Pixels: Prompt Tuning for Poem Image Generation via Diffusion Models | Sofia Jamil et.al. | 2501.05839 | link |
2025-01-09 | Decentralized Diffusion Models | David McAllister et.al. | 2501.05450 | null |
2025-01-09 | Consistent Flow Distillation for Text-to-3D Generation | Runjie Yan et.al. | 2501.05445 | null |
2025-01-09 | Progressive Growing of Video Tokenizers for Highly Compressed Latent Spaces | Aniruddha Mahapatra et.al. | 2501.05442 | null |
2025-01-09 | The GAN is dead; long live the GAN! A Modern GAN Baseline | Yiwen Huang et.al. | 2501.05441 | link |
2025-01-09 | Zero-1-to-G: Taming Pretrained 2D Diffusion Model for Direct 3D Generation | Xuyi Meng et.al. | 2501.05427 | null |
2025-01-09 | Seeing Sound: Assembling Sounds from Visuals for Audio-to-Image Generation | Darius Petermann et.al. | 2501.05413 | null |
2025-01-09 | TimeDP: Learning to Generate Multi-Domain Time Series with Domain Prompts | Yu-Hao Huang et.al. | 2501.05403 | link |
2025-01-09 | Integrating Explainable AI for Effective Malware Detection in Encrypted Network Traffic | Sileshi Nibret Zeleke et.al. | 2501.05387 | null |
2025-01-09 | Accelerated Diffusion Models via Speculative Sampling | Valentin De Bortoli et.al. | 2501.05370 | null |
2025-01-09 | CROPS: Model-Agnostic Training-Free Framework for Safe Image Synthesis with Latent Diffusion Models | Junha Park et.al. | 2501.05359 | null |
2025-01-09 | Video-Conferencing Beyond Screen-Sharing and Thumbnail Webcam Videos: Gesture-Aware Augmented Reality Video for Data-Rich Remote Presentations | Matthew Brehmer et.al. | 2501.05345 | null |
2025-01-09 | The Bakers and Millers Game with Restricted Locations | Simon Krogmann et.al. | 2501.05334 | null |
2025-01-09 | Patch-GAN Transfer Learning with Reconstructive Models for Cloud Removal | Wanli Ma et.al. | 2501.05265 | null |
2025-01-09 | Light Transport-aware Diffusion Posterior Sampling for Single-View Reconstruction of 3D Volumes | Ludwic Leonard et.al. | 2501.05226 | link |
2025-01-09 | A Novel Approach to Scalable and Automatic Topic-Controlled Question Generation in Education | Ziqing Li et.al. | 2501.05220 | null |
2025-01-08 | EditAR: Unified Conditional Generation with Autoregressive Models | Jiteng Mu et.al. | 2501.04699 | null |
2025-01-08 | ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning | Yuzhou Huang et.al. | 2501.04698 | null |
2025-01-08 | SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single Images | Zixuan Huang et.al. | 2501.04689 | null |
2025-01-08 | URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics | Ruilin Luo et.al. | 2501.04686 | link |
2025-01-08 | Integrating IPbus ALFRED into the ALICE-FIT setup | Krystian Roslon et.al. | 2501.04685 | null |
2025-01-08 | Enhancing Financial VQA in Vision Language Models using Intermediate Structured Representations | Archita Srivastava et.al. | 2501.04675 | null |
2025-01-08 | A Statistical Theory of Contrastive Pre-training and Multimodal Generative AI | Kazusato Oko et.al. | 2501.04641 | link |
2025-01-08 | Knowledge Retrieval Based on Generative AI | Te-Lun Yang et.al. | 2501.04635 | null |
2025-01-08 | Disentangled Clothed Avatar Generation with Layered Representation | Weitian Zhang et.al. | 2501.04631 | null |
2025-01-09 | MedCoDi-M: A Multi-Prompt Foundation Model for Multimodal Medical Data Generation | Daniele Molino et.al. | 2501.04614 | null |
2025-01-08 | Enhancing Low-Cost Video Editing with Lightweight Adaptors and Temporal-Aware Inversion | Yangfan He et.al. | 2501.04606 | link |
2025-01-08 | Understanding Expectations for a Robotic Guide Dog for Visually Impaired People | J. Taery Kim et.al. | 2501.04594 | null |
2025-01-08 | Improving Image Captioning by Mimicking Human Reformulation Feedback at Inference-time | Uri Berger et.al. | 2501.04513 | null |
2025-01-08 | Simultaneous MOKE imaging and measurement of magneto-resistance with vector magnet: a low noise customized setup for low field magnetic devices and thin films characterization | Imtiaz Noor Bhatti et.al. | 2501.04431 | null |
2025-01-08 | End-to-End Bangla AI for Solving Math Olympiad Problem Benchmark: Leveraging Large Language Model Using Integrated Approach | H. M. Shadman Tabib et.al. | 2501.04425 | null |
2025-01-07 | WAPTS: A Weighted Allocation Probability Adjusted Thompson Sampling Algorithm for High-Dimensional and Sparse Experiment Settings | Haochen Song et.al. | 2501.03999 | null |
2025-01-07 | Synthetic Data for Portfolios: A Throw of the Dice Will Never Abolish Chance | Adil Rengim Cetingoz et.al. | 2501.03993 | null |
2025-01-07 | NeuralSVG: An Implicit Representation for Text-to-Vector Generation | Sagi Polaczek et.al. | 2501.03992 | null |
2025-01-07 | Stabilising effect of generic anomalous diffusion independent of the Rayleigh number | Antonio Barletta et.al. | 2501.03990 | null |
2025-01-07 | Synthetic Data Privacy Metrics | Amy Steier et.al. | 2501.03941 | null |
2025-01-07 | Visual question answering: from early developments to recent advances -- a survey | Ngoc Dung Huynh et.al. | 2501.03939 | null |
2025-01-07 | A precise asymptotic analysis of learning diffusion models: theory and insights | Hugo Cui et.al. | 2501.03937 | link |
2025-01-07 | Magic Mirror: ID-Preserved Video Generation in Video Diffusion Transformers | Yuechen Zhang et.al. | 2501.03931 | link |
2025-01-07 | HYB-VITON: A Hybrid Approach to Virtual Try-On Combining Explicit and Implicit Warping | Kosuke Takemoto et.al. | 2501.03910 | link |
2025-01-07 | mFabric: An Efficient and Scalable Fabric for Mixture-of-Experts Training | Xudong Liao et.al. | 2501.03905 | null |
2025-01-07 | Rendezfood: A Design Case Study of a Conversational Location-based Approach in Restaurants | Philip Weber et.al. | 2501.03862 | null |
2025-01-07 | Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control | Zekai Gu et.al. | 2501.03847 | link |
2025-01-07 | Deep Sylvester Posterior Inference for Adaptive Compressed Sensing in Ultrasound Imaging | Simon W. Penninga et.al. | 2501.03825 | null |
2025-01-07 | Impact of diffusion mechanisms on persistence and spreading | Nathanaël Boutillon et.al. | 2501.03816 | null |
2025-01-07 | Private, Auditable, and Distributed Ledger for Financial Institutes | Shaltiel Eloul et.al. | 2501.03808 | link |
2025-01-06 | MObI: Multimodal Object Inpainting Using Diffusion Models | Alexandru Buburuzan et.al. | 2501.03173 | null |
2025-01-06 | Large language models for artificial general intelligence (AGI): A survey of foundational principles and approaches | Alhassan Mumuni et.al. | 2501.03151 | null |
2025-01-06 | DDRM-PR: Fourier Phase Retrieval using Denoising Diffusion Restoration Models | Mehmet Onurcan Kaya et.al. | 2501.03030 | link |
2025-01-06 | TransPixar: Advancing Text-to-Video Generation with Transparency | Luozhou Wang et.al. | 2501.03006 | link |
2025-01-06 | STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution | Rui Xie et.al. | 2501.02976 | null |
2025-01-06 | Leader Rotation Is Not Enough: Scrutinizing Leadership Democracy of Chained BFT Consensus | Yining Tang et.al. | 2501.02970 | null |
2025-01-07 | SceneVTG++: Controllable Multilingual Visual Text Generation in the Wild | Jiawei Liu et.al. | 2501.02962 | null |
2025-01-06 | Inhibition of bacterial growth by antibiotics | Barnabe Ledoux et.al. | 2501.02944 | null |
2025-01-06 | Deep Generative Model-Aided Power System Dynamic State Estimation and Reconstruction with Unknown Control Inputs or Data Distributions | Jianhua Pei et.al. | 2501.02928 | null |
2025-01-06 | Pointmap-Conditioned Diffusion for Consistent Novel View Synthesis | Thang-Anh-Quan Nguyen et.al. | 2501.02913 | null |
2025-01-06 | Sim-to-Real Transfer for Mobile Robots with Reinforcement Learning: from NVIDIA Isaac Sim to Gazebo and Real ROS 2 Robots | Sahar Salimpour et.al. | 2501.02902 | link |
2025-01-06 | Conditional Mutual Information Based Diffusion Posterior Sampling for Solving Inverse Problems | Shayan Mohajer Hamidi et.al. | 2501.02880 | null |
2025-01-06 | Towards HRTF Personalization using Denoising Diffusion Models | Juan Camilo Albarracín Sánchez et.al. | 2501.02871 | null |
2025-01-07 | Diff-Lung: Diffusion-Based Texture Synthesis for Enhanced Pathological Tissue Segmentation in Lung CT Scans | Rezkellah Noureddine Khiati et.al. | 2501.02867 | null |
2025-01-06 | Large Language Models for Video Surveillance Applications | Ulindu De Silva et.al. | 2501.02850 | null |
2025-01-03 | Metadata Conditioning Accelerates Language Model Pre-training | Tianyu Gao et.al. | 2501.01956 | link |
2025-01-03 | MADGEN -- Mass-Spec attends to De Novo Molecular generation | Yinkai Wang et.al. | 2501.01950 | link |
2025-01-03 | Bridging Classification and Segmentation in Osteosarcoma Assessment via Foundation and Discrete Diffusion Models | Manh Duong Nguyen et.al. | 2501.01932 | link |
2025-01-03 | EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation | Siyuan Huang et.al. | 2501.01895 | null |
2025-01-03 | Exploring Equality: An Investigation into Custom Loss Functions for Fairness Definitions | Gordon Lee et.al. | 2501.01889 | null |
2025-01-03 | LCFed: An Efficient Clustered Federated Learning Framework for Heterogeneous Data | Yuxin Zhang et.al. | 2501.01850 | null |
2025-01-03 | MoColl: Agent-Based Specific and General Model Collaboration for Image Captioning | Pu Yang et.al. | 2501.01834 | null |
2025-01-03 | Creating Artificial Students that Never Existed: Leveraging Large Language Models and CTGANs for Synthetic Data Generation | Mohammad Khalil et.al. | 2501.01793 | link |
2025-01-03 | Ingredients: Blending Custom Photos with Video Diffusion Transformers | Zhengcong Fei et.al. | 2501.01790 | link |
2025-01-03 | Nonparametric estimation of a factorizable density using diffusion models | Hyeok Kyu Kwon et.al. | 2501.01783 | null |
2025-01-03 | Customizing pseudospin unidirectional states of acoustic and electromagnetic waves in two-dimensional phoxonic topological insulators via multi-objective strategies | Gang-Gang Xu et.al. | 2501.01766 | null |
2025-01-03 | Constrained Pricing in Choice-based Revenue Management | Qian Shao et.al. | 2501.01764 | null |
2025-01-03 | Adverse Weather Conditions Augmentation of LiDAR Scenes with Latent Diffusion Models | Andrea Matteazzi et.al. | 2501.01761 | null |
2025-01-03 | MusicGen-Stem: Multi-stem music generation and edition through autoregressive modeling | Simon Rouard et.al. | 2501.01757 | null |
2025-01-03 | Combined Hyper-Extensible Extremely-Secured Zero-Trust CIAM-PAM architecture | Shivom Aggarwal et.al. | 2501.01732 | null |
2025-01-02 | Object-level Visual Prompts for Compositional Image Generation | Gaurav Parmar et.al. | 2501.01424 | null |
2025-01-02 | Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models | Jingfeng Yao et.al. | 2501.01423 | link |
2025-01-02 | Multi-Modal Video Feature Extraction for Popularity Prediction | Haixu Liu et.al. | 2501.01422 | null |
2025-01-02 | Deep Discrete Encoders: Identifiable Deep Generative Models for Rich Data with Discrete Latent Layers | Seunghyun Lee et.al. | 2501.01414 | null |
2025-01-02 | On Unifying Video Generation and Camera Pose Estimation | Chun-Hao Paul Huang et.al. | 2501.01409 | null |
2025-01-02 | Test-time Controllable Image Generation by Explicit Spatial Constraint Enforcement | Z. Zhang et.al. | 2501.01368 | null |
2025-01-02 | Contrastive Learning from Exploratory Actions: Leveraging Natural Interactions for Preference Elicitation | Nathaniel Dennler et.al. | 2501.01367 | null |
2025-01-03 | Conditional Consistency Guided Image Translation and Enhancement | Amil Bhagat et.al. | 2501.01223 | link |
2025-01-03 | TabTreeFormer: Tabular Data Generation Using Hybrid Tree-Transformer | Jiayu Li et.al. | 2501.01216 | null |
2025-01-02 | Range-Only Localization System for Small-Scale Flapping-Wing Robots | Raul Tapia et.al. | 2501.01213 | link |
2025-01-02 | LayeringDiff: Layered Image Synthesis via Generation, then Disassembly with Generative Knowledge | Kyoungkook Kang et.al. | 2501.01197 | null |
2025-01-02 | TexAVi: Generating Stereoscopic VR Video Clips from Text Descriptions | Vriksha Srihari et.al. | 2501.01156 | null |
2025-01-02 | Semantics-Guided Diffusion for Deep Joint Source-Channel Coding in Wireless Image Transmission | Maojun Zhang et.al. | 2501.01138 | link |
2025-01-02 | Co-Design of a Robot Controller Board and Indoor Positioning System for IoT-Enabled Applications | Ali Safa et.al. | 2501.01115 | null |
2025-01-02 | MalCL: Leveraging GAN-Based Generative Replay to Combat Catastrophic Forgetting in Malware Classification | Jimin Park et.al. | 2501.01110 | link |
2024-12-30 | The Gaussian Kicked Rotor: Periodic forcing with finite-width pulses and the role of shifting the kick | Jonathan Berkheim et.al. | 2412.21186 | null |
2024-12-30 | Unified dimensionality reduction techniques in chronic liver disease detection | Anand Karna et.al. | 2412.21156 | null |
2025-01-02 | Prometheus: 3D-Aware Latent Diffusion Models for Feed-Forward Text-to-3D Scene Generation | Yuanbo Yang et.al. | 2412.21117 | null |
2024-12-30 | Impact of Fourth Industrial Revolution (4IR) on Small and Medium Enterprises (SMEs) and Employment in Bangladesh: Opportunities and Challenges | Toukir Ahammed et.al. | 2412.21106 | null |
2024-12-30 | Quantum Diffusion Model for Quark and Gluon Jet Generation | Mariia Baidachna et.al. | 2412.21082 | link |
2025-01-02 | Edicho: Consistent Image Editing in the Wild | Qingyan Bai et.al. | 2412.21079 | link |
2024-12-30 | Varformer: Adapting VAR's Generative Prior for Image Restoration | Siyang Wang et.al. | 2412.21063 | link |
2024-12-30 | VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation | Jiazheng Xu et.al. | 2412.21059 | link |
2024-12-30 | E2EDiff: Direct Mapping from Noise to Data for Enhanced Diffusion Models | Zhiyu Tan et.al. | 2412.21044 | null |
2024-12-30 | Visual Style Prompt Learning Using Diffusion Models for Blind Face Restoration | Wanglong Lu et.al. | 2412.21042 | link |
2024-12-30 | TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization | Chia-Yu Hung et.al. | 2412.21037 | link |
2024-12-30 | Verified Lifting of Deep learning Operators | Qi Zhan et.al. | 2412.20992 | null |
2024-12-30 | AlignAb: Pareto-Optimal Energy Alignment for Designing Nature-Like Antibodies | Yibo Wen et.al. | 2412.20984 | null |
2024-12-30 | AGON: Automated Design Framework for Customizing Processors from ISA Documents | Chongxiao Li et.al. | 2412.20954 | null |
2024-12-30 | AI-Supported Data Analysis Boosts Student Motivation and Reduces Stress in Physics Education | Jannik Henze et.al. | 2412.20951 | null |
2024-12-27 | Tensor Network Estimation of Distribution Algorithms | John Gardiner et.al. | 2412.19780 | null |
2024-12-27 | Generative Video Propagation | Shaoteng Liu et.al. | 2412.19761 | null |
2024-12-27 | Complement or substitute? How AI increases the demand for human skills | Elina Mäkelä et.al. | 2412.19754 | null |
2024-12-27 | Text2Insight: Transform natural language text into insights seamlessly using multi-model architecture | Pradeep Sain et.al. | 2412.19718 | null |
2024-12-27 | From Elements to Design: A Layered Approach for Automatic Graphic Design Composition | Jiawei Lin et.al. | 2412.19712 | null |
2024-12-27 | An Integrated Optimization and Deep Learning Pipeline for Predicting Live Birth Success in IVF Using Feature Optimization and Transformer-Based Models | Arezoo Borji et.al. | 2412.19696 | null |
2024-12-27 | From prediction to explanation: managing influential negative reviews through explainable AI | Rongping Shen et.al. | 2412.19692 | null |
2024-12-27 | VideoMaker: Zero-shot Customized Video Generation with the Inherent Force of Video Diffusion Models | Tao Wu et.al. | 2412.19645 | null |
2024-12-27 | Diverse Rare Sample Generation with Pretrained GANs | Subeen Lee et.al. | 2412.19543 | link |
2024-12-27 | Scalable Hierarchical Reinforcement Learning for Hyper Scale Multi-Robot Task Planning | Xuan Zhou et.al. | 2412.19538 | null |
2024-12-27 | StyleRWKV: High-Quality and High-Efficiency Style Transfer with RWKV-like Architecture | Miaomiao Dai et.al. | 2412.19535 | null |
2024-12-27 | Lévy Score Function and Score-Based Particle Algorithm for Nonlinear Lévy--Fokker--Planck Equations | Yuanfei Huang et.al. | 2412.19520 | link |
2024-12-27 | Estimation of System Parameters Including Repeated Cross-Sectional Data through Emulator-Informed Deep Generative Model | Hyunwoo Cho et.al. | 2412.19517 | null |
2024-12-27 | DrivingWorld: ConstructingWorld Model for Autonomous Driving via Video GPT | Xiaotao Hu et.al. | 2412.19505 | link |
2024-12-27 | RobotDiffuse: Motion Planning for Redundant Manipulator based on Diffusion Model | Xiaohan Zhang et.al. | 2412.19500 | link |
2024-12-24 | PartGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models | Minghao Chen et.al. | 2412.18608 | null |
2024-12-24 | DrivingGPT: Unifying Driving World Modeling and Planning with Multi-modal Autoregressive Transformers | Yuntao Chen et.al. | 2412.18607 | null |
2024-12-24 | Explaining in Diffusion: Explaining a Classifier Through Hierarchical Semantics with Text-to-Image Diffusion Models | Tahira Kazimi et.al. | 2412.18604 | null |
2024-12-24 | Long-Form Speech Generation with Spoken Language Models | Se Jin Park et.al. | 2412.18603 | link |
2024-12-24 | ZeroHSI: Zero-Shot 4D Human-Scene Interaction by Video Generation | Hongjie Li et.al. | 2412.18600 | null |
2024-12-24 | DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation | Minghong Cai et.al. | 2412.18597 | link |
2024-12-24 | LatentCRF: Continuous CRF for Efficient Latent Diffusion | Kanchana Ranasinghe et.al. | 2412.18596 | null |
2024-12-24 | Resolution-Robust 3D MRI Reconstruction with 2D Diffusion Priors: Diverse-Resolution Training Outperforms Interpolation | Anselm Krainovic et.al. | 2412.18584 | null |
2024-12-24 | 3DEnhancer: Consistent Multi-View Diffusion for 3D Enhancement | Yihang Luo et.al. | 2412.18565 | null |
2024-12-24 | Elevating Information System Performance: A Deep Dive into Quality Metrics | Dana A Abdullah et.al. | 2412.18512 | null |
2024-12-24 | A region-wide, multi-year set of crop field boundary labels for Africa | L. D. Estes et.al. | 2412.18483 | null |
2024-12-24 | GeFL: Model-Agnostic Federated Learning with Generative Models | Honggu Kang et.al. | 2412.18460 | null |
2024-12-24 | Gaussian entropic optimal transport: Schrödinger bridges and the Sinkhorn algorithm | O. Deniz Akyildiz et.al. | 2412.18432 | null |
2024-12-24 | Fashionability-Enhancing Outfit Image Editing with Conditional Diffusion Models | Qice Qin et.al. | 2412.18421 | null |
2024-12-24 | Discovery of 2D Materials via Symmetry-Constrained Diffusion Model | Shihang Xu et.al. | 2412.18414 | null |
2024-12-23 | FaceLift: Single Image to 3D Head with View Generation and GS-LRM | Weijie Lyu et.al. | 2412.17812 | null |
2024-12-23 | PepTune: De Novo Generation of Therapeutic Peptides with Multi-Objective-Guided Discrete Diffusion | Sophia Tang et.al. | 2412.17780 | null |
2024-12-23 | The Superposition of Diffusion Models Using the Itô Density Estimator | Marta Skreta et.al. | 2412.17762 | null |
2024-12-23 | Superconductivity in Nanosystems: A Fruitful Path to New Phenomenology in Quantum Materials | M. V. Ramallo et.al. | 2412.17722 | null |
2024-12-23 | A Bias-Free Training Paradigm for More General AI-generated Image Detection | Fabrizio Guillaro et.al. | 2412.17671 | null |
2024-12-23 | Benchmarking Generative AI Models for Deep Learning Test Input Generation | Maryam et.al. | 2412.17652 | link |
2024-12-23 | DreamFit: Garment-Centric Human Generation via a Lightweight Anything-Dressing Encoder | Ente Lin et.al. | 2412.17644 | null |
2024-12-23 | ANID: How Far Are We? Evaluating the Discrepancies Between AI-synthesized Images and Natural Images through Multimodal Guidance | Renyang Liu et.al. | 2412.17632 | link |
2024-12-23 | Be More Diverse than the Most Diverse: Online Selection of Diverse Mixtures of Generative Models | Parham Rezaei et.al. | 2412.17622 | link |
2024-12-23 | Empathetic Response in Audio-Visual Conversations Using Emotion Preference Optimization and MambaCompressor | Yeonju Kim et.al. | 2412.17572 | null |
2024-12-23 | The Dynamic Duo of Collaborative Masking and Target for Advanced Masked Autoencoder Learning | Shentong Mo et.al. | 2412.17566 | null |
2024-12-23 | S-INF: Towards Realistic Indoor Scene Synthesis via Scene Implicit Neural Field | Zixi Liang et.al. | 2412.17561 | link |
2024-12-23 | Resource-Aware Arabic LLM Creation: Model Adaptation, Integration, and Multi-Domain Testing | Prakash Aryan et.al. | 2412.17548 | link |
2024-12-23 | Retention Score: Quantifying Jailbreak Risks for Vision Language Models | Zaitang Li et.al. | 2412.17544 | null |
2024-12-23 | CiteBART: Learning to Generate Citations for Local Citation Recommendation | Ege Yiğit Çelik et.al. | 2412.17534 | link |
2024-12-20 | Personalized Representation from Personalized Generation | Shobhita Sundaram et.al. | 2412.16156 | link |
2024-12-20 | Can Generative Video Models Help Pose Estimation? | Ruojin Cai et.al. | 2412.16155 | null |
2024-12-20 | FedGAT: A Privacy-Preserving Federated Approximation Algorithm for Graph Attention Networks | Siddharth Ambekar et.al. | 2412.16144 | null |
2024-12-20 | NeRF-To-Real Tester: Neural Radiance Fields as Test Image Generators for Vision of Autonomous Systems | Laura Weihl et.al. | 2412.16141 | null |
2024-12-20 | Predicting human cooperation: sensitizing drift-diffusion model to interaction and external stimuli | Lucila G. Alvarez-Zuzek et.al. | 2412.16121 | null |
2024-12-20 | Differentially Private Federated Learning of Diffusion Models for Synthetic Tabular Data Generation | Timur Sattarov et.al. | 2412.16083 | null |
2024-12-20 | Label-Efficient Data Augmentation with Video Diffusion Models for Guidewire Segmentation in Cardiac Fluoroscopy | Shaoyan Pan et.al. | 2412.16050 | null |
2024-12-20 | SafeCFG: Redirecting Harmful Classifier-Free Guidance for Safe Generation | Jiadong Pan et.al. | 2412.16039 | null |
2024-12-20 | Electric Vehicle Charging Stations Placement Optimization in Vietnam Using Mixed-Integer Nonlinear Programming Model | Quynh Vu Truc et.al. | 2412.16025 | link |
2024-12-20 | Data-Centric Improvements for Enhancing Multi-Modal Understanding in Spoken Conversation Modeling | Maximillian Chen et.al. | 2412.15995 | null |
2024-12-20 | Optimization of Beyond Diagonal RIS: A Universal Framework Applicable to Arbitrary Architectures | Zheyu Wu et.al. | 2412.15965 | null |
2024-12-20 | Reframing Image Difference Captioning with BLIP2IDC and Synthetic Augmentation | Gautier Evennou et.al. | 2412.15939 | link |
2024-12-20 | RiTTA: Modeling Event Relations in Text-to-Audio Generation | Yuhang He et.al. | 2412.15922 | link |
2024-12-20 | Less is More: Towards Green Code Large Language Models via Unified Structural Pruning | Guang Yang et.al. | 2412.15921 | null |
2024-12-20 | Semi-Supervised Adaptation of Diffusion Models for Handwritten Text Generation | Kai Brandenbusch et.al. | 2412.15853 | null |
2024-12-19 | LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis | Hanlin Wang et.al. | 2412.15214 | link |
2024-12-19 | Flowing from Words to Pixels: A Framework for Cross-Modality Evolution | Qihao Liu et.al. | 2412.15213 | null |
2024-12-19 | Generative Multiview Relighting for 3D Reconstruction under Extreme Illumination Variation | Hadi Alzayer et.al. | 2412.15211 | null |
2024-12-19 | AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation | Moayed Haji-Ali et.al. | 2412.15191 | null |
2024-12-19 | LlamaFusion: Adapting Pretrained Language Models for Multimodal Generation | Weijia Shi et.al. | 2412.15188 | null |
2024-12-19 | Tiled Diffusion | Or Madar et.al. | 2412.15185 | null |
2024-12-19 | SqueezeMe: Efficient Gaussian Avatars for VR | Shunsuke Saito et.al. | 2412.15171 | null |
2024-12-19 | OnlineVPO: Align Video Diffusion Model with Online Video-Centric Preference Optimization | Jiacheng Zhang et.al. | 2412.15159 | null |
2024-12-19 | Prompt-A-Video: Prompt Your Video Diffusion Model via Preference-Aligned LLM | Yatai Ji et.al. | 2412.15156 | link |
2024-12-19 | Jet: A Modern Transformer-Based Normalizing Flow | Alexander Kolesnikov et.al. | 2412.15129 | null |
2024-12-19 | Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation | Yang Tian et.al. | 2412.15109 | link |
2024-12-19 | Learning Disentangled Equivariant Representation for Explicitly Controllable 3D Molecule Generation | Haoran Liu et.al. | 2412.15086 | null |
2024-12-19 | Eigenstate Preparation on Quantum Computers | Joey Bonitati et.al. | 2412.15081 | null |
2024-12-19 | Uni-Renderer: Unifying Rendering and Inverse Rendering Via Dual Stream Diffusion | Zhifei Chen et.al. | 2412.15050 | null |
2024-12-19 | DCTdiff: Intriguing Properties of Image Generative Modeling in the DCT Space | Mang Ning et.al. | 2412.15032 | link |
2024-12-18 | AniDoc: Animation Creation Made Easier | Yihao Meng et.al. | 2412.14173 | null |
2024-12-19 | E-CAR: Efficient Continuous Autoregressive Image Generation via Multistage Modeling | Zhihang Yuan et.al. | 2412.14170 | null |
2024-12-18 | Autoregressive Video Generation without Vector Quantization | Haoge Deng et.al. | 2412.14169 | link |
2024-12-18 | VideoDPO: Omni-Preference Alignment for Video Diffusion Generation | Runtao Liu et.al. | 2412.14167 | null |
2024-12-18 | MetaMorph: Multimodal Understanding and Generation via Instruction Tuning | Shengbang Tong et.al. | 2412.14164 | null |
2024-12-18 | MCMat: Multiview-Consistent and Physically Accurate PBR Material Generation | Shenhao Zhu et.al. | 2412.14148 | null |
2024-12-18 | Event-based Photometric Bundle Adjustment | Shuang Guo et.al. | 2412.14111 | link |
2024-12-18 | Future Research Avenues for Artificial Intelligence in Digital Gaming: An Exploratory Report | Markus Dablander et.al. | 2412.14085 | null |
2024-12-18 | SurgSora: Decoupled RGBD-Flow Diffusion Model for Controllable Surgical Video Generation | Tong Chen et.al. | 2412.14018 | null |
2024-12-18 | Comparative Analysis of Machine Learning-Based Imputation Techniques for Air Quality Datasets with High Missing Data Rates | Sen Yan et.al. | 2412.13966 | null |
2024-12-18 | A Rose by Any Other Name: LLM-Generated Explanations Are Good Proxies for Human Explanations to Collect Label Distributions on NLI | Beiduo Chen et.al. | 2412.13942 | link |
2024-12-18 | Development of a High-Resolution, High-Dynamic-Range Charge Detector for Ion Beam Monitoring | O. Adriani et.al. | 2412.13934 | null |
2024-12-18 | Investigating the Effects of Diffusion-based Conditional Generative Speech Models Used for Speech Enhancement on Dysarthric Speech | Joanna Reszka et.al. | 2412.13933 | null |
2024-12-18 | Graph-Driven Models for Gas Mixture Identification and Concentration Estimation on Heterogeneous Sensor Array Signals | Ding Wang et.al. | 2412.13891 | null |
2024-12-18 | Navigating limitations with precision: A fine-grained ensemble approach to wrist pathology recognition on a limited x-ray dataset | Ammar Ahmed et.al. | 2412.13884 | null |
2024-12-17 | CoMPaSS: Enhancing Spatial Understanding in Text-to-Image Diffusion Models | Gaoyang Zhang et.al. | 2412.13195 | link |
2024-12-17 | StreetCrafter: Street View Synthesis with Controllable Video Diffusion Models | Yunzhi Yan et.al. | 2412.13188 | null |
2024-12-17 | Move-in-2D: 2D-Conditioned Human Motion Generation | Hsin-Ping Huang et.al. | 2412.13185 | null |
2024-12-17 | F-Bench: Rethinking Human Preference Evaluation Metrics for Benchmarking Face Generation, Customization, and Restoration | Lu Liu et.al. | 2412.13155 | null |
2024-12-17 | Prompt Augmentation for Self-supervised Text-guided Image Manipulation | Rumeysa Bodur et.al. | 2412.13081 | null |
2024-12-17 | 3D MedDiffusion: A 3D Medical Diffusion Model for Controllable and High-quality Medical Image Generation | Haoshen Wang et.al. | 2412.13059 | null |
2024-12-17 | Guiding Generative Protein Language Models with Reinforcement Learning | Filippo Stocco et.al. | 2412.12979 | link |
2024-12-18 | Attentive Eraser: Unleashing Diffusion Model's Object Removal Potential via Self-Attention Redirection Guidance | Wenhao Sun et.al. | 2412.12974 | link |
2024-12-17 | ArchesWeather & ArchesWeatherGen: a deterministic and generative model for efficient ML weather forecasting | Guillaume Couairon et.al. | 2412.12971 | link |
2024-12-17 | Modified UNIFAC 2.0 -- A Group-Contribution Method Completed with Machine Learning | Nicolas Hayer et.al. | 2412.12962 | null |
2024-12-17 | MOPO: Multi-Objective Prompt Optimization for Affective Text Generation | Yarik Menchaca Resendiz et.al. | 2412.12948 | null |
2024-12-17 | Generation of cosmic ray trajectories by a Diffusion Model trained on test particles in 3D magnetohydrodynamic turbulence | Johannes Martin et.al. | 2412.12923 | null |
2024-12-17 | Unsupervised Region-Based Image Editing of Denoising Diffusion Models | Zixiang Li et.al. | 2412.12912 | null |
2024-12-18 | ArtAug: Enhancing Text-to-Image Generation through Synthesis-Understanding Interaction | Zhongjie Duan et.al. | 2412.12888 | link |
2024-12-17 | Memory-minimal quantum generation of stochastic processes: spectral invariants of quantum hidden Markov models | Magdalini Zonnios et.al. | 2412.12812 | null |
2024-12-16 | Causal Diffusion Transformers for Generative Modeling | Chaorui Deng et.al. | 2412.12095 | link |
2024-12-16 | CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View Diffusion Models | Felix Taubner et.al. | 2412.12093 | null |
2024-12-16 | Wonderland: Navigating 3D Scenes from a Single Image | Hanwen Liang et.al. | 2412.12091 | null |
2024-12-16 | A LoRA is Worth a Thousand Pictures | Chenxi Liu et.al. | 2412.12048 | null |
2024-12-16 | LLMs for Cold-Start Cutting Plane Separator Configuration | Connor Lawless et.al. | 2412.12038 | link |
2024-12-16 | Learning to Navigate in Mazes with Novel Layouts using Abstract Top-down Maps | Linfeng Zhao et.al. | 2412.12024 | null |
2024-12-16 | The entropic optimal (self-)transport problem: Limit distributions for decreasing regularization with application to score function estimation | Gilles Mordant et.al. | 2412.12007 | null |
2024-12-16 | Controllable Shadow Generation with Single-Step Diffusion Models from Synthetic Data | Onur Tasar et.al. | 2412.11972 | null |
2024-12-16 | The Erdős unit distance problem for small point sets | Boris Alexeev et.al. | 2412.11914 | null |
2024-12-16 | CharacterBench: Benchmarking Character Customization of Large Language Models | Jinfeng Zhou et.al. | 2412.11912 | link |
2024-12-16 | Towards Understanding Systems Trade-offs in Retrieval-Augmented Generation Model Inference | Michael Shen et.al. | 2412.11854 | null |
2024-12-16 | ColorFlow: Retrieval-Augmented Image Sequence Colorization | Junhao Zhuang et.al. | 2412.11815 | null |
2024-12-16 | InterDyn: Controllable Interactive Dynamics with Video Diffusion Models | Rick Akkerman et.al. | 2412.11785 | null |
2024-12-16 | Joint Reconstruction of the Activity and the Attenuation in PET by Diffusion Posterior Sampling: a Feasibility Study | Clémentine Phung-Ngoc et.al. | 2412.11776 | null |
2024-12-17 | No More Adam: Learning Rate Scaling at Initialization is All You Need | Minghao Xu et.al. | 2412.11768 | link |
2024-12-13 | Towards a foundation model for heavy-ion collision experiments through point cloud diffusion | Manjunath Omana Kuttan et.al. | 2412.10352 | null |
2024-12-13 | BrushEdit: All-In-One Image Inpainting and Editing | Yaowei Li et.al. | 2412.10316 | null |
2024-12-13 | Iterating the Transient Light Transport Matrix for Non-Line-of-Sight Imaging | Talha Sultan et.al. | 2412.10300 | null |
2024-12-13 | Coherent 3D Scene Diffusion From a Single RGB Image | Manuel Dahnert et.al. | 2412.10294 | null |
2024-12-13 | Adversarial Robustness of Bottleneck Injected Deep Neural Networks for Task-Oriented Communication | Alireza Furutanpey et.al. | 2412.10265 | null |
2024-12-13 | Targeted Angular Reversal of Weights (TARS) for Knowledge Removal in Large Language Models | Harry J. Davies et.al. | 2412.10257 | null |
2024-12-13 | Exploring the Frontiers of Animation Video Generation in the Sora Era: Method, Dataset and Benchmark | Yudong Jiang et.al. | 2412.10255 | link |
2024-12-13 | Radiator Tailoring for Enhanced Performance in InAs-Based Near-Field Thermophotovoltaics | Mathieu Giroux et.al. | 2412.10217 | null |
2024-12-13 | GAF: Gaussian Avatar Reconstruction from Monocular Videos via Multi-view Diffusion | Jiapeng Tang et.al. | 2412.10209 | null |
2024-12-13 | Efficient Generative Modeling with Residual Vector Quantization-Based Tokens | Jaehyeon Kim et.al. | 2412.10208 | null |
2024-12-13 | Simple Guidance Mechanisms for Discrete Diffusion Models | Yair Schiff et.al. | 2412.10193 | link |
2024-12-13 | SwiftTry: Fast and Consistent Video Virtual Try-On with Diffusion Models | Hung Nguyen et.al. | 2412.10178 | null |
2024-12-13 | Learning payoffs while routing in skill-based queues | Sanne van Kempen et.al. | 2412.10168 | null |
2024-12-13 | The Art of Deception: Color Visual Illusions and Diffusion Models | Alex Gomez-Villa et.al. | 2412.10122 | null |
2024-12-13 | Familiarity: Better Evaluation of Zero-Shot Named Entity Recognition by Quantifying Label Shifts in Synthetic Training Data | Jonas Golde et.al. | 2412.10121 | link |
2024-12-12 | FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion | Haonan Qiu et.al. | 2412.09626 | null |
2024-12-12 | Illusion3D: 3D Multiview Illusion with 2D Diffusion Priors | Yue Feng et.al. | 2412.09625 | null |
2024-12-12 | GenEx: Generating an Explorable World | Taiming Lu et.al. | 2412.09624 | null |
2024-12-12 | OmniDrag: Enabling Motion Control for Omnidirectional Image-to-Video Generation | Weiqi Li et.al. | 2412.09623 | null |
2024-12-12 | LoRACLR: Contrastive Adaptation for Customization of Diffusion Models | Enis Simsar et.al. | 2412.09622 | null |
2024-12-12 | SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training | Dongting Hu et.al. | 2412.09619 | null |
2024-12-12 | EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM | Zhuofan Zong et.al. | 2412.09618 | null |
2024-12-12 | Context Canvas: Enhancing Text-to-Image Diffusion Models with Knowledge Graph-Based RAG | Kavana Venkatesh et.al. | 2412.09614 | null |
2024-12-13 | Olympus: A Universal Task Router for Computer Vision Tasks | Yuanze Lin et.al. | 2412.09612 | link |
2024-12-12 | Owl-1: Omni World Model for Consistent Long Video Generation | Yuanhui Huang et.al. | 2412.09600 | link |
2024-12-12 | LiftImage3D: Lifting Any Single Image to 3D Gaussians with Video Generation Priors | Yabo Chen et.al. | 2412.09597 | null |
2024-12-12 | Neural LightRig: Unlocking Accurate Object Normal and Material Estimation with Multi-Light Diffusion | Zexin He et.al. | 2412.09593 | null |
2024-12-12 | Improving the Reliability of Cable Broadband Networks via Proactive Network Maintenance | Jiyao Hu et.al. | 2412.09564 | null |
2024-12-12 | Meshtron: High-Fidelity, Artist-Like 3D Mesh Generation at Scale | Zekun Hao et.al. | 2412.09548 | null |
2024-12-12 | SimAvatar: Simulation-Ready Avatars with Layered Hair and Clothing | Xueting Li et.al. | 2412.09545 | null |
2024-12-11 | Generative Semantic Communication: Architectures, Technologies, and Applications | Jinke Ren et.al. | 2412.08642 | null |
2024-12-11 | DMin: Scalable Training Data Influence Estimation for Diffusion Models | Huawei Lin et.al. | 2412.08637 | link |
2024-12-11 | Multimodal Latent Language Modeling with Next-Token Diffusion | Yutao Sun et.al. | 2412.08635 | link |
2024-12-11 | An SDR-Based Monostatic Wi-Fi System with Analog Self-Interference Cancellation for Sensing | Andreas Toftegaard Kristensen et.al. | 2412.08612 | null |
2024-12-12 | Design2GarmentCode: Turning Design Concepts to Tangible Garments Through Program Synthesis | Feng Zhou et.al. | 2412.08603 | null |
2024-12-11 | TryOffAnyone: Tiled Cloth Generation from a Dressed Person | Ioannis Xarchakos et.al. | 2412.08573 | link |
2024-12-12 | Watermarking Training Data of Music Generation Models | Pascal Epple et.al. | 2412.08549 | null |
2024-12-11 | Orderly Management of Packets in RDMA by Eunomia | Sana Mahmood et.al. | 2412.08540 | null |
2024-12-11 | Ensemble-Based Quantum-Token Protocol Benchmarked on IBM Quantum Processors | Lucas Tsunaki et.al. | 2412.08530 | link |
2024-12-11 | Comparative Opinion Mining in Product Reviews: Multi-perspective Prompt-based Learning | Hai-Yen Thi Nguyen et.al. | 2412.08508 | null |
2024-12-11 | Open-Loop and Model Predictive Control for Electric Vehicle Charging to Manage Excess Renewable Energy Supply in Texas | Kelsey M. Nelson et.al. | 2412.08505 | null |
2024-12-11 | Learning Flow Fields in Attention for Controllable Person Image Generation | Zijian Zhou et.al. | 2412.08486 | link |
2024-12-11 | InvDiff: Invariant Guidance for Bias Mitigation in Diffusion Models | Min Hou et.al. | 2412.08480 | link |
2024-12-11 | CC-Diff: Enhancing Contextual Coherence in Remote Sensing Image Synthesis | Mu Zhang et.al. | 2412.08464 | null |
2024-12-11 | Federated Learning for Traffic Flow Prediction with Synthetic Data Augmentation | Fermin Orozco et.al. | 2412.08460 | null |
2024-12-10 | Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets | Zhen Liu et.al. | 2412.07775 | null |
2024-12-10 | UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics | Xi Chen et.al. | 2412.07774 | null |
2024-12-10 | From Slow Bidirectional to Fast Causal Video Generators | Tianwei Yin et.al. | 2412.07772 | null |
2024-12-10 | Make-A-Texture: Fast Shape-Aware Texture Generation in 3 Seconds | Xiaoyu Xiang et.al. | 2412.07766 | null |
2024-12-10 | Bayesian Optimization of Antibodies Informed by a Generative Model of Evolving Sequences | Alan Nawzad Amin et.al. | 2412.07763 | link |
2024-12-10 | Repurposing Pre-trained Video Diffusion Models for Event-based Video Interpolation | Jingxi Chen et.al. | 2412.07761 | null |
2024-12-10 | SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints | Jianhong Bai et.al. | 2412.07760 | link |
2024-12-10 | PortraitTalk: Towards Customizable One-Shot Audio-to-Talking Face Generation | Fatemeh Nazarieh et.al. | 2412.07754 | null |
2024-12-10 | Multi-Shot Character Consistency for Text-to-Video Generation | Yuval Atzmon et.al. | 2412.07750 | null |
2024-12-10 | StyleMaster: Stylize Your Video with Artistic Generation and Translation | Zixuan Ye et.al. | 2412.07744 | null |
2024-12-10 | STIV: Scalable Text and Image Conditioned Video Generation | Zongyu Lin et.al. | 2412.07730 | null |
2024-12-10 | ObjCtrl-2.5D: Training-free Object Control with Camera Poses | Zhouxia Wang et.al. | 2412.07721 | null |
2024-12-10 | ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer | Jinyi Hu et.al. | 2412.07720 | link |
2024-12-10 | Privacy-Preserving Customer Support: A Framework for Secure and Scalable Interactions | Anant Prakash Awasthi et.al. | 2412.07687 | null |
2024-12-10 | Optimizing Sensor Redundancy in Sequential Decision-Making Problems | Jonas Nüßlein et.al. | 2412.07686 | null |
2024-12-10 | [MASK] is All You Need | Vincent Tao Hu et.al. | 2412.06787 | link |
2024-12-09 | Tactile DreamFusion: Exploiting Tactile Sensing for 3D Generation | Ruihan Gao et.al. | 2412.06785 | link |
2024-12-09 | Diverse Score Distillation | Yanbo Xu et.al. | 2412.06780 | null |
2024-12-09 | Visual Lexicon: Rich Image Features in Language Space | XuDong Wang et.al. | 2412.06774 | null |
2024-12-09 | InstantRestore: Single-Step Personalized Face Restoration with Shared-Image Attention | Howard Zhang et.al. | 2412.06753 | null |
2024-12-09 | ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities | Adhiraj Ghosh et.al. | 2412.06745 | null |
2024-12-10 | ContRail: A Framework for Realistic Railway Image Synthesis using ControlNet | Andrei-Robert Alexandrescu et.al. | 2412.06742 | null |
2024-12-09 | Take Fake as Real: Realistic-like Robust Black-box Adversarial Attack to Evade AIGC Detection | Caiyun Xie et.al. | 2412.06727 | link |
2024-12-09 | You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale | Baorui Ma et.al. | 2412.06699 | link |
2024-12-09 | Gen-3Diffusion: Realistic Image-to-3D Generation via 2D & 3D Diffusion Synergy | Yuxuan Xue et.al. | 2412.06698 | null |
2024-12-09 | Diff5T: Benchmarking Human Brain Diffusion MRI with an Extensive 5.0 Tesla K-Space and Spatial Dataset | Shanshan Wang et.al. | 2412.06666 | null |
2024-12-09 | Efficiency Meets Fidelity: A Novel Quantization Framework for Stable Diffusion | Shuaiting Li et.al. | 2412.06661 | null |
2024-12-09 | MVReward: Better Aligning and Evaluating Multi-View Diffusion Models with Human Preferences | Weitao Wang et.al. | 2412.06614 | null |
2024-12-09 | Augmented reality for upper limb rehabilitation: real-time kinematic feedback with HoloLens 2 | Beatrice Luciani et.al. | 2412.06596 | null |
2024-12-09 | EmoSpeech: A Corpus of Emotionally Rich and Contextually Detailed Speech Annotations | Weizhen Bian et.al. | 2412.06581 | null |
2024-12-06 | Stag-1: Towards Realistic 4D Driving Simulation with Video Generation Model | Lening Wang et.al. | 2412.05280 | link |
2024-12-06 | Perturb-and-Revise: Flexible 3D Editing with Generative Trajectories | Susung Hong et.al. | 2412.05279 | null |
2024-12-06 | Birth and Death of a Rose | Chen Geng et.al. | 2412.05278 | null |
2024-12-06 | MotionFlow: Attention-Driven Motion Transfer in Video Diffusion Models | Tuna Han Salih Meral et.al. | 2412.05275 | null |
2024-12-06 | Go-or-Grow Models in Biology: a Monster on a Leash | R. Thiessen et.al. | 2412.05191 | null |
2024-12-06 | Privacy Drift: Evolving Privacy Concerns in Incremental Learning | Sayyed Farid Ahamed et.al. | 2412.05183 | null |
2024-12-06 | DNF: Unconditional 4D Generation with Dictionary-based Neural Fields | Xinyi Zhang et.al. | 2412.05161 | null |
2024-12-06 | A text-to-tabular approach to generate synthetic patient data using LLMs | Margaux Tornqvist et.al. | 2412.05153 | link |
2024-12-06 | LoRA.rar: Learning to Merge LoRAs via Hypernetworks for Subject-Style Conditioned Image Generation | Donald Shenaj et.al. | 2412.05148 | link |
2024-12-06 | How to Squeeze An Explanation Out of Your Model | Tiago Roxo et.al. | 2412.05134 | null |
2024-12-06 | Probabilistic Galaxy Field Generation with Diffusion Models | Tanner Sether et.al. | 2412.05131 | null |
2024-12-06 | The Silent Prompt: Initial Noise as Implicit Guidance for Goal-Driven Image Generation | Ruoyu Wang et.al. | 2412.05101 | null |
2024-12-06 | Reconstructing Quantitative Cerebral Perfusion Images Directly From Measured Sinogram Data Acquired Using C-arm Cone-Beam CT | Haotian Zhao et.al. | 2412.05084 | null |
2024-12-06 | ReF-LDM: A Latent Diffusion Model for Reference-based Face Image Restoration | Chi-Wei Hsiao et.al. | 2412.05043 | null |
2024-12-06 | Get It Right: Improving Comprehensibility with Adaptable Speech Expression of a Humanoid Service Robot | Thomas Sievers et.al. | 2412.05022 | null |
2024-12-05 | PaintScene4D: Consistent 4D Scene Generation from Text Prompts | Vinayak Gupta et.al. | 2412.04471 | null |
2024-12-05 | LayerFusion: Harmonized Multi-Layer Text-to-Image Generation with Generative Priors | Yusuf Dalva et.al. | 2412.04460 | null |
2024-12-05 | Four-Plane Factorized Video Autoencoders | Mohammed Suhail et.al. | 2412.04452 | null |
2024-12-05 | MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation | Longtao Zheng et.al. | 2412.04448 | null |
2024-12-05 | DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models | Yizhuo Li et.al. | 2412.04446 | null |
2024-12-05 | Learning Artistic Signatures: Symmetry Discovery and Style Transfer | Emma Finn et.al. | 2412.04441 | null |
2024-12-05 | GenMAC: Compositional Text-to-Video Generation with Multi-Agent Collaboration | Kaiyi Huang et.al. | 2412.04440 | null |
2024-12-05 | Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation | Yuying Ge et.al. | 2412.04432 | link |
2024-12-05 | Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis | Jian Han et.al. | 2412.04431 | link |
2024-12-05 | Reversible molecular simulation for training classical and machine learning force fields | Joe G Greener et.al. | 2412.04374 | link |
2024-12-05 | Machine Theory of Mind for Autonomous Cyber-Defence | Luke Swaby et.al. | 2412.04367 | null |
2024-12-05 | ActFusion: a Unified Diffusion Model for Action Segmentation and Anticipation | Dayoung Gong et.al. | 2412.04353 | null |
2024-12-05 | RMD: A Simple Baseline for More General Human Motion Generation via Training-free Retrieval-Augmented Motion Diffuse | Zhouyingcheng Liao et.al. | 2412.04343 | null |
2024-12-05 | Likelihood-Scheduled Score-Based Generative Modeling for Fully 3D PET Image Reconstruction | George Webber et.al. | 2412.04339 | null |
2024-12-05 | Multi-Subject Image Synthesis as a Generative Prior for Single-Subject PET Image Reconstruction | George Webber et.al. | 2412.04324 | null |
2024-12-04 | Navigation World Models | Amir Bar et.al. | 2412.03572 | null |
2024-12-04 | MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation | Zehuan Huang et.al. | 2412.03558 | null |
2024-12-04 | NODE-AdvGAN: Improving the transferability and perceptual similarity of adversarial examples by dynamic-system-driven adversarial generative model | Xinheng Xie et.al. | 2412.03539 | null |
2024-12-04 | NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images | Lingen Li et.al. | 2412.03517 | null |
2024-12-04 | Distilling Diffusion Models to Efficient 3D LiDAR Scene Completion | Shengyuan Zhang et.al. | 2412.03515 | link |
2024-12-04 | Data Fusion of Semantic and Depth Information in the Context of Object Detection | Md Abu Yusuf et.al. | 2412.03490 | null |
2024-12-04 | Flow Matching with General Discrete Paths: A Kinetic-Optimal Perspective | Neta Shaul et.al. | 2412.03487 | null |
2024-12-04 | Pre-trained Multiple Latent Variable Generative Models are good defenders against Adversarial Attacks | Dario Serez et.al. | 2412.03453 | link |
2024-12-04 | CleanDIFT: Diffusion Features without Noise | Nick Stracke et.al. | 2412.03439 | link |
2024-12-04 | SINGER: Vivid Audio-driven Singing Video Generation with Multi-scale Spectral Diffusion Model | Yan Li et.al. | 2412.03430 | null |
2024-12-04 | Skel3D: Skeleton Guided Novel View Synthesis | Aron Fóthi et.al. | 2412.03407 | null |
2024-12-04 | Identifiability implies consistency of MLE in partially observed diffusions on a torus | Ibrahim Ekren et.al. | 2412.03380 | null |
2024-12-04 | TASR: Timestep-Aware Diffusion Model for Image Super-Resolution | Qinwei Lin et.al. | 2412.03355 | link |
2024-12-04 | DIVE: Taming DINO for Subject-Driven Video Editing | Yi Huang et.al. | 2412.03347 | null |
2024-12-04 | Geometry-guided Cross-view Diffusion for One-to-many Cross-view Image Synthesis | Tao Jun Lin et.al. | 2412.03315 | null |
2024-12-03 | Motion Prompting: Controlling Video Generation with Motion Trajectories | Daniel Geng et.al. | 2412.02700 | null |
2024-12-03 | Diffusion-based Visual Anagram as Multi-task Learning | Zhiyuan Xu et.al. | 2412.02693 | link |
2024-12-03 | FoundHand: Large-Scale Domain-Specific Learning for Controllable Hand Image Generation | Kefan Chen et.al. | 2412.02690 | null |
2024-12-04 | SNOOPI: Supercharged One-step Diffusion Distillation with Proper Guidance | Viet Nguyen et.al. | 2412.02687 | null |
2024-12-03 | AniGS: Animatable Gaussian Avatar from a Single Image with Inconsistent Gaussian Reconstruction | Lingteng Qiu et.al. | 2412.02684 | null |
2024-12-03 | Sharp-It: A Multi-view to Multi-view Diffusion Model for 3D Synthesis and Manipulation | Yiftach Edelstein et.al. | 2412.02631 | null |
2024-12-03 | The effect of priors on Learning with Restricted Boltzmann Machines | Gianluca Manzan et.al. | 2412.02623 | null |
2024-12-03 | ComPair-2: A Next Generation Medium Energy Gamma-ray Telescope Prototype | Regina Caputo et.al. | 2412.02562 | null |
2024-12-03 | The Two-Center Problem of Uncertain Points on Cactus Graphs | Haitao Xu et.al. | 2412.02559 | null |
2024-12-03 | ShadowHack: Hacking Shadows via Luminance-Color Divide and Conquer | Jin Hu et.al. | 2412.02545 | link |
2024-12-03 | Unveiling Concept Attribution in Diffusion Models | Quang H. Nguyen et.al. | 2412.02542 | link |
2024-12-03 | LLMForecaster: Improving Seasonal Event Forecasts with Unstructured Textual Data | Hanyu Zhang et.al. | 2412.02525 | null |
2024-12-03 | GerPS-Compare: Comparing NER methods for legal norm analysis | Sarah T. Bachinger et.al. | 2412.02427 | null |
2024-12-03 | It Takes Two: Real-time Co-Speech Two-person's Interaction Generation via Reactive Auto-regressive Diffusion Model | Mingyi Shi et.al. | 2412.02419 | null |
2024-12-03 | A Multi-Agent Framework for Extensible Structured Text Generation in PLCs | Donghao Yang et.al. | 2412.02410 | null |
2024-11-29 | Nanostructured micrometric-pore membranes for nanofiltration: Micrometric geometry may optimize performance, energy efficiency and operational lifetime | J. C. Verde et.al. | 2411.19900 | null |
2024-11-29 | Input-Output Optics as a Causal Time Series Mapping: A Generative Machine Learning Solution | Abhijit Sen et.al. | 2411.19897 | null |
2024-11-29 | MoTe: Learning Motion-Text Diffusion Model for Multiple Generation Tasks | Yiming Wu et.al. | 2411.19786 | null |
2024-11-29 | Riemannian Denoising Score Matching for Molecular Structure Optimization with Accurate Energy | Jeheon Woo et.al. | 2411.19769 | null |
2024-11-29 | JetFormer: An Autoregressive Generative Model of Raw Images and Text | Michael Tschannen et.al. | 2411.19722 | link |
2024-11-29 | Inverse Design of Mechanical Metamaterials Using a Point-Cloud-Based Deep Generative Model | Seungwook Hong et.al. | 2411.19681 | null |
2024-11-29 | TexGaussian: Generating High-quality PBR Material via Octree-based 3D Gaussian Splatting | Bojun Xiong et.al. | 2411.19654 | link |
2024-11-29 | Uniform Attention Maps: Boosting Image Fidelity in Reconstruction and Editing | Wenyi Mo et.al. | 2411.19652 | link |
2024-11-29 | Enhancing Security in Third-Party Library Reuse -- Comprehensive Detection of 1-day Vulnerability through Code Patch Analysis | Shangzhi Xu et.al. | 2411.19648 | null |
2024-11-29 | Accelerating Multimodal Large Language Models via Dynamic Visual-Token Exit and the Empirical Findings | Qiong Wu et.al. | 2411.19628 | link |
2024-11-29 | Unimib Assistant: designing a student-friendly RAG-based chatbot for all their needs | Chiara Antico et.al. | 2411.19554 | null |
2024-11-29 | Deepfake Media Generation and Detection in the Generative AI Era: A Survey and Outlook | Florinel-Alin Croitoru et.al. | 2411.19537 | link |
2024-11-29 | Quantized Delta Weight Is Safety Keeper | Yule Liu et.al. | 2411.19530 | null |
2024-12-02 | DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding | Jungbin Cho et.al. | 2411.19527 | null |
2024-11-29 | Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis | Tianqi Li et.al. | 2411.19509 | link |
2024-11-27 | Textured Gaussians for Enhanced 3D Scene Appearance Modeling | Brian Chao et.al. | 2411.18625 | null |
2024-11-27 | GeneMAN: Generalizable Single-Image 3D Human Reconstruction from Multi-Source Human Data | Wentao Wang et.al. | 2411.18624 | null |
2024-11-27 | Diffusion Self-Distillation for Zero-Shot Customized Image Generation | Shengqu Cai et.al. | 2411.18616 | null |
2024-11-27 | CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models | Rundi Wu et.al. | 2411.18613 | null |
2024-11-27 | Evaluating and Improving the Effectiveness of Synthetic Chest X-Rays for Medical Image Analysis | Eva Prakash et.al. | 2411.18602 | null |
2024-11-27 | Bit symmetry entails the symmetry of the quantum transition probability | Gerd Niestegge et.al. | 2411.18589 | null |
2024-11-27 | Building Confidence in Deep Generative Protein Design | Tianyuan Zheng et.al. | 2411.18568 | link |
2024-11-27 | High-throughput antibody screening with high-quality factor nanophotonics and bioprinting | Sajjad Abdollahramezani et.al. | 2411.18557 | null |
2024-11-27 | FAM Diffusion: Frequency and Attention Modulation for High-Resolution Image Generation with Stable Diffusion | Haosen Yang et.al. | 2411.18552 | null |
2024-11-28 | Enhancing weed detection performance by means of GenAI-based image augmentation | Sourav Modak et.al. | 2411.18513 | null |
2024-11-27 | GATE OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation | Pengfei Zhou et.al. | 2411.18499 | null |
2024-11-27 | Synthetic ECG Generation for Data Augmentation and Transfer Learning in Arrhythmia Classification | José Fernando Núñez et.al. | 2411.18456 | null |
2024-11-27 | Is my Meeting Summary Good? Estimating Quality with a Multi-LLM Evaluator | Frederic Kirstein et.al. | 2411.18444 | null |
2024-11-27 | Learning the Evolution of Physical Structure of Galaxies via Diffusion Models | Andrew Lizarraga et.al. | 2411.18440 | link |
2024-11-27 | Search for heavy scalar or pseudoscalar states in |
Laurids Jeppe et.al. | 2411.18414 | null |
2024-11-27 | StableAnimator: High-Quality Identity-Preserving Human Image Animation | Shuyuan Tu et.al. | 2411.17697 | link |
2024-11-26 | ScribbleLight: Single Image Indoor Relighting with Scribbles | Jun Myeong Choi et.al. | 2411.17696 | null |
2024-11-26 | Visatronic: A Multimodal Decoder-Only Model for Speech Synthesis | Akshita Gupta et.al. | 2411.17690 | null |
2024-11-26 | GenDeg: Diffusion-Based Degradation Synthesis for Generalizable All-in-One Image Restoration | Sudarshan Rajagopalan et.al. | 2411.17687 | null |
2024-11-26 | Semi-analytical model for the calculation of solar radiation pressure and its effects on a LEO satellite with predicting the change in position vectors using machine learning techniques | Pranava Seth et.al. | 2411.17626 | null |
2024-11-26 | Accelerating Vision Diffusion Transformers with Skip Branches | Guanjie Chen et.al. | 2411.17616 | link |
2024-11-26 | Mixed-State Quantum Denoising Diffusion Probabilistic Model | Gino Kwun et.al. | 2411.17608 | null |
2024-11-26 | Making History Readable | Bipasha Banerjee et.al. | 2411.17600 | null |
2024-11-26 | VideoDirector: Precise Video Editing via Text-to-Video Models | Yukun Wang et.al. | 2411.17592 | null |
2024-11-26 | Rapid Deployment of Domain-specific Hyperspectral Image Processors with Application to Autonomous Driving | Jon Gutiérrez-Zaballa et.al. | 2411.17543 | null |
2024-11-26 | Metaverse Innovation Canvas: A Tool for Extended Reality Product/Service Development | Amir Reza Asadi et.al. | 2411.17541 | null |
2024-11-26 | IMPROVE: Improving Medical Plausibility without Reliance on HumanValidation -- An Enhanced Prototype-Guided Diffusion Framework | Anurag Shandilya et.al. | 2411.17535 | null |
2024-11-26 | FTMoMamba: Motion Generation with Frequency and Text State Space Models | Chengjian Li et.al. | 2411.17532 | null |
2024-11-26 | Exact and Heuristic Approaches for the Covering Tour Location Routing Problem | Andreas Hagn et.al. | 2411.17510 | link |
2024-11-26 | WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model | Zongjian Li et.al. | 2411.17459 | link |
2024-11-25 | Generative Omnimatte: Learning to Decompose Video into Layers | Yao-Chih Lee et.al. | 2411.16683 | null |
2024-11-25 | Diffusion Features for Zero-Shot 6DoF Object Pose Estimation | Bernd Von Gimborn et.al. | 2411.16668 | null |
2024-11-25 | DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation | Zun Wang et.al. | 2411.16657 | null |
2024-11-25 | Exploring Discrete Flow Matching for 3D De Novo Molecule Generation | Ian Dunn et.al. | 2411.16644 | link |
2024-11-25 | LegoPET: Hierarchical Feature Guided Conditional Diffusion for PET Image Reconstruction | Yiran Sun et.al. | 2411.16629 | link |
2024-11-25 | Chat2SVG: Vector Graphics Generation with Large Language Models and Image Diffusion Models | Ronghuan Wu et.al. | 2411.16602 | null |
2024-11-25 | Unlocking The Potential of Adaptive Attacks on Diffusion-Based Purification | Andre Kassis et.al. | 2411.16598 | link |
2024-11-25 | Rethinking Diffusion for Text-Driven Human Motion Generation | Zichong Meng et.al. | 2411.16575 | null |
2024-11-25 | Representation Collapsing Problems in Vector Quantization | Wenhao Zhao et.al. | 2411.16550 | null |
2024-11-25 | ADOBI: Adaptive Diffusion Bridge For Blind Inverse Problems with Application to MRI Reconstruction | Yuyang Hu et.al. | 2411.16535 | null |
2024-11-25 | PriorPath: Coarse-To-Fine Approach for Controlled De-Novo Pathology Semantic Masks Generation | Nati Daniel et.al. | 2411.16515 | null |
2024-11-25 | Noise Diffusion for Enhancing Semantic Faithfulness in Text-to-Image Synthesis | Boming Miao et.al. | 2411.16503 | null |
2024-11-25 | Multi-Resolution Generative Modeling of Human Motion from Limited Data | David Eduardo Moreno-Villamarín et.al. | 2411.16498 | null |
2024-11-25 | Learning by Analogy: Enhancing Few-Shot Prompting for Math Word Problem Solving with Computational Graph-Based Retrieval | Xiaocong Yang et.al. | 2411.16454 | null |
2024-11-25 | Model-based reinforcement corrosion prediction: Continuous calibration with Bayesian optimization and corrosion wire sensor data | A. Potnis et.al. | 2411.16447 | null |
2024-11-22 | DiffusionDrive: Truncated Diffusion Model for End-to-End Autonomous Driving | Bencheng Liao et.al. | 2411.15139 | link |
2024-11-22 | Material Anything: Generating Materials for Any 3D Object via Diffusion | Xin Huang et.al. | 2411.15138 | null |
2024-11-22 | VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement | Daeun Lee et.al. | 2411.15115 | null |
2024-11-22 | RE-Bench: Evaluating frontier AI R&D capabilities of language model agents against human experts | Hjalmar Wijk et.al. | 2411.15114 | link |
2024-11-22 | Efficient Pruning of Text-to-Image Models: Insights from Pruning Stable Diffusion | Samarth N Ramesh et.al. | 2411.15113 | null |
2024-11-22 | Leapfrog Latent Consistency Model (LLCM) for Medical Images Generation | Lakshmikar R. Polamreddy et.al. | 2411.15084 | link |
2024-11-22 | Towards Speaker Identification with Minimal Dataset and Constrained Resources using 1D-Convolution Neural Network | Irfan Nafiz Shahan et.al. | 2411.15082 | link |
2024-11-22 | Empowering Clients: Transformation of Design Processes Due to Generative AI | Johannes Schneider et.al. | 2411.15061 | null |
2024-11-22 | The 1D nonlocal Fisher-KPP equation with a top hat kernel. Part 3. The effect of perturbations in the kernel | David John Needham et.al. | 2411.15054 | null |
2024-11-22 | FloAt: Flow Warping of Self-Attention for Clothing Animation Generation | Swasti Shreya Mishra et.al. | 2411.15028 | null |
2024-11-22 | Enhancing Exploration with Diffusion Policies in Hybrid Off-Policy RL: Application to Non-Prehensile Manipulation | Huy Le et.al. | 2411.14913 | null |
2024-11-22 | Dynamically Encircled Higher-order Exceptional Points in an Optical Fiber | Arpan Roy et.al. | 2411.14874 | null |
2024-11-22 | Prioritize Denoising Steps on Diffusion Model Preference Alignment via Explicit Denoised Distribution Estimation | Dingyuan Shi et.al. | 2411.14871 | null |
2024-11-22 | Latent Schrodinger Bridge: Prompting Latent Diffusion for Fast Unpaired Image-to-Image Translation | Jeongsol Kim et.al. | 2411.14863 | null |
2024-11-22 | Style-Friendly SNR Sampler for Style-Driven Generation | Jooyoung Choi et.al. | 2411.14793 | null |
2024-11-21 | Stable Flow: Vital Layers for Training-Free Image Editing | Omri Avrahami et.al. | 2411.14430 | link |
2024-11-21 | Transformer-based Heuristic for Advanced Air Mobility Planning | Jun Xiang et.al. | 2411.14427 | null |
2024-11-21 | A Python-Based Approach to Sputter Deposition Simulations in Combinatorial Materials Science | Felix Thelen et.al. | 2411.14413 | null |
2024-11-21 | Multi-Agent Environments for Vehicle Routing Problems | Ricardo Gama et.al. | 2411.14411 | link |
2024-11-21 | Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation | Yuanhao Cai et.al. | 2411.14384 | null |
2024-11-21 | CoNFiLD-inlet: Synthetic Turbulence Inflow Using Generative Latent Diffusion Models with Neural Fields | Xin-Yang Liu et.al. | 2411.14378 | null |
2024-11-21 | Enhancing Medical Image Segmentation with Deep Learning and Diffusion Models | Houze Liu et.al. | 2411.14353 | null |
2024-11-21 | DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding | Tianhe Ren et.al. | 2411.14347 | link |
2024-11-21 | Lower Dimensional Spherical Representation of Medium Voltage Load Profiles for Visualization, Outlier Detection, and Generative Modelling | Edgar Mauricio Salazar Duque et.al. | 2411.14346 | null |
2024-11-21 | StereoCrafter-Zero: Zero-Shot Stereo Video Generation with Noisy Restart | Jian Shi et.al. | 2411.14295 | link |
2024-11-21 | Efficient Aspect-Based Summarization of Climate Change Reports with Small Language Models | Iacopo Ghinassi et.al. | 2411.14272 | link |
2024-11-21 | Guided MRI Reconstruction via Schrödinger Bridge | Yue Wang et.al. | 2411.14269 | null |
2024-11-21 | Regional Attention for Shadow Removal | Hengxing Liu et.al. | 2411.14201 | link |
2024-11-21 | TaQ-DiT: Time-aware Quantization for Diffusion Transformers | Xinyan Liu et.al. | 2411.14172 | null |
2024-11-21 | Creating a Formally Verified Neural Network for Autonomous Navigation: An Experience Report | Syed Ali Asadullah Bukhari et.al. | 2411.14163 | link |
2024-11-20 | REDUCIO! Generating 1024 |
Rui Tian et.al. | 2411.13552 | link |
2024-11-20 | Identity Preserving 3D Head Stylization with Multiview Score Distillation | Bahri Batuhan Bilecen et.al. | 2411.13536 | null |
2024-11-20 | VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models | Ziqi Huang et.al. | 2411.13503 | link |
2024-11-20 | LIMBA: An Open-Source Framework for the Preservation and Valorization of Low-Resource Languages using Generative Models | Salvatore Mario Carta et.al. | 2411.13453 | null |
2024-11-20 | Heuristically Adaptive Diffusion-Model Evolutionary Strategy | Benedikt Hartl et.al. | 2411.13420 | null |
2024-11-20 | Energy-based generative models for monoclonal antibodies | Paul Pereira et.al. | 2411.13390 | link |
2024-11-20 | Small and Close-In Planets are Uncommon around A-type Stars | Steven Giacalone et.al. | 2411.13363 | null |
2024-11-20 | Vertical Validation: Evaluating Implicit Generative Models for Graphs on Thin Support Regions | Mai Elkady et.al. | 2411.13358 | null |
2024-11-20 | A CSI Feedback Framework based on Transmitting the Important Values and Generating the Others | Zhilin Du et.al. | 2411.13298 | null |
2024-11-21 | Structure-Based Molecule Optimization via Gradient-Guided Bayesian Update | Keyue Qiu et.al. | 2411.13280 | null |
2024-11-20 | XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic Segmentation | Ziyi Wang et.al. | 2411.13243 | link |
2024-11-20 | BIPro: Zero-shot Chinese Poem Generation via Block Inverse Prompting Constrained Generation Framework | Xu Zou et.al. | 2411.13237 | null |
2024-11-20 | Building music with Lego bricks and Raspberry Pi | Ana M. Barbancho et.al. | 2411.13224 | null |
2024-11-20 | A computational framework for integrating Predictive processes with evidence Accumulation Models (PAM) | Antonino Visalli et.al. | 2411.13203 | link |
2024-11-20 | OpenMS WebApps: Building User-Friendly Solutions for MS Analysis | Tom David Müller et.al. | 2411.13189 | link |
2024-11-19 | Enhancing Multi-Class Disease Classification: Neoplasms, Cardiovascular, Nervous System, and Digestive Disorders Using Advanced LLMs | Ahmed Akib Jawad Karim et.al. | 2411.12712 | null |
2024-11-19 | OrigamiPlot: An R Package and Shiny Web App Enhanced Visualizations for Multivariate Data | Yiwen Lu et.al. | 2411.12674 | null |
2024-11-19 | Auto-Evaluation with Few Labels through Post-hoc Regression | Benjamin Eyre et.al. | 2411.12665 | null |
2024-11-19 | PoM: Efficient Image and Video Generation with the Polynomial Mixer | David Picard et.al. | 2411.12663 | link |
2024-11-19 | Optimizing Airline Reservation Systems with Edge-Enabled Microservices: A Framework for Real-Time Data Processing and Enhanced User Responsiveness | Biman Barua et.al. | 2411.12650 | null |
2024-11-19 | DLBacktrace: A Model Agnostic Explainability for any Deep Learning Models | Vinay Kumar Sankarapu et.al. | 2411.12643 | link |
2024-11-19 | Improving Controllability and Editability for Pretrained Text-to-Music Generation Models | Yixiao Zhang et.al. | 2411.12641 | null |
2024-11-19 | Universal programmable waveguide arrays | Akram Youssry et.al. | 2411.12610 | null |
2024-11-19 | Whisper Finetuning on Nepali Language | Sanjay Rijal et.al. | 2411.12587 | null |
2024-11-19 | Predicting Customer Satisfaction by Replicating the Survey Response Distribution | Etienne Manderscheid et.al. | 2411.12539 | null |
2024-11-19 | Data Pruning in Generative Diffusion Models | Rania Briq et.al. | 2411.12523 | link |
2024-11-19 | Probe-Me-Not: Protecting Pre-trained Encoders from Malicious Probing | Ruyi Ding et.al. | 2411.12508 | null |
2024-11-19 | Empirical Privacy Evaluations of Generative and Predictive Machine Learning Models -- A review and challenges for practice | Flavio Hafner et.al. | 2411.12451 | null |
2024-11-19 | Frequency-Aware Guidance for Blind Image Restoration via Diffusion Models | Jun Xiao et.al. | 2411.12450 | null |
2024-11-19 | A general modeling and simulation framework for dynamic vehicle routing | Markó Horváth et.al. | 2411.12406 | link |
2024-11-18 | QARM: Quantitative Alignment Multi-Modal Recommendation at Kuaishou | Xinchen Luo et.al. | 2411.11739 | null |
2024-11-18 | Aligning Few-Step Diffusion Models with Dense Reward Difference Learning | Ziyi Zhang et.al. | 2411.11727 | link |
2024-11-18 | Multiscale nonlinear integration drives accurate encoding of input information | Giorgio Nicoletti et.al. | 2411.11710 | null |
2024-11-18 | Robust Reinforcement Learning under Diffusion Models for Data with Jumps | Chenyang Jiang et.al. | 2411.11697 | null |
2024-11-18 | Active droplets controlled by enzymatic reactions | Jacques Fries et.al. | 2411.11696 | null |
2024-11-18 | Do Captioning Metrics Reflect Music Semantic Alignment? | Jinwoo Lee et.al. | 2411.11692 | null |
2024-11-18 | Conceptwm: A Diffusion Model Watermark for Concept Protection | Liangqi Lei et.al. | 2411.11688 | null |
2024-11-19 | GNN-Based Code Annotation Logic for Establishing Security Boundaries in C Code | Varun Gadey et.al. | 2411.11567 | null |
2024-11-19 | Cascaded Diffusion Models for 2D and 3D Microscopy Image Synthesis to Enhance Cell Segmentation | Rüveyda Yilmaz et.al. | 2411.11515 | link |
2024-11-18 | Collaborative Contrastive Network for Click-Through Rate Prediction | Chen Gao et.al. | 2411.11508 | null |
2024-11-18 | LaVin-DiT: Large Vision Diffusion Transformer | Zhaoqing Wang et.al. | 2411.11505 | null |
2024-11-18 | Alien Recombination: Exploring Concept Blends Beyond Human Cognitive Availability in Visual Art | Alejandro Hernandez et.al. | 2411.11494 | null |
2024-11-18 | MVLight: Relightable Text-to-3D Generation via Light-conditioned Multi-View Diffusion | Dongseok Shim et.al. | 2411.11475 | null |
2024-11-18 | GLDesigner: Leveraging Multi-Modal LLMs as Designer for Enhanced Aesthetic Text Glyph Layouts | Junwen He et.al. | 2411.11435 | null |
2024-11-18 | CLUE-MARK: Watermarking Diffusion Models using CLWE | Kareem Shehata et.al. | 2411.11434 | null |
2024-11-15 | M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation | Sucheng Ren et.al. | 2411.10433 | link |
2024-11-15 | Mitigating Parameter Degeneracy using Joint Conditional Diffusion Model for WECC Composite Load Model in Power Systems | Feiqin Zhu et.al. | 2411.10431 | null |
2024-11-15 | Multiscale Dubuc: A New Similarity Measure for Time Series | Mahsa Khazaei et.al. | 2411.10418 | link |
2024-11-15 | Experimental generation of extreme electron beams for advanced accelerator applications | Claudio Emma et.al. | 2411.10413 | null |
2024-11-15 | How to Build a Quantum Supercomputer: Scaling Challenges and Opportunities | Masoud Mohseni et.al. | 2411.10406 | null |
2024-11-15 | Nonlinearity-Driven Morphing and Control of Topological Modes in Non-Hermitian Systems | Zhao-Fan Cai et.al. | 2411.10398 | null |
2024-11-15 | Towards High-Fidelity 3D Portrait Generation with Rich Details by Cross-View Prior-Aware Diffusion | Haoran Wei et.al. | 2411.10369 | null |
2024-11-15 | Safe Text-to-Image Generation: Simply Sanitize the Prompt Embedding | Huming Qiu et.al. | 2411.10329 | null |
2024-11-15 | Probabilistic Prior Driven Attention Mechanism Based on Diffusion Model for Imaging Through Atmospheric Turbulence | Guodong Sun et.al. | 2411.10321 | null |
2024-11-15 | Assortment Optimization under the Multinomial Logit Model with Covering Constraints | Omar El Housni et.al. | 2411.10310 | null |
2024-11-15 | Modification Takes Courage: Seamless Image Stitching via Reference-Driven Inpainting | Ziqi Xie et.al. | 2411.10309 | link |
2024-11-15 | MDHP-Net: Detecting Injection Attacks on In-vehicle Network using Multi-Dimensional Hawkes Process and Temporal Model | Qi Liu et.al. | 2411.10258 | null |
2024-11-15 | The Unreasonable Effectiveness of Guidance for Diffusion Models | Tim Kaiser et.al. | 2411.10257 | null |
2024-11-15 | Smooth transport map via diffusion process | Arthur Stéphanovitch et.al. | 2411.10235 | null |
2024-11-15 | ColorEdit: Training-free Image-Guided Color editing with diffusion model | Xingxi Yin et.al. | 2411.10232 | null |
2024-11-14 | A Bayesian Optimization Approach to Machine Translation Reranking | Julius Cheng et.al. | 2411.09694 | link |
2024-11-14 | SimTube: Generating Simulated Video Comments through Multimodal AI and User Personas | Yu-Kai Hung et.al. | 2411.09577 | null |
2024-11-14 | Golden Noise for Diffusion Models: A Learning Framework | Zikai Zhou et.al. | 2411.09502 | link |
2024-11-14 | Sparse Bayesian Generative Modeling for Compressive Sensing | Benedikt Böck et.al. | 2411.09483 | link |
2024-11-14 | DiffRoad: Realistic and Diverse Road Scenario Generation for Autonomous Vehicle Testing | Junjie Zhou et.al. | 2411.09451 | null |
2024-11-14 | Image Regeneration: Evaluating Text-to-Image Model via Generating Identical Image with Multimodal Large Language Models | Chutian Meng et.al. | 2411.09449 | null |
2024-11-14 | A survey of probabilistic generative frameworks for molecular simulations | Richard John et.al. | 2411.09388 | link |
2024-11-14 | Multi-scale Generative Modeling for Fast Sampling | Xiongye Xiao et.al. | 2411.09356 | null |
2024-11-14 | ParaLBench: A Large-Scale Benchmark for Computational Paralinguistics over Acoustic Foundation Models | Zixing Zhang et.al. | 2411.09349 | null |
2024-11-15 | Approximate Probabilistic Inference for Time-Series Data A Robust Latent Gaussian Model With Temporal Awareness | Anton Johansson et.al. | 2411.09312 | null |
2024-11-14 | EEG-Based Speech Decoding: A Novel Approach Using Multi-Kernel Ensemble Diffusion Models | Soowon Kim et.al. | 2411.09302 | null |
2024-11-14 | LES-Talker: Fine-Grained Emotion Editing for Talking Head Generation in Linear Emotion Space | Guanwen Feng et.al. | 2411.09268 | null |
2024-11-14 | Jailbreak Attacks and Defenses against Multimodal Generative Models: A Survey | Xuannan Liu et.al. | 2411.09259 | link |
2024-11-14 | RibCageImp: A Deep Learning Framework for 3D Ribcage Implant Generation | Gyanendra Chaubey et.al. | 2411.09204 | null |
2024-11-14 | Improvement and Implementation of a Speech Emotion Recognition Model Based on Dual-Layer LSTM | Xiaoran Yang et.al. | 2411.09189 | null |
2024-11-13 | 4D Gaussian Splatting in the Wild with Uncertainty-Aware Regularization | Mijeong Kim et.al. | 2411.08879 | null |
2024-11-13 | A generalized software framework for consolidation of radiotherapy planning and delivery data from diverse data sources | Yasin Abdulkadir et.al. | 2411.08876 | null |
2024-11-13 | Offline Adaptation of Quadruped Locomotion using Diffusion Models | Reece O'Mahoney et.al. | 2411.08832 | null |
2024-11-13 | SANDWICH: Towards an Offline, Differentiable, Fully-Trainable Wireless Neural Ray-Tracing Surrogate | Yifei Jin et.al. | 2411.08767 | null |
2024-11-13 | Analyst Reports and Stock Performance: Evidence from the Chinese Market | Rui Liu et.al. | 2411.08726 | null |
2024-11-14 | Reducing ADC Front-end Costs During Training of On-sensor Printed Multilayer Perceptrons | Florentia Afentaki et.al. | 2411.08674 | link |
2024-11-13 | Joint Model Caching and Resource Allocation in Generative AI-Enabled Wireless Edge Networks | Zhang Liu et.al. | 2411.08672 | null |
2024-11-13 | Toward Human Understanding with Controllable Synthesis | Hanz Cuevas-Velasquez et.al. | 2411.08663 | null |
2024-11-13 | The Galactica database: an open, generic and versatile tool for the dissemination of simulation data in astrophysics | Damien Chapon et.al. | 2411.08647 | null |
2024-11-13 | Towards More Accurate Fake Detection on Images Generated from Advanced Generative and Neural Rendering Models | Chengdong Dong et.al. | 2411.08642 | null |
2024-11-13 | Deep Generative Demand Learning for Newsvendor and Pricing | Shijin Gong et.al. | 2411.08631 | null |
2024-11-13 | LG-Gaze: Learning Geometry-aware Continuous Prompts for Language-Guided Gaze Estimation | Pengwei Yin et.al. | 2411.08606 | null |
2024-11-13 | CorrSynth -- A Correlated Sampling Method for Diverse Dataset Generation from LLMs | Suhas S Kowshik et.al. | 2411.08553 | null |
2024-11-13 | Explainers' Mental Representations of Explainees' Needs in Everyday Explanations | Michael Erol Schaffer et.al. | 2411.08514 | null |
2024-11-13 | HyperFace: Generating Synthetic Face Recognition Datasets by Exploring Face Embedding Hypersphere | Hatef Otroshi Shahreza et.al. | 2411.08470 | null |
2024-11-12 | Scaling Properties of Diffusion Models for Perceptual Tasks | Rahul Ravishankar et.al. | 2411.08034 | null |
2024-11-12 | GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation | Yushi Lan et.al. | 2411.08033 | null |
2024-11-12 | Wavelet Latent Diffusion (Wala): Billion-Parameter 3D Generative Model with Compact Wavelet Encodings | Aditya Sanghi et.al. | 2411.08017 | link |
2024-11-12 | JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation | Yiyang Ma et.al. | 2411.07975 | link |
2024-11-12 | Diverse capability and scaling of diffusion and auto-regressive models when learning abstract rules | Binxu Wang et.al. | 2411.07873 | null |
2024-11-12 | Trustful LLMs: Customizing and Grounding Text Generation with Knowledge Bases and Dual Decoders | Xiaofeng Zhu et.al. | 2411.07870 | null |
2024-11-12 | CDXFormer: Boosting Remote Sensing Change Detection with Extended Long Short-Term Memory | Zhenkai Wu et.al. | 2411.07863 | link |
2024-11-12 | Sparsity-Aware Optimization of In-Memory Bayesian Binary Neural Network Accelerators | Prabodh Katti et.al. | 2411.07842 | null |
2024-11-12 | Novel View Synthesis with Pixel-Space Diffusion Models | Noam Elata et.al. | 2411.07765 | null |
2024-11-12 | Nanosecond nanothermometry in an electron microscope | Florian Castioni et.al. | 2411.07764 | null |
2024-11-12 | LapGSR: Laplacian Reconstructive Network for Guided Thermal Super-Resolution | Aditya Kasliwal et.al. | 2411.07750 | null |
2024-11-12 | The relationship between general equilibrium models with infinite-lived agents and overlapping generations models, and some applications | Ngoc-Sang Pham et.al. | 2411.07674 | null |
2024-11-12 | Evaluating the Generation of Spatial Relations in Text and Image Generative Models | Shang Hong Sim et.al. | 2411.07664 | null |
2024-11-12 | Leveraging Previous Steps: A Training-free Fast Solver for Flow Diffusion | Kaiyu Song et.al. | 2411.07627 | null |
2024-11-12 | Unraveling the Connections between Flow Matching and Diffusion Probabilistic Models in Training-free Conditional Generation | Kaiyu Song et.al. | 2411.07625 | null |
2024-11-11 | Score-based generative diffusion with "active" correlated noise sources | Alexandra Lamtyugina et.al. | 2411.07233 | null |
2024-11-12 | Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models | Yoad Tewel et.al. | 2411.07232 | null |
2024-11-11 | Learning from Limited and Imperfect Data | Harsh Rangwani et.al. | 2411.07229 | null |
2024-11-11 | TempCharBERT: Keystroke Dynamics for Continuous Access Control Based on Pre-trained Language Models | Matheus Simão et.al. | 2411.07224 | null |
2024-11-11 | DLCR: A Generative Data Expansion Framework via Diffusion for Clothes-Changing Person Re-ID | Nyle Siddiqui et.al. | 2411.07205 | link |
2024-11-11 | Crossover from inhomogeneous to homogeneous response of a resonantly driven hBN quantum emitter | Domitille Gérard et.al. | 2411.07202 | null |
2024-11-11 | OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision | Cong Wei et.al. | 2411.07199 | null |
2024-11-11 | More Expressive Attention with Negative Weights | Ang Lv et.al. | 2411.07176 | link |
2024-11-11 | Edify 3D: Scalable High-Quality 3D Asset Generation | NVIDIA et.al. | 2411.07135 | null |
2024-11-11 | Benchmarking LLMs' Judgments with No Gold Standard | Shengwei Xu et.al. | 2411.07127 | link |
2024-11-11 | Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models | NVIDIA et.al. | 2411.07126 | null |
2024-11-11 | Decoding Visual Experience and Mapping Semantics through Whole-Brain Analysis Using fMRI Foundation Models | Yanchen Wang et.al. | 2411.07121 | link |
2024-11-11 | Scaling Mesh Generation via Compressive Tokenization | Haohan Weng et.al. | 2411.07025 | link |
2024-11-11 | An Electrocardiogram Monitoring Device Based on STM32 | Wenqi Guan et.al. | 2411.06962 | null |
2024-11-11 | Generative Feature Training of Thin 2-Layer Networks | Johannes Hertrich et.al. | 2411.06848 | link |
2024-11-08 | StdGEN: Semantic-Decomposed 3D Character Generation from Single Images | Yuze He et.al. | 2411.05738 | null |
2024-11-08 | Image2Text2Image: A Novel Framework for Label-Free Evaluation of Image-to-Text Generation with Text-to-Image Diffusion Models | Jia-Hong Huang et.al. | 2411.05706 | null |
2024-11-08 | Improving Molecular Graph Generation with Flow Matching and Optimal Transport | Xiaoyang Hou et.al. | 2411.05676 | null |
2024-11-08 | Towards Lifelong Few-Shot Customization of Text-to-Image Diffusion | Nan Song et.al. | 2411.05544 | null |
2024-11-08 | Improving image synthesis with diffusion-negative sampling | Alakh Desai et.al. | 2411.05473 | null |
2024-11-08 | Bridging the Gap between Learning and Inference for Diffusion-Based Molecule Generation | Peidong Liu et.al. | 2411.05472 | link |
2024-11-08 | IntellBot: Retrieval Augmented LLM Chatbot for Cyber Threat Knowledge Delivery | Dincy R. Arikkat et.al. | 2411.05442 | link |
2024-11-08 | RED: Residual Estimation Diffusion for Low-Dose PET Sinogram Reconstruction | Xingyu Ai et.al. | 2411.05354 | link |
2024-11-08 | Electro-diffusive modeling and the role of spine geometry on action potential propagation in neurons | Rahul Gulati et.al. | 2411.05329 | null |
2024-11-08 | Social balance in directed networks | Bingjie Hao et.al. | 2411.05327 | null |
2024-11-08 | SeqRFM: Fast RFM Analysis in Sequence Data | Yanxin Zheng et.al. | 2411.05317 | link |
2024-11-08 | Differentiable Calibration of Inexact Stochastic Simulation Models via Kernel Score Minimization | Ziwei Su et.al. | 2411.05315 | null |
2024-11-08 | A Real-time Face Mask Detection and Social Distancing System for COVID-19 using Attention-InceptionV3 Model | Abdullah Al Asif et.al. | 2411.05312 | null |
2024-11-08 | Adaptive Whole-Body PET Image Denoising Using 3D Diffusion Models with ControlNet | Boxiao Yu et.al. | 2411.05302 | null |
2024-11-08 | GPT Semantic Cache: Reducing LLM Costs and Latency via Semantic Embedding Caching | Sajal Regmi et.al. | 2411.05276 | null |
2024-11-07 | SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models | Muyang Li et.al. | 2411.05007 | link |
2024-11-07 | ProEdit: Simple Progression is All You Need for High-Quality 3D Scene Editing | Jun-Kun Chen et.al. | 2411.05006 | null |
2024-11-07 | Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models | Shuhong Zheng et.al. | 2411.05005 | null |
2024-11-07 | ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning | David Junhao Zhang et.al. | 2411.05003 | null |
2024-11-07 | SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation | Koichi Namekata et.al. | 2411.04989 | null |
2024-11-07 | Few-Shot Task Learning through Inverse Generative Modeling | Aviv Netanyahu et.al. | 2411.04987 | null |
2024-11-07 | How fast does the WallGo? A package for computing wall velocities in first-order phase transitions | Andreas Ekstedt et.al. | 2411.04970 | link |
2024-11-07 | VAIR: Visuo-Acoustic Implicit Representations for Low-Cost, Multi-Modal Transparent Surface Reconstruction in Indoor Scenes | Advaith V. Sethuraman et.al. | 2411.04963 | null |
2024-11-07 | Uncovering Hidden Subspaces in Video Diffusion Models Using Re-Identification | Mischa Dombrowski et.al. | 2411.04956 | null |
2024-11-07 | Fed-LDR: Federated Local Data-infused Graph Creation with Node-centric Model Refinement | Jiechao Gao et.al. | 2411.04936 | null |
2024-11-07 | DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion | Wenqiang Sun et.al. | 2411.04928 | null |
2024-11-07 | StoryAgent: Customized Storytelling Video Generation via Multi-Agent Collaboration | Panwen Hu et.al. | 2411.04925 | null |
2024-11-07 | Stem-OB: Generalizable Visual Imitation Learning with Stem-Like Convergent Observation through Diffusion Inversion | Kaizhe Hu et.al. | 2411.04919 | link |
2024-11-07 | GASE: Generatively Augmented Sentence Encoding | Manuel Frank et.al. | 2411.04914 | null |
2024-11-07 | Controlling Human Shape and Pose in Text-to-Image Diffusion Models via Domain Adaptation | Benito Buchheim et.al. | 2411.04724 | null |
2024-11-06 | Community Forensics: Using Thousands of Generators to Train Fake Image Detectors | Jeongsoo Park et.al. | 2411.04125 | null |
2024-11-06 | Stepping Forward on the Last Mile | Chen Feng et.al. | 2411.04036 | null |
2024-11-06 | Prototyping O-RAN Enabled UAV Experimentation for the AERPAW Testbed | Joshua Moore et.al. | 2411.04027 | null |
2024-11-06 | Object-Centric Dexterous Manipulation from Human Motion Data | Yuanpei Chen et.al. | 2411.04005 | null |
2024-11-06 | Synomaly Noise and Multi-Stage Diffusion: A Novel Approach for Unsupervised Anomaly Detection in Ultrasound Imaging | Yuan Bi et.al. | 2411.04004 | null |
2024-11-06 | ET-SEED: Efficient Trajectory-Level SE(3) Equivariant Diffusion Policy | Chenrui Tie et.al. | 2411.03990 | null |
2024-11-06 | ReEdit: Multimodal Exemplar-Based Image Editing with Diffusion Models | Ashutosh Srivastava et.al. | 2411.03982 | null |
2024-11-06 | Customized Multiple Clustering via Multi-Modal Subspace Proxy Learning | Jiawei Yao et.al. | 2411.03978 | link |
2024-11-06 | Bayesian algorithmic perfumery: A Hierarchical Relevance Vector Machine for the Estimation of Personalized Fragrance Preferences based on Three Sensory Layers and Jungian Personality Archetypes | Rolando Gonzales Martinez et.al. | 2411.03965 | null |
2024-11-06 | Long-Form Text-to-Music Generation with Adaptive Prompts: A Case of Study in Tabletop Role-Playing Games Soundtracks | Felipe Marra et.al. | 2411.03948 | link |
2024-11-06 | Can Custom Models Learn In-Context? An Exploration of Hybrid Architecture Performance on In-Context Learning Tasks | Ryan Campbell et.al. | 2411.03945 | link |
2024-11-06 | GUIDE-VAE: Advancing Data Generation with User Information and Pattern Dictionaries | Kutay Bölat et.al. | 2411.03936 | link |
2024-11-06 | Large Generative Model-assisted Talking-face Semantic Communication System | Feibo Jiang et.al. | 2411.03876 | null |
2024-11-06 | ROBIN: Robust and Invisible Watermarks for Diffusion Models with Adversarial Optimization | Huayang Huang et.al. | 2411.03862 | link |
2024-11-06 | Sub-DM:Subspace Diffusion Model with Orthogonal Decomposition for MRI Reconstruction | Yu Guan et.al. | 2411.03758 | link |
2024-11-05 | MME-Finance: A Multimodal Finance Benchmark for Expert-level Understanding and Reasoning | Ziliang Gan et.al. | 2411.03314 | null |
2024-11-05 | LLMs for Domain Generation Algorithm Detection | Reynier Leyva La O et.al. | 2411.03307 | null |
2024-11-05 | DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models | Ying Zhou et.al. | 2411.03250 | null |
2024-11-05 | On Improved Conditioning Mechanisms and Pre-training Strategies for Diffusion Models | Tariq Berrada Ifriqi et.al. | 2411.03177 | null |
2024-11-05 | Unleashing the power of novel conditional generative approaches for new materials discovery | Lev Novitskiy et.al. | 2411.03156 | link |
2024-11-05 | Local Lesion Generation is Effective for Capsule Endoscopy Image Data Augmentation in a Limited Data Setting | Adrian B. Chłopowiec et.al. | 2411.03098 | null |
2024-11-05 | Gradient-Guided Conditional Diffusion Models for Private Image Reconstruction: Analyzing Adversarial Impacts of Differential Privacy and Denoising | Tao Huang et.al. | 2411.03053 | null |
2024-11-05 | GarVerseLOD: High-Fidelity 3D Garment Reconstruction from a Single In-the-Wild Image using a Dataset with Levels of Details | Zhongjin Luo et.al. | 2411.03047 | null |
2024-11-05 | Speaker Emotion Recognition: Leveraging Self-Supervised Models for Feature Extraction Using Wav2Vec2 and HuBERT | Pourya Jafarzadeh et.al. | 2411.02964 | null |
2024-11-05 | IMUDiffusion: A Diffusion Model for Multivariate Time Series Synthetisation for Inertial Motion Capturing Systems | Heiko Oppel et.al. | 2411.02954 | null |
2024-11-05 | LDPM: Towards undersampled MRI reconstruction with MR-VAE and Latent Diffusion Prior | Xingjian Tang et.al. | 2411.02951 | null |
2024-11-05 | A scalable generative model for dynamical system reconstruction from neuroimaging data | Eric Volkmann et.al. | 2411.02949 | link |
2024-11-05 | Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey | Ao Fu et.al. | 2411.02914 | null |
2024-11-05 | The Unreasonable Effectiveness of LLMs for Query Optimization | Peter Akioyamen et.al. | 2411.02862 | link |
2024-11-05 | ADOPT: Modified Adam Can Converge with Any |
Shohei Taniguchi et.al. | 2411.02853 | link |
2024-11-04 | Training-free Regional Prompting for Diffusion Transformers | Anthony Chen et.al. | 2411.02395 | link |
2024-11-04 | How Far is Video Generation from World Model: A Physical Law Perspective | Bingyi Kang et.al. | 2411.02385 | null |
2024-11-04 | Virgo Filaments IV: Using WISE to Measure the Modification of Star-Forming Disks in the Extended Regions Around the Virgo Cluster | Kim Conger et.al. | 2411.02352 | null |
2024-11-04 | Diffusion-based Generative Multicasting with Intent-aware Semantic Decomposition | Xinkai Liu et.al. | 2411.02334 | null |
2024-11-05 | PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance | Ruyang Liu et.al. | 2411.02327 | link |
2024-11-04 | LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation | Mufei Li et.al. | 2411.02322 | link |
2024-11-04 | CRMArena: Understanding the Capacity of LLM Agents to Perform Professional CRM Tasks in Realistic Environments | Kung-Hsiang Huang et.al. | 2411.02305 | link |
2024-11-04 | Hunyuan3D-1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation | Xianghui Yang et.al. | 2411.02293 | null |
2024-11-04 | Counterfactual Explanations via Riemannian Latent Space Traversal | Paraskevas Pegios et.al. | 2411.02259 | null |
2024-11-04 | FewViewGS: Gaussian Splatting with Few View Matching and Multi-stage Training | Ruihong Yin et.al. | 2411.02229 | null |
2024-11-04 | Recursive Learning of Asymptotic Variational Objectives | Alessandro Mastrototaro et.al. | 2411.02217 | null |
2024-11-04 | Digi2Real: Bridging the Realism Gap in Synthetic Data Face Recognition via Foundation Models | Anjith George et.al. | 2411.02188 | null |
2024-11-04 | Touch-to-Touch Translation -- Learning the Mapping Between Heterogeneous Tactile Sensing Technologies | Francesco Grella et.al. | 2411.02187 | null |
2024-11-04 | CleAR: Robust Context-Guided Generative Lighting Estimation for Mobile Augmented Reality | Yiqin Zhao et.al. | 2411.02179 | null |
2024-11-04 | CryptoEL: A Novel Experiential Learning Tool for Enhancing K-12 Cryptography Education | Pranathi Rayavaram et.al. | 2411.02143 | null |
2024-10-31 | Bridging Geometric States via Geometric Diffusion Bridge | Shengjie Luo et.al. | 2410.24220 | null |
2024-10-31 | Enhancing Motion in Text-to-Video Generation with Decomposed Encoding and Conditioning | Penghui Ruan et.al. | 2410.24219 | link |
2024-10-31 | DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion | Weicai Ye et.al. | 2410.24203 | link |
2024-10-31 | Multi-Attribute Linguistic Tuning for Controlled Paraphrase Generation | Mohamed Elgaar et.al. | 2410.24199 | null |
2024-10-31 | Generative modelling for mass-mapping with fast uncertainty quantification | Jessica J. Whitney et.al. | 2410.24197 | link |
2024-10-31 | AR-Pro: Counterfactual Explanations for Anomaly Repair with Formal Properties | Xiayan Ji et.al. | 2410.24178 | link |
2024-10-31 | Redefining in Dictionary: Towards a Enhanced Semantic Understanding of Creative Generation | Fu Feng et.al. | 2410.24160 | null |
2024-10-31 | Scaling Concept With Text-Guided Diffusion Models | Chao Huang et.al. | 2410.24151 | null |
2024-10-31 | Repository-Level Compositional Code Translation and Validation | Ali Reza Ibrahimzada et.al. | 2410.24117 | link |
2024-10-31 | Extended electrochemical monitoring of biomolecular binding using commercially available, reusable electrodes in microliter volumes | Jeremy Mendez et.al. | 2410.24110 | null |
2024-10-31 | Sparsh: Self-supervised touch representations for vision-based tactile sensing | Carolina Higuera et.al. | 2410.24090 | null |
2024-10-31 | Understanding Generalizability of Diffusion Models Requires Rethinking the Hidden Gaussian Structure | Xiang Li et.al. | 2410.24060 | link |
2024-10-31 | TPC: Test-time Procrustes Calibration for Diffusion-based Human Image Animation | Sunjae Yoon et.al. | 2410.24037 | null |
2024-10-31 | Unveiling Synthetic Faces: How Synthetic Datasets Can Expose Real Identities | Hatef Otroshi Shahreza et.al. | 2410.24015 | null |
2024-10-31 | DiffPAD: Denoising Diffusion-based Adversarial Patch Decontamination | Jia Fu et.al. | 2410.24006 | link |
2024-10-30 | ReferEverything: Towards Segmenting Everything We Can Speak of in Videos | Anurag Bagchi et.al. | 2410.23287 | null |
2024-10-30 | Provable acceleration for diffusion models under minimal assumptions | Gen Li et.al. | 2410.23285 | null |
2024-10-30 | RelationBooth: Towards Relation-Aware Customized Object Generation | Qingyu Shi et.al. | 2410.23280 | null |
2024-10-30 | SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation | Yining Hong et.al. | 2410.23277 | null |
2024-10-30 | Multi-student Diffusion Distillation for Better One-step Generators | Yanke Song et.al. | 2410.23274 | null |
2024-10-30 | ReaWristic: Remote Touch Sensation to Fingers from a Wristband via Visually Augmented Electro-Tactile Feedback | Yudai Tanaka et.al. | 2410.23193 | null |
2024-10-30 | Real-Time Personalization for LLM-based Recommendation with Customized In-Context Learning | Keqin Bao et.al. | 2410.23136 | link |
2024-10-30 | Educating for Hardware Specialization in the Chiplet Era: A Path for the HPC Community | Kazutomo Yoshii et.al. | 2410.23127 | null |
2024-10-30 | CausalDiff: Causality-Inspired Disentanglement via Diffusion Model for Adversarial Defense | Mingkun Zhang et.al. | 2410.23091 | link |
2024-10-30 | General Bayesian quantile regression for counts via generative modeling | Yuta Yamauchi et.al. | 2410.23081 | null |
2024-10-30 | Controlling Language and Diffusion Models by Transporting Activations | Pau Rodriguez et.al. | 2410.23054 | link |
2024-10-30 | Dispersion kinks from electronic correlations in an unconventional iron-based superconductor | Ming-Hua Chang et.al. | 2410.23044 | null |
2024-10-30 | Improving Musical Accompaniment Co-creation via Diffusion Transformers | Javier Nistal et.al. | 2410.23005 | null |
2024-10-30 | DexGraspNet 2.0: Learning Generative Dexterous Grasping in Large-scale Synthetic Cluttered Scenes | Jialiang Zhang et.al. | 2410.23004 | null |
2024-10-30 | LumiSculpt: A Consistency Lighting Control Network for Video Generation | Yuxin Zhang et.al. | 2410.22979 | null |
2024-10-29 | CaStL: Constraints as Specifications through LLM Translation for Long-Horizon Task and Motion Planning | Weihang Guo et.al. | 2410.22225 | null |
2024-10-29 | A Gaussian Process Generative Model for QCD Equation of State | Jiaxuan Gong et.al. | 2410.22160 | null |
2024-10-29 | Capacity Control is an Effective Memorization Mitigation Mechanism in Text-Conditional Diffusion Models | Raman Dutt et.al. | 2410.22149 | link |
2024-10-29 | AmpleGCG-Plus: A Strong Generative Model of Adversarial Suffixes to Jailbreak LLMs with Higher Success Rates in Fewer Attempts | Vishal Kumar et.al. | 2410.22143 | null |
2024-10-29 | Infrared photometry with InGaAs detectors: First light with SPECULOOS | Peter P. Pedersen et.al. | 2410.22140 | link |
2024-10-29 | SimRec: Mitigating the Cold-Start Problem in Sequential Recommendation by Integrating Item Similarity | Shaked Brody et.al. | 2410.22136 | link |
2024-10-29 | Protecting Privacy in Multimodal Large Language Models with MLLMU-Bench | Zheyuan Liu et.al. | 2410.22108 | link |
2024-10-29 | Variational inference for pile-up removal at hadron colliders with diffusion models | Malte Algren et.al. | 2410.22074 | null |
2024-10-29 | PACA: Perspective-Aware Cross-Attention Representation for Zero-Shot Scene Rearrangement | Shutong Jin et.al. | 2410.22059 | null |
2024-10-29 | Dual Conditional Diffusion Models for Sequential Recommendation | Hongtao Huang et.al. | 2410.21967 | null |
2024-10-29 | PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference | Kendong Liu et.al. | 2410.21966 | null |
2024-10-29 | CT to PET Translation: A Large-scale Dataset and Domain-Knowledge-Guided Diffusion Approach | Dac Thai Nguyen et.al. | 2410.21932 | link |
2024-10-29 | Guided Diffusion-based Counterfactual Augmentation for Robust Session-based Recommendation | Muskan Gupta et.al. | 2410.21892 | null |
2024-10-29 | On the study of the limit cycles for a class of population models with time-varying factors | Renhao Tian et.al. | 2410.21848 | null |
2024-10-29 | Diffusion as Reasoning: Enhancing Object Goal Navigation with LLM-Biased Diffusion Model | Yiming Ji et.al. | 2410.21842 | null |
2024-10-28 | On Inductive Biases That Enable Generalization of Diffusion Transformers | Jie An et.al. | 2410.21273 | link |
2024-10-28 | EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation | Shih-Yang Liu et.al. | 2410.21271 | null |
2024-10-28 | LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior | Hanyu Wang et.al. | 2410.21264 | null |
2024-10-28 | One-Step Diffusion Policy: Fast Visuomotor Policies via Diffusion Distillation | Zhendong Wang et.al. | 2410.21257 | null |
2024-10-28 | On learning higher-order cumulants in diffusion models | Gert Aarts et.al. | 2410.21212 | null |
2024-10-28 | The VSPEC Collection: A suite of utilities to model spectroscopic phase curves of 3D exoplanet atmospheres in the presence of stellar variability | Ted M Johnson et.al. | 2410.21190 | null |
2024-10-28 | Trajectory Flow Matching with Applications to Clinical Time Series Modeling | Xi Zhang et.al. | 2410.21154 | link |
2024-10-28 | Synthetica: Large Scale Synthetic Data for Robot Perception | Ritvik Singh et.al. | 2410.21153 | null |
2024-10-28 | Extrapolating Prospective Glaucoma Fundus Images through Diffusion Model in Irregular Longitudinal Sequences | Zhihao Zhao et.al. | 2410.21130 | null |
2024-10-28 | Shallow Diffuse: Robust and Invisible Watermarking through Low-Dimensional Subspaces in Diffusion Models | Wenda Li et.al. | 2410.21088 | link |
2024-10-28 | Federated Time Series Generation on Feature and Temporally Misaligned Data | Chenrui Fan et.al. | 2410.21072 | null |
2024-10-28 | Kandinsky 3: Text-to-Image Synthesis for Multifunctional Generative Framework | Vladimir Arkhipkin et.al. | 2410.21061 | link |
2024-10-28 | Beyond Autoregression: Fast LLMs via Self-Distillation Through Time | Justin Deschenaux et.al. | 2410.21035 | link |
2024-10-29 | EEG-Driven 3D Object Reconstruction with Color Consistency and Diffusion Prior | Xin Xiang et.al. | 2410.20981 | null |
2024-10-28 | MovieCharacter: A Tuning-Free Framework for Controllable Character Video Synthesis | Di Qiu et.al. | 2410.20974 | null |
2024-10-25 | Model merging with SVD to tie the Knots | George Stoica et.al. | 2410.19735 | link |
2024-10-25 | Adversarial Environment Design via Regret-Guided Diffusion Models | Hojun Chung et.al. | 2410.19715 | null |
2024-10-25 | Perception, Control and Hardware for In-Hand Slip-Aware Object Manipulation with Parallel Grippers | Gabriel Arslan Waltersson et.al. | 2410.19660 | null |
2024-10-25 | DiffGS: Functional Gaussian Splatting Diffusion | Junsheng Zhou et.al. | 2410.19657 | null |
2024-10-25 | VARS: Vision-based Assessment of Risk in Security Systems | Pranav Gupta et.al. | 2410.19642 | null |
2024-10-25 | Diffusion models for lattice gauge field simulations | Qianteng Zhu et.al. | 2410.19602 | null |
2024-10-25 | Energy Efficient Dual Designs of FeFET-Based Analog In-Memory Computing with Inherent Shift-Add Capability | Zeyu Yang et.al. | 2410.19593 | null |
2024-10-25 | Hybrid Memetic Search for Electric Vehicle Routing with Time Windows, Simultaneous Pickup-Delivery, and Partial Recharges | Zubin Zheng et.al. | 2410.19580 | null |
2024-10-25 | Utilizing Image Transforms and Diffusion Models for Generative Modeling of Short and Long Time Series | Ilan Naiman et.al. | 2410.19538 | null |
2024-10-25 | Ensemble Data Assimilation for Particle-based Methods | Marius Duvillard et.al. | 2410.19525 | null |
2024-10-25 | Marked Temporal Bayesian Flow Point Processes | Hui Chen et.al. | 2410.19512 | null |
2024-10-25 | EDGE: Enhanced Grounded GUI Understanding with Enriched Multi-Granularity Synthetic Data | Xuetian Chen et.al. | 2410.19461 | null |
2024-10-28 | NeuroClips: Towards High-fidelity and Smooth fMRI-to-Video Reconstruction | Zixuan Gong et.al. | 2410.19452 | link |
2024-10-25 | Learned Reference-based Diffusion Sampling for multi-modal distributions | Maxence Noble et.al. | 2410.19449 | null |
2024-10-25 | Generative Diffusion Models for Sequential Recommendations | Sharare Zolghadr et.al. | 2410.19429 | null |
2024-10-24 | Framer: Interactive Frame Interpolation | Wen Wang et.al. | 2410.18978 | null |
2024-10-24 | MotionCLR: Motion Generation and Training-free Editing via Understanding Attention Mechanisms | Ling-Hao Chen et.al. | 2410.18977 | null |
2024-10-24 | Unbounded: A Generative Infinite Game of Character Life Simulation | Jialu Li et.al. | 2410.18975 | null |
2024-10-24 | 3D-Adapter: Geometry-Consistent Multi-View Diffusion for High-Quality 3D Generation | Hansheng Chen et.al. | 2410.18974 | link |
2024-10-24 | On the Crucial Role of Initialization for Matrix Factorization | Bingcong Li et.al. | 2410.18965 | null |
2024-10-24 | Stable Consistency Tuning: Understanding and Improving Consistency Models | Fu-Yun Wang et.al. | 2410.18958 | link |
2024-10-24 | Generation of synthetic financial time series by diffusion models | Tomonori Takahashi et.al. | 2410.18897 | null |
2024-10-24 | Diff-Instruct++: Training One-step Text-to-image Generator Model to Align with Human Preferences | Weijian Luo et.al. | 2410.18881 | null |
2024-10-24 | The Cat and Mouse Game: The Ongoing Arms Race Between Diffusion Models and Detection Methods | Linda Laurier et.al. | 2410.18866 | null |
2024-10-24 | From Efficiency to Equity: Measuring Fairness in Preference Learning | Shreeyash Gowaikar et.al. | 2410.18841 | null |
2024-10-24 | From English-Centric to Effective Bilingual: LLMs with Custom Tokenizers for Underrepresented Languages | Artur Kiulian et.al. | 2410.18836 | null |
2024-10-24 | Multi-Scale Diffusion: Enhancing Spatial Layout in High-Resolution Panoramic Image Generation | Xiaoyu Zhang et.al. | 2410.18830 | null |
2024-10-24 | Towards Visual Text Design Transfer Across Languages | Yejin Choi et.al. | 2410.18823 | null |
2024-10-24 | Fast constrained sampling in pre-trained diffusion models | Alexandros Graikos et.al. | 2410.18804 | null |
2024-10-24 | Large Generative AI Models meet Open Networks for 6G: Integration, Platform, and Monetization | Peizheng Li et.al. | 2410.18790 | null |
2024-10-23 | DynamicCity: Large-Scale LiDAR Generation from Dynamic Scenes | Hengwei Bian et.al. | 2410.18084 | null |
2024-10-23 | Prioritized Generative Replay | Renhao Wang et.al. | 2410.18082 | null |
2024-10-23 | WorldSimBench: Towards Video Generation Models as World Simulators | Yiran Qin et.al. | 2410.18072 | null |
2024-10-23 | TP-Eval: Tap Multimodal LLMs' Potential in Evaluation by Customizing Prompts | Yuxuan Xie et.al. | 2410.18071 | null |
2024-10-23 | Training Free Guided Flow Matching with Optimal Control | Luran Wang et.al. | 2410.18070 | null |
2024-10-23 | Spectrally shaped THz pulses from tapered dielectric waveguides | Karel Peetermans et.al. | 2410.17975 | null |
2024-10-23 | Optical Generative Models | Shiqi Chen et.al. | 2410.17970 | null |
2024-10-23 | A Wavelet Diffusion GAN for Image Super-Resolution | Lorenzo Aloisi et.al. | 2410.17966 | null |
2024-10-23 | Addressing Asynchronicity in Clinical Multimodal Fusion via Individualized Chest X-ray Generation | Wenfang Yao et.al. | 2410.17918 | link |
2024-10-23 | regAL: Python Package for Active Learning of Regression Problems | Elizaveta Surzhikova et.al. | 2410.17917 | null |
2024-10-23 | Scaling Diffusion Language Models via Adaptation from Autoregressive Models | Shansan Gong et.al. | 2410.17891 | link |
2024-10-23 | Non-intrusive Speech Quality Assessment with Diffusion Models Trained on Clean Speech | Danilo de Oliveira et.al. | 2410.17834 | null |
2024-10-23 | PGDiffSeg: Prior-Guided Denoising Diffusion Model with Parameter-Shared Attention for Breast Cancer Segmentation | Feiyan Feng et.al. | 2410.17812 | null |
2024-10-23 | GenUDC: High Quality 3D Mesh Generation with Unsigned Dual Contouring Representation | Ruowei Wang et.al. | 2410.17802 | link |
2024-10-23 | Regularized autoregressive modeling and its application to audio signal declipping | Ondřej Mokrý et.al. | 2410.17790 | link |
2024-10-22 | Large Language Models Empowered Personalized Web Agents | Hongru Cai et.al. | 2410.17236 | null |
2024-10-22 | Creativity in AI: Progresses and Challenges | Mete Ismayilzada et.al. | 2410.17218 | link |
2024-10-22 | Audio-to-Score Conversion Model Based on Whisper methodology | Hongyao Zhang et.al. | 2410.17209 | null |
2024-10-22 | Reinforcement learning on structure-conditioned categorical diffusion for protein inverse folding | Yasha Ektefaie et.al. | 2410.17173 | link |
2024-10-22 | Performance of the CMS high-level trigger during LHC Run 2 | CMS Collaboration et.al. | 2410.17038 | null |
2024-10-22 | Hybrid Generative AI for De Novo Design of Co-Crystals with Enhanced Tabletability | Nina Gubina et.al. | 2410.17005 | link |
2024-10-22 | DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization | Haowei Zhu et.al. | 2410.16942 | null |
2024-10-22 | Hierarchical Clustering for Conditional Diffusion in Image Generation | Jorge da Silva Goncalves et.al. | 2410.16910 | link |
2024-10-22 | Bayes without Underfitting: Fully Correlated Deep Learning Posteriors via Alternating Projections | Marco Miani et.al. | 2410.16901 | null |
2024-10-22 | VistaDream: Sampling multiview consistent images for single-view scene reconstruction | Haiping Wang et.al. | 2410.16892 | null |
2024-10-22 | CK4Gen: A Knowledge Distillation Framework for Generating High-Utility Synthetic Survival Datasets in Healthcare | Nicholas I-Hsien Kuo et.al. | 2410.16872 | null |
2024-10-22 | MPDS: A Movie Posters Dataset for Image Generation with Diffusion Model | Meng Xu et.al. | 2410.16840 | null |
2024-10-22 | Bridging Search and Recommendation in Generative Retrieval: Does One Task Help the Other? | Gustavo Penha et.al. | 2410.16823 | null |
2024-10-22 | Evaluating the Effectiveness of Attack-Agnostic Features for Morphing Attack Detection | Laurent Colbois et.al. | 2410.16802 | link |
2024-10-22 | One-Step Diffusion Distillation through Score Implicit Matching | Weijian Luo et.al. | 2410.16794 | link |
2024-10-21 | MvDrag3D: Drag-based Creative 3D Editing via Multi-view Generation-Reconstruction Priors | Honghua Chen et.al. | 2410.16272 | null |
2024-10-21 | Agent-to-Sim: Learning Interactive Behavior Models from Casual Longitudinal Videos | Gengshan Yang et.al. | 2410.16259 | null |
2024-10-21 | Distribution Learning with Valid Outputs Beyond the Worst-Case | Nick Rittler et.al. | 2410.16253 | null |
2024-10-21 | Building A Coding Assistant via the Retrieval-Augmented Language Model | Xinze Li et.al. | 2410.16229 | link |
2024-10-21 | CiteClick: A Browser Extension for Real-Time Scholar Citation Tracking | Nishat Raihan et.al. | 2410.16211 | null |
2024-10-21 | A Framework for Evaluating Predictive Models Using Synthetic Image Covariates and Longitudinal Data | Simon Deltadahl et.al. | 2410.16177 | null |
2024-10-22 | Warped Diffusion: Solving Video Inverse Problems with Image Diffusion Models | Giannis Daras et.al. | 2410.16152 | null |
2024-10-21 | Modelling Structured Data Learning with Restricted Boltzmann Machines in the Teacher-Student Setting | Robin Thériault et.al. | 2410.16150 | null |
2024-10-21 | SeaDAG: Semi-autoregressive Diffusion for Conditional Directed Acyclic Graph Generation | Xinyi Zhou et.al. | 2410.16119 | null |
2024-10-21 | Critical Example Mining for Vehicle Trajectory Prediction using Flow-based Generative Models | Zhezhang Ding et.al. | 2410.16083 | null |
2024-10-21 | Continuous Speech Synthesis using per-token Latent Diffusion | Arnon Turetzky et.al. | 2410.16048 | null |
2024-10-21 | Some generalizations of the convective model of jet generation | S. N. Artekha et.al. | 2410.16035 | null |
2024-10-21 | ComPO: Community Preferences for Language Model Personalization | Sachin Kumar et.al. | 2410.16027 | null |
2024-10-21 | Massimo: Public Queue Monitoring and Management using Mass-Spring Model | Abhijeet Kumar et.al. | 2410.16012 | null |
2024-10-21 | AI-Driven Innovations in Modern Cloud Computing | Animesh Kumar et.al. | 2410.15960 | null |
2024-10-18 | BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities | Shaozhe Hao et.al. | 2410.14672 | link |
2024-10-18 | How Does Data Diversity Shape the Weight Landscape of Neural Networks? | Yang Ba et.al. | 2410.14602 | null |
2024-10-18 | Bayesian Multi-wavelength Imaging of the LMC SN1987A with SRG/eROSITA | Vincent Eberle et.al. | 2410.14599 | null |
2024-10-18 | Neuro-Symbolic Traders: Assessing the Wisdom of AI Crowds in Markets | Namid R. Stillman et.al. | 2410.14587 | null |
2024-10-18 | Reimagining partial thickness keratoplasty: An eye mountable robot for autonomous big bubble needle insertion | Y. Wang et.al. | 2410.14577 | null |
2024-10-18 | Multi-modal Pose Diffuser: A Multimodal Generative Conditional Pose Prior | Calvin-Khang Ta et.al. | 2410.14540 | null |
2024-10-18 | Blockchain-Based Trust and Transparency in Airline Reservation Systems using Microservices Architecture | Biman Barua et.al. | 2410.14518 | null |
2024-10-18 | LEAD: Latent Realignment for Human Motion Diffusion | Nefeli Andreou et.al. | 2410.14508 | null |
2024-10-18 | Reinforcement Learning in Non-Markov Market-Making | Luca Lalor et.al. | 2410.14504 | null |
2024-10-18 | Data-driven topology design with persistent homology for enhancing population diversity | Taisei Kii et.al. | 2410.14496 | null |
2024-10-18 | ANT: Adaptive Noise Schedule for Time Series Diffusion Models | Seunghan Lee et.al. | 2410.14488 | link |
2024-10-21 | CaTs and DAGs: Integrating Directed Acyclic Graphs with Transformers and Fully-Connected Neural Networks for Causally Constrained Predictions | Matthew J. Vowels et.al. | 2410.14485 | link |
2024-10-18 | DRL Optimization Trajectory Generation via Wireless Network Intent-Guided Diffusion Models for Optimizing Resource Allocation | Junjie Wu et.al. | 2410.14481 | null |
2024-10-18 | Flow-based Sampling for Entanglement Entropy and the Machine Learning of Defects | Andrea Bulgarelli et.al. | 2410.14466 | null |
2024-10-18 | FashionR2R: Texture-preserving Rendered-to-Real Image Translation with Diffusion Models | Rui Hu et.al. | 2410.14429 | null |
2024-10-17 | Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens | Lijie Fan et.al. | 2410.13863 | null |
2024-10-17 | Diffusing States and Matching Scores: A New Framework for Imitation Learning | Runzhe Wu et.al. | 2410.13855 | link |
2024-10-17 | Influence Functions for Scalable Data Attribution in Diffusion Models | Bruno Mlodozeniec et.al. | 2410.13850 | null |
2024-10-17 | VidPanos: Generative Panoramic Videos from Casual Panning Videos | Jingwei Ma et.al. | 2410.13832 | null |
2024-10-17 | DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control | Yujie Wei et.al. | 2410.13830 | null |
2024-10-17 | Deep Generative Models Unveil Patterns in Medical Images Through Vision-Language Conditioning | Xiaodan Xing et.al. | 2410.13823 | link |
2024-10-17 | ConsisSR: Delving Deep into Consistency in Diffusion-based Image Super-Resolution | Junhao Gu et.al. | 2410.13807 | null |
2024-10-17 | Probing the Latent Hierarchical Structure of Data via Diffusion Models | Antonio Sclocchi et.al. | 2410.13770 | null |
2024-10-17 | Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers | Yuchen Liang et.al. | 2410.13746 | null |
2024-10-17 | Improved Convergence Rate for Diffusion Probabilistic Models | Gen Li et.al. | 2410.13738 | null |
2024-10-17 | Optimizing Probabilistic Conformal Prediction with Vectorized Non-Conformity Scores | Minxing Zheng et.al. | 2410.13735 | null |
2024-10-18 | DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation | Hanbo Cheng et.al. | 2410.13726 | link |
2024-10-17 | Movie Gen: A Cast of Media Foundation Models | Adam Polyak et.al. | 2410.13720 | link |
2024-10-18 | Diffusion Curriculum: Synthetic-to-Real Generative Curriculum Learning via Image-Guided Diffusion | Yijun Liang et.al. | 2410.13674 | link |
2024-10-17 | Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design | Chenyu Wang et.al. | 2410.13643 | link |
2024-10-16 | Geometry-Aware Generative Autoencoders for Warped Riemannian Metric Learning and Generative Modeling on Data Manifolds | Xingzhi Sun et.al. | 2410.12779 | null |
2024-10-16 | Meta-Unlearning on Diffusion Models: Preventing Relearning Unlearned Concepts | Hongcheng Gao et.al. | 2410.12777 | link |
2024-10-16 | SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation | Jaehong Yoon et.al. | 2410.12761 | null |
2024-10-16 | Signature of Vertical Mixing in Hydrogen-dominated Exoplanet Atmospheres | Vikas Soni et.al. | 2410.12737 | null |
2024-10-16 | Counterfactual Generative Modeling with Variational Causal Inference | Yulun Wu et.al. | 2410.12730 | link |
2024-10-16 | FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression | Zhenheng Tang et.al. | 2410.12707 | null |
2024-10-16 | Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value Optimization | Xingqi Wang et.al. | 2410.12700 | link |
2024-10-16 | AdaptiveDrag: Semantic-Driven Dragging on Diffusion-Based Image Editing | DuoSheng Chen et.al. | 2410.12696 | link |
2024-10-16 | 3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation | Dewei Zhou et.al. | 2410.12669 | link |
2024-10-16 | Towards Designing Scalable Quantum-Enhanced Generative Networks for Neutrino Physics Experiments with Liquid Argon Time Projection Chambers | Andrea Delgado et.al. | 2410.12650 | null |
2024-10-16 | A Robo-Advisor System: expected utility modeling via pairwise comparisons | Bo Chen et.al. | 2410.12570 | null |
2024-10-16 | One Step Diffusion via Shortcut Models | Kevin Frans et.al. | 2410.12557 | link |
2024-10-16 | Disentangling data distribution for Federated Learning | Xinyuan Zhao et.al. | 2410.12530 | null |
2024-10-16 | Shaping a Stabilized Video by Mitigating Unintended Changes for Concept-Augmented Video Editing | Mingce Guo et.al. | 2410.12526 | null |
2024-10-16 | MING: A Functional Approach to Learning Molecular Generative Models | Van Khoa Nguyen et.al. | 2410.12522 | null |
2024-10-15 | High-Resolution Frame Interpolation with Patch-based Cascaded Diffusion | Junhwa Hur et.al. | 2410.11838 | null |
2024-10-15 | On the Effectiveness of Dataset Alignment for Fake Image Detection | Anirudh Sundara Rajan et.al. | 2410.11835 | null |
2024-10-15 | Bayesian Experimental Design via Contrastive Diffusions | Jacopo Iollo et.al. | 2410.11826 | link |
2024-10-15 | KITTEN: A Knowledge-Intensive Evaluation of Image Generation on Visual Entities | Hsin-Ping Huang et.al. | 2410.11824 | null |
2024-10-15 | Improving Long-Text Alignment for Text-to-Image Diffusion Models | Luping Liu et.al. | 2410.11817 | link |
2024-10-15 | SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing | Zhiyuan Zhang et.al. | 2410.11815 | null |
2024-10-16 | Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices | Zhiyuan Ma et.al. | 2410.11795 | null |
2024-10-15 | G-Designer: Architecting Multi-agent Communication Topologies via Graph Neural Networks | Guibin Zhang et.al. | 2410.11782 | null |
2024-10-15 | Technical Report of 1:10 Scale Autonomous Vehicle Robot | Amirhossein Kheiri Holighi et.al. | 2410.11746 | null |
2024-10-15 | Probabilistic Principles for Biophysics and Neuroscience: Entropy Production, Bayesian Mechanics & the Free-Energy Principle | Lancelot Da Costa et.al. | 2410.11735 | null |
2024-10-15 | Patch-Based Diffusion Models Beat Whole-Image Models for Mismatched Distribution Inverse Problems | Jason Hu et.al. | 2410.11730 | null |
2024-10-15 | Parameter estimation of structural dynamics with neural operators enabled surrogate modeling | Mingyuan Zhou et.al. | 2410.11712 | null |
2024-10-15 | Findings of the WMT 2024 Shared Task on Chat Translation | Wafaa Mohammed et.al. | 2410.11624 | null |
2024-10-15 | DeformPAM: Data-Efficient Learning for Long-horizon Deformable Object Manipulation via Preference-based Action Alignment | Wendi Chen et.al. | 2410.11584 | link |
2024-10-15 | A Data-Driven Aggressive Autonomous Racing Framework Utilizing Local Trajectory Planning with Velocity Prediction | Zhouheng Li et.al. | 2410.11570 | link |
2024-10-14 | Tex4D: Zero-shot 4D Scene Texturing with Video Diffusion Models | Jingzhi Bao et.al. | 2410.10821 | link |
2024-10-15 | TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models | Mu Cai et.al. | 2410.10818 | link |
2024-10-14 | LVD-2M: A Long-take Video Dataset with Temporally Dense Captions | Tianwei Xiong et.al. | 2410.10816 | link |
2024-10-14 | Depth Any Video with Scalable Synthetic Data | Honghui Yang et.al. | 2410.10815 | link |
2024-10-14 | HART: Efficient Visual Generation with Hybrid Autoregressive Transformer | Haotian Tang et.al. | 2410.10812 | link |
2024-10-14 | TrajDiffuse: A Conditional Diffusion Model for Environment-Aware Trajectory Prediction | Qingze et.al. | 2410.10804 | link |
2024-10-14 | Boosting Camera Motion Control for Video Diffusion Transformers | Soon Yau Cheong et.al. | 2410.10802 | null |
2024-10-14 | Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations | Litu Rout et.al. | 2410.10792 | null |
2024-10-14 | ControlMM: Controllable Masked Motion Generation | Ekkasit Pinyoanuntapong et.al. | 2410.10780 | null |
2024-10-14 | Adaptive Diffusion Terrain Generator for Autonomous Uneven Terrain Navigation | Youwei Yu et.al. | 2410.10766 | link |
2024-10-14 | DragEntity: Trajectory Guided Video Generation using Entity and Positional Relationships | Zhang Wan et.al. | 2410.10751 | null |
2024-10-14 | CosForce: A Force-Based General Model for Simulating Pedestrian Anticipation and Reaction Mechanisms | Jinghui Wang et.al. | 2410.10746 | null |
2024-10-14 | FlexGen: Flexible Multi-View Generation from Text and Image Inputs | Xinli Xu et.al. | 2410.10745 | null |
2024-10-14 | Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models | Junyu Chen et.al. | 2410.10733 | link |
2024-10-14 | Large Language Models Are Active Critics in NLG Evaluation | Shuying Xu et.al. | 2410.10724 | null |
2024-10-11 | SceneCraft: Layout-Guided 3D Scene Generation | Xiuyu Yang et.al. | 2410.09049 | link |
2024-10-11 | Linear Convergence of Diffusion Models Under the Manifold Hypothesis | Peter Potaptchik et.al. | 2410.09046 | null |
2024-10-11 | PEAR: A Robust and Flexible Automation Framework for Ptychography Enabled by Multiple Large Language Model Agents | Xiangyu Yin et.al. | 2410.09034 | link |
2024-10-11 | Semantic Score Distillation Sampling for Compositional Text-to-3D Generation | Ling Yang et.al. | 2410.09009 | link |
2024-10-11 | WaveDiffusion: Exploring Full Waveform Inversion via Joint Diffusion in the Latent Space | Hanchen Wang et.al. | 2410.09002 | null |
2024-10-11 | Maximizing the Potential of Synthetic Data: Insights from Random Matrix Theory | Aymane El Firdoussi et.al. | 2410.08942 | null |
2024-10-11 | DiffPO: A causal diffusion model for learning distributions of potential outcomes | Yuchen Ma et.al. | 2410.08924 | null |
2024-10-11 | An End-to-End Deep Learning Method for Solving Nonlocal Allen-Cahn and Cahn-Hilliard Phase-Field Models | Yuwei Geng et.al. | 2410.08914 | null |
2024-10-11 | Conditional Generative Models for Contrast-Enhanced Synthesis of T1w and T1 Maps in Brain MRI | Moritz Piening et.al. | 2410.08894 | link |
2024-10-11 | MATCH: Model-Aware TVM-based Compilation for Heterogeneous Edge Devices | Mohamed Amine Hamdi et.al. | 2410.08855 | link |
2024-10-14 | LIME-Eval: Rethinking Low-light Image Enhancement Evaluation via Object Detection | Mingjia Li et.al. | 2410.08810 | link |
2024-10-11 | Bad Neighbors: On Understanding VPN Provider Networks | Teemu Rytilahti et.al. | 2410.08737 | link |
2024-10-11 | 5G as Enabler for Industrie 4.0 Use Cases: Challenges and Concepts | M. Gundall et.al. | 2410.08726 | null |
2024-10-11 | Investigating Human-Computer Interaction and Visual Comprehension in Text Generation Process of Natural Language Generation Models | Yunchao Wang et.al. | 2410.08723 | null |
2024-10-11 | Impact of Surface Reflections in Maritime Obstacle Detection | Samed Yalçın et.al. | 2410.08713 | link |
2024-10-10 | LatteCLIP: Unsupervised CLIP Fine-Tuning via LMM-Synthetic Texts | Anh-Quan Cao et.al. | 2410.08211 | null |
2024-10-10 | DICE: Discrete Inversion Enabling Controllable Editing for Multinomial Diffusion and Masked Generative Models | Xiaoxiao He et.al. | 2410.08207 | null |
2024-10-10 | HybridBooth: Hybrid Prompt Inversion for Efficient Subject-Driven Generation | Shanyan Guan et.al. | 2410.08192 | null |
2024-10-10 | DifFRelight: Diffusion-Based Facial Performance Relighting | Mingming He et.al. | 2410.08188 | null |
2024-10-10 | RGM: Reconstructing High-fidelity 3D Car Assets with Relightable 3D-GS Generative Model from a Single Image | Xiaoxue Chen et.al. | 2410.08181 | null |
2024-10-10 | ZeroComp: Zero-shot Object Compositing from Image Intrinsics via Diffusion | Zitian Zhang et.al. | 2410.08168 | link |
2024-10-10 | DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation | Jiatao Gu et.al. | 2410.08159 | null |
2024-10-10 | Progressive Autoregressive Video Diffusion Models | Desai Xie et.al. | 2410.08151 | link |
2024-10-10 | Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction | Jarrid Rector-Brooks et.al. | 2410.08134 | null |
2024-10-10 | Robust AI-Generated Text Detection by Restricted Embeddings | Kristian Kuznetsov et.al. | 2410.08113 | link |
2024-10-10 | LiPO: LiDAR Inertial Odometry for ICP Comparison | Darwin Mick et.al. | 2410.08097 | null |
2024-10-10 | Unstable Unlearning: The Hidden Risk of Concept Resurgence in Diffusion Models | Vinith M. Suriyakumar et.al. | 2410.08074 | null |
2024-10-10 | Reversible Decoupling Network for Single Image Reflection Removal | Hao Zhao et.al. | 2410.08063 | link |
2024-10-10 | A Target-Aware Analysis of Data Augmentation for Hate Speech Detection | Camilla Casula et.al. | 2410.08053 | null |
2024-10-10 | LADIMO: Face Morph Generation through Biometric Template Inversion with Latent Diffusion | Marcel Grimmer et.al. | 2410.07988 | link |
2024-10-09 | IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation | Xinchen Zhang et.al. | 2410.07171 | link |
2024-10-09 | Sylber: Syllabic Embedding Representation of Speech from Raw Audio | Cheol Jun Cho et.al. | 2410.07168 | link |
2024-10-09 | AvatarGO: Zero-shot 4D Human-Object Interaction Generation and Animation | Yukang Cao et.al. | 2410.07164 | null |
2024-10-09 | InstructG2I: Synthesizing Images from Multimodal Attributed Graphs | Bowen Jin et.al. | 2410.07157 | link |
2024-10-09 | Trans4D: Realistic Geometry-Aware Transition for Compositional Text-to-4D Synthesis | Bohan Zeng et.al. | [2410.07155](http://arxiv.org/abs/ |