AI Research News Feeds for August 20th, 2025

AI RESEARCH PAPERS & ACADEMIC SOURCES

Nanopore sequencing of intact aminoacylated tRNAs
What counts as plagiarism? AI-generated papers pose new risks
The importance of negative training data for robust antibody binding prediction
Electron-density-informed effective and reliable de novo molecular design and optimization with ED2Mol
SciToolAgent: a knowledge-graph-driven scientific agent for multitool integration
Training data composition determines machine learning generalization and biological rule discovery
AI-based diagnosis of acute aortic syndrome from noncontrast CT
A comprehensive deep learning approach to improve enchondroma detection on X-ray images
Automatic detection of cognitive events using machine learning and understanding models’ interpretations of human cognition
Diffusion-Driven High-Dimensional Variable Selection
LEARNER: A Transfer Learning Method for Low-Rank Matrix Estimation
RadGPT: Constructing 3D Image-Text Tumor Datasets
Rapid Urban Visibility Hotspots: Quantifying Building Vertex Visibility from Connected Vehicle Trajectories using Spatial Indexing
BRISC: Annotated Dataset for Brain Tumor Segmentation and Classification with Swin-HAFNet
UltraDfeGAN: Detail-Enhancing Generative Adversarial Networks for High-Fidelity Functional Ultrasound Synthesis
Colon Polyps Detection from Colonoscopy Images Using Deep Learning
Benchmarking GPT-5 for Zero-Shot Multimodal Medical Reasoning in Radiology and Radiation Oncology
PediDemi -- A Pediatric Demyelinating Lesion Segmentation Dataset
InnerGS: Internal Scenes Rendering via Factorized 3D Gaussian Splatting
Susceptibility Distortion Correction of Diffusion MRI with a single Phase-Encoding Direction
Towards Understanding and Harnessing the Transferability of Prognostic Knowledge in Computational Pathology
ROVER: Robust Loop Closure Verification with Trajectory Prior in Repetitive Environments
State of Abdominal CT Datasets: A Critical Review of Bias, Clinical Relevance, and Real-world Applicability
Model-based Multi-object Visual Tracking: Identification and Standard Model Limitations
subCellSAM: Zero-Shot (Sub-)Cellular Segmentation for Hit Validation in Drug Discovery
Deep Biomechanically-Guided Interpolation for Keypoint-Based Brain Shift Registration
Sketch3DVE: Sketch-based 3D-Aware Scene Video Editing
Is-NeRF: In-scattering Neural Radiance Field for Blurred Images
Latent Interpolation Learning Using Diffusion Models for Cardiac Volume Reconstruction
Multimodal Data Storage and Retrieval for Embodied AI: A Survey
Learning to See Through Flare
MMIS-Net for Retinal Fluid Segmentation and Detection
Real-Time, Population-Based Reconstruction of 3D Bone Models via Very-Low-Dose Protocols
Augmenting cobots for sheet-metal SMEs with 3D object recognition and localisation
UNICON: UNIfied CONtinual Learning for Medical Foundational Models
Advancing Toward Robust and Scalable Fingerprint Orientation Estimation: From Gradients to Deep Learning
Diffusion Noise Feature: Accurate and Fast Generated Image Detection
A global optimization SAR image segmentation model can be easily transformed to a general ROF denoising model
SAR image segmentation algorithms based on I-divergence-TV model
Active contours driven by local and global intensity fitting energy with application to SAR image segmentation and its fast solvers
Rethinking Transformer-Based Blind-Spot Network for Self-Supervised Image Denoising
ContrastAlign: Toward Robust BEV Feature Alignment via Contrastive Learning for Multi-Modal 3D Object Detection
HouseCrafter: Lifting Floorplans to 3D Scenes with 2D Diffusion Model
WHALES: A Multi-Agent Scheduling Dataset for Enhanced Cooperation in Autonomous Driving
ResFlow: Fine-tuning Residual Optical Flow for Event-based High Temporal Resolution Motion Estimation
Image Augmentation Agent for Weakly Supervised Semantic Segmentation
MMHMER:Multi-viewer and Multi-task for Handwritten Mathematical Expression Recognition
Towards Vision Zero: The TUM Traffic Accid3nD Dataset
AutoComPose: Automatic Generation of Pose Transition Descriptions for Composed Pose Retrieval Using Multimodal LLMs
Geo4D: Leveraging Video Generators for Geometric 4D Scene Reconstruction
DNF-Avatar: Distilling Neural Fields for Real-time Animatable Avatar Relighting
EmoSEM: Segment and Explain Emotion Stimuli in Visual Art
Beyond the Horizon: Decoupling Multi-View UAV Action Recognition via Partial Order Transfer
ReservoirTTA: Prolonged Test-time Adaptation for Evolving and Recurring Domains
Boosting Adversarial Transferability for Hyperspectral Image Classification Using 3D Structure-invariant Transformation and Weighted Intermediate Feature Divergence
MCN-SLAM: Multi-Agent Collaborative Neural SLAM with Hybrid Implicit Neural Scene Representation
FreqDGT: Frequency-Adaptive Dynamic Graph Networks with Transformer for Cross-subject EEG Emotion Recognition
Upsample What Matters: Region-Adaptive Latent Sampling for Accelerated Diffusion Transformers
Stereo-based 3D Anomaly Object Detection for Autonomous Driving: A New Dataset and Baseline
Regional quality estimation for echocardiography using deep learning
SEA-LION: Southeast Asian Languages in One Network
Crossing Borders Without Crossing Boundaries: How Sociolinguistic Awareness Can Optimize User Engagement with Localized Spanish AI Models Across Hispanophone Countries
Model Performance-Guided Evaluation Data Selection for Effective Prompt Optimization
Soft Reasoning: Navigating Solution Spaces in Large Language Models through Controlled Embedding Exploration
MedVisionLlama: Leveraging Pre-Trained Large Language Model Layers to Enhance Medical Image Segmentation
YOLO11-CR: a Lightweight Convolution-and-Attention Framework for Accurate Fatigue Driving Detection
DianJin-OCR-R1: Enhancing OCR Capabilities via a Reasoning-and-Tool Interleaved Vision-Language Model
Exploration of Deep Learning Based Recognition for Urdu Text
Prune2Drive: A Plug-and-Play Framework for Accelerating Vision-Language Models in Autonomous Driving
Automated Assessment of Aesthetic Outcomes in Facial Plastic Surgery
Applications of Small Language Models in Medical Imaging Classification with a Focus on Prompt Strategies
AIM 2025 Rip Current Segmentation (RipSeg) Challenge Report
EDTalk++: Full Disentanglement for Controllable Talking Head Synthesis
Revisiting MLLM Token Technology through the Lens of Classical Visual Coding
MINR: Efficient Implicit Neural Representations for Multi-Image Encoding
Distribution-Aware Hadamard Quantization for Hardware-Efficient Implicit Neural Representations
AIM 2025 challenge on Inverse Tone Mapping Report: Methods and Results
Enhancing Robustness of Implicit Neural Representations Against Weight Perturbations
FAMNet: Integrating 2D and 3D Features for Micro-expression Recognition via Multi-task Learning and Hierarchical Attention
AdaptiveAE: An Adaptive Exposure Strategy for HDR Capturing in Dynamic Scenes
Bridging the Gap: Doubles Badminton Analysis with Singles-Trained Models
2D Gaussians Meet Visual Tokenizer
GazeProphet: Software-Only Gaze Prediction for VR Foveated Rendering
A Lightweight Dual-Mode Optimization for Generative Face Video Coding
Color Spike Data Generation via Bio-inspired Neuron-like Encoding with an Artificial Photoreceptor Layer
DictAS: A Framework for Class-Generalizable Few-Shot Anomaly Segmentation via Dictionary Lookup
Learnable SMPLify: A Neural Solution for Optimization-Free Human Pose Inverse Kinematics
Generative Model-Based Feature Attention Module for Video Action Analysis
Temporal-Conditional Referring Video Object Segmentation with Noise-Free Text-to-Video Diffusion Model
Bridging Clear and Adverse Driving Conditions
Towards Efficient Vision State Space Models via Token Merging
Unleashing Semantic and Geometric Priors for 3D Scene Completion
PersonaVlog: Personalized Multimodal Vlog Generation with Multi-Agent Collaboration and Iterative Self-Correction
Two-Factor Authentication Smart Entryway Using Modified LBPH Algorithm
TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis
RCGNet: RGB-based Category-Level 6D Object Pose Estimation with Geometric Guidance
DiffIER: Optimizing Diffusion Models with Iterative Error Reduction
OmniTry: Virtual Try-On Anything without Masks
DeH4R: A Decoupled and Hybrid Method for Road Network Graph Extraction
HumanPCR: Probing MLLM Capabilities in Diverse Human-Centric Scenes
Diversity-enhanced Collaborative Mamba for Semi-supervised Medical Image Segmentation
Hierarchical Vision-Language Retrieval of Educational Metaverse Content in Agriculture
Enhancing Targeted Adversarial Attacks on Large Vision-Language Models through Intermediate Projector Guidance
MR6D: Benchmarking 6D Pose Estimation for Mobile Robots
Shape-from-Template with Generalised Camera
VisionLaw: Inferring Interpretable Intrinsic Dynamics from Visual Observations via Bilevel Optimization
Timestep-Compressed Attack on Spiking Neural Networks through Timestep-Level Backpropagation
Self-Aware Adaptive Alignment: Enabling Accurate Perception for Intelligent Transportation Systems
SAGA: Learning Signal-Aligned Distributions for Improved Text-to-Image Generation
RED.AI Id-Pattern: First Results of Stone Deterioration Patterns with Multi-Agent Systems
RICO: Two Realistic Benchmarks and an In-Depth Analysis for Incremental Learning in Object Detection
In-hoc Concept Representations to Regularise Deep Learning in Medical Imaging
Forecasting Smog Events Using ConvLSTM: A Spatio-Temporal Approach for Aerosol Index Prediction in South Asia
SCRNet: Spatial-Channel Regulation Network for Medical Ultrasound Image Segmentation
PhysGM: Large Physical Gaussian Model for Feed-Forward 4D Synthesis
DIME-Net: A Dual-Illumination Adaptive Enhancement Network Based on Retinex and Mixture-of-Experts
ViT-FIQA: Assessing Face Image Quality using Vision Transformers
ROVR-Open-Dataset: A Large-Scale Depth Dataset for Autonomous Driving
OmViD: Omni-supervised active learning for video action detection
Physics-Based 3D Simulation for Synthetic Data Generation and Failure Analysis in Packaging Stability Assessment
Self-Supervised Sparse Sensor Fusion for Long Range Perception
ResPlan: A Large-Scale Vector-Graph Dataset of 17,000 Residential Floor Plans
Online 3D Gaussian Splatting Modeling with Novel View Selection
Backdooring Self-Supervised Contrastive Learning by Noisy Alignment
InfiniteTalk: Audio-driven Video Generation for Sparse-Frame Video Dubbing
Distilled-3DGS:Distilled 3D Gaussian Splatting
Beyond Simple Edits: Composed Video Retrieval with Dense Modifications
LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos
Query Logs Analytics: A Aystematic Literature Review
BQA: Body Language Question Answering Dataset for Video Large Language Models
Consolidating and Developing Benchmarking Datasets for the Nepali Natural Language Understanding Tasks
Understanding the Impact of Confidence in Retrieval Augmented Generation: A Case Study in the Medical Domain
Universal Abstraction: Harnessing Frontier Models to Structure Real-World Data at Scale
Basic Category Usage in Vision Language Models
Fair Play in the Newsroom: Actor-Based Filtering Gender Discrimination in Text Corpora
Stands to Reason: Investigating the Effect of Reasoning on Idiomaticity Detection
MATA (m\=ata): Mindful Assessment of the Telugu Abilities of Large Language Models
AdaDocVQA: Adaptive Framework for Long Document Visual Question Answering in Low-Resource Settings
CRISP: Persistent Concept Unlearning via Sparse Autoencoders
EEG-MedRAG: Enhancing EEG-based Clinical Decision-Making via Hierarchical Hypergraph Retrieval-Augmented Generation
Sycophancy under Pressure: Evaluating and Mitigating Sycophantic Bias via Adversarial Dialogues in Scientific QA
MGT-Prism: Enhancing Domain Generalization for Machine-Generated Text Detection via Spectral Alignment
Can Large Language Models (LLMs) Describe Pictures Like Children? A Comparative Corpus Study
TracSum: A New Benchmark for Aspect-Based Summarization with Sentence-Level Traceability in Medical Domain
Beyond Human Judgment: A Bayesian Evaluation of LLMs' Moral Values Understanding
MME-SCI: A Comprehensive and Challenging Science Benchmark for Multimodal Large Language Models
ReviewGraph: A Knowledge Graph Embedding Based Framework for Review Rating Prediction with Sentiment Features
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR
The Promise of Large Language Models in Digital Health: Evidence from Sentiment Analysis in Online Health Communities
Taming Unbalanced Training Workloads in Deep Learning with Partial Collective Operations
Breaking (Global) Barriers in Parallel Stochastic Optimization with Wait-Avoiding Group Averaging
Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines
Finite Expression Method for Solving High-Dimensional Partial Differential Equations
Active Learning of Mealy Machines with Timers
Contrastive Learning on Multimodal Analysis of Electronic Health Records
Robustly estimating heterogeneity in factorial data using Rashomon Partitions
Disciplined Geodesically Convex Programming
Unsupervised Anomaly Detection Using Diffusion Trend Analysis for Display Inspection
Parallel Network Reconstruction with Multi-directional Regularization
Development of Pre-Trained Transformer-based Models for the Nepali Language
TabulaX: Leveraging Large Language Models for Multi-Class Table Transformations
Investigating the importance of county-level characteristics in opioid-related mortality across the United States
Spatially-guided Temporal Aggregation for Robust Event-RGB Optical Flow Estimation
Hybrid Machine Learning Model with a Constrained Action Space for Trajectory Prediction
Gaussian Approximation and Multiplier Bootstrap for Stochastic Gradient Descent
Fact or Guesswork? Evaluating Large Language Models' Medical Knowledge with Structured One-Hop Judgments
Rectifying Conformity Scores for Better Conditional Coverage
MR-EEGWaveNet: Multiresolutional EEGWaveNet for Seizure Detection from Long EEG Recordings
Cross-Modal Characterization of Thin Film MoS$_2$ Using Generative Models
BLIPs: Bayesian Learned Interatomic Potentials
Learning from Preferences and Mixed Demonstrations in General Settings
FedChip: Federated LLM for Artificial Intelligence Accelerator Chip Design
Sex-Specific Vascular Score: A Novel Perfusion Biomarker from Supervoxel Analysis of 3D pCASL MRI
Modeling GRNs with a Probabilistic Categorical Framework
The Course Difficulty Analysis Cookbook
Structural Foundations for Leading Digit Laws: Beyond Probabilistic Mixtures
Automated Cervical Cancer Detection through Visual Inspection with Acetic Acid in Resource-Poor Settings with Lightweight Deep Learning Models Deployed on an Android Device
CLoE: Curriculum Learning on Endoscopic Images for Robust MES Classification
DAASH: A Meta-Attack Framework for Synthesizing Effective and Stealthy Adversarial Examples
Flow Matching-Based Generative Modeling for Efficient and Scalable Data Assimilation
A Risk Manager for Intrusion Tolerant Systems: Enhancing HAL 9000 with New Scoring and Data Sources
OrbitChain: Orchestrating In-orbit Real-time Analytics of Earth Observation Data
Vision Transformers for Kidney Stone Image Classification: A Comparative Study with CNNs
Multi-view Clustering via Bi-level Decoupling and Consistency Learning
Saudi-Dialect-ALLaM: LoRA Fine-Tuning for Dialectal Arabic Generation
Compressed Models are NOT Trust-equivalent to Their Large Counterparts
Understanding Distribution Structure on Calibrated Recommendation Systems
Towards safe control parameter tuning in distributed multi-agent systems
ViExam: Are Vision Language Models Better than Humans on Vietnamese Multimodal Exam Questions?
Know Me by My Pulse: Toward Practical Continuous Authentication on Wearable Devices via Wrist-Worn PPG
Optimizing Region of Interest Selection for Effective Embedding in Video Steganography Based on Genetic Algorithms
Unsupervised Urban Tree Biodiversity Mapping from Street-Level Imagery Using Spatially-Aware Visual Clustering
Smooth Flow Matching
Online Conformal Selection with Accept-to-Reject Changes
Generalisation and benign over-fitting for linear regression onto random functional covariates
A PC Algorithm for Max-Linear Bayesian Networks
Uncertainty-Aware PCA for Arbitrarily Distributed Data Modeled by Gaussian Mixture Models
Machine Learning H-theorem
Mask and Restore: Blind Backdoor Defense at Test Time with Masked Autoencoder
Disentangled Representation Learning with the Gromov-Monge Gap
Correlations Are Ruining Your Gradient Descent
FDR-SVM: A Federated Distributionally Robust Support Vector Machine via a Mixture of Wasserstein Balls Ambiguity Set
A Causal Graph-Enhanced Gaussian Process Regression for Modeling Engine-out NOx
Rethinking Weight-Averaged Model-merging
High-Order Tensor Regression in Sparse Convolutional Neural Networks
Environmental Feature Engineering and Statistical Validation for ML-Based Path Loss Prediction
Closed-Form Feedback-Free Learning with Forward Projection
Joint Learning of Energy-based Models and their Partition Function
Enhancing Cost Efficiency in Active Learning with Candidate Set Query
Recommendations with Sparse Comparison Data: Provably Fast Convergence for Nonconvex Matrix Factorization
A kinetic-based regularization method for data science applications
Performance Comparisons of Reinforcement Learning Algorithms for Sequential Experimental Design
Langevin Monte-Carlo Provably Learns Depth Two Neural Nets at Any Size and Data
Incorporating Attributes and Multi-Scale Structures for Heterogeneous Graph Contrastive Learning
Reinforcement Learning for Solving the Pricing Problem in Column Generation: Applications to Vehicle Routing
Can Masked Autoencoders Also Listen to Birds?
MEGA: Second-Order Gradient Alignment for Catastrophic Forgetting Mitigation in GFSCIL
Always Skip Attention
Epistemic Wrapping for Uncertainty Quantification
Quiet Feature Learning in Algorithmic Tasks
Good Things Come in Pairs: Paired Autoencoders for Inverse Problems
Bidirectional Information Flow (BIF) -- A Sample Efficient Hierarchical Gaussian Process for Bayesian Optimization
Flexible Operator Fusion for Fast Sparse Transformer with Diverse Masking on GPU
SymMatika: Structure-Aware Symbolic Discovery
PinFM: Foundation Model for User Activity Sequences at a Billion-scale Visual Discovery Platform
Improving DAPO from a Mixed-Policy Perspective
A Comprehensive Re-Evaluation of Biometric Modality Properties in the Modern Era
Revisiting Diffusion Q-Learning: From Iterative Denoising to One-Step Action Generation
Automated Energy-Aware Time-Series Model Deployment on Embedded FPGAs for Resilient Combined Sewer Overflow Management
How Usable is Automated Feature Engineering for Tabular Data?
Convergent Reinforcement Learning Algorithms for Stochastic Shortest Path Problem
AutoScale: Linear Scalarization Guided by Multi-Task Optimization Metrics
Multi-User Contextual Cascading Bandits for Personalized Recommendation
Formal Algorithms for Model Efficiency
GDNSQ: Gradual Differentiable Noise Scale Quantization for Low-bit Neural Networks
Typed Topological Structures Of Datasets
Enhancing Visual Reliance in Text Generation: A Bayesian Perspective on Mitigating Hallucination in Large Vision-Language Models
Recipes for Pre-training LLMs with MXFP8
PlantDeBERTa: An Open Source Language Model for Plant Science
ConTextTab: A Semantics-Aware Tabular In-Context Learner
Scaling Intelligence: Designing Data Centers for Next-Gen Language Models
Neural Cellular Automata for ARC-AGI
Segment Anything in Pathology Images with Natural Language
Tensor Program Optimization for the RISC-V Vector Extension Using Probabilistic Programs
Identify, Isolate, and Purge: Mitigating Hallucinations in LVLMs via Self-Evolving Distillation
Penalizing Infeasible Actions and Reward Scaling in Reinforcement Learning with Offline Data
Spatial-Temporal Transformer with Curriculum Learning for EEG-Based Emotion Recognition
BERT-VQA: Visual Question Answering on Plots
Strategies for training point distributions in physics-informed neural networks
A Recurrent Neural Network based Clustering Method for Binary Data Sets in Education
RISE: Enhancing VLM Image Annotation with Self-Supervised Reasoning
Data driven feedback linearization of nonlinear control systems via Lie derivatives and stacked regression approach
Physically Plausible Data Augmentations for Wearable IMU-based Human Activity Recognition Using Physics Simulation
Towards Human-AI Complementarity in Matching Tasks
Efficient Constraint-Aware Flow Matching via Randomized Exploration
Decoding Communications with Partial Information
X-MoE: Enabling Scalable Training for Emerging Mixture-of-Experts Architectures on HPC Platforms
Dimension lower bounds for linear approaches to function approximation
Adaptive Conformal Prediction Intervals Over Trajectory Ensembles
Batching-Aware Joint Model Onloading and Offloading for Hierarchical Multi-Task Inference
NovoMolGen: Rethinking Molecular Language Model Pretraining
Decentralized Contextual Bandits with Network Adaptivity
MAVIS: Multi-Objective Alignment via Value-Guided Inference-Time Search
ASAP: Unsupervised Post-training with Label Distribution Shift Adaptive Learning Rate
Hierarchy-Consistent Learning and Adaptive Loss Balancing for Hierarchical Multi-Label Classification
Classifying Clinical Outcome of Epilepsy Patients with Ictal Chirp Embeddings
DyMixOp: Guiding Neural Operator Design for PDEs from a Complex Dynamics Perspective with Local-Global-Mixing
Uncertainty Tube Visualization of Particle Trajectories
Explainability of Algorithms
MuFlex: A Scalable, Physics-based Platform for Multi-Building Flexibility Analysis and Coordination
CALYPSO: Forecasting and Analyzing MRSA Infection Patterns with Community and Healthcare Transmission Dynamics
Prediction of Hospital Associated Infections During Continuous Hospital Stays
A Generalized Learning Framework for Self-Supervised Contrastive Learning
Approximate Bayesian Inference via Bitstring Representations
Text2Weight: Bridging Natural Language and Neural Network Weight Spaces
Explainable Learning Rate Regimes for Stochastic Optimization
Personalized Subgraph Federated Learning with Sheaf Collaboration
MACTAS: Self-Attention-Based Module for Inter-Agent Communication in Multi-Agent Reinforcement Learning
Heavy-tailed Linear Bandits: Adversarial Robustness, Best-of-both-worlds, and Beyond
Minimizing the Weighted Number of Tardy Jobs: Data-Driven Heuristic for Single-Machine Scheduling
Trans-XFed: An Explainable Federated Learning for Supply Chain Credit Assessment
DREAMS: Preserving both Local and Global Structure in Dimensionality Reduction
Order Optimal Regret Bounds for Sharpe Ratio Optimization in the Bandit Setting
Communication-Efficient Federated Learning with Adaptive Number of Participants
Reinforcement Learning-based Adaptive Path Selection for Programmable Networks
Disentangled Deep Smoothed Bootstrap for Fair Imbalanced Regression
FedUP: Efficient Pruning-based Federated Unlearning for Model Poisoning Attacks
The AI Risk Spectrum: From Dangerous Capabilities to Existential Threats
Generics and Default Reasoning in Large Language Models
Prediction is not Explanation: Revisiting the Explanatory Capacity of Mapping Embeddings
On the Security and Privacy of Federated Learning: A Survey with Attacks, Defenses, Frameworks, Applications, and Future Directions
Mitigating Cross-Image Information Leakage in LVLMs for Multi-Image Tasks
Depth-Breadth Synergy in RLVR: Unlocking LLM Reasoning Gains with Adaptive Exploration
COMPASS: A Multi-Dimensional Benchmark for Evaluating Code Generation in Large Language Models
PENGUIN: Enhancing Transformer with Periodic-Nested Group Attention for Long-term Time Series Forecasting
Agentic DraCor and the Art of Docstring Engineering: Evaluating MCP-empowered LLM Usage of the DraCor API
Comparing Conditional Diffusion Models for Synthesizing Contrast-Enhanced Breast MRI from Pre-Contrast Images
DegDiT: Controllable Audio Generation with Dynamic Event Graph Guided Diffusion Transformer
BetaWeb: Towards a Blockchain-enabled Trustworthy Agentic Web
A Fully Transformer Based Multimodal Framework for Explainable Cancer Image Segmentation Using Radiology Reports
Prompt-Based One-Shot Exact Length-Controlled Generation with LLMs
Assessing Trustworthiness of AI Training Dataset using Subjective Logic -- A Use Case on Bias
The illusion of a perfect metric: Why evaluating AI's words is harder than it looks
Extracting Structured Requirements from Unstructured Building Technical Specifications for Building Information Modeling
One Shot vs. Iterative: Rethinking Pruning Strategies for Model Compression
UniECS: Unified Multimodal E-Commerce Search Framework with Gated Cross-modal Fusion
A Novel Attention-Augmented Wavelet YOLO System for Real-time Brain Vessel Segmentation on Transcranial Color-coded Doppler
Toward Deployable Multi-Robot Collaboration via a Symbolically-Guided Decision Transformer
Fisher-Orthogonal Projection Methods for Natural Gradient Descent with Large Batches
Categorical Policies: Multimodal Policy Learning and Exploration in Continuous Control
InPars+: Supercharging Synthetic Data Generation for Information Retrieval Systems
Prompt Orchestration Markup Language
A Mechanism for Mutual Fairness in Cooperative Games with Replicable Resources -- Extended Version
Learning to Use AI for Learning: How Can We Effectively Teach and Measure Prompting Literacy for K-12 Students?
RotBench: Evaluating Multimodal Large Language Models on Identifying Image Rotation
The Social Context of Human-Robot Interactions
Chunks as Arms: Multi-Armed Bandit-Guided Sampling for Long-Context LLM Preference Optimization
Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation
ASDFormer: A Transformer with Mixtures of Pooling-Classifier Experts for Robust Autism Diagnosis and Biomarker Discovery
Evaluating Identity Leakage in Speaker De-Identification Systems
Efficient Knowledge Graph Unlearning with Zeroth-order Information
Ask Good Questions for Large Language Models
Unintended Misalignment from Agentic Fine-Tuning: Risks and Mitigation
GeoSAM2: Unleashing the Power of SAM2 for 3D Part Segmentation
LEGO-GraphRAG: Modularizing Graph-based Retrieval-Augmented Generation for Design Space Exploration
Where to Go Next Day: Multi-scale Spatial-Temporal Decoupled Model for Mid-term Human Mobility Prediction
VRoPE: Rotary Position Embedding for Video Large Language Models
The StudyChat Dataset: Student Dialogues With ChatGPT in an Artificial Intelligence Course
GoAI: Enhancing AI Students' Learning Paths and Idea Generation via Graph of AI Ideas
Hawkeye:Efficient Reasoning with Model Collaboration
Two Heads are Better Than One: Test-time Scaling of Multi-agent Collaborative Reasoning
Trust, but verify
Hierarchical Reinforcement Learning in Multi-Goal Spatial Navigation with Autonomous Mobile Robots
It's the Thought that Counts: Evaluating the Attempts of Frontier LLMs to Persuade on Harmful Topics
Language-Guided Multi-Agent Learning in Simulations: A Unified Framework and Evaluation
Modeling the Diachronic Evolution of Legal Norms: An LRMoo-Based, Component-Level Approach
Efficient Network Automatic Relevance Determination
Dispositions and Roles of Generically Dependent Entities
Towards Urban Planing AI Agent in the Age of Agentic AI
Data-Efficient Safe Policy Improvement Using Parametric Structure
Adaptation and Optimization of Automatic Speech Recognition (ASR) for the Maritime Domain in the Field of VHF Communication
LEGO: Learning and Graph-Optimized Modular Tracker for Online Multi-Object Tracking with Point Clouds
"I see models being a whole other thing": An Empirical Study of Pre-Trained Model Naming Conventions and A Tool for Enhancing Naming Consistency
Radio Map Estimation: Empirical Validation and Analysis
Joint Problems in Learning Multiple Dynamical Systems
Fusing Echocardiography Images and Medical Records for Continuous Patient Stratification
iTBLS: A Dataset of Interactive Conversations Over Tabular Information
Iterative Utility Judgment Framework via LLMs Inspired by Relevance in Philosophy
Boolean Matrix Logic Programming on the GPU
Vision Backbone Efficient Selection for Image Classification in Low-Data Regimes
SSD-TS: Exploring the Potential of Linear State Space Models for Diffusion Models in Time Series Imputation
Script-Strategy Aligned Generation: Aligning LLMs with Expert-Crafted Dialogue Scripts and Therapeutic Strategies for Psychotherapy
DDD-GenDT: Dynamic Data-driven Generative Digital Twin Framework
Setup Once, Secure Always: A Single-Setup Secure Federated Learning Aggregation Protocol with Forward and Backward Secrecy for Dynamic Users
Unleashing the Power of LLMs in Dense Retrieval with Query Likelihood Modeling
Parameter-Efficient Continual Fine-Tuning: A Survey
POPri: Private Federated Learning using Preference-Optimized Synthetic Data
Hallucinations and Key Information Extraction in Medical Texts: A Comprehensive Assessment of Open-Source Large Language Models
Harnessing Structured Knowledge: A Concept Map-Based Approach for High-Quality Multiple Choice Question Generation with Effective Distractors
Position: We Need Responsible, Application-Driven (RAD) AI Research
"Haet Bhasha aur Diskrimineshun": Phonetic Perturbations in Code-Mixed Hinglish to Red-Team LLMs
Sample Complexity of Diffusion Model Training Without Empirical Risk Minimizer Access
G1: Teaching LLMs to Reason on Graphs with Reinforcement Learning
Piano: A Multi-Constraint Pin Assignment-Aware Floorplanner
Sustainable AI Training via Hardware-Software Co-Design on NVIDIA, AMD, and Emerging GPU Architectures
White-Box Reasoning: Synergizing LLM Strategy and gm/Id Data for Automated Analog Circuit Design
Toward an African Agenda for AI Safety
Using Artificial Intuition in Distinct, Minimalist Classification of Scientific Abstracts for Management of Technology Portfolios
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents
Combating Homelessness Stigma with LLMs: A New Multi-Modal Dataset for Bias Detection
Preference Models assume Proportional Hazards of Utilities
Contextual Attention-Based Multimodal Fusion of LLM and CNN for Sentiment Analysis
The Rise of Generative AI for Metal-Organic Framework Design and Synthesis
Benchmarking LLM-based Agents for Single-cell Omics Analysis
Utilizing the RAIN method and Graph SAGE Model to Identify Effective Drug Combinations for Gastric Neoplasm Treatment
Research on Conversational Recommender System Considering Consumer Types
Too Easily Fooled? Prompt Injection Breaks LLMs on Frustratingly Simple Multiple-Choice Questions
Deep Graph Neural Point Process For Learning Temporal Interactive Networks
MCPSecBench: A Systematic Security Benchmark and Playground for Testing Model Context Protocols
MIRAGE: Towards AI-Generated Image Detection in the Wild
PreSem-Surf: RGB-D Surface Reconstruction with Progressive Semantic Modeling and SG-MLP Pre-Rendering Mechanism
Accelerating LLM Inference via Dynamic KV Cache Placement in Heterogeneous Memory System
The Role of AI in Facilitating Interdisciplinary Collaboration: Evidence from AlphaFold
Uncertainty-Aware Learning Policy for Reliable Pulmonary Nodule Detection on Chest X-Ray
Quantifying Loss Aversion in Cyber Adversaries via LLM Analysis
Involuntary Jailbreak
Goal-Directedness is in the Eye of the Beholder
ViTAD: Timing Violation-Aware Debugging of RTL Code using Large Language Models
Hierarchical Conformal Classification
GaitCrafter: Diffusion Model for Biometric Preserving Gait Synthesis
Diff-MSM: Differentiable MusculoSkeletal Model for Simultaneous Identification of Human Muscle and Bone Parameters
A Surveillance Based Interactive Robot
A Dual-Attention Graph Network for fMRI Data Classification
Counterfactual Probabilistic Diffusion with Expert Models
Overcoming Latency Bottlenecks in On-Device Speech Translation: A Cascaded Approach with Alignment-Based Streaming MT
Whispering Context: Distilling Syntax and Semantics for Long Speech Transcripts
Datarus-R1: An Adaptive Multi-Step Reasoning LLM for Automated Data Analysis
Semi-Supervised Anomaly Detection Pipeline for SOZ Localization Using Ictal-Related Chirp
AdaptJobRec: Enhancing Conversational Career Recommendation through an LLM-Powered Agentic System
ALIGN: Word Association Learning for Cross-Cultural Generalization in Large Language Models
Mitigating Easy Option Bias in Multiple-Choice Question Answering
AlphaX: An AI-Based Value Investing Strategy for the Brazilian Stock Market
EventTSF: Event-Aware Non-Stationary Time Series Forecasting
SVDformer: Direction-Aware Spectral Graph Embedding Learning via SVD and Transformer
Dynamic Design of Machine Learning Pipelines via Metalearning
Structured Prompting and Multi-Agent Knowledge Distillation for Traffic Video Interpretation and Risk Inference
Consumer Autonomy or Illusion? Rethinking Consumer Agency in the Age of Algorithms
STER-VLM: Spatio-Temporal With Enhanced Reference Vision-Language Models
CORENet: Cross-Modal 4D Radar Denoising Network with LiDAR Supervision for Autonomous Driving
LLM-Enhanced Linear Autoencoders for Recommendation
ProMed: Shapley Information Gain Guided Reinforcement Learning for Proactive Medical LLMs
Heterogeneous Influence Maximization in User Recommendation
Calibrating Biased Distribution in VFM-derived Latent Space via Cross-Domain Geometric Consistency
DDoS Attacks in Cloud Computing: Detection and Prevention
Evaluating Open-Source Vision Language Models for Facial Emotion Recognition against Traditional Deep Learning Models
MimicFunc: Imitating Tool Manipulation from a Single Human Video via Functional Correspondence
EAvatar: Expression-Aware Head Avatar Reconstruction with Generative Geometry Priors
FLAIR: Frequency- and Locality-Aware Implicit Neural Representations
Collapsing ROC approach for risk prediction research on both common and rare variants
Physics-Informed Neural Networks for Programmable Origami Metamaterials with Controlled Deployment
The 9th AI City Challenge
End-to-End Audio-Visual Learning for Cochlear Implant Sound Coding in Noisy Environments
A Comparative Study of Decoding Strategies in Medical Text Generation
Who Gets the Mic? Investigating Gender Bias in the Speaker Assignment of a Speech-LLM
Bounding Causal Effects and Counterfactuals
Towards a Larger Model via One-Shot Federated Learning on Heterogeneous Client Models
GRAFT: Gradient-Aware Fast MaxVol Technique for Dynamic Data Sampling
Input Time Scaling
In-Context Decision Making for Optimizing Complex AutoML Pipelines
Multi-Plasticity Synergy with Adaptive Mechanism Assignment for Training Spiking Neural Networks
Knowledge Graph Completion for Action Prediction on Situational Graphs -- A Case Study on Household Tasks
MHSNet:An MoE-based Hierarchical Semantic Representation Network for Accurate Duplicate Resume Detection with Large Language Model
Neuro-Symbolic Artificial Intelligence: Towards Improving the Reasoning Abilities of Large Language Models
The DeepLog Neurosymbolic Machine
CausalPlan: Empowering Efficient LLM Multi-Agent Collaboration Through Causality-Driven Planning
Expertise-aware Multi-LLM Recruitment and Collaboration for Medical Decision-Making
Quantifier Instantiations: To Mimic or To Revolt?
Revisiting RAG Ensemble: A Theoretical and Mechanistic Analysis of Multi-RAG System Collaboration
Improved Generalized Planning with LLMs through Strategy Refinement and Reflection
Structured Agentic Workflows for Financial Time-Series Modeling with LLMs and Reflective Feedback
The Collaboration Paradox: Why Generative AI Requires Both Strategic Intelligence and Operational Stability in Supply Chain Management
ChronoLLM: Customizing Language Models for Physics-Based Simulation Code Generation
A Biased Random Key Genetic Algorithm for Solving the Longest Run Subsequence Problem
ComputerRL: Scaling End-to-End Online Reinforcement Learning for Computer Use Agents
Preliminary suggestions for rigorous GPAI model evaluations
TaoSR1: The Thinking Model for E-commerce Relevance Search
EvoVerilog: Large Langugage Model Assisted Evolution of Verilog Code
Image2Net: Datasets, Benchmark and Hybrid Framework to Convert Analog Circuit Diagrams into Netlists
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL
Cognitive Workspace: Active Memory Management for LLMs -- An Empirical Study of Functional Infinite Context
AlphaEval: A Comprehensive and Efficient Evaluation Framework for Formula Alpha Mining
Fitting Ontologies and Constraints to Relational Structures
A Hardware-oriented Approach for Efficient Active Inference Computation and Deployment
The Interpretability Analysis of the Model Can Bring Improvements to the Text-to-SQL Task
Search-Time Data Contamination
QuickMerge++: Fast Token Merging with Autoregressive Prior
AI sustains higher strategic tension than humans in chess
Explicit v.s. Implicit Memory: Exploring Multi-hop Complex Reasoning Over Personalized Information
"DIVE" into Hydrogen Storage Materials Discovery with AI Agents
CardAIc-Agents: A Multimodal Framework with Hierarchical Adaptation for Cardiac Care Support
Towards Unified Multimodal Financial Forecasting: Integrating Sentiment Embeddings and Market Indicators via Cross-Modal Attention
HiFo-Prompt: Prompting with Hindsight and Foresight for LLM-based Automatic Heuristic Design
LOOP: A Plug-and-Play Neuro-Symbolic Framework for Enhancing Planning in Autonomous Systems
SPANER: Shared Prompt Aligner for Multimodal Semantic Representation
TASER: Table Agents for Schema-guided Extraction and Recommendation
Virtuous Machines: Towards Artificial General Science
STPFormer: A State-of-the-Art Pattern-Aware Spatio-Temporal Transformer for Traffic Forecasting
Discrete Optimization of Min-Max Violation and its Applications Across Computational Sciences
LM Agents May Fail to Act on Their Own Risk Knowledge
CrafterDojo: A Suite of Foundation Models for Building Open-Ended Embodied Agents in Crafter
Toward Better EHR Reasoning in LLMs: Reinforcement Learning with Expert Attention Guidance
Breaking the SFT Plateau: Multimodal Structured Reinforcement Learning for Chart-to-Code Generation
V2P: From Background Suppression to Center Peaking for Robust GUI Grounding Task
Interactive Query Answering on Knowledge Graphs with Soft Entity Constraints
ITL-LIME: Instance-Based Transfer Learning for Enhancing Local Explanations in Low-Resource Data Settings

Research Sources: 475 | Generated: 8/25/2025