AI Research News Feeds for August 14th, 2025

AI RESEARCH PAPERS & ACADEMIC SOURCES

A deep learning model with machine vision system for recognizing type of the food during the food consumption
A dataset of high-resolution plantar pressures for gait analysis across varying footwear and walking speeds
Distributional Sensitivity Analysis: Enabling Differentiability in Sample-Based Inference
Learning topological states from randomized measurements using variational tensor network tomography
Variance-Reduced Fast Operator Splitting Methods for Generalized Equations
HiFi-Mamba: Dual-Stream W-Laplacian Enhanced Mamba for High-Fidelity MRI Reconstruction
MedPatch: Confidence-Guided Multi-Stage Fusion for Multimodal Clinical Data
Dynamic Survival Prediction using Longitudinal Images based on Transformer
DAgger Diffusion Navigation: DAgger Boosted Diffusion Policy for Vision-Language Navigation
Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations
Robustness analysis of Deep Sky Objects detection models on HPC
Toward Human-Robot Teaming: Learning Handover Behaviors from 3D Scenes
Prompt-aligned Gradient for Prompt Tuning
Debiased Fine-Tuning for Vision-language Models by Prompt Regularization
Ear-Keeper: A Cross-Platform AI System for Rapid and Accurate Ear Disease Diagnosis
STAC: Leveraging Spatio-Temporal Data Associations For Efficient Cross-Camera Streaming and Analytics
Are you Struggling? Dataset and Baselines for Struggle Determination in Assembly Videos
Revisiting 3D Medical Scribble Supervision: Benchmarking Beyond Cardiac Segmentation
From Few to More: Scribble-based Medical Image Segmentation via Masked Context Modeling and Continuous Pseudo Labels
ViCToR: Improving Visual Comprehension via Token Reconstruction for Pretraining LMMs
Joint multi-dimensional dynamic attention and transformer for general image restoration
ViewDelta: Scaling Scene Change Detection through Text-Conditioning
UltraRay: Introducing Full-Path Ray Tracing in Physics-Based Ultrasound Simulation
HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation
Image Intrinsic Scale Assessment: Bridging the Gap Between Quality and Resolution
Towards Synthesized and Editable Motion In-Betweening Through Part-Wise Phase Representation
GranQ: Granular Zero-Shot Quantization with Channel-Wise Activation Scaling in QAT
Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models
NeuralGS: Bridging Neural Fields and 3D Gaussian Splatting for Compact 3D Representations
Cyc3D: Fine-grained Controllable 3D Generation via Cycle Consistency Regularization
Scaling Vision Mamba Across Resolutions via Fractal Traversal
CAS-IQA: Teaching Vision-Language Models for Synthetic Angiography Quality Assessment
MultiFormer: A Multi-Person Pose Estimation System Based on CSI and Attention Mechanism
Follow-Your-Motion: Video Motion Transfer via Efficient Spatial-Temporal Decoupled Finetuning
SpaCE-10: A Comprehensive Benchmark for Multimodal Large Language Models in Compositional Spatial Intelligence
3D Gaussian Splatting Driven Multi-View Robust Physical Adversarial Camouflage Generation
Calibrated Self-supervised Vision Transformers Improve Intracranial Arterial Calcification Segmentation from Clinical CT Head Scans
Modulate and Reconstruct: Learning Hyperspectral Imaging from Misaligned Smartphone Views
Learning Adaptive Node Selection with External Attention for Human Interaction Recognition
RoHOI: Robustness Benchmark for Human-Object Interaction Detection
Advancing Reliable Test-Time Adaptation of Vision-Language Models under Visual Variations
HRSeg: High-Resolution Visual Perception and Enhancement for Reasoning Segmentation
FROST-BRDF: A Fast and Robust Optimal Sampling Technique for BRDF Acquisition
Lung-DDPM: Semantic Layout-guided Diffusion Models for Thoracic CT Image Synthesis
RAGAR: Retrieval Augmented Personalized Image Generation Guided by Recommendation
Emotion-Qwen: A Unified Framework for Emotion and Vision Understanding
DualMap: Online Open-Vocabulary Semantic Mapping for Natural Language Navigation in Dynamic Changing Scenes
Distilling LLM Prior to Flow Model for Generalizable Agent's Imagination in Object Goal Navigation
MPT: Motion Prompt Tuning for Micro-Expression Recognition
RASR: Retrieval-Augmented Super Resolution for Practical Reference-based Image Restoration
Animate-X++: Universal Character Image Animation with Dynamic Backgrounds
Event-driven Robust Fitting on Neuromorphic Hardware
CitySeg: A 3D Open Vocabulary Semantic Segmentation Foundation Model in City-scale Scenarios
Leveraging Failed Samples: A Few-Shot and Training-Free Framework for Generalized Deepfake Detection
From Large Angles to Consistent Faces: Identity-Preserving Video Generation via Mixture of Facial Experts
CLIP-Flow: A Universal Discriminator for AI-Generated Images Inspired by Anomaly Detection
GazeLT: Visual attention-guided long-tailed disease classification in chest radiographs
SkySplat: Generalizable 3D Gaussian Splatting from Multi-Temporal Sparse Satellite Images
SARE: Semantic-Aware Reconstruction Error for Generalizable Diffusion-Generated Image Detection
SOI is the Root of All Evil: Quantifying and Breaking Similar Object Interference in Single Object Tracking
Learning Spatial Decay for Vision Transformers
Physics-guided Deep Unfolding Network for Enhanced Kronecker Compressive sensing
Iterative Volume Fusion for Asymmetric Stereo Matching
Exploring the Equivalence of Closed-Set Generative and Real Data Augmentation in Image Classification
Topological Invariant-Based Iris Identification via Digital Homology and Machine Learning
WeatherPrompt: Multi-modality Representation Learning for All-Weather Drone Visual Geo-Localization
WEC-DG: Multi-Exposure Wavelet Correction Method Guided by Degradation Description
A Chain of Diagnosis Framework for Accurate and Explainable Radiology Report Generation
Dual Recursive Feedback on Generation and Appearance Latents for Pose-Robust Text-to-Image Diffusion
SHALE: A Scalable Benchmark for Fine-grained Hallucination Evaluation in LVLMs
Offline Auto Labeling: BAAS
SVG-Head: Hybrid Surface-Volumetric Gaussians for High-Fidelity Head Reconstruction and Real-Time Editing
Images Speak Louder Than Scores: Failure Mode Escape for Enhancing Generative Quality
BridgeTA: Bridging the Representation Gap in Knowledge Distillation via Teacher Assistant for Bird's Eye View Map Segmentation
Plane Detection and Ranking via Model Information Optimization
Semantic-aware DropSplat: Adaptive Pruning of Redundant Gaussians for 3D Aerial-View Segmentation
Enhancing Monocular 3D Hand Reconstruction with Learned Texture Priors
Multi-Contrast Fusion Module: An attention mechanism integrating multi-contrast features for fetal torso plane classification
Multi-Sequence Parotid Gland Lesion Segmentation via Expert Text-Guided Segment Anything Model
The Brain Resection Multimodal Image Registration (ReMIND2Reg) 2025 Challenge
TOTNet: Occlusion-Aware Temporal Tracking for Robust Ball Detection in Sports Videos
Noise-adapted Neural Operator for Robust Non-Line-of-Sight Imaging
NegFaceDiff: The Power of Negative Context in Identity-Conditioned Diffusion for Synthetic Face Generation
GSFixer: Improving 3D Gaussian Splatting with Reference-Guided Video Diffusion Priors
PaCo-FR: Patch-Pixel Aligned End-to-End Codebook Learning for Facial Representation Pre-training
Slot Attention-based Feature Filtering for Few-Shot Learning
MangaDiT: Reference-Guided Line Art Colorization with Hierarchical Attention in Diffusion Transformers
Predictive Uncertainty for Runtime Assurance of a Real-Time Computer Vision-Based Landing System
Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory
MoIIE: Mixture of Intra- and Inter-Modality Experts for Large Vision Language Models
DSS-Prompt: Dynamic-Static Synergistic Prompting for Few-Shot Class-Incremental Learning
MeMoSORT: Memory-Assisted Filtering and Motion-Adaptive Association Metric for Multi-Person Tracking
MUJICA: Reforming SISR Models for PBR Material Super-Resolution via Cross-Map Attention
Poaching Hotspot Identification Using Satellite Imagery
Evolution of Low-Level and Texture Human-CLIP Alignment
ViMoNet: A Multimodal Vision-Language Framework for Human Behavior Understanding from Motion and Video
Physical Autoregressive Model for Robotic Manipulation without Action Pretraining
KonfAI: A Modular and Fully Configurable Framework for Deep Learning in Medical Imaging
Reverse Convolution and Its Applications to Image Restoration
Hierarchical Graph Attention Network for No-Reference Omnidirectional Image Quality Assessment
Enhancing Diffusion Face Generation with Contrastive Embeddings and SegFormer Guidance
ARI3D: A Software for Interactive Quantification of Regions in X-Ray CT 3D Images
Do Vision Transformers See Like Humans? Evaluating their Perceptual Alignment
OneVAE: Joint Discrete and Continuous Optimization Helps Discrete Video VAE Train Better
HumanGenesis: Agent-Based Geometric and Generative Modeling for Synthetic Human Dynamics
E-4DGS: High-Fidelity Dynamic Reconstruction from the Multi-view Event Cameras
SpeechForensics: Audio-Visual Speech Representation Learning for Face Forgery Detection
Towards Comprehensive Cellular Characterisation of H&E slides
Quo Vadis Handwritten Text Generation for Handwritten Text Recognition?
AST-n: A Fast Sampling Approach for Low-Dose CT Reconstruction using Diffusion Models
LIA-X: Interpretable Latent Portrait Animator
MOC: Meta-Optimized Classifier for Few-Shot Whole Slide Image Classification
PERSONA: Personalized Whole-Body 3D Avatar with Pose-Driven Deformations from a Single Image
A Survey on 3D Gaussian Splatting Applications: Segmentation, Editing, and Generation
LLMC+: Benchmarking Vision-Language Model Compression with a Plug-and-play Toolkit
From Stars to Insights: Exploration and Implementation of Unified Sentiment Analysis with Distant Supervision
LongIns: A Challenging Long-context Instruction-based Exam for LLMs
Memorization Over Reasoning? Exposing and Mitigating Verbatim Memorization in Large Language Models' Character Understanding Evaluation
Efficient Inference for Large Reasoning Models: A Survey
Follow the Flow: On Information Flow Across Textual Tokens in Text-to-Image Models
Non-native Children's Automatic Speech Assessment Challenge (NOCASA)
IP-CRR: Information Pursuit for Interpretable Classification of Chest Radiology Reports
LogicCat: A Chain-of-Thought Text-to-SQL Benchmark for Complex Reasoning
MemGuide: Intent-Driven Memory Selection for Goal-Oriented Multi-Session LLM Agents
DefenderBench: A Toolkit for Evaluating Language Agents in Cybersecurity Environments
Applying Text Embedding Models for Efficient Analysis in Labeled Property Graphs
Synthetic Data Generation for Emotional Depth Faces: Optimizing Conditional DCGANs via Genetic Algorithms in the Latent Space and Stabilizing Training with Knowledge Distillation
FineState-Bench: A Comprehensive Benchmark for Fine-Grained State Control in GUI Agents
Beyond Blanket Masking: Examining Granularity for Privacy Protection in Images Captured by Blind and Low Vision Users
Lung-DDPM+: Efficient Thoracic CT Image Synthesis using Diffusion Probabilistic Model
UltraLight Med-Vision Mamba for Classification of Neoplastic Progression in Tubular Adenomas
Blink-to-code: real-time Morse code communication via eye blink detection and classification
DenoDet V2: Phase-Amplitude Cross Denoising for SAR Object Detection
Skyshield: Event-Driven Submillimetre Thin Obstacle Detection for Drone Flight Safety
Autonomous AI Bird Feeder for Backyard Biodiversity Monitoring
Waymo-3DSkelMo: A Multi-Agent 3D Skeletal Motion Dataset for Pedestrian Interaction Modeling in Autonomous Driving
Decoding Neural Emotion Patterns through Natural Language Processing Embeddings
Flow-SLM: Joint Learning of Linguistic and Acoustic Information for Spoken Language Modeling
Columbo: Expanding Abbreviated Column Names for Tabular Data Using Large Language Models
Leveraging Zipformer Model for Effective Language Identification in Code-Switched Child-Directed Speech
From Charts to Fair Narratives: Uncovering and Mitigating Geo-Economic Biases in Chart-to-Text
User-centric Subjective Leaderboard by Customizable Reward Modeling
LACA: Improving Cross-lingual Aspect-Based Sentiment Analysis with LLM Data Augmentation
Cross-lingual Aspect-Based Sentiment Analysis: A Survey on Tasks, Approaches, and Challenges
UWBa at SemEval-2025 Task 7: Multilingual and Crosslingual Fact-Checked Claim Retrieval
The Surprising Effectiveness of Membership Inference with Simple N-Gram Coverage
AINL-Eval 2025 Shared Task: Detection of AI-Generated Scientific Abstracts in Russian
EffiEval: Efficient and Generalizable Model Evaluation via Capability Coverage Maximization
Slow Tuning and Low-Entropy Masking for Safe Chain-of-Thought Distillation
The Perils of Chart Deception: How Misleading Visualizations Affect Vision-Language Models
Transforming Questions and Documents for Semantically Aligned Retrieval-Augmented Generation
Echoes of Agreement: Argument Driven Opinion Shifts in Large Language Models
UtterTune: LoRA-Based Target-Language Pronunciation Edit and Control in Multilingual Text-to-Speech
BigCharts-R1: Enhanced Chart Reasoning with Visual Reinforcement Finetuning
Assessing the Feasibility of Lightweight Whisper Models for Low-Resource Urdu Transcription
A Survey of Cognitive Distortion Detection and Classification in NLP
Language of Persuasion and Misrepresentation in Business Communication: A Textual Detection Approach
Shaping Event Backstories to Estimate Potential Emotion Contexts
Performance of GPT-5 Frontier Models in Ophthalmology Question Answering
Which one Performs Better? Wav2Vec or Whisper? Applying both in Badini Kurdish Speech to Text (BKSTT)
IAG: Input-aware Backdoor Attack on VLMs for Visual Grounding
Enhancing Deep Hedging of Options with Implied Volatility Surface Feedback Information
VulScribeR: Exploring RAG-based Vulnerability Augmentation with LLMs
On the Robustness of Kernel Goodness-of-Fit Tests
Leveraging Reviewer Experience in Code Review Comment Generation
A spectral method for multi-view subspace learning using the product of projections
Improving Multimodal Large Language Models Using Continual Learning
Learning Whole-Body Loco-Manipulation for Omni-Directional Task Space Pose Tracking with a Wheeled-Quadrupedal-Manipulator
A2SB: Audio-to-Audio Schrodinger Bridges
Gradient Descent Algorithm in Hilbert Spaces under Stationary Markov Chains with $\phi$- and $\beta$-Mixing
RocketKV: Accelerating Long-Context LLM Inference via Two-Stage KV Cache Compression
Verifying Quantized Graph Neural Networks is PSPACE-complete
Generative Active Adaptation for Drifting and Imbalanced Network Intrusion Detection
Cryo-em images are intrinsically low dimensional
ParkDiffusion: Heterogeneous Multi-Agent Multi-Modal Trajectory Prediction for Automated Parking using Diffusion Models
Towards Safer Pretraining: Analyzing and Filtering Harmful Content in Webscale datasets for Responsible LLMs
M-learner:A Flexible And Powerful Framework To Study Heterogeneous Treatment Effect In Mediation Model
MoCA: Multi-modal Cross-masked Autoencoder for Digital Health Measurements
AbRank: A Benchmark Dataset and Metric-Learning Framework for Antibody-Antigen Affinity Ranking
MetaCipher: A Time-Persistent and Universal Multi-Agent Framework for Cipher-Based Jailbreak Attacks for LLMs
Efficient Visual Appearance Optimization by Learning from Prior Preferences
Understanding Nonlinear Implicit Bias via Region Counts in Input Space
Finite-Time Global Optimality Convergence in Deep Neural Actor-Critic Methods for Decentralized Multi-Agent Reinforcement Learning
Mini-Game Lifetime Value Prediction in WeChat
Leveraging Predictive Equivalence in Decision Trees
Unlasting: Unpaired Single-Cell Multi-Perturbation Estimation by Dual Conditional Diffusion Implicit Bridges
Faster Diffusion Models via Higher-Order Approximation
Quantum Machine Learning in Transportation: A Case Study of Pedestrian Stress Modelling
Feel-Good Thompson Sampling for Contextual Bandits: a Markov Chain Monte Carlo Showdown
PrAViC: Probabilistic Adaptation Framework for Real-Time Video Classification
Continuous-time q-Learning for Jump-Diffusion Models under Tsallis Entropy
Importance Corrected Neural JKO Sampling
MiCo: End-to-End Mixed Precision Neural Network Co-Exploration Framework for Edge AI
Causal Graph Profiling via Structural Divergence for Robust Anomaly Detection in Cyber-Physical Systems
Enhancing Memory Recall in LLMs with Gauss-Tin: A Hybrid Instructional and Gaussian Replay Approach
Time-Aware and Transition-Semantic Graph Neural Networks for Interpretable Predictive Business Process Monitoring
SYNAPSE-G: Bridging Large Language Models and Graph Learning for Rare Event Classification
Edge General Intelligence Through World Models and Agentic AI: Fundamentals, Solutions, and Challenges
Online Prediction with Limited Selectivity
Physics- and geometry-aware spatio-spectral graph neural operator for time-independent and time-dependent PDEs
Thermal Tracks: A Gaussian process-based framework for universal melting curve analysis enabling unconstrained hit identification in thermal proteome profiling experiments
Global Convergence Analysis of Vanilla Gradient Descent for Asymmetric Matrix Completion
Temporal Anchoring in Deepening Embedding Spaces: Event-Indexed Projections, Drift, Convergence, and an Internal Computational Architecture
Combating Noisy Labels via Dynamic Connection Masking
GraphTreeGen: Subtree-Centric Approach to Efficient and Supervised Graph Generation
Generative Modeling with Multi-Instance Reward Learning for E-commerce Creative Optimization
HKT: A Biologically Inspired Framework for Modular Hereditary Knowledge Transfer in Neural Networks
A Machine Learning Approach to Predict Biological Age and its Longitudinal Drivers
$\mu$-Parametrization for Mixture of Experts
TriForecaster: A Mixture of Experts Framework for Multi-Region Electric Load Forecasting with Tri-dimensional Specialization
Bayesian autoregression to optimize temporal Mat\'ern kernel Gaussian process hyperparameters
Feature Impact Analysis on Top Long-Jump Performances with Quantile Random Forest and Explainable AI Techniques
RankList -- A Listwise Preference Learning Framework for Predicting Subjective Preferences
FedShard: Federated Unlearning with Efficiency Fairness and Performance Fairness
Modern Neural Networks for Small Tabular Datasets: The New Default for Field-Scale Digital Soil Mapping?
Prototype-Guided Diffusion: Visual Conditioning without External Memory
Noise Hypernetworks: Amortizing Test-Time Compute in Diffusion Models
Dynamic Mixture-of-Experts for Incremental Graph Learning
RadioMamba: Breaking the Accuracy-Efficiency Trade-off in Radio Map Construction via a Hybrid Mamba-UNet
GANime: Generating Anime and Manga Character Drawings from Sketches with Deep Learning
Exploring Molecular Odor Taxonomies for Structure-based Odor Predictions using Machine Learning
Objective Soups: Multilingual Multi-Task Modeling for Speech Processing
Forecasting Binary Economic Events in Modern Mercantilism: Traditional methodologies coupled with PCA and K-means Quantitative Analysis of Qualitative Sentimental Data
Harnessing Input-Adaptive Inference for Efficient VLN
A Generative Imputation Method for Multimodal Alzheimer's Disease Diagnosis
Teaching Code Refactoring Using LLMs
Classifying Cool Dwarfs: Comprehensive Spectral Typing of Field and Peculiar Dwarfs Using Machine Learning
ProMode: A Speech Prosody Model Conditioned on Acoustic and Textual Inputs
A pseudo-inverse of a line graph
HyperKD: Distilling Cross-Spectral Knowledge in Masked Autoencoders via Inverse Domain Shift with Spatial-Aware Masking and Specialized Loss
CWFBind: Geometry-Awareness for Fast and Accurate Protein-Ligand Docking
DeepWKB: Learning WKB Expansions of Invariant Distributions for Stochastic Systems
Emergence of Hierarchies in Multi-Agent Self-Organizing Systems Pursuing a Joint Objective
HierMoE: Accelerating MoE Training with Hierarchical Token Deduplication and Expert Swap
Scalable h-adaptive probabilistic solver for time-independent and time-dependent systems
Personalized Product Search Ranking: A Multi-Task Learning Approach with Tabular and Non-Tabular Data
Improving Diversity in Language Models: When Temperature Fails, Change the Loss
Social-Sensor Identity Cloning Detection Using Weakly Supervised Deep Forest and Cryptographic Authentication
DeputyDev -- AI Powered Developer Assistant: Breaking the Code Review Logjam through Contextual AI to Boost Developer Productivity
NEURAL: Attention-Guided Pruning for Unified Multimodal Resource-Constrained Clinical Evaluation
Multimodal Sheaf-based Network for Glioblastoma Molecular Subtype Prediction
Structured Kernel Regression VAE: A Computationally Efficient Surrogate for GP-VAEs in ICA
Sample More to Think Less: Group Filtered Policy Optimization for Concise Reasoning
Improving the Speaker Anonymization Evaluation's Robustness to Target Speakers with Adversarial Learning
On the Generalization Limits of Quantum Generative Adversarial Networks with Pure State Generators
Stable Diffusion Models are Secretly Good at Visual In-Context Learning
Neural Bandit Based Optimal LLM Selection for a Pipeline of Tasks
Story2Board: A Training-Free Approach for Expressive Storyboard Generation
Forecasting steam mass flow in power plants using the parallel hybrid network
Semi-Bandit Learning for Monotone Stochastic Optimization
Discrete Neural Algorithmic Reasoning
No-Regret M${}^{\natural}$-Concave Function Maximization: Stochastic Bandit Algorithms and Hardness of Adversarial Full-Information Setting
Sparse Spectral Training and Inference on Euclidean and Hyperbolic Neural Networks
Distributed Lag Transformer based on Time-Variable-Aware Learning for Explainable Multivariate Time Series Forecasting
Federated Learning for Smart Grid: A Survey on Applications and Potential Vulnerabilities
Differentiation Through Black-Box Quadratic Programming Solvers
Provably Transformers Harness Multi-Concept Word Semantics for Efficient In-Context Learning
Generative Feature Training of Thin 2-Layer Networks
Scalable Out-of-distribution Robustness in the Presence of Unobserved Confounders
Indirect Query Bayesian Optimization with Integrated Feedback
MVICAD2: Multi-View Independent Component Analysis with Delays and Dilations
Pivoting Factorization: A Compact Meta Low-Rank Representation of Sparsity for Efficient Inference in Large Language Models
Accelerating Linear Recurrent Neural Networks for the Edge with Unstructured Sparsity
LEAPS: A discrete neural sampler via locally equivariant networks
Fast, Accurate Manifold Denoising by Tunneling Riemannian Optimization
Underdamped Diffusion Bridges with Applications to Sampling
Dequantified Diffusion-Schr{\"o}dinger Bridge for Density Ratio Estimation
Leveraging Audio and Text Modalities in Mental Health: A Study of LLMs Performance
SLTNet: Efficient Event-based Semantic Segmentation with Spike-driven Lightweight Transformer-based Networks
Evaluation of Bio-Inspired Models under Different Learning Settings For Energy Efficiency in Network Traffic Prediction
Beyond Memorization: Assessing Semantic Generalization in Large Language Models Using Phrasal Constructions
GenAI Confessions: Black-box Membership Inference for Generative Image Models
Benchmarking LLMs' Mathematical Reasoning with Unseen Random Variables Questions
Conformal Prediction of Classifiers with Many Classes based on Noisy Labels
One-shot Optimized Steering Vectors Mediate Safety-relevant Behaviors in LLMs
RIZE: Regularized Imitation Learning via Distributional Reinforcement Learning
Simulating the Real World: A Unified Survey of Multimodal Generative Models
Shifting Perspectives: Steering Vectors for Robust Bias Mitigation in LLMs
The Illusory Normativity of Rights-Based AI Regulation
FT-Transformer: Resilient and Reliable Transformer with End-to-End Fault Tolerant Attention
CO-Bench: Benchmarking Language Model Agents in Algorithm Search for Combinatorial Optimization
Mosaic: Composite Projection Pruning for Resource-efficient LLMs
GraspClutter6D: A Large-scale Real-world Dataset for Robust Perception and Grasping in Cluttered Scenes
AI-Slop to AI-Polish? Aligning Language Models through Edit-Based Writing Rewards and Test-time Computation
FedRecon: Missing Modality Reconstruction in Heterogeneous Distributed Environments
EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting
Deep Learning Warm Starts for Trajectory Optimization on the International Space Station
Halting Recurrent GNNs and the Graded $\mu$-Calculus
Teaching Large Language Models to Maintain Contextual Faithfulness via Synthetic Tasks and Reinforcement Learning
MapStory: Prototyping Editable Map Animations with LLM Agents
Exploring Scaling Laws for EHR Foundation Models
Sarc7: Evaluating Sarcasm Detection and Generation with Seven Types and Emotion-Informed Techniques
Gradual Transition from Bellman Optimality Operator to Bellman Operator in Online Reinforcement Learning
Deep Learning Model Acceleration and Optimization Strategies for Real-Time Recommendation Systems
MGDFIS: Multi-scale Global-detail Feature Integration Strategy for Small Object Detection
Poison Once, Control Anywhere: Clean-Text Visual Backdoors in VLM-based Mobile Agents
Open-Set LiDAR Panoptic Segmentation Guided by Uncertainty-Aware Learning
HVL: Semi-Supervised Segmentation leveraging Hierarchical Vision-Language Synergy with Dynamic Text-Spatial Query Alignment
Human Motion Capture from Loose and Sparse Inertial Sensors with Garment-aware Diffusion Models
The Importance of Being Lazy: Scaling Limits of Continual Learning
SWA-SOP: Spatially-aware Window Attention for Semantic Occupancy Prediction in Autonomous Driving
OC-SOP: Enhancing Vision-Based 3D Semantic Occupancy Prediction by Object-Centric Awareness
Beyond Autocomplete: Designing CopilotLens Towards Transparent and Explainable AI Coding Agents
Audio-3DVG: Unified Audio -- Point Cloud Fusion for 3D Visual Grounding
WebArXiv: Evaluating Multimodal Agents on Time-Invariant arXiv Tasks
GLM-4.1V-Thinking and GLM-4.5V: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
A multi-strategy improved snake optimizer for three-dimensional UAV path planning and engineering problems
MoLAN: A Unified Modality-Aware Noise Dynamic Editing Framework for Multimodal Sentiment Analysis
Presenting DiaData for Research on Type 1 Diabetes
An Unsupervised Deep XAI Framework for Localization of Concurrent Replay Attacks in Nuclear Reactor Signals
Generating Feasible and Diverse Synthetic Populations Using Diffusion Models
Masked Training for Robust Arrhythmia Detection from Digitalized Multiple Layout ECG Images
SVGen: Interpretable Vector Graphics Generation with Large Language Models
Breath as a biomarker: A survey of contact and contactless applications and approaches in respiratory monitoring
Building Safer Sites: A Large-Scale Multi-Level Dataset for Construction Safety Research
The First Differentiable Transfer-Based Algorithm for Discrete MicroLED Repair
Blockchain Network Analysis using Quantum Inspired Graph Neural Networks & Ensemble Models
LLM Empowered Prototype Learning for Zero and Few-Shot Tasks on Tabular Data
Over-Squashing in GNNs and Causal Inference of Rewiring Strategies
Constrained Black-Box Attacks Against Multi-Agent Reinforcement Learning
Pattern-based Knowledge Component Extraction from Student Code Using Representation Learning
Distilling Reinforcement Learning into Single-Batch Datasets
Resurrecting the Salmon: Rethinking Mechanistic Interpretability with Domain-Specific Sparse Autoencoders
Integrating Feature Attention and Temporal Modeling for Collaborative Financial Risk Assessment
Graph Neural Network and Transformer Integration for Unsupervised System Anomaly Discovery
NEXICA: Discovering Road Traffic Causality (Extended arXiv Version)
Open-Set Fault Diagnosis in Multimode Processes via Fine-Grained Deep Feature Representation
Learn to Explore: Meta NAS via Bayesian Optimization Guided Graph Generation
EGGS-PTP: An Expander-Graph Guided Structured Post-training Pruning Method for Large Language Models
Multi-Step Reasoning with Large Language Models, a Survey
Probing Mechanical Reasoning in Large Vision Language Models
Revisiting Your Memory: Reconstruction of Affect-Contextualized Memory via EEG-guided Audiovisual Generation
Analyzing Finetuning Representation Shift for Multimodal LLMs Steering
MedRep: Medical Concept Representation for General Electronic Health Record Foundation Models
Integrating Visual Interpretation and Linguistic Reasoning for Math Problem Solving
GridRoute: A Benchmark for LLM-Based Route Planning with Cardinal Movement in Grid Environments
AgentOrchestra: A Hierarchical Multi-Agent Framework for General-Purpose Task Solving
MoSE: Skill-by-Skill Mixture-of-Experts Learning for Embodied Autonomous Machines
Game-Theoretic Multiagent Reinforcement Learning
LEAVES: Learning Views for Time-Series Biobehavioral Data in Contrastive Learning
Learning to Defer in Congested Systems: The AI-Human Interplay
From Model Performance to Claim: How a Change of Focus in Machine Learning Replicability Can Help Bridge the Responsibility Gap
Robo-Instruct: Simulator-Augmented Instruction Alignment For Finetuning Code LLMs
Towards Black-Box Membership Inference Attack for Diffusion Models
LUMA: A Benchmark Dataset for Learning from Uncertain and Multimodal Data
Integrating Clinical Knowledge Graphs and Gradient-Based Neural Systems for Enhanced Melanoma Diagnosis via the 7-Point Checklist
Towards flexible perception with visual memory
SpectralEarth: Training Hyperspectral Foundation Models at Scale
Explaining Caption-Image Interactions in CLIP Models with Second-Order Attributions
CTRQNets & LQNets: Continuous Time Recurrent and Liquid Quantum Neural Networks
Pediatric brain tumor classification using digital histopathology and deep learning: evaluation of SOTA methods on a multi-center Swedish cohort
Episodic Memory Verbalization using Hierarchical Representations of Life-Long Robot Experience
Downscaling Extreme Precipitation with Wasserstein Regularized Diffusion
Retrieval-Augmented Decision Transformer: External Memory for In-context RL
Depth-Guided Self-Supervised Human Keypoint Detection via Cross-Modal Distillation
Learning Characteristics of Reverse Quaternion Neural Network
What Can We Learn from Inter-Annotator Variability in Skin Lesion Segmentation?
X-UniMotion: Animating Human Images with Expressive, Unified and Identity-Agnostic Motion Latents
Understanding Dementia Speech Alignment with Diffusion-Based Image Generation
RampNet: A Two-Stage Pipeline for Bootstrapping Curb Ramp Detection in Streetscape Images from Open Government Metadata
Domain-Generalization to Improve Learning in Meta-Learning Algorithms
Implicit Hypergraph Neural Networks: A Stable Framework for Higher-Order Relational Learning with Provable Guarantees
What-Meets-Where: Unified Learning of Action and Contact Localization in a New Dataset
Shadow in the Cache: Unveiling and Mitigating Privacy Risks of KV-cache in LLM Inference
A Unified Contrastive-Generative Framework for Time Series Classification
Hallucination vs interpretation: rethinking accuracy and precision in AI-assisted data extraction for knowledge synthesis
RelayFormer: A Unified Local-Global Attention Framework for Scalable Image and Video Manipulation Localization
Gen-AFFECT: Generation of Avatar Fine-grained Facial Expressions with Consistent identiTy
DeepFeatIoT: Unifying Deep Learned, Randomized, and LLM Features for Enhanced IoT Time Series Sensor Data Classification in Smart Industries
NeuronTune: Fine-Grained Neuron Modulation for Balanced Safety-Utility Alignment in LLMs
Episodic Memory Representation for Long-form Video Understanding
Large-Small Model Collaborative Framework for Federated Continual Learning
Learning Facts at Scale with Active Reading
From Ranking to Selection: A Simple but Efficient Dynamic Passage Selector for Retrieval Augmented Generation
Verify Distributed Deep Learning Model Implementation Refinement with Iterative Relation Inference
SMART-OC: A Real-time Time-risk Optimal Replanning Algorithm for Dynamic Obstacles and Spatio-temporally Varying Currents
COMPEER: Controllable Empathetic Reinforcement Reasoning for Emotional Support Conversation
Generation of Indian Sign Language Letters, Numbers, and Words
Decentralized Rank Scheduling for Energy-Constrained Multi-Task Federated Fine-Tuning in Edge-Assisted IoV Networks
COXNet: Cross-Layer Fusion with Adaptive Alignment and Scale Integration for RGBT Tiny Object Detection
AI Blob! LLM-Driven Recontextualization of Italian Television Archives
Your Coding Intent is Secretly in the Context and You Should Deliberately Infer It Before Completion
GoViG: Goal-Conditioned Visual Navigation Instruction Generation
CaRoBio: 3D Cable Routing with a Bio-inspired Gripper Fingernail
Hierarchical Brain Structure Modeling for Predicting Genotype of Glioma
A Lightweight Learned Cardinality Estimation Model
How Persuasive Could LLMs Be? A First Study Combining Linguistic-Rhetorical Analysis and User Experiments
MInDI-3D: Iterative Deep Learning in 3D for Sparse-view Cone Beam Computed Tomography
Interpretable Robot Control via Structured Behavior Trees and Large Language Models
Goal Discovery with Causal Capacity for Efficient Reinforcement Learning
TimeMKG: Knowledge-Infused Causal Reasoning for Multivariate Time Series Modeling
AmbiGraph-Eval: Can LLMs Effectively Handle Ambiguous Graph Queries?
Preacher: Paper-to-Video Agentic System
A Close Reading Approach to Gender Narrative Biases in AI-Generated Stories
Demystifying the Role of Rule-based Detection in AI Systems for Windows Malware Detection
On Negative-aware Preference Optimization for Recommendation
Anomaly Detection for IoT Global Connectivity
Surg-InvNeRF: Invertible NeRF for 3D tracking and reconstruction in surgical vision
Evaluating the Role of Large Language Models in Legal Practice in India
Improving ARDS Diagnosis Through Context-Aware Concept Bottleneck Models
Region-to-Region: Enhancing Generative Image Harmonization with Adaptive Regional Injection
NEUBORN: The Neurodevelopmental Evolution framework Using BiOmechanical RemodelliNg
Enhance the machine learning algorithm performance in phishing detection with keyword features
Counting Short Trajectories in Elementary Cellular Automata using the Transfer Matrix Method
Can LLM-Generated Textual Explanations Enhance Model Classification Performance? An Empirical Study
Combinative Matching for Geometric Shape Assembly
Adoption of Explainable Natural Language Processing: Perspectives from Industry and Academia on Practices and Challenges
Prototype Training with Dual Pseudo-Inverse and Optimized Hidden Activations
LibRec: Benchmarking Retrieval-Augmented LLMs for Library Migration Recommendations
Explainable Ensemble Learning for Graph-Based Malware Detection
Automated Segmentation of Coronal Brain Tissue Slabs for 3D Neuropathology
A Comprehensive Survey of Datasets for Clinical Mental Health AI Systems
TRACE: Learning 3D Gaussian Physical Dynamics from Multi-view Videos
Provable In-Context Vector Arithmetic via Retrieving Task Concepts
RayletDF: Raylet Distance Fields for Generalizable 3D Surface Reconstruction from Point Clouds or Gaussians
Exploring the Potential of Large Language Models in Fine-Grained Review Comment Classification
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models
PRELUDE: A Benchmark Designed to Require Global Comprehension and Reasoning over Long Contexts
Perceptual Reality Transformer: Neural Architectures for Simulating Neurological Perception Conditions
STREAM (ChemBio): A Standard for Transparently Reporting Evaluations in AI Model Reports
Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models
Beyond Scaling Law: A Data-Efficient Distillation Framework for Reasoning
COME: Dual Structure-Semantic Learning with Collaborative MoE for Universal Lesion Detection Across Heterogeneous Ultrasound Datasets
Rare anomalies require large datasets: About proving the existence of anomalies
Beyond Na\"ive Prompting: Strategies for Improved Zero-shot Context-aided Forecasting with LLMs
T-CACE: A Time-Conditioned Autoregressive Contrast Enhancement Multi-Task Framework for Contrast-Free Liver MRI Synthesis, Segmentation, and Diagnosis
Residual Reservoir Memory Networks
A Comprehensive Evaluation framework of Alignment Techniques for LLMs
VisCodex: Unified Multimodal Code Generation via Merging Vision and Coding Models
Specialised or Generic? Tokenization Choices for Radiology Language Models
GBC: Generalized Behavior-Cloning Framework for Whole-Body Humanoid Imitation
January Food Benchmark (JFB): A Public Benchmark Dataset and Evaluation Suite for Multimodal Food Analysis
Vision-driven River Following of UAV via Safe Reinforcement Learning using Semantic Dynamics Model
Echo-4o: Harnessing the Power of GPT-4o Synthetic Images for Improved Image Generation
Value Function Initialization for Knowledge Transfer and Jump-start in Deep Reinforcement Learning
The Othello AI Arena: Evaluating Intelligent Systems Through Limited-Time Adaptation to Unseen Boards
An Automated Multi-Modal Evaluation Framework for Mobile Intelligent Assistants
EvoCurr: Self-evolving Curriculum with Behavior Code Generation for Complex Decision-making
UbiQTree: Uncertainty Quantification in XAI with Tree Ensembles
MEML-GRPO: Heterogeneous Multi-Expert Mutual Learning for RLVR Advancement
UDA: Unsupervised Debiasing Alignment for Pair-wise LLM-as-a-Judge
The PacifAIst Benchmark:Would an Artificial Intelligence Choose to Sacrifice Itself for Human Safety?
Reasoning About Knowledge on Regular Expressions is 2EXPTIME-complete
Human-Aligned Procedural Level Generation Reinforcement Learning via Text-Level-Sketch Shared Representation
AWorld: Dynamic Multi-Agent System with Stable Maneuvering for Robust GAIA Problem Solving
RAGulating Compliance: A Multi-Agent Knowledge Graph for Regulatory QA
Mathematical Computation and Reasoning Errors by Large Language Models
QuickGrasp: Lightweight Antipodal Grasp Planning with Point Clouds
User-Intent-Driven Semantic Communication via Adaptive Deep Understanding
Bayesian-Driven Graph Reasoning for Active Radio Map Construction
Efficient Real-Time Aircraft ETA Prediction via Feature Tokenization Transformer
To Theoretically Understand Transformer-Based In-Context Learning for Optimizing CSMA
Agentic TinyML for Intent-aware Handover in 6G Wireless Networks
Motif 2.6B Technical Report
5G Core Fault Detection and Root Cause Analysis using Machine Learning and Generative AI
JustDense: Just using Dense instead of Sequence Mixer for Time Series analysis
Peer Effect Estimation in the Presence of Simultaneous Feedback and Unobserved Confounders
A Rolling Stone Gathers No Moss: Adaptive Policy Optimization for Stable Self-Evaluation in Large Multimodal Models
Physics-Constrained Fine-Tuning of Flow-Matching Models for Generation and Inverse Problems
EvaDrive: Evolutionary Adversarial Policy Optimization for End-to-End Autonomous Driving
Agoran: An Agentic Open Marketplace for 6G RAN Automation
Physics-Guided Memory Network for Building Energy Modeling
Energy-Efficient Stochastic Computing (SC) Neural Networks for Internet of Things Devices With Layer-Wise Adjustable Sequence Length (ASL)
Multimodal RAG Enhanced Visual Description
webMCP: Efficient AI-Native Client-Side Interaction for Agent-Ready Web Design
FedMP: Tackling Medical Feature Heterogeneity in Federated Learning from a Manifold Perspective
A Context-aware Attention and Graph Neural Network-based Multimodal Framework for Misogyny Detection
DQT: Dynamic Quantization Training via Dequantization-Free Nested Integer Arithmetic
Generative Artificial Intelligence in Medical Imaging: Foundations, Progress, and Clinical Translation
IAD-R1: Reinforcing Consistent Reasoning in Industrial Anomaly Detection
scAGC: Learning Adaptive Cell Graphs with Contrastive Guidance for Single-Cell Clustering
Long-Term Client Selection for Federated Learning with Non-IID Data: A Truthful Auction Approach
Quantum-Efficient Reinforcement Learning Solutions for Last-Mile On-Demand Delivery
HiSTM: Hierarchical Spatiotemporal Mamba for Cellular Traffic Forecasting
A Neurosymbolic Framework for Interpretable Cognitive Attack Detection in Augmented Reality
RL-MoE: An Image-Based Privacy Preserving Approach In Intelligent Transportation System
Hybrid(Transformer+CNN)-based Polyp Segmentation
Fine-Grained Safety Neurons with Training-Free Continual Projection to Reduce LLM Fine Tuning Risks
From Values to Tokens: An LLM-Driven Framework for Context-aware Time Series Forecasting via Symbolic Discretization
Diffusion LLMs Can Do Faster-Than-AR Inference via Discrete Diffusion Forcing
Multi-Objective Instruction-Aware Representation Learning in Procedural Content Generation RL
Meta-Learning for Speeding Up Large Model Inference in Decentralized Environments
impuTMAE: Multi-modal Transformer with Masked Pre-training for Missing Modalities Imputation in Cancer Survival Prediction
FIVA: Federated Inverse Variance Averaging for Universal CT Segmentation with Uncertainty Estimation
MX-AI: Agentic Observability and Control Platform for Open and AI-RAN
ADT4Coupons: An Innovative Framework for Sequential Coupon Distribution in E-commerce
$\Delta$-AttnMask: Attention-Guided Masked Hidden States for Efficient Data Selection and Augmentation
Zero-shot self-supervised learning of single breath-hold magnetic resonance cholangiopancreatography (MRCP) reconstruction
Learning to Detect Unknown Jailbreak Attacks in Large Vision-Language Models: A Unified and Accurate Approach
Personalized Feature Translation for Expression Recognition: An Efficient Source-Free Domain Adaptation Method
MoQE: Improve Quantization Model performance via Mixture of Quantization Experts
From Explainable to Explained AI: Ideas for Falsifying and Quantifying Explanations
CoMoE: Collaborative Optimization of Expert Aggregation and Offloading for MoE-based LLMs at Edge
Quantum-Enhanced Generative Adversarial Networks: Comparative Analysis of Classical and Hybrid Quantum-Classical Generative Adversarial Networks
MME-Emotion: A Holistic Evaluation Benchmark for Emotional Intelligence in Multimodal Large Language Models
Deep Generative Models for Discrete Genotype Simulation
Real-time deep learning phase imaging flow cytometer reveals blood cell aggregate biomarkers for haematology diagnostics
Towards Effective MLLM Jailbreaking Through Balanced On-Topicness and OOD-Intensity
Understanding Ethical Practices in AI: Insights from a Cross-Role, Cross-Region Survey of AI Development Teams
Towards Scalable Training for Handwritten Mathematical Expression Recognition
Hierarchical Adaptive networks with Task vectors for Test-Time Adaptation
From Hard Refusals to Safe-Completions: Toward Output-Centric Safety Training
AMRG: Extend Vision Language Models for Automatic Mammography Report Generation
GSMT: Graph Fusion and Spatiotemporal TaskCorrection for Multi-Bus Trajectory Prediction
Cluster Topology-Driven Placement of Experts Reduces Network Traffic in MoE Inference
Cowpox: Towards the Immunity of VLM-based Multi-Agent Systems
Beyond Technocratic XAI: The Who, What & How in Explanation Design
PETLP: A Privacy-by-Design Pipeline for Social Media Data in AI Research
Gradient-Direction-Aware Density Control for 3D Gaussian Splatting
NEFMind: Parameter-Efficient Fine-Tuning of Open-Source LLMs for Telecom APIs Automation
Cross-BCI, A Cross-BCI-Paradigm Classifica-tion Model Towards Universal BCI Applications
Detection of Odor Presence via Deep Neural Networks
Can AI Keep a Secret? Contextual Integrity Verification: A Provable Security Architecture for LLMs
Ethical Medical Image Synthesis
Fake-Mamba: Real-Time Speech Deepfake Detection Using Bidirectional Mamba as Self-Attention's Alternative
Based AI improves human decision-making but reduces trust
Decentralized Weather Forecasting via Distributed Machine Learning and Blockchain-Based Model Validation
ParallelSearch: Train your LLMs to Decompose Query and Search Sub-queries in Parallel with Reinforcement Learning
TPTP World Infrastructure for Non-classical Logics
Exact Verification of Graph Neural Networks with Incremental Constraint Solving
Leveraging Large Language Models for Rare Disease Named Entity Recognition
TEN: Table Explicitization, Neurosymbolically
SegDAC: Segmentation-Driven Actor-Critic for Visual Reinforcement Learning
Synaptic Pruning: A Biological Inspiration for Deep Learning Regularization
RicciFlowRec: A Geometric Root Cause Recommender Using Ricci Curvature on Financial Graphs
Collective dynamics of strategic classification
The Human-AI Hybrid Delphi Model: A Structured Framework for Context-Rich, Expert Consensus in Complex Domains
FusionEnsemble-Net: An Attention-Based Ensemble of Spatiotemporal Networks for Multimodal Sign Language Recognition
A Signer-Invariant Conformer and Multi-Scale Fusion Transformer for Continuous Sign Language Recognition
APIO: Automatic Prompt Induction and Optimization for Grammatical Error Correction and Text Simplification

Research Sources: 530 | Generated: 8/25/2025