AI Research News Feeds for September 29th, 2025

AI RESEARCH PAPERS & ACADEMIC SOURCES

APTx Neuron: A Unified Trainable Neuron Architecture Integrating Activation and Computation
Multimodal Recurrent Ensembles for Predicting Brain Responses to Naturalistic Movies (Algonauts 2025)
Recent Advancements in Microscopy Image Enhancement using Deep Learning: A Survey
SeamCrafter: Enhancing Mesh Seam Generation for Artist UV Unwrapping via Reinforcement Learning
FERD: Fairness-Enhanced Data-Free Robustness Distillation
Differential-Integral Neural Operator for Long-Term Turbulence Forecasting
TAMMs: Temporal-Aware Multimodal Model for Satellite Image Change Understanding and Forecasting
Shape-for-Motion: Precise and Consistent Video Editing with 3D Proxy
Reasoning to Edit: Hypothetical Instruction-Based Image Editing with Visual Reasoning
SIU3R: Simultaneous Scene Understanding and 3D Reconstruction Beyond Feature Alignment
Investigating Redundancy in Multimodal Large Language Models with Multiple Vision Encoders
pFedMMA: Personalized Federated Fine-Tuning with Multi-Modal Adapter for Vision-Language Models
NarrLV: Towards a Comprehensive Narrative-Centric Evaluation for Long Video Generation
STQE: Spatial-Temporal Attribute Quality Enhancement for G-PCC Compressed Dynamic Point Clouds
DriveAgent-R1: Advancing VLM-based Autonomous Driving with Active Perception and Hybrid Thinking
$A^2R^2$: Advancing Img2LaTeX Conversion via Visual Reasoning with Attention-Guided Refinement
Content-Aware Mamba for Learned Image Compression
Small Dents, Big Impact: A Dataset and Deep Learning Approach for Vehicle Dent Detection
Re-Densification Meets Cross-Scale Propagation: Real-Time Neural Compression of LiDAR Point Clouds
Draw-In-Mind: Rebalancing Designer-Painter Roles in Unified Multimodal Models Benefits Image Editing
GLEAM: Learning to Match and Explain in Cross-View Geo-Localization
Deep Learning for Clouds and Cloud Shadow Segmentation in Methane Satellite and Airborne Imaging Spectroscopy
Diffence: Fencing Membership Privacy With Diffusion Models
Metric-Guided Conformal Bounds for Probabilistic Image Reconstruction
STHN: Deep Homography Estimation for UAV Thermal Geo-localization with Satellite Imagery
Diverse Subset Selection via Norm-Based Sampling and Orthogonality
Excavating in the Wild: The GOOSE-Ex Dataset for Semantic Segmentation
DOTA: Distributional Test-Time Adaptation of Vision-Language Models
Unstable Unlearning: The Hidden Risk of Concept Resurgence in Diffusion Models
Rare-to-Frequent: Unlocking Compositional Generation Power of Diffusion Models on Rare Concepts with LLM Guidance
Surgical Vision World Model
Learning Personalized Driving Styles via Reinforcement Learning from Human Feedback
Texture or Semantics? Vision-Language Models Get Lost in Font Recognition
Can Diffusion Models Disentangle? A Theoretical Perspective
Geometry aware inference of steady state PDEs using Equivariant Neural Fields representations
Plan-R1: Safe and Feasible Trajectory Planning as Language Modeling
Mobi-$\pi$: Mobilizing Your Robot Learning Policy
iTACO: Interactable Digital Twins of Articulated Objects from Casually Captured RGBD Videos
NeuVAS: Neural Implicit Surfaces for Variational Shape Modeling
Rethinking the Embodied Gap in Vision-and-Language Navigation: A Holistic Study of Physical and Visual Disparities
Multi-View Hypercomplex Learning for Breast Cancer Screening
Frequency-Domain Refinement with Multiscale Diffusion for Super Resolution
Leveraging Model Guidance to Extract Training Data from Personalized Diffusion Models
Pose Prior Learner: Unsupervised Categorical Prior Learning for Pose Estimation
Diffusion Curriculum: Synthetic-to-Real Data Curriculum via Image-Guided Diffusion
Large Pre-Training Datasets Don't Always Guarantee Robustness after Fine-Tuning
TAPTRv3: Spatial and Temporal Context Foster Robust Tracking of Any Point in Long Video
DynamicControl: Adaptive Condition Selection for Improved Text-to-Image Generation
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling
Self-Guidance: Boosting Flow and Diffusion Generation on Their Own
LOGen: Toward Lidar Object Generation by Point Diffusion
UIP2P: Unsupervised Instruction-based Image Editing via Edit Reversibility Constraint
Unforgettable Lessons from Forgettable Images: Intra-Class Memorability Matters in Computer Vision
LiT: Delving into a Simple Linear Diffusion Transformer for Image Generation
Single-weight Model Editing for Post-hoc Spurious Correlation Neutralization
PDV: Prompt Directional Vectors for Zero-shot Composed Image Retrieval
VidCRAFT3: Camera, Object, and Lighting Control for Image-to-Video Generation
GeoDANO: Geometric VLM with Domain Agnostic Vision Encoder
SPEED: Scalable, Precise, and Efficient Concept Erasure for Diffusion Models
Semantic Consistent Language Gaussian Splatting for Point-Level Open-vocabulary Querying
DanceText: A Training-Free Layered Framework for Controllable Multilingual Text Transformation in Images
CARL: Camera-Agnostic Representation Learning for Spectral Image Analysis
Image Recognition with Online Lightweight Vision Transformer: A Survey
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
OS-W2S: An Automatic Labeling Engine for Language-Guided Open-Set Aerial Object Detection
Intentional Gesture: Deliver Your Intentions with Gestures for Speech
Octic Vision Transformers: Quicker ViTs Through Equivariance
PhyMAGIC: Physical Motion-Aware Generative Inference with Confidence-guided LLM
Deeper Diffusion Models Amplify Bias
DVD-Quant: Data-free Video Diffusion Transformers Quantization
ChartGalaxy: A Dataset for Infographic Chart Understanding and Generation
CARE: Confidence-aware Ratio Estimation for Medical Biomarkers
Mamba-Driven Topology Fusion for Monocular 3D Human Pose Estimation
Towards Scalable Language-Image Pre-training for 3D Medical Imaging
Pose-free 3D Gaussian splatting via shape-ray estimation
Physics-Guided Motion Loss for Video Generation Model
ReSpace: Text-Driven 3D Indoor Scene Synthesis and Editing with Preference Alignment
Dual Branch VideoMamba with Gated Class Token Fusion for Violence Detection
Astraea: A Token-wise Acceleration Framework for Video Diffusion Transformers
DriveAction: A Benchmark for Exploring Human-like Driving Decisions in VLA Models
Structure before the Machine: Input Space is the Prerequisite for Concepts
HiSin: A Sinogram-Aware Framework for Efficient High-Resolution Inpainting
VidBridge-R1: Bridging QA and Captioning for RL-based Video Understanding Models with Intermediate Proxy Tasks
LEO-VL: Efficient Scene Representation for Scalable 3D Vision-Language Learning
Think With Videos For Agentic Long-Video Understanding
video-SALMONN 2: Caption-Enhanced Audio-Visual Large Language Models
HyCoVAD: A Hybrid SSL-LLM Model for Complex Video Anomaly Detection
JanusVLN: Decoupling Semantics and Spatiality with Dual Implicit Memory for Vision-Language Navigation
SpikeMatch: Semi-Supervised Learning with Temporal Dynamics of Spiking Neural Networks
Vision-Language Alignment from Compressed Image Representations using 2D Gaussian Splatting
LongLive: Real-time Interactive Long Video Generation
SPARK: Synergistic Policy And Reward Co-Evolving Framework
CCNeXt: An Effective Self-Supervised Stereo Depth Estimation Approach
UML-CoT: Structured Reasoning and Planning with Unified Modeling Language for Robotic Room Cleaning
LABELING COPILOT: A Deep Research Agent for Automated Data Curation in Computer Vision
Training-Free Synthetic Data Generation with Dual IP-Adapter Guidance
Scale-Wise VAR is Secretly Discrete Diffusion
Hierarchical Representation Matching for CLIP-based Class-Incremental Learning
Learning Human-Perceived Fakeness in AI-Generated Videos via Multimodal LLMs
CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning
RefAM: Attention Magnets for Zero-Shot Referral Segmentation
SGAligner++: Cross-Modal Language-Aided 3D Scene Graph Alignment
Cross-Modal Retrieval with Cauchy-Schwarz Divergence
Language-in-the-Loop Culvert Inspection on the Erie Canal
Are Hallucinations Bad Estimations?
VISION: Prompting Ocean Vertical Velocity Reconstruction from Incomplete Observations
SlimDiff: Training-Free, Activation-Guided Hands-free Slimming of Diffusion Models
DistillKac: Few-Step Image Generation via Damped Wave Equations
TRiCo: Triadic Game-Theoretic Co-Training for Robust Semi-Supervised Learning
Patch-Based Diffusion for Data-Efficient, Radiologist-Preferred MRI Reconstruction
ControlHair: Physically-based Video Diffusion for Controllable Dynamic Hair Rendering
Visual Multi-Agent System: Mitigating Hallucination Snowballing via Visual Flow
Perception-Consistency Multimodal Large Language Models Reasoning via Caption-Regularized Policy Optimization
Closing the Oracle Gap: Increment Vector Transformation for Class Incremental Learning
Comparative Analysis of GAN and Diffusion for MRI-to-CT translation
Enriching Knowledge Distillation with Intra-Class Contrastive Learning
Guidance Watermarking for Diffusion Models
Rigidity-Aware 3D Gaussian Deformation from a Single Image
Aerial Path Planning for Urban Geometry and Texture Co-Capture
COMPASS: Robust Feature Conformal Prediction for Medical Segmentation Metrics
Clinical Uncertainty Impacts Machine Learning Evaluations
RoboView-Bias: Benchmarking Visual Bias in Embodied Agents for Robotic Manipulation
Deep Learning-Based Cross-Anatomy CT Synthesis Using Adapted nnResU-Net with Anatomical Feature Prioritized Loss
Adaptive Dual-Mode Distillation with Incentive Schemes for Scalable, Heterogeneous Federated Learning on Non-IID Data
JointDiff: Bridging Continuous and Discrete in Multi-Agent Trajectory Generation
Activation Function Design Sustains Plasticity in Continual Learning
MINT-RVAE: Multi-Cues Intention Prediction of Human-Robot Interaction using Human Pose and Emotion Information from RGB-only Camera Data
Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning
WoW: Towards a World omniscient World model Through Embodied Interaction
VoiceAssistant-Eval: Benchmarking AI Assistants across Listening, Speaking, and Viewing
Pixel Motion Diffusion is What We Need for Robot Control
See, Point, Fly: A Learning-Free VLM Framework for Universal Unmanned Aerial Navigation
Self-Supervised Point Cloud Completion based on Multi-View Augmentations of Single Partial Point Cloud
REFINE-CONTROL: A Semi-supervised Distillation Method For Conditional Image Generation
Joint graph entropy knowledge distillation for point cloud classification and robustness against corruptions
MultiMat: Multimodal Program Synthesis for Procedural Materials using Large Multimodal Models
DragGANSpace: Latent Space Exploration and Control for GANs
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing
Towards Faithful Reasoning in Remote Sensing: A Perceptually-Grounded GeoSpatial Chain-of-Thought for Vision-Language Models
Polysemous Language Gaussian Splatting via Matching-based Mask Lifting
UrbanFeel: A Comprehensive Benchmark for Temporal and Perceptual Understanding of City Scenes through Human Perspective
A Tale of Two Experts: Cooperative Learning for Source-Free Unsupervised Domain Adaptation
FlashEdit: Decoupling Speed, Structure, and Semantics for Precise Image Editing
Beyond Classification Accuracy: Neural-MedBench and the Need for Deeper Reasoning Benchmarks
UniMapGen: A Generative Framework for Large-Scale Map Construction from Multi-modal Data
GS-2M: Gaussian Splatting for Joint Mesh Reconstruction and Material Decomposition
MesaTask: Towards Task-Driven Tabletop Scene Generation via 3D Spatial Reasoning
Rule-Based Reinforcement Learning for Document Image Classification with Vision Language Models
Jailbreaking on Text-to-Video Models via Scene Splitting Strategy
HiGS: History-Guided Sampling for Plug-and-Play Enhancement of Diffusion Models
Johnson-Lindenstrauss Lemma Guided Network for Efficient 3D Medical Segmentation
NIFTY: a Non-Local Image Flow Matching for Texture Synthesis
RAPID^3: Tri-Level Reinforced Acceleration Policies for Diffusion Transformer
Pedestrian Attribute Recognition via Hierarchical Cross-Modality HyperGraph Learning
CircuitSense: A Hierarchical Circuit System Benchmark Bridging Visual Comprehension and Symbolic Reasoning in Engineering Design Process
HierLight-YOLO: A Hierarchical and Lightweight Object Detection Network for UAV Photography
Effectiveness of Large Multimodal Models in Detecting Disinformation: Experimental Results
GPT-4 for Occlusion Order Recovery
Gradient-based multi-focus image fusion with focus-aware saliency enhancement
Text Adversarial Attacks with Dynamic Outputs
Integrating Background Knowledge in Medical Semantic Segmentation with Logic Tensor Networks
Closing the Safety Gap: Surgical Concept Erasure in Visual Autoregressive Models
RAU: Reference-based Anatomical Understanding with Vision Language Models
FreqDebias: Towards Generalizable Deepfake Detection via Consistency-Driven Frequency Debiasing
LucidFlux: Caption-Free Universal Image Restoration via a Large-Scale Diffusion Transformer
Explaining multimodal LLMs via intra-modal token interactions
U-MAN: U-Net with Multi-scale Adaptive KAN Network for Medical Image Segmentation
$\gamma$-Quant: Towards Learnable Quantization for Low-bit Pattern Recognition
SSVIF: Self-Supervised Segmentation-Oriented Visible and Infrared Image Fusion
B\'ezier Meets Diffusion: Robust Generation Across Domains for Medical Image Segmentation
PSTTS: A Plug-and-Play Token Selector for Efficient Event-based Spatio-temporal Representation Learning
Group Critical-token Policy Optimization for Autoregressive Image Generation
Where MLLMs Attend and What They Rely On: Explaining Autoregressive Token Generation
Color Names in Vision-Language Models
EfficientDepth: A Fast and Detail-Preserving Monocular Depth Estimation Model
Category Discovery: An Open-World Perspective
MIRG-RL: Multi-Image Reasoning and Grounding with Reinforcement Learning
LongScape: Advancing Long-Horizon Embodied World Models with Context-Aware MoE
MoWM: Mixture-of-World-Models for Embodied Planning via Latent-to-Pixel Feature Modulation
DiTraj: training-free trajectory control for video diffusion transformer
A Comprehensive Evaluation of Transformer-Based Question Answering Models and RAG-Enhanced Design
Dynamic Novel View Synthesis in High Dynamic Range
SRHand: Super-Resolving Hand Images and 3D Shapes via View/Pose-aware Neural Image Representations and Explicit 3D Meshes
Deepfakes: we need to re-think the concept of "real" images
Unlocking the Essence of Beauty: Advanced Aesthetic Reasoning with Relative-Absolute Policy Optimization
StableDub: Taming Diffusion Prior for Generalized and Efficient Visual Dubbing
Drag4D: Align Your Motion with Text-Driven 3D Scene Generation
Syncphony: Synchronized Audio-to-Video Generation with Diffusion Transformers
LG-CD: Enhancing Language-Guided Change Detection through SAM2 Adaptation
TDEdit: A Unified Diffusion Framework for Text-Drag Guided Image Manipulation
Enhancing Vehicle Detection under Adverse Weather Conditions with Contrastive Learning
Taming Flow-based I2V Models for Creative Video Editing
Multi-View Crowd Counting With Self-Supervised Learning
Spatial Reasoning in Foundation Models: Benchmarking Object-Centric Spatial Understanding
PANICL: Mitigating Over-Reliance on Single Prompt in Visual In-Context Learning
SingRef6D: Monocular Novel Object Pose Estimation with a Single RGB Reference
DynaNav: Dynamic Feature and Layer Selection for Efficient Visual Navigation
SemanticControl: A Training-Free Approach for Handling Loosely Aligned Visual Conditions in ControlNet
Customizing Visual Emotion Evaluation for MLLMs: An Open-vocabulary, Multifaceted, and Scalable Approach
MultiCrafter: High-Fidelity Multi-Subject Generation via Spatially Disentangled Attention and Identity-Aware Reinforcement Learning
PartSAM: A Scalable Promptable Part Segmentation Model Trained on Native 3D Data
No-Reference Image Contrast Assessment with Customized EfficientNet-B0
Geo-R1: Improving Few-Shot Geospatial Referring Expression Understanding with Reinforcement Fine-Tuning
Benchmarking and Mitigate Psychological Sycophancy in Medical Vision-Language Models
Resolving Ambiguity in Gaze-Facilitated Visual Assistant Interaction Paradigm
From Bias to Balance: Exploring and Mitigating Spatial Bias in LVLMs
Mind-the-Glitch: Visual Correspondence for Detecting Inconsistencies in Subject-Driven Generation
WAVE: Learning Unified & Versatile Audio-Visual Embeddings with Multimodal LLM
ERGO: Efficient High-Resolution Visual Understanding for Vision-Language Models
DualFocus: Depth from Focus with Spatio-Focal Dual Variational Constraints
Rate-Distortion Optimized Communication for Collaborative Perception
FailureAtlas:Mapping the Failure Landscape of T2I Models via Active Exploration
Exposing Hallucinations To Suppress Them: VLMs Representation Editing With Generative Anchors
CoFFT: Chain of Foresight-Focus Thought for Visual Language Models
Lightweight Structured Multimodal Reasoning for Clinical Scene Understanding in Robotics
EgoInstruct: An Egocentric Video Dataset of Face-to-face Instructional Interactions with Multi-modal LLM Benchmarking
High-Quality Sound Separation Across Diverse Categories via Visually-Guided Generative Modeling
SpecXNet: A Dual-Domain Convolutional Network for Robust Deepfake Detection
Large Material Gaussian Model for Relightable 3D Generation
On the Status of Foundation Models for SAR Imagery
UISim: An Interactive Image-Based UI Simulator for Dynamic Mobile Environments
LFA-Net: A Lightweight Network with LiteFusion Attention for Retinal Vessel Segmentation
Incorporating Scene Context and Semantic Labels for Enhanced Group-level Emotion Recognition
KG-SAM: Injecting Anatomical Knowledge into Segment Anything Models via Conditional Random Fields
UniVid: Unifying Vision Tasks with Pre-trained Video Generation Models
CubistMerge: Spatial-Preserving Token Merging For Diverse ViT Backbones
Training-Free Multimodal Deepfake Detection via Graph Reasoning
Prompt-guided Representation Disentanglement for Action Recognition
DeHate: A Stable Diffusion-based Multimodal Approach to Mitigate Hate Speech in Images
R-Stitch: Dynamic Trajectory Stitching for Efficient Reasoning
Resource Consumption Red-Teaming for Large Vision-Language Models
Sparse but Wrong: Incorrect L0 Leads to Incorrect Features in Sparse Autoencoders
EigenBench: A Comparative Behavioral Measure of Value Alignment
Cognitive Load Limits in Large Language Models: Benchmarking Multi-Hop Reasoning
TrustJudge: Inconsistencies of LLM-as-a-Judge and How to Alleviate Them
Expanding Reasoning Potential in Foundation Model by Learning Diverse Chains of Thought Patterns
Random Direct Preference Optimization for Radiography Report Generation
Improving Autism Detection with Multimodal Behavioral Analysis
KV-Efficient VLA: A Method of Speed up Vision Language Model with RNN-Gated Chunked KV Cache
Phrase-grounded Fact-checking for Automatically Generated Chest X-ray Reports
MDF-MLLM: Deep Fusion Through Cross-Modal Feature Alignment for Contextually Aware Fundoscopic Image Classification
Multimodal Prompt Decoupling Attack on the Safety Filters in Text-to-Image Models
A Mutual Learning Method for Salient Object Detection with intertwined Multi-Supervision--Revised
MAJORScore: A Novel Metric for Evaluating Multimodal Relevance via Joint Representation
Safety Assessment of Scaffolding on Construction Site using AI
Automated Prompt Generation for Creative and Counterfactual Text-to-image Synthesis
In silico Deep Learning Protocols for Label-Free Super-Resolution Microscopy: A Comparative Study of Network Architectures and SNR Dependence
Dynamic Multi-Target Fusion for Efficient Audio-Visual Navigation
SAEmnesia: Erasing Concepts in Diffusion Models with Sparse Autoencoders
Coreset selection based on Intra-class diversity
The LongiMam model for improved breast cancer risk prediction using longitudinal mammograms
Assessing the Alignment of Popular CNNs to the Brain for Valence Appraisal
Debugging Concept Bottleneck Models through Removal and Retraining
ShipwreckFinder: A QGIS Tool for Shipwreck Detection in Multibeam Sonar Data
Do Sparse Subnetworks Exhibit Cognitively Aligned Attention? Effects of Pruning on Saliency Map Fidelity, Sparsity, and Concept Coherence
TUN3D: Towards Real-World Scene Understanding from Unposed Images
Large AI Model-Enabled Generative Semantic Communications for Image Transmission
mmHSense: Multi-Modal and Distributed mmWave ISAC Datasets for Human Sensing
Skeleton Sparsification and Densification Scale-Spaces
Downscaling climate projections to 1 km with single-image super resolution
JaiLIP: Jailbreaking Vision-Language Models via Loss Guided Image Perturbation
Overview of ExpertLifeCLEF 2018: how far automated identification systems are from the best experts?
QuadGPT: Native Quadrilateral Mesh Generation with Autoregressive Models
DyME: Dynamic Multi-Concept Erasure in Diffusion Models with Bi-Level Orthogonal LoRA Adaptation
VideoJudge: Bootstrapping Enables Scalable Supervision of MLLM-as-a-Judge for Video Understanding
Residual Vector Quantization For Communication-Efficient Multi-Agent Perception
Gender Stereotypes in Professional Roles Among Saudis: An Analytical Study of AI-Generated Images Using Language Models
Reasoning-Enhanced Domain-Adaptive Pretraining of Multimodal Large Language Models for Short Video Content Moderation
Learning GUI Grounding with Spatial Reasoning from Visual Feedback
X-CoT: Explainable Text-to-Video Retrieval via LLM-based Chain-of-Thought Reasoning
Unsupervised Defect Detection for Surgical Instruments
No Alignment Needed for Generation: Learning Linearly Separable Representations in Diffusion Models
Enhancing Contrastive Learning for Geolocalization by Discovering Hard Negatives on Semivariograms
X-Streamer: Unified Human World Modeling with Audiovisual Interaction
What Happens Next? Anticipating Future Motion by Generating Point Trajectories
Temporal vs. Spatial: Comparing DINOv3 and V-JEPA2 Feature Representations for Video Action Analysis
VLCE: A Knowledge-Enhanced Framework for Image Description in Disaster Assessment
A Data-driven Typology of Vision Models from Integrated Representational Metrics
FantasyWorld: Geometry-Consistent World Modeling via Unified Video and 3D Prediction
MORPH: Shape-agnostic PDE Foundation Models
MS-YOLO: Infrared Object Detection for Edge Deployment via MobileNetV4 and SlideLoss
Motion-Aware Transformer for Multi-Object Tracking
DeLiVR: Differential Spatiotemporal Lie Bias for Efficient Video Deraining
InfiMed: Low-Resource Medical MLLMs with Advancing Understanding and Reasoning
RARE: Retrieval-Aware Robustness Evaluation for Retrieval-Augmented Generation Systems
Probing Neural Topology of Large Language Models
Resisting Contextual Interference in RAG via Parametric-Knowledge Reinforcement
Personalized LLM Decoding via Contrasting Personal Preference
MUCAR: Benchmarking Multilingual Cross-Modal Ambiguity Resolution for Multimodal Large Language Models
KaLM-Embedding-V2: Superior Training Techniques and Data Inspire A Versatile Embedding Model
VAT-KG: Knowledge-Intensive Multimodal Knowledge Graph Dataset for Retrieval-Augmented Generation
WildSpeech-Bench: Benchmarking End-to-End SpeechLLMs in the Wild
Why Reinforcement Fine-Tuning Enables MLLMs Preserve Prior Knowledge Better: A Data Perspective
Unveiling the Potential of Diffusion Large Language Model in Controllable Generation
Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions
Learn Globally, Speak Locally: Bridging the Gaps in Multilingual Reasoning
What Factors Affect LLMs and RLLMs in Financial Question Answering?
KV Cache Steering for Controlling Frozen LLMs
The Imitation Game: Turing Machine Imitator is Length Generalizable Reasoner
LoopServe: An Adaptive Dual-phase LLM Inference Acceleration System for Multi-Turn Dialogues
Persona-Augmented Benchmarking: Evaluating LLMs Across Diverse Writing Styles
DAMR: Efficient and Adaptive Context-Aware Knowledge Graph Question Answering with LLM-Guided MCTS
MLP Memory: A Retriever-Pretrained Memory for Large Language Models
Conflict-Aware Soft Prompting for Retrieval-Augmented Generation
Influence-driven Curriculum Learning for Pre-training on Limited Data
Dream to Chat: Model-based Reinforcement Learning on Dialogues with User Belief Modeling
JudgeAgent: Knowledge-wise and Dynamic LLM Evaluation with Agent-as-Interviewer
CMRAG: Co-modality-based visual document retrieval and question answering
Chain or tree? Re-evaluating complex reasoning from the perspective of a matrix of thought
Towards an AI Musician: Synthesizing Sheet Music Problems for Musical Reasoning
Positional Encoding via Token-Aware Phase Attention
DivLogicEval: A Framework for Benchmarking Logical Reasoning Evaluation in Large Language Models
Distribution-Aligned Decoding for Efficient LLM Task Adaptation
RPG: A Repository Planning Graph for Unified and Scalable Codebase Generation
HiCoLoRA: Addressing Context-Prompt Misalignment via Hierarchical Collaborative LoRA for Zero-Shot DST
A Critical Look At Tokenwise Reward-Guided Text Generation
Large Language Models versus Classical Machine Learning: Performance in COVID-19 Mortality Prediction Using High-Dimensional Tabular Data
On the Within-class Variation Issue in Alzheimer's Disease Detection
Development and Validation of a Large Language Model for Generating Fully-Structured Radiology Reports
Rare-to-Frequent: Unlocking Compositional Generation Power of Diffusion Models on Rare Concepts with LLM Guidance
Training-Free Bayesianization for Low-Rank Adapters of Large Language Models
Detecting and Interpreting NSFW Prompts in Text-to-Image Models through Uncovering Harmful Semantics
Process Reinforcement through Implicit Rewards
GeoDANO: Geometric VLM with Domain Agnostic Vision Encoder
Taxonomy, Opportunities, and Challenges of Representation Engineering for Large Language Models
Recursive Training Loops in LLMs: How training data properties modulate distribution shift in generated data?
TokUR: Token-Level Uncertainty Estimation for Large Language Model Reasoning
Feature Hedging: Correlated Features Break Narrow Sparse Autoencoders
The Polar Express: Optimal Matrix Sign Methods and Their Application to the Muon Algorithm
ChartGalaxy: A Dataset for Infographic Chart Understanding and Generation
Domain-Aware Tensor Network Structure Search
Think With Videos For Agentic Long-Video Understanding
Security Degradation in Iterative AI Code Generation -- A Systematic Analysis of the Paradox
video-SALMONN 2: Caption-Enhanced Audio-Visual Large Language Models
Latent Concept Disentanglement in Transformer-based Language Models
MultiVox: A Benchmark for Evaluating Voice Assistants for Multimodal Interactions
Rethinking the Embodied Gap in Vision-and-Language Navigation: A Holistic Study of Physical and Visual Disparities
From Roots to Rewards: Dynamic Tree Reasoning with Reinforcement Learning
The Invisible Leash: Why RLVR May or May Not Escape Its Origin
Library Hallucinations in LLMs: Risk Analysis Grounded in Developer Queries
InfiMed-Foundation: Pioneering Advanced Multimodal Medical Models with Compute-Efficient Pre-Training and Multi-Stage Fine-Tuning
PRIME: Planning and Retrieval-Integrated Memory for Enhanced Reasoning
Can Synthetic Query Rewrites Capture User Intent Better than Humans in Retrieval-Augmented Generation?
Bridging Kolmogorov Complexity and Deep Learning: Asymptotically Optimal Description Length Objectives for Transformers
MDAR: A Multi-scene Dynamic Audio Reasoning Benchmark
IIET: Efficient Numerical Transformer via Implicit Iterative Euler Method
Mental Health Impacts of AI Companions: Triangulating Social Media Quasi-Experiments, User Perspectives, and Relational Theory
Does AI Coaching Prepare us for Workplace Negotiations?
Dynamic Experts Search: Enhancing Reasoning in Mixture-of-Experts LLMs at Test Time
EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning
Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning
Vision-Language Alignment from Compressed Image Representations using 2D Gaussian Splatting
IA2: Alignment with ICL Activations Improves Supervised Fine-Tuning
LABELING COPILOT: A Deep Research Agent for Automated Data Curation in Computer Vision
Towards Efficient Online Exploration for Reinforcement Learning with Human Feedback
Learning Human-Perceived Fakeness in AI-Generated Videos via Multimodal LLMs
CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning
See, Point, Fly: A Learning-Free VLM Framework for Universal Unmanned Aerial Navigation
Constituency Parsing using LLMs
TEXT2AFFORD: Probing Object Affordance Prediction abilities of Language Models solely from Text
Sharing Matters: Analysing Neurons Across Languages and Tasks in LLMs
LLMAEL: Large Language Models are Good Context Augmenters for Entity Linking
Position IDs Matter: An Enhanced Position Layout for Efficient Context Compression in Large Language Models
Stuffed Mamba: Oversized States Lead to the Inability to Forget
Improving the Language Understanding Capabilities of Large Language Models Using Reinforcement Learning
Vulnerability of LLMs to Vertically Aligned Text Manipulations
Semantic Component Analysis: Introducing Multi-Topic Distributions to Clustering-Based Topic Modeling
$100K or 100 Days: Trade-offs when Pre-Training with Academic Resources
AtomR: Atomic Operator-Empowered Large Language Models for Heterogeneous Knowledge Reasoning
Can LLMs be Good Graph Judge for Knowledge Graph Construction?
Demystifying Domain-adaptive Post-training for Financial LLMs
Demystifying Multilingual Chain-of-Thought in Process Reward Modeling
Elucidating Mechanisms of Demographic Bias in LLMs for Healthcare
LoRA-MGPO: Mitigating Double Descent in Low-Rank Adaptation via Momentum-Guided Perturbation Optimization
RuCCoD: Towards Automated ICD Coding in Russian
How LLMs Fail to Support Fact-Checking
Adaptively profiling models with task elicitation
Generator-Assistant Stepwise Rollback Framework for Large Language Model Agent
Improving LLM-as-a-Judge Inference with the Judgment Distribution
InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models
Cost-Optimal Grouped-Query Attention for Long-Context Modeling
Retrieval-Augmented Generation with Hierarchical Knowledge
CLASH: Evaluating Language Models on Judging High-Stakes Dilemmas from Multiple Perspectives
SOLAR: Towards Characterizing Subjectivity of Individuals through Modeling Value Conflicts and Trade-offs
MrGuard: A Multilingual Reasoning Guardrail for Universal LLM Safety
Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review
LLM-OptiRA: LLM-Driven Optimization of Resource Allocation for Non-Convex Problems in Wireless Communications
Follow the Path: Reasoning over Knowledge Graph Paths to Improve LLM Factuality
SuperCoder: Assembly Program Superoptimization with Large Language Models
HiddenBench: Assessing Collective Reasoning in Multi-Agent LLMs via Hidden Profile Tasks
ZeroTuning: Unlocking the Initial Token's Power to Enhance Large Language Models Without Training
HBO: Hierarchical Balancing Optimization for Fine-Tuning Large Language Models
ExpertSteer: Intervening in LLMs through Expert Knowledge
Shadow-FT: Tuning Instruct Model via Training on Paired Base Model
Language-Specific Latent Process Hinders Cross-Lingual Performance
UltraEdit: Training-, Subject-, and Memory-Free Lifelong Editing in Language Models
UniErase: Towards Balanced and Precise Unlearning in Language Models
Beyond Early-Token Bias: Model-Specific and Language-Specific Position Effects in Multilingual LLMs
Attributing Response to Context: A Jensen-Shannon Divergence Driven Mechanistic Study of Context Attribution in Retrieval-Augmented Generation
Beyond Static Testbeds: An Interaction-Centric Agent Simulation Platform for Dynamic Recommender Systems
Unlearning Isn't Deletion: Investigating Reversibility of Machine Unlearning in LLMs
BP-Seg: A graphical model approach to unsupervised and non-contiguous text segmentation using belief propagation
From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning
Prompting is not Enough: Exploring Knowledge Integration and Controllable Generation on Large Language Models
BiomedSQL: Text-to-SQL for Scientific Reasoning on Biomedical Knowledge Bases
EmoBench-UA: A Benchmark Dataset for Emotion Detection in Ukrainian
Table-R1: Inference-Time Scaling for Table Reasoning
FeatBench: Evaluating Coding Agents on Feature Implementation for Vibe Coding
FLEXI: Benchmarking Full-duplex Human-LLM Speech Interaction
Safety Compliance: Rethinking LLM Safety Reasoning through the Lens of Compliance
Beyond Textual Context: Structural Graph Encoding with Adaptive Space Alignment to alleviate the hallucination of LLMs
Bridging Fairness and Explainability: Can Input-Based Explanations Promote Fairness in Hate Speech Detection?
Advancing Natural Language Formalization to First Order Logic with Fine-tuned LLMs
Transformers Can Learn Connectivity in Some Graphs but Not Others
The InviTE Corpus: Annotating Invectives in Tudor English Texts for Computational Modeling
Conversational Implicatures: Modelling Relevance Theory Probabilistically
CHRONOBERG: Capturing Language Evolution and Temporal Awareness in Foundation Models
Exploratory Semantic Reliability Analysis of Wind Turbine Maintenance Logs using Large Language Models
What Is The Political Content in LLMs' Pre- and Post-Training Data?
Chimera: Diagnosing Shortcut Learning in Visual-Language Understanding
Detecting (Un)answerability in Large Language Models with Linear Directions
Evaluating the Limits of Large Language Models in Multilingual Legal Reasoning
NeLLCom-Lex: A Neural-agent Framework to Study the Interplay between Lexical Systems and Language Use
Exploring Solution Divergence and Its Effect on Large Language Model Problem Solving
JGU Mainz's Submission to the WMT25 Shared Task on LLMs with Limited Resources for Slavic Languages: MT and QA
Representing LLMs in Prompt Semantic Task Space
We Think, Therefore We Align LLMs to Helpful, Harmless and Honest Before They Go Wrong
InfiR2: A Comprehensive FP8 Training Recipe for Reasoning-Enhanced Language Models
Think Socially via Cognitive Reasoning
Retrieval-Augmented Guardrails for AI-Drafted Patient-Portal Messages: Error Taxonomy Construction and Large-Scale Evaluation
Fine-Grained Detection of Context-Grounded Hallucinations Using LLMs
ArabJobs: A Multinational Corpus of Arabic Job Ads
From Formal Language Theory to Statistical Learning: Finite Observability of Subregular Languages
Capturing Opinion Shifts in Deliberative Discourse through Frequency-based Quantum deep learning methods
From tests to effect sizes: Quantifying uncertainty and statistical variability in multilingual and multitask NLP evaluation benchmarks
StateX: Enhancing RNN Recall via Post-training State Expansion
Variational Reasoning for Language Models
Language Models Can Learn from Verbal Feedback Without Scalar Rewards
Death of the Novel(ty): Beyond n-Gram Novelty as a Metric for Textual Creativity
WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning
VoiceAssistant-Eval: Benchmarking AI Assistants across Listening, Speaking, and Viewing
Accelerate Creation of Product Claims Using Generative AI
HetaRAG: Hybrid Deep Retrieval-Augmented Generation across Heterogeneous Data Stores
Towards mitigating information leakage when evaluating safety monitors
Random Direct Preference Optimization for Radiography Report Generation
ReGeS: Reciprocal Retrieval-Generation Synergy for Conversational Recommender Systems
LLMs for Bayesian Optimization in Scientific Domains: Are We There Yet?
ARTI-6: Towards Six-dimensional Articulatory Speech Encoding
VideoJudge: Bootstrapping Enables Scalable Supervision of MLLM-as-a-Judge for Video Understanding
Gender Stereotypes in Professional Roles Among Saudis: An Analytical Study of AI-Generated Images Using Language Models
Are Hallucinations Bad Estimations?
LLM Agent Meets Agentic AI: Can LLM Agents Simulate Customers to Evaluate Agentic-AI-based Shopping Assistants?
Uncertainty-Aware Knowledge Tracing Models
C-QUERI: Congressional Questions, Exchanges, and Responses in Institutions Dataset
Learning GUI Grounding with Spatial Reasoning from Visual Feedback
Leveraging Big Data Frameworks for Spam Detection in Amazon Reviews
AUDDT: Audio Unified Deepfake Detection Benchmark Toolkit
InvBench: Can LLMs Accelerate Program Verification with Invariant Synthesis?
UISim: An Interactive Image-Based UI Simulator for Dynamic Mobile Environments
UltraHorizon: Benchmarking Agent Capabilities in Ultra Long-Horizon Scenarios
DeHate: A Stable Diffusion-based Multimodal Approach to Mitigate Hate Speech in Images
Compiling by Proving: Language-Agnostic Automatic Optimization from Formal Semantics
SBFA: Single Sneaky Bit Flip Attack to Break Large Language Models
What Makes LLM Agent Simulations Useful for Policy? Insights From an Iterative Design Engagement in Emergency Preparedness
You Can't Steal Nothing: Mitigating Prompt Leakages in LLMs via System Vectors
AgentPack: A Dataset of Code Changes, Co-Authored by Agents and Humans
Evaluating Open-Source Large Language Models for Technical Telecom Question Answering
RISK: A Framework for GUI Agents in E-commerce Risk Management
From Bias to Balance: Exploring and Mitigating Spatial Bias in LVLMs
ERGO: Efficient High-Resolution Visual Understanding for Vision-Language Models
The Thinking Spectrum: An Emperical Study of Tunable Reasoning in LLMs through Model Merging
A2R: An Asymmetric Two-Stage Reasoning Framework for Parallel Reasoning
Speak Your Mind: The Speech Continuation Task as a Probe of Voice-Based Model Bias
SecureAgentBench: Benchmarking Secure Code Generation under Realistic Vulnerability Scenarios
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing
Thinking in Many Modes: How Composite Reasoning Elevates Large Language Model Performance with Limited Data
In Their Own Words: Reasoning Traces Tailored for Small Models Make Them Better Reasoners
Influence Guided Context Selection for Effective Retrieval-Augmented Generation
Context Is What You Need: The Maximum Effective Context Window for Real World Limits of LLMs
How Large Language Models Need Symbolism
One Model, Many Morals: Uncovering Cross-Linguistic Misalignments in Computational Moral Reasoning
LLM-Based Support for Diabetes Diagnosis: Opportunities, Scenarios, and Challenges with GPT-5
Diagnosing the Performance Trade-off in Moral Alignment: A Case Study on Gender Stereotypes
A State-of-the-Art SQL Reasoning Model using RLVR
Learning to Reason with Mixture of Tokens
Dual-Head Reasoning Distillation: Improving Classifier Accuracy with Train-Time-Only Reasoning
On Code-Induced Reasoning in LLMs
Agribot: agriculture-specific question answer system
Domain-Aware Speaker Diarization On African-Accented English
Generation-Time vs. Post-hoc Citation: A Holistic Evaluation of LLM Attribution
Comparative Personalization for Multi-document Summarization
Vision Language Models Cannot Plan, but Can They Formalize?
"Be My Cheese?": Assessing Cultural Nuance in Multilingual LLM Translations
Multi-Objective Reinforcement Learning for Large Language Model Optimization: Visionary Perspective
OjaKV: Context-Aware Online Low-Rank KV Cache Compression with Oja's Rule
Towards Transparent AI: A Survey on Explainable Language Models
ReviewScore: Misinformed Peer Review Detection with Large Language Models
GRAB: A Risk Taxonomy--Grounded Benchmark for Unsupervised Topic Discovery in Financial Disclosures
Think-on-Graph 3.0: Efficient and Adaptive LLM Reasoning on Heterogeneous Graphs via Multi-Agent Dual-Evolving Context Retrieval
ProPerSim: Developing Proactive and Personalized AI Assistants through User-Assistant Simulation
How Accurate Are LLMs at Multi-Question Answering on Conversational Transcripts?
Self-Speculative Biased Decoding for Faster Live Translation
Thinking with Sound: Audio Chain-of-Thought Enables Multimodal Reasoning in Large Audio-Language Models
SynerGen: Contextualized Generative Recommender for Unified Search and Recommendation
Navigating the Impact of Structured Output Format on Large Language Models through the Compass of Causal Inference
Evaluating and Improving Cultural Awareness of Reward Models for LLM Alignment
Redefining Machine Simultaneous Interpretation: From Incremental Translation to Human-Like Strategies
Towards Minimal Causal Representations for Human Multimodal Language Understanding
Can LLMs Solve and Generate Linguistic Olympiad Puzzles?
ResT: Reshaping Token-Level Policy Gradients for Tool-Use Large Language Models
Semantic Agreement Enables Efficient Open-Ended LLM Cascades
Following the TRACE: A Structured Path to Empathetic Response Generation with Multi-Agent Models
KnowMT-Bench: Benchmarking Knowledge-Intensive Long-Form Question Answering in Multi-Turn Dialogues
Enhancing Low-Rank Adaptation with Structured Nonlinear Transformations
LUMINA: Detecting Hallucinations in RAG System with Context-Knowledge Signals
No Prompt Left Behind: Exploiting Zero-Variance Prompts in LLM Reinforcement Learning via Entropy-Guided Advantage Shaping
QoNext: Towards Next-generation QoE for Foundation Models
Elastic MoE: Unlocking the Inference-Time Scalability of Mixture-of-Experts
A Large-Scale Dataset and Citation Intent Classification in Turkish with LLMs
AutoSCORE: Enhancing Automated Scoring with Multi-Agent Large Language Models via Structured Component Recognition
SimulSense: Sense-Driven Interpreting for Efficient Simultaneous Speech Translation
Why Chain of Thought Fails in Clinical Text Understanding
Debiasing Large Language Models in Thai Political Stance Detection via Counterfactual Calibration
MotivGraph-SoIQ: Integrating Motivational Knowledge Graphs and Socratic Dialogue for Enhanced LLM Ideation
Black-Box Hallucination Detection via Consistency Under the Uncertain Expression
GraphSearch: An Agentic Deep Searching Workflow for Graph Retrieval-Augmented Generation
From Outliers to Topics in Language Models: Anticipating Trends in News Corpora
Taxonomy of Comprehensive Safety for Clinical Agents
Fuzzy Reasoning Chain (FRC): An Innovative Reasoning Framework from Fuzziness to Clarity
RedNote-Vibe: A Dataset for Capturing Temporal Dynamics of AI-Generated Text in Social Media
The QCET Taxonomy of Standard Quality Criterion Names and Definitions for the Evaluation of NLP Systems
Fine-tuning Done Right in Model Editing
COSPADI: Compressing LLMs via Calibration-Guided Sparse Dictionary Learning
Multilingual Dialogue Generation and Localization with Dialogue Act Scripting
S2J: Bridging the Gap Between Solving and Judging Ability in Generative Reward Models
Think Right, Not More: Test-Time Scaling for Numerical Claim Verification
Universal Legal Article Prediction via Tight Collaboration between Supervised Classification Model and LLM
Multilingual Vision-Language Models, A Survey
FoodSEM: Large Language Model Specialized in Food Named-Entity Linking
R-Capsule: Compressing High-Level Plans for Efficient Large Language Model Reasoning
Bridging Draft Policy Misalignment: Group Tree Optimization for Speculative Decoding
NFDI4DS Shared Tasks for Scholarly Document Processing
From Long to Lean: Performance-aware and Adaptive Chain-of-Thought Compression via Multi-round Refinement
Mixture of Detectors: A Compact View of Machine-Generated Text Detection
Context Parametrization with Compositional Adapters
When Does Reasoning Matter? A Controlled Study of Reasoning's Contribution to Model Performance
The Outputs of Large Language Models are Meaningless
Question-Driven Analysis and Synthesis: Building Interpretable Thematic Trees with LLMs for Text Clustering and Controllable Generation
StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs
A Novel Differential Feature Learning for Effective Hallucination Detection and Classification
Intuition emerges in Maximum Caliber models at criticality
Mitigating Exponential Mixed Frequency Growth through Frequency Selection
Partially Functional Dynamic Backdoor Diffusion-based Causal Model
EigenBench: A Comparative Behavioral Measure of Value Alignment
Decentralized Stochastic Nonconvex Optimization under the Relaxed Smoothness
Scaling to Multimodal and Multichannel Heart Sound Classification: Fine-Tuning Wav2Vec 2.0 with Synthetic and Augmented Biosignals
Recent Advancements in Microscopy Image Enhancement using Deep Learning: A Survey
DivLogicEval: A Framework for Benchmarking Logical Reasoning Evaluation in Large Language Models
Audio Super-Resolution with Latent Bridge Models
Residual Off-Policy RL for Finetuning Behavior Cloning Policies
Cognitive Load Limits in Large Language Models: Benchmarking Multi-Hop Reasoning
Benchmarking LLMs in Web API Integration Tasks
Thinking Augmented Pre-training
IntSR: An Integrated Generative Framework for Search and Recommendation
Data-driven Neural Networks for Windkessel Parameter Calibration
Pre-Training Representations of Binary Code Using Contrastive Learning
Data-driven Piecewise Affine Decision Rules for Stochastic Programming with Covariate Information
QECO: A QoE-Oriented Computation Offloading Algorithm based on Deep Reinforcement Learning for Mobile Edge Computing
Diffence: Fencing Membership Privacy With Diffusion Models
DoDo-Code: an Efficient Levenshtein Distance Embedding-based Code for 4-ary IDS Channel
Online Resource Allocation with Average Budget Constraints
Discretization Error of Fourier Neural Operators
On the Within-class Variation Issue in Alzheimer's Disease Detection
Leveraging Model Guidance to Extract Training Data from Personalized Diffusion Models
A Survey on LLM-based Code Generation for Low-Resource and Domain-Specific Programming Languages
Stuffed Mamba: Oversized States Lead to the Inability to Forget
GraphSCENE: On-Demand Critical Scenario Generation for Autonomous Vehicles in Simulation
$100K or 100 Days: Trade-offs when Pre-Training with Academic Resources
Conditional Latent Space Molecular Scaffold Optimization for Accelerated Molecular Design
Effectively Leveraging Momentum Terms in Stochastic Line Search Frameworks for Fast Optimization of Finite-Sum Problems
Training-Free Bayesianization for Low-Rank Adapters of Large Language Models
Demystifying Domain-adaptive Post-training for Financial LLMs
IP$^{2}$-RSNN: Bi-level Intrinsic Plasticity Enables Learning-to-learn in Recurrent Spiking Neural Networks
Forecasting the future development in quality and value of professional football players
VidCRAFT3: Camera, Object, and Lighting Control for Image-to-Video Generation
Adaptively profiling models with task elicitation
Surgical Vision World Model
Cost-Optimal Grouped-Query Attention for Long-Context Modeling
Learning Personalized Driving Styles via Reinforcement Learning from Human Feedback
Ethical AI for Young Digital Citizens: A Call to Action on Privacy Governance
Detecting Scarce and Sparse Anomalous: Solving Dual Imbalance in Multi-Instance Learning
Do Data Valuations Make Good Data Prices?
Multi-Agent Reinforcement Learning for Greenhouse Gas Offset Credit Markets
Can Code Language Models Learn Clarification-Seeking Behaviors?
CARL: Camera-Agnostic Representation Learning for Spectral Image Analysis
LLM-OptiRA: LLM-Driven Optimization of Resource Allocation for Non-Convex Problems in Wireless Communications
From Grunts to Lexicons: Emergent Language from Cooperative Foraging
UltraEdit: Training-, Subject-, and Memory-Free Lifelong Editing in Language Models
Octic Vision Transformers: Quicker ViTs Through Equivariance
Beyond Early-Token Bias: Model-Specific and Language-Specific Position Effects in Multilingual LLMs
Attributing Response to Context: A Jensen-Shannon Divergence Driven Mechanistic Study of Context Attribution in Retrieval-Augmented Generation
Unlearning Isn't Deletion: Investigating Reversibility of Machine Unlearning in LLMs
BP-Seg: A graphical model approach to unsupervised and non-contiguous text segmentation using belief propagation
Distillation-Enabled Knowledge Alignment Protocol for Semantic Communication in AI Agent Networks
BiomedSQL: Text-to-SQL for Scientific Reasoning on Biomedical Knowledge Bases
Transfer learning for multifidelity simulation-based inference in cosmology
Mobi-$\pi$: Mobilizing Your Robot Learning Policy
Two failure modes of deep transformers and how to avoid them: a unified theory of signal propagation at initialisation
Scalable In-Context Q-Learning
Dual Branch VideoMamba with Gated Class Token Fusion for Violence Detection
SNR and Resource Adaptive Deep JSCC for Distributed IoT Image Classification
Security Degradation in Iterative AI Code Generation -- A Systematic Analysis of the Paradox
HEIST: A Graph Foundation Model for Spatial Transcriptomics and Proteomics Data
MUCAR: Benchmarking Multilingual Cross-Modal Ambiguity Resolution for Multimodal Large Language Models
pFedMMA: Personalized Federated Fine-Tuning with Multi-Modal Adapter for Vision-Language Models
Learn Globally, Speak Locally: Bridging the Gaps in Multilingual Reasoning
A Unified Empirical Risk Minimization Framework for Flexible N-Tuples Weak Supervision
APTx Neuron: A Unified Trainable Neuron Architecture Integrating Activation and Computation
Tricks and Plug-ins for Gradient Boosting in Image Classification
Hallucination to Truth: A Review of Fact-Checking and Factuality Evaluation in Large Language Models
Beyond the Proxy: Trajectory-Distilled Guidance for Offline GFlowNet Training
Practical estimation of the optimal classification error with soft labels and calibration
Spectral-inspired Operator Learning with Limited Data and Unknown Physics
SDPO: Importance-Sampled Direct Preference Optimization for Stable Diffusion Training
Model-Preserving Adaptive Rounding
Domain-Aware Tensor Network Structure Search
Mamba Integrated with Physics Principles Masters Long-term Chaotic System Forecasting
Intercept Cancer: Cancer Pre-Screening with Large Scale Healthcare Foundation Models
RsGCN: Subgraph-Based Rescaling Enhances Generalization of GCNs for Solving Traveling Salesman Problems
WeightLoRA: Keep Only Necessary Adapters
Sign-SGD is the Golden Gate between Multi-Node to Single-Node Learning: Significant Boost via Parameter-Free Optimization
OrthoGrad Improves Neural Calibration
Spectral Graph Neural Networks are Incomplete on Graphs with a Simple Spectrum
AMPED: Adaptive Multi-objective Projection for balancing Exploration and skill Diversification
Caterpillar GNN: Replacing Message Passing with Efficient Aggregation
Aircraft Trajectory Dataset Augmentation in Latent Space
Exploiting Block Coordinate Descent for Cost-Effective LLM Model Training
RL-Obfuscation: Can Language Models Learn to Evade Latent-Space Monitors?
SecP-Tuning: Efficient Privacy-Preserving Prompt Tuning for Large Language Models via MPC
Latent Concept Disentanglement in Transformer-based Language Models
Online Multi-Agent Control with Adversarial Disturbances
On the Necessity of Output Distribution Reweighting for Effective Class Unlearning
Beyond Simple Graphs: Neural Multi-Objective Routing on Multigraphs
Whom to Trust? Adaptive Collaboration in Personalized Federated Learning
Neural-Network solver of ideal MHD equilibria
Lightweight MSA Design Advances Protein Folding From Evolutionary Embeddings
Relative Entropy Pathwise Policy Optimization
The Invisible Leash: Why RLVR May or May Not Escape Its Origin
SpectrumWorld: Artificial Intelligence Foundation for Spectroscopy
Tricks and Plug-ins for Gradient Boosting with Transformers
Graph is a Natural Regularization: Revisiting Vector Quantization for Graph Representation Learning
ERIS: An Energy-Guided Feature Disentanglement Framework for Out-of-Distribution Time Series Classification
Multi-Channel Differential Transformer for Cross-Domain Sleep Stage Classification with Heterogeneous EEG and EOG
Sparse but Wrong: Incorrect L0 Leads to Incorrect Features in Sparse Autoencoders
In-Context Algorithm Emulation in Fixed-Weight Transformers
Scalable Option Learning in High-Throughput Environments
Challenges in Non-Polymeric Crystal Structure Prediction: Why a Geometric, Permutation-Invariant Loss is Needed
AI for Scientific Discovery is a Social Problem
Towards a Physics Foundation Model
A Variational Framework for Residual-Based Adaptivity in Neural PDE Solvers and Operator Learning
GPU Temperature Simulation-Based Testing for In-Vehicle Deep Learning Frameworks
TimeMosaic: Temporal Heterogeneity Guided Time Series Forecasting via Adaptive Granularity Patch and Segment-wise Decoding
The Limit Points of (Optimistic) Gradient Descent in Min-Max Optimization
Multi-View Hypercomplex Learning for Breast Cancer Screening
Efficient Epistemic Uncertainty Estimation in Regression Ensemble Models Using Pairwise-Distance Estimators
VDFD: Multi-Agent Value Decomposition Framework with Disentangled World Model
Fast Partition-Based Cross-Validation With Centering and Scaling for $\mathbf{X}^\mathbf{T}\mathbf{X}$ and $\mathbf{X}^\mathbf{T}\mathbf{Y}$
Metric-Guided Conformal Bounds for Probabilistic Image Reconstruction
A Notion of Uniqueness for the Adversarial Bayes Classifier
Machine Learning-Assisted Sustainable Remanufacturing, Reusing and Recycling for Lithium-ion Batteries
Diverse Subset Selection via Norm-Based Sampling and Orthogonality
A Critical Look At Tokenwise Reward-Guided Text Generation
VeriFlow: Modeling Distributions for Neural Network Verification
Large Language Models versus Classical Machine Learning: Performance in COVID-19 Mortality Prediction Using High-Dimensional Tabular Data
Closed-Form Interpretation of Neural Network Latent Spaces with Symbolic Gradients
DOTA: Distributional Test-Time Adaptation of Vision-Language Models
Degree-Conscious Spiking Graph for Cross-Domain Adaptation
Unstable Unlearning: The Hidden Risk of Concept Resurgence in Diffusion Models
Measurability in the Fundamental Theorem of Statistical Learning
Capacity-Aware Planning and Scheduling in Budget-Constrained Multi-Agent MDPs: A Meta-RL Approach
Rare-to-Frequent: Unlocking Compositional Generation Power of Diffusion Models on Rare Concepts with LLM Guidance
Machine Unlearning for Speaker-Agnostic Detection of Gender-Based Violence Condition in Speech
How Strategic Agents Respond: Comparing Analytical Models with LLM-Generated Responses in Strategic Classification
Avoiding $\mathbf{exp(R_{max})}$ scaling in RLHF through Preference-based Exploration
Efficient Prior Selection in Gaussian Process Bandits with Thompson Sampling
Process Reinforcement through Implicit Rewards
GNN-DT: Graph Neural Network Enhanced Decision Transformer for Efficient Optimization in Dynamic Environments
ReciNet: Reciprocal Space-Aware Long-Range Modeling for Crystalline Property Prediction
Mechanisms of Projective Composition of Diffusion Models
LDC-MTL: Balancing Multi-Task Learning through Scalable Loss Discrepancy Control
Beyond Shallow Behavior: Task-Efficient Value-Based Multi-Task Offline MARL via Skill Discovery
Fused Partial Gromov-Wasserstein for Structured Objects
Tokenizing Single-Channel EEG with Time-Frequency Motif Learning
Taxonomy, Opportunities, and Challenges of Representation Engineering for Large Language Models
Multi-View Causal Discovery without Non-Gaussianity: Identifiability and Algorithms
BPINN-EM-Post: Bayesian Physics-Informed Neural Network based Stochastic Electromigration Damage Analysis in the Post-void Phase
MNT-TNN: Spatiotemporal Traffic Data Imputation via Compact Multimode Nonlinear Transform-based Tensor Nuclear Norm
Can Diffusion Models Disentangle? A Theoretical Perspective
CSF: Fixed-outline Floorplanning Based on the Conjugate Subgradient Algorithm Assisted by Q-Learning
Recursive Training Loops in LLMs: How training data properties modulate distribution shift in generated data?
Sparsity Forcing: Reinforcing Token Sparsity of MLLMs
Geometry aware inference of steady state PDEs using Equivariant Neural Fields representations
Trial and Trust: Addressing Byzantine Attacks with Comprehensive Defense Strategy
Exploiting the Asymmetric Uncertainty Structure of Pre-trained VLMs on the Unit Hypersphere
Efficient Orthogonal Fine-Tuning with Principal Subspace Adaptation
Quantization Meets Reasoning: Exploring and Mitigating Degradation of Low-Bit LLMs in Mathematical Reasoning
TokUR: Token-Level Uncertainty Estimation for Large Language Model Reasoning
Feature Hedging: Correlated Features Break Narrow Sparse Autoencoders
Latent Veracity Inference for Identifying Errors in Stepwise Reasoning
Structured Relational Representations
Implicit bias produces neural scaling laws in learning curves, from perceptrons to deep networks
Forward-only Diffusion Probabilistic Models
Learning Flexible Forward Trajectories for Masked Molecular Diffusion
The Polar Express: Optimal Matrix Sign Methods and Their Application to the Muon Algorithm
SPAR: Self-supervised Placement-Aware Representation Learning for Distributed Sensing
Bottlenecked Transformers: Periodic KV Cache Consolidation for Generalised Reasoning
Navigate the Unknown: Enhancing LLM Reasoning with Intrinsic Motivation Guided Exploration
FFT-based Dynamic Subspace Selection for Low-Rank Adaptive Optimization of Large Language Models
Can LLMs Alleviate Catastrophic Forgetting in Graph Continual Learning? A Systematic Study
HD-PiSSA: High-Rank Distributed Orthogonal Adaptation
ExPLAIND: Unifying Model, Data, and Training Attribution to Study Model Behavior
Scale-Wise VAR is Secretly Discrete Diffusion
Variational Reasoning for Language Models
Language Models Can Learn from Verbal Feedback Without Scalar Rewards
See, Point, Fly: A Learning-Free VLM Framework for Universal Unmanned Aerial Navigation
Adaptive Policy Learning to Additional Tasks
A Random Matrix Perspective of Echo State Networks: From Precise Bias--Variance Characterization to Optimal Regularization
Exploring the Early Universe with Deep Learning
Comparative Analysis of GAN and Diffusion for MRI-to-CT translation
Direct Bias-Correction Term Estimation for Propensity Scores and Average Treatment Effect Estimation
Incorporating priors in learning: a random matrix study under a teacher-student framework
Multi-Agent Path Finding via Offline RL and LLM Collaboration
DragGANSpace: Latent Space Exploration and Control for GANs
COMPASS: Robust Feature Conformal Prediction for Medical Segmentation Metrics
Clinical Uncertainty Impacts Machine Learning Evaluations
Structured Sparse Transition Matrices to Enable State Tracking in State-Space Models
HiGS: History-Guided Sampling for Plug-and-Play Enhancement of Diffusion Models
NIFTY: a Non-Local Image Flow Matching for Texture Synthesis
Preventing Model Collapse Under Overparametrization: Optimal Mixing Ratios for Interpolation Learning and Ridge Regression
Transformers Can Learn Connectivity in Some Graphs but Not Others
Multi-channel convolutional neural quantum embedding
Multidimensional Uncertainty Quantification via Optimal Transport
Integrating Background Knowledge in Medical Semantic Segmentation with Logic Tensor Networks
NeuroScalar: A Deep Learning Framework for Fast, Accurate, and In-the-Wild Cycle-Level Performance Prediction
Learning to Ball: Composing Policies for Long-Horizon Basketball Moves
Universal Inverse Distillation for Matching Models with Real-Data Supervision (No GANs)
CausalKANs: interpretable treatment effect estimation with Kolmogorov-Arnold networks
Evaluating the Limits of Large Language Models in Multilingual Legal Reasoning
Estimating the Empowerment of Language Model Agents
Representing LLMs in Prompt Semantic Task Space
TrueGradeAI: Retrieval-Augmented and Bias-Resistant AI for Transparent and Explainable Digital Assessments
REMA: A Unified Reasoning Manifold Framework for Interpreting Large Language Model
Smoothing-Based Conformal Prediction for Balancing Efficiency and Interpretability
Debiased Front-Door Learners for Heterogeneous Effects
Metrics for Parametric Families of Networks
ConQuER: Modular Architectures for Control and Bias Mitigation in IQP Quantum Generative Models
Linear Causal Representation Learning by Topological Ordering, Pruning, and Disentanglement
Nearly Tight Regret Bounds for Profit Maximization in Bilateral Trade
Dynamic Experts Search: Enhancing Reasoning in Mixture-of-Experts LLMs at Test Time
Effective Policy Learning for Multi-Agent Online Coordination Beyond Submodular Objectives
From Formal Language Theory to Statistical Learning: Finite Observability of Subregular Languages
Benefits and Pitfalls of Reinforcement Learning for Language Model Planning: A Theoretical Perspective
SPARK: Synergistic Policy And Reward Co-Evolving Framework
StateX: Enhancing RNN Recall via Post-training State Expansion
Towards Efficient Online Exploration for Reinforcement Learning with Human Feedback
Training-Free Synthetic Data Generation with Dual IP-Adapter Guidance
New Algorithmic Directions in Optimal Transport and Applications for Product Spaces
Agribot: agriculture-specific question answer system
AutoClimDS: Climate Data Science Agentic AI -- A Knowledge Graph is All You Need
Domain-Aware Speaker Diarization On African-Accented English
No Alignment Needed for Generation: Learning Linearly Separable Representations in Diffusion Models
EEG-Based Consumer Behaviour Prediction: An Exploration from Classical Machine Learning to Graph Neural Networks
General Pruning Criteria for Fast SBL
IndiSeek learns information-guided disentangled representations
What Happens Next? Anticipating Future Motion by Generating Point Trajectories
Automated and Interpretable Survival Analysis from Multimodal Data
VLCE: A Knowledge-Enhanced Framework for Image Description in Disaster Assessment
Multi-Objective Reinforcement Learning for Large Language Model Optimization: Visionary Perspective
Effective continuous equations for adaptive SGD: a stochastic analysis view
OjaKV: Context-Aware Online Low-Rank KV Cache Compression with Oja's Rule
Guiding Audio Editing with Audio Language Model
InvBench: Can LLMs Accelerate Program Verification with Invariant Synthesis?
MobiLLM: An Agentic AI Framework for Closed-Loop Threat Mitigation in 6G Open RANs
Automated Machine Learning Pipeline for Training and Analysis Using Large Language Models
A regret minimization approach to fixed-point iterations
Automating Sensor Characterization with Bayesian Optimization
Generating Stable Placements via Physics-guided Diffusion Models
MORPH: Shape-agnostic PDE Foundation Models
HuLA: Prosody-Aware Anti-Spoofing with Multi-Task Learning for Expressive and Emotional Synthetic Speech
SADA: Safe and Adaptive Inference with Multiple Black-Box Predictions
Multi-modal Bayesian Neural Network Surrogates with Conjugate Last-Layer Estimation
Align2Speak: Improving TTS for Low Resource Languages via ASR-Guided Online Preference Optimization
UISim: An Interactive Image-Based UI Simulator for Dynamic Mobile Environments
Noise-to-Notes: Diffusion-based Generation and Refinement for Automatic Drum Transcription
Self-Speculative Biased Decoding for Faster Live Translation
Retrieval-of-Thought: Efficient Reasoning via Reusing Thoughts
Reinforcement Learning Based Traffic Signal Design to Minimize Queue Lengths
CubistMerge: Spatial-Preserving Token Merging For Diverse ViT Backbones
Lifelong Learning with Behavior Consolidation for Vehicle Routing
Navigating the Impact of Structured Output Format on Large Language Models through the Compass of Causal Inference
SBFA: Single Sneaky Bit Flip Attack to Break Large Language Models
Causal-EPIG: A Prediction-Oriented Active Learning Framework for CATE Estimation
No Prompt Left Behind: Exploiting Zero-Variance Prompts in LLM Reinforcement Learning via Entropy-Guided Advantage Shaping
Elastic MoE: Unlocking the Inference-Time Scalability of Mixture-of-Experts
Error Analysis of Discrete Flow with Generator Matching
Sequential 1-bit Mean Estimation with Near-Optimal Sample Complexity
Outlier Detection in Plantar Pressure: Human-Centered Comparison of Statistical Parametric Mapping and Explainable Machine Learning
Learnable Conformal Prediction with Context-Aware Nonconformity Functions for Robotic Planning and Perception
FlowDrive: moderated flow matching with data balancing for trajectory planning
ERGO: Efficient High-Resolution Visual Understanding for Vision-Language Models
Bilinear relational structure fixes reversal curse and enables consistent model editing
A Nonparametric Discrete Hawkes Model with a Collapsed Gaussian-Process Prior
GSM-Agent: Understanding Agentic Reasoning Using Controllable Environments
From Parameters to Behavior: Unsupervised Compression of the Policy Space
Machine learning approaches to seismic event classification in the Ostrava region
EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning
The Lie of the Average: How Class Incremental Learning Evaluation Deceives You?
Transport Based Mean Flows for Generative Modeling
Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning
Quantile Advantage Estimation for Entropy-Safe Reasoning
IA2: Alignment with ICL Activations Improves Supervised Fine-Tuning
A Theoretical Analysis of Discrete Flow Matching Generative Models
Learning Admissible Heuristics for A*: Theory and Practice
Assessment of deep learning models integrated with weather and environmental variables for wildfire spread prediction and a case study of the 2023 Maui fires
Interpretable Spectral Features Predict Conductivity in Self-Driving Doped Conjugated Polymer Labs
Seismic Velocity Inversion from Multi-Source Shot Gathers Using Deep Segmentation Networks: Benchmarking U-Net Variants and SeismoLabV3+
Cycle is All You Need: More Is Different
From Embeddings to Equations: Genetic-Programming Surrogates for Interpretable Transformer Classification
SGNNBench: A Holistic Evaluation of Spiking Graph Neural Network on Large-scale Graph
Towards mitigating information leakage when evaluating safety monitors
Spiking Neural Networks for Mental Workload Classification with a Multimodal Approach
Accurate typhoon intensity forecasts using a non-iterative spatiotemporal transformer model
Improving Autism Detection with Multimodal Behavioral Analysis
Data-driven approach to the design of complexing agents for trivalent transuranium elements
ReGeS: Reciprocal Retrieval-Generation Synergy for Conversational Recommender Systems
Coreset selection based on Intra-class diversity
The LongiMam model for improved breast cancer risk prediction using longitudinal mammograms
Debugging Concept Bottleneck Models through Removal and Retraining
Do Sparse Subnetworks Exhibit Cognitively Aligned Attention? Effects of Pruning on Saliency Map Fidelity, Sparsity, and Concept Coherence
mmHSense: Multi-Modal and Distributed mmWave ISAC Datasets for Human Sensing
Downscaling climate projections to 1 km with single-image super resolution
Near-Optimal Experiment Design in Linear non-Gaussian Cyclic Models
DyME: Dynamic Multi-Concept Erasure in Diffusion Models with Bi-Level Orthogonal LoRA Adaptation
Foundation models for high-energy physics
A State-of-the-Art SQL Reasoning Model using RLVR
Enhanced Generative Machine Listener
Learning to Reason with Mixture of Tokens
Context-Aware Hybrid Routing in Bluetooth Mesh Networks Using Multi-Model Machine Learning and AODV Fallback
Functional Encryption in Secure Neural Network Training: Data Leakage and Practical Mitigations
ASSESS: A Semantic and Structural Evaluation Framework for Statement Similarity
Wavelet-Induced Rotary Encodings: RoPE Meets Graphs
Erase or Hide? Suppressing Spurious Unlearning Neurons for Robust Unlearning
Towards a more realistic evaluation of machine learning models for bearing fault diagnosis
Fine-Grained Uncertainty Decomposition in Large Language Models: A Spectral Approach
Unlocking the Power of Mixture-of-Experts for Task-Aware Time Series Analytics
Conditional Denoising Diffusion Autoencoders for Wireless Semantic Communications
A Multi-Level Framework for Multi-Objective Hypergraph Partitioning: Combining Minimum Spanning Tree and Proximal Gradient
Aurora: Towards Universal Generative Multimodal Time Series Forecasting
HEAPr: Hessian-based Efficient Atomic Expert Pruning in Output Space
SoDaDE: Solvent Data-Driven Embeddings with Small Transformer Models
Adaptive Policy Backbone via Shared Network
Progressive Weight Loading: Accelerating Initial Inference and Gradually Boosting Performance on Resource-Constrained Environments
Distributed Associative Memory via Online Convex Optimization
Spectral Collapse Drives Loss of Plasticity in Deep Continual Learning
SurvDiff: A Diffusion Model for Generating Synthetic Data in Survival Analysis
Context and Diversity Matter: The Emergence of In-Context Learning in World Models
Stochastic activations
Neural Feature Geometry Evolves as Discrete Ricci Flow
Investigating Faithfulness in Large Audio Language Models
Role-Aware Multi-modal federated learning system for detecting phishing webpages
Enhancing Credit Risk Prediction: A Meta-Learning Framework Integrating Baseline Models, LASSO, and ECOC for Superior Accuracy
(Sometimes) Less is More: Mitigating the Complexity of Rule-based Representation for Interpretable Classification
SpinGPT: A Large-Language-Model Approach to Playing Poker Correctly
Improving accuracy in short mortality rate series: Exploring Multi-step Forecasting Approaches in Hybrid Systems
ReLAM: Learning Anticipation Model for Rewarding Visual Robotic Manipulation
MoveFM-R: Advancing Mobility Foundation Models via Language-driven Semantic Reasoning
Fast-Forward Lattice Boltzmann: Learning Kinetic Behaviour with Physics-Informed Neural Operators
One Prompt Fits All: Universal Graph Adaptation for Pretrained Models
Partial Parameter Updates for Efficient Distributed Training
Learning from Delayed Feedback in Games via Extra Prediction
The Flood Complex: Large-Scale Persistent Homology on Millions of Points
Global Convergence in Neural ODEs: Impact of Activation Functions
Bridging Kolmogorov Complexity and Deep Learning: Asymptotically Optimal Description Length Objectives for Transformers
Overclocking Electrostatic Generative Models
Physics-informed GNN for medium-high voltage AC power flow with edge-aware attention and line search correction operator
Nonlinear Optimization with GPU-Accelerated Neural Network Constraints
IIET: Efficient Numerical Transformer via Implicit Iterative Euler Method
Learning the Neighborhood: Contrast-Free Multimodal Self-Supervised Molecular Graph Pretraining
Bayesian Transfer Operators in Reproducing Kernel Hilbert Spaces
OFMU: Optimization-Driven Framework for Machine Unlearning
A Machine Learning Pipeline for Multiple Sclerosis Biomarker Discovery: Comparing explainable AI and Traditional Statistical Approaches
Dual Optimistic Ascent (PI Control) is the Augmented Lagrangian Method in Disguise
Adaptive Dual-Mode Distillation with Incentive Schemes for Scalable, Heterogeneous Federated Learning on Non-IID Data
JointDiff: Bridging Continuous and Discrete in Multi-Agent Trajectory Generation
ECHO: Toward Contextual Seq2Seq Paradigms in Large EEG Models
Learning to Price Bundles: A GCN Approach for Mixed Bundling
Activation Function Design Sustains Plasticity in Continual Learning
Preference-Guided Learning for Sparse-Reward Multi-Agent Reinforcement Learning
On the Complexity Theory of Masked Discrete Diffusion: From $\mathrm{poly}(1/\epsilon)$ to Nearly $\epsilon$-Free
Beyond Johnson-Lindenstrauss: Uniform Bounds for Sketched Bilinear Forms
Graph of Agents: Principled Long Context Modeling by Emergent Multi-Agent Collaboration
MolSpectLLM: A Molecular Foundation Model Bridging Spectroscopy, Molecule Elucidation, and 3D Structure Generation
Beyond RAG vs. Long-Context: Learning Distraction-Aware Retrieval for Efficient Knowledge Grounding
Abductive Logical Rule Induction by Bridging Inductive Logic Programming and Multimodal Large Language Models
Zubov-Net: Adaptive Stability for Neural ODEs Reconciling Accuracy with Robustness
Position: The Hidden Costs and Measurement Gaps of Reinforcement Learning with Verifiable Rewards
Why High-rank Neural Networks Generalize?: An Algebraic Framework with RKHSs
Closing the Oracle Gap: Increment Vector Transformation for Class Incremental Learning
Discrete Guidance Matching: Exact Guidance for Discrete Flow Matching
Multiplicative-Additive Constrained Models:Toward Joint Visualization of Interactive and Independent Effects
Generation Properties of Stochastic Interpolation under Finite Training Set
Extracting Actionable Insights from Building Energy Data using Vision LLMs on Wavelet and 3D Recurrence Representations
Statistical Advantage of Softmax Attention: Insights from Single-Location Regression
Structural Information-based Hierarchical Diffusion for Offline Reinforcement Learning
Active Attacks: Red-teaming LLMs via Adaptive Environments
Think Smart, Not Hard: Difficulty Adaptive Reasoning for Large Audio Language Models
GRAM-TDI: adaptive multimodal representation learning for drug target interaction prediction
Stage-wise Dynamics of Classifier-Free Guidance in Diffusion Models
Goal-Guided Efficient Exploration via Large Language Model in Reinforcement Learning
Concept-SAE: Active Causal Probing of Visual Model Behavior
AEGIS: Authentic Edge Growth In Sparsity for Link Prediction in Edge-Sparse Bipartite Knowledge Graphs
Task-Adaptive Parameter-Efficient Fine-Tuning for Weather Foundation Models
Teaching Transformers to Solve Combinatorial Problems through Efficient Trial & Error
MCGM: Multi-stage Clustered Global Modeling for Long-range Interactions in Molecules
OrtSAE: Orthogonal Sparse Autoencoders Uncover Atomic Features
Latent Diffusion : Multi-Dimension Stable Diffusion Latent Space Explorer
Convexity-Driven Projection for Point Cloud Dimensionality Reduction
MO-GRPO: Mitigating Reward Hacking of Group Relative Policy Optimization on Multi-Objective Problems
BrainPro: Towards Large-scale Brain State-aware EEG Representation Learning
Enriching Knowledge Distillation with Intra-Class Contrastive Learning
Towards Understanding Feature Learning in Parameter Transfer
The Rogue Scalpel: Activation Steering Compromises LLM Safety
Non-Linear Trajectory Modeling for Multi-Step Gradient Inversion Attacks in Federated Learning
SHAKE-GNN: Scalable Hierarchical Kirchhoff-Forest Graph Neural Network
Reinforcement Learning for Durable Algorithmic Recourse
Modeling Psychological Profiles in Volleyball via Mixed-Type Bayesian Networks
Countering adversarial evasion in regression analysis
Learning More with Less: A Dynamic Dual-Level Down-Sampling Framework for Efficient Policy Optimization
Mind the Missing: Variable-Aware Representation Learning for Irregular EHR Time Series using Large Language Models
Slicing Wasserstein Over Wasserstein Via Functional Optimal Transport
Pushing Toward the Simplex Vertices: A Simple Remedy for Code Collapse in Smoothed Vector Quantization
Lightweight error mitigation strategies for post-training N:M activation sparsity in LLMs
Efficiency Boost in Decentralized Optimization: Reimagining Neighborhood Aggregation with Minimal Overhead
Learning Equivariant Functions via Quadratic Forms
Mechanistic Independence: A Principle for Identifiable Disentangled Representations
Kernel Regression of Multi-Way Data via Tensor Trains with Hadamard Overparametrization: The Dynamic Graph Flow Case
Reversible GNS for Dissipative Fluids with Consistent Bidirectional Dynamics
A Law of Data Reconstruction for Random Features (and Beyond)
Automatic Discovery of One Parameter Subgroups of $SO(n)$
Fairness-Aware Reinforcement Learning (FAReL): A Framework for Transparent and Balanced Sequential Decision-Making
Limitations on Safe, Trusted, Artificial General Intelligence
DriftLite: Lightweight Drift Control for Inference-Time Scaling of Diffusion Models
Differentiable Structure Learning for General Binary Data
RED-DiffEq: Regularization by denoising diffusion models for solving inverse PDE problems with application to full waveform inversion
A Systematic Review of Conformal Inference Procedures for Treatment Effect Estimation: Methods and Challenges
MMPlanner: Zero-Shot Multimodal Procedural Planning with Chain-of-Thought Object State Reasoning
Logic of Hypotheses: from Zero to Full Knowledge in Neurosymbolic Integration
DIM: Enforcing Domain-Informed Monotonicity in Deep Neural Networks
Neuroprobe: Evaluating Intracranial Brain Responses to Naturalistic Stimuli
SlotFM: A Motion Foundation Model with Slot Attention for Diverse Downstream Tasks
Scalable Second-order Riemannian Optimization for $K$-means Clustering
Prophecy: Inferring Formal Properties from Neuron Activations
SpecMER: Fast Protein Generation with K-mer Guided Speculative Decoding
Wav2Arrest 2.0: Long-Horizon Cardiac Arrest Prediction with Time-to-Event Modeling, Identity-Invariance, and Pseudo-Lab Alignment
Exact Subgraph Isomorphism Network for Predictive Graph Mining
Downscaling human mobility data based on demographic socioeconomic and commuting characteristics using interpretable machine learning methods
PQFed: A Privacy-Preserving Quality-Controlled Federated Learning Framework
A Unifying Framework for Parallelizing Sequential Models with Linear Dynamical Systems
Information-Theoretic Bayesian Optimization for Bilevel Optimization Problems
Uncovering Alzheimer's Disease Progression via SDE-based Spatio-Temporal Graph Deep Learning on Longitudinal Brain Networks
POLO: Preference-Guided Multi-Turn Reinforcement Learning for Lead Optimization
Brain PathoGraph Learning
HyperCore: Coreset Selection under Noise via Hypersphere Models
SubZeroCore: A Submodular Approach with Zero Training for Coreset Selection
Reparameterizing 4DVAR with neural fields
Machine Learning and AI Applied to fNIRS Data Reveals Novel Brain Activity Biomarkers in Stable Subclinical Multiple Sclerosis
Beyond Formula Complexity: Effective Information Criterion Improves Performance and Interpretability for Symbolic Regression
FastGRPO: Accelerating Policy Optimization via Concurrency-aware Speculative Decoding and Online Draft Learning
Exploring the Relationships Between Physiological Signals During Automated Fatigue Detection
ChaosNexus: A Foundation Model for Universal Chaotic System Forecasting with Multi-scale Representations
Scaling Laws for Neural Material Models
Sharpness-Aware Minimization Can Hallucinate Minimizers
High-Probability Analysis of Online and Federated Zero-Order Optimisation
Neural Operators for Mathematical Modeling of Transient Fluid Flow in Subsurface Reservoir Systems
GraphPFN: A Prior-Data Fitted Graph Foundation Model
SlimDiff: Training-Free, Activation-Guided Hands-free Slimming of Diffusion Models
Chasing the Tail: Effective Rubric-based Reward Modeling for Large Language Model Post-Training
Contrastive Mutual Information Learning: Toward Robust Representations without Positive-Pair Augmentations
DistillKac: Few-Step Image Generation via Damped Wave Equations
Uncertainty-Aware Knowledge Tracing Models
$\mathbf{Li_2}$: A Framework on Dynamics of Feature Emergence and Delayed Generalization
TRiCo: Triadic Game-Theoretic Co-Training for Robust Semi-Supervised Learning
Preemptive Detection and Steering of LLM Misalignment via Latent Reachability
Expert-guided Clinical Text Augmentation via Query-Based Model Collaboration
A circuit for predicting hierarchical structure in-context in Large Language Models
Evidence for Limited Metacognition in LLMs
Machine Learning. The Science of Selection under Uncertainty
Interpretable time series analysis with Gumbel dynamics
Leveraging Big Data Frameworks for Spam Detection in Amazon Reviews
GenUQ: Predictive Uncertainty Estimates via Generative Hyper-Networks
Task-Agnostic Federated Continual Learning via Replay-Free Gradient Projection
Causal Abstraction Inference under Lossy Representations
LANCE: Low Rank Activation Compression for Efficient On-Device Continual Learning
PreLoRA: Hybrid Pre-training of Vision Transformers with Full Training and Low-Rank Adapters
Shoot from the HIP: Hessian Interatomic Potentials without derivatives
Blockwise Hadamard high-Rank Adaptation for Parameter-Efficient LLM Fine-Tuning
Understanding and Enhancing Mask-Based Pretraining towards Universal Representations
Hallucination to Truth: A Review of Fact-Checking and Factuality Evaluation in Large Language Models
GTPO and GRPO-S: Token and Sequence-Level Reward Shaping with Policy Entropy
StreetReaderAI: Making Street View Accessible Using Context-Aware Multimodal AI
Scalable Option Learning in High-Throughput Environments
Draw-In-Mind: Rebalancing Designer-Painter Roles in Unified Multimodal Models Benefits Image Editing
JudgeAgent: Knowledge-wise and Dynamic LLM Evaluation with Agent-as-Interviewer
Chain or tree? Re-evaluating complex reasoning from the perspective of a matrix of thought
Positional Encoding via Token-Aware Phase Attention
Justice in Judgment: Unveiling (Hidden) Bias in LLM-assisted Peer Reviews
Towards a Physics Foundation Model
Constructive Conflict-Driven Multi-Agent Reinforcement Learning for Strategic Diversity
Recent Advancements in Microscopy Image Enhancement using Deep Learning: A Survey
Comparing RAG and GraphRAG for Page-Level Retrieval Question Answering on Math Textbook
Dendritic Resonate-and-Fire Neuron for Effective and Efficient Long Sequence Modeling
Automated Facility Enumeration for Building Compliance Checking using Door Detection and Large Language Models
Diffusion-Augmented Contrastive Learning: A Noise-Robust Encoder for Biosignal Representations
AnchDrive: Bootstrapping Diffusion Policies with Hybrid Trajectory Anchors for End-to-End Driving
Discovering and Analyzing Stochastic Processes to Reduce Waste in Food Retail
Impact of Loss Weight and Model Complexity on Physics-Informed Neural Networks for Computational Fluid Dynamics
LLMs for Bayesian Optimization in Scientific Domains: Are We There Yet?
Object Identification Under Known Dynamics: A PIRNN Approach for UAV Classification
Null-Space Filtering for Data-Free Continual Model Merging: Preserving Transparency, Promoting Fidelity
Forecasting Seismic Waveforms: A Deep Learning Approach for Einstein Telescope
Talking Trees: Reasoning-Assisted Induction of Decision Trees for Tabular Data
Score-based Idempotent Distillation of Diffusion Models
Are Hallucinations Bad Estimations?
d2: Improved Techniques for Training Reasoning Diffusion Language Models
VISION: Prompting Ocean Vertical Velocity Reconstruction from Incomplete Observations
Filtering with Confidence: When Data Augmentation Meets Conformal Prediction
SDPO: Importance-Sampled Direct Preference Optimization for Stable Diffusion Training
DORAEMON: Decentralized Ontology-aware Reliable Agent with Enhanced Memory Oriented Navigation
Model-Preserving Adaptive Rounding
Mamba Integrated with Physics Principles Masters Long-term Chaotic System Forecasting
InfiMed: Low-Resource Medical MLLMs with Advancing Understanding and Reasoning
Probing Neural Topology of Large Language Models
Physics-Guided Motion Loss for Video Generation Model
CapSpeech: Enabling Downstream Applications in Style-Captioned Text-to-Speech
Dual Branch VideoMamba with Gated Class Token Fusion for Violence Detection
Resisting Contextual Interference in RAG via Parametric-Knowledge Reinforcement
DriveAction: A Benchmark for Exploring Human-like Driving Decisions in VLA Models
AMPED: Adaptive Multi-objective Projection for balancing Exploration and skill Diversification
Position: Simulating Society Requires Simulating Thought
VidBridge-R1: Bridging QA and Captioning for RL-based Video Understanding Models with Intermediate Proxy Tasks
Think With Videos For Agentic Long-Video Understanding
Security Degradation in Iterative AI Code Generation -- A Systematic Analysis of the Paradox
Exploiting Block Coordinate Descent for Cost-Effective LLM Model Training
Personalized LLM Decoding via Contrasting Personal Preference
Latent Concept Disentanglement in Transformer-based Language Models
TAMMs: Temporal-Aware Multimodal Model for Satellite Image Change Understanding and Forecasting
On the Necessity of Output Distribution Reweighting for Effective Class Unlearning
Beyond Simple Graphs: Neural Multi-Objective Routing on Multigraphs
Why Reinforcement Fine-Tuning Enables MLLMs Preserve Prior Knowledge Better: A Data Perspective
Investigating Redundancy in Multimodal Large Language Models with Multiple Vision Encoders
Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions
Learn Globally, Speak Locally: Bridging the Gaps in Multilingual Reasoning
Lightweight MSA Design Advances Protein Folding From Evolutionary Embeddings
KV Cache Steering for Controlling Frozen LLMs
Rethinking the Embodied Gap in Vision-and-Language Navigation: A Holistic Study of Physical and Visual Disparities
LoopServe: An Adaptive Dual-phase LLM Inference Acceleration System for Multi-Turn Dialogues
The Invisible Leash: Why RLVR May or May Not Escape Its Origin
R-Stitch: Dynamic Trajectory Stitching for Efficient Reasoning
DAMR: Efficient and Adaptive Context-Aware Knowledge Graph Question Answering with LLM-Guided MCTS
Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review
Geometry aware inference of steady state PDEs using Equivariant Neural Fields representations
CARL: Camera-Agnostic Representation Learning for Spectral Image Analysis
Neural Orchestration for Multi-Agent Systems: A Deep Learning Framework for Optimal Agent Selection in Multi-Domain Task Environments
Formalising Human-in-the-Loop: Computational Reductions, Failure Modes, and Legal-Moral Responsibility
Follow the Path: Reasoning over Knowledge Graph Paths to Improve LLM Factuality
SuperCoder: Assembly Program Superoptimization with Large Language Models
HiddenBench: Assessing Collective Reasoning in Multi-Agent LLMs via Hidden Profile Tasks
Quantization Meets Reasoning: Exploring and Mitigating Degradation of Low-Bit LLMs in Mathematical Reasoning
TokUR: Token-Level Uncertainty Estimation for Large Language Model Reasoning
ZeroTuning: Unlocking the Initial Token's Power to Enhance Large Language Models Without Training
Feature Hedging: Correlated Features Break Narrow Sparse Autoencoders
Latent Veracity Inference for Identifying Errors in Stepwise Reasoning
Structured Relational Representations
Shadow-FT: Tuning Instruct Model via Training on Paired Base Model
Learning Hierarchical Domain Models Through Environment-Grounded Interaction
VocalAgent: Large Language Models for Vocal Health Diagnostics with Safety-Aware Evaluation
UltraEdit: Training-, Subject-, and Memory-Free Lifelong Editing in Language Models
Intentional Gesture: Deliver Your Intentions with Gestures for Speech
Octic Vision Transformers: Quicker ViTs Through Equivariance
UniErase: Towards Balanced and Precise Unlearning in Language Models
Attributing Response to Context: A Jensen-Shannon Divergence Driven Mechanistic Study of Context Attribution in Retrieval-Augmented Generation
Beyond Static Testbeds: An Interaction-Centric Agent Simulation Platform for Dynamic Recommender Systems
Learning Flexible Forward Trajectories for Masked Molecular Diffusion
Unlearning Isn't Deletion: Investigating Reversibility of Machine Unlearning in LLMs
The Polar Express: Optimal Matrix Sign Methods and Their Application to the Muon Algorithm
Bottlenecked Transformers: Periodic KV Cache Consolidation for Generalised Reasoning
BP-Seg: A graphical model approach to unsupervised and non-contiguous text segmentation using belief propagation
From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning
FFT-based Dynamic Subspace Selection for Low-Rank Adaptive Optimization of Large Language Models
Can LLMs Alleviate Catastrophic Forgetting in Graph Continual Learning? A Systematic Study
HD-PiSSA: High-Rank Distributed Orthogonal Adaptation
Prompting is not Enough: Exploring Knowledge Integration and Controllable Generation on Large Language Models
Beyond the Proxy: Trajectory-Distilled Guidance for Offline GFlowNet Training
BiomedSQL: Text-to-SQL for Scientific Reasoning on Biomedical Knowledge Bases
Spectral-inspired Operator Learning with Limited Data and Unknown Physics
Large Language Models versus Classical Machine Learning: Performance in COVID-19 Mortality Prediction Using High-Dimensional Tabular Data
Closed-Form Interpretation of Neural Network Latent Spaces with Symbolic Gradients
On the Within-class Variation Issue in Alzheimer's Disease Detection
DOTA: Distributional Test-Time Adaptation of Vision-Language Models
Leveraging Model Guidance to Extract Training Data from Personalized Diffusion Models
Degree-Conscious Spiking Graph for Cross-Domain Adaptation
Stuffed Mamba: Oversized States Lead to the Inability to Forget
Improving the Language Understanding Capabilities of Large Language Models Using Reinforcement Learning
Diffusion Curriculum: Synthetic-to-Real Data Curriculum via Image-Guided Diffusion
Capacity-Aware Planning and Scheduling in Budget-Constrained Multi-Agent MDPs: A Meta-RL Approach
Large Pre-Training Datasets Don't Always Guarantee Robustness after Fine-Tuning
Rare-to-Frequent: Unlocking Compositional Generation Power of Diffusion Models on Rare Concepts with LLM Guidance
Conditional Latent Space Molecular Scaffold Optimization for Accelerated Molecular Design
Can LLMs be Good Graph Judge for Knowledge Graph Construction?
Training-Free Bayesianization for Low-Rank Adapters of Large Language Models
Demystifying Domain-adaptive Post-training for Financial LLMs
How Strategic Agents Respond: Comparing Analytical Models with LLM-Generated Responses in Strategic Classification
Avoiding $\mathbf{exp(R_{max})}$ scaling in RLHF through Preference-based Exploration
Process Reinforcement through Implicit Rewards
VidCRAFT3: Camera, Object, and Lighting Control for Image-to-Video Generation
Beyond Shallow Behavior: Task-Efficient Value-Based Multi-Task Offline MARL via Skill Discovery
Tokenizing Single-Channel EEG with Time-Frequency Motif Learning
RuCCoD: Towards Automated ICD Coding in Russian
How LLMs Fail to Support Fact-Checking
Adaptively profiling models with task elicitation
InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models
Cost-Optimal Grouped-Query Attention for Long-Context Modeling
Retrieval-Augmented Generation with Hierarchical Knowledge
Ethical AI for Young Digital Citizens: A Call to Action on Privacy Governance
Detecting Scarce and Sparse Anomalous: Solving Dual Imbalance in Multi-Instance Learning
Can Diffusion Models Disentangle? A Theoretical Perspective
Recursive Training Loops in LLMs: How training data properties modulate distribution shift in generated data?
Prima.cpp: Fast 30-70B LLM Inference on Heterogeneous and Low-Resource Home Clusters
CLASH: Evaluating Language Models on Judging High-Stakes Dilemmas from Multiple Perspectives
CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning
Toward a Physics of Deep Learning and Brains
VoiceAssistant-Eval: Benchmarking AI Assistants across Listening, Speaking, and Viewing
See, Point, Fly: A Learning-Free VLM Framework for Universal Unmanned Aerial Navigation
A critical review of methods and challenges in large language models
Attributing Responsibility in AI-Induced Incidents: A Computational Reflective Equilibrium Framework for Accountability
Development and Validation of a Large Language Model for Generating Fully-Structured Radiology Reports
Grounding Multimodal LLMs to Embodied Agents that Ask for Help with Reinforcement Learning
A Domain-Agnostic Scalable AI Safety Ensuring Framework
Reasoning BO: Enhancing Bayesian Optimization with Long-Context Reasoning Power of LLMs
From Grunts to Lexicons: Emergent Language from Cooperative Foraging
The First Impression Problem: Internal Bias Triggers Overthinking in Reasoning Models
XBOUND: Exploring Capability Boundaries of Device-Control Agents at the State Level
Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents
Scalable In-Context Q-Learning
Mitigating Hallucination Through Theory-Consistent Symmetric Multimodal Preference Optimization
AgentOrchestra: Orchestrating Hierarchical Multi-Agent Intelligence with the Tool-Environment-Agent(TEA) Protocol
LocationReasoner: Evaluating LLMs on Real-World Site Selection Reasoning
From Roots to Rewards: Dynamic Tree Reasoning with Reinforcement Learning
MMSearch-Plus: Benchmarking Provenance-Aware Search for Multimodal Browsing Agents
Towards Agentic OS: An LLM Agent Framework for Linux Schedulers
EigenBench: A Comparative Behavioral Measure of Value Alignment
The Anatomy of Alignment: Decomposing Preference Optimization by Steering Sparse Features
The STAR-XAI Protocol: A Framework for Inducing and Verifying Agency, Reasoning, and Reliability in AI Agents
Multi-View Hypercomplex Learning for Breast Cancer Screening
Efficient Epistemic Uncertainty Estimation in Regression Ensemble Models Using Pairwise-Distance Estimators
VDFD: Multi-Agent Value Decomposition Framework with Disentangled World Model
Biospheric AI
Machine Learning-Assisted Sustainable Remanufacturing, Reusing and Recycling for Lithium-ion Batteries
Diverse Subset Selection via Norm-Based Sampling and Orthogonality
VeriFlow: Modeling Distributions for Neural Network Verification
CHRONOBERG: Capturing Language Evolution and Temporal Awareness in Foundation Models
What Is The Political Content in LLMs' Pre- and Post-Training Data?
Zero-Effort Image-to-Music Generation: An Interpretable RAG-based VLM Approach
SpinGPT: A Large-Language-Model Approach to Playing Poker Correctly
Deep Learning-Based Cross-Anatomy CT Synthesis Using Adapted nnResU-Net with Anatomical Feature Prioritized Loss
RAU: Reference-based Anatomical Understanding with Vision Language Models
Explaining multimodal LLMs via intra-modal token interactions
Partial Parameter Updates for Efficient Distributed Training
An Ontology for Unified Modeling of Tasks, Actions, Environments, and Capabilities in Personal Service Robotics
Global Convergence in Neural ODEs: Impact of Activation Functions
Chimera: Diagnosing Shortcut Learning in Visual-Language Understanding
Learning to Ball: Composing Policies for Long-Horizon Basketball Moves
Bridging Kolmogorov Complexity and Deep Learning: Asymptotically Optimal Description Length Objectives for Transformers
Physics-informed GNN for medium-high voltage AC power flow with edge-aware attention and line search correction operator
MDAR: A Multi-scene Dynamic Audio Reasoning Benchmark
Learning the Neighborhood: Contrast-Free Multimodal Self-Supervised Molecular Graph Pretraining
Evaluating the Limits of Large Language Models in Multilingual Legal Reasoning
Exploring Solution Divergence and Its Effect on Large Language Model Problem Solving
OFMU: Optimization-Driven Framework for Machine Unlearning
A Machine Learning Pipeline for Multiple Sclerosis Biomarker Discovery: Comparing explainable AI and Traditional Statistical Approaches
Ontological foundations for contrastive explanatory narration of robot plans
Mental Health Impacts of AI Companions: Triangulating Social Media Quasi-Experiments, User Perspectives, and Relational Theory
InfiR2: A Comprehensive FP8 Training Recipe for Reasoning-Enhanced Language Models
Does AI Coaching Prepare us for Workplace Negotiations?
ConQuER: Modular Architectures for Control and Bias Mitigation in IQP Quantum Generative Models
Activation Function Design Sustains Plasticity in Continual Learning
Retrieval-Augmented Guardrails for AI-Drafted Patient-Portal Messages: Error Taxonomy Construction and Large-Scale Evaluation
From Parameters to Behavior: Unsupervised Compression of the Policy Space
Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning
Quantile Advantage Estimation for Entropy-Safe Reasoning
Vision-Language Alignment from Compressed Image Representations using 2D Gaussian Splatting
IA2: Alignment with ICL Activations Improves Supervised Fine-Tuning
A Theoretical Analysis of Discrete Flow Matching Generative Models
Learning Admissible Heuristics for A*: Theory and Practice
StateX: Enhancing RNN Recall via Post-training State Expansion
Towards Efficient Online Exploration for Reinforcement Learning with Human Feedback
Variational Reasoning for Language Models
Language Models Can Learn from Verbal Feedback Without Scalar Rewards
Death of the Novel(ty): Beyond n-Gram Novelty as a Metric for Textual Creativity
WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning
Hierarchical Representation Matching for CLIP-based Class-Incremental Learning
Learning Human-Perceived Fakeness in AI-Generated Videos via Multimodal LLMs
Thinking in Many Modes: How Composite Reasoning Elevates Large Language Model Performance with Limited Data
Polysemous Language Gaussian Splatting via Matching-based Mask Lifting
Fairness-Aware Reinforcement Learning (FAReL): A Framework for Transparent and Balanced Sequential Decision-Making
FeatBench: Evaluating Coding Agents on Feature Implementation for Vibe Coding
ASSESS: A Semantic and Structural Evaluation Framework for Statement Similarity
Safety Compliance: Rethinking LLM Safety Reasoning through the Lens of Compliance
Beyond Textual Context: Structural Graph Encoding with Adaptive Space Alignment to alleviate the hallucination of LLMs
Secure and Efficient Access Control for Computer-Use Agents via Context Space
Beyond Classification Accuracy: Neural-MedBench and the Need for Deeper Reasoning Benchmarks
Wavelet-Induced Rotary Encodings: RoPE Meets Graphs
A Global Analysis of Cyber Threats to the Energy Sector: "Currents of Conflict" from a Geopolitical Perspective
Leveraging Large Language Models for Robot-Assisted Learning of Morphological Structures in Preschool Children with Language Vulnerabilities
Bridging Fairness and Explainability: Can Input-Based Explanations Promote Fairness in Hate Speech Detection?
Jailbreaking on Text-to-Video Models via Scene Splitting Strategy
HEAPr: Hessian-based Efficient Atomic Expert Pruning in Output Space
HiGS: History-Guided Sampling for Plug-and-Play Enhancement of Diffusion Models
Adaptive Policy Backbone via Shared Network
Progressive Weight Loading: Accelerating Initial Inference and Gradually Boosting Performance on Resource-Constrained Environments
Pedestrian Attribute Recognition via Hierarchical Cross-Modality HyperGraph Learning
Spectral Collapse Drives Loss of Plasticity in Deep Continual Learning
Advancing Natural Language Formalization to First Order Logic with Fine-tuned LLMs
Transformers Can Learn Connectivity in Some Graphs but Not Others
SurvDiff: A Diffusion Model for Generating Synthetic Data in Survival Analysis
Context and Diversity Matter: The Emergence of In-Context Learning in World Models
Stochastic activations
Forecasting the Future with Yesterday's Climate: Temperature Bias in AI Weather and Climate Models
REFINE-CONTROL: A Semi-supervised Distillation Method For Conditional Image Generation
From Long to Lean: Performance-aware and Adaptive Chain-of-Thought Compression via Multi-round Refinement
Pushing Toward the Simplex Vertices: A Simple Remedy for Code Collapse in Smoothed Vector Quantization
Lightweight error mitigation strategies for post-training N:M activation sparsity in LLMs
Teaching AI to Feel: A Collaborative, Full-Body Exploration of Emotive Communication
Efficiency Boost in Decentralized Optimization: Reimagining Neighborhood Aggregation with Minimal Overhead
Learning Equivariant Functions via Quadratic Forms
MimicDreamer: Aligning Human and Robot Demonstrations for Scalable VLA Training
The Outputs of Large Language Models are Meaningless
Reversible GNS for Dissipative Fluids with Consistent Bidirectional Dynamics
Question-Driven Analysis and Synthesis: Building Interpretable Thematic Trees with LLMs for Text Clustering and Controllable Generation
Impact of Collective Behaviors of Autonomous Vehicles on Urban Traffic Dynamics: A Multi-Agent Reinforcement Learning Approach
VizGen: Data Exploration and Visualization from Natural Language via a Multi-Agent AI Architecture
Automatic Discovery of One Parameter Subgroups of $SO(n)$
Rigidity-Aware 3D Gaussian Deformation from a Single Image
SAGE: Scene Graph-Aware Guidance and Execution for Long-Horizon Manipulation Tasks
Why Chain of Thought Fails in Clinical Text Understanding
SemanticControl: A Training-Free Approach for Handling Loosely Aligned Visual Conditions in ControlNet
Unveiling Many Faces of Surrogate Models for Configuration Tuning: A Fitness Landscape Analysis Perspective
Debiasing Large Language Models in Thai Political Stance Detection via Counterfactual Calibration
Active Attacks: Red-teaming LLMs via Adaptive Environments
FlowDrive: moderated flow matching with data balancing for trajectory planning
No-Reference Image Contrast Assessment with Customized EfficientNet-B0
From Superficial Outputs to Superficial Learning: Risks of Large Language Models in Education
Geo-R1: Improving Few-Shot Geospatial Referring Expression Understanding with Reinforcement Fine-Tuning
Benchmarking and Mitigate Psychological Sycophancy in Medical Vision-Language Models
Hybrid Diffusion for Simultaneous Symbolic and Continuous Planning
Developing Vision-Language-Action Model from Egocentric Videos
ERGO: Efficient High-Resolution Visual Understanding for Vision-Language Models
Black-Box Hallucination Detection via Consistency Under the Uncertain Expression
Lightweight Structured Multimodal Reasoning for Clinical Scene Understanding in Robotics
Latent Diffusion : Multi-Dimension Stable Diffusion Latent Space Explorer
Fuzzy Reasoning Chain (FRC): An Innovative Reasoning Framework from Fuzziness to Clarity
An Adaptive ICP LiDAR Odometry Based on Reliable Initial Pose
Decoding Deception: Understanding Automatic Speech Recognition Vulnerabilities in Evasion and Poisoning Attacks
The QCET Taxonomy of Standard Quality Criterion Names and Definitions for the Evaluation of NLP Systems
The Rogue Scalpel: Activation Steering Compromises LLM Safety
Action-aware Dynamic Pruning for Efficient Vision-Language-Action Manipulation
SecureAgentBench: Benchmarking Secure Code Generation under Realistic Vulnerability Scenarios
Reinforcement Learning for Durable Algorithmic Recourse
Learning More with Less: A Dynamic Dual-Level Down-Sampling Framework for Efficient Policy Optimization
The AI_INFN Platform: Artificial Intelligence Development in the Cloud
Universal Legal Article Prediction via Tight Collaboration between Supervised Classification Model and LLM
Multi-Agent Path Finding via Offline RL and LLM Collaboration
R-Capsule: Compressing High-Level Plans for Efficient Large Language Model Reasoning
Bridging Draft Policy Misalignment: Group Tree Optimization for Speculative Decoding
DIM: Enforcing Domain-Informed Monotonicity in Deep Neural Networks
MORPH: Shape-agnostic PDE Foundation Models
SlotFM: A Motion Foundation Model with Slot Attention for Diverse Downstream Tasks
QueryGym: Step-by-Step Interaction with Relational Databases
Optimizing the non-Clifford-count in unitary synthesis using Reinforcement Learning
Not My Agent, Not My Boundary? Elicitation of Personal Privacy Boundaries in AI-Delegated Information Sharing
Developing Strategies to Increase Capacity in AI Education
UISim: An Interactive Image-Based UI Simulator for Dynamic Mobile Environments
Uncovering Alzheimer's Disease Progression via SDE-based Spatio-Temporal Graph Deep Learning on Longitudinal Brain Networks
POLO: Preference-Guided Multi-Turn Reinforcement Learning for Lead Optimization
LFA-Net: A Lightweight Network with LiteFusion Attention for Retinal Vessel Segmentation
Self-Speculative Biased Decoding for Faster Live Translation
Brain PathoGraph Learning
HyperCore: Coreset Selection under Noise via Hypersphere Models
SubZeroCore: A Submodular Approach with Zero Training for Coreset Selection
Backdoor Attribution: Elucidating and Controlling Backdoor in Language Models
Beyond Structure: Invariant Crystal Property Prediction with Pseudo-Particle Ray Diffraction
Unbiased Binning: Fairness-aware Attribute Representation
FastGRPO: Accelerating Policy Optimization via Concurrency-aware Speculative Decoding and Online Draft Learning
Evaluating and Improving Cultural Awareness of Reward Models for LLM Alignment
ChaosNexus: A Foundation Model for Universal Chaotic System Forecasting with Multi-scale Representations
DiTraj: training-free trajectory control for video diffusion transformer
Can Large Language Models Autoformalize Kinematics?
Beyond Johnson-Lindenstrauss: Uniform Bounds for Sketched Bilinear Forms
Graph of Agents: Principled Long Context Modeling by Emergent Multi-Agent Collaboration
Enhancing Low-Rank Adaptation with Structured Nonlinear Transformations
Unlocking the Essence of Beauty: Advanced Aesthetic Reasoning with Relative-Absolute Policy Optimization
No Prompt Left Behind: Exploiting Zero-Variance Prompts in LLM Reinforcement Learning via Entropy-Guided Advantage Shaping
Position: The Hidden Costs and Measurement Gaps of Reinforcement Learning with Verifiable Rewards
You Can't Steal Nothing: Mitigating Prompt Leakages in LLMs via System Vectors
Elastic MoE: Unlocking the Inference-Time Scalability of Mixture-of-Experts
A Large-Scale Dataset and Citation Intent Classification in Turkish with LLMs
AutoSCORE: Enhancing Automated Scoring with Multi-Agent Large Language Models via Structured Component Recognition
EqDiff-CT: Equivariant Conditional Diffusion model for CT Image Synthesis from CBCT
Generation Properties of Stochastic Interpolation under Finite Training Set
One Model, Many Morals: Uncovering Cross-Linguistic Misalignments in Computational Moral Reasoning
ARTI-6: Towards Six-dimensional Articulatory Speech Encoding
A State-of-the-Art SQL Reasoning Model using RLVR
Enhanced Generative Machine Listener
Gender Stereotypes in Professional Roles Among Saudis: An Analytical Study of AI-Generated Images Using Language Models
Score-based Idempotent Distillation of Diffusion Models
Are Hallucinations Bad Estimations?
Learning to Reason with Mixture of Tokens
Neural Operators for Mathematical Modeling of Transient Fluid Flow in Subsurface Reservoir Systems
Dual-Head Reasoning Distillation: Improving Classifier Accuracy with Train-Time-Only Reasoning
Chasing the Tail: Effective Rubric-based Reward Modeling for Large Language Model Post-Training
New Algorithmic Directions in Optimal Transport and Applications for Product Spaces
DistillKac: Few-Step Image Generation via Damped Wave Equations
$\mathbf{Li_2}$: A Framework on Dynamics of Feature Emergence and Delayed Generalization
Shortcut Flow Matching for Speech Enhancement: Step-Invariant flows via single stage training
Preemptive Detection and Steering of LLM Misalignment via Latent Reachability
Agribot: agriculture-specific question answer system
Psychological and behavioural responses in human-agent vs. human-human interactions: a systematic review and meta-analysis
Domain-Aware Speaker Diarization On African-Accented English
No Alignment Needed for Generation: Learning Linearly Separable Representations in Diffusion Models
Enhancing Contrastive Learning for Geolocalization by Discovering Hard Negatives on Semivariograms
What Happens Next? Anticipating Future Motion by Generating Point Trajectories
Temporal vs. Spatial: Comparing DINOv3 and V-JEPA2 Feature Representations for Video Action Analysis
Multi-Objective Reinforcement Learning for Large Language Model Optimization: Visionary Perspective
LANCE: Low Rank Activation Compression for Efficient On-Device Continual Learning
OjaKV: Context-Aware Online Low-Rank KV Cache Compression with Oja's Rule
Guiding Audio Editing with Audio Language Model
A Data-driven Typology of Vision Models from Integrated Representational Metrics
InvBench: Can LLMs Accelerate Program Verification with Invariant Synthesis?
MobiLLM: An Agentic AI Framework for Closed-Loop Threat Mitigation in 6G Open RANs
Limitations on Safe, Trusted, Artificial General Intelligence
Logic of Hypotheses: from Zero to Full Knowledge in Neurosymbolic Integration
StepORLM: A Self-Evolving Framework With Generative Process Supervision For Operations Research Language Models
UniMIC: Token-Based Multimodal Interactive Coding for Human-AI Collaboration
Dynamic Experts Search: Enhancing Reasoning in Mixture-of-Experts LLMs at Test Time
Benefits and Pitfalls of Reinforcement Learning for Language Model Planning: A Theoretical Perspective
From Search to Reasoning: A Five-Level RAG Capability Framework for Enterprise Data
PIR-RAG: A System for Private Information Retrieval in Retrieval-Augmented Generation
Assessment of deep learning models integrated with weather and environmental variables for wildfire spread prediction and a case study of the 2023 Maui fires
Seismic Velocity Inversion from Multi-Source Shot Gathers Using Deep Segmentation Networks: Benchmarking U-Net Variants and SeismoLabV3+
Cross-Modal Retrieval with Cauchy-Schwarz Divergence
Cycle is All You Need: More Is Different
From Embeddings to Equations: Genetic-Programming Surrogates for Interpretable Transformer Classification
SGNNBench: A Holistic Evaluation of Spiking Graph Neural Network on Large-scale Graph
Random Direct Preference Optimization for Radiography Report Generation
KV-Efficient VLA: A Method of Speed up Vision Language Model with RNN-Gated Chunked KV Cache
Domain-Informed Genetic Superposition Programming: A Case Study on SFRC Beams
Phrase-grounded Fact-checking for Automatically Generated Chest X-ray Reports
A Novel Differential Feature Learning for Effective Hallucination Detection and Classification
MDF-MLLM: Deep Fusion Through Cross-Modal Feature Alignment for Contextually Aware Fundoscopic Image Classification
Influence Guided Context Selection for Effective Retrieval-Augmented Generation
Multimodal Prompt Decoupling Attack on the Safety Filters in Text-to-Image Models
Context Is What You Need: The Maximum Effective Context Window for Real World Limits of LLMs
A Mutual Learning Method for Salient Object Detection with intertwined Multi-Supervision--Revised
MAJORScore: A Novel Metric for Evaluating Multimodal Relevance via Joint Representation
Design and Implementation of a Secure RAG-Enhanced AI Chatbot for Smart Tourism Customer Service: Defending Against Prompt Injection Attacks -- A Case Study of Hsinchu, Taiwan
Safety Assessment of Scaffolding on Construction Site using AI
ReGeS: Reciprocal Retrieval-Generation Synergy for Conversational Recommender Systems
Automated Prompt Generation for Creative and Counterfactual Text-to-image Synthesis
In silico Deep Learning Protocols for Label-Free Super-Resolution Microscopy: A Comparative Study of Network Architectures and SNR Dependence
Dynamic Multi-Target Fusion for Efficient Audio-Visual Navigation
SAEmnesia: Erasing Concepts in Diffusion Models with Sparse Autoencoders
Toward a Realistic Encoding Model of Auditory Affective Understanding in the Brain
Do Sparse Subnetworks Exhibit Cognitively Aligned Attention? Effects of Pruning on Saliency Map Fidelity, Sparsity, and Concept Coherence
Towards Adapting Federated & Quantum Machine Learning for Network Intrusion Detection: A Survey
MIXRAG : Mixture-of-Experts Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering
Large AI Model-Enabled Generative Semantic Communications for Image Transmission
How Large Language Models Need Symbolism
Near-Optimal Experiment Design in Linear non-Gaussian Cyclic Models
PhenoMoler: Phenotype-Guided Molecular Optimization via Chemistry Large Language Model
DyME: Dynamic Multi-Concept Erasure in Diffusion Models with Bi-Level Orthogonal LoRA Adaptation
Foundation models for high-energy physics
Can AI Perceive Physical Danger and Intervene?
Align2Speak: Improving TTS for Low Resource Languages via ASR-Guided Online Preference Optimization
Retrieval-of-Thought: Efficient Reasoning via Reusing Thoughts
Lifelong Learning with Behavior Consolidation for Vehicle Routing
UltraHorizon: Benchmarking Agent Capabilities in Ultra Long-Horizon Scenarios
Benchmarking MLLM-based Web Understanding: Reasoning, Robustness and Safety
D-Artemis: A Deliberative Cognitive Framework for Mobile GUI Multi-Agents
ProRe: A Proactive Reward System for GUI Agents via Reasoner-Actor Collaboration
DS-STAR: Data Science Agent via Iterative Planning and Verification
Axiomatic Choice and the Decision-Evaluation Paradox
DeepTravel: An End-to-End Agentic Reinforcement Learning Framework for Autonomous Travel Planning Agents
Reimagining Agent-based Modeling with Large Language Model Agents via Shachi
TRACE: Learning to Compute on Graphs
GenesisGeo: Technical Report
DyRo-MCTS: A Robust Monte Carlo Tree Search Approach to Dynamic Job Shop Scheduling
Outlier Detection in Plantar Pressure: Human-Centered Comparison of Statistical Parametric Mapping and Explainable Machine Learning
CoBel-World: Harnessing LLM Reasoning to Build a Collaborative Belief World for Optimizing Embodied Multi-Agent Collaboration
RISK: A Framework for GUI Agents in E-commerce Risk Management
Bilinear relational structure fixes reversal curse and enables consistent model editing
GSM-Agent: Understanding Agentic Reasoning Using Controllable Environments
The Thinking Spectrum: An Emperical Study of Tunable Reasoning in LLMs through Model Merging
A2R: An Asymmetric Two-Stage Reasoning Framework for Parallel Reasoning
Generalizing Multi-Objective Search via Objective-Aggregation Functions
Ground-Truthing AI Energy Consumption: Validating CodeCarbon Against External Measurements
Log2Plan: An Adaptive GUI Automation Framework Integrated with Task Mining Approach
Clinical Uncertainty Impacts Machine Learning Evaluations
Evaluating LLMs for Combinatorial Optimization: One-Phase and Two-Phase Heuristics for 2D Bin-Packing
InfiMed-Foundation: Pioneering Advanced Multimodal Medical Models with Compute-Efficient Pre-Training and Multi-Stage Fine-Tuning
Structured Sparse Transition Matrices to Enable State Tracking in State-Space Models
Large Language Models as Nondeterministic Causal Models
PRIME: Planning and Retrieval-Integrated Memory for Enhanced Reasoning
Do LLM Agents Know How to Ground, Recover, and Assess? A Benchmark for Epistemic Competence in Information-Seeking Agents
EMMA: Generalizing Real-World Robot Manipulation via Generative Visual Transfer
Guiding Evolution of Artificial Life Using Vision-Language Models
GeoSketch: A Neural-Symbolic Approach to Geometric Multimodal Reasoning with Auxiliary Line Construction and Affine Transformation
InfiAgent: Self-Evolving Pyramid Agent Framework for Infinite Scenarios
Estimating the Empowerment of Language Model Agents
TrueGradeAI: Retrieval-Augmented and Bias-Resistant AI for Transparent and Explainable Digital Assessments
REMA: A Unified Reasoning Manifold Framework for Interpreting Large Language Model
The Emergence of Altruism in Large-Language-Model Agents Society
Towards mitigating information leakage when evaluating safety monitors
Correct Reasoning Paths Visit Shared Decision Pivots
AutoClimDS: Climate Data Science Agentic AI -- A Knowledge Graph is All You Need
EEG-Based Consumer Behaviour Prediction: An Exploration from Classical Machine Learning to Graph Neural Networks
GeoEvolve: Automating Geospatial Model Discovery via Multi-Agent Large Language Models
Automated and Interpretable Survival Analysis from Multimodal Data
Semantic F1 Scores: Fair Evaluation Under Fuzzy Class Boundaries

Research Sources: 1438 | Generated: 9/29/2025