AI RESEARCH PAPERS & ACADEMIC SOURCES
- A high-resolution, nanopore-based artificial intelligence assay for DNA replication stress in human cancer cells
- TrimNN: characterizing cellular community motifs for studying multicellular topological organization in complex tissues
- Finding your academic voice: faster, healthier writing with AI speech recognition
- Author Correction: Meta-prediction of coronary artery disease risk
- Data-driven organic solubility prediction at the limit of aleatoric uncertainty
- The chronODE framework for modelling multi-omic time series with ordinary differential equations and machine learning
- Cracking the code: predicting tumor microenvironment enabled chemoresistance with machine learning in the human tumoroid models
- On computing and the complexity of computing higher-order $U$-statistics, exactly
- A note on simulation methods for the Dirichlet-Laplace prior
- Unified Conformalized Multiple Testing with Full Data Efficiency
- Does the Barron space really defy the curse of dimensionality?
- Asymptotic breakdown point analysis of the minimum density power divergence estimator under independent non-homogeneous setups
- Simultaneous estimation of connectivity and dimensionality in samples of networks
- A self-supervised learning approach for denoising autoregressive models with additive noise: finite and infinite variance cases
- Convergence analysis of online algorithms for vector-valued kernel regression
- Efficiently matching random inhomogeneous graphs via degree profiles
- Prompt-to-Slate: Diffusion Models for Prompt-Conditioned Slate Generation
- Multilingual hierarchical classification of job advertisements for job vacancy statistics
- Constructive approximate transport maps with normalizing flows
- Author Correction: A scoping review of self-supervised representation learning for clinical decision making using EHR categorical data
- Beyond AlphaFold: how AI is decoding the grammar of the genome
- Boosting the predictive power of protein representations with a corpus of text annotations
- Real-time prediction of HFNC treatment failure in acute hypoxemic respiratory failure using machine learning
- ProtAlign-ARG: antibiotic resistance gene characterization integrating protein language models and alignment-based scoring
- Adaptive deep SVM for detecting early heart disease among cardiac patients
- Transforming Blood Cell Detection and Classification with Advanced Deep Learning Models: A Comparative Study
- Communicate Less, Synthesize the Rest: Latency-aware Intent-based Generative Semantic Multicasting with Diffusion Models
- A polynomial formula for the perspective four points problem
- WIR3D: Visually-Informed and Geometry-Aware 3D Shape Abstraction
- Matrix-Game 2.0: An Open-Source, Real-Time, and Streaming Interactive World Model
- EgoTwin: Dreaming Body and View in First Person
- HierAdaptMR: Cross-Center Cardiac MRI Reconstruction with Hierarchical Feature Adapters
- IntelliCap: Intelligent Guidance for Consistent View Sampling
- Odo: Depth-Guided Diffusion for Identity-Preserving Body Reshaping
- ID-Card Synthetic Generation: Toward a Simulated Bona fide Dataset
- Checkmate: interpretable and explainable RSVQA is the endgame
- DMS:Diffusion-Based Multi-Baseline Stereo Generation for Improving Self-Supervised Depth Estimation
- Real-Time Beach Litter Detection and Counting: A Comparative Analysis of RT-DETR Model Variants
- Precise Action-to-Video Generation Through Visual Action Prompts
- Motion2Motion: Cross-topology Motion Transfer with Sparse Correspondence
- IGFuse: Interactive 3D Gaussian Scene Reconstruction via Multi-Scans Fusion
- 4DNeX: Feed-Forward 4D Generative Modeling Made Easy
- Data-driven RF Tomography via Cross-modal Sensing and Continual Learning
- BeeNet: Reconstructing Flower Shapes from Electric Fields using Deep Learning
- Statistical analysis of multivariate planar curves and applications to X-ray classification
- DermINO: Hybrid Pretraining for a Versatile Dermatology Foundation Model
- iTrace: Click-Based Gaze Visualization on the Apple Vision Pro
- Express4D: Expressive, Friendly, and Extensible 4D Facial Motion Generation Benchmark
- FractMorph: A Fractional Fourier-Based Multi-Domain Transformer for Deformable Image Registration
- Mechanical Automation with Vision: A Design for Rubik's Cube Solver
- Segmenting Thalamic Nuclei: T1 Maps Provide a Reliable and Efficient Solution
- PROD: Palpative Reconstruction of Deformable Objects through Elastostatic Signed Distance Functions
- Anatomic Feature Fusion Model for Diagnosing Calcified Pulmonary Nodules on Chest X-Ray
- Temporal and Rotational Calibration for Event-Centric Multi-Sensor Systems
- HOMI: Ultra-Fast EdgeAI platform for Event Cameras
- Point upsampling networks for single-photon sensing
- Grounding Actions in Camera Space: Observation-Centric Vision-Language-Action Policy
- Adaptively Clustering Neighbor Elements for Image-Text Generation
- A locally statistical active contour model for SAR image segmentation can be solved by denoising algorithms
- Optimization of Prompt Learning via Multi-Knowledge Representation for Vision-Language Models
- Multispectral Fine-Grained Classification of Blackgrass in Wheat and Barley Crops
- MicroMIL: Graph-Based Multiple Instance Learning for Context-Aware Diagnosis with Microscopic Images
- EventHallusion: Diagnosing Event Hallucinations in Video LLMs
- Style Ambiguity Loss Using CLIP
- OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation
- SLGaussian: Fast Language Gaussian Splatting in Sparse Views
- Rethinking Model Redundancy for Low-light Image Enhancement
- Embodied Image Quality Assessment for Robotic Intelligence
- Co-Paced Learning Strategy Based on Confidence for Flying Bird Object Detection Model Training
- Shape from Semantics: 3D Shape Generation from Multi-View Semantics
- D-Attn: Decomposed Attention for Large Vision-and-Language Models
- From One Single Sketch to 3D Detailed Face Reconstruction
- Best Foot Forward: Robust Foot Reconstruction in-the-wild
- TopoMortar: A dataset to evaluate image segmentation methods focused on topology accuracy
- STORM: Token-Efficient Long Video Understanding for Multimodal LLMs
- Novel Object 6D Pose Estimation with a Single Reference View
- PixelPonder: Dynamic Patch Adaptation for Enhanced Multi-Conditional Text-to-Image Generation
- ICE-Bench: A Unified and Comprehensive Benchmark for Image Creating and Editing
- ETVA: Evaluation of Text-to-Video Alignment via Fine-grained Question Generation and Answering
- DAGait: Generalized Skeleton-Guided Data Alignment for Gait Recognition
- Reasoning and Learning a Perceptual Metric for Self-Training of Reflective Objects in Bin-Picking with a Low-cost Camera
- Diffusion Based Ambiguous Image Segmentation
- DLTPose: 6DoF Pose Estimation From Accurate Dense Surface Point Estimates
- InterAnimate: Taming Region-aware Diffusion Model for Realistic Human Interaction Animation
- Cognitive-Inspired Hierarchical Attention Fusion With Visual and Textual for Cross-Domain Sequential Recommendation
- Segmenting Objectiveness and Task-awareness Unknown Region for Autonomous Driving
- Differentiable Room Acoustic Rendering with Multi-View Vision Priors
- Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
- RMMSS: Towards Advanced Robust Multi-Modal Semantic Segmentation with Hybrid Prototype Distillation and Feature Selection
- Diving into the Fusion of Monocular Priors for Generalized Stereo Matching
- Continual Learning on CLIP via Incremental Prompt Tuning with Intrinsic Textual Anchors
- InterRVOS: Interaction-aware Referring Video Object Segmentation
- Hyperspectral Image Generation with Unmixing Guided Diffusion Model
- LoRA-Edit: Controllable First-Frame-Guided Video Editing via Mask-Aware LoRA Fine-Tuning
- Not All Tokens and Heads Are Equally Important: Dual-Level Attention Intervention for Hallucination Mitigation
- EraserDiT: Fast Video Inpainting with Diffusion Transformer Model
- Visual Content Detection in Educational Videos with Transfer Learning and Dataset Enrichment
- Attention to the Burstiness in Visual Prompt Tuning!
- Foundation Models for Zero-Shot Segmentation of Scientific Images without AI-Ready Data
- Learn 3D VQA Better with Active Selection and Reannotation
- Mammo-SAE: Interpreting Breast Cancer Concept Learning with Sparse Autoencoders
- D2-Mamba: Dual-Scale Fusion and Dual-Path Scanning with SSMs for Shadow Removal
- SocialTrack: Multi-Object Tracking in Complex Urban Traffic Scenes Inspired by Social Behavior
- Leveraging Diffusion Models for Stylization using Multiple Style Images
- Morphological classification of eclipsing binary stars using computer vision methods
- DEEP-SEA: Deep-Learning Enhancement for Environmental Perception in Submerged Aquatics
- Multi-source Multimodal Progressive Domain Adaption for Audio-Visual Deception Detection
- Cross-Domain Few-Shot Learning via Multi-View Collaborative Optimization with Vision-Language Models
- Preserve and Sculpt: Manifold-Aligned Fine-tuning of Vision-Language Models for Few-Shot Learning
- S^2-Guidance: Stochastic Self Guidance for Training-Free Enhancement of Diffusion Models
- ONG: One-Shot NMF-based Gradient Masking for Efficient Model Sparsification
- CMF-IoU: Multi-Stage Cross-Modal Fusion 3D Object Detection with IoU Joint Prediction
- 7Bench: a Comprehensive Benchmark for Layout-guided Text-to-image Models
- Towards High-Resolution Industrial Image Anomaly Detection
- Lumen: Consistent Video Relighting and Harmonious Background Replacement with Video Generative Models
- MaskSem: Semantic-Guided Masking for Learning 3D Hybrid High-Order Motion Representation
- Breaking Reward Collapse: Adaptive Reinforcement for Open-ended Medical Reasoning with Enhanced Semantic Discrimination
- GazeDETR: Gaze Detection using Disentangled Head and Gaze Representations
- Compact Attention: Exploiting Structured Spatio-Temporal Sparsity for Fast Video Generation
- Dextr: Zero-Shot Neural Architecture Search with Singular Value Decomposition and Extrinsic Curvature
- Omni Survey for Multimodality Analysis in Visual Object Tracking
- SlimComm: Doppler-Guided Sparse Queries for Bandwidth-Efficient Cooperative 3-D Perception
- Exploring Spatial-Temporal Dynamics in Event-based Facial Micro-Expression Analysis
- InstDrive: Instance-Aware 3D Gaussian Splatting for Driving Scenes
- WiseLVAM: A Novel Framework For Left Ventricle Automatic Measurements
- Enhancing 3D point accuracy of laser scanner through multi-stage convolutional neural network for applications in construction
- Error Propagation Mechanisms and Compensation Strategies for Quantized Diffusion
- VELVET-Med: Vision and Efficient Language Pre-training for Volumetric Imaging Tasks in Medicine
- DualFit: A Two-Stage Virtual Try-On via Warping and Synthesis
- TriQDef: Disrupting Semantic and Gradient Alignment to Prevent Adversarial Patch Transferability in Quantized Neural Networks
- Infusing fine-grained visual knowledge to Vision-Language Models
- Scalable RF Simulation in Generative 4D Worlds
- Splat Feature Solver
- C2PSA-Enhanced YOLOv11 Architecture: A Novel Approach for Small Target Detection in Cotton Disease Diagnosis
- In vivo 3D ultrasound computed tomography of musculoskeletal tissues with generative neural physics
- WXSOD: A Benchmark for Robust Salient Object Detection in Adverse Weather Conditions
- Superpixel-informed Continuous Low-Rank Tensor Representation for Multi-Dimensional Data Recovery
- SNNSIR: A Simple Spiking Neural Network for Stereo Image Restoration
- CLAIR: CLIP-Aided Weakly Supervised Zero-Shot Cross-Domain Image Retrieval
- Improving Densification in 3D Gaussian Splatting for High-Fidelity Rendering
- Neural Cellular Automata for Weakly Supervised Segmentation of White Blood Cells
- Attention Pooling Enhances NCA-based Classification of Microscopy Images
- DoppDrive: Doppler-Driven Temporal Aggregation for Improved Radar Object Detection
- Geometry-Aware Video Inpainting for Joint Headset Occlusion Removal and Face Reconstruction in Social XR
- AquaFeat: A Features-Based Image Enhancement Model for Underwater Object Detection
- MBMamba: When Memory Buffer Meets Mamba for Structure-Aware Image Deblurring
- EgoLoc: A Generalizable Solution for Temporal Interaction Localization in Egocentric Videos
- ViT-EnsembleAttack: Augmenting Ensemble Models for Stronger Adversarial Transferability in Vision Transformers
- DeCoT: Decomposing Complex Instructions for Enhanced Text-to-Image Generation with Large Language Models
- Federated Cross-Modal Style-Aware Prompt Generation
- MPCAR: Multi-Perspective Contextual Augmentation for Enhanced Visual Reasoning in Large Vision-Language Models
- LMAD: Integrated End-to-End Vision-Language Model for Explainable Autonomous Driving
- S5: Scalable Semi-Supervised Semantic Segmentation in Remote Sensing
- TiP4GEN: Text to Immersive Panorama 4D Scene Generation
- Illusions in Humans and AI: How Visual Perception Aligns and Diverges
- X-Ray-CoT: Interpretable Chest X-ray Diagnosis with Vision-Language Models via Chain-of-Thought Reasoning
- Skin Cancer Classification: Hybrid CNN-Transformer Models with KAN-Based Fusion
- LangVision-LoRA-NAS: Neural Architecture Search for Variable LoRA Rank in Vision Language Models
- MuSACo: Multimodal Subject-Specific Selection and Adaptation for Expression Recognition with Co-Training
- REVEAL -- Reasoning and Evaluation of Visual Evidence through Aligned Language
- Structure-preserving Feature Alignment for Old Photo Colorization
- Foundation Model for Skeleton-Based Human Action Understanding
- Multimodal Chain of Continuous Thought for Latent-Space Reasoning in Vision-Language Models
- ViLaD: A Large Vision Language Diffusion Framework for End-to-End Autonomous Driving
- ViDA-UGC: Detailed Image Quality Analysis via Visual Distortion Assessment for UGC Images
- WIPES: Wavelet-based Visual Primitives
- Creative4U: MLLMs-based Advertising Creative Image Selector with Comparative Reasoning
- Learn Faster and Remember More: Balancing Exploration and Exploitation for Continual Test-time Adaptation
- DyCrowd: Towards Dynamic Crowd Reconstruction from a Large-scene Video
- Stable Diffusion-Based Approach for Human De-Occlusion
- WP-CLIP: Leveraging CLIP to Predict W\"olfflin's Principles in Visual Art
- Refine-and-Contrast: Adaptive Instance-Aware BEV Representations for Multi-UAV Collaborative Object Detection
- Neural Rendering for Sensor Adaptation in 3D Object Detection
- Drifting Away from Truth: GenAI-Driven News Diversity Challenges LVLM-Based Misinformation Detection
- Real-Time Sign Language Gestures to Speech Transcription using Deep Learning
- Single-Reference Text-to-Image Manipulation with Dual Contrastive Denoising Score
- Quantifying and Alleviating Co-Adaptation in Sparse-View 3D Gaussian Splatting
- Frequency-Driven Inverse Kernel Prediction for Single Image Defocus Deblurring
- MHPP: Exploring the Capabilities and Limitations of Language Models Beyond Basic Code Generation
- Towards No-Code Programming of Cobots: Experiments with Code Synthesis by Large Code Models for Conversational Programming
- StepTool: Enhancing Multi-Step Tool Usage in LLMs via Step-Grained Reinforcement Learning
- NormXLogit: The Head-on-Top Never Lies
- Idiom Detection in Sorani Kurdish Texts
- VisualSpeech: Enhancing Prosody Modeling in TTS Using Video
- LIDDIA: Language-based Intelligent Drug Discovery Agent
- Full-Duplex-Bench: A Benchmark to Evaluate Full-duplex Spoken Dialogue Models on Turn-taking Capabilities
- An Information-Theoretic Approach to Identifying Formulaic Clusters in Textual Data
- High-Dimensional Interlingual Representations of Large Language Models
- FastCuRL: Curriculum Reinforcement Learning with Stage-wise Context Scaling for Efficient Training R1-like Reasoning Models
- SCORE: Story Coherence and Retrieval Enhancement for AI Narratives
- TeleAntiFraud-28k: An Audio-Text Slow-Thinking Dataset for Telecom Fraud Detection
- EvalAgent: Discovering Implicit Evaluation Criteria from the Web
- Deliberate Planning in Language Models with Symbolic Representation
- From Templates to Natural Language: Generalization Challenges in Instruction-Tuned LLMs for Spatial Reasoning
- Concealment of Intent: A Game-Theoretic Analysis
- Translation in the Wild
- PromptSuite: A Task-Agnostic Framework for Multi-Prompt Generation
- CoRank: LLM-Based Compact Reranking with Document Features for Scientific Retrieval
- USAD: Universal Speech and Audio Representation via Distillation
- A Deep Learning-Based CCTV System for Automatic Smoking Detection in Fire Exit Zones
- Towards Understanding 3D Vision: the Role of Gaussian Curvature
- Impact of Clinical Image Quality on Efficient Foundation Model Finetuning
- Large Kernel Modulation Network for Efficient Image Super-Resolution
- OVG-HQ: Online Video Grounding with Hybrid-modal Queries
- SafeCtrl: Region-Based Safety Control for Text-to-Image Diffusion via Detect-Then-Suppress
- TimeSenCLIP: A Vision-Language Model for Remote Sensing Using Single-Pixel Time Series
- Assessment of Using Synthetic Data in Brain Tumor Segmentation
- Deep Learning For Point Cloud Denoising: A Survey
- DynamicPose: Real-time and Robust 6D Object Pose Tracking for Fast-Moving Cameras and Objects
- Transferable Class Statistics and Multi-scale Feature Approximation for 3D Object Detection
- UniUGG: Unified 3D Understanding and Generation via Geometric-Semantic Encoding
- SAMDWICH: Moment-aware Video-text Alignment for Referring Video Object Segmentation
- PEdger++: Practical Edge Detection via Assembling Cross Information
- VideoAVE: A Multi-Attribute Video-to-Text Attribute Value Extraction Dataset and Benchmark Models
- Mitigating Jailbreaks with Intent-Aware LLMs
- Insight Rumors: A Novel Textual Rumor Locating and Marking Model Leveraging Att_BiMamba2 Network
- Vision-G1: Towards General Vision Language Reasoning with Multi-Domain Data Curation
- Investigating Transcription Normalization in the Faetar ASR Benchmark
- A Multi-Task Evaluation of LLMs' Processing of Academic Text Input
- LLM-Guided Planning and Summary-Based Scientific Text Simplification: DS@GT at CLEF 2025 SimpleText
- Hallucination Detection and Mitigation in Scientific Text Simplification using Ensemble Approaches: DS@GT at CLEF 2025 SimpleText
- A Survey of Idiom Datasets for Psycholinguistic and Computational Research
- In-Context Examples Matter: Improving Emotion Recognition in Conversation with Instruction Tuning
- LLMs Struggle with NLI for Perfect Aspect: A Cross-Linguistic Study in Chinese and Japanese
- CAMF: Collaborative Adversarial Multi-agent Framework for Machine Generated Text Detection
- Learning Wisdom from Errors: Promoting LLM's Continual Relation Learning through Exploiting Error Cases
- Exploring Efficiency Frontiers of Thinking Budget in Medical Reasoning: Scaling Laws between Computational Resources and Reasoning Quality
- LLM-as-a-Judge for Privacy Evaluation? Exploring the Alignment of Human and LLM Perceptions of Privacy in Textual Data
- Arabic Multimodal Machine Learning: Datasets, Applications, Approaches, and Challenges
- SEA-BED: Southeast Asia Embedding Benchmark
- What do Speech Foundation Models Learn? Analysis and Applications
- Structuring the Unstructured: A Systematic Review of Text-to-Structure Generation for Agentic AI with a Universal Evaluation Framework
- Fast, Slow, and Tool-augmented Thinking for LLMs: A Review
- Legal$\Delta$: Enhancing Legal Reasoning in LLMs via Reinforcement Learning with Chain-of-Thought Guided Information Gain
- A Question Answering Dataset for Temporal-Sensitive Retrieval-Augmented Generation
- Incorporating Legal Logic into Deep Learning: An Intelligent Approach to Probation Prediction
- Consensus or Conflict? Fine-Grained Evaluation of Conflicting Answers in Question-Answering
- ReaLM: Reflection-Enhanced Autonomous Reasoning with Small Language Models
- ZigzagAttention: Efficient Long-Context Inference with Exclusive Retrieval and Streaming Heads
- The Cultural Gene of Large Language Models: A Study on the Impact of Cross-Corpus Training on Model Values and Biases
- M3PO: Multimodal-Model-Guided Preference Optimization for Visual Instruction Following
- LoraxBench: A Multitask, Multilingual Benchmark Suite for 20 Indonesian Languages
- Is GPT-OSS Good? A Comprehensive Evaluation of OpenAI's Latest Open Source Models
- The Structural Sources of Verb Meaning Revisited: Large Language Models Display Syntactic Bootstrapping
- Semantic Anchoring in Agentic Memory: Leveraging Linguistic Structures for Persistent Conversational Context
- Beyond GPT-5: Making LLMs Cheaper and Better via Performance-Efficiency Optimized Routing
- Prompt-Induced Linguistic Fingerprints for LLM-Generated Fake News Detection
- Leveraging Large Language Models for Predictive Analysis of Human Misery
- DESIGNER: Design-Logic-Guided Multidisciplinary Data Synthesis for LLM Reasoning
- From SALAMANDRA to SALAMANDRATA: BSC Submission for WMT25 General Machine Translation Shared Task
- HeteroRAG: A Heterogeneous Retrieval-Augmented Generation Framework for Medical Vision Language Tasks
- When Alignment Hurts: Decoupling Representational Spaces in Multilingual Models
- ding-01 :ARG0: An AMR Corpus for Spontaneous French Dialogue
- It takes a village to write a book: Mapping anonymous contributions in Stephen Langton's Quaestiones Theologiae
- An LLM Agent-Based Complex Semantic Table Annotation Approach
- Analyzing Information Sharing and Coordination in Multi-Agent Planning
- WebMall -- A Multi-Shop Benchmark for Evaluating Web Agents
- Integrating Feedback Loss from Bi-modal Sarcasm Detector for Sarcastic Speech Synthesis
- B\"{u}y\"{u}k Dil Modelleri i\c{c}in TR-MMLU Benchmark{\i}: Performans De\u{g}erlendirmesi, Zorluklar ve \.{I}yile\c{s}tirme F{\i}rsatlar{\i}
- Do\u{g}al Dil \.I\c{s}lemede Tokenizasyon Standartlar{\i} ve \"Ol\c{c}\"um\"u: T\"urk\c{c}e \"Uzerinden B\"uy\"uk Dil Modellerinin Kar\c{s}{\i}la\c{s}t{\i}rmal{\i} Analizi
- Evaluating ASR robustness to spontaneous speech errors: A study of WhisperX using a Speech Error Database
- DocHPLT: A Massively Multilingual Document-Level Translation Dataset
- All for law and law for all: Adaptive RAG Pipeline for Legal Research
- AutoBnB-RAG: Enhancing Multi-Agent Incident Response with Retrieval-Augmented Generation
- MuDRiC: Multi-Dialect Reasoning for Arabic Commonsense Validation
- Asymptotic Optimism of Random-Design Linear and Kernel Regression Models
- ActionPiece: Contextually Tokenizing Action Sequences for Generative Recommendation
- Rashomon perspective for measuring uncertainty in the survival predictive maintenance models
- Partially stochastic deep learning with uncertainty quantification for model predictive heating control
- Can LLMs Handle WebShell Detection? Overcoming Detection Challenges with Behavioral Function-Aware Framework
- Efficient Discovery of Motif Transition Process for Large-Scale Temporal Graphs
- High-Fidelity And Complex Test Data Generation For Real-World SQL Code Generation Services
- Balancing Interpretability and Flexibility in Modeling Diagnostic Trajectories with an Embedded Neural Hawkes Process Model
- RIFT: Closed-Loop RL Fine-Tuning for Realistic and Controllable Traffic Simulation
- Adaptive Noise Resilient Keyword Spotting Using One-Shot Learning
- Token-level Accept or Reject: A Micro Alignment Approach for Large Language Models
- Wavelet Flow For Extragalactic Foreground Simulations
- Symmetry-Aware GFlowNets
- Towards Generalized Source Tracing for Codec-Based Deepfake Speech
- Generative Modeling of Full-Atom Protein Conformations using Latent Diffusion on Graph Embeddings
- Information Must Flow: Recursive Bootstrapping for Information Bottleneck in Optimal Transport
- Model-free reinforcement learning with noisy actions for automated experimental control in optics
- Clustering-Based Validation Splits for Model Selection under Domain Shift
- MUC: Machine Unlearning for Contrastive Learning with Black-box Evaluation
- Variational Flow Matching for Graph Generation
- LGR2: Language Guided Reward Relabeling for Accelerating Hierarchical Reinforcement Learning
- State-Space Modeling in Long Sequence Processing: A Survey on Recurrence in the Transformer Era
- Data-dependent and Oracle Bounds on Forgetting in Continual Learning
- KACQ-DCNN: Uncertainty-Aware Interpretable Kolmogorov-Arnold Classical-Quantum Dual-Channel Neural Network for Heart Disease Detection
- Towards Optimal Environmental Policies: Policy Learning under Arbitrary Bipartite Network Interference
- Sliding Puzzles Gym: A Scalable Benchmark for State Representation in Visual Reinforcement Learning
- Direct Preference Optimization for Primitive-Enabled Hierarchical Reinforcement Learning
- On-device Anomaly Detection in Conveyor Belt Operations
- Segmenting Action-Value Functions Over Time-Scales in SARSA via TD($\Delta$)
- Rethinking Aleatoric and Epistemic Uncertainty
- Sub-Sequential Physics-Informed Learning with State Space Model
- OneForecast: A Universal Framework for Global and Regional Weather Forecasting
- Inverse Bridge Matching Distillation
- Reverse Markov Learning: Multi-Step Generative Models for Complex Distributions
- SALSA-RL: Stability Analysis in the Latent Space of Actions for Reinforcement Learning
- Hierarchical Refinement: Optimal Transport to Infinity and Beyond
- Seldonian Reinforcement Learning for Ad Hoc Teamwork
- Enabling Weak Client Participation via On-device Knowledge Distillation in Heterogenous Federated Learning
- MedSpaformer: a Transferable Transformer with Multi-granularity Token Sparsification for Medical Time Series Classification
- Optimizing Language Models for Inference Time Objectives using Reinforcement Learning
- NoProp: Training Neural Networks without Full Back-propagation or Full Forward-propagation
- Deep Positive-Negative Prototypes for Adversarially Robust Discriminative Prototypical Learning
- Learning from Samples: Inverse Problems over measures via Sharpened Fenchel-Young Losses
- The Panaceas for Improving Low-Rank Decomposition in Communication-Efficient Federated Learning
- Beyond Zero Initialization: Investigating the Impact of Non-Zero Initialization on LoRA Fine-Tuning Dynamics
- Generalizable LLM Learning of Graph Synthetic Data with Post-training Alignment
- Mixture of Experts Provably Detect and Learn the Latent Cluster Structure in Gradient-Based Learning
- When can in-context learning generalize out of task distribution?
- Exponential Family Variational Flow Matching for Tabular Data Generation
- Towards Infant Sleep-Optimized Driving: Synergizing Wearable and Vehicle Sensing in Intelligent Cruise Control
- Breaking Data Silos: Towards Open and Scalable Mobility Foundation Models via Generative Continual Learning
- Scalable Gaussian Processes with Latent Kronecker Structure
- Overcoming Long-Context Limitations of State-Space Models via Context-Dependent Sparse Attention
- AdaMuon: Adaptive Muon Optimizer
- Nonlinear Concept Erasure: a Density Matching Approach
- Near-Optimal Sparse Allreduce for Distributed Deep Learning
- Is Smaller Always Faster? Tradeoffs in Compressing Self-Supervised Speech Transformers
- Kernel Ridge Regression Inference
- Learning Zero-Sum Linear Quadratic Games with Improved Sample Complexity and Last-Iterate Convergence
- A Consistent and Scalable Algorithm for Best Subset Selection in Single Index Models
- Variational Optimization for Quantum Problems using Deep Generative Networks
- CCDM: Continuous Conditional Diffusion Models for Image Generation
- FacLens: Transferable Probe for Foreseeing Non-Factuality in Fact-Seeking Question Answering of Large Language Models
- LieRE: Lie Rotational Positional Encodings
- Optimal Projections for Classification with Naive Bayes
- Differentially Private Covariate Balancing Causal Inference
- Emoji Attack: Enhancing Jailbreak Attacks Against Judge LLM Detection
- Universal on-chip polarization handling with deep photonic networks
- Nonparametric Filtering, Estimation and Classification using Neural Jump ODEs
- Benchmarking Federated Learning for Semantic Datasets: Federated Scene Graph Generation
- STRAP: Robot Sub-Trajectory Retrieval for Augmented Policy Learning
- Convex Physics Informed Neural Networks for the Monge-Amp\`ere Optimal Transport Problem
- Propagation of Chaos for Mean-Field Langevin Dynamics and its Application to Model Ensemble
- Linear Bandits with Partially Observable Features
- The path to a goal: Understanding soccer possessions via path signatures
- Simulation-Based Inference: A Practical Guide
- Fully Automated Segmentation of Fiber Bundles in Anatomic Tracing Data
- Shapley Values: Paired-Sampling Approximations
- Arabic ASR on the SADA Large-Scale Arabic Speech Corpus with Transformer-Based Models
- Transfer Learning for Neutrino Scattering: Domain Adaptation with GANs
- Empirical Evidences for the Effects of Feature Diversity in Open Set Recognition and Continual Learning
- Is This News Still Interesting to You?: Lifetime-aware Interest Matching for News Recommendation
- Eyes on the Image: Gaze Supervised Multimodal Learning for Chest X-ray Diagnosis and Report Generation
- Denoising diffusion models for inverse design of inflatable structures with programmable deformations
- Improving Detection of Watermarked Language Models
- OptimalThinkingBench: Evaluating Over and Underthinking in LLMs
- Has GPT-5 Achieved Spatial Intelligence? An Empirical Study
- Signal and Noise: A Framework for Reducing Uncertainty in Language Model Evaluation
- On Delta-Homology Analogy: Memory as Structured Trajectories
- STRIDE: Structure and Embedding Distillation with Attention for Graph Neural Networks
- Latent Plan Transformer for Trajectory Abstraction: Planning as Latent Space Inference
- HetSyn: Versatile Timescale Integration in Spiking Neural Networks via Heterogeneous Synapses
- Inductive transfer learning from regression to classification in ECG analysis
- Robust Sparse Bayesian Learning Based on Minimum Error Entropy for Noisy High-Dimensional Brain Activity Decoding
- Unsupervised Pairwise Learning Optimization Framework for Cross-Corpus EEG-Based Emotion Recognition Based on Prototype Representation
- Energy-Efficient Real-Time 4-Stage Sleep Classification at 10-Second Resolution: A Comprehensive Study
- Explainable Deep Neural Network for Multimodal ECG Signals: Intermediate vs Late Fusion
- A Graph Neural Network based on a Functional Topology Model: Unveiling the Dynamic Mechanisms of Non-Suicidal Self-Injury in Single-Channel EEG
- Enhancing Corrosion Resistance of Aluminum Alloys Through AI and ML Modeling
- Data-Driven Discovery of Interpretable Kalman Filter Variants through Large Language Models and Genetic Programming
- BaMANI: Bayesian Multi-Algorithm causal Network Inference
- Limitation Learning: Catching Adverse Dialog with GAIL
- Ontology-Guided Query Expansion for Biomedical Document Retrieval using Large Language Models
- An MLP Baseline for Handwriting Recognition Using Planar Curvature and Gradient Orientation
- Audio Flamingo Sound-CoT Technical Report: Improving Chain-of-Thought Reasoning in Sound Understanding
- From Pixels to Graphs: Deep Graph-Level Anomaly Detection on Dermoscopic Images
- Dropping Just a Handful of Preferences Can Change Top Large Language Model Rankings
- Adversarial Robustness in Distributed Quantum Machine Learning
- ComplicitSplat: Downstream Models are Vulnerable to Blackbox Attacks by 3D Gaussian Splat Camouflages
- On Balancing Sparsity with Reliable Connectivity in Distributed Network Design with Random K-out Graphs
- A Sobel-Gradient MLP Baseline for Handwritten Character Recognition
- Reduced-order modeling of Hamiltonian dynamics based on symplectic neural networks
- Optimizing Token Choice for Code Watermarking: A RL Approach
- Leveraging Geometric Insights in Hyperbolic Triplet Loss for Improved Recommendations
- Optimizing Neural Architectures for Hindi Speech Separation and Enhancement in Noisy Environments
- Robust Data Fusion via Subsampling
- Belief-Conditioned One-Step Diffusion: Real-Time Trajectory Planning with Just-Enough Sensing
- ATLAS: AI-Native Receiver Test-and-Measurement by Leveraging AI-Guided Search
- CarelessWhisper: Turning Whisper into a Causal Streaming Model
- Uncovering Emergent Physics Representations Learned In-Context by Large Language Models
- SimQFL: A Quantum Federated Learning Simulator with Real-Time Visualization
- Data-driven Trust Bootstrapping for Mobile Edge Computing-based Industrial IoT Services
- A Self-Ensemble Inspired Approach for Effective Training of Binary-Weight Spiking Neural Networks
- Towards SISO Bistatic Sensing for ISAC
- Synthesizing Accurate and Realistic T1-weighted Contrast-Enhanced MR Images using Posterior-Mean Rectified Flow
- DIT: Dimension Reduction View on Optimal NFT Rarity Meters
- Unfolded Laplacian Spectral Embedding: A Theoretically Grounded Approach to Dynamic Network Representation
- Adaptive Model-Predictive Control of a Soft Continuum Robot Using a Physics-Informed Neural Network Based on Cosserat Rod Theory
- MixCache: Mixture-of-Cache for Video Diffusion Transformer Acceleration
- Unlearning Comparator: A Visual Analytics System for Comparative Evaluation of Machine Unlearning Methods
- A Hierarchical Surrogate Model for Efficient Multi-Task Parameter Learning in Closed-Loop Contro
- On the Importance of Behavioral Nuances: Amplifying Non-Obvious Motor Noise Under True Empirical Considerations May Lead to Briefer Assays and Faster Classification Processes
- Deep Semantic Inference over the Air: An Efficient Task-Oriented Communication System
- SIS-Challenge: Event-based Spatio-temporal Instance Segmentation Challenge at the CVPR 2025 Event-based Vision Workshop
- Efficient and Verifiable Privacy-Preserving Convolutional Computation for CNN Inference with Untrusted Clouds
- Optimal Condition for Initialization Variance in Deep Neural Networks: An SGD Dynamics Perspective
- Fairness Regularization in Federated Learning
- VARAN: Variational Inference for Self-Supervised Speech Models Fine-Tuning on Downstream Tasks
- Content Accuracy and Quality Aware Resource Allocation Based on LP-Guided DRL for ISAC-Driven AIGC Networks
- Time-Scale Coupling Between States and Parameters in Recurrent Neural Networks
- DE-VAE: Revealing Uncertainty in Parametric and Inverse Projections with Variational Autoencoders using Differential Entropy
- Communication-Efficient Distributed Asynchronous ADMM
- CC-Time: Cross-Model and Cross-Modality Time Series Forecasting
- DHG-Bench: A Comprehensive Benchmark on Deep Hypergraph Learning
- L-SR1: Learned Symmetric-Rank-One Preconditioning
- Convergence Analysis of the Lion Optimizer in Centralized and Distributed Settings
- Bi-Axial Transformers: Addressing the Increasing Complexity of EHR Classification
- Machine Learning-Based Manufacturing Cost Prediction from 2D Engineering Drawings via Geometric Features
- Local Cluster Cardinality Estimation for Adaptive Mean Shift
- Cost-Aware Contrastive Routing for LLMs
- Trust Region Constrained Measure Transport in Path Space for Stochastic Optimal Control and Inference
- Results of the NeurIPS 2023 Neural MMO Competition on Multi-task Reinforcement Learning
- Toward Architecture-Agnostic Local Control of Posterior Collapse in VAEs
- Illuminating LLM Coding Agents: Visual Analytics for Deeper Understanding and Enhancement
- Deep Learning-Based Financial Time Series Forecasting via Sliding Window and Variational Mode Decomposition
- Data-driven particle dynamics: Structure-preserving coarse-graining for emergent behavior in non-equilibrium systems
- Physics-informed deep operator network for traffic state estimation
- FLARE: Fast Low-rank Attention Routing Engine
- Constructing Invariant and Equivariant Operations by Symmetric Tensor Network
- A Hybrid Surrogate for Electric Vehicle Parameter Estimation and Power Consumption via Physics-Informed Neural Operators
- FlowMol3: Flow Matching for 3D De Novo Small-Molecule Generation
- BUILDA: A Thermal Building Data Generation Framework for Transfer Learning
- Argos: A Decentralized Federated System for Detection of Traffic Signs in CAVs
- FedSODA: Federated Fine-tuning of LLMs via Similarity Group Pruning and Orchestrated Distillation Alignment
- A Multi-Resolution Benchmark Framework for Spatial Reasoning Assessment in Neural Networks
- Constrained Centroid Clustering: A Novel Approach for Compact and Structured Partitioning
- Short-Term Forecasting of Energy Production and Consumption Using Extreme Learning Machine: A Comprehensive MIMO based ELM Approach
- Online Ensemble Transformer for Accurate Cloud Workload Forecasting in Predictive Auto-Scaling
- Wavy Transformer
- Maximum Score Routing For Mixture-of-Experts
- Learning In-context $\pmb{n}$-grams with Transformers: Sub-$\pmb{n}$-grams Are Near-stationary Points
- TCUQ: Single-Pass Uncertainty Quantification from Temporal Consistency with Streaming Conformal Calibration for TinyML
- SparseMap: A Sparse Tensor Accelerator Framework Based on Evolution Strategy
- SNAP-UQ: Self-supervised Next-Activation Prediction for Single-Pass Uncertainty in TinyML
- Fed-DPRoC:Communication-Efficient Differentially Private and Robust Federated Learning
- Predicting the Performance of Graph Convolutional Networks with Spectral Properties of the Graph Laplacian
- Fairness-Aware Multi-view Evidential Learning with Adaptive Prior
- Monte Carlo Functional Regularisation for Continual Learning
- Design and Analysis of Robust Adaptive Filtering with the Hyperbolic Tangent Exponential Kernel M-Estimator Function for Active Noise Control
- Beyond Internal Data: Bounding and Estimating Fairness from Incomplete Data
- Seeing the Many: Exploring Parameter Distributions Conditioned on Features in Surrogates
- Outlier Detection of Poisson-Distributed Targets Using a Seabed Sensor Network
- A Perfectly Truthful Calibration Measure
- Causally-Guided Pairwise Transformer -- Towards Foundational Digital Twins in Process Industry
- Training Machine Learning Models on Human Spatio-temporal Mobility Data: An Experimental Study [Experiment Paper]
- MDPO: Overcoming the Training-Inference Divide of Masked Diffusion Language Models
- Tightening the mixed integer linear formulation for the piecewise linear approximation in general dimensions
- Sparse Attention across Multiple-context KV Cache
- From Heuristics to Data: Quantifying Site Planning Layout Indicators with Deep Learning and Multi-Modal Data
- Causal Structure Learning in Hawkes Processes with Complex Latent Confounder Networks
- Scalable Geospatial Data Generation Using AlphaEarth Foundations Model
- Fed-Meta-Align: A Similarity-Aware Aggregation and Personalization Pipeline for Federated TinyML on Heterogeneous Data
- Combinations of Fast Activation and Trigonometric Functions in Kolmogorov-Arnold Networks
- PCA- and SVM-Grad-CAM for Convolutional Neural Networks: Closed-form Jacobian Expression
- Scale-Disentangled spatiotemporal Modeling for Long-term Traffic Emission Forecasting
- An Improved Algorithm for Adversarial Linear Contextual Bandits via Reduction
- M3OOD: Automatic Selection of Multimodal OOD Detectors
- Learning Marked Temporal Point Process Explanations based on Counterfactual and Factual Reasoning
- Set-Valued Transformer Network for High-Emission Mobile Source Identification
- Universal Learning of Nonlinear Dynamics
- FedUHD: Unsupervised Federated Learning using Hyperdimensional Computing
- LLMs Are In-Context Bandit Reinforcement Learners
- Advanced Gesture Recognition for Autism Spectrum Disorder Detection: Integrating YOLOv7, Video Augmentation, and VideoMAE for Naturalistic Video Analysis
- Testing Components of the Attention Schema Theory in Artificial Neural Networks
- Regress, Don't Guess -- A Regression-like Loss on Number Tokens for Language Models
- Diagnostic performance of deep learning for predicting glioma isocitrate dehydrogenase and 1p/19q co-deletion in MRI: a systematic review and meta-analysis
- Heuristic-Induced Multimodal Risk Distribution Jailbreak Attack for Multimodal Large Language Models
- SGPT: Few-Shot Prompt Tuning for Signed Graphs
- Machine Learning-Based Automated Assessment of Intracorporeal Suturing in Laparoscopic Fundoplication
- On Fusing ChatGPT and Ensemble Learning in Discon-tinuous Named Entity Recognition in Health Corpora
- Emergent Symbol-like Number Variables in Artificial Neural Networks
- 2SSP: A Two-Stage Framework for Structured Pruning of LLMs
- Adaptive Exploration for Multi-Reward Multi-Policy Evaluation
- Dealing with Annotator Disagreement in Hate Speech Classification
- AI-Augmented Thyroid Scintigraphy for Robust Classification
- Dimensionality reduction for homological stability and global structure preservation
- SKALD: Learning-Based Shot Assembly for Coherent Multi-Shot Video Creation
- Alzheimer's Disease Classification Using Retinal OCT: TransnetOCT and Swin Transformer Models
- More Women, Same Stereotypes: Unpacking the Gender Bias Paradox in Large Language Models
- PVChat: Personalized Video Chat with One-Shot Learning
- Collaborating with AI Agents: Field Experiments on Teamwork, Productivity, and Performance
- Embodied Long Horizon Manipulation with Closed-loop Code Generation and Incremental Few-shot Adaptation
- SpectR: Dynamically Composing LM Experts with Spectral Routing
- Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models
- Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling
- LauraTSE: Target Speaker Extraction using Auto-Regressive Decoder-Only Language Models
- Co-Writing with AI, on Human Terms: Aligning Research with User Demands Across the Writing Process
- Crossing the Human-Robot Embodiment Gap with Sim-to-Real RL using One Human Demonstration
- CaRL: Learning Scalable Planning Policies with Simple Rewards
- Sharpness-Aware Minimization with Z-Score Gradient Filtering
- D-CODA: Diffusion for Coordinated Dual-Arm Data Augmentation
- Convert Language Model into a Value-based Strategic Planner
- HuB: Learning Extreme Humanoid Balance
- OMGM: Orchestrate Multiple Granularities and Modalities for Efficient Multimodal Retrieval
- SLAG: Scalable Language-Augmented Gaussian Splatting
- RT-Cache: Training-Free Retrieval for Real-Time Manipulation
- Unsupervised Invariant Risk Minimization
- JARVIS: A Multi-Agent Code Assistant for High-Quality EDA Script Generation
- Action is All You Need: Dual-Flow Generative Ranking Network for Recommendation
- Hard Negative Contrastive Learning for Fine-Grained Geometric Understanding in Large Multimodal Models
- Explaining Large Language Models with gSMILE
- Flexible Tool Selection through Low-dimensional Attribute Alignment of Vision and Language
- INSIGHT: A Survey of In-Network Systems for Intelligent, High-Efficiency AI and Topology Optimization
- AutoChemSchematic AI: Agentic Physics-Aware Automation for Chemical Manufacturing Scale-Up
- Improving LLM Agents with Reinforcement Learning on Cryptographic CTF Challenges
- SLAC: Simulation-Pretrained Latent Action Space for Whole-Body Real-World RL
- Policy Search, Retrieval, and Composition via Task Similarity in Collaborative Agentic Systems
- Towards an Explainable Comparison and Alignment of Feature Embeddings
- Eigenspectrum Analysis of Neural Networks without Aspect Ratio Bias
- Fast Geometric Embedding for Node Influence Maximization
- From Teacher to Student: Tracking Memorization Through Model Distillation
- Continual Learning with Columnar Spiking Neural Networks
- Dissecting the SWE-Bench Leaderboards: Profiling Submitters and Architectures of LLM- and Agent-Based Repair Systems
- Controlled Generation with Equivariant Variational Flow Matching
- Geological Everything Model 3D: A Physics-informed Promptable Foundation Model for Unified and Zero-Shot Subsurface Understanding
- Multi-agent Auditory Scene Analysis
- OrthoRank: Token Selection via Sink Token Orthogonality for Efficient LLM inference
- A Novel Approach for Estimating Largest Lyapunov Exponents in One-Dimensional Chaotic Time Series Using Machine Learning
- LoRA-Augmented Generation (LAG) for Knowledge-Intensive Language Tasks
- Loss-Complexity Landscape and Model Structure Functions
- LeAdQA: LLM-Driven Context-Aware Temporal Grounding for Video Question Answering
- Hierarchical Multi-Agent Reinforcement Learning with Control Barrier Functions for Safety-Critical Autonomous Systems
- LinguaSafe: A Comprehensive Multilingual Safety Benchmark for Large Language Models
- FedUNet: A Lightweight Additive U-Net Module for Federated Learning with Heterogeneous Models
- DCSCR: A Class-Specific Collaborative Representation based Network for Image Set Classification
- CLAIRE-DSA: Fluoroscopic Image Classification for Quality Assurance of Computer Vision Pipelines in Acute Ischemic Stroke
- Harnessing Group-Oriented Consistency Constraints for Semi-Supervised Semantic Segmentation in CdZnTe Semiconductors
- CRED-SQL: Enhancing Real-world Large Scale Database Text-to-SQL Parsing through Cluster Retrieval and Execution Description
- Randomized PCA Forest for Outlier Detection
- Bridging Human and LLM Judgments: Understanding and Narrowing the Gap
- Vehicle detection from GSV imagery: Predicting travel behaviour for cycling and motorcycling using Computer Vision
- A Shift in Perspective on Causality in Domain Generalization
- Atom-Searcher: Enhancing Agentic Deep Research via Fine-Grained Atomic Thought Reward
- Next Visual Granularity Generation
- Learning to Steer: Input-dependent Steering for Multimodal LLMs
- Context Matters: Incorporating Target Awareness in Conversational Abusive Language Detection
- Toward Storage-Aware Learning with Compressed Data An Empirical Exploratory Study on JPEG
- HRS: Hybrid Representation Framework with Scheduling Awareness for Time Series Forecasting in Crowdsourced Cloud-Edge Platforms
- Word Meanings in Transformer Language Models
- One-Class Intrusion Detection with Dynamic Graphs
- CTFlow: Video-Inspired Latent Flow Matching for 3D CT Synthesis
- A Stitch in Time Saves Nine: Proactive Self-Refinement for Language Models
- SecFSM: Knowledge Graph-Guided Verilog Code Generation for Secure Finite State Machines in Systems-on-Chip
- Learning local and global prototypes with optimal transport for unsupervised anomaly detection and localization
- SEDEG:Sequential Enhancement of Decoder and Encoder's Generality for Class Incremental Learning with Small Memory
- Multi-Phase Automated Segmentation of Dental Structures in CBCT Using a Lightweight Auto3DSeg and SegResNet Implementation
- SL-ACC: A Communication-Efficient Split Learning Framework with Adaptive Channel-wise Compression
- Kourkoutas-Beta: A Sunspike-Driven Adam Optimizer with Desert Flair
- Vitamin N: Benefits of Different Forms of Public Greenery for Urban Health
- The Application of Transformer-Based Models for Predicting Consequences of Cyber Attacks
- Can Large Models Teach Student Models to Solve Mathematical Problems Like Human Beings? A Reasoning Distillation Method via Multi-LoRA Interaction
- Using AI for User Representation: An Analysis of 83 Persona Prompts
- XR-NPE: High-Throughput Mixed-precision SIMD Neural Processing Engine for Extended Reality Perception Workloads
- Hierarchical Evaluation Function (HEF): A Multi-Metric Approach for Optimizing Demand Forecasting Models
- Reinforced Context Order Recovery for Adaptive Reasoning and Planning
- From Transthoracic to Transesophageal: Cross-Modality Generation using LoRA Diffusion
- VerilogLAVD: LLM-Aided Rule Generation for Vulnerability Detection in Verilog
- Contrastive Representations for Temporal Reasoning
- Spot the BlindSpots: Systematic Identification and Quantification of Fine-Grained LLM Biases in Contact Center Summaries
- RepreGuard: Detecting LLM-Generated Text by Revealing Hidden Representation Patterns
- Exploring Scholarly Data by Semantic Query on Knowledge Graph Embedding Space
- Unravelling Responsibility for AI
- FCL-ViT: Task-Aware Attention Tuning for Continual Learning
- Encoding Argumentation Frameworks to Propositional Logic Systems
- Does Prior Data Matter? Exploring Joint Training in the Context of Few-Shot Class-Incremental Learning
- Advancing AI-Scientist Understanding: Multi-Agent LLMs with Interpretable Physics Reasoning
- Learning Adaptive Parallel Reasoning with Language Models
- Bridging Econometrics and AI: VaR Estimation via Reinforcement Learning and GARCH Models
- Explainable Reinforcement Learning Agents Using World Models
- LD-Scene: LLM-Guided Diffusion for Controllable Generation of Adversarial Safety-Critical Driving Scenarios
- LocalGPT: Benchmarking and Advancing Large Language Models for Local Life Services in Meituan
- Fragile Preferences: A Deep Dive Into Order Effects in Large Language Models
- Mobile-R1: Towards Interactive Reinforcement Learning for VLM-Based Mobile Agent via Task-Level Rewards
- Opus: A Prompt Intention Framework for Complex Workflow Generation
- InsightX Agent: An LMM-based Agentic Framework with Integrated Tools for Reliable X-ray NDT Analysis
- Unveiling the Unseen: A Comprehensive Survey on Explainable Anomaly Detection in Images and Videos
- Interpretable and Robust AI in EEG Systems: A Survey
- NeFT: Negative Feedback Training to Improve Robustness of Compute-In-Memory DNN Accelerators
- Self-Tuning PID Control via a Hybrid Actor-Critic-Based Neural Structure for Quadcopter Control
- Towards Safe Autonomous Driving Policies using a Neuro-Symbolic Deep Reinforcement Learning Approach
- New Interaction Paradigm for Complex EDA Software Leveraging GPT
- Large language models can replicate cross-cultural differences in personality
- A Deep Learning Approach to Teeth Segmentation and Orientation from Panoramic X-rays
- TRIALSCOPE: A Unifying Causal Framework for Scaling Real-World Evidence Generation with Biomedical Language Models
- GeoSAM: Fine-tuning SAM with Multi-Modal Prompts for Mobility Infrastructure Segmentation
- Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation
- TFB: Towards Comprehensive and Fair Benchmarking of Time Series Forecasting Methods
- Quantformer: from attention to profit with a quantitative transformer trading strategy
- An MRP Formulation for Supervised Learning: Generalized Temporal Difference Learning Models
- Mapping the Unseen: Unified Promptable Panoptic Mapping with Dynamic Labeling using Foundation Models
- Large Language Models Must Be Taught to Know What They Don't Know
- Benchmarking Spectral Graph Neural Networks: A Comprehensive Study on Effectiveness and Efficiency
- European Space Agency Benchmark for Anomaly Detection in Satellite Telemetry
- Improving Diffusion Inverse Problem Solving with Decoupled Noise Annealing
- Regime-Aware Time Weighting for Physics-Informed Neural Networks
- V-RoAst: Visual Road Assessment. Can VLM be a Road Safety Assessor Using the iRAP Standard?
- A Law of Next-Token Prediction in Large Language Models
- Solving Stochastic Orienteering Problems with Chance Constraints Using a GNN Powered Monte Carlo Tree Search
- S2Cap: A Benchmark and a Baseline for Singing Style Captioning
- GraphLand: Evaluating Graph Machine Learning Models on Diverse Industrial Data
- Interpreting Time Series Forecasts with LIME and SHAP: A Case Study on the Air Passengers Dataset
- Fortifying the Agentic Web: A Unified Zero-Trust Architecture Against Logic-layer Threats
- Region-Level Context-Aware Multimodal Understanding
- The Self-Execution Benchmark: Measuring LLMs' Attempts to Overcome Their Lack of Self-Execution
- CRoC: Context Refactoring Contrast for Graph Anomaly Detection with Limited Supervision
- TSLA: A Task-Specific Learning Adaptation for Semantic Segmentation on Autonomous Vehicles Platform
- "My productivity is boosted, but ..." Demystifying Users' Perception on AI Coding Assistants
- HuBERT-VIC: Improving Noise-Robust Automatic Speech Recognition of Speech Foundation Model via Variance-Invariance-Covariance Regularization
- Mutually Assured Deregulation
- Synchronization Dynamics of Heterogeneous, Collaborative Multi-Agent AI Systems
- Semantic Discrepancy-aware Detector for Image Forgery Identification
- A Large-Scale Web Search Dataset for Federated Online Learning to Rank
- Synthetic Data is Sufficient for Zero-Shot Visual Generalization from Offline Data
- Uncovering Systematic Failures of LLMs in Verifying Code Against Natural Language Specifications
- Navigating the Exploration-Exploitation Tradeoff in Inference-Time Scaling of Diffusion Models
- IPGPhormer: Interpretable Pathology Graph-Transformer for Survival Analysis
- MedKGent: A Large Language Model Agent Framework for Constructing Temporally Evolving Medical Knowledge Graph
- Where to Start Alignment? Diffusion Large Language Model May Demand a Distinct Position
- Extracting Post-Acute Sequelae of SARS-CoV-2 Infection Symptoms from Clinical Notes via Hybrid Natural Language Processing
- SRMA-Mamba: Spatial Reverse Mamba Attention Network for Pathological Liver Segmentation in MRI Volumes
- LumiMAS: A Comprehensive Framework for Real-Time Monitoring and Enhanced Observability in Multi-Agent Systems
- Quantum Flow Matching
- fCrit: A Visual Explanation System for Furniture Design Creative Support
- Adversarial Attacks on VQA-NLE: Exposing and Alleviating Inconsistencies in Visual Question Answering Explanations
- Tactile Gesture Recognition with Built-in Joint Sensors for Industrial Robots
- Inverse-LLaVA: Eliminating Alignment Pre-training Through Text-to-Vision Mapping
- A Robust Cross-Domain IDS using BiGRU-LSTM-Attention for Medical and Industrial IoT Security
- Standardization of Neuromuscular Reflex Analysis -- Role of Fine-Tuned Vision-Language Model Consortium and OpenAI gpt-oss Reasoning LLM Enabled Decision Support System
- EXOTIC: An Exact, Optimistic, Tree-Based Algorithm for Min-Max Optimization
- Cold-RL: Learning Cache Eviction with Offline Reinforcement Learning for NGINX
- Mitigating Hallucinations in Large Language Models via Causal Reasoning
- Design and Validation of a Responsible Artificial Intelligence-based System for the Referral of Diabetic Retinopathy Patients
- An Introduction to Sliced Optimal Transport
- An Initial Study of Bird's-Eye View Generation for Autonomous Vehicles using Cross-View Transformers
- Rethinking Safety in LLM Fine-tuning: An Optimization Perspective
- Defining and Benchmarking a Data-Centric Design Space for Brain Graph Construction
- CorrSteer: Steering Improves Task Performance and Safety in LLMs through Correlation-based Sparse Autoencoder Feature Selection
- Systematic Analysis of MCP Security
- OS-R1: Agentic Operating System Kernel Tuning with Reinforcement Learning
- Deep Learning Model for Amyloidogenicity Prediction using a Pre-trained Protein LLM
- Widening the Network Mitigates the Impact of Data Heterogeneity on FedAvg
- Energy-Efficient Wireless LLM Inference via Uncertainty and Importance-Aware Speculative Decoding
- Beyond Modality Limitations: A Unified MLLM Approach to Automated Speaking Assessment with Effective Curriculum Learning
- SSPO: Self-traced Step-wise Preference Optimization for Process Supervision and Reasoning Compression
- OpenMoCap: Rethinking Optical Motion Capture under Real-world Occlusion
- A Generalized Genetic Random Field Method for the Genetic Association Analysis of Sequencing Data
- How can we trust opaque systems? Criteria for robust explanations in XAI
- SpotVLM: Cloud-edge Collaborative Real-time VLM based on Context Transfer
- Score-informed Neural Operator for Enhancing Ordering-based Causal Discovery
- Breaking Language Barriers: Equitable Performance in Multilingual Language Models
- Robust Federated Learning under Adversarial Attacks via Loss-Based Client Clustering
- Deploying Models to Non-participating Clients in Federated Learning without Fine-tuning: A Hypernetwork-based Approach
- A Taxonomy of Hierarchical Multi-Agent Systems: Design Patterns, Coordination Mechanisms, and Industrial Applications
- ToolACE-MT: Non-Autoregressive Generation for Agentic Multi-Turn Interaction
- TTA-DAME: Test-Time Adaptation with Domain Augmentation and Model Ensemble for Dynamic Driving Conditions
- Multi-Level Knowledge Distillation and Dynamic Self-Supervised Learning for Continual Learning
- A Unified Cortical Circuit Model with Divisive Normalization and Self-Excitation for Robust Representation and Memory Maintenance
- Asymmetric Diffusion Recommendation Model
- MATPAC++: Enhanced Masked Latent Prediction for Self-Supervised Audio Representation Learning
- Efficient Modular Learning through Naive LoRA Summation: Leveraging Orthogonality in High-Dimensional Models
- MOON: Generative MLLM-based Multimodal Representation Learning for E-commerce Product Understanding
- Predicting ChatGPT Use in Assignments: Implications for AI-Aware Assessment Design
- BConformeR: A Conformer Based on Mutual Sampling for Unified Prediction of Continuous and Discontinuous Antibody Binding Sites
- Q-FSRU: Quantum-Augmented Frequency-Spectral Fusion for Medical Visual Question Answering
- Mind the Generation Process: Fine-Grained Confidence Estimation During LLM Generation
- Large Language Models Enable Personalized Nudges to Promote Carbon Offsetting Among Air Travellers
- Generalized invariants meet constitutive neural networks: A novel framework for hyperelastic materials
- VimoRAG: Video-based Retrieval-augmented 3D Motion Generation for Motion Language Models
- Automated Model Evaluation for Object Detection via Prediction Consistency and Reliablity
- Generic Event Boundary Detection via Denoising Diffusion
- J6: Jacobian-Driven Role Attribution for Multi-Objective Prompt Optimization in LLMs
- STEM: Efficient Relative Capability Evaluation of LLMs through Structured Transition Samples
- Generative Medical Event Models Improve with Scale
- Simple o3: Towards Interleaved Vision-Language Reasoning
- DynamixSFT: Dynamic Mixture Optimization of Instruction Tuning Collections
- Substituting Proof of Work in Blockchain with Training-Verified Collaborative Model Computation
- KP-INR: A Dual-Branch Implicit Neural Representation Model for Cardiac Cine MRI Reconstruction
- Demystifying Foreground-Background Memorization in Diffusion Models
- AICRN: Attention-Integrated Convolutional Residual Network for Interpretable Electrocardiogram Analysis
- RealTalk: Realistic Emotion-Aware Lifelike Talking-Head Synthesis
- Self-Guided Action Diffusion
- Exploring Multimodal AI Reasoning for Meteorological Forecasting from Skew-T Diagrams
- Improving Pre-Trained Vision-Language-Action Policies with Model-Based Search
- ProtTeX-CC: Activating In-Context Learning in Protein LLM via Two-Stage Instruction Compression
- Towards Generalizable Human Activity Recognition: A Survey
- Unlearning at Scale: Implementing the Right to be Forgotten in Large Language Models
- Distribution Matching via Generalized Consistency Models
- LinkAnchor: An Autonomous LLM-Based Agent for Issue-to-Commit Link Recovery
- STM3: Mixture of Multiscale Mamba for Long-Term Spatio-Temporal Time-Series Prediction
- The Maximum Coverage Model and Recommendation System for UAV Vertiports Location Planning
- GridCodex: A RAG-Driven AI Framework for Power Grid Code Reasoning and Compliance
- EGOILLUSION: Benchmarking Hallucinations in Egocentric Video Understanding
- GTool: Graph Enhanced Tool Planning with Large Language Model
- Beyond Ethical Alignment: Evaluating LLMs as Artificial Moral Assistants
- HeroBench: A Benchmark for Long-Horizon Planning and Structured Reasoning in Virtual Worlds
- Reinforcement Learning with Rubric Anchors
- [Social] Allostasis: Or, How I Learned To Stop Worrying and Love The Noise
- Scaling Multi-Agent Epistemic Planning through GNN-Derived Heuristics
- CAMAR: Continuous Actions Multi-Agent Routing
- E3RG: Building Explicit Emotion-driven Empathetic Response Generation System with Multimodal Large Language Model
- Reliability, Embeddedness, and Agency: A Utility-Driven Mathematical Framework for Agent-Centric AI Adoption
- FuSaR: A Fuzzification-Based Method for LRM Safety-Reasoning Balance
- Do Large Language Model Agents Exhibit a Survival Instinct? An Empirical Study in a Sugarscape-Style Simulation
- Towards Open-Ended Emotional Support Conversations in LLMs via Reinforcement Learning with Future-Oriented Rewards
- OPTIC-ER: A Reinforcement Learning Framework for Real-Time Emergency Response and Equitable Resource Allocation in Underserved African Communities
- EvolMathEval: Towards Evolvable Benchmarks for Mathematical Reasoning via Evolutionary Testing
- e-boost: Boosted E-Graph Extraction with Adaptive Heuristics and Exact Solving
- PC-Sampler: Position-Aware Calibration of Decoding Bias in Masked Diffusion Models
- G$^2$RPO-A: Guided Group Relative Policy Optimization with Adaptive Guidance
- A Language-Signal-Vision Multimodal Framework for Multitask Cardiac Analysis
- Bayesian Optimization-based Search for Agent Control in Automated Game Testing
- Exploring Autonomous Agents: A Closer Look at Why They Fail When Completing Tasks
- Vibe2Spike: Batteryless Wireless Tags for Vibration Sensing with Event Cameras and Spiking Networks
- Categorical Construction of Logically Verifiable Neural Architectures
- Toward Practical Equilibrium Propagation: Brain-inspired Recurrent Neural Network with Feedback Regulation and Residual Connections
- Generative AI in Training and Coaching: Redefining the Design Process of Learning Materials
- Assessing Representation Stability for Transformer Models
- Collaborative Learning-Enhanced Lightweight Models for Predicting Arterial Blood Pressure Waveform in a Large-scale Perioperative Dataset
- RRRA: Resampling and Reranking through a Retriever Adapter
- LLM-Based Intelligent Agents for Music Recommendation: A Comparison with Classical Content-Based Filtering
- Revealing Neurocognitive and Behavioral Patterns by Unsupervised Manifold Learning from Dynamic Brain Data
- Contrastive Regularization over LoRA for Multimodal Biomedical Image Incremental Learning
- Learning Internal Biological Neuron Parameters and Complexity-Based Encoding for Improved Spiking Neural Networks Performance
- Deep Language Geometry: Constructing a Metric Space from LLM Weights
- Lifelong Learner: Discovering Versatile Neural Solvers for Vehicle Routing Problems
- Comparative Analysis of Time Series Foundation Models for Demographic Forecasting: Enhancing Predictive Accuracy in US Population Dynamics
- Future progress in artificial intelligence: A survey of expert opinion
- Age-Normalized HRV Features for Non-Invasive Glucose Prediction: A Pilot Sleep-Aware Machine Learning Study
- Adaptive Spiking with Plasticity for Energy Aware Neuromorphic Systems
- Real Time Child Abduction And Detection System
- Towards Generalizable Learning Models for EEG-Based Identification of Pain Perception
- Scalable, Technology-Agnostic Diagnosis and Predictive Maintenance for Point Machine using Deep Learning
- Track Component Failure Detection Using Data Analytics over existing STDS Track Circuit data
- RefAdGen: High-Fidelity Advertising Image Generation
- Separating Knowledge and Perception with Procedural Data
- Next-Gen Education: Enhancing AI for Microlearning
- Centralized Permutation Equivariant Policy for Cooperative Multi-Agent Reinforcement Learning
- Listening with Language Models: Using LLMs to Collect and Interpret Classroom Feedback
- Street Review: A Participatory AI-Based Framework for Assessing Streetscape Inclusivity
- Navigating the New Landscape: A Conceptual Model for Project-Based Assessment (PBA) in the Age of GenAI
- Code Vulnerability Detection Across Different Programming Languages with AI Models
- Enhancing GraphQL Security by Detecting Malicious Queries Using Large Language Models, Sentence Transformers, and Convolutional Neural Networks
- Benchmark Dataset Generation and Evaluation for Excel Formula Repair with LLMs
- Privacy-Aware Detection of Fake Identity Documents: Methodology, Benchmark, and Improved Detection Methods (FakeIDet2)
- Are AI Machines Making Humans Obsolete?
- FusionFM: Fusing Eye-specific Foundational Models for Optimized Ophthalmic Diagnosis
- UniDCF: A Foundation Model for Comprehensive Dentocraniofacial Hard Tissue Reconstruction
- The Stories We Govern By: AI, Risk, and the Power of Imaginaries
- BRIEF: BRain-Inspired network connection search with Extensive temporal feature Fusion enhances disease classification
- SafeSieve: From Heuristics to Experience in Progressive Pruning for LLM-based Multi-Agent Communication
- Ovis2.5 Technical Report
- Artificial Intelligence in Rural Healthcare Delivery: Bridging Gaps and Enhancing Equity through Innovation
- Can we Evaluate RAGs with Synthetic Data?
- Using Natural Language for Human-Robot Collaboration in the Real World
- Uncalibrated Reasoning: GRPO Induces Overconfidence for Stochastic Outcomes
- Labels or Input? Rethinking Augmentation in Multimodal Hate Detection
- FairTabGen: Unifying Counterfactual and Causal Fairness in Synthetic Tabular Data Generation
- Rethinking Autonomy: Preventing Failures in AI-Driven Software Engineering
- Every 28 Days the AI Dreams of Soft Skin and Burning Stars: Scaffolding AI Agents with Hormones and Emotions
- When Does Language Transfer Help? Sequential Fine-Tuning for Cross-Lingual Euphemism Detection
- Recent Advances in Transformer and Large Language Models for UAV Applications
- What Matters for Bioacoustic Encoding
- SupraTok: Cross-Boundary Tokenization for Enhanced Language Model Performance
- AI-Augmented CI/CD Pipelines: From Code Commit to Production with Autonomous Decisions
- Data Shift of Object Detection in Autonomous Driving
- AdaRing: Towards Ultra-Light Vision-Language Adaptation via Cross-Layer Tensor Ring Decomposition
- Singing Syllabi with Virtual Avatars: Enhancing Student Engagement Through AI-Generated Music and Digital Embodiment
- SimInterview: Transforming Business Education through Large Language Model-Based Simulated Multilingual Interview Training System
- Discovering Expert-Level Nash Equilibrium Algorithms with Large Language Models
- EVTP-IVS: Effective Visual Token Pruning For Unifying Instruction Visual Segmentation In Multi-Modal Large Language Models
- Integrating Symbolic RL Planning into a BDI-based Autonomous UAV Framework: System Integration and SIL Validation
- Deciphering the Interplay between Attack and Protection Complexity in Privacy-Preserving Federated Learning
- CORE: Measuring Multi-Agent LLM Interaction Quality under Game-Theoretic Pressures
- ENA: Efficient N-dimensional Attention
- No More Blind Spots: Learning Vision-Based Omnidirectional Bipedal Locomotion for Challenging Terrain
- HPD: Hybrid Projection Decomposition for Robust State Space Models on Analog CIM Hardware
- Extending Straight-Through Estimation for Robust Neural Networks on Analog CIM Hardware
- A Comprehensive Review of AI Agents: Transforming Possibilities in Technology and Beyond
- TBGRecall: A Generative Retrieval Model for E-commerce Recommendation Scenarios
- Finite Automata Extraction: Low-data World Model Learning as Programs from Gameplay Video
- EvoCut: Strengthening Integer Programs via Evolution-Guided Language Models
- LARC: Towards Human-level Constrained Retrosynthesis Planning through an Agentic Framework
- QuarkMed Medical Foundation Model Technical Report
- CHBench: A Cognitive Hierarchy Benchmark for Evaluating Strategic Reasoning Capability of LLMs
- Data Mixing Optimization for Supervised Fine-Tuning of Large Language Models
- UniCast: A Unified Multimodal Prompting Framework for Time Series Forecasting
- Rigorous Feature Importance Scores based on Shapley Value and Banzhaf Index
- Chart-CoCa: Self-Improving Chart Understanding of Vision LMs via Code-Driven Synthesis and Candidate-Conditioned Answering
- FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction
- Modeling Relational Logic Circuits for And-Inverter Graph Convolutional Network
- AgentCDM: Enhancing Multi-Agent Collaborative Decision-Making via ACH-Inspired Structured Reasoning
- AI Models for Depressive Disorder Detection and Diagnosis: A Review
- Bongard-RWR+: Real-World Representations of Fine-Grained Concepts in Bongard Problems
- Active inference for action-unaware agents
- MAPF-World: Action World Model for Multi-Agent Path Finding
- Overcoming Knowledge Discrepancies: Structuring Reasoning Threads through Knowledge Balancing in Interactive Scenarios
- MOVER: Multimodal Optimal Transport with Volume-based Embedding Regularization
- RLNVR: Reinforcement Learning from Non-Verified Real-World Rewards
- Mantis: A Simulation-Grounded Foundation Model for Disease Forecasting
- RadarQA: Multi-modal Quality Analysis of Weather Radar Forecasts
- Wisdom of the Crowd: Reinforcement Learning from Coevolutionary Collective Feedback
- Hierarchical knowledge guided fault intensity diagnosis of complex industrial systems
- GraphCogent: Overcoming LLMs' Working Memory Constraints via Multi-Agent Collaboration in Complex Graph Understanding
- Non-Iterative Symbolic-Aided Chain-of-Thought for Logical Reasoning
- GALA: Can Graph-Augmented Large Language Model Agentic Workflows Elevate Root Cause Analysis?
- The Yokai Learning Environment: Tracking Beliefs Over Space and Time
- Advanced DOA Regulation with a Whale-Optimized Fractional Order Fuzzy PID Framework
- Root Cause Analysis of Hydrogen Bond Separation in Spatio-Temporal Molecular Dynamics using Causal Models
- Help or Hurdle? Rethinking Model Context Protocol-Augmented Large Language Models
- An LLM + ASP Workflow for Joint Entity-Relation Extraction
- Cognitive Structure Generation: From Educational Priors to Policy Optimization
Research Sources: 816 | Generated: 8/25/2025