AI RESEARCH PAPERS & ACADEMIC SOURCES
- Frequency-Compensated Network for Daily Arctic Sea Ice Concentration Prediction
- Buffer-free Class-Incremental Learning with Out-of-Distribution Detection
- TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs
- AnyPlace: Learning Generalized Object Placement for Robot Manipulation
- Benchmarking for Practice: Few-Shot Time-Series Crop-Type Classification on the EuroCropsML Dataset
- Estimating Deep Learning energy consumption based on model architecture and training environment
- Reinforcement Learning in Categorical Cybernetics
- FoMo-0D: A Foundation Model for Zero-shot Tabular Outlier Detection
- Expressiveness of Multi-Neuron Convex Relaxations in Neural Network Certification
- TReMu: Towards Neuro-Symbolic Temporal Reasoning for LLM-Agents with Memory in Multi-Session Dialogues
- The Value of Information in Human-AI Decision-making
- RollPacker: Mitigating Long-Tail Rollouts for Fast, Synchronous RL Post-Training
- MPC-based Deep Reinforcement Learning Method for Space Robotic Control with Fuel Sloshing Mitigation
- Physics Informed Neural Networks for design optimisation of diamond particle detectors for charged particle fast-tracking at high luminosity hadron colliders
- IntSR: An Integrated Generative Framework for Search and Recommendation
- Data-driven Neural Networks for Windkessel Parameter Calibration
- The Use of the Simplex Architecture to Enhance Safety in Deep-Learning-Powered Autonomous Systems
- Fine-Tuning LLMs to Analyze Multiple Dimensions of Code Review: A Maximum Entropy Regulated Long Chain-of-Thought Approach
- Adoption, usability and perceived clinical value of a UK AI clinical reference platform (iatroX): a mixed-methods formative evaluation of real-world usage and a 1,223-respondent user survey
- Semantic Edge-Cloud Communication for Real-Time Urban Traffic Surveillance with ViT and LLMs over Mobile Networks
- Data-Centric Elastic Pipeline Parallelism for Efficient Long-Context LLM Training
- A Decision Theoretic Framework for Measuring AI Reliance
- Text-Augmented Multimodal LLMs for Chemical Reaction Condition Recommendation
- Examining the Prevalence and Dynamics of AI-Generated Media in Art Subreddits
- The Asymptotic Behavior of Attention in Transformers
- i-LAVA: Insights on Low Latency Voice-2-Voice Architecture for Agents
- Dual-Path Phishing Detection: Integrating Transformer-Based NLP with Structural URL Analysis
- AnywhereVLA: Language-Conditioned Exploration and Mobile Manipulation
- Automatic Red Teaming LLM-based Agents with Model Context Protocol Tools
- Data-Efficient Time-Dependent PDE Surrogates: Graph Neural Simulators vs. Neural Operators
- A Finite-Time Analysis of TD Learning with Linear Function Approximation without Projections or Strong Convexity
- Improved Scaling Laws in Linear Regression via Data Reuse
- Generalizing while preserving monotonicity in comparison-based preference learning models
- CopulaSMOTE: A Copula-Based Oversampling Approach for Imbalanced Classification in Diabetes Prediction
- An entropy-optimal path to humble AI
- Enhanced Generative Model Evaluation with Clipped Density and Coverage
- Training-Free Stein Diffusion Guidance: Posterior Correction for Sampling Beyond High-Density Regions
- Contextual Combinatorial Bandits with Changing Action Sets via Gaussian Processes
- Rosenthal-type inequalities for linear statistics of Markov chains
- Energy based diffusion generator for efficient sampling of Boltzmann distributions
- Understanding Optimization in Deep Learning with Central Flows
- Bayesian Optimization with Preference Exploration using a Monotonic Neural Network Ensemble
- Regularization can make diffusion models more efficient
- Provably Sample-Efficient Robust Reinforcement Learning with Average Reward
- Why and When Deep is Better than Shallow: An Implementation-Agnostic State-Transition View of Depth Supremacy
- Bridging Arbitrary and Tree Metrics via Differentiable Gromov Hyperbolicity
- Response to Promises and Pitfalls of Deep Kernel Learning
- Incorporating External Controls for Estimating the Average Treatment Effect on the Treated with High-Dimensional Data: Retaining Double Robustness and Ensuring Double Safety
- Towards Complete Causal Explanation with Expert Knowledge
- Hybrid Summary Statistics
- Revenue Maximization Under Sequential Price Competition Via The Estimation Of s-Concave Demand Functions
- Uncertainty-Aware Surrogate-based Amortized Bayesian Inference for Computationally Expensive Models
- Tensor State Space-based Dynamic Multilayer Network Modeling
- Learning to Bid Optimally and Efficiently in Adversarial First-price Auctions
- Empirical PAC-Bayes bounds for Markov chains
- Best-of-$\infty$ -- Asymptotic Performance of Test-Time Compute
- WISER: Segmenting watermarked region - an epidemic change-point perspective
- Breaking the curse of dimensionality for linear rules: optimal predictors over the ellipsoid
- Why Settle for One? Text-to-ImageSet Generation and Evaluation
- 3D-MoRe: Unified Modal-Contextual Reasoning for Embodied Question Answering
- GeMix: Conditional GAN-Based Mixup for Improved Medical Image Augmentation
- CNS-Bench: Benchmarking Image Classifier Robustness Under Continuous Nuisance Shifts
- RIS-LAD: A Benchmark and Model for Referring Low-Altitude Drone Image Segmentation
- VideoPASTA: 7K Preference Pairs That Matter for Video-LLM Alignment
- Learning Flow-Guided Registration for RGB-Event Semantic Segmentation
- Automated Visual Attention Detection using Mobile Eye Tracking in Behavioral Classroom Studies
- Instance-aware Image Colorization with Controllable Textual Descriptions and Segmentation Masks
- CONSIGN: Conformal Segmentation Informed by Spatial Groupings via Decomposition
- MMaDA: Multimodal Large Diffusion Language Models
- MME-VideoOCR: Evaluating OCR-Based Capabilities of Multimodal LLMs in Video Scenarios
- Beyond Quantity: Distribution-Aware Labeling for Visual Grounding
- ProstaTD: Bridging Surgical Triplet from Classification to Fully Supervised Detection
- O-MaMa: Learning Object Mask Matching between Egocentric and Exocentric Views
- LadderMIL: Multiple Instance Learning with Coarse-to-Fine Self-Distillation
- Efficiently Disentangling CLIP for Multi-Object Perception
- HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation
- Diff-Reg v2: Diffusion-Based Matching Matrix Estimation for Image Matching and 3D Registration
- REArtGS: Reconstructing and Generating Articulated Objects via 3D Gaussian Splatting with Geometric and Motion Constraints
- LLaVA-RadZ: Can Multimodal Large Language Models Effectively Tackle Zero-shot Radiology Recognition?
- Parameter-Efficient Adaptation of Geospatial Foundation Models through Embedding Deflection
- Autoregressive Image Generation with Randomized Parallel Decoding
- Radar-Guided Polynomial Fitting for Metric Depth Estimation
- Improving Brain Disorder Diagnosis with Advanced Brain Function Representation and Kolmogorov-Arnold Networks
- Vim-F: Visual State Space Model Benefiting from Learning in the Frequency Domain
- Model Agnostic Defense against Adversarial Patch Attacks on Object Detection in Unmanned Aerial Vehicles
- SOOD++: Leveraging Unlabeled Data to Boost Oriented Object Detection
- Lightweight Modular Parameter-Efficient Tuning for Open-Vocabulary Object Detection
- Asynchronous Perception Machine For Efficient Test-Time-Training
- Training-Free Layout-to-Image Generation with Marginal Attention Constraints
- Supercharged One-step Text-to-Image Diffusion Models with Negative Prompts
- GVDepth: Zero-Shot Monocular Depth Estimation for Ground Vehicles based on Probabilistic Cue Fusion
- MonSter++: Unified Stereo Matching, Multi-view Stereo, and Real-time Stereo with Monodepth Priors
- Technical report on label-informed logit redistribution for better domain generalization in low-shot classification with foundation models
- MASt3R-Fusion: Integrating Feed-Forward Visual Model with IMU, GNSS for High-Functionality SLAM
- ARMesh: Autoregressive Mesh Generation via Next-Level-of-Detail Prediction
- ArchGPT: Understanding the World's Architectures with Large Multimodal Models
- Autoregressive End-to-End Planning with Time-Invariant Spatial Alignment and Multi-Objective Policy Refinement
- Marching Neurons: Accurate Surface Extraction for Neural Implicit Shapes
- KeyWorld: Key Frame Reasoning Enables Effective and Efficient World Models
- Cross-Modal Instructions for Robot Motion Generation
- CHARM: Control-point-based 3D Anime Hairstyle Auto-Regressive Modeling
- Human-like Navigation in a World Built for Humans
- SuperPatchMatch: an Algorithm for Robust Correspondences using Superpixel Patches
- Retina Vision Transformer (RetinaViT): Introducing Scaled Patches into Vision Transformers
- Quantized Visual Geometry Grounded Transformer
- NewtonGen: Physics-Consistent and Controllable Text-to-Video Generation via Neural Newtonian Dynamics
- SD3.5-Flash: Distribution-Guided Distillation of Generative Flows
- BlockFUL: Enabling Unlearning in Blockchained Federated Learning
- Optimal Transport Based Hyperspectral Unmixing for Highly Mixed Observations
- Equi-RO: A 4D mmWave Radar Odometry via Equivariant Networks
- RAM-NAS: Resource-aware Multiobjective Neural Architecture Search Method for Robot Vision Tasks
- ArtUV: Artist-style UV Unwrapping
- SeamCrafte: Enhancing Mesh Seam Generation for Artist UV Unwrapping via Reinforcement Learning
- SLAM-Free Visual Navigation with Hierarchical Vision-Language Perception and Coarse-to-Fine Semantic Topological Planning
- SlideMamba: Entropy-Based Adaptive Fusion of GNN and Mamba for Enhanced Representation Learning in Digital Pathology
- Hunyuan3D-Omni: A Unified Framework for Controllable Generation of 3D Assets
- Learning to Look: Cognitive Attention Alignment with Vision-Language Models
- Decipher-MR: A Vision-Language Foundation Model for 3D MRI Representations
- Instruction-tuned Self-Questioning Framework for Multimodal Reasoning
- Every Subtlety Counts: Fine-grained Person Independence Micro-Action Recognition via Distributionally Robust Optimization
- Dense Semantic Matching with VGGT Prior
- MedVSR: Medical Video Super-Resolution with Cross State-Space Propagation
- MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources
- A Sentinel-3 foundation model for ocean colour
- Does FLUX Already Know How to Perform Physically Plausible Image Composition?
- Vision Transformers: the threat of realistic adversarial patches
- UniTransfer: Video Concept Transfer via Progressive Spatial and Timestep Decomposition
- VideoChat-R1.5: Visual Test-Time Scaling to Reinforce Multimodal Reasoning by Iterative Perception
- Mammo-CLIP Dissect: A Framework for Analysing Mammography Concepts in Vision-Language Models
- MOSS-ChatV: Reinforcement Learning with Process Reasoning Reward for Video Temporal Reasoning
- MotionFlow:Learning Implicit Motion Flow for Complex Camera Trajectory Control in Video Generation
- The Unwinnable Arms Race of AI Image Detection
- WAVECLIP: Wavelet Tokenization for Adaptive-Resolution CLIP
- Can Less Precise Be More Reliable? A Systematic Evaluation of Quantization's Impact on CLIP Beyond Accuracy
- Learning Conformal Explainers for Image Classifiers
- Unlocking Noise-Resistant Vision: Key Architectural Secrets for Robust Models
- Decoding the Surgical Scene: A Scoping Review of Scene Graphs in Surgery
- A Real-Time On-Device Defect Detection Framework for Laser Power-Meter Sensors via Unsupervised Learning
- An Adaptor for Triggering Semi-Supervised Learning to Out-of-Box Serve Deep Image Clustering
- SiNGER: A Clearer Voice Distills Vision Transformers Further
- Fast-SEnSeI: Lightweight Sensor-Independent Cloud Masking for On-board Multispectral Sensors
- A Single Neuron Works: Precise Concept Erasure in Text-to-Image Diffusion Models
- OmniPlantSeg: Species Agnostic 3D Point Cloud Organ Segmentation for High-Resolution Plant Phenotyping Across Modalities
- Background Prompt for Few-Shot Out-of-Distribution Detection
- Stratify or Die: Rethinking Data Splits in Image Segmentation
- EnGraf-Net: Multiple Granularity Branch Network with Fine-Coarse Graft Grained for Classification Task
- Plant identification based on noisy web data: the amazing performance of deep learning (LifeCLEF 2017)
- SD-RetinaNet: Topologically Constrained Semi-Supervised Retinal Lesion and Layer Segmentation in OCT
- Plant identification in an open-world (LifeCLEF 2016)
- The Unanticipated Asymmetry Between Perceptual Optimization and Assessment
- Concepts in Motion: Temporal Bottlenecks for Interpretable Video Classification
- FSMODNet: A Closer Look at Few-Shot Detection in Multispectral Data
- Finding 3D Positions of Distant Objects from Noisy Camera Movement and Semantic Segmentation Sequences
- SwinMamba: A hybrid local-global mamba framework for enhancing semantic segmentation of remotely sensed images
- Revisiting Data Challenges of Computational Pathology: A Pack-based Multiple Instance Learning Framework
- SimDiff: Simulator-constrained Diffusion Model for Physically Plausible Motion Generation
- Recov-Vision: Linking Street View Imagery and Vision-Language Models for Post-Disaster Recovery
- Enhancing Cross-View Geo-Localization Generalization via Global-Local Consistency and Geometric Equivariance
- DENet: Dual-Path Edge Network with Global-Local Attention for Infrared Small Target Detection
- Neptune-X: Active X-to-Maritime Generation for Universal Maritime Object Detection
- FreeInsert: Personalized Object Insertion with Geometric and Style Control
- CompressAI-Vision: Open-source software to evaluate compression methods for computer vision tasks
- Dual-supervised Asymmetric Co-training for Semi-supervised Medical Domain Generalization
- Real-Time Object Detection Meets DINOv3
- Federated Domain Generalization with Domain-specific Soft Prompts Generation
- Poisoning Prompt-Guided Sampling in Video Large Language Models
- Punching Above Precision: Small Quantized Model Distillation with Learnable Regularizer
- Quasi-Synthetic Riemannian Data Generation for Writer-Independent Offline Signature Verification
- Seedream 4.0: Toward Next-generation Multimodal Image Generation
- A Contrastive Learning Framework for Breast Cancer Detection
- Are Foundation Models Ready for Industrial Defect Recognition? A Reality Check on Real-World Data
- Data-Efficient Stream-Based Active Distillation for Scalable Edge Model Deployment
- A Comparative Benchmark of Real-time Detectors for Blueberry Detection towards Precision Orchard Management
- Reflect3r: Single-View 3D Stereo Reconstruction Aided by Mirror Reflections
- SelfBudgeter: Adaptive Token Allocation for Efficient LLM Reasoning
- MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence
- Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute
- Blending Supervised and Reinforcement Fine-Tuning with Prefix Sampling
- Reinforcement Fine-Tuning Naturally Mitigates Forgetting in Continual Post-Training
- NoHumansRequired: Autonomous High-Quality Image Editing Triplet Mining
- Can social media provide early warning of retraction? Evidence from critical tweets identified by human annotation and large language models
- TestAgent: Automatic Benchmarking and Exploratory Interaction for Evaluating LLMs in Vertical Domains
- Bias Similarity Measurement: A Black-Box Audit of Fairness Across LLMs
- Reformulation is All You Need: Addressing Malicious Text Features in DNNs
- AdaSVD: Adaptive Singular Value Decomposition for Large Language Models
- Scaling Rich Style-Prompted Text-to-Speech Datasets
- What Makes a Reward Model a Good Teacher? An Optimization Perspective
- On the Perception Bottleneck of VLMs for Chart Understanding
- A Framework for Situating Innovations, Opportunities, and Challenges in Advancing Vertical Systems with Large AI Models
- Process Reward Models That Think
- UDDETTS: Unifying Discrete and Dimensional Emotions for Controllable Emotional Text-to-Speech
- Just-in-time and distributed task representations in language models
- PLaMo 2 Technical Report
- ConsistentChat: Building Skeleton-Guided Consistent Multi-Turn Dialogues for Large Language Models from Scratch
- From Replication to Redesign: Exploring Pairwise Comparisons for LLM-Based Peer Review
- ImpliRet: Benchmarking the Implicit Fact Retrieval Challenge
- When Does Meaning Backfire? Investigating the Role of AMRs in NLI
- THCM-CAL: Temporal-Hierarchical Causal Modelling with Conformal Calibration for Clinical Risk Prediction
- A Simple "Motivation" Can Enhance Reinforcement Finetuning of Large Reasoning Models
- ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs
- ARF-RLHF: Adaptive Reward-Following for RLHF through Emotion-Driven Self-Supervision and Trace-Biased Dynamic Optimization
- ixi-GEN: Efficient Industrial sLLMs through Domain Adaptive Continual Pretraining
- Turning Internal Gap into Self-Improvement: Promoting the Generation-Understanding Unification in MLLMs
- Constructions are Revealed in Word Distributions
- Inference-Time Scaling for Generalist Reward Modeling
- Decoding Open-Ended Information Seeking Goals from Eye Movements in Reading
- Ambiguity Resolution in Text-to-Structured Data Mapping
- VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models
- UNCERTAINTY-LINE: Length-Invariant Estimation of Uncertainty for Large Language Models
- InComeS: Integrating Compression and Selection Mechanisms into LLMs for Efficient Model Editing
- Bayesian Attention Mechanism: A Probabilistic Framework for Positional Encoding and Context Length Extrapolation
- BabyLM's First Constructions: Causal probing provides a signal of learning
- Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning
- Investigating Factuality in Long-Form Text Generation: The Roles of Self-Known and Self-Unknown
- LAMA-UT: Language Agnostic Multilingual ASR through Orthography Unification and Language-Specific Transliteration
- Labeling Free-text Data using Language Model Ensembles
- Improving LLM Unlearning Robustness via Random Perturbations
- Quantifying depressive mental states with large language models
- MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task
- The Validation Gap: A Mechanistic Analysis of How Language Models Compute Arithmetic but Fail to Validate It
- JUREX-4E: Juridical Expert-Annotated Four-Element Knowledge Base for Legal Reasoning
- Collab-Overcooked: Benchmarking and Evaluating Large Language Models as Collaborative Agents
- PMark: Towards Robust and Distortion-free Semantic-level Watermarking with Channel Constraints
- Communication Bias in Large Language Models: A Regulatory Perspective
- Automotive-ENV: Benchmarking Multimodal Agents in Vehicle Interface Systems
- TABLET: A Large-Scale Dataset for Robust Visual Table Understanding
- Sigma: Semantically Informative Pre-training for Skeleton-based Sign Language Understanding
- Evaluating the Evaluators: Metrics for Compositional Text-to-Image Generation
- Hallucination as an Upper Bound: A New Perspective on Text-to-Image Evaluation
- Interactive Recommendation Agent with Active User Commands
- Higher-Order DisCoCat (Peirce-Lambek-Montague semantics)
- ASCIIEval: Benchmarking Models' Visual Perception in Text Strings via ASCII Art
- UniHR: Hierarchical Representation Learning for Unified Knowledge Graph Link Prediction
- LLMTrace: A Corpus for Classification and Fine-Grained Localization of AI-Written Text
- Bounds of Chain-of-Thought Robustness: Reasoning Steps, Embed Norms, and Beyond
- DisCoCLIP: A Distributional Compositional Tensor Network Encoder for Vision-Language Understanding
- The role of synthetic data in Multilingual, Multi-cultural AI systems: Lessons from Indic Languages
- Sycophancy Is Not One Thing: Causal Separation of Sycophantic Behaviors in LLMs
- RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards
- SciReasoner: Laying the Scientific Reasoning Ground Across Disciplines
- Intercept Cancer: Cancer Pre-Screening with Large Scale Healthcare Foundation Models
- RadAgents: Multimodal Agentic Reasoning for Chest X-ray Interpretation with Radiologist-like Workflows
- Human Semantic Representations of Social Interactions from Moving Shapes
- Visual Authority and the Rhetoric of Health Misinformation: A Multimodal Analysis of Social Media Videos
- Retrieval over Classification: Integrating Relation Semantics for Multimodal Relation Extraction
- Learning the Wrong Lessons: Syntactic-Domain Spurious Correlations in Language Models
- Who's Laughing Now? An Overview of Computational Humour Generation and Explanation
- GEP: A GCG-Based method for extracting personally identifiable information from chatbots built on small language models
- Eigen-1: Adaptive Multi-Agent Refinement with Monitor-Based RAG for Scientific Reasoning
- CLaw: Benchmarking Chinese Legal Knowledge in Large Language Models - A Fine-grained Corpus and Reasoning Analysis
- SGMem: Sentence Graph Memory for Long-Term Conversational Agents
- Query-Centric Graph Retrieval Augmented Generation
- Un-Doubling Diffusion: LLM-guided Disambiguation of Homonym Duplication
- LLM Output Homogenization is Task Dependent
- Analysis of instruction-based LLMs' capabilities to score and judge text-input problems in an academic setting
- Generative AI for FFRDCs
- Behind RoPE: How Does Causal Mask Encode Positional Information?
- When Instructions Multiply: Measuring and Estimating LLM Capabilities of Multiple Instructions Following
- SoM-1K: A Thousand-Problem Benchmark Dataset for Strength of Materials
- Which Cultural Lens Do Models Adopt? On Cultural Positioning Bias and Agentic Mitigation in LLMs
- PerHalluEval: Persian Hallucination Evaluation Benchmark for Large Language Models
- BESPOKE: Benchmark for Search-Augmented Large Language Model Personalization via Diagnostic Feedback
- VoiceBBQ: Investigating Effect of Content and Acoustics in Social Bias of Spoken Language Model
- Acoustic-based Gender Differentiation in Speech-aware Language Models
- AutoIntent: AutoML for Text Classification
- SFT Doesn't Always Hurt General Capabilities: Revisiting Domain-Specific Fine-Tuning in LLMs
- Few-Shot and Training-Free Review Generation via Conversational Prompting
- Enrich-on-Graph: Query-Graph Alignment for Complex Reasoning with LLM Enriching
- Distilling Many-Shot In-Context Learning into a Cheat Sheet
- Zero-Shot Privacy-Aware Text Rewriting via Iterative Tree Search
- Concise and Sufficient Sub-Sentence Citations for Retrieval-Augmented Generation
- WeFT: Weighted Entropy-driven Fine-Tuning for dLLMs
- Learning to Summarize by Learning to Quiz: Adversarial Agentic Collaboration for Long Document Summarization
- MemLens: Uncovering Memorization in LLMs with Activation Trajectories
- Cross-Linguistic Analysis of Memory Load in Sentence Comprehension: Linear Distance and Structural Density
- Tool Calling for Arabic LLMs: Data Strategies and Instruction Tuning
- Nuclear Diffusion Models for Low-Rank Background Suppression in Videos
- Conditionally Whitened Generative Models for Probabilistic Time Series Forecasting
- ShortCheck: Checkworthiness Detection of Multilingual Short-Form Videos
- SiniticMTError: A Machine Translation Dataset with Error Annotations for Sinitic Languages
- Building Tailored Speech Recognizers for Japanese Speaking Assessment
- Enhancing Molecular Property Prediction with Knowledge from Large Language Models
- RedHerring Attack: Testing the Reliability of Attack Detection
- Overcoming Black-box Attack Inefficiency with Hybrid and Dynamic Select Algorithms
- MI-Fuse: Label Fusion for Unsupervised Domain Adaptation with Closed-Source Large-Audio Language Model
- Probability Distribution Collapse: A Critical Bottleneck to Compact Unsupervised Neural Grammar Induction
- Cryptographic Backdoor for Neural Networks: Boon and Bane
- PALQO: Physics-informed Model for Accelerating Large-scale Quantum Optimization
- Real-Time System for Audio-Visual Target Speech Enhancement
- RAPTOR-GEN: RApid PosTeriOR GENerator for Bayesian Learning in Biomanufacturing
- Identifying Group Anchors in Real-World Group Interactions Under Label Scarcity
- Leveraging Temporally Extended Behavior Sharing for Multi-task Reinforcement Learning
- Extrapolating Phase-Field Simulations in Space and Time with Purely Convolutional Architectures
- Actively Learning Halfspaces without Synthetic Data
- Single Answer is Not Enough: On Generating Ranked Lists with Medical Reasoning Models
- RecIS: Sparse to Dense, A Unified Training Framework for Recommendation Models
- Objective Evaluation of Prosody and Intelligibility in Speech Synthesis via Conditional Prediction of Discrete Tokens
- Fast Estimation of Wasserstein Distances via Regression on Sliced Wasserstein Distances
- Innovative Deep Learning Architecture for Enhanced Altered Fingerprint Recognition
- Large Pre-Trained Models for Bimanual Manipulation in 3D
- Region-of-Interest Augmentation for Mammography Classification under Patient-Level Cross-Validation
- Unsupervised Domain Adaptation with an Unobservable Source Subpopulation
- A Gapped Scale-Sensitive Dimension and Lower Bounds for Offset Rademacher Complexity
- Design, Implementation and Evaluation of a Novel Programming Language Topic Classification Workflow
- A Hierarchical Variational Graph Fused Lasso for Recovering Relative Rates in Spatial Compositional Data
- Implicit Augmentation from Distributional Symmetry in Turbulence Super-Resolution
- No Prior, No Leakage: Revisiting Reconstruction Attacks in Trained Neural Networks
- Copycats: the many lives of a publicly available medical imaging dataset
- An Analytical and AI-discovered Stable, Accurate, and Generalizable Subgrid-scale Closure for Geophysical Turbulence
- Speaker Style-Aware Phoneme Anchoring for Improved Cross-Lingual Speech Emotion Recognition
- Leveraging NTPs for Efficient Hallucination Detection in VLMs
- A Comparative Analysis of Ensemble-Based Machine Learning Approaches with Explainable AI for Multi-Class Intrusion Detection in Drone Networks
- Sample completion, structured correlation, and Netflix problems
- Structuring Collective Action with LLM-Guided Evolution: From Ill-Structured Problems to Executable Heuristics
- SceneWeaver: All-in-One 3D Scene Synthesis with an Extensible and Self-Reflective Agent
- Neural Networks as Surrogate Solvers for Time-Dependent Accretion Disk Dynamics
- Document Summarization with Conformal Importance Guarantees
- Go With The Flow: Churn-Tolerant Decentralized Training of Large Language Models
- AbideGym: Turning Static RL Worlds into Adaptive Challenges
- Tree Search for LLM Agent Reinforcement Learning
- Explaining Fine Tuned LLMs via Counterfactuals A Knowledge Graph Driven Framework
- Federated Flow Matching
- humancompatible.train: Implementing Optimization Algorithms for Stochastically-Constrained Stochastic Optimization Problems
- A Causality-Aware Spatiotemporal Model for Multi-Region and Multi-Pollutant Air Quality Forecasting
- SuperOffload: Unleashing the Power of Large-Scale LLM Training on Superchips
- It's Not You, It's Clipping: A Soft Trust-Region via Probability Smoothing for LLM RL
- Optimal Robust Recourse with $L^p$-Bounded Model Change
- LAVA: Explainability for Unsupervised Latent Embeddings
- CAD-Tokenizer: Towards Text-based CAD Prototyping via Modality-Specific Tokenization
- GRPO is Secretly a Process Reward Model
- DATS: Distance-Aware Temperature Scaling for Calibrated Class-Incremental Learning
- Mixture of Thoughts: Learning to Aggregate What Experts Think, Not Just What They Say
- A Unified Framework for Diffusion Model Unlearning with f-Divergence
- Inverse Reinforcement Learning Using Just Classification and a Few Regressions
- Closed-form $\ell_r$ norm scaling with data for overparameterized linear regression and diagonal linear networks under $\ell_p$ bias
- Towards Foundation Models for Zero-Shot Time Series Anomaly Detection: Leveraging Synthetic Data and Relative Context Discrepancy
- Differential-Integral Neural Operator for Long-Term Turbulence Forecasting
- From Physics to Machine Learning and Back: Part II - Learning and Observational Bias in PHM
- Physics of Learning: A Lagrangian perspective to different learning paradigms
- GeoRef: Referring Expressions in Geometry via Task Formulation, Synthetic Supervision, and Reinforced MLLM-based Solutions
- SPREAD: Sampling-based Pareto front Refinement via Efficient Adaptive Diffusion
- Structure-Attribute Transformations with Markov Chain Boost Graph Domain Adaptation
- ScaleDiff: Scaling Difficult Problems for Advanced Mathematical Reasoning
- TyphoonMLA: A Mixed Naive-Absorb MLA Kernel For Shared Prefix
- GraphUniverse: Enabling Systematic Evaluation of Inductive Generalization
- Teaching RL Agents to Act Better: VLM as Action Advisor for Online Reinforcement Learning
- EvoMail: Self-Evolving Cognitive Agents for Adaptive Spam and Phishing Email Defense
- Sparse Representations Improve Adversarial Robustness of Neural Network Classifiers
- Feature Augmentation of GNNs for ILPs: Local Uniqueness Suffices
- Lossless Compression: A New Benchmark for Time Series Model Evaluation
- MAIFormer: Multi-Agent Inverted Transformer for Flight Trajectory Prediction
- ExMolRL: Phenotype-Target Joint Generation of De Novo Molecules via Multi-Objective Reinforcement Learning
- Mechanism of Task-oriented Information Removal in In-context Learning
- Predicting LLM Reasoning Performance with Small Proxy Model
- DELTA-Code: How Does RL Unlock and Transfer New Programming Algorithms in LLMs?
- Efficient Ensemble Conditional Independence Test Framework for Causal Discovery
- Actor-Critic without Actor
- FORCE: Transferable Visual Jailbreaking Attacks via Feature Over-Reliance CorrEction
- Reinforcement Learning Fine-Tuning Enhances Activation Intensity and Diversity in the Internal Circuitry of LLMs
- GenFacts-Generative Counterfactual Explanations for Multi-Variate Time Series
- Why Attention Fails: The Degeneration of Transformers into MLPs in Time Series Forecasting
- Decoupled-Value Attention for Prior-Data Fitted Networks: GP Inference for Physical Equations
- Alignment Unlocks Complementarity: A Framework for Multiview Circuit Representation Learning
- Knowledgeable Language Models as Black-Box Optimizers for Personalized Medicine
- CLUE: Conflict-guided Localization for LLM Unlearning Framework
- FracAug: Fractional Augmentation boost Graph-level Anomaly Detection under Limited Supervision
- Toward Robust and Efficient ML-Based GPU Caching for Modern Inference
- Learning Ising Models under Hard Constraints using One Sample
- Binary Autoencoder for Mechanistic Interpretability of Large Language Models
- LiLAW: Lightweight Learnable Adaptive Weighting to Meta-Learn Sample Difficulty and Improve Noisy Training
- Aligning Inductive Bias for Data-Efficient Generalization in State Space Models
- FERD: Fairness-Enhanced Data-Free Robustness Distillation
- T2I-Diff: fMRI Signal Generation via Time-Frequency Image Transform and Classifier-Free Denoising Diffusion Models
- Explaining Grokking and Information Bottleneck through Neural Collapse Emergence
- Shaping Initial State Prevents Modality Competition in Multi-modal Fusion: A Two-stage Scheduling Framework via Fast Partial Information Decomposition
- Causal Time Series Generation via Diffusion Models
- Distribution-Controlled Client Selection to Improve Federated Learning Strategies
- Deterministic Discrete Denoising
- Energy saving in off-road vehicles using leakage compensation technique
- Investigating Modality Contribution in Audio LLMs for Music
- Wonder Wins Ways: Curiosity-Driven Exploration through Multi-Agent Contextual Calibration
- Guiding Application Users via Estimation of Computational Resources for Massively Parallel Chemistry Computations
- Theoretical Bounds for Stable In-Context Learning
- Can Federated Learning Safeguard Private Data in LLM Training? Vulnerabilities, Attacks, and Defense Evaluation
- CE-GPPO: Controlling Entropy via Gradient-Preserving Clipping Policy Optimization in Reinforcement Learning
- A Genetic Algorithm for Navigating Synthesizable Molecular Spaces
- Scaling Laws are Redundancy Laws
- The Impact of Audio Watermarking on Audio Anti-Spoofing Countermeasures
- Sig2Model: A Boosting-Driven Model for Updatable Learned Indexes
- A Recovery Theory for Diffusion Priors: Deterministic Analysis of the Implicit Prior Algorithm
- MDBench: Benchmarking Data-Driven Methods for Model Discovery
- Generalizable Diabetes Risk Stratification via Hybrid Machine Learning Models
- The Sensitivity of Variational Bayesian Neural Network Performance to Hyperparameters
- Learning Greens Operators through Hierarchical Neural Networks Inspired by the Fast Multipole Method
- TSKAN: Interpretable Machine Learning for QoE modeling over Time Series Data
- Explicit and Effectively Symmetric Schemes for Neural SDEs
- Function Spaces Without Kernels: Learning Compact Hilbert Space Representations
- Policy Compatible Skill Incremental Learning via Lazy Learning Interface
- Latent Twins
- Training Task Reasoning LLM Agents for Multi-turn Task Planning via Single-turn Reinforcement Learning
- A Theory of Multi-Agent Generative Flow Networks
- FastEagle: Cascaded Drafting for Accelerating Speculative Decoding
- mloz: A Highly Efficient Machine Learning-Based Ozone Parameterization for Climate Sensitivity Simulations
- Bridging Privacy and Utility: Synthesizing anonymized EEG with constraining utility functions
- Efficiently Attacking Memorization Scores
- Offline Goal-conditioned Reinforcement Learning with Quasimetric Representations
- Beyond Visual Similarity: Rule-Guided Multimodal Clustering with explicit domain rules
- Myosotis: structured computation for attention like layer
- Auto-Regressive U-Net for Full-Field Prediction of Shrinkage-Induced Damage in Concrete
- StyleBench: Evaluating thinking styles in Large Language Models
- Model-Based Reinforcement Learning under Random Observation Delays
- SCRA-VQA: Summarized Caption-Rerank for Augmented Large Language Models in Visual Question Answering
- On Theoretical Interpretations of Concept-Based In-Context Learning
- Integrating Object Interaction Self-Attention and GAN-Based Debiasing for Visual Question Answering
- Improving Early Sepsis Onset Prediction Through Federated Learning
- FerretNet: Efficient Synthetic Image Detection via Local Pixel Dependencies
- Deep Learning for Crime Forecasting: The Role of Mobility at Fine-grained Spatiotemporal Scales
- CTI Dataset Construction from Telegram
- Flow Matching in the Low-Noise Regime: Pathologies and a Contrastive Remedy
- Unlocking Financial Insights: An advanced Multimodal Summarization with Multimodal Output Framework for Financial Advisory Videos
- Revolutionizing Precise Low Back Pain Diagnosis via Contrastive Learning
- Even More Kawaii than Real-Person-Driven VTubers? Understanding How Viewers Perceive AI-Driven VTubers
- CaTS-Bench: Can Language Models Describe Numeric Time Series?
- Trustworthy Semantic Communication for Vehicular Networks: Challenges and Solutions
- Security-aware Semantic-driven ISAC via Paired Adversarial Residual Networks
- Verification Limits Code LLM Training
- ImaginationPolicy: Towards Generalizable, Precise and Reliable End-to-End Policy for Robotic Manipulation
- Robust Multi-Omics Integration from Incomplete Modalities Significantly Improves Prediction of Alzheimer's Disease
- FHRFormer: A Self-supervised Transformer Approach for Fetal Heart Rate Inpainting and Forecasting
- TasselNetV4: A vision foundation model for cross-scene, cross-scale, and cross-species plant counting
- Federated Markov Imputation: Privacy-Preserving Temporal Imputation in Multi-Centric ICU Environments
- Imagining Design Workflows in Agentic AI Futures
- AI-Enabled Crater-Based Navigation for Lunar Mapping
- Confidence-guided Refinement Reasoning for Zero-shot Question Answering
- Seeing Through Words, Speaking Through Pixels: Deep Representational Alignment Between Vision and Language Models
- Measuring LLM Sensitivity in Transformer-based Tabular Data Synthesis
- Provenance Analysis of Archaeological Artifacts via Multimodal RAG Systems
- CusEnhancer: A Zero-Shot Scene and Controllability Enhancement Method for Photo Customization via ResInversion
- IConv: Focusing on Local Variation with Channel Independent Convolution for Multivariate Time Series Forecasting
- Towards Atoms of Large Language Models
- DAC-LoRA: Dynamic Adversarial Curriculum for Efficient and Robust Few-Shot Adaptation
- Leveraging What's Overfixed: Post-Correction via LLM Grammatical Error Overcorrection
- Bispectral OT: Dataset Comparison using Symmetry-Aware Optimal Transport
- QAMO: Quality-aware Multi-centroid One-class Learning For Speech Deepfake Detection
- Efficient Construction of Implicit Surface Models From a Single Image for Motion Generation
- Addressing Gradient Misalignment in Data-Augmented Training for Robust Speech Deepfake Detection
- Learning to Align Molecules and Proteins: A Geometry-Aware Approach to Binding Affinity
- Incorporating LLM Embeddings for Variation Across the Human Genome
- Joint Flow Trajectory Optimization For Feasible Robot Motion Generation from Video Demonstrations
- Beyond the Individual: Introducing Group Intention Forecasting with SHOT Dataset
- RobotDancing: Residual-Action Reinforcement Learning Enables Robust Long-Horizon Humanoid Motion Tracking
- Every Character Counts: From Vulnerability to Defense in Phishing Detection
- An LLM-based Agentic Framework for Accessible Network Control
- Experience Deploying Containerized GenAI Services at an HPC Center
- MMG: Mutual Information Estimation via the MMSE Gap in Diffusion
- FS-DFM: Fast and Accurate Long Text Generation with Few-Step Diffusion Language Models
- Personalized Federated Dictionary Learning for Modeling Heterogeneity in Multi-site fMRI Data
- Recidivism and Peer Influence with LLM Text Embeddings in Low Security Correctional Facilities
- Learning Terrain-Specialized Policies for Adaptive Locomotion in Challenging Environments
- A Framework for Rapidly Developing and Deploying Protection Against Large Language Model Attacks
- Look Before you Leap: Estimating LLM Benchmark Scores from Descriptions
- Understanding Mode Switching in Human-AI Collaboration: Behavioral Insights and Predictive Modeling
- CHOIR: A Chatbot-mediated Organizational Memory Leveraging Communication in University Research Labs
- InstructVTON: Optimal Auto-Masking and Natural-Language-Guided Interactive Style Control for Inpainting-Based Virtual Try-On
- Understanding and Improving Adversarial Robustness of Neural Probabilistic Circuits
- GraspFactory: A Large Object-Centric Grasping Dataset
- Perspectra: Choosing Your Experts Enhances Critical Thinking in Multi-Agent Research Ideation
- SwasthLLM: a Unified Cross-Lingual, Multi-Task, and Meta-Learning Zero-Shot Framework for Medical Diagnosis Using Contrastive Representations
- PIRF: Physics-Informed Reward Fine-Tuning for Diffusion Models
- MechStyle: Augmenting Generative AI with Mechanical Simulation to Create Stylized and Structurally Viable 3D Models
- Dynamic Reasoning Chains through Depth-Specialized Mixture-of-Experts in Transformer Architectures
- Hierarchical Resolution Transformers: A Wavelet-Inspired Architecture for Multi-Scale Language Understanding
- Shared Neural Space: Unified Precomputed Feature Encoding for Multi-Task and Cross Domain Vision
- CoSupFormer : A Contrastive Supervised learning approach for EEG signal Classification
- AI-Specific Code Smells: From Specification to Detection
- Boosting Zero-Shot VLN via Abstract Obstacle Map-Based Waypoint Prediction with TopoGraph-and-VisitInfo-Aware Prompting
- MARS: toward more efficient multi-agent collaboration for LLM reasoning
- Complexity-Driven Policy Optimization
- Variational Low-Rank Adaptation for Personalized Impaired Speech Recognition
- Defending against Stegomalware in Deep Neural Networks with Permutation Symmetry
- Adversarial Defense in Cybersecurity: A Systematic Review of GANs for Threat Detection and Mitigation
- A Taxonomy of Data Risks in AI and Quantum Computing (QAI) - A Systematic Review
- Wartime Media Dynamics in Emerging Democracies: Case Study of Pakistani Media in May 2025 Indo-Pak Conflict
- ACCeLLiuM: Supervised Fine-Tuning for Automated OpenACC Pragma Generation
- USB-Rec: An Effective Framework for Improving Conversational Recommendation Capability of Large Language Model
- Lightweight MobileNetV1+GRU for ECG Biometric Authentication: Federated and Adversarial Evaluation
- MARS: A Malignity-Aware Backdoor Defense in Federated Learning
- R1-Fuzz: Specializing Language Models for Textual Fuzzing via Reinforcement Learning
- Dynamic ReAct: Scalable Tool Selection for Large-Scale MCP Environments
- Can You Trust Your Copilot? A Privacy Scorecard for AI Coding Assistants
- The Secret Agenda: LLMs Strategically Lie and Our Current Safety Tools Are Blind
- Blueprints of Trust: AI System Cards for End to End Transparency and Governance
- Centralized vs. Decentralized Security for Space AI Systems? A New Look
- Data-Efficient ASR Personalization for Non-Normative Speech Using an Uncertainty-Based Phoneme Difficulty Score for Guided Sampling
- Grounding AI Explanations in Experience: A Reflective Cognitive Architecture for Clinical Decision Support
- VC-Agent: An Interactive Agent for Customized Video Dataset Collection
- SAGE: A Realistic Benchmark for Semantic Understanding
- Interpreting Public Sentiment in Diplomacy Events: A Counterfactual Analysis Framework Using Large Language Models
- AI-driven formative assessment and adaptive learning in data-science education: Evaluating an LLM-powered virtual teaching assistant
- CFD-LLMBench: A Benchmark Suite for Evaluating Large Language Models in Computational Fluid Dynamics
- Assessing Classical Machine Learning and Transformer-based Approaches for Detecting AI-Generated Research Text
- ConceptViz: A Visual Analytics Approach for Exploring Concepts in Large Language Models
- SKILL-RAG: Self-Knowledge Induced Learning and Filtering for Retrieval-Augmented Generation
- Beyond Global Emotion: Fine-Grained Emotional Speech Synthesis with Dynamic Word-Level Modulation
- Combinatorial Creativity: A New Frontier in Generalization Abilities
- Disagreements in Reasoning: How a Model's Thinking Process Dictates Persuasion in Multi-Agent Systems
- Recon-Act: A Self-Evolving Multi-Agent Browser-Use System via Web Reconnaissance, Tool Generation, and Task Execution
- TrustJudge: Inconsistencies of LLM-as-a-Judge and How to Alleviate Them
- Expanding Reasoning Potential in Foundation Model by Learning Diverse Chains of Thought Patterns
- RL Squeezes, SFT Expands: A Comparative Study of Reasoning LLMs
- ToMPO: Training LLM Strategic Decision Making from a Multi-Agent Perspective
- Embodied Representation Alignment with Mirror Neurons
- Distributed Specialization: Rare-Token Neurons in Large Language Models
- A Fano-Style Accuracy Upper Bound for LLM Single-Pass Reasoning in Multi-Hop QA
- What Do LLM Agents Do When Left Alone? Evidence of Spontaneous Meta-Cognitive Patterns
- Fairy: Interactive Mobile Assistant to Real-world Tasks via LMM-based Multi-agent
- Parallel Thinking, Sequential Answering: Bridging NAR and AR for Efficient Reasoning
- Meta-Memory: Retrieving and Integrating Semantic-Spatial Memories for Robot Spatial Reasoning
- LogReasoner: Empowering LLMs with Expert-like Coarse-to-Fine Reasoning for Log Analysis Tasks
- DeFacto: Counterfactual Thinking with Images for Enforcing Evidence-Grounded and Faithful Reasoning
- GALAX: Graph-Augmented Language Model for Explainable Reinforcement-Guided Subgraph Reasoning in Precision Medicine
- Beyond Stars: Bridging the Gap Between Ratings and Review Sentiment with LLM
- AOT*: Efficient Synthesis Planning via LLM-Empowered AND-OR Tree Search
- CORE: Full-Path Evaluation of LLM Agents Beyond Final State
- Who Gets Cited Most? Benchmarking Long-Context Language Models on Scientific Articles
- CLAUSE: Agentic Neuro-Symbolic Knowledge Graph Reasoning via Dynamic Learnable Context Engineering
- An Approach to Checking Correctness for Agentic Systems
- LATTS: Locally Adaptive Test-Time Scaling
- Philosophy-informed Machine Learning
- InsightGUIDE: An Opinionated AI Assistant for Guided Critical Reading of Scientific Literature
- Reconstruction-Based Adaptive Scheduling Using AI Inferences in Safety-Critical Systems
- Adaptive Approach to Enhance Machine Learning Scheduling Algorithms During Runtime Using Reinforcement Learning in Metascheduling Applications
- A Compound Classification System Based on Fuzzy Relations Applied to the Noise-Tolerant Control of a Bionic Hand via EMG Signal Recognition
- SAMULE: Self-Learning Agents Enhanced by Multi-level Reflection
- Adaptive Cybersecurity Architecture for Digital Product Ecosystems Using Agentic AI
- Accelerate Creation of Product Claims Using Generative AI
- An Automated Retrieval-Augmented Generation LLaMA-4 109B-based System for Evaluating Radiotherapy Treatment Plans
Research Sources: 537 | Generated: 9/27/2025