AI RESEARCH PAPERS & ACADEMIC SOURCES
- WHAT-IF: Exploring Branching Narratives by Meta-Prompting Large Language Models
- DialUp! Modeling the Language Continuum by Adapting Models to Dialects and Dialects to Models
- DCAD-2000: A Multilingual Dataset across 2000+ Languages with Data Cleaning as Anomaly Detection
- Think Silently, Think Fast: Dynamic Latent Compression of LLM Reasoning Chains
- CoDial: Interpretable Task-Oriented Dialogue Systems Through Dialogue Flow Alignment
- The Translation Barrier Hypothesis: Multilingual Generation with Large Language Models Suffers from Implicit Translation Failure
- Fairshare Data Pricing via Data Valuation for Large Language Models
- DrunkAgent: Stealthy Memory Corruption in LLM-Powered Recommender Agents
- Don't Retrieve, Generate: Prompting LLMs for Synthetic Training Data in Dense Retrieval
- Facts are Harder Than Opinions -- A Multilingual, Comparative Analysis of LLM-Based Fact-Checking Reliability
- MSR-Align: Policy-Grounded Multimodal Alignment for Safety-Aware Reasoning in Vision-Language Models
- CoIDO: Efficient Data Selection for Visual Instruction Tuning via Coupled Importance-Diversity Optimization
- CMIS-Net: A Cascaded Multi-Scale Individual Standardization Network for Backchannel Agreement Estimation
- Robotic Classification of Divers' Swimming States using Visual Pose Keypoints as IMUs
- InsideOut: Integrated RGB-Radiative Gaussian Splatting for Comprehensive 3D Object Representation
- GAN-based Content-Conditioned Generation of Handwritten Musical Symbols
- Investigating Demographic Bias in Brain MRI Segmentation: A Comparative Study of Deep-Learning and Non-Deep-Learning Methods
- ManzaiSet: A Multimodal Dataset of Viewer Responses to Japanese Manzai Comedy
- Chimera: Compositional Image Generation using Part-based Concepting
- Big Data, Tiny Targets: An Exploratory Study in Machine Learning-enhanced Detection of Microplastic from Filters
- From Volume Rendering to 3D Gaussian Splatting: Theory and Applications
- Online In-Context Distillation for Low-Resource Vision Language Models
- World-in-World: World Models in a Closed-Loop World
- Adapting Stereo Vision From Objects To 3D Lunar Surface Reconstruction with the StereoLunar Dataset
- EMA-SAM: Exponential Moving-average for SAM-based PTMC Segmentation
- Beyond Frequency: Scoring-Driven Debiasing for Object Detection via Blueprint-Prompted Image Synthesis
- DeepSeek-OCR: Contexts Optical Compression
- BlendCLIP: Bridging Synthetic and Real Domains for Zero-Shot 3D Object Classification with Multimodal Pretraining
- OpenInsGaussian: Open-vocabulary Instance Gaussian Segmentation with Context-aware Cross-view Fusion
- UWBench: A Comprehensive Vision-Language Benchmark for Underwater Understanding
- TreeFedDG: Alleviating Global Drift in Federated Domain Generalization for Medical Image Segmentation
- GeoDiff: Geometry-Guided Diffusion for Metric Depth Estimation
- Proactive Reasoning-with-Retrieval Framework for Medical Multimodal Large Language Models
- OmniNWM: Omniscient Driving Navigation World Models
- Beyond Single Models: Mitigating Multimodal Hallucinations via Adaptive Token Ensemble Decoding
- Enhancing Few-Shot Classification of Benchmark and Disaster Imagery with ATTBHFA-Net
- ViSE: A Systematic Approach to Vision-Only Street-View Extrapolation
- GPTFace: Generative Pre-training of Facial-Linguistic Transformer by Span Masking and Weakly Correlated Text-image Data
- AV-Master: Dual-Path Comprehensive Perception Makes Better Audio-Visual Question Answering
- Ranking-based Preference Optimization for Diffusion Models from Implicit User Feedback
- Learning Human-Object Interaction as Groups
- FeatureFool: Zero-Query Fooling of Video Models via Feature Map
- Cross-Modal Scene Semantic Alignment for Image Complexity Assessment
- Entropy-Enhanced Conformal Features from Ricci Flow for Robust Alzheimer's Disease Classification
- Bayesian Fully-Connected Tensor Network for Hyperspectral-Multispectral Image Fusion
- Beyond Single Images: Retrieval Self-Augmented Unsupervised Camouflaged Object Detection
- LAND: Lung and Nodule Diffusion for 3D Chest CT Synthesis with Anatomical Guidance
- Mono4DGS-HDR: High Dynamic Range 4D Gaussian Splatting from Alternating-exposure Monocular Videos
- DWaste: Greener AI for Waste Sorting using Mobile and Edge Devices
- RayPose: Ray Bundling Diffusion for Template Views in Unseen 6D Object Pose Estimation
- GBlobs: Local LiDAR Geometry for Improved Sensor Placement Generalization
- Descriptor: Occluded nuScenes: A Multi-Sensor Dataset for Evaluating Perception Robustness in Automated Driving
- Image augmentation with invertible networks in interactive satellite image change detection
- Beyond the Pipeline: Analyzing Key Factors in End-to-End Deep Learning for Historical Writer Identification
- MoGA: Mixture-of-Groups Attention for End-to-End Long Video Generation
- UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation
- Exploring a Unified Vision-Centric Contrastive Alternatives on Multi-Modal Web Documents
- A Renaissance of Explicit Motion Information Mining from Transformers for Action Recognition
- PLANA3R: Zero-shot Metric Planar 3D Reconstruction via Feed-Forward Planar Splatting
- SSD: Spatial-Semantic Head Decoupling for Efficient Autoregressive Image Generation
- IF-VidCap: Can Video Caption Models Follow Instructions?
- Moving Light Adaptive Colonoscopy Reconstruction via Illumination-Attenuation-Aware 3D Gaussian Splatting
- SEAL: Semantic-Aware Hierarchical Learning for Generalized Category Discovery
- Detection and Simulation of Urban Heat Islands Using a Fine-Tuned Geospatial Foundation Model for Microclimate Impact Prediction
- UltraGen: High-Resolution Video Generation with Hierarchical Attention
- Rebellious Student: A Complementary Learning Framework for Background Feature Enhancement in Hyperspectral Anomaly Detection
- ProCLIP: Progressive Vision-Language Alignment via LLM-based Embedder
- A Geometric Approach to Steerable Convolutions
- SAM 2++: Tracking Anything at Any Granularity
- Unifying and Enhancing Graph Transformers via a Hierarchical Mask Framework
- FedDEAP: Adaptive Dual-Prompt Tuning for Multi-Domain Federated Learning
- DSI-Bench: A Benchmark for Dynamic Spatial Intelligence
- Robobench: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models as Embodied Brain
- Cross-Domain Multi-Person Human Activity Recognition via Near-Field Wi-Fi Sensing
- DMTrack: Deformable State-Space Modeling for UAV Multi-Object Tracking with Kalman Fusion and Uncertainty-Aware Association
- Conformal Lesion Segmentation for 3D Medical Images
- A Generalizable Light Transport 3D Embedding for Global Illumination
- DualHash: A Stochastic Primal-Dual Algorithm with Theoretical Guarantee for Deep Hashing
- CUARewardBench: A Benchmark for Evaluating Reward Models on Computer-using Agent
- When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models
- Learning Collaborative Knowledge with Multimodal Representation for Polyp Re-Identification
- H3D-DGS: Exploring Heterogeneous 3D Motion Representation for Deformable 3D Gaussian Splatting
- 3D Audio-Visual Segmentation
- Med-2E3: A 2D-Enhanced 3D Medical Multimodal Large Language Model
- Foundation Cures Personalization: Improving Personalized Models' Prompt Consistency via Hidden Foundation Knowledge
- View Transformation Robustness for Multi-View 3D Object Reconstruction with Reconstruction Error-Guided View Selection
- WMamba: Wavelet-based Mamba for Face Forgery Detection
- ITVTON: Virtual Try-On Diffusion Transformer Based on Integrated Image and Text
- RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning
- Mask Image Watermarking
- Monitoring morphometric drift in lifelong learning segmentation of the spinal cord
- VisualQuality-R1: Reasoning-Induced Image Quality Assessment via Reinforcement Learning to Rank
- Visionary-R1: Mitigating Shortcuts in Visual Reasoning with Reinforcement Learning
- A Unified Solution to Video Fusion: From Multi-Frame Learning to Benchmarking
- DisasterM3: A Remote Sensing Vision-Language Dataset for Disaster Damage Assessment and Response
- Re-ttention: Ultra Sparse Visual Generation via Attention Statistical Reshape
- Janus-Pro-R1: Advancing Collaborative Visual Comprehension and Generation via Reinforcement Learning
- From Objects to Anywhere: A Holistic Benchmark for Multi-level Visual Grounding in 3D Scenes
- SDTagNet: Leveraging Text-Annotated Navigation Maps for Online HD Map Construction
- ReID5o: Achieving Omni Multi-modal Person Re-identification in a Single Model
- HOIDiNi: Human-Object Interaction through Diffusion Noise Optimization
- Polyline Path Masked Attention for Vision Transformer
- GreenHyperSpectra: A multi-source hyperspectral dataset for global vegetation trait prediction
- RODS: Robust Optimization Inspired Diffusion Sampling for Detecting and Reducing Hallucination in Generative Models
- Moving Object Detection from Moving Camera Using Focus of Expansion Likelihood and Segmentation
- RWKV-UNet: Improving UNet with Long-Range Cooperation for Effective Medical Image Segmentation
- SimCortex: Collision-free Simultaneous Cortical Surfaces Reconstruction
- ViBED-Net: Video Based Engagement Detection Network Using Face-Aware and Scene-Aware Spatiotemporal Cues
- Transformer Redesign for Late Fusion of Audio-Text Features on Ultra-Low-Power Edge Hardware
- Fast Agnostic Learners in the Plane
- Arbitrated Indirect Treatment Comparisons
- PrivaDE: Privacy-preserving Data Evaluation for Blockchain-based Data Marketplaces
- Generalization Below the Edge of Stability: The Role of Data Geometry
- Extracting Rule-based Descriptions of Attention Features in Transformers
- Beating the Winner's Curse via Inference-Aware Policy Optimization
- Joint Estimation of Piano Dynamics and Metrical Structure with a Multi-task Multi-Scale Network
- RESCUE: Retrieval Augmented Secure Code Generation
- The Bias-Variance Tradeoff in Data-Driven Optimization: A Local Misspecification Perspective
- LIME: Link-based user-item Interaction Modeling with decoupled xor attention for Efficient test time scaling
- Efficient Few-shot Identity Preserving Attribute Editing for 3D-aware Deep Generative Models
- A Distributed Framework for Causal Modeling of Performance Variability in GPU Traces
- Parametrising the Inhomogeneity Inducing Capacity of a Training Set, and its Impact on Supervised Learning
- ECG-LLM-- training and evaluation of domain-specific large language models for electrocardiography
- A machine learning approach to automation and uncertainty evaluation for self-validating thermocouples
- Vision Foundation Models Can Be Good Tokenizers for Latent Diffusion Models
- Decoding Dynamic Visual Experience from Calcium Imaging via Cell-Pattern-Aware SSL
- Interval Prediction of Annual Average Daily Traffic on Local Roads via Quantile Random Forest with High-Dimensional Spatial Data
- A Multi-Evidence Framework Rescues Low- Power Prognostic Signals and Rejects Statistical Artifacts in Cancer Genomics
- CovMatch: Cross-Covariance Guided Multimodal Dataset Distillation with Trainable Text Encoder
- Channel-Aware Vector Quantization for Robust Semantic Communication on Discrete Channels
- A Compositional Paradigm for Foundation Models: Towards Smarter Robotic Agents
- Differentially Private E-Values
- Bayesian Low-Rank Factorization for Robust Model Adaptation
- Adapting Language Balance in Code-Switching Speech
- Diffusion Buffer for Online Generative Speech Enhancement
- Symbolic Emulators for Cosmology: Accelerating Cosmological Analyses Without Sacrificing Precision
- Analyse comparative d'algorithmes de restauration en architecture d\'epli\'ee pour des signaux chromatographiques parcimonieux
- A Frequentist Statistical Introduction to Variational Inference, Autoencoders, and Diffusion Models
- SO(3)-invariant PCA with application to molecular data
- MTraining: Distributed Dynamic Sparse Attention for Efficient Ultra-Long Context Training
- One-Pass Learning via Bridging Orthogonal Gradient Descent and Recursive Least-Squares
- Better NTK Conditioning: A Free Lunch from (ReLU) Nonlinear Activation in Wide Neural Networks
- Sparse Explanations of Neural Networks Using Pruned Layer-Wise Relevance Propagation
- A Flow-Based Model for Conditional and Probabilistic Electricity Consumption Profile Generation and Prediction
- Learning Confidence Bounds for Classification with Imbalanced Data
- One protein is all you need
- FedMeld: A Model-dispersal Federated Learning Framework for Space-ground Integrated Networks
- Asynchronous Federated Learning: A Scalable Approach for Decentralized Machine Learning
- A Statistical Theory of Contrastive Pre-training and Multimodal Generative AI
- Can We Validate Counterfactual Estimations in the Presence of General Network Interference?
- Analyzing Similarity Metrics for Data Selection for Language Model Pretraining
- Beyond Benign Overfitting in Nadaraya-Watson Interpolators
- In-Context Learning of Linear Dynamical Systems with Transformers: Approximation Bounds and Depth-Separation
- Enhancing Sample Selection Against Label Noise by Cutting Mislabeled Easy Examples
- CayleyPy RL: Pathfinding and Reinforcement Learning on Cayley Graphs
- In-Context Learning of Stochastic Differential Equations with Foundation Inference Models
- Reinforcement Learning with Verifiable Rewards: GRPO's Effective Loss, Dynamics, and Success Amplification
- T\'yr-the-Pruner: Structural Pruning LLMs via Global Sparsity Distribution Optimization
- PAUSE: Low-Latency and Privacy-Aware Active User Selection for Federated Learning
- Efficient Verified Machine Unlearning For Distillation
- Enabling Automatic Differentiation with Mollified Graph Neural Operators
- Spike-timing-dependent Hebbian learning as noisy gradient descent
- FlashBias: Fast Computation of Attention with Bias
- Neural Graduated Assignment for Maximum Common Edge Subgraphs
- Fair Supervised Learning Through Constraints on Smooth Nonconvex Unfairness-Measure Surrogates
- Understanding Differential Transformer Unchains Pretrained Self-Attentions
- The Spacetime of Diffusion Models: An Information Geometry Perspective
- Inverse Q-Learning Done Right: Offline Imitation Learning in $Q^\pi$-Realizable MDPs
- Generative or Discriminative? Revisiting Text Classification in the Era of Transformers
- PARALLELPROMPT: Extracting Parallelism from Large Language Model Queries
- A unified framework for establishing the universal approximation of transformer-type architectures
- Class-wise Balancing Data Replay for Federated Class-Incremental Learning
- The Impact of Coreset Selection on Spurious Correlations and Group Robustness
- Graph Neural Networks for Road Safety Modeling: Datasets and Evaluations for Accident Analysis
- Generation of Uncertainty-Aware Emergent Concepts in Factorized 3D Scene Graphs via Graph Neural Networks
- Implicit Neural Compression of Point Clouds
- On Generalization and Distributional Update for Mimicking Observations with Adequate Exploration
- Molecular Fingerprints Are Strong Models for Peptide Function Prediction
- Dynamic object goal pushing with mobile manipulators through model-free constrained reinforcement learning
- The $\varphi$ Curve: The Shape of Generalization through the Lens of Norm-based Capacity Control
- VLA-Cache: Efficient Vision-Language-Action Manipulation via Adaptive Token Caching
- Improving Diffusion-based Inverse Algorithms under Few-Step Constraint via Learnable Linear Extrapolation
- Harnessing Test-time Adaptation for NLU tasks Involving Dialects of English
- Low-cost Embedded Breathing Rate Determination Using 802.15.4z IR-UWB Hardware for Remote Healthcare
- REPA-E: Unlocking VAE for End-to-End Tuning with Latent Diffusion Transformers
- A Physics-Informed Spatiotemporal Deep Learning Framework for Turbulent Systems
- Backward Conformal Prediction
- Time Reversal Symmetry for Efficient Robotic Manipulations in Deep Reinforcement Learning
- Steering Generative Models with Experimental Data for Protein Fitness Optimization
- gen2seg: Generative Models Enable Generalizable Instance Segmentation
- DanmakuTPPBench: A Multi-modal Benchmark for Temporal Point Process Modeling and Understanding
- Nearly Dimension-Independent Convergence of Mean-Field Black-Box Variational Inference
- FlySearch: Exploring how vision-language models explore
- ALINE: Joint Amortization for Bayesian Inference and Active Data Acquisition
- Dynamic Diffusion Schr\"odinger Bridge in Astrophysical Observational Inversions
- ReVeal: Self-Evolving Code Agents via Reliable Self-Verification
- Advances in Pre-trained Language Models for Domain-Specific Text Classification: A Systematic Review
- Atomic Literary Styling: Mechanistic Manipulation of Prose Generation in Neural Language Models
- Chain-of-Thought Reasoning Improves Context-Aware Translation with Large Language Models
- Na Pr\'atica, qual IA Entende o Direito? Um Estudo Experimental com IAs Generalistas e uma IA Jur\'idica
- Does Reasoning Help LLM Agents Play Dungeons and Dragons? A Prompt Engineering Experiment
- LLMs Encode How Difficult Problems Are
- CMT-Bench: Cricket Multi-Table Generation Benchmark for Probing Robustness in Large Language Models
- MARCUS: An Event-Centric NLP Pipeline that generates Character Arcs from Narratives
- BrailleLLM: Braille Instruction Tuning with Large Language Models for Braille Domain Tasks
- Food4All: A Multi-Agent Framework for Real-time Free Food Discovery with Integrated Nutritional Metadata
- Combining Distantly Supervised Models with In Context Learning for Monolingual and Cross-Lingual Relation Extraction
- KrishokBondhu: A Retrieval-Augmented Voice-Based Agricultural Advisory Call Center for Bengali Farmers
- KoSimpleQA: A Korean Factuality Benchmark with an Analysis of Reasoning LLMs
- Towards Fair ASR For Second Language Speakers Using Fairness Prompted Finetuning
- Adamas: Hadamard Sparse Attention for Efficient Long-Context Inference
- Chain-of-Conceptual-Thought: Eliciting the Agent to Deeply Think within the Response
- Grounding or Guessing? Visual Signals for Detecting Hallucinations in Sign Language Translation
- Engagement Undermines Safety: How Stereotypes and Toxicity Shape Humor in Language Models
- ChronoPlay: A Framework for Modeling Dual Dynamics and Authenticity in Game RAG Benchmarks
- DePass: Unified Feature Attributing by Simple Decomposed Forward Pass
- CEFR-Annotated WordNet: LLM-Based Proficiency-Guided Semantic Database for Language Learning
- IMB: An Italian Medical Benchmark for Question Answering
- DART: A Structured Dataset of Regulatory Drug Documents in Italian for Clinical NLP
- How Efficient Are Diffusion Language Models? A Critical Examination of Efficiency Evaluation Practices
- Identity-Aware Large Language Models require Cultural Reasoning
- Building Trust in Clinical LLMs: Bias Analysis and Dataset Transparency
- Beyond the Explicit: A Bilingual Dataset for Dehumanization Detection in Social Media
- Dynamical model parameters from ultrasound tongue kinematics
- MLMA: Towards Multilingual with Mamba Based Architectures
- Investigating LLM Capabilities on Long Context Comprehension for Medical Question Answering
- SemiAdapt and SemiLoRA: Efficient Domain Adaptation for Transformer-based Low-Resource Language Translation with a Case Study on Irish
- Topoformer: brain-like topographic organization in Transformer language models through spatial querying and reweighting
- AI use in American newspapers is widespread, uneven, and rarely disclosed
- KAT-Coder Technical Report
- WebSeer: Training Deeper Search Agents through Reinforcement Learning with Self-Reflection
- HouseTour: A Virtual Real Estate A(I)gent
- The Impact of Image Resolution on Biomedical Multimodal Large Language Models
- Position: LLM Watermarking Should Align Stakeholders' Incentives for Practical Adoption
- See the Text: From Tokenization to Visual Reading
- Unconditional Truthfulness: Learning Unconditional Uncertainty of Large Language Models
- Rethinking LLM Uncertainty: A Multi-Agent Approach to Estimating Black-Box Model Uncertainty
- Discovering the curriculum with AI: A proof-of-concept demonstration with an intelligent tutoring system for teaching project selection
- LENS: Large Pre-trained Transformer for Exploring Financial Time Series Regularities
- Counterfactual Effect Decomposition in Multi-Agent Sequential Decision Making
- InternLM2.5-StepProver: Advancing Automated Theorem Proving via Critic-Guided Search
- Do LLMs Strategically Reveal, Conceal, and Infer Information? A Theoretical and Empirical Analysis in The Chameleon Game
- Modeling Human Beliefs about AI Behavior for Scalable Oversight
- A representational framework for learning and encoding structurally enriched trajectories in complex agent environments
- HyperGraphRAG: Retrieval-Augmented Generation via Hypergraph-Structured Knowledge Representation
- Improving Human-AI Coordination through Online Adversarial Training and Generative Models
- Pretraining a Shared Q-Network for Data-Efficient Offline Reinforcement Learning
- MTRE: Multi-Token Reliability Estimation for Hallucination Detection in VLMs
- SOCIA: Joint Structure-Parameter Co-Optimization for Automated Simulator Construction
- Can Agents Fix Agent Issues?
- VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning
- Mitigating Prior Errors in Causal Structure Learning: A Resilient Approach via Bayesian Networks
- Learning by Watching: A Review of Video-based Learning Approaches for Robot Manipulation
- Exploring Data-Efficient Adaptation of Large Language Models for Code Generation
- A Survey of Automatic Hallucination Evaluation on Natural Language Generation
- Learning Fairer Representations with FairVIC
- Review of Explainable Graph-Based Recommender Systems
- BlockScan: Detecting Anomalies in Blockchain Transactions
- Transition of $\alpha$-mixing in Random Iterations with Applications in Queuing Theory
- When Text Embedding Meets Large Language Model: A Comprehensive Survey
- Deep Learning in Palmprint Recognition-A Comprehensive Survey
- LLM Safety Alignment is Divergence Estimation in Disguise
- Foundations of a Developmental Design Paradigm for Integrated Continual Learning, Deliberative Behavior, and Comprehensibility
- Challenges in Testing Large Language Model Based Software: A Faceted Taxonomy
- Temporal Alignment of LLMs through Cycle Encoding for Long-Range Time Representations
- Energy Matching: Unifying Flow Matching and Energy-Based Models for Generative Modeling
- VLLFL: A Vision-Language Model Based Lightweight Federated Learning Framework for Smart Agriculture
- Dendritic Computing with Multi-Gate Ferroelectric Field-Effect Transistors
- Regression is all you need for medical image translation
- VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model
- The Shift Towards Preprints in AI Policy Research: A Comparative Study of Preprint Trends in the U.S., Europe, and South Korea
- MetaBox-v2: A Unified Benchmark Platform for Meta-Black-Box Optimization
- COLORA: Efficient Fine-Tuning for Convolutional Models with a Study Case on Optical Coherence Tomography Image Classification
- GraSS: Scalable Data Attribution with Gradient Sparsification and Sparse Projection
- LIMOPro: Reasoning Refinement for Efficient and Effective Test-time Scaling
- Multi-Agent Collaboration via Evolving Orchestration
- Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models
- REOrdering Patches Improves Vision Models
- EvaLearn: Quantifying the Learning Capability and Efficiency of LLMs via Sequential Problem Solving
- Counterfactual reasoning: an analysis of in-context emergence
- Model-based Implicit Neural Representation for sub-wavelength Radio Localization
- Mind the Web: The Security of Web Use Agents
- Understanding In-Context Learning on Structured Manifolds: Bridging Attention to Kernel Methods
- C-SEO Bench: Does Conversational SEO Work?
- Sparse Feature Coactivation Reveals Causal Semantic Modules in Large Language Models
- Iterative Quantum Feature Maps
- UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning
- From Noise to Laws: Regularized Time-Series Forecasting via Denoised Dynamic Graphs
- Shock-Aware Physics-Guided Fusion-DeepONet Operator for Rarefied Micro-Nozzle Flows
- Demystifying Transition Matching: When and Why It Can Beat Flow Matching
- Attention-Guided Deep Adversarial Temporal Subspace Clustering (A-DATSC) Model for multivariate spatiotemporal data
- Benchmarking Probabilistic Time Series Forecasting Models on Neural Activity
- Batch Distillation Data for Developing Machine Learning Anomaly Detection Methods
- MEG-GPT: A transformer-based foundation model for magnetoencephalography data
- Provably Optimal Reinforcement Learning under Safety Filtering
- Gradient Variance Reveals Failure Modes in Flow-Based Generative Models
- Efficient Long-context Language Model Training by Core Attention Disaggregation
- HyperDiffusionFields (HyDiF): Diffusion-Guided Hypernetworks for Learning Implicit Molecular Neural Fields
- Rethinking PCA Through Duality
- Nash Policy Gradient: A Policy Gradient Method with Iteratively Refined Regularization for Finding Nash Equilibria
- Ensemble based Closed-Loop Optimal Control using Physics-Informed Neural Networks
- Joint Optimization of Cooperation Efficiency and Communication Covertness for Target Detection with AUVs
- Towards Fast LLM Fine-tuning through Zeroth-Order Optimization with Projected Gradient-Aligned Perturbations
- ACTG-ARL: Differentially Private Conditional Text Generation with RL-Boosted Control
- Fostering the Ecosystem of AI for Social Impact Requires Expanding and Strengthening Evaluation Standards
- Learning with Dual-level Noisy Correspondence for Multi-modal Entity Alignment
- From Competition to Synergy: Unlocking Reinforcement Learning for Subject-Driven Image Generation
- Online Time Series Forecasting with Theoretical Guarantees
- Physics-Informed Parametric Bandits for Beam Alignment in mmWave Communications
- Towards Identifiability of Hierarchical Temporal Causal Representation Learning
- Uncertainty Estimation by Flexible Evidential Deep Learning
- Why Policy Gradient Algorithms Work for Undiscounted Total-Reward MDPs
- Computable universal online learning
- Ensembling Pruned Attention Heads For Uncertainty-Aware Efficient Transformers
- Learning to Flow from Generative Pretext Tasks for Neural Architecture Encoding
- Towards Unsupervised Open-Set Graph Domain Adaptation via Dual Reprogramming
- Training Diverse Graph Experts for Ensembles: A Systematic Empirical Study
- Approximation Rates of Shallow Neural Networks: Barron Spaces, Activation Functions and Optimality Analysis
- Provable Generalization Bounds for Deep Neural Networks with Adaptive Regularization
- Learning Boltzmann Generators via Constrained Mass Transport
- Safe But Not Sorry: Reducing Over-Conservatism in Safety Critics via Uncertainty-Aware Modulation
- Learning to Navigate Under Imperfect Perception: Conformalised Segmentation for Safe Reinforcement Learning
- Alibaba International E-commerce Product Search Competition DILAB Team Technical Report
- Partial VOROS: A Cost-aware Performance Metric for Binary Classifiers with Precision and Capacity Constraints
- HeFS: Helper-Enhanced Feature Selection via Pareto-Optimized Genetic Search
- Robustness Verification of Graph Neural Networks Via Lightweight Satisfiability Testing
- Unrolled-SINDy: A Stable Explicit Method for Non linear PDE Discovery from Sparsely Sampled Data
- Hardness of Learning Regular Languages in the Next Symbol Prediction Setting
- Optimality and NP-Hardness of Transformers in Learning Markovian Dynamical Functions
- Informed Learning for Estimating Drought Stress at Fine-Scale Resolution Enables Accurate Yield Prediction
- Learning Time-Varying Turn-Taking Behavior in Group Conversations
- Prototyping an End-to-End Multi-Modal Tiny-CNN for Cardiovascular Sensor Patches
- Learning Task-Agnostic Representations through Multi-Teacher Distillation
- Reinforcement Learning with Imperfect Transition Predictions: A Bellman-Jensen Approach
- OmniCast: A Masked Latent Diffusion Model for Weather Forecasting Across Time Scales
- Improving the Generation and Evaluation of Synthetic Data for Downstream Medical Causal Inference
- Enhancing Fractional Gradient Descent with Learned Optimizers
- CAGE: Curvature-Aware Gradient Estimation For Accurate Quantization-Aware Training
- Stick-Breaking Embedded Topic Model with Continuous Optimal Transport for Online Analysis of Document Streams
- On Biologically Plausible Learning in Continuous Time
- When LRP Diverges from Leave-One-Out in Transformers
- A Unified Perspective on Optimization in Machine Learning and Neuroscience: From Gradient Descent to Neural Adaptation
- Search Self-play: Pushing the Frontier of Agent Capability without Supervision
- BO4Mob: Bayesian Optimization Benchmarks for High-Dimensional Urban Mobility Problem
- A Hybrid Enumeration Framework for Optimal Counterfactual Generation in Post-Acute COVID-19 Heart Failure
- Retaining by Doing: The Role of On-Policy Data in Mitigating Forgetting
- In-Process Monitoring of Gear Power Honing Using Vibration Signal Analysis and Machine Learning
- Exploring Complexity Changes in Diseased ECG Signals for Enhanced Classification
- Single-Snapshot Gridless 2D-DoA Estimation for UCAs: A Joint Optimization Approach
- CLARAE: Clarity Preserving Reconstruction AutoEncoder for Denoising and Rhythm Classification of Intracardiac Electrograms
- Covariance Matrix Construction with Preprocessing-Based Spatial Sampling for Robust Adaptive Beamforming
- Neural networks for neurocomputing circuits: a computational study of tolerance to noise and activation function non-uniformity when machine learning materials properties
- Provenance of AI-Generated Images: A Vector Similarity and Blockchain-based Approach
- Shortcutting Pre-trained Flow Matching Diffusion Models is Almost Free Lunch
- Mixed Monotonicity Reachability Analysis of Neural ODE: A Trade-Off Between Tightness and Efficiency
- Three-dimensional inversion of gravity data using implicit neural representations
- Graphical model for tensor factorization by sparse sampling
- TritonRL: Training LLMs to Think and Code Triton Without Cheating
- Learning Time-Varying Graphs from Incomplete Graph Signals
- QINNs: Quantum-Informed Neural Networks
- Does GenAI Rewrite How We Write? An Empirical Study on Two-Million Preprints
- From Flows to Words: Can Zero-/Few-Shot LLMs Detect Network Intrusions? A Grammar-Constrained, Calibrated Evaluation on UNSW-NB15
- When Intelligence Fails: An Empirical Study on Why LLMs Struggle with Password Cracking
- Metrics and evaluations for computational and sustainable AI efficiency
- Hey Pentti, We Did It!: A Fully Vector-Symbolic Lisp
- MIN-Merging: Merge the Important Neurons for Model Merging
- Hierarchical Federated Unlearning for Large Language Models
- Long-Context Attention Benchmark: From Kernel Efficiency to Distributed Context Parallelism
- L-MoE: End-to-End Training of a Lightweight Mixture of Low-Rank Adaptation Experts
- Automated Algorithm Design for Auto-Tuning Optimizers
- Are LLMs Court-Ready? Evaluating Frontier Models on Indian Legal Reasoning
- The Sherpa.ai Blind Vertical Federated Learning Paradigm to Minimize the Number of Communications
- BreakFun: Jailbreaking LLMs via Schema Exploitation
- Interpretability Framework for LLMs in Undergraduate Calculus
- TACLA: An LLM-Based Multi-Agent Tool for Transactional Analysis Training in Education
- NeuCo-Bench: A Novel Benchmark Framework for Neural Embeddings in Earth Observation
- Uncertainty-Aware Post-Hoc Calibration: Mitigating Confidently Incorrect Predictions Beyond Calibration Metrics
- Self-Evidencing Through Hierarchical Gradient Decomposition: A Dissipative System That Maintains Non-Equilibrium Steady-State by Minimizing Variational Free Energy
- Data Unlearning Beyond Uniform Forgetting via Diffusion Time and Frequency Selection
- JT-Safe: Intrinsically Enhancing the Safety and Trustworthiness of LLMs
- ParaVul: A Parallel Large Language Model and Retrieval-Augmented Framework for Smart Contract Vulnerability Detection
- CBINNS: Cancer Biology-Informed Neural Network for Unknown Parameter Estimation and Missing Physics Identification
- CLAWS:Creativity detection for LLM-generated solutions using Attention Window of Sections
- Select-Then-Decompose: From Empirical Analysis to Adaptive Selection Strategy for Task Decomposition in Large Language Models
- Rewarding the Journey, Not Just the Destination: A Composite Path and Answer Self-Scoring Reward Mechanism for Test-Time Reinforcement Learning
- Efficient Toxicity Detection in Gaming Chats: A Comparative Study of Embeddings, Fine-Tuned Transformers and LLMs
- SpecAgent: A Speculative Retrieval and Forecasting Agent for Code Completion
- EvoSyn: Generalizable Evolutionary Data Synthesis for Verifiable Learning
- Diagnosing Representation Dynamics in NER Model Extension
- Attracting Commercial Artificial Intelligence Firms to Support National Security through Collaborative Contracts
- From Charts to Code: A Hierarchical Benchmark for Multimodal Models
- From Observations to Parameters: Detecting Changepoint in Nonlinear Dynamics with Simulation-based Inference
- AtlasKV: Augmenting LLMs with Billion-Scale Knowledge Graphs in 20GB VRAM
- XDXD: End-to-end crystal structure determination with low resolution X-ray diffraction
- UniRL-Zero: Reinforcement Learning on Unified Models with Joint Language Model and Diffusion Model Experts
- The Integration of Artificial Intelligence in Undergraduate Medical Education in Spain: Descriptive Analysis and International Perspectives
- Believe It or Not: How Deeply do LLMs Believe Implanted Facts?
- Trust in foundation models and GenAI: A geographic perspective
- Intuitionistic $j$-Do-Calculus in Topos Causal Models
- PLAGUE: Plug-and-play framework for Lifelong Adaptive Generation of Multi-turn Exploits
- Studying the Effects of Robot Intervention on School Shooters in Virtual Reality
- Universal Spectral Tokenization via Self-Supervised Panchromatic Representation Learning
- SimBA: Simplifying Benchmark Analysis Using Performance Matrices Alone
- BadScientist: Can a Research Agent Write Convincing but Unsound Papers that Fool LLM Reviewers?
- Is Multilingual LLM Watermarking Truly Multilingual? A Simple Back-Translation Solution
- DynaQuery: A Self-Adapting Framework for Querying Structured and Multimodal Data
- From Local to Global: Revisiting Structured Pruning Paradigms for Large Language Models
- SAVANT: Semantic Analysis with Vision-Augmented Anomaly deTection
- TriggerNet: A Novel Explainable AI Framework for Red Palm Mite Detection and Multi-Model Comparison and Heuristic-Guided Annotation
- Cross-Domain Long-Term Forecasting: Radiation Dose from Sparse Neutron Sensor via Spatio-Temporal Operator Network
- Language Models as Semantic Augmenters for Sequential Recommenders
- Measure-Theoretic Anti-Causal Representation Learning
- Adaptive Divergence Regularized Policy Optimization for Fine-tuning Generative Models
- SPACeR: Self-Play Anchoring with Centralized Reference Models
- Fine-tuning Flow Matching Generative Models with Intermediate Feedback
- R2L: Reliable Reinforcement Learning: Guaranteed Return & Reliable Policies in Reinforcement Learning
- Any-Depth Alignment: Unlocking Innate Safety Alignment of LLMs to Any-Depth
- RL-Driven Security-Aware Resource Allocation Framework for UAV-Assisted O-RAN
- R2BC: Multi-Agent Imitation Learning from Single-Agent Demonstrations
- Accelerating Vision Transformers with Adaptive Patch Sizes
- Enhancing mortality prediction in cardiac arrest ICU patients through meta-modeling of structured clinical data from MIMIC-IV
- From AutoRecSys to AutoRecLab: A Call to Build, Evaluate, and Govern Autonomous Recommender-Systems Research Labs
- Latent Discrete Diffusion Models
- SafeCoop: Unravelling Full Stack Safety in Agentic Collaborative Driving
- Automatic Prompt Generation via Adaptive Selection of Prompting Techniques
- ActivationReasoning: Logical Reasoning in Latent Activation Spaces
- VelocityNet: Real-Time Crowd Anomaly Detection via Person-Specific Velocity Analysis
- RadDiagSeg-M: A Vision Language Model for Joint Diagnosis and Multi-Target Segmentation in Radiology
- Contrastive Decoding Mitigates Score Range Bias in LLM-as-a-Judge
- VLSU: Mapping the Limits of Joint Multimodal Understanding for AI Safety
- The Emergence of Complex Behavior in Large-Scale Ecological Environments
- EVER: Edge-Assisted Auto-Verification for Mobile MR-Aided Operation
- Scaling Laws Meet Model Architecture: Toward Inference-Efficient LLMs
- Finding the Sweet Spot: Optimal Data Augmentation Ratio for Imbalanced Credit Scoring Using ADASYN
- Hyperbolic Space Learning Method Leveraging Temporal Motion Priors for Human Mesh Recovery
- DelvePO: Direction-Guided Self-Evolving Framework for Flexible Prompt Optimization
- NTKMTL: Mitigating Task Imbalance in Multi-Task Learning from Neural Tangent Kernel Perspective
- Learning under Quantization for High-Dimensional Linear Regression
- SPIKE: Stable Physics-Informed Kernel Evolution Method for Solving Hyperbolic Conservation Laws
- Latent-Info and Low-Dimensional Learning for Human Mesh Recovery and Parallel Optimization
- StreamingTOM: Streaming Token Compression for Efficient Video Understanding
- Text or Pixels? It Takes Half: On the Token Efficiency of Visual Text Inputs in Multimodal LLMs
- From Retrieval to Generation: Unifying External and Parametric Knowledge for Medical Question Answering
- Higher Embedding Dimension Creates a Stronger World Model for a Simple Sorting Task
- MoMaGen: Generating Demonstrations under Soft and Hard Constraints for Multi-Step Bimanual Mobile Manipulation
- Scalable, Explainable and Provably Robust Anomaly Detection with One-Step Flow Matching
- PGTT: Phase-Guided Terrain Traversal for Perceptive Legged Locomotion
- S2AP: Score-space Sharpness Minimization for Adversarial Pruning
- MENTOR: A Reinforcement Learning Framework for Model Enhancement via Teacher-Optimized Rewards in Small Models
- Automated Wicket-Taking Delivery Segmentation and Weakness Detection in Cricket Videos Using OCR-Guided YOLOv8 and Trajectory Modeling
- Learning from N-Tuple Data with M Positive Instances: Unbiased Risk Estimation and Theoretical Guarantees
- On AI Verification in Open RAN
- Optimistic Higher-Order Superposition
- ScaleNet: Scaling up Pretrained Neural Networks with Incremental Parameters
- ImageGem: In-the-wild Generative Image Interaction Dataset for Generative Model Personalization
- DeLoad: Demand-Driven Short-Video Preloading with Scalable Watch-Time Estimation
- Simple and Efficient Heterogeneous Temporal Graph Neural Network
- CodeRL+: Improving Code Generation via Reinforcement with Execution Semantics Alignment
- Benchmarking Fairness-aware Graph Neural Networks in Knowledge Graphs
- One Size Fits All? A Modular Adaptive Sanitization Kit (MASK) for Customizable Privacy-Preserving Phone Scam Detection
- Zero-Shot Vehicle Model Recognition via Text-Based Retrieval-Augmented Generation
- Pay Attention to the Triggers: Constructing Backdoors That Survive Distillation
- EfficientNav: Towards On-Device Object-Goal Navigation with Navigation Map Caching and Retrieval
- RAISE: A Unified Framework for Responsible AI Scoring and Evaluation
- WebDevJudge: Evaluating (M)LLMs as Critiques for Web Development Quality
- Large language models for folktale type automation based on motifs: Cinderella case study
- Kaleido: Open-Sourced Multi-Subject Reference Video Generation Model
- The Cost-Benefit of Interdisciplinarity in AI for Mental Health
- A Rectification-Based Approach for Distilling Boosted Trees into Decision Trees
- Think with 3D: Geometric Imagination Grounded Spatial Reasoning from Limited Views
- C-SWAP: Explainability-Aware Structured Pruning for Efficient Neural Networks Compression
- {\epsilon}-Seg: Sparsely Supervised Semantic Segmentation of Microscopy Data
- Binary Quadratic Quantization: Beyond First-Order Quantization for Real-Valued Matrix Compression
- Reasoning Language Model Inference Serving Unveiled: An Empirical Study
- Exploring Membership Inference Vulnerabilities in Clinical Large Language Models
- Fetch.ai: An Architecture for Modern Multi-Agent Systems
- Preference-based Reinforcement Learning beyond Pairwise Comparisons: Benefits of Multiple Options
- Causally Perturbed Fairness Testing
- HarmNet: A Framework for Adaptive Multi-Turn Jailbreak Attacks on Large Language Models
- Verifiable Accuracy and Abstention Rewards in Curriculum RL to Alleviate Lost-in-Conversation
- Computational Foundations for Strategic Coopetition: Formalizing Interdependence and Complementarity
- Online SFT for LLM Reasoning: Surprising Effectiveness of Self-Tuning without Rewards
- Fine-Tuned Thoughts: Leveraging Chain-of-Thought Reasoning for Industrial Asset Health Monitoring
- An Explainable Hybrid AI Framework for Enhanced Tuberculosis and Symptom Detection
- Actor-Free Continuous Control via Structurally Maximizable Q-Functions
- Towards Faithful and Controllable Personalization via Critique-Post-Edit Reinforcement Learning
- DP$^2$O-SR: Direct Perceptual Preference Optimization for Real-World Image Super-Resolution
- Lyapunov-Aware Quantum-Inspired Reinforcement Learning for Continuous-Time Vehicle Control: A Feasibility Study
- Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
- LightMem: Lightweight and Efficient Memory-Augmented Generation
- How Do LLMs Use Their Depth?
- Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs
- Activation Manifold Projection: Liberating Task-Specific Behaviors from LLM Architectures
- Beyond More Context: Retrieval Diversity Boosts Multi-Turn Intent Understanding
- FABRIC: Framework for Agent-Based Realistic Intelligence Creation
- OPTAGENT: Optimizing Multi-Agent LLM Interactions Through Verbal Reinforcement Learning for Enhanced Reasoning
- Subject-Event Ontology Without Global Time: Foundations and Execution Semantics
- CompactPrompt: A Unified Pipeline for Prompt Data Compression in LLM Workflows
- Planned Diffusion
- SMaRT: Select, Mix, and ReinvenT - A Strategy Fusion Framework for LLM-Driven Reasoning and Planning
- Measuring Reasoning in LLMs: a New Dialectical Angle
- Learning from Generalization Patterns: An Evaluation-Driven Approach to Enhanced Data Augmentation for Fine-Tuning Small Language Models
- Annotating the Chain-of-Thought: A Behavior-Labeled Dataset for AI Safety
- LLM-Based Multi-Agent System for Simulating and Analyzing Marketing and Consumer Behavior
- Saber: An Efficient Sampling with Adaptive Acceleration and Backtracking Enhanced Remasking for Diffusion Language Model
- AgentChangeBench: A Multi-Dimensional Evaluation Framework for Goal-Shift Robustness in Conversational AI
- Local Coherence or Global Validity? Investigating RLVR Traces in Math Domains
- FST.ai 2.0: An Explainable AI Ecosystem for Fair, Fast, and Inclusive Decision-Making in Olympic and Paralympic Taekwondo
- A Definition of AGI
- ssToken: Self-modulated and Semantic-aware Token Selection for LLM Fine-tuning
- Illusions of reflection: open-ended task reveals systematic failures in Large Language Models' reflective reasoning
- Genesis: Evolving Attack Strategies for LLM Web Agent Red-Teaming
- Earth AI: Unlocking Geospatial Insights with Foundation Models and Cross-Modal Reasoning
- ShortcutBreaker: Low-Rank Noisy Bottleneck with Global Perturbation Attention for Multi-Class Unsupervised Anomaly Detection
- Memory-Augmented State Machine Prompting: A Novel LLM Agent Framework for Real-Time Strategy Games
- Heterogeneous Adversarial Play in Interactive Environments
- Deep Learning-Based Control Optimization for Glass Bottle Forming
- Med-VRAgent: A Framework for Medical Visual Reasoning-Enhanced Agents
- Automated urban waterlogging assessment and early warning through a mixture of foundation models
- AlphaOPT: Formulating Optimization Programs with Self-Improving LLM Experience Library
- PlanU: Large Language Model Decision Making through Planning under Uncertainty
- CircuitSeer: Mining High-Quality Data by Probing Mathematical Reasoning Circuits in LLMs
- Probabilistic Modeling of Intentions in Socially Intelligent LLM Agents
- LAFA: Agentic LLM-Driven Federated Analytics over Decentralized Data Sources
- StarBench: A Turn-Based RPG Benchmark for Agentic Multimodal Decision-Making and Information Seeking
- AndroidControl-Curated: Revealing the True Potential of GUI Agents through Benchmark Purification
- Crucible: Quantifying the Potential of Control Algorithms through LLM Agents
- Counterfactual Reasoning for Steerable Pluralistic Value Alignment of Large Language Models
- Physics-guided Emulators Reveal Resilience and Fragility under Operational Latencies and Outages
- SOCIA-Nabla: Textual Gradient Meets Multi-Agent Orchestration for Automated Simulator Generation
- Extracting alignment data in open models
- QuantEvolve: Automating Quantitative Strategy Discovery through Multi-Agent Evolutionary Framework
- VAR: Visual Attention Reasoning via Structured Search and Backtracking
- Leveraging Association Rules for Better Predictions and Better Explanations
- Comparative Expressivity for Structured Argumentation Frameworks with Uncertain Rules and Premises
- Query Decomposition for RAG: Balancing Exploration-Exploitation
- Sherlock Your Queries: Learning to Ask the Right Questions for Dialogue-Based Retrieval
- Seg the HAB: Language-Guided Geospatial Algae Bloom Reasoning and Segmentation
- Decoding Funded Research: Comparative Analysis of Topic Models and Uncovering the Effect of Gender and Geographic Location
- Visual Space Optimization for Zero-shot Learning
- LLM Assisted Alpha Fairness for 6 GHz WiFi and NR_U Coexistence: An Agentic Orchestrator for Throughput, Energy, and SLA
- A Biophysical-Model-Informed Source Separation Framework For EMG Decomposition
- Carbon-Aware Orchestration of Integrated Satellite Aerial Terrestrial Networks via Digital Twin
- Speak to a Protein: An Interactive Multimodal Co-Scientist for Protein Analysis
- Multi-Agent Design Assistant for the Simulation of Inertial Fusion Energy
- Synthetic EEG Generation using Diffusion Models for Motor Imagery Tasks
- Brain-Language Model Alignment: Insights into the Platonic Hypothesis and Intermediate-Layer Advantage
- GRETEL: A Goal-driven Retrieval and Execution-based Trial Framework for LLM Tool Selection Enhancing
- Modeling Layered Consciousness with Multi-Agent Large Language Models
- MAT-Agent: Adaptive Multi-Agent Training Optimization
- CARLE: A Hybrid Deep-Shallow Learning Framework for Robust and Explainable RUL Estimation of Rolling Element Bearings
- Pre to Post-Treatment Glioblastoma MRI Prediction using a Latent Diffusion Model
- Deploying Atmospheric and Oceanic AI Models on Chinese Hardware and Framework: Migration Strategies, Performance Optimization and Analysis
- MUSE: Model-based Uncertainty-aware Similarity Estimation for zero-shot 2D Object Detection and Segmentation
- A Survey of Recursive and Recurrent Neural Networks
- Auditing and Mitigating Bias in Gender Classification Algorithms: A Data-Centric Approach
- Repairing Tool Calls Using Post-tool Execution Reflection and RAG
- 3D Weakly Supervised Semantic Segmentation via Class-Aware and Geometry-Guided Pseudo-Label Refinement
- DRL-Based Resource Allocation for Energy-Efficient IRS-Assisted UAV Spectrum Sharing Systems
- Decoding Listeners Identity: Person Identification from EEG Signals Using a Lightweight Spiking Transformer
- Outraged AI: Large language models prioritise emotion over cost in fairness enforcement
- POPI: Personalizing LLMs via Optimized Natural Language Preference Inference
Research Sources: 562 | Generated: 10/22/2025
