AI RESEARCH PAPERS & ACADEMIC SOURCES
- DreamCS: Geometry-Aware Text-to-3D Generation with Unpaired 3D Reward Supervision
- Learning Frequency and Memory-Aware Prompts for Multi-Modal Object Tracking
- IC-Custom: Diverse Image Customization via In-Context Learning
- Towards More Accurate Diffusion Model Acceleration with A Timestep Tuner
- Semi-Supervised Unconstrained Head Pose Estimation in the Wild
- Scheduling Weight Transitions for Quantization-Aware Training
- DPDETR: Decoupled Position Detection Transformer for Infrared-Visible Object Detection
- SL$^{2}$A-INR: Single-Layer Learnable Activation for Implicit Neural Representation
- Rectified Diffusion Guidance for Conditional Generation
- Dressing the Imagination: A Dataset for AI-Powered Translation of Text into Fashion Outfits and A Novel KAN Adapter for Enhanced Feature Adaptation
- Diffusion Model as a Noise-Aware Latent Reward Model for Step-Level Preference Optimization
- SEE: See Everything Every Time -- Adaptive Brightness Adjustment for Broad Light Range Images via Events
- Beyond Semantics: Rediscovering Spatial Awareness in Vision-Language Models
- Easi3R: Estimating Disentangled Motion from DUSt3R Without Training
- H3AE: High Compression, High Speed, and High Quality AutoEncoder for Video Diffusion Models
- SpikeGen: Decoupled "Rods and Cones" Visual Representation Processing with Latent Generative Framework
- STORK: Faster Diffusion And Flow Matching Sampling By Resolving Both Stiffness And Structure-Dependence
- ATAS: Any-to-Any Self-Distillation for Enhanced Open-Vocabulary Dense Prediction
- A Scene is Worth a Thousand Features: Feed-Forward Camera Localization from a Collection of Image Features
- Visual Self-Refinement for Autoregressive Models
- SoftCFG: Uncertainty-guided Stable Guidance for Visual autoregressive Model
- POVQA: Preference-Optimized Video Question Answering with Rationales for Data Efficiency
- ImageDoctor: Diagnosing Text-to-Image Generation via Grounded Image Reasoning
- Towards Adversarial Training under Hyperspectral Images
- Instant4D: 4D Gaussian Splatting in Minutes
- EvoWorld: Evolving Panoramic World Generation with Explicit 3D Memory
- IMAGEdit: Let Any Subject Transform
- Latent Representation Learning from 3D Brain MRI for Interpretable Prediction in Multiple Sclerosis
- Adapting Large Language Models to Mitigate Skin Tone Biases in Clinical Dermatology Tasks: A Mixed-Methods Study
- Variable Rate Image Compression via N-Gram Context based Swin-transformer
- Behavioural Classification in C. elegans: a Spatio-Temporal Analysis of Locomotion
- A Fast and Precise Method for Searching Rectangular Tumor Regions in Brain MR Images
- Adaptive Event Stream Slicing for Open-Vocabulary Event-Based Object Detection via Vision-Language Knowledge Distillation
- ProtoMask: Segmentation-Guided Prototype Learning
- Graph Integrated Multimodal Concept Bottleneck Model
- Training-free Uncertainty Guidance for Complex Visual Tasks with MLLMs
- Deep learning motion correction of quantitative stress perfusion cardiovascular magnetic resonance
- DEAP DIVE: Dataset Investigation with Vision transformers for EEG evaluation
- Defect Segmentation in OCT scans of ceramic parts for non-destructive inspection using deep learning
- ZQBA: Zero Query Black-box Adversarial Attack
- From Seeing to Predicting: A Vision-Language Framework for Trajectory Forecasting and Controlled Video Generation
- PhraseStereo: The First Open-Vocabulary Stereo Image Segmentation Dataset
- NSARM: Next-Scale Autoregressive Modeling for Robust Real-World Image Super-Resolution
- PAL-Net: A Point-Wise CNN with Patch-Attention for 3D Facial Landmark Localization
- Equivariant Splitting: Self-supervised learning from incomplete data
- Looking Alike From Far to Near: Enhancing Cross-Resolution Re-Identification via Feature Vector Panning
- InfVSR: Breaking Length Limits of Generic Video Super-Resolution
- JEPA-T: Joint-Embedding Predictive Architecture with Text Fusion for Image Generation
- Cascaded Diffusion Framework for Probabilistic Coarse-to-Fine Hand Pose Estimation
- Assessing Foundation Models for Mold Colony Detection with Limited Training Data
- Arbitrary Generative Video Interpolation
- Color Models in Image Processing: A Review and Experimental Comparison
- Multi-level Dynamic Style Transfer for NeRFs
- LVLMs as inspectors: an agentic framework for category-level structural defect annotation
- Disentangling Foreground and Background for vision-Language Navigation via Online Augmentation
- Robust Context-Aware Object Recognition
- UCD: Unconditional Discriminator Promotes Nash Equilibrium in GANs
- LAKAN: Landmark-assisted Adaptive Kolmogorov-Arnold Network for Face Forgery Detection
- Erased, But Not Forgotten: Erased Rectified Flow Transformers Still Remain Unsafe Under Concept Attack
- FIN: Fast Inference Network for Map Segmentation
- OTTER: Open-Tagging via Text-Image Representation for Multi-modal Understanding
- Weakly Supervised Cloud Detection Combining Spectral Features and Multi-Scale Deep Network
- Unsupervised Unfolded rPCA (U2-rPCA): Deep Interpretable Clutter Filtering for Ultrasound Microvascular Imaging
- Beyond one-hot encoding? Journey into compact encoding for large multi-class segmentation
- OIG-Bench: A Multi-Agent Annotated Benchmark for Multimodal One-Image Guides Understanding
- Improved Hyperspectral Anomaly Detection via Unsupervised Subspace Modeling in the Signed Cumulative Distribution Transform Domain
- PAL-UI: Planning with Active Look-back for Vision-Based GUI Agents
- BindWeave: Subject-Consistent Video Generation via Cross-Modal Integration
- VLOD-TTA: Test-Time Adaptation of Vision-Language Object Detectors
- MathSticks: A Benchmark for Visual Symbolic Compositional Reasoning with Matchstick Puzzles
- Affordance-Guided Diffusion Prior for 3D Hand Reconstruction
- Efficient Multi-modal Large Language Models via Progressive Consistency Distillation
- CardioBench: Do Echocardiography Foundation Models Generalize Beyond the Lab?
- Hearing the Order: Investigating Selection Bias in Large Audio-Language Models
- Spiralformer: Low Latency Encoder for Streaming Speech Recognition with Circular Layer Skipping and Early Exiting
- Improving Code Localization with Repository Memory
- Language Models can Subtly Deceive Without Lying: A Case Study on Strategic Phrasing in Legislation
- Second Language (Arabic) Acquisition of LLMs via Progressive Vocabulary Expansion
- Self-Evolving Multi-Agent Simulations for Realistic Clinical Interactions
- Improving Retrieval-Augmented Neural Machine Translation with Monolingual Data
- GuRE:Generative Query REwriter for Legal Passage Retrieval
- Precise Information Control in Long-Form Text Generation
- Through the Valley: Path to Effective Long CoT Training for Small Language Models
- REAL: Reading Out Transformer Activations for Precise Localization in Language Model Steering
- CounselBench: A Large-Scale Expert Evaluation and Adversarial Benchmarking of Large Language Models in Mental Health Question Answering
- Retain or Reframe? A Computational Framework for the Analysis of Framing in News Articles and Reader Comments
- Are Knowledge and Reference in Multilingual Language Models Cross-Lingually Consistent?
- CoT Vectors: Transferring and Probing the Reasoning Mechanisms of LLMs
- MCM-DPO: Multifaceted Cross-Modal Direct Preference Optimization for Alt-text Generation
- Family Matters: Language Transfer and Merging for Adapting Small LLMs to Faroese
- Exposing the Cracks: Vulnerabilities of Retrieval-Augmented LLM-based Machine Translation
- ManagerBench: Evaluating the Safety-Pragmatism Trade-off in Autonomous LLMs
- HalluGuard: Evidence-Grounded Small Reasoning Models to Mitigate Hallucinations in Retrieval-Augmented Generation
- Making, not Taking, the Best of N
- Analyzing Dialectical Biases in LLMs for Knowledge and Reasoning Benchmarks
- Syntax-Guided Diffusion Language Models with User-Integrated Personalization
- Research on the Integration of Embodied Intelligence and Reinforcement Learning in Textual Domains
- Automatic Speech Recognition (ASR) for African Low-Resource Languages: A Systematic Literature Review
- Pay-Per-Search Models are Abstention Models
- Energy-Regularized Sequential Model Editing on Hyperspheres
- Unpacking Musical Symbolism in Online Communities: Content-Based and Network-Centric Approaches
- QSearchNet: A Quantum Walk Search Framework for Link Prediction
- When Silence Matters: The Impact of Irrelevant Audio on Text Reasoning in Large Audio-Language Models
- TAMA: Tool-Augmented Multimodal Agent for Procedural Activity Understanding
- DRBench: A Realistic Benchmark for Enterprise Deep Research
- PrimeX: A Dataset of Worldview, Opinion, and Explanation
- Judging with Confidence: Calibrating Autoraters to Preference Distributions
- ReEvalMed: Rethinking Medical Report Evaluation by Aligning Metrics with Real-World Clinical Judgment
- CORTEX: Collaborative LLM Agents for High-Stakes Alert Triage
- TokMem: Tokenized Procedural Memory for Large Language Models
- LongCodeZip: Compress Long Context for Code Language Models
- Enhancing Rating Prediction with Off-the-Shelf LLMs Using In-Context User Reviews
- Agent Fine-tuning through Distillation for Domain-specific LLMs in Microdomains
- Agent-ScanKit: Unraveling Memory and Reasoning of Multimodal Agents via Sensitivity Perturbations
- JoyAgent-JDGenie: Technical Report on the GAIA
- GUI-KV: Efficient GUI Agents via KV Cache with Spatio-Temporal Awareness
- ThinkBrake: Mitigating Overthinking in Tool Reasoning
- Are Large Language Models Chronically Online Surfers? A Dataset for Chinese Internet Meme Explanation
- ReSeek: A Self-Correcting Framework for Search Agents with Instructive Rewards
- Divergence-Based Similarity Function for Multi-View Contrastive Learning
- Imagining Alternatives: Towards High-Resolution 3D Counterfactual Medical Image Generation via Language Guidance
- Online Non-convex Optimization with Long-term Non-convex Constraints
- Achieving More Human Brain-Like Vision via Human EEG Representational Alignment
- PhyloLM : Inferring the Phylogeny of Large Language Models and Predicting their Performances in Benchmarks
- Statistically Truthful Auctions via Acceptance Rule
- Graph Transformer Networks for Accurate Band Structure Prediction: An End-to-End Approach
- Estimating quantum relative entropies on quantum computers
- Integration of Calcium Imaging Traces via Deep Generative Modeling
- An Iterative Bayesian Approach for System Identification based on Linear Gaussian Models
- Learning the Universe: Learning to Optimize Cosmic Initial Conditions with Non-Differentiable Structure Formation Models
- OBELiX: A Curated Dataset of Crystal Structures and Experimentally Measured Ionic Conductivities for Lithium Solid-State Electrolytes
- Toward a Robust R2D2 Paradigm for Radio-interferometric Imaging: Revisiting Deep Neural Network Training and Architecture
- Is Limited Participant Diversity Impeding EEG-based Machine Learning?
- Robustness and sex differences in skin cancer detection: logistic regression vs CNNs
- Learning simple heuristic rules for classifying materials based on chemical composition
- EVALOOOP: A Self-Consistency-Centered Framework for Assessing Large Language Model Robustness in Programming
- Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation
- Gauges and Accelerated Optimization over Smooth and/or Strongly Convex Sets
- Mutual information maximizing quantum generative adversarial networks
- The causal structure of galactic astrophysics
- Uncovering Challenges of Solving the Continuous Gromov-Wasserstein Problem
- The Inhibitor: ReLU and Addition-Based Attention for Efficient Transformers under Fully Homomorphic Encryption on the Torus
- SMaRt: Improving GANs with Score Matching Regularity
- Neural Network Characterization and Entropy Regulated Data Balancing through Principal Component Analysis
- Fully Heteroscedastic Count Regression with Deep Double Poisson Networks
- Out-of-Distribution Detection with Relative Angles
- Krony-PT: GPT2 compressed with Kronecker Products
- CYCle: Choosing Your Collaborators Wisely to Enhance Collaborative Fairness in Decentralized Learning
- Prompt Tuning Decision Transformers with Structured and Scalable Bandits
- Mol-LLaMA: Towards General Understanding of Molecules in Large Molecular Language Model
- Federated Dynamic Modeling and Learning for Spatiotemporal Data Forecasting
- Rapid training of Hamiltonian graph networks using random features
- Temporal Misalignment Attacks against Multimodal Perception in Autonomous Driving
- LoRA meets Riemannion: Muon Optimizer for Parametrization-independent Low-Rank Adapters
- Scaling Linear Attention with Sparse State Expansion
- Parametric modeling of shear wave velocity profiles for the conterminous U.S
- A Deep Learning Pipeline for Epilepsy Genomic Analysis Using GPT-2 XL and NVIDIA H100
- Improving Virtual Contrast Enhancement using Longitudinal Data
- EuroSpeech: A Multilingual Speech Corpus
- Beyond Log Likelihood: Probability-Based Objectives for Supervised Fine-Tuning across the Model Capability Continuum
- Virtual Fashion Photo-Shoots: Building a Large-Scale Garment-Lookbook Dataset
- Multi-Domain Brain Vessel Segmentation Through Feature Disentanglement
- A Geometric Unification of Generative AI with Manifold-Probabilistic Projection Models
- Stochastic Self-Organization in Multi-Agent Systems
- Discovering Communities in Continuous-Time Temporal Networks by Optimizing L-Modularity
- GeoGraph: Geometric and Graph-based Ensemble Descriptors for Intrinsically Disordered Proteins
- AI-CNet3D: An Anatomically-Informed Cross-Attention Network with Multi-Task Consistency Fine-tuning for 3D Glaucoma Classification
- COMMET: orders-of-magnitude speed-up in finite element method via batch-vectorized neural constitutive updates
- Modeling Market States with Clustering and State Machines
- Secure and reversible face anonymization with diffusion models
- Temporal Score Rescaling for Temperature Sampling in Diffusion and Flow Models
- Dirichlet-Prior Shaping: Guiding Expert Specialization in Upcycled MoEs
- FTSCommDetector: Discovering Behavioral Communities through Temporal Synchronization
- A Recall-First CNN for Sleep Apnea Screening from Snoring Audio
- DPsurv: Dual-Prototype Evidential Fusion for Uncertainty-Aware and Interpretable Whole-Slide Image Survival Prediction
- Enhancing Certifiable Semantic Robustness via Robust Pruning of Deep Neural Networks
- Revealing the temporal dynamics of antibiotic anomalies in the infant gut microbiome with neural jump ODEs
- Quantum reservoir computing using Jaynes-Cummings model
- Learning from the electronic structure of molecules across the periodic table
- Board Gender Diversity and Carbon Emissions Performance: Insights from Panel Regressions, Machine Learning and Explainable AI
- Low Resource Audio Codec Challenge Baseline Systems
- SafePassage: High-Fidelity Information Extraction with Black Box LLMs
- Electron neural closure for turbulent magnetosheath simulations: energy channels
- Malliavin Calculus with Weak Derivatives for Counterfactual Stochastic Optimization
- Looking Beyond the Known: Towards a Data Discovery Guided Open-World Object Detection
- End-to-end Training of High-Dimensional Optimal Control with Implicit Hamiltonians via Jacobian-Free Backpropagation
- Meaningless Tokens, Meaningful Gains: How Activation Shifts Enhance LLM Reasoning
- Gated X-TFC: Soft Domain Decomposition for Forward and Inverse Problems in Sharp-Gradient PDEs
- Eliciting Secret Knowledge from Language Models
- Predicting Diabetic Retinopathy Using a Two-Level Ensemble Model
- Multi-Actor Multi-Critic Deep Deterministic Reinforcement Learning with a Novel Q-Ensemble Method
- Dynamical system reconstruction from partial observations using stochastic dynamics
- Geometric Properties of Neural Multivariate Regression
- Augmenting LLMs for General Time Series Understanding and Prediction
- Privacy Preserved Federated Learning with Attention-Based Aggregation for Biometric Recognition
- Eliciting Chain-of-Thought Reasoning for Time Series Analysis using Reinforcement Learning
- Breaking the Euclidean Barrier: Hyperboloid-Based Biological Sequence Analysis
- Prompt Curriculum Learning for Efficient LLM Post-Training
- Sample-Efficient Differentially Private Fine-Tuning via Gradient Matrix Denoising
- Neural Hamilton--Jacobi Characteristic Flows for Optimal Transport
- Multi-Marginal Flow Matching with Adversarially Learnt Interpolants
- BroRL: Scaling Reinforcement Learning via Broadened Exploration
- Downgrade to Upgrade: Optimizer Simplification Enhances Robustness in LLM Unlearning
- In-Place Feedback: A New Paradigm for Guiding LLMs in Multi-Turn Reasoning
- Complex System Exploration with Interactive Human Guidance
- Guiding Evolutionary Molecular Design: Adding Reinforcement Learning for Mutation Selection
- Online Minimization of Polarization and Disagreement via Low-Rank Matrix Bandits
- Are Time Series Foundation Models Susceptible to Catastrophic Forgetting?
- LLM Routing with Dueling Feedback
- Population Synthesis using Incomplete Information
- The data-quality illusion: Rethinking Classifier-based quality filtering for LLM Pretraining
- Target Population Synthesis using CT-GAN
- Reducci\'on de ruido por medio de autoencoders: caso de estudio con la se\~nal GW150914
- Rectifying Regression in Reinforcement Learning
- BoMGene: Integrating Boruta-mRMR feature selection for enhanced Gene expression classification
- Large Reasoning Models Learn Better Alignment from Flawed Thinking
- It Takes Two: Your GRPO Is Secretly DPO
- Riemannian Consistency Model
- Random Feature Spiking Neural Networks
- Randomized Matrix Sketching for Neural Network Training and Gradient Monitoring
- Rehearsal-free and Task-free Online Continual Learning With Contrastive Prompt
- Diagnosing Shortcut-Induced Rigidity in Continual Learning: The Einstellung Rigidity Index (ERI)
- Vicinity-Guided Discriminative Latent Diffusion for Privacy-Preserving Domain Adaptation
- Diffusion Alignment as Variational Expectation-Maximization
- Spectral Scaling Laws in Language Models: How Effectively Do Feed-Forward Networks Use Their Latent Space?
- Interpretable Machine Learning for Life Expectancy Prediction: A Comparative Study of Linear Regression, Decision Tree, and Random Forest
- Private Online Learning against an Adaptive Adversary: Realizable and Agnostic Settings
- Probability calibration for precipitation nowcasting
- Designing Ambiguity Sets for Distributionally Robust Optimization Using Structural Causal Optimal Transport
- Multi-Agent Stage-wise Conservative Linear Bandits
- Physics-Informed Extreme Learning Machine (PIELM) for Tunnelling-Induced Soil-Pile Interactions
- Comparison of Machine Learning Models to Classify Documents on Digital Development
- TD-JEPA: Latent-predictive Representations for Zero-Shot Reinforcement Learning
- How Foundational are Foundation Models for Time Series Forecasting?
- LEAP: Local ECT-Based Learnable Positional Encodings for Graphs
- Cutting the Skip: Training Residual-Free Transformers
- Initial Distribution Sensitivity of Constrained Markov Decision Processes
- Flow Autoencoders are Effective Protein Tokenizers
- AReUReDi: Annealed Rectified Updates for Refining Discrete Flows with Multi-Objective Guidance
- Continual Learning with Query-Only Attention
- The Transformer Cookbook
- GDLNN: Marriage of Programming Language and Neural Networks for Accurate and Easy-to-Explain Graph Classification
- Composer: A Search Framework for Hybrid Neural Architecture Design
- Efficient Probabilistic Tensor Networks
- Learning Passive Continuous-Time Dynamics with Multistep Port-Hamiltonian Gaussian Processes
- Bayesian Distributional Models of Executive Functioning
- Graph2Region: Efficient Graph Similarity Learning with Structure and Scale Restoration
- Can Mamba Learn In Context with Outliers? A Theoretical Generalization Analysis
- Hierarchy-Aware Neural Subgraph Matching with Enhanced Similarity Measure
- Learning a Zeroth-Order Optimizer for Fine-Tuning LLMs
- On-the-Fly Data Augmentation via Gradient-Guided and Sample-Aware Influence Estimation
- On the Soundness and Consistency of LLM Agents for Executing Test Cases Written in Natural Language
- Linear Regression in p-adic metric spaces
- Federated Learning Meets LLMs: Feature Extraction From Heterogeneous Clients
- Large Language Models Inference Engines based on Spiking Neural Networks
- RouterArena: An Open Platform for Comprehensive Comparison of LLM Routers
- Differentiable Autoencoding Neural Operator for Interpretable and Integrable Latent Space Modeling
- Per-example gradients: a new frontier for understanding and improving optimizers
- Reward driven discovery of the optimal microstructure representations with invariant variational autoencoders
- CODED-SMOOTHING: Coding Theory Helps Generalization
- Delayed Attention Training Improves Length Generalization in Transformer--RNN Hybrids
- Beyond Token Probes: Hallucination Detection via Activation Tensors with ACT-ViT
- Robust Federated Inference
- DiSC-AMC: Token- and Parameter-Efficient Discretized Statistics In-Context Automatic Modulation Classification
- MS-DFTVNet:A Long-Term Time Series Prediction Method Based on Multi-Scale Deformable Convolution
- EFRame: Deeper Reasoning via Exploration-Filter-Replay Reinforcement Learning Framework
- Towards a Progress Bar for Reasoning: Progress Prediction in Large Reasoning Models
- Model Parallelism With Subnetwork Data Parallelism
- Fair CCA for Fair Representation Learning: An ADNI Study
- Nonlinear Framework for Speech Bandwidth Extension
- AS400-DET: Detection using Deep Learning Model for IBM i (AS/400)
- Training-free LLM Verification via Recycling Few-shot Examples
- Not All Rollouts are Useful: Down-Sampling Rollouts in LLM Reinforcement Learning
- A Physics-Inspired Optimizer: Velocity Regularized Adam
- Choosing a Model, Shaping a Future: Comparing LLM Perspectives on Sustainability and its Relationship with AI
- Towards Holistic Evaluation of Large Audio-Language Models: A Comprehensive Survey
- Federated Causal Inference from Multi-Site Observational Data via Propensity Score Aggregation
- Steering LLM Reasoning Through Bias-Only Adaptation
- Object Centric Concept Bottlenecks
- Text-to-CT Generation via 3D Latent Diffusion Model with Contrastive Vision-Language Pretraining
- PCoreSet: Effective Active Learning through Knowledge Distillation from Vision-Language Models
- Unpacking Let Alone: Human-Scale Models Generalize to a Rare Construction in Form but not Meaning
- MLLM-CL: Continual Learning for Multimodal Large Language Models
- Estimating Visceral Adiposity from Wrist-Worn Accelerometry
- LoRA Users Beware: A Few Spurious Tokens Can Manipulate Your Finetuned Model
- Graphon Particle Systems, Part II: Dynamics of Distributed Stochastic Continuum Optimization
- Balancing Multimodal Training Through Game-Theoretic Regularization
- 3D Interaction Geometric Pre-training for Molecular Relational Learning
- PortraitTalk: Towards Customizable One-Shot Audio-to-Talking Face Generation
- Stability Bounds for the Unfolded Forward-Backward Algorithm
- Exploring and Controlling Diversity in LLM-Agent Conversation
- Distilling Calibration via Conformalized Credal Inference
- Mitigating Domain Shift in Federated Learning via Intra- and Inter-Domain Prototypes
- ATLAS: Autoformalizing Theorems through Lifting, Augmentation, and Synthesis of Data
- Toward Foundational Model for Sleep Analysis Using a Multimodal Hybrid Self-Supervised Learning Framework
- Addressing Moral Uncertainty using Large Language Models for Ethical Decision-Making
- Resolving UnderEdit & OverEdit with Iterative & Neighbor-Assisted Model Editing
- BlobCtrl: Taming Controllable Blob for Element-level Image Editing
- TDBench: A Benchmark for Top-Down Image Understanding with Reliability Analysis of Vision-Language Models
- R&D-Agent: An LLM-Agent Framework Towards Autonomous Data Science
- MoveGPT: Scaling Mobility Foundation Models with Spatially-Aware Mixture of Experts
- AgentMisalignment: Measuring the Propensity for Misaligned Behaviour in LLM-Based Agents
- Topology of Reasoning: Understanding Large Reasoning Models through Reasoning Graph Properties
- Discerning What Matters: A Multi-Dimensional Assessment of Moral Competence in LLMs
- ConciseHint: Boosting Efficient Reasoning via Continuous Concise Hints during Generation
- What if Othello-Playing Language Models Could See?
- Code Like Humans: A Multi-Agent Solution for Medical Coding
- Foam-Agent 2.0: An End-to-End Composable Multi-Agent Framework for Automating CFD Simulation in OpenFOAM
- Learning Dynamic Graph Embeddings with Neural Controlled Differential Equations
- Adversarial Attacks to Latent Representations of Distributed Neural Networks in Split Computing
- TabINR: An Implicit Neural Representation Framework for Tabular Data Imputation
- mR3: Multilingual Rubric-Agnostic Reward Reasoning Models
- Prosperity before Collapse: How Far Can Off-Policy RL Reach with Stale Data on LLMs?
- GRAD: Generative Retrieval-Aligned Demonstration Sampler for Efficient Few-Shot Reasoning
- Simultaneous Multi-objective Alignment Across Verifiable and Non-verifiable Rewards
- Fiaingen: A financial time series generative method matching real-world data quality
- Verbalized Sampling: How to Mitigate Mode Collapse and Unlock LLM Diversity
- COM-BOM: Bayesian Exemplar Search for Efficiently Exploring the Accuracy-Calibration Pareto Frontier
- TOUCAN: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments
- NL2Plan: Robust LLM-Driven Planning from Minimal Text Descriptions
- Breast Cancer Diagnosis: A Comprehensive Exploration of Explainable Artificial Intelligence (XAI) Techniques
- Whose Journey Matters? Investigating Identity Biases in Large Language Models (LLMs) for Travel Planning Assistance
- PETAH: Parameter Efficient Task Adaptation for Hybrid Transformers in a resource-limited Context
- Diffusion Model-based Parameter Estimation in Dynamic Power Systems
- ViLBias: Detecting and Reasoning about Bias in Multimodal Content
- MathConstruct: Challenging LLM Reasoning with Constructive Proofs
- Neural Theorem Proving: Generating and Structuring Proofs for Formal Verification
- Advancing Automated Ethical Profiling in SE: a Zero-Shot Evaluation of LLM Reasoning
- GLAI: GreenLightningAI for Accelerated Training through Knowledge Decoupling
- Span-level Detection of AI-generated Scientific Text via Contrastive Learning and Structural Calibration
- TubeDAgger: Reducing the Number of Expert Interventions with Stochastic Reach-Tubes
- RiskPO: Risk-based Policy Optimization via Verifiable Reward for LLM Post-Training
- Reinforcement Learning with Verifiable yet Noisy Rewards under Imperfect Verifiers
- Benchmarking Foundation Models with Retrieval-Augmented Generation in Olympic-Level Physics Problem Solving
- Bridging the Gap Between Simulated and Real Network Data Using Transfer Learning
- TextCAM: Explaining Class Activation Map with Text
- CurES: From Gradient Analysis to Efficient Curriculum Learning for Reasoning LLMs
- Authentic Discrete Diffusion Model
- Interpreting Language Models Through Concept Descriptions: A Survey
- GEM: A Gym for Agentic LLMs
- Hybrid Dialogue State Tracking for Persian Chatbots: A Language Model-Based Approach
- CodeGenLink: A Tool to Find the Likely Origin and License of Automatically Generated Code
- Rethinking Thinking Tokens: LLMs as Improvement Operators
- A Practitioner's Guide to Multi-turn Agentic Reinforcement Learning
- Extreme Blind Image Restoration via Prompt-Conditioned Information Bottleneck
- Neural Diffusion Processes for Physically Interpretable Survival Prediction
- From Scores to Preferences: Redefining MOS Benchmarking for Speech Quality Reward Modeling
- Multi-Objective Task-Aware Predictor for Image-Text Alignment
- UniverSR: Unified and Versatile Audio Super-Resolution via Vocoder-Free Flow Matching
- Uncertainty-Aware Concept Bottleneck Models with Enhanced Interpretability
- MetaLogic: Robustness Evaluation of Text-to-Image Models via Logically Equivalent Prompts
- Solar PV Installation Potential Assessment on Building Facades Based on Vision and Language Foundation Models
- MG2FlowNet: Accelerating High-Reward Sample Generation via Enhanced MCTS and Greediness Control
- What You See is What You Ask: Evaluating Audio Descriptions
- Stabilizing Policy Gradients for Sample-Efficient Reinforcement Learning in LLM Reasoning
- Towards Verifiable Federated Unlearning: Framework, Challenges, and The Road Ahead
- Feature Identification for Hierarchical Contrastive Learning
- Mechanistic Interpretability as Statistical Estimation: A Variance Analysis of EAP-IG
- Can World Models Benefit VLMs for World Dynamics?
- Gather-Scatter Mamba: Accelerating Propagation with Efficient State Space Model
- Copy-Paste to Mitigate Large Language Model Hallucinations
- Adaptive Data-Knowledge Alignment in Genetic Perturbation Prediction
- Architectural Transformations and Emerging Verification Demands in AI-Enabled Cyber-Physical Systems
- Forestpest-YOLO: A High-Performance Detection Framework for Small Forestry Pests
- EMR-AGENT: Automating Cohort and Feature Extraction from EMR Databases
- On Predictability of Reinforcement Learning Dynamics for Large Language Models
- Memory Determines Learning Direction: A Theory of Gradient-Based Optimization in State Space Models
- Panorama: Fast-Track Nearest Neighbors
- Adaptive Shared Experts with LoRA-Based Mixture of Experts for Multi-Task Learning
- SAGE-LD: Towards Scalable and Generalizable End-to-End Language Diarization via Simulated Data Augmentation
- U-DFA: A Unified DINOv2-Unet with Dual Fusion Attention for Multi-Dataset Medical Segmentation
- AI-Driven Self-Evolving Software: A Promising Path Toward Software Automation
- FAME: Adaptive Functional Attention with Expert Routing for Function-on-Function Regression
- Tenyidie Syllabification corpus creation and deep learning applications
- Align Your Tangent: Training Better Consistency Models via Manifold-Aligned Tangents
- Facilitating Cognitive Accessibility with LLMs: A Multi-Task Approach to Easy-to-Read Text Generation
- Inclusive Easy-to-Read Generation for Individuals with Cognitive Impairments
- Domain-Specialized Interactive Segmentation Framework for Meningioma Radiotherapy Planning
- Automated Structured Radiology Report Generation with Rich Clinical Context
- Plug-and-Play Prompt Refinement via Latent Feedback for Diffusion Model Alignment
- Measuring and Controlling the Spectral Bias for Self-Supervised Image Denoising
- UrbanGraph: Physics-Informed Spatio-Temporal Dynamic Heterogeneous Graphs for Urban Microclimate Prediction
- TimeEmb: A Lightweight Static-Dynamic Disentanglement Framework for Time Series Forecasting
- Feature Identification via the Empirical NTK
- Analyzing Latent Concepts in Code Language Models
- PodEval: A Multimodal Evaluation Framework for Podcast Audio Generation
- Black-Box Time-Series Domain Adaptation via Cross-Prompt Foundation Models
- Exploring System 1 and 2 communication for latent reasoning in LLMs
- Normal-Abnormal Guided Generalist Anomaly Detection
- MOSS-Speech: Towards True Speech-to-Speech Models Without Text Guidance
- Relative-Absolute Fusion: Rethinking Feature Extraction in Image-Based Iterative Method Selection for Solving Sparse Linear Systems
- Graph2Eval: Automatic Multimodal Task Generation for Agents via Knowledge Graphs
- SLogic: Subgraph-Informed Logical Rule Learning for Knowledge Graph Completion
- Data driven approaches in nanophotonics: A review of AI-enabled metadevices
- o-MEGA: Optimized Methods for Explanation Generation and Analysis
- Free Draft-and-Verification: Toward Lossless Parallel Decoding for Diffusion Large Language Models
- Barriers for Learning in an Evolving World: Mathematical Understanding of Loss of Plasticity
- Digital Domination: A Case for Republican Liberty in Artificial Intelligence
- DecepChain: Inducing Deceptive Reasoning in Large Language Models
- A Framework for Selection of Machine Learning Algorithms Based on Performance Metrices and Akaike Information Criteria in Healthcare, Telecommunication, and Marketing Sector
- Reasoning-Aware Prompt Orchestration: A Foundation Model for Multi-Agent Language Model Coordination
- Structural Refinement of Bayesian Networks for Efficient Model Parameterisation
- In-Context Curiosity: Distilling Exploration for Decision-Pretrained Transformers on Bandit Tasks
- Combining Large Language Models and Gradient-Free Optimization for Automatic Control Policy Synthesis
- Discrete Wavelet Transform as a Facilitator for Expressive Latent Space Representation in Variational Autoencoders in Satellite Imagery
- SAGE-Music: Low-Latency Symbolic Music Generation via Attribute-Specialized Key-Value Head Sharing
- AbsTopK: Rethinking Sparse Autoencoders For Bidirectional Features
- David and Goliath in Medical Vision: Convolutional Networks vs Biomedical Vision Language Models
- BigBang-Proton Technical Report: Next-Word-Prediction is Scientific Multitask Learner
- Which Rewards Matter? Reward Selection for Reinforcement Learning under Limited Feedback
- Partial Identification Approach to Counterfactual Fairness Assessment
- Personalized Reasoning: Just-In-Time Personalization and Why LLMs Fail At It
- Why Can't Transformers Learn Multiplication? Reverse-Engineering Reveals Long-Range Dependency Pitfalls
- PrunedLoRA: Robust Gradient-Based structured pruning for Low-rank Adaptation in Fine-tuning
- GRPO-$\lambda$: Credit Assignment improves LLM Reasoning
- LoRAFusion: Efficient LoRA Fine-Tuning for LLMs
- Directed-MAML: Meta Reinforcement Learning Algorithm with Task-directed Approximation
- Thoughtbubbles: an Unsupervised Method for Parallel Thinking in Latent Space
- The Pitfalls of KV Cache Compression
- BiasFreeBench: a Benchmark for Mitigating Bias in Large Language Model Responses
- Debunk the Myth of SFT Generalization
- TASER: Translation Assessment via Systematic Evaluation and Reasoning
- Learning Energy-based Variational Latent Prior for VAEs
- Retrieval-Augmented Generation for Electrocardiogram-Language Models
- Efficient Layer-wise LLM Fine-tuning for Revision Intention Prediction
- DexBench: Benchmarking LLMs for Personalized Decision Making in Diabetes Management
- Uncovering Intrinsic Capabilities: A Paradigm for Data Curation in Vision-Language Models
- Culture In a Frame: C$^3$B as a Comic-Based Benchmark for Multimodal Culturally Awareness
- Beyond the Prompt: Gender Bias in Text-to-Image Models, with a Case Study on Hospital Professions
- Reinforcement Learning-Based Prompt Template Stealing for Text-to-Image Models
- Explanation-Driven Counterfactual Testing for Faithfulness in Vision-Language Model Explanations
- AI-Based Stroke Rehabilitation Domiciliary Assessment System with ST_GCN Attention
- Object-AVEdit: An Object-level Audio-Visual Editing Model
- HiDe: Rethinking The Zoom-IN method in High Resolution MLLMs via Hierarchical Decoupling
- FSDENet: A Frequency and Spatial Domains based Detail Enhancement Network for Remote Sensing Semantic Segmentation
- Survey of AI-Powered Approaches for Osteoporosis Diagnosis in Medical Imaging
- Efficient CNN Compression via Multi-method Low Rank Factorization and Feature Map Similarity
- AstroMMBench: A Benchmark for Evaluating Multimodal Large Language Models Capabilities in Astronomy
- Geo-R1: Unlocking VLM Geospatial Reasoning with Cross-View Reinforcement Learning
- Adaptive and Resource-efficient Agentic AI Systems for Mobile and Embedded Devices: A Survey
- SoREX: Towards Self-Explainable Social Recommendation with Relevant Ego-Path Extraction
- Simulating Student Success in the Age of GenAI: A Kantian-Axiomatic Perspective
- Methodological Framework for Quantifying Semantic Test Coverage in RAG Systems
- IA aplicada al an\'alisis del conflicto Ir\'an-Israel: Mapeo de discursos en YouTube
- EpidemIQs: Prompt-to-Paper LLM Agents for Epidemic Modeling and Analysis
- Learning Inter-Atomic Potentials without Explicit Equivariance
- Rethinking RoPE Scaling in Quantized LLM: Theory, Outlier, and Channel-Band Analysis with Weight Rescaling
- Enhancing Safety in Diabetic Retinopathy Detection: Uncertainty-Aware Deep Learning Models with Rejection Capabilities
- Temporal-Aware Iterative Speech Model for Dementia Detection
- VibeCodeHPC: An Agent-Based Iterative Prompting Auto-Tuner for HPC Code Generation Using LLMs
- WaveMind: Towards a Conversational EEG Foundation Model Aligned to Textual and Visual Modalities
- Hybrid Deep Learning for Hyperspectral Single Image Super-Resolution
- Review of Hallucination Understanding in Large Language and Vision Models
- Deep Learning-Based Pneumonia Detection from Chest X-ray Images: A CNN Approach with Performance Analysis and Clinical Implications
- On Robustness of Vision-Language-Action Model against Multi-Modal Perturbations
- FusionAdapter for Few-Shot Relation Learning in Multimodal Knowledge Graphs
- On Discovering Algorithms for Adversarial Imitation Learning
- Test-Time Search in Neural Graph Coarsening Procedures for the Capacitated Vehicle Routing Problem
- A Neuro-Fuzzy System for Interpretable Long-Term Stock Market Forecasting
- QUASAR: Quantum Assembly Code Generation Using Tool-Augmented LLMs via Agentic RL
- Integrating AI and Ensemble Forecasting: Explainable Materials Planning with Scorecards and Trend Insights for a Large-Scale Manufacturer
- Shape Happens: Automatic Feature Manifold Discovery in LLMs via Supervised Multi-Dimensional Scaling
- Uncovering the Computational Ingredients of Human-Like Representations in LLMs
- Activation-Deactivation: A General Framework for Robust Post-hoc Explainable AI
- Typed Chain-of-Thought: A Curry-Howard Framework for Verifying LLM Reasoning
- Safety Instincts: LLMs Learn to Trust Their Internal Compass for Self-Defense
- Optimizing Fairness in Production Planning: A Human-Centric Approach to Machine and Workforce Allocation
- PRISM-Consult: A Panel-of-Experts Architecture for Clinician-Aligned Diagnosis
- Exploring Network-Knowledge Graph Duality: A Case Study in Agentic Supply Chain Risk Analysis
- Apriel-1.5-15b-Thinker
- Generalized Parallel Scaling with Interdependent Generations
- MARS: Audio Generation via Multi-Channel Autoregression on Spectrograms
- Collaborative-Distilled Diffusion Models (CDDM) for Accelerated and Lightweight Trajectory Prediction
- Expected Attention: KV Cache Compression by Estimating Attention from Future Queries Distribution
- Batch-CAM: Introduction to better reasoning in convolutional deep learning models
- Relevance-Zone Reduction in Game Solving
- ACPO: Adaptive Curriculum Policy Optimization for Aligning Vision-Language Models in Complex Reasoning
- EvolProver: Advancing Automated Theorem Proving by Evolving Formalized Problems via Symmetry and Difficulty
- DIA: The Adversarial Exposure of Deterministic Inversion in Diffusion Models
- AI in data science education: experiences from the classroom
- Benchmarking Agentic Systems in Automated Scientific Information Extraction with ChemX
- Semantic Bridges Between First Order c-Representations and Cost-Based Semantics: An Initial Perspective
- Logical Consistency Between Disagreeing Experts and Its Role in AI Safety
- Benchmarking Machine Learning Models for Fault Classification and Localization in Power System Protection
- Improving Cryptocurrency Pump-and-Dump Detection through Ensemble-Based Models and Synthetic Oversampling Techniques
- Learning Compact Representations of LLM Abilities via Item Response Theory
- Unveiling Interesting Insights: Monte Carlo Tree Search for Knowledge Discovery
- DualTune: Decoupled Fine-Tuning for On-Device Agentic Systems
- MAGIC-MASK: Multi-Agent Guided Inter-Agent Collaboration with Mask-Based Explainability for Reinforcement Learning
- ICL Optimized Fragility
- BiasBusters: Uncovering and Mitigating Tool Selection Bias in Large Language Models
- When Hallucination Costs Millions: Benchmarking AI Agents in High-Stakes Adversarial Financial Markets
- Hierarchical Reasoning Model: A Critical Supplementary Material
- Semantic-Driven AI Agent Communications: Challenges and Solutions
- Towards Self-Evolving Benchmarks: Synthesizing Agent Trajectories via Test-Time Exploration under Validate-by-Reproduce Paradigm
- Automated Evaluation can Distinguish the Good and Bad AI Responses to Patient Questions about Hospitalization
- Expandable Decision-Making States for Multi-Agent Deep Reinforcement Learning in Soccer Tactical Analysis
- Rethinking Reward Models for Multi-Domain Test-Time Scaling
- VIRTUE: Visual-Interactive Text-Image Universal Embedder
- Toward Safer Diffusion Language Models: Discovery and Mitigation of Priming Vulnerability
- ACON: Optimizing Context Compression for Long-horizon LLM Agents
- HARPA: A Testability-Driven, Literature-Grounded Framework for Research Ideation
- Is Model Editing Built on Sand? Revealing Its Illusory Success and Fragile Foundation
- Learning to Lead Themselves: Agentic AI in MAS using MARL
- ToolBrain: A Flexible Reinforcement Learning Framework for Agentic Tools
- ARS: Adaptive Reasoning Suppression for Efficient Large Reasoning Language Models
- NeurIPS should lead scientific consensus on AI policy
- Towards a Framework for Supporting the Ethical and Regulatory Certification of AI Systems
- Judging by Appearances? Auditing and Intervening Vision-Language Models for Bail Prediction
- AuditAgent: Expert-Guided Multi-Agent Reasoning for Cross-Document Fraudulent Evidence Discovery
- Object-Centric Case-Based Reasoning via Argumentation
- Thinkquel: A Model Dedicated to Text-to-dbt Using Synthetic Data and a Span-Aware Objective
- Synthetic Census Data Generation via Multidimensional Multiset Sum
- A Framework for Double-Blind Federated Adaptation of Foundation Models
- Leaky Thoughts: Large Reasoning Models Are Not Private Thinkers
- A Backdoor-based Explainable AI Benchmark for High Fidelity Evaluation of Attributions
- Phantom: General Backdoor Attacks on Retrieval Augmented Language Generation
- COOKIEGUARD: Characterizing and Isolating the First-Party Cookie Jar
- Noninterference Analysis of Irreversible or Reversible Systems with Nondeterminism and Probabilities
- A Hitchhiker's Guide to Privacy-Preserving Digital Payment Systems: A Survey on Anonymity, Confidentiality, and Auditability
- gh0stEdit: Exploiting Layer-Based Access Vulnerability Within Docker Container Images
- Extended c-differential distinguishers of full 9 and reduced-round Kuznyechik cipher
- Vectorised Hashing Based on Bernstein-Rabin-Winograd Polynomials over Prime Order Fields
- Verifiability and Privacy in Federated Learning through Context-Hiding Multi-Key Homomorphic Authenticators
- B-Privacy: Defining and Enforcing Privacy in Weighted Voting
- Differential Privacy of Network Parameters from a System Identification Perspective
- Blockchain-Based Secure Online Voting Platform Ensuring Voter Anonymity, Integrity, and End-to-End Verifiability
- Hot PATE: Private Aggregation of Distributions for Diverse Task
- Fast, Secure, and High-Capacity Image Watermarking with Autoencoded Text Vectors
- Universally Composable Termination Analysis of Tendermint
- EditTrack: Detecting and Attributing AI-assisted Image Editing
- Direct Token Optimization: A Self-contained Approach to Large Language Model Unlearning
- MOLM: Mixture of LoRA Markers
- Understanding Sensitivity of Differential Attention through the Lens of Adversarial Robustness
- LSPFuzz: Hunting Bugs in Language Servers
- Eyes-on-Me: Scalable RAG Poisoning through Transferable Attention-Steering Attractors
- Maven-Lockfile: High Integrity Rebuild of Past Java Releases
- Computational Monogamy of Entanglement and Non-Interactive Quantum Key Distribution
- Adaptive Federated Few-Shot Rare-Disease Diagnosis with Energy-Aware Secure Aggregation
- Semantics-Aligned, Curriculum-Driven, and Reasoning-Enhanced Vulnerability Repair Framework
- HVAC-EAR: Eavesdropping Human Speech Using HVAC Systems
- Backdoor Attacks Against Speech Language Models
- Stealing AI Model Weights Through Covert Communication Channels
- Calyx: Privacy-Preserving Multi-Token Optimistic-Rollup Protocol
- CHAI: Command Hijacking against embodied AI
- SecureBERT 2.0: Advanced Language Model for Cybersecurity Intelligence
- MAVUL: Multi-Agent Vulnerability Detection via Contextual Reasoning and Interactive Refinement
- Privately Estimating Black-Box Statistics
- Security and Privacy Analysis of Tile's Location Tracking Protocol
- A Call to Action for a Secure-by-Design Generative AI Paradigm
- Cloud Investigation Automation Framework (CIAF): An AI-Driven Approach to Cloud Forensics
- Has the Two-Decade-Old Prophecy Come True? Artificial Bad Intelligence Triggered by Merely a Single-Bit Flip in Large Language Models
- Memory-Augmented Log Analysis with Phi-4-mini: Enhancing Threat Detection in Structured Security Logs
- Sentry: Authenticating Machine Learning Artifacts on the Fly
- IntrusionX: A Hybrid Convolutional-LSTM Deep Learning Framework with Squirrel Search Optimization for Network Intrusion Detection
- A Monoid Ring Approach to Color Visual Cryptography
- HLTCOE at TREC 2024 NeuCLIR Track
- Privacy-Preserving Learning-Augmented Data Structures
- Milco: Learned Sparse Retrieval Across Languages via a Multilingual Connector
- On Listwise Reranking for Corpus Feedback
- Bridging Language Gaps: Advances in Cross-Lingual Information Retrieval with Multilingual LLMs
- Deep Learning-Based Approach for Improving Relational Aggregated Search
- ModernVBERT: Towards Smaller Visual Document Retrievers
- AutoPK: Leveraging LLMs and a Hybrid Similarity Metric for Advanced Retrieval of Pharmacokinetic Data from Complex Tables and Documents
- Which Programming Language and Model Work Best With LLM-as-a-Judge For Code Retrieval?
- ALARB: An Arabic Legal Argument Reasoning Benchmark
- AttentionDep: Domain-Aware Attention for Explainable Depression Severity Assessment
- Erase to Improve: Erasable Reinforcement Learning for Search-Augmented LLMs
- PaECTER: Patent-level Representation Learning using Citation-informed Transformers
- Stop Playing the Guessing Game! Target-free User Simulation for Evaluating Conversational Recommender Systems
- Stop Misusing t-SNE and UMAP for Visual Analytics
- Designing Psychometric Bias Measures for ChatBots: An Application to Racial Bias Measurement
- Confirmation Bias as a Cognitive Resource in LLM-Supported Deliberation
- "Having Lunch Now": Understanding How Users Engage with a Proactive Agent for Daily Planning and Self-Reflection
- Grounded GUI Understanding for Vision-Based Spatial Intelligent Agent: Exemplified by Extended Reality Apps
- XRZoo: A Large-Scale and Versatile Dataset of Extended Reality (XR) Applications
- GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents
- From latent factors to language: a user study on LLM-generated explanations for an inherently interpretable matrix-based recommender system
- Motion In-Betweening for Densely Interacting Characters
- ReSWD: ReSTIR'd, not shaken. Combining Reservoir Sampling and Sliced Wasserstein Distance for Variance Reduction
- Audio Driven Real-Time Facial Animation for Social Telepresence
- Temporally Smooth Mesh Extraction for Procedural Scenes with Long-Range Camera Trajectories using Spacetime Octrees
- Optimizing What Matters: AUC-Driven Learning for Robust Neural Retrieval
- Virtual Reality Alters Perceived Functional Body Size
- "We are not Future-ready": Understanding AI Privacy Risks and Existing Mitigation Strategies from the Perspective of AI Developers in Europe
- Social Photo-Elicitation: The Use of Communal Production of Meaning to Hear a Vulnerable Population
- Intelligent 5S Audit: Application of Artificial Intelligence for Continuous Improvement in the Automotive Industry
- Multidimensional Bayesian Active Machine Learning of Working Memory Task Performance
- Make a Video Call with LLM: A Measurement Campaign over Five Mainstream Apps
- Data Quality Challenges in Retrieval-Augmented Generation
- AI Where It Matters: Where, Why, and How Developers Want AI Support in Daily Work
- A Visual Diagnostics Framework for District Heating Data: Enhancing Data Quality for AI-Driven Heat Consumption Prediction
- A Technique Based on Trade-off Maps to Visualise and Analyse Relationships Between Objectives in Optimisation Problems
- Visualizing Quantum Circuits: State Vector Difference Highlighting and the Half-Matrix
- Intuitions of Machine Learning Researchers about Transfer Learning for Medical Image Classification
- Disc-Cover Complexity Trends in Music Illustrations from Sinatra to Swift
- Social Welfare Function Leaderboard: When LLM Agents Allocate Social Welfare
- Code2Video: A Code-centric Paradigm for Educational Video Generation
- DISCOVER: Data-driven Identification of Sub-activities via Clustering and Visualization for Enhanced Activity Recognition in Smart Homes
- Perceived Weight of Mediated Reality Sticks
- Data Melodification FM: Where Musical Rhetoric Meets Sonification
- Can AI agents understand spoken conversations about data visualizations in online meetings?
- Visualization Was Here: Reorienting Research When Visualizations Fade into the Background
- Navigating the Synchrony-Stability Frontier in Adaptive Chatbots
- The Feng Shui of Visualization: Design the Path to SUCCESS and GOOD FORTUNE
- Attribution Gradients: Incrementally Unfolding Citations for Critical Examination of Attributed AI Answers
- Investigating Encoding and Perspective for Augmented Reality
- RELATE-Sim: Leveraging Turning Point Theory and LLM Agents to Predict and Understand Long-Term Relationship Dynamics through Interactive Narrative Simulations
- Face2Feel: Emotion-Aware Adaptive User Interface
- PromptPilot: Improving Human-AI Collaboration Through LLM-Enhanced Prompt Engineering
- Rethinking Wine Tasting for Chinese Consumers: A Service Design Approach Enhanced by Multimodal Personalization
- Designing Wine Tasting Experiences for All: The role of Human Diversity and Personal food memory
- Datasets for Valence and Arousal Inference: A Survey
- ImpedanceGPT: VLM-driven Impedance Control of Swarm of Mini-drones for Intelligent Navigation in Dynamic Environment
- Adaptive Diffusion Constrained Sampling for Bimanual Robot Manipulation
- Poutine: Vision-Language-Trajectory Pre-Training and Reinforcement Learning Post-Training Enable Robust End-to-End Autonomous Driving
- Heterogeneous Predictor-based Risk-Aware Planning with Conformal Prediction in Dense, Uncertain Environments
- Sampling-Based Global Optimal Control and Estimation via Semidefinite Programming
- Probabilistic Collision Risk Estimation through Gauss-Legendre Cubature and Non-Homogeneous Poisson Processes
- How Safe Will I Be Given What I Saw? Calibrated Prediction of Safety Chances for Image-Controlled Autonomy
- RoVerFly: Robust and Versatile Implicit Hybrid Control of Quadrotor-Payload Systems
- DBF-MA: A Differential Bayesian Filtering Planner for Multi-Agent Autonomous Racing Overtakes
- Certifiably Optimal Estimation and Calibration in Robotics via Trace-Constrained Semi-Definite Programming
- HR-INR: Continuous Space-Time Video Super-Resolution via Event Camera
- On the Application of Model Predictive Control to a Weighted Coverage Path Planning Problem
- Beyond Needle(s) in the Embodied Haystack: Environment, Architecture, and Training Considerations for Long Context Reasoning
- Compose Your Policies! Improving Diffusion-based or Flow-based Robot Policies via Test-time Distribution-level Composition
- Real-Time Trajectory Generation and Hybrid Lyapunov-Based Control for Hopping Robots
- Less is More: Lean yet Powerful Vision-Language Model for Autonomous Driving
- The Formation of Trust in Autonomous Vehicles after Interacting with Robotaxis on Public Roads
- Drones that Think on their Feet: Sudden Landing Decisions with Embodied AI
- Robust Attitude Control of Nonlinear Multi-Rotor Dynamics with LFT Models and $\mathcal{H}_\infty$ Performance
- A Hierarchical Agentic Framework for Autonomous Drone-Based Visual Inspection
- EgoTraj-Bench: Towards Robust Trajectory Prediction Under Ego-view Noisy Observations
- Conflict-Based Search as a Protocol: A Multi-Agent Motion Planning Protocol for Heterogeneous Agents, Solvers, and Independent Tasks
- KeySG: Hierarchical Keyframe-Based 3D Scene Graphs
- Predictive Control Barrier Functions for Discrete-Time Linear Systems with Unmodeled Delays
- Strategic Fusion of Vision Language Models: Shapley-Credited Context-Aware Dawid-Skene for Multi-Label Tasks in Autonomous Driving
- Active Shadowing (ASD): Manipulating Perception of Robotic Behaviors via Implicit Virtual Communication
- Optimization-based Task and Motion Planning under Signal Temporal Logic Specifications using Logic Network Flow
- HetSwarm: Cooperative Navigation of Heterogeneous Swarm in Dynamic and Dense Environments through Impedance-based Guidance
- GRITS: A Spillage-Aware Guided Diffusion Policy for Robot Food Scooping Tasks
- Hybrid Training for Vision-Language-Action Models
- What Did I Learn? Operational Competence Assessment for AI-Based Trajectory Planners
- Trajectory Based Observer Design: A Framework for Lightweight Sensor Fusion
- Enabling High-Frequency Cross-Modality Visual Positioning Service for Accurate Drone Landing
- Shared Object Manipulation with a Team of Collaborative Quadrupeds
- HAMLET: Switch your Vision-Language-Action Model into a History-Aware Policy
- MultiPhysio-HRC: Multimodal Physiological Signals Dataset for industrial Human-Robot Collaboration
- CroSTAta: Cross-State Transition Attention Transformer for Robotic Manipulation
- Tele-rehabilitation with online skill transfer and adaptation in $\mathbb{R}^3 \times \mathit{S}^3$
- Semantic Visual Simultaneous Localization and Mapping: A Survey on State of the Art, Challenges, and Future Directions
- RTFF: Random-to-Target Fabric Flattening Policy using Dual-Arm Manipulator
- Product-oriented Product-Process-Resource Asset Network and its Representation in AutomationML for Asset Administration Shell
- Non-submodular Visual Attention for Robot Navigation
- ROSflight 2.0: Lean ROS 2-Based Autopilot for Unmanned Aerial Vehicles
- Prometheus: Universal, Open-Source Mocap-Based Teleoperation System with Force Feedback for Dataset Collection in Robot Learning
- ROSplane 2.0: A Fixed-Wing Autopilot for Research
- The Gauss-Markov Adjunction Provides Categorical Semantics of Residuals in Supervised Learning
- RoboPilot: Generalizable Dynamic Robotic Manipulation with Dual-thinking Modes
- A Systematic Study of Large Language Models for Task and Motion Planning With PDDLStream
- A Novel Robust Control Method Combining DNN-Based NMPC Approximation and PI Control: Application to Exoskeleton Squat Movements
- TGPO: Temporal Grounded Policy Optimization for Signal Temporal Logic Tasks
- BC-MPPI: A Probabilistic Constraint Layer for Safe Model-Predictive Path-Integral Control
- Learning Human Reaching Optimality Principles from Minimal Observation Inverse Reinforcement Learning
- DiSA-IQL: Offline Reinforcement Learning for Robust Soft Robot Control under Distribution Shifts
- Physics-Informed Neural Controlled Differential Equations for Scalable Long Horizon Multi-Agent Motion Forecasting
- VLA-RFT: Vision-Language-Action Reinforcement Fine-tuning with Verified Rewards in World Simulators
- Seeing through Uncertainty: Robust Task-Oriented Optimization in Visual Navigation
- Integrating Offline Pre-Training with Online Fine-Tuning: A Reinforcement Learning Approach for Robot Social Navigation
- From Human Hands to Robot Arms: Manipulation Skills Transfer via Trajectory Alignment
- Two stage GNSS outlier detection for factor graph optimization based GNSS-RTK/INS/odometer fusion
- How Does the Pretraining Distribution Shape In-Context Learning? Task Selection, Generalization, and Robustness
- A first-order method for constrained nonconvex--nonconcave minimax problems under a local Kurdyka-{\L}ojasiewicz condition
- On the Benefits of Weight Normalization for Overparameterized Matrix Sensing
- Propagating Model Uncertainty through Filtering-based Probabilistic Numerical ODE Solvers
- Learning linear dynamical systems under convex constraints
- On the Natural Gradient of the Evidence Lower Bound
- A Likelihood Based Approach to Distribution Regression Using Conditional Deep Generative Models
- Learning Pattern-Specific Experts for Time Series Forecasting Under Patch-level Distribution Shift
- Learning to Dissipate Energy in Oscillatory State-Space Models
- Learning to Rank Chain-of-Thought: Using a Small Model
- Direct Preference Optimization for Adaptive Concept-based Explanations
- Entropic Risk Optimization in Discounted MDPs: Sample Complexity Bounds with a Generative Model
- Accurate Estimation of Mutual Information in High Dimensional Data
- Guided Speculative Inference for Efficient Test-Time Alignment of LLMs
- Minimax and Bayes Optimal Best-Arm Identification
- Theory of Scaling Laws for In-Context Regression: Depth, Width, Context and Time
- Deep Learning Approaches with Explainable AI for Differentiating Alzheimer Disease and Mild Cognitive Impairment
- Directed Information $\gamma$-covering: An Information-Theoretic Framework for Context Engineering
- Approximately Unimodal Likelihood Models for Ordinal Regression
- Nonparametric Identification of Latent Concepts
- Lipschitz Bandits with Stochastic Delayed Feedback
- Train on Validation (ToV): Fast data selection with applications to fine-tuning
- Progressively Sampled Equality-Constrained Optimization
- Robust Spatiotemporally Contiguous Anomaly Detection Using Tensor Decomposition
- On the joint observability of flow fields and particle properties from Lagrangian trajectories: evidence from neural data assimilation
- Mathematical Theory of Collinearity Effects on Machine Learning Variable Importance Measures
- Error Feedback for Muon and Friends
- Learn to Guide Your Diffusion Model
- Non-Euclidean Broximal Point Method: A Blueprint for Geometry-Aware Optimization
- False Discovery Rate Control via Bayesian Mirror Statistic
- The Good, the Bad, and the Sampled: a No-Regret Approach to Safe Online Classification
- Equivariant Geometric Scattering Networks via Vector Diffusion Wavelets
- Identifying All {\epsilon}-Best Arms in (Misspecified) Linear Bandits
- Private Learning of Littlestone Classes, Revisited
- CINDES: Classification induced neural density estimator and simulator
- On the Adversarial Robustness of Learning-based Conformal Novelty Detection
- A universal compression theory: Lottery ticket hypothesis and superpolynomial scaling laws
- Bayesian Neural Networks for Functional ANOVA model
- Guaranteed Noisy CP Tensor Recovery via Riemannian Optimization on the Segre Manifold
- Approximation of differential entropy in Bayesian optimal experimental design
- Optimal placement of wind farms via quantile constraint learning
Research Sources: 713 | Generated: 10/2/2025