AI RESEARCH PAPERS & ACADEMIC SOURCES
- Identifiability of the minimum-trace directed acyclic graph and hill climbing algorithms without strict local optima under weakly increasing error variances
- Estimating the size of a set using cascading exclusion
- A variational approach to dimension-free self-normalized concentration
- Non-asymptotic estimates for accelerated high order Langevin Monte Carlo algorithms
- Proximal optimal transport divergences
- Can Multimodal Large Language Models Understand Spatial Relations?
- TD3Net: A Temporal Densely Connected Multi-Dilated Convolutional Network for Lipreading
- End-to-End Fine-Tuning of 3D Texture Generation using Differentiable Rewards
- Neural-Driven Image Editing
- Trustworthy Pedestrian Trajectory Prediction via Pattern-Aware Interaction Modeling
- InterAct-Video: Reasoning-Rich Video QA for Urban Traffic
- Survival Modeling from Whole Slide Images via Patch-Level Graph Clustering and Mixture Density Experts
- Direct Robot Configuration Space Construction using Convolutional Encoder-Decoders
- MambaEviScrib: Mamba and Evidence-Guided Consistency Enhance CNN Robustness for Scribble-Based Weakly Supervised Ultrasound Image Segmentation
- CDI: Blind Image Restoration Fidelity Evaluation based on Consistency with Degraded Image
- POMATO: Marrying Pointmap Matching with Temporal Motion for Dynamic 3D Reconstruction
- DAVSP: Safety Alignment for Large Vision-Language Models via Deep Aligned Visual Safety Prompt
- MoDA: Multi-modal Diffusion Architecture for Talking Head Generation
- PASG: A Closed-Loop Framework for Automated Geometric Primitive Extraction and Semantic Anchoring in Robotic Manipulation
- AnimateScene: Camera-controllable Animation in Any Scene
- Fast Motion Estimation and Context-Aware Refinement for Efficient Bayer-Domain Video Vision
- EvoMakeup: High-Fidelity and Controllable Makeup Editing with MakeupQuad
- MathReal: We Keep It Real! A Real Scene Benchmark for Evaluating Math Reasoning in Multimodal Large Language Models
- ExploreGS: Explorable 3D Scene Reconstruction with Virtual Camera Samplings and Diffusion Priors
- Learning 3D Texture-Aware Representations for Parsing Diverse Human Clothing and Body Parts
- InstantEdit: Text-Guided Few-Step Image Editing with Piecewise Rectified Flow
- More Is Better: A MoE-Based Emotion Recognition Framework with Human Preference Alignment
- NEP: Autoregressive Image Editing via Next Editing Token Prediction
- VQAThinker: Exploring Generalizable and Explainable Video Quality Assessment via Reinforcement Learning
- LV-Net: Anatomy-aware lateral ventricle shape modeling with a case study on Alzheimer's disease, the Australian Imaging Biomarkers and Lifestyle flagship study of ageing
- Lightweight Quad Bayer HybridEVS Demosaicing via State Space Augmented Cross-Attention
- Distribution-Specific Learning for Joint Salient and Camouflaged Object Detection
- DreamVE: Unified Instruction-based Image and Video Editing
- SwiftVideo: A Unified Framework for Few-Step Video Generation through Trajectory-Distribution Alignment
- AdaptInfer: Adaptive Token Pruning for Vision-Language Model Inference with Dynamical Text Guidance
- Q-CLIP: Unleashing the Power of Vision-Language Models for Video Quality Assessment through Unified Cross-Modal Adaptation
- E-React: Towards Emotionally Controlled Synthesis of Human Reactions
- UGD-IML: A Unified Generative Diffusion-based Framework for Constrained and Unconstrained Image Manipulation Localization
- MCA: 2D-3D Retrieval with Noisy Labels via Multi-level Adaptive Correction and Alignment
- GMF-Drive: Gated Mamba Fusion with Spatial-Aware BEV Representation for End-to-End Autonomous Driving
- SynSeg: Feature Synergy for Multi-Category Contrastive Learning in Open-Vocabulary Semantic Segmentation
- Learning Representations of Satellite Images with Evaluations on Synoptic Weather Events
- SC-Captioner: Improving Image Captioning with Self-Correction by Reinforcement Learning
- SAM Encoder Breach by Adversarial Simplicial Complex Triggers Downstream Model Failures
- DiffCap: Diffusion-based Real-time Human Motion Capture using Sparse IMUs and a Monocular Camera
- SDEval: Safety Dynamic Evaluation for Multimodal Large Language Models
- Text-guided Visual Prompt DINO for Generic Segmentation
- DSConv: Dynamic Splitting Convolution for Pansharpening
- VISTAR:A User-Centric and Role-Driven Benchmark for Text-to-Image Evaluation
- An Interpretable Multi-Plane Fusion Framework With Kolmogorov-Arnold Network Guided Attention Enhancement for Alzheimer's Disease Diagnosis
- Fewer Denoising Steps or Cheaper Per-Step Inference: Towards Compute-Optimal Diffusion Model Deployment
- Graph-based Robot Localization Using a Graph Neural Network with a Floor Camera and a Feature Rich Industrial Floor
- MA-CBP: A Criminal Behavior Prediction Framework Based on Multi-Agent Asynchronous Collaboration
- A Semantic Segmentation Algorithm for Pleural Effusion Based on DBIF-AUNet
- AnomalyMoE: Towards a Language-free Generalist Model for Unified Visual Anomaly Detection
- PA-HOI: A Physics-Aware Human and Object Interaction Dataset
- Interpretable Rheumatoid Arthritis Scoring via Anatomy-aware Multiple Instance Learning
- TEFormer: Texture-Aware and Edge-Guided Transformer for Semantic Segmentation of Urban Remote Sensing Images
- Depth Jitter: Seeing through the Depth
- Towards Unified Image Deblurring using a Mixture-of-Experts Decoder
- Deepfake Detection that Generalizes Across Benchmarks
- FedX: Explanation-Guided Pruning for Communication-Efficient Federated Learning in Remote Sensing
- XAG-Net: A Cross-Slice Attention and Skip Gating Network for 2.5D Femur MRI Segmentation
- Uncertainty-quantified Rollout Policy Adaptation for Unlabelled Cross-domain Temporal Grounding
- Can Diffusion Models Bridge the Domain Gap in Cardiac MR Imaging?
- ViPro-2: Unsupervised State Estimation via Integrated Dynamics for Guiding Video Prediction
- Street View Sociability: Interpretable Analysis of Urban Social Behavior Across 15 Cities
- Aligning Effective Tokens with Video Anomaly in Large Language Models
- An Implemention of Two-Phase Image Segmentation using the Split Bregman Method
- Text as Any-Modality for Zero-Shot Classification by Consistent Prompt Tuning
- FVGen: Accelerating Novel-View Synthesis with Adversarial Video Diffusion Distillation
- Feature-Space Oversampling for Addressing Class Imbalance in SAR Ship Classification
- MotionSwap
- LightSwitch: Multi-view Relighting with Material-guided Diffusion
- Neural Field-Based 3D Surface Reconstruction of Microstructures from Multi-Detector Signals in Scanning Electron Microscopy
- Universally Unfiltered and Unseen:Input-Agnostic Multimodal Jailbreaks against Text-to-Image Model Safeguards
- KnapFormer: An Online Load Balancer for Efficient Diffusion Transformers Training
- Transformer-Based Explainable Deep Learning for Breast Cancer Detection in Mammography: The MammoFormer Framework
- Clinically-guided Data Synthesis for Laryngeal Lesion Detection
- Affordance-R1: Reinforcement Learning for Generalizable Affordance Reasoning in Multimodal Large Language Model
- Anti-Tamper Protection for Unauthorized Individual Image Generation
- CPT-Interp: Continuous sPatial and Temporal Motion Modeling for 4D Medical Image Interpolation
- A Calibration Tool for Refractive Underwater Vision
- Hybrid-TTA: Continual Test-time Adaptation via Dynamic Domain Shift Detection
- Localized Gaussians as Self-Attention Weights for Point Clouds Correspondence
- MBA-SLAM: Motion Blur Aware Gaussian Splatting SLAM
- SAR Strikes Back: A New Hope for RSVQA
- Fine-Grained Image-Text Correspondence with Cost Aggregation for Open-Vocabulary Part Segmentation
- MPG-SAM 2: Adapting SAM 2 with Mask Priors and Global Context for Referring Video Object Segmentation
- Embodied Intelligence for 3D Understanding: A Survey on 3D Scene Question Answering
- Generative Video Bi-flow
- COIN: Confidence Score-Guided Distillation for Annotation-Free Cell Segmentation
- Can Test-Time Scaling Improve World Foundation Model?
- Two-stage deep learning framework for the restoration of incomplete-ring PET images
- Event2Vec: Processing neuromorphic events directly by representations in vector space
- Dome-DETR: DETR with Density-Oriented Feature-Query Manipulation for Efficient Tiny Object Detection
- Spectrum Projection Score: Aligning Retrieved Summaries with Reader Models in Retrieval-Augmented Generation
- Adversarial Topic-aware Prompt-tuning for Cross-topic Automated Essay Scoring
- ConlangCrafter: Constructing Languages with a Multi-Hop LLM Pipeline
- Few-Shot Prompting for Extractive Quranic QA with Instruction-Tuned LLMs
- You Don't Need Pre-built Graphs for RAG: Retrieval Augmented Generation with Adaptive Reasoning Structures
- AURA: Affordance-Understanding and Risk-aware Alignment Technique for Large Language Models
- Scaling Personality Control in LLMs with Big Five Scaler Prompts
- Semantic and Structural Analysis of Implicit Biases in Large Language Models: An Interpretable Approach
- Pragmatics beyond humans: meaning, communication, and LLMs
- Comparing Knowledge Injection Methods for LLMs in a Low-Resource Regime
- DKG-LLM : A Framework for Medical Diagnosis and Personalized Treatment Recommendations via Dynamic Knowledge Graph and Large Language Model Integration
- Beyond Uniform Criteria: Scenario-Adaptive Multi-Dimensional Jailbreak Evaluation
- EICAP: Deep Dive in Assessment and Enhancement of Large Language Models in Emotional Intelligence through Multi-Turn Conversations
- Matrix-Driven Instant Review: Confident Detection and Reconstruction of LLM Plagiarism on PC
- Cyberbullying Detection via Aggression-Enhanced Prompting
- Evaluating Style-Personalized Text Generation: Challenges and Directions
- LLMs vs. Chinese Anime Enthusiasts: A Comparative Study on Emotionally Supportive Role-Playing
- Quantifying Conversation Drift in MCP via Latent Polytope
- SlimInfer: Accelerating Long-Context LLM Inference via Dynamic Token Pruning
- GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
- HapticLLaMA: A Multimodal Sensory Language Model for Haptic Captioning
- DINA: A Dual Defense Framework Against Internal Noise and External Attacks in Natural Language Processing
- Basic interactive algorithms: Preview
- NanoCodec: Towards High-Quality Ultra Fast Speech LLM Inference
- Fact2Fiction: Targeted Poisoning Attack to Agentic Fact-checking System
- Effective Training Data Synthesis for Improving MLLM Chart Understanding
- Benchmarking LLMs on the Semantic Overlap Summarization Task
- Towards Pareto Optimal Throughput in Small Language Model Serving
- Extract-and-Abstract: Unifying Extractive and Abstractive Summarization within Single Encoder-Decoder Framework
- Turning Logic Against Itself : Probing Model Defenses Through Contrastive Questions
- Structural Embedding Projection for Contextual Large Language Model Inference
- Context-Preserving Tensorial Reconfiguration in Large Language Model Training
- Contextual Morphogenesis in Large Language Models: A Novel Approach to Self-Organizing Token Representations
- Context-Aware Hierarchical Merging for Long Document Summarization
- Gradient-Regularized Latent Space Modulation in Large Language Models for Structured Contextual Synthesis
- Latent Structure Modulation in Large Language Models Through Stochastic Concept Embedding Transitions
- Structural Perturbation in Large Language Model Representations through Recursive Symbolic Regeneration
- Structural Reformation of Large Language Model Neuron Encapsulation for Divergent Information Aggregation
- Structured Convergence in Large Language Model Representations via Hierarchical Latent Space Folding
- Statistical Coherence Alignment for Large Language Model Representation Learning Through Tensor Field Convergence
- Exploring Contextual Flux in Large Language Models: A Novel Approach to Self-Modulating Semantic Networks
- One ruler to measure them all: Benchmarking multilingual long-context language models
- OpenCodeReasoning: Advancing Data Distillation for Competitive Coding
- Single-Pass Document Scanning for Question Answering
- Not All Data Are Unlearned Equally
- EvidenceBench: A Benchmark for Extracting Evidence from Biomedical Papers
- Can a Crow Hatch a Falcon? Lineage Matters in Predicting Large Language Model Performance
- The Devil Is in the Word Alignment Details: On Translation-Based Cross-Lingual Transfer for Token Classification Tasks
- Automated Privacy Information Annotation in Large Language Model Interactions
- DrVoice: Parallel Speech-Text Voice Conversation Model via Dual-Resolution Speech Representations
- AALC: Large Language Model Efficient Reasoning via Adaptive Accuracy-Length Control
- Integrating large language models and active inference to understand eye movements in reading and dyslexia
- Reducibility among NP-Hard graph problems and boundary classes
- OpenCodeInstruct: A Large-scale Instruction Tuning Dataset for Code LLMs
- From Autonomy to Agency: Agentic Vehicles for Human-Centered Mobility Systems
- Nyay-Darpan: Enhancing Decision Making Through Summarization and Case Retrieval for Consumer Law in India
- Generalized Few-Shot Out-of-Distribution Detection
- Improving Masked Style Transfer using Blended Partial Convolution
- MAISI-v2: Accelerated 3D High-Resolution Medical Image Synthesis with Rectified Flow and Region-specific Contrastive Loss
- Optimization-Free Style Transfer for 3D Gaussian Splats
- MZEN: Multi-Zoom Enhanced NeRF for 3-D Reconstruction with Unknown Camera Poses
- TSMS-SAM2: Multi-scale Temporal Sampling Augmentation and Memory-Splitting Pruning for Promptable Video Object Segmentation and Tracking in Surgical Scenarios
- Temporal Cluster Assignment for Efficient Real-Time Video Segmentation
- VISTA: Vision-Language Imitation of Situational Thinking and Attention for Human-Like Driver Focus in Dynamic Environments
- Multi-view Gaze Target Estimation
- ETTA: Efficient Test-Time Adaptation for Vision-Language Models through Dynamic Embedding Updates
- HOLODECK 2.0: Vision-Language-Guided 3D World Generation with Editing
- Robust Image Stitching with Optimal Plane
- Neural Field Representations of Mobile Computational Photography
- PEACH: A sentence-aligned Parallel English-Arabic Corpus for Healthcare
- Guardians and Offenders: A Survey on Harmful Content Generation and Safety Mitigation
- FineDialFact: A benchmark for Fine-grained Dialogue Fact Verification
- Human-like fleeting memory improves language learning but impairs reading time prediction in transformer language models
- "Mirror" Language AI Models of Depression are Criterion-Contaminated
- Discovering Properties of Inflectional Morphology in Neural Emergent Communication
- CANVAS: Commonsense-Aware Navigation System for Intuitive Human-Robot Interaction
- WildSAT: Learning Satellite Image Representations from Wildlife Observations
- EVA-S2PLoR: Decentralized Secure 2-party Logistic Regression with A Subtly Hadamard Product Protocol (Full Version)
- Conditional Diffusion Models are Medical Image Classifiers that Provide Explainability and Uncertainty for Free
- Building Age Estimation: A New Multi-Modal Benchmark Dataset and Community Challenge
- Rank1: Test-Time Compute for Reranking in Information Retrieval
- CRUST-Bench: A Comprehensive Benchmark for C-to-safe-Rust Transpilation
- MAATS: A Multi-Agent Automated Translation System Based on MQM Evaluation
- Decompositional Reasoning for Graph Retrieval with Large Language Models
- Measurement as Bricolage: Examining How Data Scientists Construct Target Variables for Predictive Modeling Tasks
- The Fourth State: Signed-Zero Ternary for Stable LLM Quantization (and More)
- Dual Signal Decomposition of Stochastic Time Series
- Fast, Convex and Conditioned Network for Multi-Fidelity Vectors and Stiff Univariate Differential Equations
- Mitigating Think-Answer Mismatch in LLM Reasoning Through Noise-Aware Advantage Reweighting
- LinguaFluid: Language Guided Fluid Control via Semantic Rewards in Reinforcement Learning
- Parameter-free Optimal Rates for Nonlinear Semi-Norm Contractions with Applications to $Q$-Learning
- Pruning the Unsurprising: Efficient Code Reasoning via First-Token Surprisal
- Optimizing Prompt Sequences using Monte Carlo Tree Search for LLM-Based Optimization
- Stepwise Fine and Gray: Subject-Specific Variable Selection Shows When Hemodynamic Data Improves Prognostication of Comatose Post-Cardiac Arrest Patients
- Recurrent Deep Differentiable Logic Gate Networks
- Improving Diagnostic Accuracy for Oral Cancer with inpainting Synthesis Lesions Generated Using Diffusion Models
- SCAR: State-Space Compression for AI-Driven Resource Management in 6G-Enabled Vehicular Infotainment Systems
- Near-Optimal Regret for Efficient Stochastic Combinatorial Semi-Bandits
- Multi-Omics Analysis for Cancer Subtype Inference via Unrolling Graph Smoothness Priors
- A Study on Regularization-Based Continual Learning Methods for Indic ASR
- Low-Bit Data Processing Using Multiple-Output Spiking Neurons with Non-linear Reset Feedback
- Introducing Fractional Classification Loss for Robust Learning with Noisy Labels
- Geometric-k-means: A Bound Free Approach to Fast and Eco-Friendly k-means
- A New Lens on Homelessness: Daily Tent Monitoring with 311 Calls and Street Images
- Sample-efficient LLM Optimization with Reset Replay
- LLM Unlearning using Gradient Ratio-Based Influence Estimation and Noise Injection
- Indian Legal NLP Benchmarks : A Survey
- Moment Estimate and Variational Approach for Learning Generalized Diffusion with Non-gradient Structures
- AI Guided Accelerator For Search Experience
- Random Walk Learning and the Pac-Man Attack
- Domain-Specific Fine-Tuning and Prompt-Based Learning: A Comparative Study for developing Natural Language-Based BIM Information Retrieval Systems
- MM-FusionNet: Context-Aware Dynamic Fusion for Multi-modal Fake News Detection with Large Vision-Language Models
- Boosting Adversarial Transferability via Residual Perturbation Attack
- Leveraging large language models for SQL behavior-based database intrusion detection
- MambaITD: An Efficient Cross-Modal Mamba Network for Insider Threat Detection
- G-UBS: Towards Robust Understanding of Implicit Feedback via Group-Aware User Behavior Simulation
- Reduction Techniques for Survival Analysis
- Detecting Model Misspecification in Cosmology with Scale-Dependent Normalizing Flows
- Evaluating Universal Machine Learning Force Fields Against Experimental Measurements
- Stochastic Trace Optimization of Parameter Dependent Matrices Based on Statistical Learning Theory
- Stochastic Bandits for Crowdsourcing and Multi-Platform Autobidding
- Training chord recognition models on artificially generated audio
- Hybrid Physics-Machine Learning Models for Quantitative Electron Diffraction Refinements
- Enhancing Construction Site Analysis and Understanding with 3D Segmentation
- Position: Intelligent Coding Systems Should Write Programs with Justifications
- Efficient Knowledge Probing of Large Language Models by Adapting Pre-trained Embeddings
- Data-Driven Density Steering via the Gromov-Wasserstein Optimal Transport Distance
- AGI for the Earth, the path, possibilities and how to evaluate intelligence of models that work with Earth Observation Data?
- Lightweight Auto-bidding based on Traffic Prediction in Live Advertising
- Adaptive Backtracking for Privacy Protection in Large Language Models
- Ensemble-Based Graph Representation of fMRI Data for Cognitive Brain State Classification
- IOCC: Aligning Semantic and Cluster Centers for Few-shot Short Text Clustering
- Enhancing the Scalability of Classical Surrogates for Real-World Quantum Machine Learning Applications
- Large Language Model Data Generation for Enhanced Intent Recognition in German Speech
- EmoAugNet: A Signal-Augmented Hybrid CNN-LSTM Framework for Speech Emotion Recognition
- Decorrelated feature importance from local sample weighting
- DP-SPRT: Differentially Private Sequential Probability Ratio Tests
- Tree-Based Deep Learning for Ranking Symbolic Integration Algorithms
- Blockchain-Enabled Federated Learning
- eSASRec: Enhancing Transformer-based Recommendations in a Modular Fashion
- TRUST: Leveraging Text Robustness for Unsupervised Domain Adaptation
- Maximum Impact with Fewer Features: Efficient Feature Selection for Cold-Start Recommenders through Collaborative Importance Weighting
- Multivariate Fields of Experts
- Reinforcement Learning Based Sensor Optimization for Bio-markers
- Bayesian Gaussian Process ODEs via Double Normalizing Flows
- Decision-focused predictions via pessimistic bilevel optimization: complexity and algorithms
- Soft Dice Confidence: A Near-Optimal Confidence Estimator for Selective Prediction in Semantic Segmentation
- Data Collaboration Analysis with Orthonormal Basis Selection and Alignment
- Position: Lifetime tuning is incompatible with continual reinforcement learning
- Formal Local Implication Between Two Neural Networks
- Adaptive Collocation Point Strategies For Physics Informed Neural Networks via the QR Discrete Empirical Interpolation Method
- Rethinking the Bias of Foundation Model under Long-tailed Distribution
- The Ensemble Kalman Update is an Empirical Matheron Update
- Sample-Efficient Reinforcement Learning from Human Feedback via Information-Directed Sampling
- Navigating Demand Uncertainty in Container Shipping: Deep Reinforcement Learning for Enabling Adaptive and Feasible Master Stowage Planning
- Global graph features unveiled by unsupervised geometric deep learning
- Learning to Match Unpaired Data with Minimum Entropy Coupling
- Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining
- On the Value of Cross-Modal Misalignment in Multimodal Representation Learning
- Khan-GCL: Kolmogorov-Arnold Network Based Graph Contrastive Learning with Hard Negatives
- A Graph Sufficiency Perspective for Neural Networks
- MESAHA-Net: Multi-Encoders based Self-Adaptive Hard Attention Network with Maximum Intensity Projections for Lung Nodule Segmentation in CT Scan
- Generalization Bound for Diffusion Models using Random Features
- Towards More Realistic Extraction Attacks: An Adversarial Perspective
- Modeling Spatial Extremal Dependence of Precipitation Using Distributional Neural Networks
- Optimal sampling for least-squares approximation
- Optimal Linear Baseline Models for Scientific Machine Learning
- An Effective Approach for Node Classification in Textual Graphs
- A Markov Decision Process Framework for Early Maneuver Decisions in Satellite Collision Avoidance
- Graph Federated Learning for Personalized Privacy Recommendation
- Reparameterization Proximal Policy Optimization
- InfoCausalQA:Can Models Perform Non-explicit Causal Reasoning Based on Infographic?
- Membership Inference Attack with Partial Features
- In-Training Defenses against Emergent Misalignment in Language Models
- Synthetic Data Generation and Differential Privacy using Tensor Networks' Matrix Product States (MPS)
- SIFThinker: Spatially-Aware Image Focus for Visual Reasoning
- Numerical Considerations in Weighted Model Counting
- OM2P: Offline Multi-Agent Mean-Flow Policy
- Advanced Deep Learning Techniques for Accurate Lung Cancer Detection and Classification
- FedMeNF: Privacy-Preserving Federated Meta-Learning for Neural Fields
- Mixture of Experts Guided by Gaussian Splatters Matters: A new Approach to Weakly-Supervised Video Anomaly Detection
- Unsupervised Partner Design Enables Robust Ad-hoc Teamwork
- On Approximate MMS Allocations on Restricted Graph Classes
- Harnessing Adaptive Topology Representations for Zero-Shot Graph Question Answering
- Structural Equation-VAE: Disentangled Latent Representations for Tabular Data
- Are you In or Out (of gallery)? Wisdom from the Same-Identity Crowd
- Beyond Prompt-Induced Lies: Investigating LLM Deception on Benign Prompts
- ActivityDiff: A diffusion model with Positive and Negative Activity Guidance for De Novo Drug Design
- SpeakerLM: End-to-End Versatile Speaker Diarization and Recognition with Multimodal Large Language Models
- End-to-End Text-to-SQL with Dataset Selection: Leveraging LLMs for Adaptive Query Generation
- Identity Increases Stability in Neural Cellular Automata
- Robust Target Speaker Diarization and Separation via Augmented Speaker Embedding Sampling
- A Systematic Literature Review of Retrieval-Augmented Generation: Techniques, Metrics, and Challenges
- A Classification-Aware Super-Resolution Framework for Ship Targets in SAR Imagery
- Dimensional Characterization and Pathway Modeling for Catastrophic AI Risks
- Shortcut Learning in Generalist Robot Policies: The Role of Dataset Diversity and Fragmentation
- SPARSE Data, Rich Results: Few-Shot Semi-Supervised Learning via Class-Conditioned Image Translation
- Memp: Exploring Agent Procedural Memory
- CLIPin: A Non-contrastive Plug-in to CLIP for Multimodal Semantic Alignment
- Learning the Topic, Not the Language: How LLMs Classify Online Immigration Discourse Across Languages
- Echoes of Automation: The Increasing Use of LLMs in Newsmaking
- Text Embedded Swin-UMamba for DeepLesion Segmentation
- ScamAgents: How AI Agents Can Simulate Human-Level Scam Calls
- Intuition emerges in Maximum Caliber models at criticality
- Post-training for Efficient Communication via Convention Formation
- WGAST: Weakly-Supervised Generative Network for Daily 10 m Land Surface Temperature Estimation via Spatio-Temporal Fusion
- From Next-Token to Mathematics: The Learning Dynamics of Mathematical Reasoning in Language Models
- Are Your LLMs Capable of Stable Reasoning?
- Probabilistic Foundations for Metacognition via Hybrid-AI
- Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models
- Off-Policy Evaluation for Sequential Persuasion Process with Unobserved Confounding
- DONOD: Efficient and Generalizable Instruction Fine-Tuning for LLMs via Model-Intrinsic Dataset Pruning
- Contemplative Artificial Intelligence
- Language Agents Mirror Human Causal Reasoning Biases. How Can We Help Them Think Like Scientists?
- SuperRL: Reinforcement Learning with Supervision to Boost Language Model Reasoning
- HASD: Hierarchical Adaption for pathology Slide-level Domain-shift
- Efficient Knowledge Graph Construction and Retrieval from Unstructured Text for Large-Scale RAG Systems
- Benchmarking Deception Probes via Black-to-White Performance Boosts
- From research to clinic: Accelerating the translation of clinical decision support systems by making synthetic data interoperable
- A Markov Random Field model for Hypergraph-based Machine Learning
- Learning to Initialize Trajectory Optimization for Vision-Based Autonomous Flight in Unknown Environments
- Improved DDIM Sampling with Moment Matching Gaussian Mixtures
- Entropy Causal Graphs for Multivariate Time Series Anomaly Detection
- INS-MMBench: A Comprehensive Benchmark for Evaluating LVLMs' Performance in Insurance
- Reorganizing attention-space geometry with expressive attention
- Spatio-Temporal Partial Sensing Forecast for Long-term Traffic
- ATM: Improving Model Merging by Alternating Tuning and Merging
- Reconsidering the Performance of GAE in Link Prediction
- CodeXEmbed: A Generalist Embedding Model Family for Multiligual and Multi-task Code Retrieval
- DeepMDV: Global Spatial Matching for Multi-depot Vehicle Routing Problems
- TryOffDiff: Virtual-Try-Off via High-Fidelity Garment Reconstruction using Diffusion Models
- LeakAgent: RL-based Red-teaming Agent for LLM Privacy Leakage
- The Alternative Annotator Test for LLM-as-a-Judge: How to Statistically Justify Replacing Human Annotators with LLMs
- Neural Contextual Reinforcement Framework for Logical Structure Language Generation
- Unveiling Zero-Space Detection: A Novel Framework for Autonomous Ransomware Identification in High-Velocity Environments
- Architectural Fusion Through Contextual Partitioning in Large Language Models: A Novel Approach to Parameterized Knowledge Integration
- Autonomous Structural Memory Manipulation for Large Language Models Using Hierarchical Embedding Augmentation
- Systemizing Multiplicity: The Curious Case of Arbitrariness in Machine Learning
- Hierarchical Pattern Decryption Methodology for Ransomware Detection Using Probabilistic Cryptographic Footprints
- FIT-Print: Towards False-claim-resistant Model Ownership Verification via Targeted Fingerprint
- Contextual Reinforcement in Multimodal Token Compression for Large Language Models
- Algorithmic Segmentation and Behavioral Profiling for Ransomware Detection Using Temporal-Correlation Graphs
- Contextually Entangled Gradient Mapping for Optimized LLM Comprehension
- DeToNATION: Decoupled Torch Network-Aware Training on Interlinked Online Nodes
- CAST: Cross Attention based multimodal fusion of Structure and Text for materials property prediction
- ArchRAG: Attributed Community-based Hierarchical Retrieval-Augmented Generation
- Exploring Synaptic Resonance in Large Language Models: A Novel Approach to Contextual Memory Integration
- Topic Over Source: The Key to Effective Data Mixing for Language Models Pre-training
- ACTIVA: Amortized Causal Effect Estimation via Transformer-based Variational Autoencoder
- Code-as-Symbolic-Planner: Foundation Model-Based Robot Planning via Symbolic Code Generation
- Training Plug-n-Play Knowledge Modules with Deep Context Distillation
- CodeARC: Benchmarking Reasoning Capabilities of LLM Agents for Inductive Program Synthesis
- Evaluating and Designing Sparse Autoencoders by Approximating Quasi-Orthogonality
- M$^2$IV: Towards Efficient and Fine-grained Multimodal In-Context Learning via Representation Engineering
- Self-Steering Language Models
- Layers at Similar Depths Generate Similar Activations Across LLM Architectures
- AI-Assisted Conversational Interviewing: Effects on Data Quality and User Experience
- iTFKAN: Interpretable Time Series Forecasting with Kolmogorov-Arnold Network
- SMOGAN: Synthetic Minority Oversampling with GAN Refinement for Imbalanced Regression
- Vision-Language Model-Based Semantic-Guided Imaging Biomarker for Lung Nodule Malignancy Prediction
- Solving Copyright Infringement on Short Video Platforms: Novel Datasets and an Audio Restoration Deep Learning Pipeline
- No Query, No Access
- Are Large Language Models Robust in Understanding Code Against Semantics-Preserving Mutations?
- LaDi-WM: A Latent Diffusion-based World Model for Predictive Manipulation
- CUB: Benchmarking Context Utilisation Techniques for Language Models
- LLM-Meta-SR: In-Context Learning for Evolving Selection Operators in Symbolic Regression
- Fusing Cross-Domain Knowledge from Multimodal Data to Solve Problems in the Physical World
- Survey on the Evaluation of Generative Models in Music
- AVA-Bench: Atomic Visual Ability Benchmark for Vision Foundation Models
- No Universal Prompt: Unifying Reasoning through Adaptive Prompting for Temporal Table Reasoning
- Scientifically-Interpretable Reasoning Network (ScIReN): Discovering Hidden Relationships in the Carbon Cycle and Beyond
- Time-Prompt: Integrated Heterogeneous Prompts for Unlocking LLMs in Time Series Forecasting
- Quantifying Fairness in LLMs Beyond Tokens: A Semantic and Statistical Perspective
- Quality over Quantity: An Effective Large-Scale Data Reduction Strategy Based on Pointwise V-Information
- Humans overrely on overconfident language models, across languages
- Diagrams-to-Dynamics (D2D): Exploring Causal Loop Diagram Leverage Points under Uncertainty
- A Graph Neural Network Approach for Mapping the Conceptual Structure and Inter-Branch Connectivity of Physics
- Machine Learning-Based Nonlinear Nudging for Chaotic Dynamical Systems
- InfiGUI-G1: Advancing GUI Grounding with Adaptive Exploration Policy Optimization
- A Framework for Inherently Safer AGI through Language-Mediated Active Inference
- Whither symbols in the era of advanced neural networks?
- Holistic Explainable AI (H-XAI): Extending Transparency Beyond Developers in AI-Driven Decision Making
- Safety of Embodied Navigation: A Survey
- Planning Agents on an Ego-Trip: Leveraging Hybrid Ego-Graph Ensembles for Improved Tool Retrieval in Enterprise Task Planning
- Mediator-Guided Multi-Agent Collaboration among Open-Source Models for Medical Decision-Making
- Society of Mind Meets Real-Time Strategy: A Hierarchical Multi-Agent Framework for Strategic Reasoning
- LLMs for Resource Allocation: A Participatory Budgeting Approach to Inferring Preferences
- Don't Forget Imagination!
- A Generic Complete Anytime Beam Search for Optimal Decision Tree
- ME$^3$-BEV: Mamba-Enhanced Deep Reinforcement Learning for End-to-End Autonomous Driving with BEV-Perception
- Aggregate-Combine-Readout GNNs Are More Expressive Than Logic C2
- PanelTR: Zero-Shot Table Reasoning Framework Through Multi-Agent Scientific Discussion
- SKATE, a Scalable Tournament Eval: Weaker LLMs differentiate between stronger ones using verifiable challenges
- Study of Robust Features in Formulating Guidance for Heuristic Algorithms for Solving the Vehicle Routing Problem
- Retrieval Augmented Large Language Model System for Comprehensive Drug Contraindications
- Overconfidence in LLM-as-a-Judge: Diagnosis and Confidence-Driven Solution
- GeoLaux: A Benchmark for Evaluating MLLMs' Geometry Performance on Long-Step Problems Requiring Auxiliary Lines
- Learning Logical Rules using Minimum Message Length
- Symmetry breaking for inductive logic programming
- LLM Robustness Leaderboard v1 --Technical report
- A "good regulator theorem" for embodied agents
- AntiCheatPT: A Transformer-Based Approach to Cheat Detection in Competitive Computer Games
- From Explainable to Explanatory Artificial Intelligence: Toward a New Paradigm for Human-Centered Explanations through Generative AI
- Automated Creation of the Legal Knowledge Graph Addressing Legislation on Violence Against Women: Resource, Methodology and Lessons Learned
- The Fair Game: Auditing & Debiasing AI Algorithms Over Time
- What Voting Rules Actually Do: A Data-Driven Analysis of Multi-Winner Voting
- Epidemic Control on a Large-Scale-Agent-Based Epidemiology Model using Deep Deterministic Policy Gradient
- SHACL Validation in the Presence of Ontologies: Semantics and Rewriting Techniques
- Automated Visualization Makeovers with LLMs
- Request-Only Optimization for Recommendation Systems
- Query-Aware Graph Neural Networks for Enhanced Retrieval-Augmented Generation
- AquiLLM: a RAG Tool for Capturing Tacit Knowledge in Research Groups
- OmniBench-RAG: A Multi-Domain Evaluation Platform for Retrieval-Augmented Generation Tools
- Lessons from A Large Language Model-based Outdoor Trail Recommendation Chatbot with Retrieval Augmented Generation
- Modeling Interactive Narrative Systems: A Formal Approach
- Comparison of Information Retrieval Techniques Applied to IT Support Tickets
- Beyond Single Labels: Improving Conversational Recommendation through LLM-Powered Data Augmentation
- Open-Source Agentic Hybrid RAG Framework for Scientific Literature Review
- Zero-Shot Retrieval for Scalable Visual Search in a Two-Sided Marketplace
- From Static to Dynamic: A Streaming RAG Approach to Real-time Knowledge Base
- Enhancing Retrieval-Augmented Generation for Electric Power Industry Customer Support
- HySemRAG: A Hybrid Semantic Retrieval-Augmented Generation Framework for Automated Literature Synthesis and Methodological Gap Analysis
- ITDR: An Instruction Tuning Dataset for Enhancing Large Language Models in Recommendations
- A Survey of LLM-based Deep Search Agents: Paradigm, Optimization, Evaluation, and Challenges
- Fine-Tuning Vision-Language Models for Markdown Conversion of Financial Tables in Malaysian Audited Financial Reports
- Can LLMs effectively provide game-theoretic-based scenarios for cybersecurity?
- LMAR: Language Model Augmented Retriever for Domain-specific Knowledge Indexing
- Breaking the Top-$K$ Barrier: Advancing Top-$K$ Ranking Metrics Optimization in Recommender Systems
- Towards Effective Offensive Security LLM Agents: Hyperparameter Tuning, LLM as a Judge, and a Lightweight CTF Benchmark
- Principle-Guided Verilog Optimization: IP-Safe Knowledge Transfer via Local-Cloud Collaboration
- Adversarial Attacks on Reinforcement Learning-based Medical Questionnaire Systems: Input-level Perturbation Strategies and Medical Constraint Validation
- Are All Genders Equal in the Eyes of Algorithms? -- Analysing Search and Retrieval Algorithms for Algorithmic Gender Fairness
- Selection-Based Vulnerabilities: Clean-Label Backdoor Attacks in Active Learning
- Risk Analysis Techniques for Governed LLM-based Multi-Agent Systems
- Empirical Evaluation of AI-Assisted Software Package Selection: A Knowledge Graph Approach
- DMFI: Dual-Modality Fine-Tuning and Inference Framework for LLM-Based Insider Threat Detection
- Log2Sig: Frequency-Aware Insider Threat Detection via Multivariate Behavioral Signal Decomposition
- Multi-Faceted Large Embedding Tables for Pinterest Ads Ranking
- Semantic Reasoning Meets Numerical Precision: An LLM-Powered Multi-Agent System for Power Grid Control
- A Physiologically-Constrained Neural Network Digital Twin Framework for Replicating Glucose Dynamics in Type 1 Diabetes
- Klear-CodeTest: Scalable Test Case Generation for Code Reinforcement Learning
- CLAPP: The CLASS LLM Agent for Pair Programming
- UnGuide: Learning to Forget with LoRA-Guided Diffusion Models
- Few-Shot Deployment of Pretrained MRI Transformers in Brain Imaging Tasks
- From Imperfect Signals to Trustworthy Structure: Confidence-Aware Inference from Heterogeneous and Reliability-Varying Utility Data
- AI-Guided Exploration of Large-Scale Codebases
- Integrating Vision Foundation Models with Reinforcement Learning for Enhanced Object Interaction
- Towards Transparent Ethical AI: A Roadmap for Trustworthy Robotic Systems
- Do Machines Think Emotionally? Cognitive Appraisal Analysis of Large Language Models
- Do Ethical AI Principles Matter to Users? A Large-Scale Analysis of User Sentiment and Satisfaction
- Enhancing Software Vulnerability Detection Through Adaptive Test Input Generation Using Genetic Algorithm
- REFS: Robust EEG feature selection with missing multi-dimensional annotation for emotion recognition
- ASLSL: Adaptive shared latent structure learning with incomplete multi-modal physiological data for multi-dimensional emotional feature selection
- Prosocial Behavior Detection in Player Game Chat: From Aligning Human-AI Definitions to Efficient Annotation at Scale
- A 3DGS-Diffusion Self-Supervised Framework for Normal Estimation from a Single Image
- Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents
- Multi-Armed Bandits-Based Optimization of Decision Trees
- Mildly Conservative Regularized Evaluation for Offline Reinforcement Learning
- Impact-driven Context Filtering For Cross-file Code Completion
- DAFMSVC: One-Shot Singing Voice Conversion with Dual Attention Mechanism and Flow Matching
- Learning by Teaching: Engaging Students as Instructors of Large Language Models in Computer Science Education
- ETA: Energy-based Test-time Adaptation for Depth Completion
- ECMF: Enhanced Cross-Modal Fusion for Multimodal Emotion Recognition in MER-SEMI Challenge
- Hand by Hand: LLM Driving EMS Assistant for Operational Skill Learning
- Crisp Attention: Regularizing Transformers via Structured Sparsity
- Improved Sub-Visible Particle Classification in Flow Imaging Microscopy via Generative AI-Based Image Synthesis
- Temporal Self-Rewarding Language Models: Decoupling Chosen-Rejected via Past-Future
- Adaptive Heterogeneous Graph Neural Networks: Bridging Heterophily and Heterogeneity
- Fourier-VLM: Compressing Vision Tokens in the Frequency Domain for Large Vision-Language Models
- DP-LLM: Runtime Model Adaptation with Dynamic Layer-wise Precision Assignment
- EvolvR: Self-Evolving Pairwise Reasoning for Story Evaluation to Enhance Generation
- ThematicPlane: Bridging Tacit User Intent and Latent Spaces for Image Generation
- Architecture-Aware Generalization Bounds for Temporal Networks: Theory and Fair Comparison Methodology
- Can Large Models Fool the Eye? A New Turing Test for Biological Animation
- Towards MR-Based Trochleoplasty Planning
- Bounding Distributional Shifts in World Modeling through Novelty Detection
- MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows
- Mask & Match: Learning to Recognize Handwritten Math with Self-Supervised Attention
- GCHR : Goal-Conditioned Hindsight Regularization for Sample-Efficient Reinforcement Learning
- FMCE-Net++: Feature Map Convergence Evaluation and Training
- LLM Serving Optimization with Variable Prefill and Decode Lengths
- Less is More: Selective Reflection for Compatible and Efficient Knowledge Distillation in Large Language Models
- Roll Your Eyes: Gaze Redirection via Explicit 3D Eyeball Rotation
- Semantic Item Graph Enhancement for Multimodal Recommendation
- One Size Does Not Fit All: A Distribution-Aware Sparsification for More Precise Model Merging
- UR$^2$: Unify RAG and Reasoning through Reinforcement Learning
- UW-3DGS: Underwater 3D Reconstruction with Physics-Aware Gaussian Splatting
- Synthetic Data-Driven Multi-Architecture Framework for Automated Polyp Segmentation Through Integrated Detection and Mask Generation
- Differentially Private Federated Clustering with Random Rebalancing
- Benchmarking Pretrained Molecular Embedding Models For Molecular Representation Learning
- LoRA in LoRA: Towards Parameter-Efficient Architecture Expansion for Continual Visual Instruction Tuning
- Classification is a RAG problem: A case study on hate speech detection
Research Sources: 488 | Generated: 8/25/2025