AI RESEARCH PAPERS & ACADEMIC SOURCES
- Nanopore sequencing of intact aminoacylated tRNAs
- What counts as plagiarism? AI-generated papers pose new risks
- The importance of negative training data for robust antibody binding prediction
- Electron-density-informed effective and reliable de novo molecular design and optimization with ED2Mol
- SciToolAgent: a knowledge-graph-driven scientific agent for multitool integration
- Training data composition determines machine learning generalization and biological rule discovery
- AI-based diagnosis of acute aortic syndrome from noncontrast CT
- A comprehensive deep learning approach to improve enchondroma detection on X-ray images
- Automatic detection of cognitive events using machine learning and understanding models’ interpretations of human cognition
- Diffusion-Driven High-Dimensional Variable Selection
- LEARNER: A Transfer Learning Method for Low-Rank Matrix Estimation
- RadGPT: Constructing 3D Image-Text Tumor Datasets
- Rapid Urban Visibility Hotspots: Quantifying Building Vertex Visibility from Connected Vehicle Trajectories using Spatial Indexing
- BRISC: Annotated Dataset for Brain Tumor Segmentation and Classification with Swin-HAFNet
- UltraDfeGAN: Detail-Enhancing Generative Adversarial Networks for High-Fidelity Functional Ultrasound Synthesis
- Colon Polyps Detection from Colonoscopy Images Using Deep Learning
- Benchmarking GPT-5 for Zero-Shot Multimodal Medical Reasoning in Radiology and Radiation Oncology
- PediDemi -- A Pediatric Demyelinating Lesion Segmentation Dataset
- InnerGS: Internal Scenes Rendering via Factorized 3D Gaussian Splatting
- Susceptibility Distortion Correction of Diffusion MRI with a single Phase-Encoding Direction
- Towards Understanding and Harnessing the Transferability of Prognostic Knowledge in Computational Pathology
- ROVER: Robust Loop Closure Verification with Trajectory Prior in Repetitive Environments
- State of Abdominal CT Datasets: A Critical Review of Bias, Clinical Relevance, and Real-world Applicability
- Model-based Multi-object Visual Tracking: Identification and Standard Model Limitations
- subCellSAM: Zero-Shot (Sub-)Cellular Segmentation for Hit Validation in Drug Discovery
- Deep Biomechanically-Guided Interpolation for Keypoint-Based Brain Shift Registration
- Sketch3DVE: Sketch-based 3D-Aware Scene Video Editing
- Is-NeRF: In-scattering Neural Radiance Field for Blurred Images
- Latent Interpolation Learning Using Diffusion Models for Cardiac Volume Reconstruction
- Multimodal Data Storage and Retrieval for Embodied AI: A Survey
- Learning to See Through Flare
- MMIS-Net for Retinal Fluid Segmentation and Detection
- Real-Time, Population-Based Reconstruction of 3D Bone Models via Very-Low-Dose Protocols
- Augmenting cobots for sheet-metal SMEs with 3D object recognition and localisation
- UNICON: UNIfied CONtinual Learning for Medical Foundational Models
- Advancing Toward Robust and Scalable Fingerprint Orientation Estimation: From Gradients to Deep Learning
- Diffusion Noise Feature: Accurate and Fast Generated Image Detection
- A global optimization SAR image segmentation model can be easily transformed to a general ROF denoising model
- SAR image segmentation algorithms based on I-divergence-TV model
- Active contours driven by local and global intensity fitting energy with application to SAR image segmentation and its fast solvers
- Rethinking Transformer-Based Blind-Spot Network for Self-Supervised Image Denoising
- ContrastAlign: Toward Robust BEV Feature Alignment via Contrastive Learning for Multi-Modal 3D Object Detection
- HouseCrafter: Lifting Floorplans to 3D Scenes with 2D Diffusion Model
- WHALES: A Multi-Agent Scheduling Dataset for Enhanced Cooperation in Autonomous Driving
- ResFlow: Fine-tuning Residual Optical Flow for Event-based High Temporal Resolution Motion Estimation
- Image Augmentation Agent for Weakly Supervised Semantic Segmentation
- MMHMER:Multi-viewer and Multi-task for Handwritten Mathematical Expression Recognition
- Towards Vision Zero: The TUM Traffic Accid3nD Dataset
- AutoComPose: Automatic Generation of Pose Transition Descriptions for Composed Pose Retrieval Using Multimodal LLMs
- Geo4D: Leveraging Video Generators for Geometric 4D Scene Reconstruction
- DNF-Avatar: Distilling Neural Fields for Real-time Animatable Avatar Relighting
- EmoSEM: Segment and Explain Emotion Stimuli in Visual Art
- Beyond the Horizon: Decoupling Multi-View UAV Action Recognition via Partial Order Transfer
- ReservoirTTA: Prolonged Test-time Adaptation for Evolving and Recurring Domains
- Boosting Adversarial Transferability for Hyperspectral Image Classification Using 3D Structure-invariant Transformation and Weighted Intermediate Feature Divergence
- MCN-SLAM: Multi-Agent Collaborative Neural SLAM with Hybrid Implicit Neural Scene Representation
- FreqDGT: Frequency-Adaptive Dynamic Graph Networks with Transformer for Cross-subject EEG Emotion Recognition
- Upsample What Matters: Region-Adaptive Latent Sampling for Accelerated Diffusion Transformers
- Stereo-based 3D Anomaly Object Detection for Autonomous Driving: A New Dataset and Baseline
- Regional quality estimation for echocardiography using deep learning
- SEA-LION: Southeast Asian Languages in One Network
- Crossing Borders Without Crossing Boundaries: How Sociolinguistic Awareness Can Optimize User Engagement with Localized Spanish AI Models Across Hispanophone Countries
- Model Performance-Guided Evaluation Data Selection for Effective Prompt Optimization
- Soft Reasoning: Navigating Solution Spaces in Large Language Models through Controlled Embedding Exploration
- MedVisionLlama: Leveraging Pre-Trained Large Language Model Layers to Enhance Medical Image Segmentation
- YOLO11-CR: a Lightweight Convolution-and-Attention Framework for Accurate Fatigue Driving Detection
- DianJin-OCR-R1: Enhancing OCR Capabilities via a Reasoning-and-Tool Interleaved Vision-Language Model
- Exploration of Deep Learning Based Recognition for Urdu Text
- Prune2Drive: A Plug-and-Play Framework for Accelerating Vision-Language Models in Autonomous Driving
- Automated Assessment of Aesthetic Outcomes in Facial Plastic Surgery
- Applications of Small Language Models in Medical Imaging Classification with a Focus on Prompt Strategies
- AIM 2025 Rip Current Segmentation (RipSeg) Challenge Report
- EDTalk++: Full Disentanglement for Controllable Talking Head Synthesis
- Revisiting MLLM Token Technology through the Lens of Classical Visual Coding
- MINR: Efficient Implicit Neural Representations for Multi-Image Encoding
- Distribution-Aware Hadamard Quantization for Hardware-Efficient Implicit Neural Representations
- AIM 2025 challenge on Inverse Tone Mapping Report: Methods and Results
- Enhancing Robustness of Implicit Neural Representations Against Weight Perturbations
- FAMNet: Integrating 2D and 3D Features for Micro-expression Recognition via Multi-task Learning and Hierarchical Attention
- AdaptiveAE: An Adaptive Exposure Strategy for HDR Capturing in Dynamic Scenes
- Bridging the Gap: Doubles Badminton Analysis with Singles-Trained Models
- 2D Gaussians Meet Visual Tokenizer
- GazeProphet: Software-Only Gaze Prediction for VR Foveated Rendering
- A Lightweight Dual-Mode Optimization for Generative Face Video Coding
- Color Spike Data Generation via Bio-inspired Neuron-like Encoding with an Artificial Photoreceptor Layer
- DictAS: A Framework for Class-Generalizable Few-Shot Anomaly Segmentation via Dictionary Lookup
- Learnable SMPLify: A Neural Solution for Optimization-Free Human Pose Inverse Kinematics
- Generative Model-Based Feature Attention Module for Video Action Analysis
- Temporal-Conditional Referring Video Object Segmentation with Noise-Free Text-to-Video Diffusion Model
- Bridging Clear and Adverse Driving Conditions
- Towards Efficient Vision State Space Models via Token Merging
- Unleashing Semantic and Geometric Priors for 3D Scene Completion
- PersonaVlog: Personalized Multimodal Vlog Generation with Multi-Agent Collaboration and Iterative Self-Correction
- Two-Factor Authentication Smart Entryway Using Modified LBPH Algorithm
- TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis
- RCGNet: RGB-based Category-Level 6D Object Pose Estimation with Geometric Guidance
- DiffIER: Optimizing Diffusion Models with Iterative Error Reduction
- OmniTry: Virtual Try-On Anything without Masks
- DeH4R: A Decoupled and Hybrid Method for Road Network Graph Extraction
- HumanPCR: Probing MLLM Capabilities in Diverse Human-Centric Scenes
- Diversity-enhanced Collaborative Mamba for Semi-supervised Medical Image Segmentation
- Hierarchical Vision-Language Retrieval of Educational Metaverse Content in Agriculture
- Enhancing Targeted Adversarial Attacks on Large Vision-Language Models through Intermediate Projector Guidance
- MR6D: Benchmarking 6D Pose Estimation for Mobile Robots
- Shape-from-Template with Generalised Camera
- VisionLaw: Inferring Interpretable Intrinsic Dynamics from Visual Observations via Bilevel Optimization
- Timestep-Compressed Attack on Spiking Neural Networks through Timestep-Level Backpropagation
- Self-Aware Adaptive Alignment: Enabling Accurate Perception for Intelligent Transportation Systems
- SAGA: Learning Signal-Aligned Distributions for Improved Text-to-Image Generation
- RED.AI Id-Pattern: First Results of Stone Deterioration Patterns with Multi-Agent Systems
- RICO: Two Realistic Benchmarks and an In-Depth Analysis for Incremental Learning in Object Detection
- In-hoc Concept Representations to Regularise Deep Learning in Medical Imaging
- Forecasting Smog Events Using ConvLSTM: A Spatio-Temporal Approach for Aerosol Index Prediction in South Asia
- SCRNet: Spatial-Channel Regulation Network for Medical Ultrasound Image Segmentation
- PhysGM: Large Physical Gaussian Model for Feed-Forward 4D Synthesis
- DIME-Net: A Dual-Illumination Adaptive Enhancement Network Based on Retinex and Mixture-of-Experts
- ViT-FIQA: Assessing Face Image Quality using Vision Transformers
- ROVR-Open-Dataset: A Large-Scale Depth Dataset for Autonomous Driving
- OmViD: Omni-supervised active learning for video action detection
- Physics-Based 3D Simulation for Synthetic Data Generation and Failure Analysis in Packaging Stability Assessment
- Self-Supervised Sparse Sensor Fusion for Long Range Perception
- ResPlan: A Large-Scale Vector-Graph Dataset of 17,000 Residential Floor Plans
- Online 3D Gaussian Splatting Modeling with Novel View Selection
- Backdooring Self-Supervised Contrastive Learning by Noisy Alignment
- InfiniteTalk: Audio-driven Video Generation for Sparse-Frame Video Dubbing
- Distilled-3DGS:Distilled 3D Gaussian Splatting
- Beyond Simple Edits: Composed Video Retrieval with Dense Modifications
- LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos
- Query Logs Analytics: A Aystematic Literature Review
- BQA: Body Language Question Answering Dataset for Video Large Language Models
- Consolidating and Developing Benchmarking Datasets for the Nepali Natural Language Understanding Tasks
- Understanding the Impact of Confidence in Retrieval Augmented Generation: A Case Study in the Medical Domain
- Universal Abstraction: Harnessing Frontier Models to Structure Real-World Data at Scale
- Basic Category Usage in Vision Language Models
- Fair Play in the Newsroom: Actor-Based Filtering Gender Discrimination in Text Corpora
- Stands to Reason: Investigating the Effect of Reasoning on Idiomaticity Detection
- MATA (m\=ata): Mindful Assessment of the Telugu Abilities of Large Language Models
- AdaDocVQA: Adaptive Framework for Long Document Visual Question Answering in Low-Resource Settings
- CRISP: Persistent Concept Unlearning via Sparse Autoencoders
- EEG-MedRAG: Enhancing EEG-based Clinical Decision-Making via Hierarchical Hypergraph Retrieval-Augmented Generation
- Sycophancy under Pressure: Evaluating and Mitigating Sycophantic Bias via Adversarial Dialogues in Scientific QA
- MGT-Prism: Enhancing Domain Generalization for Machine-Generated Text Detection via Spectral Alignment
- Can Large Language Models (LLMs) Describe Pictures Like Children? A Comparative Corpus Study
- TracSum: A New Benchmark for Aspect-Based Summarization with Sentence-Level Traceability in Medical Domain
- Beyond Human Judgment: A Bayesian Evaluation of LLMs' Moral Values Understanding
- MME-SCI: A Comprehensive and Challenging Science Benchmark for Multimodal Large Language Models
- ReviewGraph: A Knowledge Graph Embedding Based Framework for Review Rating Prediction with Sentiment Features
- Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR
- The Promise of Large Language Models in Digital Health: Evidence from Sentiment Analysis in Online Health Communities
- Taming Unbalanced Training Workloads in Deep Learning with Partial Collective Operations
- Breaking (Global) Barriers in Parallel Stochastic Optimization with Wait-Avoiding Group Averaging
- Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines
- Finite Expression Method for Solving High-Dimensional Partial Differential Equations
- Active Learning of Mealy Machines with Timers
- Contrastive Learning on Multimodal Analysis of Electronic Health Records
- Robustly estimating heterogeneity in factorial data using Rashomon Partitions
- Disciplined Geodesically Convex Programming
- Unsupervised Anomaly Detection Using Diffusion Trend Analysis for Display Inspection
- Parallel Network Reconstruction with Multi-directional Regularization
- Development of Pre-Trained Transformer-based Models for the Nepali Language
- TabulaX: Leveraging Large Language Models for Multi-Class Table Transformations
- Investigating the importance of county-level characteristics in opioid-related mortality across the United States
- Spatially-guided Temporal Aggregation for Robust Event-RGB Optical Flow Estimation
- Hybrid Machine Learning Model with a Constrained Action Space for Trajectory Prediction
- Gaussian Approximation and Multiplier Bootstrap for Stochastic Gradient Descent
- Fact or Guesswork? Evaluating Large Language Models' Medical Knowledge with Structured One-Hop Judgments
- Rectifying Conformity Scores for Better Conditional Coverage
- MR-EEGWaveNet: Multiresolutional EEGWaveNet for Seizure Detection from Long EEG Recordings
- Cross-Modal Characterization of Thin Film MoS$_2$ Using Generative Models
- BLIPs: Bayesian Learned Interatomic Potentials
- Learning from Preferences and Mixed Demonstrations in General Settings
- FedChip: Federated LLM for Artificial Intelligence Accelerator Chip Design
- Sex-Specific Vascular Score: A Novel Perfusion Biomarker from Supervoxel Analysis of 3D pCASL MRI
- Modeling GRNs with a Probabilistic Categorical Framework
- The Course Difficulty Analysis Cookbook
- Structural Foundations for Leading Digit Laws: Beyond Probabilistic Mixtures
- Automated Cervical Cancer Detection through Visual Inspection with Acetic Acid in Resource-Poor Settings with Lightweight Deep Learning Models Deployed on an Android Device
- CLoE: Curriculum Learning on Endoscopic Images for Robust MES Classification
- DAASH: A Meta-Attack Framework for Synthesizing Effective and Stealthy Adversarial Examples
- Flow Matching-Based Generative Modeling for Efficient and Scalable Data Assimilation
- A Risk Manager for Intrusion Tolerant Systems: Enhancing HAL 9000 with New Scoring and Data Sources
- OrbitChain: Orchestrating In-orbit Real-time Analytics of Earth Observation Data
- Vision Transformers for Kidney Stone Image Classification: A Comparative Study with CNNs
- Multi-view Clustering via Bi-level Decoupling and Consistency Learning
- Saudi-Dialect-ALLaM: LoRA Fine-Tuning for Dialectal Arabic Generation
- Compressed Models are NOT Trust-equivalent to Their Large Counterparts
- Understanding Distribution Structure on Calibrated Recommendation Systems
- Towards safe control parameter tuning in distributed multi-agent systems
- ViExam: Are Vision Language Models Better than Humans on Vietnamese Multimodal Exam Questions?
- Know Me by My Pulse: Toward Practical Continuous Authentication on Wearable Devices via Wrist-Worn PPG
- Optimizing Region of Interest Selection for Effective Embedding in Video Steganography Based on Genetic Algorithms
- Unsupervised Urban Tree Biodiversity Mapping from Street-Level Imagery Using Spatially-Aware Visual Clustering
- Smooth Flow Matching
- Online Conformal Selection with Accept-to-Reject Changes
- Generalisation and benign over-fitting for linear regression onto random functional covariates
- A PC Algorithm for Max-Linear Bayesian Networks
- Uncertainty-Aware PCA for Arbitrarily Distributed Data Modeled by Gaussian Mixture Models
- Machine Learning H-theorem
- Mask and Restore: Blind Backdoor Defense at Test Time with Masked Autoencoder
- Disentangled Representation Learning with the Gromov-Monge Gap
- Correlations Are Ruining Your Gradient Descent
- FDR-SVM: A Federated Distributionally Robust Support Vector Machine via a Mixture of Wasserstein Balls Ambiguity Set
- A Causal Graph-Enhanced Gaussian Process Regression for Modeling Engine-out NOx
- Rethinking Weight-Averaged Model-merging
- High-Order Tensor Regression in Sparse Convolutional Neural Networks
- Environmental Feature Engineering and Statistical Validation for ML-Based Path Loss Prediction
- Closed-Form Feedback-Free Learning with Forward Projection
- Joint Learning of Energy-based Models and their Partition Function
- Enhancing Cost Efficiency in Active Learning with Candidate Set Query
- Recommendations with Sparse Comparison Data: Provably Fast Convergence for Nonconvex Matrix Factorization
- A kinetic-based regularization method for data science applications
- Performance Comparisons of Reinforcement Learning Algorithms for Sequential Experimental Design
- Langevin Monte-Carlo Provably Learns Depth Two Neural Nets at Any Size and Data
- Incorporating Attributes and Multi-Scale Structures for Heterogeneous Graph Contrastive Learning
- Reinforcement Learning for Solving the Pricing Problem in Column Generation: Applications to Vehicle Routing
- Can Masked Autoencoders Also Listen to Birds?
- MEGA: Second-Order Gradient Alignment for Catastrophic Forgetting Mitigation in GFSCIL
- Always Skip Attention
- Epistemic Wrapping for Uncertainty Quantification
- Quiet Feature Learning in Algorithmic Tasks
- Good Things Come in Pairs: Paired Autoencoders for Inverse Problems
- Bidirectional Information Flow (BIF) -- A Sample Efficient Hierarchical Gaussian Process for Bayesian Optimization
- Flexible Operator Fusion for Fast Sparse Transformer with Diverse Masking on GPU
- SymMatika: Structure-Aware Symbolic Discovery
- PinFM: Foundation Model for User Activity Sequences at a Billion-scale Visual Discovery Platform
- Improving DAPO from a Mixed-Policy Perspective
- A Comprehensive Re-Evaluation of Biometric Modality Properties in the Modern Era
- Revisiting Diffusion Q-Learning: From Iterative Denoising to One-Step Action Generation
- Automated Energy-Aware Time-Series Model Deployment on Embedded FPGAs for Resilient Combined Sewer Overflow Management
- How Usable is Automated Feature Engineering for Tabular Data?
- Convergent Reinforcement Learning Algorithms for Stochastic Shortest Path Problem
- AutoScale: Linear Scalarization Guided by Multi-Task Optimization Metrics
- Multi-User Contextual Cascading Bandits for Personalized Recommendation
- Formal Algorithms for Model Efficiency
- GDNSQ: Gradual Differentiable Noise Scale Quantization for Low-bit Neural Networks
- Typed Topological Structures Of Datasets
- Enhancing Visual Reliance in Text Generation: A Bayesian Perspective on Mitigating Hallucination in Large Vision-Language Models
- Recipes for Pre-training LLMs with MXFP8
- PlantDeBERTa: An Open Source Language Model for Plant Science
- ConTextTab: A Semantics-Aware Tabular In-Context Learner
- Scaling Intelligence: Designing Data Centers for Next-Gen Language Models
- Neural Cellular Automata for ARC-AGI
- Segment Anything in Pathology Images with Natural Language
- Tensor Program Optimization for the RISC-V Vector Extension Using Probabilistic Programs
- Identify, Isolate, and Purge: Mitigating Hallucinations in LVLMs via Self-Evolving Distillation
- Penalizing Infeasible Actions and Reward Scaling in Reinforcement Learning with Offline Data
- Spatial-Temporal Transformer with Curriculum Learning for EEG-Based Emotion Recognition
- BERT-VQA: Visual Question Answering on Plots
- Strategies for training point distributions in physics-informed neural networks
- A Recurrent Neural Network based Clustering Method for Binary Data Sets in Education
- RISE: Enhancing VLM Image Annotation with Self-Supervised Reasoning
- Data driven feedback linearization of nonlinear control systems via Lie derivatives and stacked regression approach
- Physically Plausible Data Augmentations for Wearable IMU-based Human Activity Recognition Using Physics Simulation
- Towards Human-AI Complementarity in Matching Tasks
- Efficient Constraint-Aware Flow Matching via Randomized Exploration
- Decoding Communications with Partial Information
- X-MoE: Enabling Scalable Training for Emerging Mixture-of-Experts Architectures on HPC Platforms
- Dimension lower bounds for linear approaches to function approximation
- Adaptive Conformal Prediction Intervals Over Trajectory Ensembles
- Batching-Aware Joint Model Onloading and Offloading for Hierarchical Multi-Task Inference
- NovoMolGen: Rethinking Molecular Language Model Pretraining
- Decentralized Contextual Bandits with Network Adaptivity
- MAVIS: Multi-Objective Alignment via Value-Guided Inference-Time Search
- ASAP: Unsupervised Post-training with Label Distribution Shift Adaptive Learning Rate
- Hierarchy-Consistent Learning and Adaptive Loss Balancing for Hierarchical Multi-Label Classification
- Classifying Clinical Outcome of Epilepsy Patients with Ictal Chirp Embeddings
- DyMixOp: Guiding Neural Operator Design for PDEs from a Complex Dynamics Perspective with Local-Global-Mixing
- Uncertainty Tube Visualization of Particle Trajectories
- Explainability of Algorithms
- MuFlex: A Scalable, Physics-based Platform for Multi-Building Flexibility Analysis and Coordination
- CALYPSO: Forecasting and Analyzing MRSA Infection Patterns with Community and Healthcare Transmission Dynamics
- Prediction of Hospital Associated Infections During Continuous Hospital Stays
- A Generalized Learning Framework for Self-Supervised Contrastive Learning
- Approximate Bayesian Inference via Bitstring Representations
- Text2Weight: Bridging Natural Language and Neural Network Weight Spaces
- Explainable Learning Rate Regimes for Stochastic Optimization
- Personalized Subgraph Federated Learning with Sheaf Collaboration
- MACTAS: Self-Attention-Based Module for Inter-Agent Communication in Multi-Agent Reinforcement Learning
- Heavy-tailed Linear Bandits: Adversarial Robustness, Best-of-both-worlds, and Beyond
- Minimizing the Weighted Number of Tardy Jobs: Data-Driven Heuristic for Single-Machine Scheduling
- Trans-XFed: An Explainable Federated Learning for Supply Chain Credit Assessment
- DREAMS: Preserving both Local and Global Structure in Dimensionality Reduction
- Order Optimal Regret Bounds for Sharpe Ratio Optimization in the Bandit Setting
- Communication-Efficient Federated Learning with Adaptive Number of Participants
- Reinforcement Learning-based Adaptive Path Selection for Programmable Networks
- Disentangled Deep Smoothed Bootstrap for Fair Imbalanced Regression
- FedUP: Efficient Pruning-based Federated Unlearning for Model Poisoning Attacks
- The AI Risk Spectrum: From Dangerous Capabilities to Existential Threats
- Generics and Default Reasoning in Large Language Models
- Prediction is not Explanation: Revisiting the Explanatory Capacity of Mapping Embeddings
- On the Security and Privacy of Federated Learning: A Survey with Attacks, Defenses, Frameworks, Applications, and Future Directions
- Mitigating Cross-Image Information Leakage in LVLMs for Multi-Image Tasks
- Depth-Breadth Synergy in RLVR: Unlocking LLM Reasoning Gains with Adaptive Exploration
- COMPASS: A Multi-Dimensional Benchmark for Evaluating Code Generation in Large Language Models
- PENGUIN: Enhancing Transformer with Periodic-Nested Group Attention for Long-term Time Series Forecasting
- Agentic DraCor and the Art of Docstring Engineering: Evaluating MCP-empowered LLM Usage of the DraCor API
- Comparing Conditional Diffusion Models for Synthesizing Contrast-Enhanced Breast MRI from Pre-Contrast Images
- DegDiT: Controllable Audio Generation with Dynamic Event Graph Guided Diffusion Transformer
- BetaWeb: Towards a Blockchain-enabled Trustworthy Agentic Web
- A Fully Transformer Based Multimodal Framework for Explainable Cancer Image Segmentation Using Radiology Reports
- Prompt-Based One-Shot Exact Length-Controlled Generation with LLMs
- Assessing Trustworthiness of AI Training Dataset using Subjective Logic -- A Use Case on Bias
- The illusion of a perfect metric: Why evaluating AI's words is harder than it looks
- Extracting Structured Requirements from Unstructured Building Technical Specifications for Building Information Modeling
- One Shot vs. Iterative: Rethinking Pruning Strategies for Model Compression
- UniECS: Unified Multimodal E-Commerce Search Framework with Gated Cross-modal Fusion
- A Novel Attention-Augmented Wavelet YOLO System for Real-time Brain Vessel Segmentation on Transcranial Color-coded Doppler
- Toward Deployable Multi-Robot Collaboration via a Symbolically-Guided Decision Transformer
- Fisher-Orthogonal Projection Methods for Natural Gradient Descent with Large Batches
- Categorical Policies: Multimodal Policy Learning and Exploration in Continuous Control
- InPars+: Supercharging Synthetic Data Generation for Information Retrieval Systems
- Prompt Orchestration Markup Language
- A Mechanism for Mutual Fairness in Cooperative Games with Replicable Resources -- Extended Version
- Learning to Use AI for Learning: How Can We Effectively Teach and Measure Prompting Literacy for K-12 Students?
- RotBench: Evaluating Multimodal Large Language Models on Identifying Image Rotation
- The Social Context of Human-Robot Interactions
- Chunks as Arms: Multi-Armed Bandit-Guided Sampling for Long-Context LLM Preference Optimization
- Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation
- ASDFormer: A Transformer with Mixtures of Pooling-Classifier Experts for Robust Autism Diagnosis and Biomarker Discovery
- Evaluating Identity Leakage in Speaker De-Identification Systems
- Efficient Knowledge Graph Unlearning with Zeroth-order Information
- Ask Good Questions for Large Language Models
- Unintended Misalignment from Agentic Fine-Tuning: Risks and Mitigation
- GeoSAM2: Unleashing the Power of SAM2 for 3D Part Segmentation
- LEGO-GraphRAG: Modularizing Graph-based Retrieval-Augmented Generation for Design Space Exploration
- Where to Go Next Day: Multi-scale Spatial-Temporal Decoupled Model for Mid-term Human Mobility Prediction
- VRoPE: Rotary Position Embedding for Video Large Language Models
- The StudyChat Dataset: Student Dialogues With ChatGPT in an Artificial Intelligence Course
- GoAI: Enhancing AI Students' Learning Paths and Idea Generation via Graph of AI Ideas
- Hawkeye:Efficient Reasoning with Model Collaboration
- Two Heads are Better Than One: Test-time Scaling of Multi-agent Collaborative Reasoning
- Trust, but verify
- Hierarchical Reinforcement Learning in Multi-Goal Spatial Navigation with Autonomous Mobile Robots
- It's the Thought that Counts: Evaluating the Attempts of Frontier LLMs to Persuade on Harmful Topics
- Language-Guided Multi-Agent Learning in Simulations: A Unified Framework and Evaluation
- Modeling the Diachronic Evolution of Legal Norms: An LRMoo-Based, Component-Level Approach
- Efficient Network Automatic Relevance Determination
- Dispositions and Roles of Generically Dependent Entities
- Towards Urban Planing AI Agent in the Age of Agentic AI
- Data-Efficient Safe Policy Improvement Using Parametric Structure
- Adaptation and Optimization of Automatic Speech Recognition (ASR) for the Maritime Domain in the Field of VHF Communication
- LEGO: Learning and Graph-Optimized Modular Tracker for Online Multi-Object Tracking with Point Clouds
- "I see models being a whole other thing": An Empirical Study of Pre-Trained Model Naming Conventions and A Tool for Enhancing Naming Consistency
- Radio Map Estimation: Empirical Validation and Analysis
- Joint Problems in Learning Multiple Dynamical Systems
- Fusing Echocardiography Images and Medical Records for Continuous Patient Stratification
- iTBLS: A Dataset of Interactive Conversations Over Tabular Information
- Iterative Utility Judgment Framework via LLMs Inspired by Relevance in Philosophy
- Boolean Matrix Logic Programming on the GPU
- Vision Backbone Efficient Selection for Image Classification in Low-Data Regimes
- SSD-TS: Exploring the Potential of Linear State Space Models for Diffusion Models in Time Series Imputation
- Script-Strategy Aligned Generation: Aligning LLMs with Expert-Crafted Dialogue Scripts and Therapeutic Strategies for Psychotherapy
- DDD-GenDT: Dynamic Data-driven Generative Digital Twin Framework
- Setup Once, Secure Always: A Single-Setup Secure Federated Learning Aggregation Protocol with Forward and Backward Secrecy for Dynamic Users
- Unleashing the Power of LLMs in Dense Retrieval with Query Likelihood Modeling
- Parameter-Efficient Continual Fine-Tuning: A Survey
- POPri: Private Federated Learning using Preference-Optimized Synthetic Data
- Hallucinations and Key Information Extraction in Medical Texts: A Comprehensive Assessment of Open-Source Large Language Models
- Harnessing Structured Knowledge: A Concept Map-Based Approach for High-Quality Multiple Choice Question Generation with Effective Distractors
- Position: We Need Responsible, Application-Driven (RAD) AI Research
- "Haet Bhasha aur Diskrimineshun": Phonetic Perturbations in Code-Mixed Hinglish to Red-Team LLMs
- Sample Complexity of Diffusion Model Training Without Empirical Risk Minimizer Access
- G1: Teaching LLMs to Reason on Graphs with Reinforcement Learning
- Piano: A Multi-Constraint Pin Assignment-Aware Floorplanner
- Sustainable AI Training via Hardware-Software Co-Design on NVIDIA, AMD, and Emerging GPU Architectures
- White-Box Reasoning: Synergizing LLM Strategy and gm/Id Data for Automated Analog Circuit Design
- Toward an African Agenda for AI Safety
- Using Artificial Intuition in Distinct, Minimalist Classification of Scientific Abstracts for Management of Technology Portfolios
- MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents
- Combating Homelessness Stigma with LLMs: A New Multi-Modal Dataset for Bias Detection
- Preference Models assume Proportional Hazards of Utilities
- Contextual Attention-Based Multimodal Fusion of LLM and CNN for Sentiment Analysis
- The Rise of Generative AI for Metal-Organic Framework Design and Synthesis
- Benchmarking LLM-based Agents for Single-cell Omics Analysis
- Utilizing the RAIN method and Graph SAGE Model to Identify Effective Drug Combinations for Gastric Neoplasm Treatment
- Research on Conversational Recommender System Considering Consumer Types
- Too Easily Fooled? Prompt Injection Breaks LLMs on Frustratingly Simple Multiple-Choice Questions
- Deep Graph Neural Point Process For Learning Temporal Interactive Networks
- MCPSecBench: A Systematic Security Benchmark and Playground for Testing Model Context Protocols
- MIRAGE: Towards AI-Generated Image Detection in the Wild
- PreSem-Surf: RGB-D Surface Reconstruction with Progressive Semantic Modeling and SG-MLP Pre-Rendering Mechanism
- Accelerating LLM Inference via Dynamic KV Cache Placement in Heterogeneous Memory System
- The Role of AI in Facilitating Interdisciplinary Collaboration: Evidence from AlphaFold
- Uncertainty-Aware Learning Policy for Reliable Pulmonary Nodule Detection on Chest X-Ray
- Quantifying Loss Aversion in Cyber Adversaries via LLM Analysis
- Involuntary Jailbreak
- Goal-Directedness is in the Eye of the Beholder
- ViTAD: Timing Violation-Aware Debugging of RTL Code using Large Language Models
- Hierarchical Conformal Classification
- GaitCrafter: Diffusion Model for Biometric Preserving Gait Synthesis
- Diff-MSM: Differentiable MusculoSkeletal Model for Simultaneous Identification of Human Muscle and Bone Parameters
- A Surveillance Based Interactive Robot
- A Dual-Attention Graph Network for fMRI Data Classification
- Counterfactual Probabilistic Diffusion with Expert Models
- Overcoming Latency Bottlenecks in On-Device Speech Translation: A Cascaded Approach with Alignment-Based Streaming MT
- Whispering Context: Distilling Syntax and Semantics for Long Speech Transcripts
- Datarus-R1: An Adaptive Multi-Step Reasoning LLM for Automated Data Analysis
- Semi-Supervised Anomaly Detection Pipeline for SOZ Localization Using Ictal-Related Chirp
- AdaptJobRec: Enhancing Conversational Career Recommendation through an LLM-Powered Agentic System
- ALIGN: Word Association Learning for Cross-Cultural Generalization in Large Language Models
- Mitigating Easy Option Bias in Multiple-Choice Question Answering
- AlphaX: An AI-Based Value Investing Strategy for the Brazilian Stock Market
- EventTSF: Event-Aware Non-Stationary Time Series Forecasting
- SVDformer: Direction-Aware Spectral Graph Embedding Learning via SVD and Transformer
- Dynamic Design of Machine Learning Pipelines via Metalearning
- Structured Prompting and Multi-Agent Knowledge Distillation for Traffic Video Interpretation and Risk Inference
- Consumer Autonomy or Illusion? Rethinking Consumer Agency in the Age of Algorithms
- STER-VLM: Spatio-Temporal With Enhanced Reference Vision-Language Models
- CORENet: Cross-Modal 4D Radar Denoising Network with LiDAR Supervision for Autonomous Driving
- LLM-Enhanced Linear Autoencoders for Recommendation
- ProMed: Shapley Information Gain Guided Reinforcement Learning for Proactive Medical LLMs
- Heterogeneous Influence Maximization in User Recommendation
- Calibrating Biased Distribution in VFM-derived Latent Space via Cross-Domain Geometric Consistency
- DDoS Attacks in Cloud Computing: Detection and Prevention
- Evaluating Open-Source Vision Language Models for Facial Emotion Recognition against Traditional Deep Learning Models
- MimicFunc: Imitating Tool Manipulation from a Single Human Video via Functional Correspondence
- EAvatar: Expression-Aware Head Avatar Reconstruction with Generative Geometry Priors
- FLAIR: Frequency- and Locality-Aware Implicit Neural Representations
- Collapsing ROC approach for risk prediction research on both common and rare variants
- Physics-Informed Neural Networks for Programmable Origami Metamaterials with Controlled Deployment
- The 9th AI City Challenge
- End-to-End Audio-Visual Learning for Cochlear Implant Sound Coding in Noisy Environments
- A Comparative Study of Decoding Strategies in Medical Text Generation
- Who Gets the Mic? Investigating Gender Bias in the Speaker Assignment of a Speech-LLM
- Bounding Causal Effects and Counterfactuals
- Towards a Larger Model via One-Shot Federated Learning on Heterogeneous Client Models
- GRAFT: Gradient-Aware Fast MaxVol Technique for Dynamic Data Sampling
- Input Time Scaling
- In-Context Decision Making for Optimizing Complex AutoML Pipelines
- Multi-Plasticity Synergy with Adaptive Mechanism Assignment for Training Spiking Neural Networks
- Knowledge Graph Completion for Action Prediction on Situational Graphs -- A Case Study on Household Tasks
- MHSNet:An MoE-based Hierarchical Semantic Representation Network for Accurate Duplicate Resume Detection with Large Language Model
- Neuro-Symbolic Artificial Intelligence: Towards Improving the Reasoning Abilities of Large Language Models
- The DeepLog Neurosymbolic Machine
- CausalPlan: Empowering Efficient LLM Multi-Agent Collaboration Through Causality-Driven Planning
- Expertise-aware Multi-LLM Recruitment and Collaboration for Medical Decision-Making
- Quantifier Instantiations: To Mimic or To Revolt?
- Revisiting RAG Ensemble: A Theoretical and Mechanistic Analysis of Multi-RAG System Collaboration
- Improved Generalized Planning with LLMs through Strategy Refinement and Reflection
- Structured Agentic Workflows for Financial Time-Series Modeling with LLMs and Reflective Feedback
- The Collaboration Paradox: Why Generative AI Requires Both Strategic Intelligence and Operational Stability in Supply Chain Management
- ChronoLLM: Customizing Language Models for Physics-Based Simulation Code Generation
- A Biased Random Key Genetic Algorithm for Solving the Longest Run Subsequence Problem
- ComputerRL: Scaling End-to-End Online Reinforcement Learning for Computer Use Agents
- Preliminary suggestions for rigorous GPAI model evaluations
- TaoSR1: The Thinking Model for E-commerce Relevance Search
- EvoVerilog: Large Langugage Model Assisted Evolution of Verilog Code
- Image2Net: Datasets, Benchmark and Hybrid Framework to Convert Analog Circuit Diagrams into Netlists
- Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL
- Cognitive Workspace: Active Memory Management for LLMs -- An Empirical Study of Functional Infinite Context
- AlphaEval: A Comprehensive and Efficient Evaluation Framework for Formula Alpha Mining
- Fitting Ontologies and Constraints to Relational Structures
- A Hardware-oriented Approach for Efficient Active Inference Computation and Deployment
- The Interpretability Analysis of the Model Can Bring Improvements to the Text-to-SQL Task
- Search-Time Data Contamination
- QuickMerge++: Fast Token Merging with Autoregressive Prior
- AI sustains higher strategic tension than humans in chess
- Explicit v.s. Implicit Memory: Exploring Multi-hop Complex Reasoning Over Personalized Information
- "DIVE" into Hydrogen Storage Materials Discovery with AI Agents
- CardAIc-Agents: A Multimodal Framework with Hierarchical Adaptation for Cardiac Care Support
- Towards Unified Multimodal Financial Forecasting: Integrating Sentiment Embeddings and Market Indicators via Cross-Modal Attention
- HiFo-Prompt: Prompting with Hindsight and Foresight for LLM-based Automatic Heuristic Design
- LOOP: A Plug-and-Play Neuro-Symbolic Framework for Enhancing Planning in Autonomous Systems
- SPANER: Shared Prompt Aligner for Multimodal Semantic Representation
- TASER: Table Agents for Schema-guided Extraction and Recommendation
- Virtuous Machines: Towards Artificial General Science
- STPFormer: A State-of-the-Art Pattern-Aware Spatio-Temporal Transformer for Traffic Forecasting
- Discrete Optimization of Min-Max Violation and its Applications Across Computational Sciences
- LM Agents May Fail to Act on Their Own Risk Knowledge
- CrafterDojo: A Suite of Foundation Models for Building Open-Ended Embodied Agents in Crafter
- Toward Better EHR Reasoning in LLMs: Reinforcement Learning with Expert Attention Guidance
- Breaking the SFT Plateau: Multimodal Structured Reinforcement Learning for Chart-to-Code Generation
- V2P: From Background Suppression to Center Peaking for Robust GUI Grounding Task
- Interactive Query Answering on Knowledge Graphs with Soft Entity Constraints
- ITL-LIME: Instance-Based Transfer Learning for Enhancing Local Explanations in Low-Resource Data Settings
Research Sources: 475 | Generated: 8/25/2025