AI RESEARCH PAPERS & ACADEMIC SOURCES
- A deep learning model with machine vision system for recognizing type of the food during the food consumption
- A dataset of high-resolution plantar pressures for gait analysis across varying footwear and walking speeds
- Distributional Sensitivity Analysis: Enabling Differentiability in Sample-Based Inference
- Learning topological states from randomized measurements using variational tensor network tomography
- Variance-Reduced Fast Operator Splitting Methods for Generalized Equations
- HiFi-Mamba: Dual-Stream W-Laplacian Enhanced Mamba for High-Fidelity MRI Reconstruction
- MedPatch: Confidence-Guided Multi-Stage Fusion for Multimodal Clinical Data
- Dynamic Survival Prediction using Longitudinal Images based on Transformer
- DAgger Diffusion Navigation: DAgger Boosted Diffusion Policy for Vision-Language Navigation
- Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations
- Robustness analysis of Deep Sky Objects detection models on HPC
- Toward Human-Robot Teaming: Learning Handover Behaviors from 3D Scenes
- Prompt-aligned Gradient for Prompt Tuning
- Debiased Fine-Tuning for Vision-language Models by Prompt Regularization
- Ear-Keeper: A Cross-Platform AI System for Rapid and Accurate Ear Disease Diagnosis
- STAC: Leveraging Spatio-Temporal Data Associations For Efficient Cross-Camera Streaming and Analytics
- Are you Struggling? Dataset and Baselines for Struggle Determination in Assembly Videos
- Revisiting 3D Medical Scribble Supervision: Benchmarking Beyond Cardiac Segmentation
- From Few to More: Scribble-based Medical Image Segmentation via Masked Context Modeling and Continuous Pseudo Labels
- ViCToR: Improving Visual Comprehension via Token Reconstruction for Pretraining LMMs
- Joint multi-dimensional dynamic attention and transformer for general image restoration
- ViewDelta: Scaling Scene Change Detection through Text-Conditioning
- UltraRay: Introducing Full-Path Ray Tracing in Physics-Based Ultrasound Simulation
- HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation
- Image Intrinsic Scale Assessment: Bridging the Gap Between Quality and Resolution
- Towards Synthesized and Editable Motion In-Betweening Through Part-Wise Phase Representation
- GranQ: Granular Zero-Shot Quantization with Channel-Wise Activation Scaling in QAT
- Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models
- NeuralGS: Bridging Neural Fields and 3D Gaussian Splatting for Compact 3D Representations
- Cyc3D: Fine-grained Controllable 3D Generation via Cycle Consistency Regularization
- Scaling Vision Mamba Across Resolutions via Fractal Traversal
- CAS-IQA: Teaching Vision-Language Models for Synthetic Angiography Quality Assessment
- MultiFormer: A Multi-Person Pose Estimation System Based on CSI and Attention Mechanism
- Follow-Your-Motion: Video Motion Transfer via Efficient Spatial-Temporal Decoupled Finetuning
- SpaCE-10: A Comprehensive Benchmark for Multimodal Large Language Models in Compositional Spatial Intelligence
- 3D Gaussian Splatting Driven Multi-View Robust Physical Adversarial Camouflage Generation
- Calibrated Self-supervised Vision Transformers Improve Intracranial Arterial Calcification Segmentation from Clinical CT Head Scans
- Modulate and Reconstruct: Learning Hyperspectral Imaging from Misaligned Smartphone Views
- Learning Adaptive Node Selection with External Attention for Human Interaction Recognition
- RoHOI: Robustness Benchmark for Human-Object Interaction Detection
- Advancing Reliable Test-Time Adaptation of Vision-Language Models under Visual Variations
- HRSeg: High-Resolution Visual Perception and Enhancement for Reasoning Segmentation
- FROST-BRDF: A Fast and Robust Optimal Sampling Technique for BRDF Acquisition
- Lung-DDPM: Semantic Layout-guided Diffusion Models for Thoracic CT Image Synthesis
- RAGAR: Retrieval Augmented Personalized Image Generation Guided by Recommendation
- Emotion-Qwen: A Unified Framework for Emotion and Vision Understanding
- DualMap: Online Open-Vocabulary Semantic Mapping for Natural Language Navigation in Dynamic Changing Scenes
- Distilling LLM Prior to Flow Model for Generalizable Agent's Imagination in Object Goal Navigation
- MPT: Motion Prompt Tuning for Micro-Expression Recognition
- RASR: Retrieval-Augmented Super Resolution for Practical Reference-based Image Restoration
- Animate-X++: Universal Character Image Animation with Dynamic Backgrounds
- Event-driven Robust Fitting on Neuromorphic Hardware
- CitySeg: A 3D Open Vocabulary Semantic Segmentation Foundation Model in City-scale Scenarios
- Leveraging Failed Samples: A Few-Shot and Training-Free Framework for Generalized Deepfake Detection
- From Large Angles to Consistent Faces: Identity-Preserving Video Generation via Mixture of Facial Experts
- CLIP-Flow: A Universal Discriminator for AI-Generated Images Inspired by Anomaly Detection
- GazeLT: Visual attention-guided long-tailed disease classification in chest radiographs
- SkySplat: Generalizable 3D Gaussian Splatting from Multi-Temporal Sparse Satellite Images
- SARE: Semantic-Aware Reconstruction Error for Generalizable Diffusion-Generated Image Detection
- SOI is the Root of All Evil: Quantifying and Breaking Similar Object Interference in Single Object Tracking
- Learning Spatial Decay for Vision Transformers
- Physics-guided Deep Unfolding Network for Enhanced Kronecker Compressive sensing
- Iterative Volume Fusion for Asymmetric Stereo Matching
- Exploring the Equivalence of Closed-Set Generative and Real Data Augmentation in Image Classification
- Topological Invariant-Based Iris Identification via Digital Homology and Machine Learning
- WeatherPrompt: Multi-modality Representation Learning for All-Weather Drone Visual Geo-Localization
- WEC-DG: Multi-Exposure Wavelet Correction Method Guided by Degradation Description
- A Chain of Diagnosis Framework for Accurate and Explainable Radiology Report Generation
- Dual Recursive Feedback on Generation and Appearance Latents for Pose-Robust Text-to-Image Diffusion
- SHALE: A Scalable Benchmark for Fine-grained Hallucination Evaluation in LVLMs
- Offline Auto Labeling: BAAS
- SVG-Head: Hybrid Surface-Volumetric Gaussians for High-Fidelity Head Reconstruction and Real-Time Editing
- Images Speak Louder Than Scores: Failure Mode Escape for Enhancing Generative Quality
- BridgeTA: Bridging the Representation Gap in Knowledge Distillation via Teacher Assistant for Bird's Eye View Map Segmentation
- Plane Detection and Ranking via Model Information Optimization
- Semantic-aware DropSplat: Adaptive Pruning of Redundant Gaussians for 3D Aerial-View Segmentation
- Enhancing Monocular 3D Hand Reconstruction with Learned Texture Priors
- Multi-Contrast Fusion Module: An attention mechanism integrating multi-contrast features for fetal torso plane classification
- Multi-Sequence Parotid Gland Lesion Segmentation via Expert Text-Guided Segment Anything Model
- The Brain Resection Multimodal Image Registration (ReMIND2Reg) 2025 Challenge
- TOTNet: Occlusion-Aware Temporal Tracking for Robust Ball Detection in Sports Videos
- Noise-adapted Neural Operator for Robust Non-Line-of-Sight Imaging
- NegFaceDiff: The Power of Negative Context in Identity-Conditioned Diffusion for Synthetic Face Generation
- GSFixer: Improving 3D Gaussian Splatting with Reference-Guided Video Diffusion Priors
- PaCo-FR: Patch-Pixel Aligned End-to-End Codebook Learning for Facial Representation Pre-training
- Slot Attention-based Feature Filtering for Few-Shot Learning
- MangaDiT: Reference-Guided Line Art Colorization with Hierarchical Attention in Diffusion Transformers
- Predictive Uncertainty for Runtime Assurance of a Real-Time Computer Vision-Based Landing System
- Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory
- MoIIE: Mixture of Intra- and Inter-Modality Experts for Large Vision Language Models
- DSS-Prompt: Dynamic-Static Synergistic Prompting for Few-Shot Class-Incremental Learning
- MeMoSORT: Memory-Assisted Filtering and Motion-Adaptive Association Metric for Multi-Person Tracking
- MUJICA: Reforming SISR Models for PBR Material Super-Resolution via Cross-Map Attention
- Poaching Hotspot Identification Using Satellite Imagery
- Evolution of Low-Level and Texture Human-CLIP Alignment
- ViMoNet: A Multimodal Vision-Language Framework for Human Behavior Understanding from Motion and Video
- Physical Autoregressive Model for Robotic Manipulation without Action Pretraining
- KonfAI: A Modular and Fully Configurable Framework for Deep Learning in Medical Imaging
- Reverse Convolution and Its Applications to Image Restoration
- Hierarchical Graph Attention Network for No-Reference Omnidirectional Image Quality Assessment
- Enhancing Diffusion Face Generation with Contrastive Embeddings and SegFormer Guidance
- ARI3D: A Software for Interactive Quantification of Regions in X-Ray CT 3D Images
- Do Vision Transformers See Like Humans? Evaluating their Perceptual Alignment
- OneVAE: Joint Discrete and Continuous Optimization Helps Discrete Video VAE Train Better
- HumanGenesis: Agent-Based Geometric and Generative Modeling for Synthetic Human Dynamics
- E-4DGS: High-Fidelity Dynamic Reconstruction from the Multi-view Event Cameras
- SpeechForensics: Audio-Visual Speech Representation Learning for Face Forgery Detection
- Towards Comprehensive Cellular Characterisation of H&E slides
- Quo Vadis Handwritten Text Generation for Handwritten Text Recognition?
- AST-n: A Fast Sampling Approach for Low-Dose CT Reconstruction using Diffusion Models
- LIA-X: Interpretable Latent Portrait Animator
- MOC: Meta-Optimized Classifier for Few-Shot Whole Slide Image Classification
- PERSONA: Personalized Whole-Body 3D Avatar with Pose-Driven Deformations from a Single Image
- A Survey on 3D Gaussian Splatting Applications: Segmentation, Editing, and Generation
- LLMC+: Benchmarking Vision-Language Model Compression with a Plug-and-play Toolkit
- From Stars to Insights: Exploration and Implementation of Unified Sentiment Analysis with Distant Supervision
- LongIns: A Challenging Long-context Instruction-based Exam for LLMs
- Memorization Over Reasoning? Exposing and Mitigating Verbatim Memorization in Large Language Models' Character Understanding Evaluation
- Efficient Inference for Large Reasoning Models: A Survey
- Follow the Flow: On Information Flow Across Textual Tokens in Text-to-Image Models
- Non-native Children's Automatic Speech Assessment Challenge (NOCASA)
- IP-CRR: Information Pursuit for Interpretable Classification of Chest Radiology Reports
- LogicCat: A Chain-of-Thought Text-to-SQL Benchmark for Complex Reasoning
- MemGuide: Intent-Driven Memory Selection for Goal-Oriented Multi-Session LLM Agents
- DefenderBench: A Toolkit for Evaluating Language Agents in Cybersecurity Environments
- Applying Text Embedding Models for Efficient Analysis in Labeled Property Graphs
- Synthetic Data Generation for Emotional Depth Faces: Optimizing Conditional DCGANs via Genetic Algorithms in the Latent Space and Stabilizing Training with Knowledge Distillation
- FineState-Bench: A Comprehensive Benchmark for Fine-Grained State Control in GUI Agents
- Beyond Blanket Masking: Examining Granularity for Privacy Protection in Images Captured by Blind and Low Vision Users
- Lung-DDPM+: Efficient Thoracic CT Image Synthesis using Diffusion Probabilistic Model
- UltraLight Med-Vision Mamba for Classification of Neoplastic Progression in Tubular Adenomas
- Blink-to-code: real-time Morse code communication via eye blink detection and classification
- DenoDet V2: Phase-Amplitude Cross Denoising for SAR Object Detection
- Skyshield: Event-Driven Submillimetre Thin Obstacle Detection for Drone Flight Safety
- Autonomous AI Bird Feeder for Backyard Biodiversity Monitoring
- Waymo-3DSkelMo: A Multi-Agent 3D Skeletal Motion Dataset for Pedestrian Interaction Modeling in Autonomous Driving
- Decoding Neural Emotion Patterns through Natural Language Processing Embeddings
- Flow-SLM: Joint Learning of Linguistic and Acoustic Information for Spoken Language Modeling
- Columbo: Expanding Abbreviated Column Names for Tabular Data Using Large Language Models
- Leveraging Zipformer Model for Effective Language Identification in Code-Switched Child-Directed Speech
- From Charts to Fair Narratives: Uncovering and Mitigating Geo-Economic Biases in Chart-to-Text
- User-centric Subjective Leaderboard by Customizable Reward Modeling
- LACA: Improving Cross-lingual Aspect-Based Sentiment Analysis with LLM Data Augmentation
- Cross-lingual Aspect-Based Sentiment Analysis: A Survey on Tasks, Approaches, and Challenges
- UWBa at SemEval-2025 Task 7: Multilingual and Crosslingual Fact-Checked Claim Retrieval
- The Surprising Effectiveness of Membership Inference with Simple N-Gram Coverage
- AINL-Eval 2025 Shared Task: Detection of AI-Generated Scientific Abstracts in Russian
- EffiEval: Efficient and Generalizable Model Evaluation via Capability Coverage Maximization
- Slow Tuning and Low-Entropy Masking for Safe Chain-of-Thought Distillation
- The Perils of Chart Deception: How Misleading Visualizations Affect Vision-Language Models
- Transforming Questions and Documents for Semantically Aligned Retrieval-Augmented Generation
- Echoes of Agreement: Argument Driven Opinion Shifts in Large Language Models
- UtterTune: LoRA-Based Target-Language Pronunciation Edit and Control in Multilingual Text-to-Speech
- BigCharts-R1: Enhanced Chart Reasoning with Visual Reinforcement Finetuning
- Assessing the Feasibility of Lightweight Whisper Models for Low-Resource Urdu Transcription
- A Survey of Cognitive Distortion Detection and Classification in NLP
- Language of Persuasion and Misrepresentation in Business Communication: A Textual Detection Approach
- Shaping Event Backstories to Estimate Potential Emotion Contexts
- Performance of GPT-5 Frontier Models in Ophthalmology Question Answering
- Which one Performs Better? Wav2Vec or Whisper? Applying both in Badini Kurdish Speech to Text (BKSTT)
- IAG: Input-aware Backdoor Attack on VLMs for Visual Grounding
- Enhancing Deep Hedging of Options with Implied Volatility Surface Feedback Information
- VulScribeR: Exploring RAG-based Vulnerability Augmentation with LLMs
- On the Robustness of Kernel Goodness-of-Fit Tests
- Leveraging Reviewer Experience in Code Review Comment Generation
- A spectral method for multi-view subspace learning using the product of projections
- Improving Multimodal Large Language Models Using Continual Learning
- Learning Whole-Body Loco-Manipulation for Omni-Directional Task Space Pose Tracking with a Wheeled-Quadrupedal-Manipulator
- A2SB: Audio-to-Audio Schrodinger Bridges
- Gradient Descent Algorithm in Hilbert Spaces under Stationary Markov Chains with $\phi$- and $\beta$-Mixing
- RocketKV: Accelerating Long-Context LLM Inference via Two-Stage KV Cache Compression
- Verifying Quantized Graph Neural Networks is PSPACE-complete
- Generative Active Adaptation for Drifting and Imbalanced Network Intrusion Detection
- Cryo-em images are intrinsically low dimensional
- ParkDiffusion: Heterogeneous Multi-Agent Multi-Modal Trajectory Prediction for Automated Parking using Diffusion Models
- Towards Safer Pretraining: Analyzing and Filtering Harmful Content in Webscale datasets for Responsible LLMs
- M-learner:A Flexible And Powerful Framework To Study Heterogeneous Treatment Effect In Mediation Model
- MoCA: Multi-modal Cross-masked Autoencoder for Digital Health Measurements
- AbRank: A Benchmark Dataset and Metric-Learning Framework for Antibody-Antigen Affinity Ranking
- MetaCipher: A Time-Persistent and Universal Multi-Agent Framework for Cipher-Based Jailbreak Attacks for LLMs
- Efficient Visual Appearance Optimization by Learning from Prior Preferences
- Understanding Nonlinear Implicit Bias via Region Counts in Input Space
- Finite-Time Global Optimality Convergence in Deep Neural Actor-Critic Methods for Decentralized Multi-Agent Reinforcement Learning
- Mini-Game Lifetime Value Prediction in WeChat
- Leveraging Predictive Equivalence in Decision Trees
- Unlasting: Unpaired Single-Cell Multi-Perturbation Estimation by Dual Conditional Diffusion Implicit Bridges
- Faster Diffusion Models via Higher-Order Approximation
- Quantum Machine Learning in Transportation: A Case Study of Pedestrian Stress Modelling
- Feel-Good Thompson Sampling for Contextual Bandits: a Markov Chain Monte Carlo Showdown
- PrAViC: Probabilistic Adaptation Framework for Real-Time Video Classification
- Continuous-time q-Learning for Jump-Diffusion Models under Tsallis Entropy
- Importance Corrected Neural JKO Sampling
- MiCo: End-to-End Mixed Precision Neural Network Co-Exploration Framework for Edge AI
- Causal Graph Profiling via Structural Divergence for Robust Anomaly Detection in Cyber-Physical Systems
- Enhancing Memory Recall in LLMs with Gauss-Tin: A Hybrid Instructional and Gaussian Replay Approach
- Time-Aware and Transition-Semantic Graph Neural Networks for Interpretable Predictive Business Process Monitoring
- SYNAPSE-G: Bridging Large Language Models and Graph Learning for Rare Event Classification
- Edge General Intelligence Through World Models and Agentic AI: Fundamentals, Solutions, and Challenges
- Online Prediction with Limited Selectivity
- Physics- and geometry-aware spatio-spectral graph neural operator for time-independent and time-dependent PDEs
- Thermal Tracks: A Gaussian process-based framework for universal melting curve analysis enabling unconstrained hit identification in thermal proteome profiling experiments
- Global Convergence Analysis of Vanilla Gradient Descent for Asymmetric Matrix Completion
- Temporal Anchoring in Deepening Embedding Spaces: Event-Indexed Projections, Drift, Convergence, and an Internal Computational Architecture
- Combating Noisy Labels via Dynamic Connection Masking
- GraphTreeGen: Subtree-Centric Approach to Efficient and Supervised Graph Generation
- Generative Modeling with Multi-Instance Reward Learning for E-commerce Creative Optimization
- HKT: A Biologically Inspired Framework for Modular Hereditary Knowledge Transfer in Neural Networks
- A Machine Learning Approach to Predict Biological Age and its Longitudinal Drivers
- $\mu$-Parametrization for Mixture of Experts
- TriForecaster: A Mixture of Experts Framework for Multi-Region Electric Load Forecasting with Tri-dimensional Specialization
- Bayesian autoregression to optimize temporal Mat\'ern kernel Gaussian process hyperparameters
- Feature Impact Analysis on Top Long-Jump Performances with Quantile Random Forest and Explainable AI Techniques
- RankList -- A Listwise Preference Learning Framework for Predicting Subjective Preferences
- FedShard: Federated Unlearning with Efficiency Fairness and Performance Fairness
- Modern Neural Networks for Small Tabular Datasets: The New Default for Field-Scale Digital Soil Mapping?
- Prototype-Guided Diffusion: Visual Conditioning without External Memory
- Noise Hypernetworks: Amortizing Test-Time Compute in Diffusion Models
- Dynamic Mixture-of-Experts for Incremental Graph Learning
- RadioMamba: Breaking the Accuracy-Efficiency Trade-off in Radio Map Construction via a Hybrid Mamba-UNet
- GANime: Generating Anime and Manga Character Drawings from Sketches with Deep Learning
- Exploring Molecular Odor Taxonomies for Structure-based Odor Predictions using Machine Learning
- Objective Soups: Multilingual Multi-Task Modeling for Speech Processing
- Forecasting Binary Economic Events in Modern Mercantilism: Traditional methodologies coupled with PCA and K-means Quantitative Analysis of Qualitative Sentimental Data
- Harnessing Input-Adaptive Inference for Efficient VLN
- A Generative Imputation Method for Multimodal Alzheimer's Disease Diagnosis
- Teaching Code Refactoring Using LLMs
- Classifying Cool Dwarfs: Comprehensive Spectral Typing of Field and Peculiar Dwarfs Using Machine Learning
- ProMode: A Speech Prosody Model Conditioned on Acoustic and Textual Inputs
- A pseudo-inverse of a line graph
- HyperKD: Distilling Cross-Spectral Knowledge in Masked Autoencoders via Inverse Domain Shift with Spatial-Aware Masking and Specialized Loss
- CWFBind: Geometry-Awareness for Fast and Accurate Protein-Ligand Docking
- DeepWKB: Learning WKB Expansions of Invariant Distributions for Stochastic Systems
- Emergence of Hierarchies in Multi-Agent Self-Organizing Systems Pursuing a Joint Objective
- HierMoE: Accelerating MoE Training with Hierarchical Token Deduplication and Expert Swap
- Scalable h-adaptive probabilistic solver for time-independent and time-dependent systems
- Personalized Product Search Ranking: A Multi-Task Learning Approach with Tabular and Non-Tabular Data
- Improving Diversity in Language Models: When Temperature Fails, Change the Loss
- Social-Sensor Identity Cloning Detection Using Weakly Supervised Deep Forest and Cryptographic Authentication
- DeputyDev -- AI Powered Developer Assistant: Breaking the Code Review Logjam through Contextual AI to Boost Developer Productivity
- NEURAL: Attention-Guided Pruning for Unified Multimodal Resource-Constrained Clinical Evaluation
- Multimodal Sheaf-based Network for Glioblastoma Molecular Subtype Prediction
- Structured Kernel Regression VAE: A Computationally Efficient Surrogate for GP-VAEs in ICA
- Sample More to Think Less: Group Filtered Policy Optimization for Concise Reasoning
- Improving the Speaker Anonymization Evaluation's Robustness to Target Speakers with Adversarial Learning
- On the Generalization Limits of Quantum Generative Adversarial Networks with Pure State Generators
- Stable Diffusion Models are Secretly Good at Visual In-Context Learning
- Neural Bandit Based Optimal LLM Selection for a Pipeline of Tasks
- Story2Board: A Training-Free Approach for Expressive Storyboard Generation
- Forecasting steam mass flow in power plants using the parallel hybrid network
- Semi-Bandit Learning for Monotone Stochastic Optimization
- Discrete Neural Algorithmic Reasoning
- No-Regret M${}^{\natural}$-Concave Function Maximization: Stochastic Bandit Algorithms and Hardness of Adversarial Full-Information Setting
- Sparse Spectral Training and Inference on Euclidean and Hyperbolic Neural Networks
- Distributed Lag Transformer based on Time-Variable-Aware Learning for Explainable Multivariate Time Series Forecasting
- Federated Learning for Smart Grid: A Survey on Applications and Potential Vulnerabilities
- Differentiation Through Black-Box Quadratic Programming Solvers
- Provably Transformers Harness Multi-Concept Word Semantics for Efficient In-Context Learning
- Generative Feature Training of Thin 2-Layer Networks
- Scalable Out-of-distribution Robustness in the Presence of Unobserved Confounders
- Indirect Query Bayesian Optimization with Integrated Feedback
- MVICAD2: Multi-View Independent Component Analysis with Delays and Dilations
- Pivoting Factorization: A Compact Meta Low-Rank Representation of Sparsity for Efficient Inference in Large Language Models
- Accelerating Linear Recurrent Neural Networks for the Edge with Unstructured Sparsity
- LEAPS: A discrete neural sampler via locally equivariant networks
- Fast, Accurate Manifold Denoising by Tunneling Riemannian Optimization
- Underdamped Diffusion Bridges with Applications to Sampling
- Dequantified Diffusion-Schr{\"o}dinger Bridge for Density Ratio Estimation
- Leveraging Audio and Text Modalities in Mental Health: A Study of LLMs Performance
- SLTNet: Efficient Event-based Semantic Segmentation with Spike-driven Lightweight Transformer-based Networks
- Evaluation of Bio-Inspired Models under Different Learning Settings For Energy Efficiency in Network Traffic Prediction
- Beyond Memorization: Assessing Semantic Generalization in Large Language Models Using Phrasal Constructions
- GenAI Confessions: Black-box Membership Inference for Generative Image Models
- Benchmarking LLMs' Mathematical Reasoning with Unseen Random Variables Questions
- Conformal Prediction of Classifiers with Many Classes based on Noisy Labels
- One-shot Optimized Steering Vectors Mediate Safety-relevant Behaviors in LLMs
- RIZE: Regularized Imitation Learning via Distributional Reinforcement Learning
- Simulating the Real World: A Unified Survey of Multimodal Generative Models
- Shifting Perspectives: Steering Vectors for Robust Bias Mitigation in LLMs
- The Illusory Normativity of Rights-Based AI Regulation
- FT-Transformer: Resilient and Reliable Transformer with End-to-End Fault Tolerant Attention
- CO-Bench: Benchmarking Language Model Agents in Algorithm Search for Combinatorial Optimization
- Mosaic: Composite Projection Pruning for Resource-efficient LLMs
- GraspClutter6D: A Large-scale Real-world Dataset for Robust Perception and Grasping in Cluttered Scenes
- AI-Slop to AI-Polish? Aligning Language Models through Edit-Based Writing Rewards and Test-time Computation
- FedRecon: Missing Modality Reconstruction in Heterogeneous Distributed Environments
- EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting
- Deep Learning Warm Starts for Trajectory Optimization on the International Space Station
- Halting Recurrent GNNs and the Graded $\mu$-Calculus
- Teaching Large Language Models to Maintain Contextual Faithfulness via Synthetic Tasks and Reinforcement Learning
- MapStory: Prototyping Editable Map Animations with LLM Agents
- Exploring Scaling Laws for EHR Foundation Models
- Sarc7: Evaluating Sarcasm Detection and Generation with Seven Types and Emotion-Informed Techniques
- Gradual Transition from Bellman Optimality Operator to Bellman Operator in Online Reinforcement Learning
- Deep Learning Model Acceleration and Optimization Strategies for Real-Time Recommendation Systems
- MGDFIS: Multi-scale Global-detail Feature Integration Strategy for Small Object Detection
- Poison Once, Control Anywhere: Clean-Text Visual Backdoors in VLM-based Mobile Agents
- Open-Set LiDAR Panoptic Segmentation Guided by Uncertainty-Aware Learning
- HVL: Semi-Supervised Segmentation leveraging Hierarchical Vision-Language Synergy with Dynamic Text-Spatial Query Alignment
- Human Motion Capture from Loose and Sparse Inertial Sensors with Garment-aware Diffusion Models
- The Importance of Being Lazy: Scaling Limits of Continual Learning
- SWA-SOP: Spatially-aware Window Attention for Semantic Occupancy Prediction in Autonomous Driving
- OC-SOP: Enhancing Vision-Based 3D Semantic Occupancy Prediction by Object-Centric Awareness
- Beyond Autocomplete: Designing CopilotLens Towards Transparent and Explainable AI Coding Agents
- Audio-3DVG: Unified Audio -- Point Cloud Fusion for 3D Visual Grounding
- WebArXiv: Evaluating Multimodal Agents on Time-Invariant arXiv Tasks
- GLM-4.1V-Thinking and GLM-4.5V: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
- A multi-strategy improved snake optimizer for three-dimensional UAV path planning and engineering problems
- MoLAN: A Unified Modality-Aware Noise Dynamic Editing Framework for Multimodal Sentiment Analysis
- Presenting DiaData for Research on Type 1 Diabetes
- An Unsupervised Deep XAI Framework for Localization of Concurrent Replay Attacks in Nuclear Reactor Signals
- Generating Feasible and Diverse Synthetic Populations Using Diffusion Models
- Masked Training for Robust Arrhythmia Detection from Digitalized Multiple Layout ECG Images
- SVGen: Interpretable Vector Graphics Generation with Large Language Models
- Breath as a biomarker: A survey of contact and contactless applications and approaches in respiratory monitoring
- Building Safer Sites: A Large-Scale Multi-Level Dataset for Construction Safety Research
- The First Differentiable Transfer-Based Algorithm for Discrete MicroLED Repair
- Blockchain Network Analysis using Quantum Inspired Graph Neural Networks & Ensemble Models
- LLM Empowered Prototype Learning for Zero and Few-Shot Tasks on Tabular Data
- Over-Squashing in GNNs and Causal Inference of Rewiring Strategies
- Constrained Black-Box Attacks Against Multi-Agent Reinforcement Learning
- Pattern-based Knowledge Component Extraction from Student Code Using Representation Learning
- Distilling Reinforcement Learning into Single-Batch Datasets
- Resurrecting the Salmon: Rethinking Mechanistic Interpretability with Domain-Specific Sparse Autoencoders
- Integrating Feature Attention and Temporal Modeling for Collaborative Financial Risk Assessment
- Graph Neural Network and Transformer Integration for Unsupervised System Anomaly Discovery
- NEXICA: Discovering Road Traffic Causality (Extended arXiv Version)
- Open-Set Fault Diagnosis in Multimode Processes via Fine-Grained Deep Feature Representation
- Learn to Explore: Meta NAS via Bayesian Optimization Guided Graph Generation
- EGGS-PTP: An Expander-Graph Guided Structured Post-training Pruning Method for Large Language Models
- Multi-Step Reasoning with Large Language Models, a Survey
- Probing Mechanical Reasoning in Large Vision Language Models
- Revisiting Your Memory: Reconstruction of Affect-Contextualized Memory via EEG-guided Audiovisual Generation
- Analyzing Finetuning Representation Shift for Multimodal LLMs Steering
- MedRep: Medical Concept Representation for General Electronic Health Record Foundation Models
- Integrating Visual Interpretation and Linguistic Reasoning for Math Problem Solving
- GridRoute: A Benchmark for LLM-Based Route Planning with Cardinal Movement in Grid Environments
- AgentOrchestra: A Hierarchical Multi-Agent Framework for General-Purpose Task Solving
- MoSE: Skill-by-Skill Mixture-of-Experts Learning for Embodied Autonomous Machines
- Game-Theoretic Multiagent Reinforcement Learning
- LEAVES: Learning Views for Time-Series Biobehavioral Data in Contrastive Learning
- Learning to Defer in Congested Systems: The AI-Human Interplay
- From Model Performance to Claim: How a Change of Focus in Machine Learning Replicability Can Help Bridge the Responsibility Gap
- Robo-Instruct: Simulator-Augmented Instruction Alignment For Finetuning Code LLMs
- Towards Black-Box Membership Inference Attack for Diffusion Models
- LUMA: A Benchmark Dataset for Learning from Uncertain and Multimodal Data
- Integrating Clinical Knowledge Graphs and Gradient-Based Neural Systems for Enhanced Melanoma Diagnosis via the 7-Point Checklist
- Towards flexible perception with visual memory
- SpectralEarth: Training Hyperspectral Foundation Models at Scale
- Explaining Caption-Image Interactions in CLIP Models with Second-Order Attributions
- CTRQNets & LQNets: Continuous Time Recurrent and Liquid Quantum Neural Networks
- Pediatric brain tumor classification using digital histopathology and deep learning: evaluation of SOTA methods on a multi-center Swedish cohort
- Episodic Memory Verbalization using Hierarchical Representations of Life-Long Robot Experience
- Downscaling Extreme Precipitation with Wasserstein Regularized Diffusion
- Retrieval-Augmented Decision Transformer: External Memory for In-context RL
- Depth-Guided Self-Supervised Human Keypoint Detection via Cross-Modal Distillation
- Learning Characteristics of Reverse Quaternion Neural Network
- What Can We Learn from Inter-Annotator Variability in Skin Lesion Segmentation?
- X-UniMotion: Animating Human Images with Expressive, Unified and Identity-Agnostic Motion Latents
- Understanding Dementia Speech Alignment with Diffusion-Based Image Generation
- RampNet: A Two-Stage Pipeline for Bootstrapping Curb Ramp Detection in Streetscape Images from Open Government Metadata
- Domain-Generalization to Improve Learning in Meta-Learning Algorithms
- Implicit Hypergraph Neural Networks: A Stable Framework for Higher-Order Relational Learning with Provable Guarantees
- What-Meets-Where: Unified Learning of Action and Contact Localization in a New Dataset
- Shadow in the Cache: Unveiling and Mitigating Privacy Risks of KV-cache in LLM Inference
- A Unified Contrastive-Generative Framework for Time Series Classification
- Hallucination vs interpretation: rethinking accuracy and precision in AI-assisted data extraction for knowledge synthesis
- RelayFormer: A Unified Local-Global Attention Framework for Scalable Image and Video Manipulation Localization
- Gen-AFFECT: Generation of Avatar Fine-grained Facial Expressions with Consistent identiTy
- DeepFeatIoT: Unifying Deep Learned, Randomized, and LLM Features for Enhanced IoT Time Series Sensor Data Classification in Smart Industries
- NeuronTune: Fine-Grained Neuron Modulation for Balanced Safety-Utility Alignment in LLMs
- Episodic Memory Representation for Long-form Video Understanding
- Large-Small Model Collaborative Framework for Federated Continual Learning
- Learning Facts at Scale with Active Reading
- From Ranking to Selection: A Simple but Efficient Dynamic Passage Selector for Retrieval Augmented Generation
- Verify Distributed Deep Learning Model Implementation Refinement with Iterative Relation Inference
- SMART-OC: A Real-time Time-risk Optimal Replanning Algorithm for Dynamic Obstacles and Spatio-temporally Varying Currents
- COMPEER: Controllable Empathetic Reinforcement Reasoning for Emotional Support Conversation
- Generation of Indian Sign Language Letters, Numbers, and Words
- Decentralized Rank Scheduling for Energy-Constrained Multi-Task Federated Fine-Tuning in Edge-Assisted IoV Networks
- COXNet: Cross-Layer Fusion with Adaptive Alignment and Scale Integration for RGBT Tiny Object Detection
- AI Blob! LLM-Driven Recontextualization of Italian Television Archives
- Your Coding Intent is Secretly in the Context and You Should Deliberately Infer It Before Completion
- GoViG: Goal-Conditioned Visual Navigation Instruction Generation
- CaRoBio: 3D Cable Routing with a Bio-inspired Gripper Fingernail
- Hierarchical Brain Structure Modeling for Predicting Genotype of Glioma
- A Lightweight Learned Cardinality Estimation Model
- How Persuasive Could LLMs Be? A First Study Combining Linguistic-Rhetorical Analysis and User Experiments
- MInDI-3D: Iterative Deep Learning in 3D for Sparse-view Cone Beam Computed Tomography
- Interpretable Robot Control via Structured Behavior Trees and Large Language Models
- Goal Discovery with Causal Capacity for Efficient Reinforcement Learning
- TimeMKG: Knowledge-Infused Causal Reasoning for Multivariate Time Series Modeling
- AmbiGraph-Eval: Can LLMs Effectively Handle Ambiguous Graph Queries?
- Preacher: Paper-to-Video Agentic System
- A Close Reading Approach to Gender Narrative Biases in AI-Generated Stories
- Demystifying the Role of Rule-based Detection in AI Systems for Windows Malware Detection
- On Negative-aware Preference Optimization for Recommendation
- Anomaly Detection for IoT Global Connectivity
- Surg-InvNeRF: Invertible NeRF for 3D tracking and reconstruction in surgical vision
- Evaluating the Role of Large Language Models in Legal Practice in India
- Improving ARDS Diagnosis Through Context-Aware Concept Bottleneck Models
- Region-to-Region: Enhancing Generative Image Harmonization with Adaptive Regional Injection
- NEUBORN: The Neurodevelopmental Evolution framework Using BiOmechanical RemodelliNg
- Enhance the machine learning algorithm performance in phishing detection with keyword features
- Counting Short Trajectories in Elementary Cellular Automata using the Transfer Matrix Method
- Can LLM-Generated Textual Explanations Enhance Model Classification Performance? An Empirical Study
- Combinative Matching for Geometric Shape Assembly
- Adoption of Explainable Natural Language Processing: Perspectives from Industry and Academia on Practices and Challenges
- Prototype Training with Dual Pseudo-Inverse and Optimized Hidden Activations
- LibRec: Benchmarking Retrieval-Augmented LLMs for Library Migration Recommendations
- Explainable Ensemble Learning for Graph-Based Malware Detection
- Automated Segmentation of Coronal Brain Tissue Slabs for 3D Neuropathology
- A Comprehensive Survey of Datasets for Clinical Mental Health AI Systems
- TRACE: Learning 3D Gaussian Physical Dynamics from Multi-view Videos
- Provable In-Context Vector Arithmetic via Retrieving Task Concepts
- RayletDF: Raylet Distance Fields for Generalizable 3D Surface Reconstruction from Point Clouds or Gaussians
- Exploring the Potential of Large Language Models in Fine-Grained Review Comment Classification
- Speed Always Wins: A Survey on Efficient Architectures for Large Language Models
- PRELUDE: A Benchmark Designed to Require Global Comprehension and Reasoning over Long Contexts
- Perceptual Reality Transformer: Neural Architectures for Simulating Neurological Perception Conditions
- STREAM (ChemBio): A Standard for Transparently Reporting Evaluations in AI Model Reports
- Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models
- Beyond Scaling Law: A Data-Efficient Distillation Framework for Reasoning
- COME: Dual Structure-Semantic Learning with Collaborative MoE for Universal Lesion Detection Across Heterogeneous Ultrasound Datasets
- Rare anomalies require large datasets: About proving the existence of anomalies
- Beyond Na\"ive Prompting: Strategies for Improved Zero-shot Context-aided Forecasting with LLMs
- T-CACE: A Time-Conditioned Autoregressive Contrast Enhancement Multi-Task Framework for Contrast-Free Liver MRI Synthesis, Segmentation, and Diagnosis
- Residual Reservoir Memory Networks
- A Comprehensive Evaluation framework of Alignment Techniques for LLMs
- VisCodex: Unified Multimodal Code Generation via Merging Vision and Coding Models
- Specialised or Generic? Tokenization Choices for Radiology Language Models
- GBC: Generalized Behavior-Cloning Framework for Whole-Body Humanoid Imitation
- January Food Benchmark (JFB): A Public Benchmark Dataset and Evaluation Suite for Multimodal Food Analysis
- Vision-driven River Following of UAV via Safe Reinforcement Learning using Semantic Dynamics Model
- Echo-4o: Harnessing the Power of GPT-4o Synthetic Images for Improved Image Generation
- Value Function Initialization for Knowledge Transfer and Jump-start in Deep Reinforcement Learning
- The Othello AI Arena: Evaluating Intelligent Systems Through Limited-Time Adaptation to Unseen Boards
- An Automated Multi-Modal Evaluation Framework for Mobile Intelligent Assistants
- EvoCurr: Self-evolving Curriculum with Behavior Code Generation for Complex Decision-making
- UbiQTree: Uncertainty Quantification in XAI with Tree Ensembles
- MEML-GRPO: Heterogeneous Multi-Expert Mutual Learning for RLVR Advancement
- UDA: Unsupervised Debiasing Alignment for Pair-wise LLM-as-a-Judge
- The PacifAIst Benchmark:Would an Artificial Intelligence Choose to Sacrifice Itself for Human Safety?
- Reasoning About Knowledge on Regular Expressions is 2EXPTIME-complete
- Human-Aligned Procedural Level Generation Reinforcement Learning via Text-Level-Sketch Shared Representation
- AWorld: Dynamic Multi-Agent System with Stable Maneuvering for Robust GAIA Problem Solving
- RAGulating Compliance: A Multi-Agent Knowledge Graph for Regulatory QA
- Mathematical Computation and Reasoning Errors by Large Language Models
- QuickGrasp: Lightweight Antipodal Grasp Planning with Point Clouds
- User-Intent-Driven Semantic Communication via Adaptive Deep Understanding
- Bayesian-Driven Graph Reasoning for Active Radio Map Construction
- Efficient Real-Time Aircraft ETA Prediction via Feature Tokenization Transformer
- To Theoretically Understand Transformer-Based In-Context Learning for Optimizing CSMA
- Agentic TinyML for Intent-aware Handover in 6G Wireless Networks
- Motif 2.6B Technical Report
- 5G Core Fault Detection and Root Cause Analysis using Machine Learning and Generative AI
- JustDense: Just using Dense instead of Sequence Mixer for Time Series analysis
- Peer Effect Estimation in the Presence of Simultaneous Feedback and Unobserved Confounders
- A Rolling Stone Gathers No Moss: Adaptive Policy Optimization for Stable Self-Evaluation in Large Multimodal Models
- Physics-Constrained Fine-Tuning of Flow-Matching Models for Generation and Inverse Problems
- EvaDrive: Evolutionary Adversarial Policy Optimization for End-to-End Autonomous Driving
- Agoran: An Agentic Open Marketplace for 6G RAN Automation
- Physics-Guided Memory Network for Building Energy Modeling
- Energy-Efficient Stochastic Computing (SC) Neural Networks for Internet of Things Devices With Layer-Wise Adjustable Sequence Length (ASL)
- Multimodal RAG Enhanced Visual Description
- webMCP: Efficient AI-Native Client-Side Interaction for Agent-Ready Web Design
- FedMP: Tackling Medical Feature Heterogeneity in Federated Learning from a Manifold Perspective
- A Context-aware Attention and Graph Neural Network-based Multimodal Framework for Misogyny Detection
- DQT: Dynamic Quantization Training via Dequantization-Free Nested Integer Arithmetic
- Generative Artificial Intelligence in Medical Imaging: Foundations, Progress, and Clinical Translation
- IAD-R1: Reinforcing Consistent Reasoning in Industrial Anomaly Detection
- scAGC: Learning Adaptive Cell Graphs with Contrastive Guidance for Single-Cell Clustering
- Long-Term Client Selection for Federated Learning with Non-IID Data: A Truthful Auction Approach
- Quantum-Efficient Reinforcement Learning Solutions for Last-Mile On-Demand Delivery
- HiSTM: Hierarchical Spatiotemporal Mamba for Cellular Traffic Forecasting
- A Neurosymbolic Framework for Interpretable Cognitive Attack Detection in Augmented Reality
- RL-MoE: An Image-Based Privacy Preserving Approach In Intelligent Transportation System
- Hybrid(Transformer+CNN)-based Polyp Segmentation
- Fine-Grained Safety Neurons with Training-Free Continual Projection to Reduce LLM Fine Tuning Risks
- From Values to Tokens: An LLM-Driven Framework for Context-aware Time Series Forecasting via Symbolic Discretization
- Diffusion LLMs Can Do Faster-Than-AR Inference via Discrete Diffusion Forcing
- Multi-Objective Instruction-Aware Representation Learning in Procedural Content Generation RL
- Meta-Learning for Speeding Up Large Model Inference in Decentralized Environments
- impuTMAE: Multi-modal Transformer with Masked Pre-training for Missing Modalities Imputation in Cancer Survival Prediction
- FIVA: Federated Inverse Variance Averaging for Universal CT Segmentation with Uncertainty Estimation
- MX-AI: Agentic Observability and Control Platform for Open and AI-RAN
- ADT4Coupons: An Innovative Framework for Sequential Coupon Distribution in E-commerce
- $\Delta$-AttnMask: Attention-Guided Masked Hidden States for Efficient Data Selection and Augmentation
- Zero-shot self-supervised learning of single breath-hold magnetic resonance cholangiopancreatography (MRCP) reconstruction
- Learning to Detect Unknown Jailbreak Attacks in Large Vision-Language Models: A Unified and Accurate Approach
- Personalized Feature Translation for Expression Recognition: An Efficient Source-Free Domain Adaptation Method
- MoQE: Improve Quantization Model performance via Mixture of Quantization Experts
- From Explainable to Explained AI: Ideas for Falsifying and Quantifying Explanations
- CoMoE: Collaborative Optimization of Expert Aggregation and Offloading for MoE-based LLMs at Edge
- Quantum-Enhanced Generative Adversarial Networks: Comparative Analysis of Classical and Hybrid Quantum-Classical Generative Adversarial Networks
- MME-Emotion: A Holistic Evaluation Benchmark for Emotional Intelligence in Multimodal Large Language Models
- Deep Generative Models for Discrete Genotype Simulation
- Real-time deep learning phase imaging flow cytometer reveals blood cell aggregate biomarkers for haematology diagnostics
- Towards Effective MLLM Jailbreaking Through Balanced On-Topicness and OOD-Intensity
- Understanding Ethical Practices in AI: Insights from a Cross-Role, Cross-Region Survey of AI Development Teams
- Towards Scalable Training for Handwritten Mathematical Expression Recognition
- Hierarchical Adaptive networks with Task vectors for Test-Time Adaptation
- From Hard Refusals to Safe-Completions: Toward Output-Centric Safety Training
- AMRG: Extend Vision Language Models for Automatic Mammography Report Generation
- GSMT: Graph Fusion and Spatiotemporal TaskCorrection for Multi-Bus Trajectory Prediction
- Cluster Topology-Driven Placement of Experts Reduces Network Traffic in MoE Inference
- Cowpox: Towards the Immunity of VLM-based Multi-Agent Systems
- Beyond Technocratic XAI: The Who, What & How in Explanation Design
- PETLP: A Privacy-by-Design Pipeline for Social Media Data in AI Research
- Gradient-Direction-Aware Density Control for 3D Gaussian Splatting
- NEFMind: Parameter-Efficient Fine-Tuning of Open-Source LLMs for Telecom APIs Automation
- Cross-BCI, A Cross-BCI-Paradigm Classifica-tion Model Towards Universal BCI Applications
- Detection of Odor Presence via Deep Neural Networks
- Can AI Keep a Secret? Contextual Integrity Verification: A Provable Security Architecture for LLMs
- Ethical Medical Image Synthesis
- Fake-Mamba: Real-Time Speech Deepfake Detection Using Bidirectional Mamba as Self-Attention's Alternative
- Based AI improves human decision-making but reduces trust
- Decentralized Weather Forecasting via Distributed Machine Learning and Blockchain-Based Model Validation
- ParallelSearch: Train your LLMs to Decompose Query and Search Sub-queries in Parallel with Reinforcement Learning
- TPTP World Infrastructure for Non-classical Logics
- Exact Verification of Graph Neural Networks with Incremental Constraint Solving
- Leveraging Large Language Models for Rare Disease Named Entity Recognition
- TEN: Table Explicitization, Neurosymbolically
- SegDAC: Segmentation-Driven Actor-Critic for Visual Reinforcement Learning
- Synaptic Pruning: A Biological Inspiration for Deep Learning Regularization
- RicciFlowRec: A Geometric Root Cause Recommender Using Ricci Curvature on Financial Graphs
- Collective dynamics of strategic classification
- The Human-AI Hybrid Delphi Model: A Structured Framework for Context-Rich, Expert Consensus in Complex Domains
- FusionEnsemble-Net: An Attention-Based Ensemble of Spatiotemporal Networks for Multimodal Sign Language Recognition
- A Signer-Invariant Conformer and Multi-Scale Fusion Transformer for Continuous Sign Language Recognition
- APIO: Automatic Prompt Induction and Optimization for Grammatical Error Correction and Text Simplification
Research Sources: 530 | Generated: 8/25/2025