AI RESEARCH PAPERS & ACADEMIC SOURCES
- Informed protein–ligand docking via geodesic guidance in translational, rotational and torsional spaces
- AI helps assemble ‘brain’ of future quantum computer
- Integrating CNN and transformer architectures for superior Arabic printed and handwriting characters classification
- A mind–reading brain implant that comes with password protection
- Curated CYP450 Interaction Dataset: Covering the Majority of Phase I Drug Metabolism
- Development of a deep learning algorithm for radiographic detection of syndesmotic instability in ankle fractures with intraoperative validation
- Automated insect detection and biomass monitoring via AI and electrical field sensor technology
- HIBRID: histology-based risk-stratification with deep learning and ctDNA in colorectal cancer
- Transparent artificial intelligence-enabled interpretable and interactive sleep apnea assessment across flexible monitoring scenarios
- Expert evaluation of ChatGPT accuracy and reliability for basic celiac disease frequently asked questions
- Scan-rescan reliability assessment of brain volumetric analysis across scanners and software solutions
- Collaborative Mean Estimation Among Heterogeneous Strategic Agents: Individual Rationality, Fairness, and Truthful Contribution
- Evaluation of Speech Foundation Models for ASR on Child-Adult Conversations in Autism Diagnostic Sessions
- Combining Machine Learning Defenses without Conflicts
- Using machine learning to inform harvest control rule design in complex fishery settings
- Bi-Sparse Unsupervised Feature Selection
- TAR: Teacher-Aligned Representations via Contrastive Learning for Quadrupedal Locomotion
- Neuronal correlations shape the scaling behavior of memory capacity and nonlinear computational capability of reservoir recurrent neural networks
- Sequential QCQP for Bilevel Optimization with Line Search
- Competitive Algorithms for Multi-Agent Ski-Rental Problems
- GraphFedMIG: Tackling Class Imbalance in Federated Graph Learning via Mutual Information-Guided Generation
- EDAPT: Towards Calibration-Free BCIs with Continual Online Adaptation
- Learning State-Space Models of Dynamic Systems from Arbitrary Data using Joint Embedding Predictive Architectures
- Nonlocal Monte Carlo via Reinforcement Learning
- Projected Coupled Diffusion for Test-Time Constrained Joint Generation
- Driving Accurate Allergen Prediction with Protein Language Models and Generalization-Focused Evaluation
- Technical Report: Facilitating the Adoption of Causal Inference Methods Through LLM-Empowered Co-Pilot
- GNN-based Unified Deep Learning
- Self-Supervised Temporal Super-Resolution of Energy Data using Generative Adversarial Transformer
- Oops!... They Stole it Again: Attacks on Split Learning
- Variance Reduced Policy Gradient Method for Multi-Objective Reinforcement Learning
- Beyond Random Sampling: Instance Quality-Based Data Partitioning via Item Response Theory
- Energy-Based Models for Predicting Mutational Effects on Proteins
- Conditional Information Bottleneck for Multimodal Fusion: Overcoming Shortcut Learning in Sarcasm Detection
- Graph Learning via Logic-Based Weisfeiler-Leman Variants and Tabularization
- IBEX: Information-Bottleneck-EXplored Coarse-to-Fine Molecular Generation under Limited Data
- Non-Stationary Restless Multi-Armed Bandits with Provable Guarantee
- SoK: Data Minimization in Machine Learning
- Efficiently Verifiable Proofs of Data Attribution
- A Dataset for Distilling Knowledge Priors from Literature for Therapeutic Design
- Whisper Smarter, not Harder: Adversarial Attack on Partial Suppression
- zERExtractor:An Automated Platform for Enzyme-Catalyzed Reaction Data Extraction from Scientific Literature
- Neural Network-Based Detection and Multi-Class Classification of FDI Attacks in Smart Grid Home Energy Systems
- Dynamical Alignment: A Principle for Adaptive Neural Computation
- Next Edit Prediction: Learning to Predict Code Edits from Context and Interaction History
- In silico study on the cytotoxicity against Hela cancer cells of xanthones bioactive compounds from Garcinia cowa: QSAR based on Graph Deep Learning, Network Pharmacology, and Molecular Docking
- Machine Learning for Cloud Detection in IASI Measurements: A Data-Driven SVM Approach with Physical Constraints
- Pre-trained Transformer-models using chronic invasive electrophysiology for symptom decoding without patient-individual training
- Estimating carbon pools in the shelf sea environment: reanalysis or model-informed machine learning?
- Flexible Personalized Split Federated Learning for On-Device Fine-Tuning of Foundation Models
- Clicks Versus Conversion: Choosing a Recommender's Training Objective in E-Commerce
- Efficient Methods for Accurate Sparse Trajectory Recovery and Map Matching
- Virtual Sensing for Solder Layer Degradation and Temperature Monitoring in IGBT Modules
- Mitigating Exponential Mixed Frequency Growth through Frequency Selection and Dimensional Separation in Quantum Machine Learning
- Physics-Informed Deep Contrast Source Inversion: A Unified Framework for Inverse Scattering Problems
- Reproducible Physiological Features in Affective Computing: A Preliminary Analysis on Arousal Modeling
- Advancing Autonomous Incident Response: Leveraging LLMs and Cyber Threat Intelligence
- Symmetry-Constrained Multi-Scale Physics-Informed Neural Networks for Graphene Electronic Band Structure Prediction
- Memorisation and forgetting in a learning Hopfield neural network: bifurcation mechanisms, attractors and basins
- Parity Cross-Resonance: A Multiqubit Gate
- Accelerating exoplanet climate modelling: A machine learning approach to complement 3D GCM grid simulations
- Performance of universal machine-learned potentials with explicit long-range interactions in biomolecular simulations
- CrossDenoise: Denoising Implicit Feedback via a Lightweight Entity-Aware Synergistic Framework
- Learning to Schedule in Parallel-Server Queues with Stochastic Bilinear Rewards
- Clipping Improves Adam-Norm and AdaGrad-Norm when the Noise Is Heavy-Tailed
- HGAurban: Heterogeneous Graph Autoencoding for Urban Spatial-Temporal Learning
- FRUGAL: Memory-Efficient Optimization by Reducing State Overhead for Scalable Training
- Efficient Distributed Optimization under Heavy-Tailed Noise
- A Market for Accuracy: Classification under Competition
- Learning Classifiers That Induce Markets
- From Actions to Words: Towards Abstractive-Textual Policy Summarization in RL
- Rethinking Client-oriented Federated Graph Learning
- Fast Convergence for High-Order ODE Solvers in Diffusion Probabilistic Models
- Transferable Parasitic Estimation via Graph Contrastive Learning and Label Rebalancing in AMS Circuits
- Diversifying Policy Behaviors with Extrinsic Behavioral Curiosity
- DiRW: Path-Aware Digraph Learning for Heterophily
- Multi-objective Optimization in CPU Design Space Exploration: Attention is All You Need
- A Training-Free Approach for Music Style Transfer with Latent Diffusion Models
- Interpretable Neural ODEs for Gene Regulatory Network Discovery under Perturbations
- CLoQ: Enhancing Fine-Tuning of Quantized LLMs via Calibrated LoRA Initialization
- Rollout Roulette: A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo Methods
- Delayed Feedback Modeling with Influence Functions
- Rhythmic sharing: A bio-inspired paradigm for zero-shot adaptive learning in neural networks
- Boosting Cross-problem Generalization in Diffusion-Based Neural Combinatorial Solver via Inference Time Adaptation
- Advancing MAPF towards the Real World: A Scalable Multi-Agent Realistic Testbed (SMART)
- VectorFit : Adaptive Singular & Bias Vector Fine-Tuning of Pre-trained Foundation Models
- FinSage: A Multi-aspect RAG System for Financial Filings Question Answering
- Goal-Oriented Time-Series Forecasting: Foundation Framework Design
- Adaptive Budgeted Multi-Armed Bandits for IoT with Dynamic Resource Constraints
- Unraveling the iterative CHAD
- Is Quantum Optimization Ready? An Effort Towards Neural Network Compression using Adiabatic Quantum Computing
- Security Concerns for Large Language Models: A Survey
- PromptTSS: A Prompting-Based Approach for Interactive Multi-Granularity Time Series Segmentation
- Discrepancy-Aware Graph Mask Auto-Encoder
- AmpLyze: A Deep Learning Model for Predicting the Hemolytic Concentration
- Class-Proportional Coreset Selection for Difficulty-Separable Data
- A Personalized Exercise Assistant using Reinforcement Learning (PEARL): Results from a four-arm Randomized-controlled Trial
- Measuring Time Series Forecast Stability for Demand Planning
- Constrained Decoding of Diffusion LLMs with Context-Free Grammars
- Characterizing Evolution in Expectation-Maximization Estimates for Overspecified Mixed Linear Regression
- Benchmark-Driven Selection of AI: Evidence from DeepSeek-R1
- Interpretable Machine Learning Model for Early Prediction of Acute Kidney Injury in Critically Ill Patients with Cirrhosis: A Retrospective Study
- Can Transformers Break Encryption Schemes via In-Context Learning?
- Pruning and Malicious Injection: A Retraining-Free Backdoor Attack on Transformer Models
- Convergence Analysis of Max-Min Exponential Neural Network Operators in Orlicz Space
- Multi-Agent Reinforcement Learning for Adaptive Resource Orchestration in Cloud-Native Clusters
- Federated Anomaly Detection for Multi-Tenant Cloud Platforms with Personalized Modeling
- Source Component Shift Adaptation via Offline Decomposition and Online Mixing Approach
- A Hierarchical IDS for Zero-Day Attack Detection in Internet of Medical Things Networks
- Semantic Communication with Distribution Learning through Sequential Observations
- A Unified Evaluation Framework for Multi-Annotator Tendency Learning
- XQuant: Breaking the Memory Wall for LLM Inference with KV Cache Rematerialization
- SC2Arena and StarEvolve: Benchmark and Self-Improvement Framework for LLMs in Complex Decision-Making Tasks
- Estimating Covariance for Global Minimum Variance Portfolio: A Decision-Focused Learning Approach
- The SET Perceptual Factors Framework: Towards Assured Perception for Autonomous Systems
- A Multimodal Neural Network for Recognizing Subjective Self-Disclosure Towards Social Robots
- TLE-Based A2C Agent for Terrestrial Coverage Orbital Path Planning
- Empirical Investigation into Configuring Echo State Networks for Representative Benchmark Problem Domains
- Leveraging Large Language Models for Relevance Judgments in Legal Case Retrieval
- A Random-Key Optimizer for Combinatorial Optimization
- FAIRGAME: a Framework for AI Agents Bias Recognition using Game Theory
- Communication Cost Reduction for Subgraph Counting under Local Differential Privacy via Hash Functions
- An Explainable Transformer-based Model for Phishing Email Detection: A Large Language Model Approach
- Implicit Safe Set Algorithm for Provably Safe Reinforcement Learning
- Advancing Data Equity: Practitioner Responsibility and Accountability in NLP Data Practices
- Less is More: Learning Graph Tasks with Just LLMs
- rETF-semiSL: Semi-Supervised Learning for Neural Collapse in Temporal Data
- Out-of-Distribution Detection using Counterfactual Distance
- CATNet: A geometric deep learning approach for CAT bond spread prediction in the primary market
- An Explainable AI based approach for Monitoring Animal Health
- No Free Lunch from Audio Pretraining in Bioacoustics: A Benchmark Study of Embeddings
- Facilitating Longitudinal Interaction Studies of AI Systems
- A Vision-Language Pre-training Model-Guided Approach for Mitigating Backdoor Attacks in Federated Learning
- Layer-Wise Analysis of Self-Supervised Representations for Age and Gender Classification in Children's Speech
- Welfare-Centric Clustering
- eMamba: Efficient Acceleration Framework for Mamba Models in Edge Computing
- AnalogSeeker: An Open-source Foundation Language Model for Analog Circuit Design
- MCP2OSC: Parametric Control by Natural Language
- MASH: Cooperative-Heterogeneous Multi-Agent Reinforcement Learning for Single Humanoid Robot Locomotion
- Alternating Approach-Putt Models for Multi-Stage Speech Enhancement
- RealAC: A Domain-Agnostic Framework for Realistic and Actionable Counterfactual Explanations
- X-Node: Self-Explanation is All We Need
- Pinet: Optimizing hard-constrained neural networks with orthogonal projection layers
- Contrastive ECOC: Learning Output Codes for Adversarial Defense
- A Unified Multi-Agent Framework for Universal Multimodal Understanding and Generation
- Advances in Logic-Based Entity Resolution: Enhancing ASPEN with Local Merges and Optimality Criteria
- Fake Speech Wild: Detecting Deepfake Speech on Social Media Platform
- FreeGAD: A Training-Free yet Effective Approach for Graph Anomaly Detection
- SPHENIC: Topology-Informed Multi-View Clustering for Spatial Transcriptomics
- Deep Learning in Classical and Quantum Physics
- REFN: A Reinforcement-Learning-From-Network Framework against 1-day/n-day Exploitations
- Electromagnetic Simulations of Antennas on GPUs for Machine Learning Applications
- APFL: Analytic Personalized Federated Learning via Dual-Stream Least Squares
- Natively Trainable Sparse Attention for Hierarchical Point Cloud Datasets
- FROGENT: An End-to-End Full-process Drug Design Agent
- A Survey of Optimization Modeling Meets LLMs: Progress and Future Directions
- MCP-Orchestrated Multi-Agent System for Automated Disinformation Detection
- Agentic AI Frameworks: Architectures, Protocols, and Design Challenges
- Improving and Evaluating Open Deep Research Agents
- Pruning Long Chain-of-Thought of Large Reasoning Models via Small-Scale Preference Optimization
- KompeteAI: Accelerated Autonomous Multi-Agent System for End-to-End Pipeline Generation for Machine Learning Problems
- Extending the Entropic Potential of Events for Uncertainty Quantification and Decision-Making in Artificial Intelligence
- Why Cannot Large Language Models Ever Make True Correct Reasoning?
- Promoting Efficient Reasoning with Verifiable Stepwise Reward
- A Curriculum Learning Approach to Reinforcement Learning: Leveraging RAG for Multimodal Question Answering
- Multi-Agent Trust Region Policy Optimisation: A Joint Constraint Approach
- What to Ask Next? Probing the Imaginative Reasoning of LLMs with TurtleSoup Puzzles
- LeanRAG: Knowledge-Graph-Based Generation with Semantic Aggregation and Hierarchical Retrieval
- HiRef: Leveraging Hierarchical Ontology and Network Refinement for Robust Medication Recommendation
- FIRESPARQL: A LLM-based Framework for SPARQL Query Generation over Scholarly Knowledge Graphs
- SEQ-GPT: LLM-assisted Spatial Query via Example
- PASS: Probabilistic Agentic Supernet Sampling for Interpretable and Adaptive Chest X-Ray Reasoning
- MSRS: Adaptive Multi-Subspace Representation Steering for Attribute Alignment in Large Language Models
- STEP: Stepwise Curriculum Learning for Context-Knowledge Fusion in Conversational Recommendation
- GenOM: Ontology Matching with Description Generation and Large Language Model
- Scaling Up without Fading Out: Goal-Aware Sparse GNN for RL-based Generalized Planning
- Modeling Human Responses to Multimodal AI Content
- The Knowledge-Reasoning Dissociation: Fundamental Limitations of LLMs in Clinical Natural Language Inference
- Who Benefits from AI Explanations? Towards Accessible and Interpretable Systems
- OpenFPL: An open-source forecasting method rivaling state-of-the-art Fantasy Premier League services
- A Robust Pipeline for Differentially Private Federated Learning on Imbalanced Clinical Data using SMOTETomek and FedProx
- Cognitive Cybersecurity for Artificial Intelligence: Guardrail Engineering with CCS-7
- Jet Image Tagging Using Deep Learning: An Ensemble Model
- Certifiably robust malware detectors by design
- Multi-task Adversarial Attacks against Black-box Model with Few-shot Queries
- Exploring Content and Social Connections of Fake News with Explainable Text and Graph Learning
- FIDELIS: Blockchain-Enabled Protection Against Poisoning Attacks in Federated Learning
- Securing Agentic AI: Threat Modeling and Risk Analysis for Network Monitoring Agentic AI System
- Generative AI for Cybersecurity of Energy Management Systems: Methods, Challenges, and Future Directions
- SABIA: An AI-Powered Tool for Detecting Opioid-Related Behaviors on Social Media
- Legal Zero-Days: A Novel Risk Vector for Advanced AI Systems
- NetMoniAI: An Agentic AI Framework for Network Security & Monitoring
- Prediction-Powered Inference with Inverse Probability Weighting
- Mo' Memory, Mo' Problems: Stream-Native Machine Unlearning
- Dimension-Free Bounds for Generalized First-Order Methods via Gaussian Coupling
- An Iterative Algorithm for Differentially Private $k$-PCA with Adaptive Noise
- Conic Formulations of Transport Metrics for Unbalanced Measure Networks and Hypernetworks
- xRFM: Accurate, scalable, and interpretable feature learning models for tabular data
- Bayesian Models for Joint Selection of Features and Auto-Regressive Lags: Theory and Applications in Environmental and Financial Forecasting
- Comparison of D-Wave Quantum Annealing and Markov Chain Monte Carlo for Sampling from a Probability Distribution of a Restricted Boltzmann Machine
- The Conditional Regret-Capacity Theorem for Batch Universal Prediction
- Uncertainty-Aware Prediction of Parkinson's Disease Medication Needs: A Two-Stage Conformal Prediction Approach
- Online selective conformal inference: adaptive scores, convergence rate and optimality
- Unpacking the Implicit Norm Dynamics of Sharpness-Aware Minimization in Tensorized Models
- BKP: An R Package for Beta Kernel Process Modeling
- Confounding is a Pervasive Problem in Real World Recommender Systems
- Nonlinear filtering based on density approximation and deep BSDE prediction
- A Guide to Bayesian Optimization in Bioprocess Engineering
- MDNS: Masked Diffusion Neural Sampler via Stochastic Optimal Control
- Enhancing Fairness in Autoencoders for Node-Level Graph Anomaly Detection
- Comparison of Data Reduction Criteria for Online Gaussian Processes
- Continuous Parallel Relaxation for Finding Diverse Solutions in Combinatorial Optimization Problems
- Hypothesis Spaces for Deep Learning
- A Geometric Unification of Distributionally Robust Covariance Estimators: Shrinking the Spectrum by Inflating the Ambiguity Set
- Online Distributional Regression
- Sharp Generalization for Nonparametric Regression in Interpolation Space by Over-Parameterized Neural Networks Trained with Preconditioned Gradient Descent and Early-Stopping
- A Two-Stage Learning-to-Defer Approach for Multi-Task Learning
- Adversarial Robustness in Two-Stage Learning-to-Defer: Algorithms and Guarantees
- Hyperflux: Pruning Reveals the Importance of Weights
- MIRRAMS: Learning Robust Tabular Models under Unseen Missingness Shifts
- Minimax Optimality in Contextual Dynamic Pricing with General Valuation Models
- A Parametric Contextual Online Learning Theory of Brokerage
- Neural Networks Generalize on Low Complexity Data
- Responsible Machine Learning via Mixed-Integer Optimization
- Identifying Causal Direction via Variational Bayesian Compression
- Reinforcement Learning with Random Time Horizons
- Optimistic critics can empower small actors
- MAP Estimation with Denoisers: Convergence Rates and Guarantees
- Improved Regularization and Robustness for Fine-tuning in Neural Networks
- Unifying Self-Supervised Clustering and Energy-Based Models
- Detection and Tracking of MAVs Using a Rosette Scanning Pattern LiDAR
- Tuning-Free Online Robust Principal Component Analysis through Implicit Regularization
- Visual SLAMMOT Considering Multiple Motion Models
- INSIGHT: Explainable Weakly-Supervised Medical Image Analysis
- Bootstrapping, Autonomous Testing, and Initialization System for Si/SiGe Multi-quantum Dot Devices
- Robotic Ultrasound-Guided Femoral Artery Reconstruction of Anatomically-Representative Phantoms
- EvRWKV: A Continuous Interactive RWKV Framework for Effective Event-Guided Low-Light Image Enhancement
- Reinforcement Learning meets Masked Video Modeling : Trajectory-Guided Adaptive Token Selection
- CCL-LGS: Contrastive Codebook Learning for 3D Language Gaussian Splatting
- Point or Line? Using Line-based Representation for Panoptic Symbol Spotting in CAD Drawings
- Data Pruning by Information Maximization
- Stepwise Decomposition and Dual-stream Focus: A Novel Approach for Training-free Camouflaged Object Segmentation
- Quantitative Comparison of Fine-Tuning Techniques for Pretrained Latent Diffusion Models in the Generation of Unseen SAR Images
- Semantic Structure-Aware Generative Attacks for Enhanced Adversarial Transferability
- VMem: Consistent Interactive Video Scene Generation with Surfel-Indexed View Memory
- Deblurring in the Wild: A Real-World Dataset from Smartphone High-Speed Videos
- EXAONE Path 2.0: Pathology Foundation Model with End-to-End Supervision
- Common Data Properties Limit Object-Attribute Binding in CLIP
- M2DAO-Talker: Harmonizing Multi-granular Motion Decoupling and Alternating Optimization for Talking-head Generation
- Warehouse Spatial Question Answering with LLM Agent
- Hierarchical Cross-modal Prompt Learning for Vision-Language Models
- STAMP: Multi-pattern Attention-aware Multiple Instance Learning for STAS Diagnosis in Multi-center Histopathology Images
- TweezeEdit: Consistent and Efficient Image Editing with Path Regularization
- Multi-Sample Anti-Aliasing and Constrained Optimization for 3D Gaussian Splatting
- A Segmentation-driven Editing Method for Bolt Defect Augmentation and Detection
- EgoMusic-driven Human Dance Motion Estimation with Skeleton Mamba
- Reasoning in Computer Vision: Taxonomy, Models, Tasks, and Methodologies
- Med-GLIP: Advancing Medical Language-Image Pre-training with Large-scale Grounded Dataset
- GCRPNet: Graph-Enhanced Contextual and Regional Perception Network For Salient Object Detection in Optical Remote Sensing Images
- PSScreen: Partially Supervised Multiple Retinal Disease Screening
- AR Surgical Navigation With Surface Tracing: Comparing In-SitVisualization with Tool-Tracking Guidance for Neurosurgical Applications
- Retrieval-Augmented Prompt for OOD Detection
- PTQAT: A Hybrid Parameter-Efficient Quantization Algorithm for 3D Perception Tasks
- HM-Talker: Hybrid Motion Modeling for High-Fidelity Talking Head Synthesis
- SpaRC-AD: A Baseline for Radar-Camera Fusion in End-to-End Autonomous Driving
- Adapting SAM via Cross-Entropy Masking for Class Imbalance in Remote Sensing Change Detection
- Towards Agentic AI for Multimodal-Guided Video Object Segmentation
- HumanSense: From Multimodal Perception to Empathetic Context-Aware Responses through Reasoning MLLMs
- EvTurb: Event Camera Guided Turbulence Removal
- Towards Powerful and Practical Patch Attacks for 2D Object Detection in Autonomous Driving
- Fourier-Guided Attention Upsampling for Image Super-Resolution
- FIND-Net -- Fourier-Integrated Network with Dictionary Kernels for Metal Artifact Reduction
- Increasing the Utility of Synthetic Images through Chamfer Guidance
- ChatENV: An Interactive Vision-Language Model for Sensor-Guided Environmental Monitoring and Scenario Simulation
- Processing and acquisition traces in visual encoders: What does CLIP know about your camera?
- Lameness detection in dairy cows using pose estimation and bidirectional LSTMs
- SemPT: Semantic Prompt Tuning for Vision-Language Models
- Serial Over Parallel: Learning Continual Unification for Multi-Modal Visual Object Tracking and Benchmarking
- AddressVLM: Cross-view Alignment Tuning for Image Address Localization using Large Vision-Language Models
- Hybrid Generative Fusion for Efficient and Privacy-Preserving Face Recognition Dataset Generation
- HyperTea: A Hypergraph-based Temporal Enhancement and Alignment Network for Moving Infrared Small Target Detection
- Physics-Informed Joint Multi-TE Super-Resolution with Implicit Neural Representation for Robust Fetal T2 Mapping
- IADGPT: Unified LVLM for Few-Shot Industrial Anomaly Detection, Localization, and Reasoning via In-Context Learning
- Novel View Synthesis using DDIM Inversion
- Beyond conventional vision: RGB-event fusion for robust object detection in dynamic traffic scenarios
- CountCluster: Training-Free Object Quantity Guidance with Cross-Attention Map Clustering for Text-to-Image Generation
- NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale
- Lightweight CNNs for Embedded SAR Ship Target Detection and Classification
- Revisiting Cross-View Localization from Image Matching
- Exploiting Discriminative Codebook Prior for Autoregressive Image Generation
- EgoCross: Benchmarking Multimodal Large Language Models for Cross-Domain Egocentric Video Question Answering
- Dissecting Generalized Category Discovery: Multiplex Consensus under Self-Deconstruction
- Privacy-enhancing Sclera Segmentation Benchmarking Competition: SSBC 2025
- Axis-level Symmetry Detection with Group-Equivariant Representation
- Forgery Guided Learning Strategy with Dual Perception Network for Deepfake Cross-domain Detection
- An Efficient Model-Driven Groupwise Approach for Atlas Construction
- From Diagnosis to Improvement: Probing Spatio-Physical Reasoning in Vision Language Models
- AEGIS: Authenticity Evaluation Benchmark for AI-Generated Video Sequences
- Video-BLADE: Block-Sparse Attention Meets Step Distillation for Efficient Video Generation
- Ultra-High-Definition Reference-Based Landmark Image Super-Resolution with Generative Diffusion Prior
- Cooperative Face Liveness Detection from Optical Flow
- VasoMIM: Vascular Anatomy-Aware Masked Image Modeling for Vessel Segmentation
- Object Fidelity Diffusion for Remote Sensing Image Generation
- Mobile-Friendly Deep Learning for Plant Disease Detection: A Lightweight CNN Benchmark Across 101 Classes of 33 Crops
- UI-Venus Technical Report: Building High-performance UI Agents with RFT
- Self-Supervised Stereo Matching with Multi-Baseline Contrastive Learning
- Generalizable Federated Learning using Client Adaptive Focal Modulation
- Hierarchical Fine-grained Preference Optimization for Physically Plausible Video Generation
- Performance of GPT-5 in Brain Tumor MRI Reasoning
- TexVerse: A Universe of 3D Objects with High-Resolution Textures
- Medico 2025: Visual Question Answering for Gastrointestinal Imaging
- ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing
- STream3R: Scalable Sequential 3D Reconstruction with Causal Transformer
- MAESTRO: Masked AutoEncoders for Multimodal, Multitemporal, and Multispectral Earth Observation Data
- ESSENTIAL: Episodic and Semantic Memory Integration for Video Class-Incremental Learning
- Human-in-Context: Unified Cross-Domain 3D Human Motion Modeling via In-Context Learning
- Puppeteer: Rig and Animate Your 3D Models
- Quantum Visual Fields with Neural Amplitude Encoding
- Invisible Watermarks, Visible Gains: Steering Machine Unlearning with Bi-Level Watermarking Design
- From Intent to Execution: Multimodal Chain-of-Thought Reinforcement Learning for Precise CAD Code Generation
- Explainable AI Technique in Lung Cancer Detection Using Convolutional Neural Networks
- Data-Efficient Learning for Generalizable Surgical Video Understanding
- AI-Driven Detection and Analysis of Handwriting on Seized Ivory: A Tool to Uncover Criminal Networks in the Illicit Wildlife Trade
- DINOMotion: advanced robust tissue motion tracking with DINOv2 in 2D-Cine MRI-guided radiotherapy
- SynBrain: Enhancing Visual-to-fMRI Synthesis via Probabilistic Representation Learning
- Improving Learning of New Diseases through Knowledge-Enhanced Initialization for Federated Adapter Tuning
- Efficient Image Denoising Using Global and Local Circulant Representation
- ReconVLA: Reconstructive Vision-Language-Action Model as Effective Robot Perceiver
- MM-Food-100K: A 100,000-Sample Multimodal Food Intelligence Dataset with Verifiable Provenance
- We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning
- On the Complexity-Faithfulness Trade-off of Gradient-Based Explanations
- On Spectral Properties of Gradient-based Explanation Methods
- DIVA-VQA: Detecting Inter-frame Variations in UGC Video Quality
- Geospatial Diffusion for Land Cover Imperviousness Change Forecasting
- Agentic Design Review System
- Insights from the Algonauts 2025 Winners
- When Experts Disagree: Characterizing Annotator Variability for Vessel Segmentation in DSA Images
- GC-MVSNet: Multi-View, Multi-Scale, Geometrically-Consistent Multi-View Stereo
- Video-based automatic lameness detection of dairy cows using pose estimation and multiple locomotion traits
- Debiasing Multimodal Large Language Models via Penalization of Language Priors
- VPOcc: Exploiting Vanishing Point for 3D Semantic Occupancy Prediction
- MinD-3D++: Advancing fMRI-Based 3D Reconstruction with High-Quality Textured Mesh Generation and a Comprehensive Dataset
- CapeLLM: Support-Free Category-Agnostic Pose Estimation with Multimodal Large Language Models
- Quantum-Brain: Quantum-Inspired Neural Network Approach to Vision-Brain Understanding
- MyTimeMachine: Personalized Facial Age Transformation
- Speedy-Splat: Fast 3D Gaussian Splatting with Sparse Pixels and Sparse Primitives
- DGNS: Deformable Gaussian Splatting and Dynamic Neural Surface for Monocular Dynamic 3D Reconstruction
- DualPM: Dual Posed-Canonical Point Maps for 3D Shape and Pose Reconstruction
- Mastering Collaborative Multi-modal Data Selection: A Focus on Informativeness, Uniqueness, and Representativeness
- Understanding Transformer-based Vision Models through Inversion
- A Lightweight Transformer with Phase-Only Cross-Attention for Illumination-Invariant Biometric Authentication
- Nautilus: Locality-aware Autoencoder for Scalable Mesh Generation
- Leveraging Motion Estimation for Efficient Bayer-Domain Computer Vision
- NAVER: A Neuro-Symbolic Compositional Automaton for Visual Grounding with Explicit Logic Reasoning
- MIDAS: Modeling Ground-Truth Distributions with Dark Knowledge for Domain Generalized Stereo Matching
- Just Functioning as a Hook for Two-Stage Referring Multi-Object Tracking
- Continual Learning for Multiple Modalities
- UniOcc: A Unified Benchmark for Occupancy Forecasting and Prediction in Autonomous Driving
- Scaling Open-Vocabulary Action Detection
- OrderChain: Towards General Instruct-Tuning for Stimulating the Ordinal Understanding Ability of MLLM
- A Rose by Any Other Name Would Smell as Sweet: Categorical Homotopy Theory for Large Language Models
- Decoupling Understanding from Reasoning via Problem Space Mapping for Small-scale Model Reasoning
- FedCoT: Communication-Efficient Federated Reasoning Enhancement for Large Language Models
- LATTE: Learning Aligned Transactions and Textual Embeddings for Bank Clients
- Conformal P-Value in Multiple-Choice Question Answering Tasks with Provable Risk Control
- RTTC: Reward-Guided Collaborative Test-Time Compute
- Detecting and explaining postpartum depression in real-time with generative artificial intelligence
- SABER: Switchable and Balanced Training for Efficient LLM Reasoning
- LLMCARE: Alzheimer's Detection via Transformer Models Enhanced by LLM-Generated Synthetic Data
- PREF: Reference-Free Evaluation of Personalised Text Generation in LLMs
- Latent Fusion Jailbreak: Blending Harmful and Harmless Representations to Elicit Unsafe LLM Outputs
- Inference-Aware Prompt Optimization for Aligning Black-Box Large Language Models
- The Cost of Thinking: Increased Jailbreak Risk in Large Language Models
- Reflect then Learn: Active Prompting for Information Extraction Guided by Introspective Confusion
- mSCoRe: a $M$ultilingual and Scalable Benchmark for $S$kill-based $Co$mmonsense $Re$asoning
- Multi-Turn Puzzles: Evaluating Interactive Reasoning and Strategic Dialogue in LLMs
- LaajMeter: A Framework for LaaJ Evaluation
- Estimating Machine Translation Difficulty
- Efficient Forward-Only Data Valuation for Pretrained LLMs and VLMs
- PakBBQ: A Culturally Adapted Bias Benchmark for QA
- Prompt-Response Semantic Divergence Metrics for Faithfulness Hallucination and Misalignment Detection in Large Language Models
- Understanding Textual Emotion Through Emoji Prediction
- Using Large Language Models to Measure Symptom Severity in Patients At Risk for Schizophrenia
- A Computational Approach to Analyzing Language Change and Variation in the Constructed Language Toki Pona
- Inductive Bias Extraction and Matching for LLM Prompts
- Yet another algorithmic bias: A Discursive Analysis of Large Language Models Reinforcing Dominant Discourses on Gender and Race
- ReviewRL: Towards Automated Scientific Review with RL
- From Surface to Semantics: Semantic Structure Parsing for Table-Centric Document Analysis
- Beyond Semantic Understanding: Preserving Collaborative Frequency Components in LLM-based Recommendation
- Cross-Prompt Encoder for Low-Performing Languages
- Making Qwen3 Think in Korean with Reinforcement Learning
- Advancing Cross-lingual Aspect-Based Sentiment Analysis with LLMs and Constrained Decoding for Sequence-to-Sequence Models
- Large Language Models for Summarizing Czech Historical Documents and Beyond
- Improving Generative Cross-lingual Aspect-Based Sentiment Analysis with Constrained Decoding
- Jailbreaking Commercial Black-Box LLMs with Explicitly Harmful Prompts
- Layer-Wise Perturbations via Sparse Autoencoders for Adversarial Text Generation
- ComoRAG: A Cognitive-Inspired Memory-Organized RAG for Stateful Long Narrative Reasoning
- Evaluating LLMs on Chinese Idiom Translation
- Computational Economics in Large Language Models: Exploring Model Behavior and Incentive Design under Resource Constraints
- DiFaR: Enhancing Multimodal Misinformation Detection with Diverse, Factual, and Relevant Rationales
- When Explainability Meets Privacy: An Investigation at the Intersection of Post-hoc Explainability and Differential Privacy in the Context of Natural Language Processing
- When Language Overrules: Revealing Text Dominance in Multimodal Large Language Models
- eDIF: A European Deep Inference Fabric for Remote Interpretability of LLM
- Neural Machine Translation for Coptic-French: Strategies for Low-Resource Ancient Languages
- Continuous Bangla Sign Language Translation: Mitigating the Expense of Gloss Annotation with the Assistance of Graph
- Learning from Natural Language Feedback for Personalized Question Answering
- Thinking Inside the Mask: In-Place Prompting in Diffusion LLMs
- Beyond "Not Novel Enough": Enriching Scholarly Critique with LLM-Assisted Feedback
- Reinforced Language Models for Sequential Decision Making
- Psyche-R1: Towards Reliable Psychological LLMs through Unified Empathy, Expertise, and Reasoning
- From Black Box to Transparency: Enhancing Automated Interpreting Assessment with Explainable AI in College Classrooms
- SSRL: Self-Search Reinforcement Learning
- A Survey on Diffusion Language Models
- Context Misleads LLMs: The Role of Context Filtering in Maintaining Safe Alignment of LLMs
- Large Language Models Show Signs of Alignment with Human Neurocognition During Abstract Reasoning
- SaraCoder: Orchestrating Semantic and Structural Cues for Profit-Oriented Repository-Level Code Completion
- Amazon Nova AI Challenge -- Trusted AI: Advancing secure, AI-assisted software development
- Nested-ReFT: Efficient Reinforcement Learning for Large Language Model Fine-Tuning via Off-Policy Rollouts
- Personalized Real-time Jargon Support for Online Meetings
- Improving OCR for Historical Texts of Multiple Languages
- CorrectNav: Self-Correction Flywheel Empowers Vision-Language-Action Navigation Model
- Reverse Physician-AI Relationship: Full-process Clinical Diagnosis Driven by a Large Language Model
- Diversity First, Quality Later: A Two-Stage Assumption for Language Model Alignment
- Improving Value-based Process Verifier via Low-Cost Variance Reduction
- Stabilizing Long-term Multi-turn Reinforcement Learning with Gated Rewards
- Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models
- Memory-Augmented Transformers: A Systematic Review from Neuroscience Principles to Technical Solutions
- Searching for Privacy Risks in LLM Agents via Simulation
- Knowledge-based Consistency Testing of Large Language Models
- This Candidate is [MASK]. Prompt-based Sentiment Extraction and Reference Letters
- Fleurs-SLU: A Massively Multilingual Benchmark for Spoken Language Understanding
- Measuring Diversity in Synthetic Datasets
- LED-Merging: Mitigating Safety-Utility Conflicts in Model Merging with Location-Election-Disjoint
- TikZero: Zero-Shot Text-Guided Graphics Program Synthesis
- Explainable Sentiment Analysis with DeepSeek-R1: Performance, Efficiency, and Few-Shot Learning
- Building Instruction-Tuning Datasets from Human-Written Instructions with Open-Weight Large Language Models
- ToolACE-R: Model-aware Iterative Training and Adaptive Refinement for Tool Learning
- CoTAL: Human-in-the-Loop Prompt Engineering for Generalizable Formative Assessment Scoring
- Swedish Whispers; Leveraging a Massive Speech Corpus for Swedish Speech Recognition
- AF-MAT: Aspect-aware Flip-and-Fuse xLSTM for Aspect-based Sentiment Analysis
- Meanings are like Onions: a Layered Approach to Metaphor Processing
- CodeJudgeBench: Benchmarking LLM-as-a-Judge for Coding Tasks
- DeepWriter: A Fact-Grounded Multimodal Writing Assistant Based On Offline Knowledge Base
- A New Query Expansion Approach via Agent-Mediated Dialogic Inquiry
- BitDecoding: Unlocking Tensor Cores for Long-Context LLMs with Low-Bit KV Cache
- CAPTURe: Evaluating Spatial Reasoning in Vision Language Models via Occluded Object Counting
- Grouped Sequency-arranged Rotation: Optimizing Rotation Transformation for Quantization for Free
- FreeKV: Boosting KV Cache Retrieval for Efficient LLM Inference
- MAPS: A Multilingual Benchmark for Global Agent Performance and Security
- Evaluation of Cultural Competence of Vision-Language Models
- Prompt Attacks Reveal Superficial Knowledge Removal in Unlearning Methods
- LAPO: Internalizing Reasoning Efficiency via Length-Adaptive Policy Optimization
- Stochastic-based Patch Filtering for Few-Shot Learning
- DINOv3
- Empowering Morphing Attack Detection using Interpretable Image-Text Foundation Model
- Interpretable Oracle Bone Script Decipherment through Radical and Pictographic Analysis with LVLMs
- Deep Learning Enables Large-Scale Shape and Appearance Modeling in Total-Body DXA Imaging
- MANGO: Multimodal Attention-based Normalizing Flow Approach to Fusion Learning
- Improving watermelon (Citrullus lanatus) disease classification with generative artificial intelligence (GenAI)-based synthetic and real-field images via a custom EfficientNetV2-L model
- SynSpill: Improved Industrial Spill Detection With Synthetic Data
- EntropyGS: An Efficient Entropy Coding on 3D Gaussian Splatting
- CellSymphony: Deciphering the molecular and phenotypic orchestration of cells with single-cell pathomics
- Deep Learning for Crack Detection: A Review of Learning Paradigms, Generalizability, and Datasets
- MRFD: Multi-Region Fusion Decoding with Self-Consistency for Mitigating Hallucinations in LVLMs
- Pose-Robust Calibration Strategy for Point-of-Gaze Estimation on Mobile Phones
- High Fidelity Text to Image Generation with Contrastive Alignment and Structural Guidance
- VIFSS: View-Invariant and Figure Skating-Specific Pose Representation Learning for Temporal Action Segmentation
- JRDB-Reasoning: A Difficulty-Graded Benchmark for Visual Reasoning in Robotics
- A Sub-Pixel Multimodal Optical Remote Sensing Images Matching Method
- InterSyn: Interleaved Learning for Dynamic Motion Synthesis in the Wild
- From Pixel to Mask: A Survey of Out-of-Distribution Segmentation
- Integrating Reinforcement Learning with Visual Generative Models: Foundations and Advances
- Concepts or Skills? Rethinking Instruction Selection for Multi-modal Models
- Glo-DMU: A Deep Morphometry Framework of Ultrastructural Characterization in Glomerular Electron Microscopic Images
- AtomDiffuser: Time-Aware Degradation Modeling for Drift and Beam Damage in STEM Imaging
- Contrast Sensitivity Function of Multimodal Vision-Language Models
- Towards Spatially Consistent Image Generation: On Incorporating Intrinsic Scene Properties into Diffusion Models
- Unlocking Robust Semantic Segmentation Performance via Label-only Elastic Deformations against Implicit Label Noise
- PQ-DAF: Pose-driven Quality-controlled Data Augmentation for Data-scarce Driver Distraction Detection
- Translation of Text Embedding via Delta Vector to Suppress Strongly Entangled Content in Text-to-Image Diffusion Models
- SC-Lane: Slope-aware and Consistent Road Height Estimation Framework for 3D Lane Detection
- NanoControl: A Lightweight Framework for Precise and Efficient Control in Diffusion Transformer
- STRIDE-QA: Visual Question Answering Dataset for Spatiotemporal Reasoning in Urban Driving Scenes
- CRISP: Contrastive Residual Injection and Semantic Prompting for Continual Video Instance Segmentation
- DOD-SA: Infrared-Visible Decoupled Object Detection with Single-Modality Annotations
- SkeySpot: Automating Service Key Detection for Digital Electrical Layout Plans in the Construction Industry
- From Images to Perception: Emergence of Perceptual Properties by Reconstructing Images
- Trajectory-aware Shifted State Space Models for Online Video Super-Resolution
- Multi-Label Plant Species Prediction with Metadata-Enhanced Multi-Head Vision Transformers
- SingleStrip: learning skull-stripping from a single labeled example
- Enhanced Sparse Point Cloud Data Processing for Privacy-aware Human Action Recognition
- A Transparent Fairness Evaluation Protocol for Open-Source Language Model Benchmarking on the Blockchain
- Thematic and Task-Based Categorization of K-12 GenAI Usages with Hierarchical Topic Modeling
- INTIMA: A Benchmark for Human-AI Companionship Behavior
- XFacta: Contemporary, Real-World Dataset and Evaluation for Multimodal Misinformation Detection with Multimodal LLMs
- AutoGeTS: Knowledge-based Automated Generation of Text Synthetics for Improving Text Classification
- HiFACTMix: A Code-Mixed Benchmark and Graph-Aware Model for EvidenceBased Political Claim Verification in Hinglish
- Semantic Structure in Large Language Model Embeddings
- User Perception of Attention Visualizations: Effects on Interpretability Across Evidence-Based Medical Documents
- From Answers to Questions: EQGBench for Evaluating LLMs' Educational Question Generation
- Automated scoring of the Ambiguous Intentions Hostility Questionnaire using fine-tuned large language models
- Multidimensional classification of posts for online course discussion forum curation
- Beyond Hard Sharing: Efficient Multi-Task Speech-to-Text Modeling with Supervised Mixture of Experts
- An Audit and Analysis of LLM-Assisted Health Misinformation Jailbreaks Against LLMs
- Evaluation of GPT-based large language generative AI models as study aids for the national licensure examination for registered dietitians in Japan
- Guided Navigation in Knowledge-Dense Environments: Structured Semantic Exploration with Guidance Graphs
- Semantic Bridge: Universal Multi-Hop Question Generation via AMR-Driven Graph Synthesis
- PersonaEval: Are LLM Evaluators Human Enough to Judge Role-Play?
- RealTalk-CN: A Realistic Chinese Speech-Text Dialogue Benchmark With Cross-Modal Interaction Analysis
- Training-Free Multimodal Large Language Model Orchestration
- Bridging AI Innovation and Healthcare Needs: Lessons Learned from Incorporating Modern NLP at The BC Cancer Registry
Research Sources: 511 | Generated: 8/25/2025