AI Research News Feeds for November 6th, 2025

AI RESEARCH PAPERS & ACADEMIC SOURCES

Signal Intensity-weighted coordinate channels improve learning stability and generalisation in 1D and 2D CNNs in localisation tasks on biomedical signals : Abstract: Localisation tasks in biomedical data often require models to learn meaningful spatial or temporal relationships from signals with complex intensity distributions. A common strategy, exempli...
A Lightweight 3D-CNN for Event-Based Human Action Recognition with Privacy-Preserving Potential : Abstract: This paper presents a lightweight three-dimensional convolutional neural network (3DCNN) for human activity recognition (HAR) using event-based vision data. Privacy preservation is a key cha...
Part-Aware Bottom-Up Group Reasoning for Fine-Grained Social Interaction Detection : Abstract: Social interactions often emerge from subtle, fine-grained cues such as facial expressions, gaze, and gestures. However, existing methods for social interaction detection overlook such nuanc...
Disentangled Concepts Speak Louder Than Words:Explainable Video Action Recognition : Abstract: Effective explanations of video action recognition models should disentangle how movements unfold over time from the surrounding spatial context. However, existing methods based on saliency ...
Benchmarking ResNet for Short-Term Hypoglycemia Classification with DiaData : Abstract: Individualized therapy is driven forward by medical data analysis, which provides insight into the patient's context. In particular, for Type 1 Diabetes (T1D), which is an autoimmune disease...
Optimizing the nnU-Net model for brain tumor (Glioma) segmentation Using a BraTS Sub-Saharan Africa (SSA) dataset : Abstract: Medical image segmentation is a critical achievement in modern medical science, developed over decades of research. It allows for the exact delineation of anatomical and pathological feature...
Domain-Adaptive Transformer for Data-Efficient Glioma Segmentation in Sub-Saharan MRI : Abstract: Glioma segmentation is critical for diagnosis and treatment planning, yet remains challenging in Sub-Saharan Africa due to limited MRI infrastructure and heterogeneous acquisition protocols ...
Comprehensive Assessment of LiDAR Evaluation Metrics: A Comparative Study Using Simulated and Real Data : Abstract: For developing safe Autonomous Driving Systems (ADS), rigorous testing is required before they are deemed safe for road deployments. Since comprehensive conventional physical testing is impr...
Morpho-Genomic Deep Learning for Ovarian Cancer Subtype and Gene Mutation Prediction from Histopathology : Abstract: Ovarian cancer remains one of the most lethal gynecological malignancies, largely due to late diagnosis and extensive heterogeneity across subtypes. Current diagnostic methods are limited in...
Seeing What You Say: Expressive Image Generation from Speech : Abstract: This paper proposes VoxStudio, the first unified and end-to-end speech-to-image model that generates expressive images directly from spoken descriptions by jointly aligning linguistic and pa...
OneOcc: Semantic Occupancy Prediction for Legged Robots with a Single Panoramic Camera : Abstract: Robust 3D semantic occupancy is crucial for legged/humanoid robots, yet most semantic scene completion (SSC) systems target wheeled platforms with forward-facing sensors. We present OneOcc, ...
Flying Robotics Art: ROS-based Drone Draws the Record-Breaking Mural : Abstract: This paper presents the innovative design and successful deployment of a pioneering autonomous unmanned aerial system developed for executing the world's largest mural painted by a drone. Ad...
A New Comprehensive Framework for Multi-Exposure Stereo Coding Utilizing Low Rank Tucker-ALS and 3D-HEVC Techniques : Abstract: Display technology must offer high dynamic range (HDR) contrast-based depth induction and 3D personalization simultaneously. Efficient algorithms to compress HDR stereo data is critical. Dir...
Seal2Real: Prompt Prior Learning on Diffusion Model for Unsupervised Document Seal Data Generation and Realisation : Abstract: Seal-related tasks in document processing-such as seal segmentation, authenticity verification, seal removal, and text recognition under seals-hold substantial commercial importance. However...
BoxCell: Leveraging SAM for Cell Segmentation with Box Supervision : Abstract: Cell segmentation in histopathological images is vital for diagnosis, and treatment of several diseases. Annotating data is tedious, and requires medical expertise, making it difficult to em...
A Label Propagation Strategy for CutMix in Multi-Label Remote Sensing Image Classification : Abstract: The development of supervised deep learning-based methods for multi-label scene classification (MLC) is one of the prominent research directions in remote sensing (RS). However, collecting a...
ROADWork: A Dataset and Benchmark for Learning to Recognize, Observe, Analyze and Drive Through Work Zones : Abstract: Perceiving and autonomously navigating through work zones is a challenging and underexplored problem. Open datasets for this long-tailed scenario are scarce. We propose the ROADWork dataset ...
FusionRF: High-Fidelity Satellite Neural Radiance Fields from Multispectral and Panchromatic Acquisitions : Abstract: We introduce FusionRF, a novel framework for digital surface reconstruction from satellite multispectral and panchromatic images. Current work has demonstrated the increased accuracy of neur...
SAM-EM: Real-Time Segmentation for Automated Liquid Phase Transmission Electron Microscopy : Abstract: The absence of robust segmentation frameworks for noisy liquid phase transmission electron microscopy (LPTEM) videos prevents reliable extraction of particle trajectories, creating a major b...
ALTo: Adaptive-Length Tokenizer for Autoregressive Mask Generation : Abstract: While humans effortlessly draw visual objects and shapes by adaptively allocating attention based on their complexity, existing multimodal large language models (MLLMs) remain constrained by...
ZPressor: Bottleneck-Aware Compression for Scalable Feed-Forward 3DGS : Abstract: Feed-forward 3D Gaussian Splatting (3DGS) models have recently emerged as a promising solution for novel view synthesis, enabling one-pass inference without the need for per-scene 3DGS optim...
SpatialLM: Training Large Language Models for Structured Indoor Modeling : Abstract: SpatialLM is a large language model designed to process 3D point cloud data and generate structured 3D scene understanding outputs. These outputs include architectural elements like walls, d...
MagCache: Fast Video Generation with Magnitude-Aware Cache : Abstract: Existing acceleration techniques for video diffusion models often rely on uniform heuristics or time-embedding variants to skip timesteps and reuse cached features. These approaches typicall...
Human Perception-Inspired Grain Segmentation Refinement Using Conditional Random Fields : Abstract: Automated detection of grain boundaries (GBs) in electron microscope images of polycrystalline materials could help accelerate the nanoscale characterization of myriad engineering materials ...
MAROON: A Framework for the Joint Characterization of Near-Field High-Resolution Radar and Optical Depth Imaging Techniques : Abstract: Utilizing the complementary strengths of wavelength-specific range or depth sensors is crucial for robust computer-assisted tasks such as autonomous driving. Despite this, there is still lit...
ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing : Abstract: While end-to-end video-to-audio generation has greatly improved, producing high-fidelity audio that authentically captures the nuances of visual content remains challenging. Like professiona...
Accelerating Physical Property Reasoning for Augmented Visual Cognition : Abstract: This paper introduces \sysname, a system that accelerates vision-guided physical property reasoning to enable augmented visual cognition. \sysname minimizes the run-time latency of this reas...
Finetuning-Free Personalization of Text to Image Generation via Hypernetworks : Abstract: Personalizing text-to-image diffusion models has traditionally relied on subject-specific fine-tuning approaches such as DreamBooth~\cite{ruiz2023dreambooth}, which are computationally expen...
Subsampled Randomized Fourier GaLore for Adapting Foundation Models in Depth-Driven Liver Landmark Segmentation : Abstract: Accurate detection and delineation of anatomical structures in medical imaging are critical for computer-assisted interventions, particularly in laparoscopic liver surgery where 2D video str...
SurgAnt-ViVQA: Learning to Anticipate Surgical Events through GRU-Driven Temporal Cross-Attention : Abstract: Anticipating forthcoming surgical events is vital for real-time assistance in endonasal transsphenoidal pituitary surgery, where visibility is limited and workflow changes rapidly. Most visu...
PETWB-REP: A Multi-Cancer Whole-Body FDG PET/CT and Radiology Report Dataset for Medical Imaging Research : Abstract: Publicly available, large-scale medical imaging datasets are crucial for developing and validating artificial intelligence models and conducting retrospective clinical research. However, dat...
MvBody: Multi-View-Based Hybrid Transformer Using Optical 3D Body Scan for Explainable Cesarean Section Prediction : Abstract: Accurately assessing the risk of cesarean section (CS) delivery is critical, especially in settings with limited medical resources, where access to healthcare is often restricted. Early and ...
Diffusion-Guided Mask-Consistent Paired Mixing for Endoscopic Image Segmentation : Abstract: Augmentation for dense prediction typically relies on either sample mixing or generative synthesis. Mixing improves robustness but misaligned masks yield soft label ambiguity. Diffusion synt...
Transformer-Progressive Mamba Network for Lightweight Image Super-Resolution : Abstract: Recently, Mamba-based super-resolution (SR) methods have demonstrated the ability to capture global receptive fields with linear complexity, addressing the quadratic computational cost of Tr...
Decoupled Multi-Predictor Optimization for Inference-Efficient Model Tuning : Abstract: Recently, remarkable progress has been made in large-scale pre-trained model tuning, and inference efficiency is becoming more crucial for practical deployment. Early exiting in conjunction ...
Enhancing Medical Image Segmentation via Heat Conduction Equation : Abstract: Medical image segmentation has been significantly advanced by deep learning architectures, notably U-Net variants. However, existing models struggle to achieve efficient global context model...
IEC3D-AD: A 3D Dataset of Industrial Equipment Components for Unsupervised Point Cloud Anomaly Detection : Abstract: 3D anomaly detection (3D-AD) plays a critical role in industrial manufacturing, particularly in ensuring the reliability and safety of core equipment components. Although existing 3D dataset...
Unified Long Video Inpainting and Outpainting via Overlapping High-Order Co-Denoising : Abstract: Generating long videos remains a fundamental challenge, and achieving high controllability in video inpainting and outpainting is particularly demanding. To address both of these challenges ...
Diffusion-SDPO: Safeguarded Direct Preference Optimization for Diffusion Models : Abstract: Text-to-image diffusion models deliver high-quality images, yet aligning them with human preferences remains challenging. We revisit diffusion-based Direct Preference Optimization (DPO) for ...
SurgViVQA: Temporally-Grounded Video Question Answering for Surgical Scene Understanding : Abstract: Video Question Answering (VideoQA) in the surgical domain aims to enhance intraoperative understanding by enabling AI models to reason over temporally coherent events rather than isolated fr...
Multi-Object Tracking Retrieval with LLaVA-Video: A Training-Free Solution to MOT25-StAG Challenge : Abstract: In this report, we present our solution to the MOT25-Spatiotemporal Action Grounding (MOT25-StAG) Challenge. The aim of this challenge is to accurately localize and track multiple objects th...
UniAVGen: Unified Audio and Video Generation with Asymmetric Cross-Modal Interactions : Abstract: Due to the lack of effective cross-modal modeling, existing open-source audio-video generation methods often exhibit compromised lip synchronization and insufficient semantic consistency. To...
Robust Alignment of the Human Embryo in 3D Ultrasound using PCA and an Ensemble of Heuristic, Atlas-based and Learning-based Classifiers Evaluated on the Rotterdam Periconceptional Cohort : Abstract: Standardized alignment of the embryo in three-dimensional (3D) ultrasound images aids prenatal growth monitoring by facilitating standard plane detection, improving visualization of landmark...
Generalizing Shape-from-Template to Topological Changes : Abstract: Reconstructing the surfaces of deformable objects from correspondences between a 3D template and a 2D image is well studied under Shape-from-Template (SfT) methods; however, existing approac...
Human Mesh Modeling for Anny Body : Abstract: Parametric body models are central to many human-centric tasks, yet existing models often rely on costly 3D scans and learned shape spaces that are proprietary and demographically narrow. We...
Data-Efficient Adaptation and a Novel Evaluation Method for Aspect-based Sentiment Analysis : Abstract: Aspect-based Sentiment Analysis (ABSA) is a fine-grained opinion mining approach that identifies and classifies opinions associated with specific entities (aspects) or their categories withi...
Precise asymptotic analysis of Sobolev training for random feature models : Abstract: Gradient information is widely useful and available in applications, and is therefore natural to include in the training of neural networks. Yet little is known theoretically about the impac...
Min-Max Optimization Is Strictly Easier Than Variational Inequalities : Abstract: Classically, a mainstream approach for solving a convex-concave min-max problem is to instead solve the variational inequality problem arising from its first-order optimality conditions. Is ...
From Propagation to Prediction: Point-level Uncertainty Evaluation of MLS Point Clouds under Limited Ground Truth : Abstract: Evaluating uncertainty is critical for reliable use of Mobile Laser Scanning (MLS) point clouds in many high-precision applications such as Scan-to-BIM, deformation analysis, and 3D modeling...
PolyNorm: Few-Shot LLM-Based Text Normalization for Text-to-Speech : Abstract: Text Normalization (TN) is a key preprocessing step in Text-to-Speech (TTS) systems, converting written forms into their canonical spoken equivalents. Traditional TN systems can exhibit high...
Quantifying Articulatory Coordination as a Biomarker for Schizophrenia : Abstract: Advances in artificial intelligence (AI) and deep learning have improved diagnostic capabilities in healthcare, yet limited interpretability continues to hinder clinical adoption. Schizophre...
Provable Accelerated Bayesian Optimization with Knowledge Transfer : Abstract: We study how Bayesian optimization (BO) can be accelerated on a target task with historical knowledge transferred from related source tasks. Existing works on BO with knowledge transfer eith...
Scheduling the Off-Diagonal Weingarten Loss of Neural SDFs for CAD Models : Abstract: Neural signed distance functions (SDFs) have become a powerful representation for geometric reconstruction from point clouds, yet they often require both gradient- and curvature-based regula...
Modeling Headway in Heterogeneous and Mixed Traffic Flow: A Statistical Distribution Based on a General Exponential Function : Abstract: The ability of existing headway distributions to accurately reflect the diverse behaviors and characteristics in heterogeneous traffic (different types of vehicles) and mixed traffic (human-...
Learning-based Cooperative Robotic Paper Wrapping: A Unified Control Policy with Residual Force Control : Abstract: Human-robot cooperation is essential in environments such as warehouses and retail stores, where workers frequently handle deformable objects like paper, bags, and fabrics. Coordinating robo...
Understanding Robustness of Model Editing in Code LLMs: An Empirical Study : Abstract: Large language models (LLMs) are increasingly used in software development. However, while LLMs remain static after pretraining, programming languages and APIs continue to evolve, leading to...
Statistical Properties of Rectified Flow : Abstract: Rectified flow (Liu et al., 2022; Liu, 2022; Wu et al., 2023) is a method for defining a transport map between two distributions, and enjoys popularity in machine learning, although theoreti...
Provable Separations between Memorization and Generalization in Diffusion Models : Abstract: Diffusion models have achieved remarkable success across diverse domains, but they remain vulnerable to memorization -- reproducing training data rather than generating novel outputs. This n...
RKUM: An R Package for Robust Kernel Unsupervised Methods : Abstract: RKUM is an R package developed for implementing robust kernel-based unsupervised methods. It provides functions for estimating the robust kernel covariance operator (CO) and the robust kerne...
Topography, climate, land cover, and biodiversity: Explaining endemic richness and management implications on a Mediterranean island : Abstract: Island endemism is shaped by complex interactions among environmental, ecological, and evolutionary factors, yet the relative contributions of topography, climate, and land cover remain inco...
Death by a Thousand Prompts: Open Model Vulnerability Analysis : Abstract: Open-weight models provide researchers and developers with accessible foundations for diverse downstream applications. We tested the safety and security postures of eight open-weight large l...
Influence of Data Dimensionality Reduction Methods on the Effectiveness of Quantum Machine Learning Models : Abstract: Data dimensionality reduction techniques are often utilized in the implementation of Quantum Machine Learning models to address two significant issues: the constraints of NISQ quantum device...
SyMuPe: Affective and Controllable Symbolic Music Performance : Abstract: Emotions are fundamental to the creation and perception of music performances. However, achieving human-like expression and emotion through machine learning models for performance rendering ...
A Support-Set Algorithm for Optimization Problems with Nonnegative and Orthogonal Constraints : Abstract: In this paper, we investigate optimization problems with nonnegative and orthogonal constraints, where any feasible matrix of size $n \times p$ exhibits a sparsity pattern such that each row...
System Identification of a Moored ASV with Recessed Moon Pool via Deterministic and Bayesian Hankel-DMDc : Abstract: This study addresses the system identification of a small autonomous surface vehicle (ASV) under moored conditions using Hankel dynamic mode decomposition with control (HDMDc) and its Bayesi...
BanglaSTEM: A Parallel Corpus for Technical Domain Bangla-English Translation : Abstract: Large language models work well for technical problem solving in English but perform poorly when the same questions are asked in Bangla. A simple solution would be to translate Bangla questi...
The Structure of Cross-Validation Error: Stability, Covariance, and Minimax Limits : Abstract: Despite ongoing theoretical research on cross-validation (CV), many theoretical questions about CV remain widely open. This motivates our investigation into how properties of algorithm-distr...
Vector-valued self-normalized concentration inequalities beyond sub-Gaussianity : Abstract: The study of self-normalized processes plays a crucial role in a wide range of applications, from sequential decision-making to econometrics. While the behavior of self-normalized concentrat...
CLAX: Fast and Flexible Neural Click Models in JAX : Abstract: CLAX is a JAX-based library that implements classic click models using modern gradient-based optimization. While neural click models have emerged over the past decade, complex click models b...
Neural Beamforming with Doppler-Aware Sparse Attention for High Mobility Environments : Abstract: Beamforming has significance for enhancing spectral efficiency and mitigating interference in multi-antenna wireless systems, facilitating spatial multiplexing and diversity in dense and hig...
Towards Transparent Stance Detection: A Zero-Shot Approach Using Implicit and Explicit Interpretability : Abstract: Zero-Shot Stance Detection (ZSSD) identifies the attitude of the post toward unseen targets. Existing research using contrastive, meta-learning, or data augmentation suffers from generalizab...
Quantifying Weighted Morphological Content of Large-Scale Structures via Simulation-Based Inference : Abstract: In this work, we perform a simulation-based forecasting analysis to compare the constraining power of two higher-order summary statistics of the large-scale structure (LSS), the Minkowski Fu...
Efficient Testing Implies Structured Symmetry : Abstract: Given a small random sample of $n$-bit strings labeled by an unknown Boolean function, which properties of this function can be tested computationally efficiently? We show an equivalence bet...
Colorectal Cancer Histopathological Grading using Multi-Scale Federated Learning : Abstract: Colorectal cancer (CRC) grading is a critical prognostic factor but remains hampered by inter-observer variability and the privacy constraints of multi-institutional data sharing. While deep...
The Adaptivity Barrier in Batched Nonparametric Bandits: Sharp Characterization of the Price of Unknown Margin : Abstract: We study batched nonparametric contextual bandits under a margin condition when the margin parameter $\alpha$ is unknown. To capture the statistical price of this ignorance, we introduce the...
Trustworthy Representation Learning via Information Funnels and Bottlenecks : Abstract: Ensuring trustworthiness in machine learning -- by balancing utility, fairness, and privacy -- remains a critical challenge, particularly in representation learning. In this work, we investi...
How does training shape the Riemannian geometry of neural network representations? : Abstract: In machine learning, there is a long history of trying to build neural networks that can learn from fewer example data by baking in strong geometric priors. However, it is not always clear a...
AlignIQL: Policy Alignment in Implicit Q-Learning through Constrained Optimization : Abstract: Implicit Q-learning (IQL) serves as a strong baseline for offline RL, which learns the value function using only dataset actions through quantile regression. However, it is unclear how to re...
Data Quality Monitoring for the Hadron Calorimeters Using Transfer Learning for Anomaly Detection : Abstract: The proliferation of sensors brings an immense volume of spatio-temporal (ST) data in many domains, including monitoring, diagnostics, and prognostics applications. Data curation is a time-c...
Dynamical loss functions shape landscape topography and improve learning in artificial neural networks : Abstract: Dynamical loss functions are derived from standard loss functions used in supervised classification tasks, but are modified so that the contribution from each class periodically increases an...
Learning Expressive Random Feature Models via Parametrized Activations : Abstract: Random feature (RF) method is a powerful kernel approximation technique, but is typically equipped with fixed activation functions, limiting its adaptability across diverse tasks. To overcom...
HALO: Hadamard-Assisted Lower-Precision Optimization for LLMs : Abstract: Quantized training of Large Language Models (LLMs) remains an open challenge, as maintaining accuracy while performing all matrix multiplications in low precision has proven difficult. This ...
REINFORCE-ING Chemical Language Models for Drug Discovery : Abstract: Chemical language models, combined with reinforcement learning (RL), have shown significant promise to efficiently traverse large chemical spaces for drug discovery. However, the performance...
Sundial: A Family of Highly Capable Time Series Foundation Models : Abstract: We introduce Sundial, a family of native, flexible, and scalable time series foundation models. To predict the next-patch's distribution, we propose a TimeFlow Loss based on flow-matching, w...
Stable Port-Hamiltonian Neural Networks : Abstract: In recent years, nonlinear dynamic system identification using artificial neural networks has garnered attention due to its broad potential applications across science and engineering. Howev...
Decision-aware training of spatiotemporal forecasting models to select a top K subset of sites for intervention : Abstract: Optimal allocation of scarce resources is a common problem for decision makers faced with choosing a limited number of locations for intervention. Spatiotemporal prediction models could make...
UniFault: A Fault Diagnosis Foundation Model from Bearing Data : Abstract: Machine fault diagnosis (FD) is a critical task for predictive maintenance, enabling early fault detection and preventing unexpected failures. Despite its importance, existing FD models are ...
Reliable and efficient inverse analysis using physics-informed neural networks with normalized distance functions and adaptive weight tuning : Abstract: Physics-informed neural networks have attracted significant attention in scientific machine learning for their capability to solve forward and inverse problems governed by partial differenti...
NeuralSurv: Deep Survival Analysis with Bayesian Uncertainty Quantification : Abstract: We introduce NeuralSurv, the first deep survival model to incorporate Bayesian uncertainty quantification. Our non-parametric, architecture-agnostic framework captures time-varying covariate...
On scalable and efficient training of diffusion samplers : Abstract: We address the challenge of training diffusion models to sample from unnormalized energy distributions in the absence of data, the so-called diffusion samplers. Although these approaches hav...
A Theoretical Framework for Grokking: Interpolation followed by Riemannian Norm Minimisation : Abstract: We study the dynamics of gradient flow with small weight decay on general training losses $F: \mathbb{R}^d \to \mathbb{R}$. Under mild regularity assumptions and assuming convergence of the ...
Robust and Computation-Aware Gaussian Processes : Abstract: Gaussian processes (GPs) are widely used for regression and optimization tasks such as Bayesian optimization (BO) due to their expressiveness and principled uncertainty estimates. However, i...
Why Machine Learning Models Fail to Fully Capture Epistemic Uncertainty : Abstract: In recent years various supervised learning methods that disentangle aleatoric and epistemic uncertainty based on second-order distributions have been proposed. We argue that these methods f...
DiCoFlex: Model-agnostic diverse counterfactuals with flexible control : Abstract: Counterfactual explanations play a pivotal role in explainable artificial intelligence (XAI) by offering intuitive, human-understandable alternatives that elucidate machine learning model de...
Model-Informed Flows for Bayesian Inference : Abstract: Variational inference often struggles with the posterior geometry exhibited by complex hierarchical Bayesian models. Recent advances in flow-based variational families and Variationally Infe...
Inference-Time Reward Hacking in Large Language Models : Abstract: A common paradigm to improve the performance of large language models is optimizing for a reward model. Reward models assign a numerical score to an LLM's output that indicates, for example,...
Compliance Minimization via Physics-Informed Gaussian Processes : Abstract: Machine learning (ML) techniques have recently gained significant attention for solving compliance minimization (CM) problems. However, these methods typically provide poor feature boundarie...
Composing Linear Layers from Irreducibles : Abstract: Contemporary large models often exhibit behaviors suggesting the presence of low-level primitives that compose into modules with richer functionality, but these fundamental building blocks r...
OrdShap: Feature Position Importance for Sequential Black-Box Models : Abstract: Sequential deep learning models excel in domains with temporal or sequential dependencies, but their complexity necessitates post-hoc feature attribution methods for understanding their pred...
Variable Selection in Maximum Mean Discrepancy for Interpretable Distribution Comparison : Abstract: We study two-sample variable selection: identifying variables that discriminate between the distributions of two sets of data vectors. Such variables help scientists understand the mechanism...
Contraction of Private Quantum Channels and Private Quantum Hypothesis Testing : Abstract: A quantum generalized divergence by definition satisfies the data-processing inequality; as such, the relative decrease in such a divergence under the action of a quantum channel is at most ...
Disentanglement with Factor Quantized Variational Autoencoders : Abstract: Disentangled representation learning aims to represent the underlying generative factors of a dataset in a latent representation independently of one another. In our work, we propose a discr...
Alleviating Hyperparameter-Tuning Burden in SVM Classifiers for Pulmonary Nodules Diagnosis with Multi-Task Bayesian Optimization : Abstract: In the field of non-invasive medical imaging, radiomic features are utilized to measure tumor characteristics. However, these features can be affected by the techniques used to discretize th...
Aspen Open Jets: Unlocking LHC Data for Foundation Models in Particle Physics : Abstract: Foundation models are deep learning models pre-trained on large amounts of data which are capable of generalizing to multiple datasets and/or downstream tasks. This work demonstrates how dat...
Online Learning of Pure States is as Hard as Mixed States : Abstract: Quantum state tomography, the task of learning an unknown quantum state, is a fundamental problem in quantum information. In standard settings, the complexity of this problem depends signifi...
Data-Driven Probabilistic Air-Sea Flux Parameterization : Abstract: Accurately quantifying air-sea fluxes is important for understanding air-sea interactions and improving coupled weather and climate systems. This study introduces a probabilistic framework t...
Depth Matters: Multimodal RGB-D Perception for Robust Autonomous Agents : Abstract: Autonomous agents that rely purely on perception to make real-time control decisions require efficient and robust architectures. In this work, we demonstrate that augmenting RGB input with d...
A Polynomial-Time Algorithm for Variational Inequalities under the Minty Condition : Abstract: Solving variational inequalities (SVIs) is a foundational problem at the heart of optimization. However, this expressivity comes at the cost of computational hardness. As a result, most rese...
Tight Regret Bounds for Fixed-Price Bilateral Trade : Abstract: We examine fixed-price mechanisms in bilateral trade through the lens of regret minimization. Our main results are twofold. (i) For independent values, a near-optimal $\widetilde{\Theta}(T^{...
VQC-MLPNet: An Unconventional Hybrid Quantum-Classical Architecture for Scalable and Robust Quantum Machine Learning : Abstract: Variational quantum circuits (VQCs) hold promise for quantum machine learning but face challenges in expressivity, trainability, and noise resilience. We propose VQC-MLPNet, a hybrid archite...
Recurrent neural network-based robust control systems with closed-loop regional incremental ISS and application to MPC design : Abstract: This paper investigates the design of output-feedback schemes for systems described by a class of recurrent neural networks. We propose a procedure based on linear matrix inequalities for de...
Cache Mechanism for Agent RAG Systems : Abstract: Recent advances in Large Language Model (LLM)-based agents have been propelled by Retrieval-Augmented Generation (RAG), which grants the models access to vast external knowledge bases. Despi...
LEGO-Eval: Towards Fine-Grained Evaluation on Synthesizing 3D Embodied Environments with Tool Augmentation : Abstract: Despite recent progress in using Large Language Models (LLMs) for automatically generating 3D scenes, generated scenes often lack realistic spatial layouts and object attributes found in rea...
Targeted Error Correction in Knowledge Distillation: Small Language Models Surpass GPT : Abstract: We introduce an Analyze-Revise-Finetune (ARF) pipeline that enables smaller open-source language models (LLMs) to surpass substantially larger proprietary models in customer service summariz...
ROBoto2: An Interactive System and Dataset for LLM-assisted Clinical Trial Risk of Bias Assessment : Abstract: We present ROBOTO2, an open-source, web-based platform for large language model (LLM)-assisted risk of bias (ROB) assessment of clinical trials. ROBOTO2 streamlines the traditionally labor-i...
A Computational Approach to Analyzing Disrupted Language in Schizophrenia: Integrating Surprisal and Coherence Measures : Abstract: Language disruptions are one of the well-known effects of schizophrenia symptoms. They are often manifested as disorganized speech and impaired discourse coherence. These abnormalities in sp...
MME-CC: A Challenging Multi-Modal Evaluation Benchmark of Cognitive Capacity : Abstract: As reasoning models scale rapidly, the essential role of multimodality in human cognition has come into sharp relief, driving a growing need to probe vision-centric cognitive behaviors. Yet,...
Measuring Aleatoric and Epistemic Uncertainty in LLMs: Empirical Evaluation on ID and OOD QA Tasks : Abstract: Large Language Models (LLMs) have become increasingly pervasive, finding applications across many industries and disciplines. Ensuring the trustworthiness of LLM outputs is paramount, where ...
BengaliMoralBench: A Benchmark for Auditing Moral Reasoning in Large Language Models within Bengali Language and Culture : Abstract: As multilingual Large Language Models (LLMs) gain traction across South Asia, their alignment with local ethical norms, particularly for Bengali, which is spoken by over 285 million people a...
Beyond Ranked Lists: The SARAL Framework for Cross-Lingual Document Set Retrieval : Abstract: Machine Translation for English Retrieval of Information in Any Language (MATERIAL) is an IARPA initiative targeted to advance the state of cross-lingual information retrieval (CLIR). This r...
IndicSuperTokenizer: An Optimized Tokenizer for Indic Multilingual LLMs : Abstract: Tokenizers play a crucial role in determining the performance, training efficiency, and the inference cost of Large Language Models (LLMs). Designing effective tokenizers for multilingual LL...
SCALE: Upscaled Continual Learning of Large Language Models : Abstract: We revisit continual pre-training for large language models and argue that progress now depends more on scaling the right structure than on scaling parameters alone. We introduce SCALE, a wi...
Silenced Biases: The Dark Side LLMs Learned to Refuse : Abstract: Safety-aligned large language models (LLMs) are becoming increasingly widespread, especially in sensitive applications where fairness is essential and biased outputs can cause significant ha...
EQ-Negotiator: Dynamic Emotional Personas Empower Small Language Models for Edge-Deployable Credit Negotiation : Abstract: The deployment of large language models (LLMs) in automated negotiation has set a high performance benchmark, but their computational cost and data privacy requirements render them unsuitabl...
LFC-DA: Logical Formula-Controlled Data Augmentation for Enhanced Logical Reasoning : Abstract: For complex logical data augmentation, heavy reliance on human annotation is costly, whereas direct generation with large language models yields uninterpretable and logically homogeneous exa...
Segmentation Beyond Defaults: Asymmetrical Byte Pair Encoding for Optimal Machine Translation Performance : Abstract: Existing Machine Translation (MT) research often suggests a single, fixed set of hyperparameters for word segmentation models, symmetric Byte Pair Encoding (BPE), which applies the same numb...
Overcoming the Generalization Limits of SLM Finetuning for Shape-Based Extraction of Datatype and Object Properties : Abstract: Small language models (SLMs) have shown promises for relation extraction (RE) when extracting RDF triples guided by SHACL shapes focused on common datatype properties. This paper investigate...
Efficient Reasoning via Thought-Training and Thought-Free Inference : Abstract: Recent advances in large language models (LLMs) have leveraged explicit Chain-of-Thought (CoT) prompting to improve reasoning accuracy. However, most existing methods primarily compress verb...
Knowledge-Augmented Question Error Correction for Chinese Question Answer System with QuestionRAG : Abstract: Input errors in question-answering (QA) systems often lead to incorrect responses. Large language models (LLMs) struggle with this task, frequently failing to interpret user intent (misinter...
Kastor: Fine-tuned Small Language Models for Shape-based Active Relation Extraction : Abstract: RDF pattern-based extraction is a compelling approach for fine-tuning small language models (SLMs) by focusing a relation extraction task on a specified SHACL shape. This technique enables t...
HaluMem: Evaluating Hallucinations in Memory Systems of Agents : Abstract: Memory systems are key components that enable AI systems such as LLMs and AI agents to achieve long-term learning and sustained interaction. However, during memory storage and retrieval, the...
One Battle After Another: Probing LLMs' Limits on Multi-Turn Instruction Following with a Benchmark Evolving Framework : Abstract: Understanding how well large language models can follow users' instructions throughout a dialogue spanning multiple topics is of great importance for data-intensive conversational applicatio...
Bearing Syntactic Fruit with Stack-Augmented Neural Networks : Abstract: Any finite set of training data is consistent with an infinite number of hypothetical algorithms that could have generated it. Studies have shown that when human children learn language, the...
ASVRI-Legal: Fine-Tuning LLMs with Retrieval Augmented Generation for Enhanced Legal Regulation : Abstract: In this study, we explore the fine-tuning of Large Language Models (LLMs) to better support policymakers in their crucial work of understanding, analyzing, and crafting legal regulations. To...
A systematic review of relation extraction task since the emergence of Transformers : Abstract: This article presents a systematic review of relation extraction (RE) research since the advent of Transformer-based models. Using an automated framework to collect and annotate publications...
Do Androids Dream of Unseen Puppeteers? Probing for a Conspiracy Mindset in Large Language Models : Abstract: In this paper, we investigate whether Large Language Models (LLMs) exhibit conspiratorial tendencies, whether they display sociodemographic biases in this domain, and how easily they can be ...
Let the Bees Find the Weak Spots: A Path Planning Perspective on Multi-Turn Jailbreak Attacks against LLMs : Abstract: Large Language Models (LLMs) have been widely deployed across various applications, yet their potential security and ethical risks have raised increasing concerns. Existing research employs ...
Beyond Citations: Measuring Idea-level Knowledge Diffusion from Research to Journalism and Policy-making : Abstract: Despite the importance of social science knowledge for various stakeholders, measuring its diffusion into different domains remains a challenge. This study uses a novel text-based approach t...
Retrieval-Augmented Feature Generation for Domain-Specific Classification : Abstract: Feature generation can significantly enhance learning outcomes, particularly for tasks with limited data. An effective way to improve feature generation is to expand the current feature spac...
Verdict: A Library for Scaling Judge-Time Compute : Abstract: The use of LLMs as automated judges ("LLM-as-a-judge") is now widespread, yet standard judges suffer from a multitude of reliability issues. To address these challenges, we introduce Verdict...
Does Synthetic Data Help Named Entity Recognition for Low-Resource Languages? : Abstract: Named Entity Recognition(NER) for low-resource languages aims to produce robust systems for languages where there is limited labeled training data available, and has been an area of increasi...
The Case for Repeatable, Open, and Expert-Grounded Hallucination Benchmarks in Large Language Models : Abstract: Plausible, but inaccurate, tokens in model-generated text are widely believed to be pervasive and problematic for the responsible adoption of language models. Despite this concern, there is ...
Read Your Own Mind: Reasoning Helps Surface Self-Confidence Signals in LLMs : Abstract: We study the source of uncertainty in DeepSeek R1-32B by analyzing its self-reported verbal confidence on question answering (QA) tasks. In the default answer-then-confidence setting, the mo...
LexTime: A Benchmark for Temporal Ordering of Legal Events : Abstract: Understanding temporal relationships and accurately reconstructing the event timeline is important for case law analysis, compliance monitoring, and legal summarization. However, existing be...
Inv-Entropy: A Fully Probabilistic Framework for Uncertainty Quantification in Language Models : Abstract: Large language models (LLMs) have transformed natural language processing, but their reliable deployment requires effective uncertainty quantification (UQ). Existing UQ methods are often heu...
Scalable Medication Extraction and Discontinuation Identification from Electronic Health Records Using Large Language Models : Abstract: Identifying medication discontinuations in electronic health records (EHRs) is vital for patient safety but is often hindered by information being buried in unstructured notes. This study ai...
Post Persona Alignment for Multi-Session Dialogue Generation : Abstract: Multi-session persona-based dialogue generation presents challenges in maintaining long-term consistency and generating diverse, personalized responses. While large language models (LLMs) ex...
Token Perturbation Guidance for Diffusion Models : Abstract: Classifier-free guidance (CFG) has become an essential component of modern diffusion models to enhance both generation quality and alignment with input conditions. However, CFG requires spec...
Cropland Mapping using Geospatial Embeddings : Abstract: Accurate and up-to-date land cover maps are essential for understanding land use change, a key driver of climate change. Geospatial embeddings offer a more efficient and accessible way to ma...
ProM3E: Probabilistic Masked MultiModal Embedding Model for Ecology : Abstract: We introduce ProM3E, a probabilistic masked multimodal embedding model for any-to-any generation of multimodal representations for ecology. ProM3E is based on masked modality reconstruction ...
SCALE-VLP: Soft-Weighted Contrastive Volumetric Vision-Language Pre-training with Spatial-Knowledge Semantics : Abstract: Vision-language models (VLMs) have demonstrated strong cross-modal capabilities, yet most work remains limited to 2D data and assumes binary supervision (i.e., positive vs. negative pairs), ...
Learning with less: label-efficient land cover classification at very high spatial resolution using self-supervised deep learning : Abstract: Deep learning semantic segmentation methods have shown promising performance for very high 1-m resolution land cover classification, but the challenge of collecting large volumes of represen...
A Foundation Model for Brain MRI with Dynamic Modality Integration : Abstract: We present a foundation model for brain MRI that can work with different combinations of imaging sequences. The model uses one encoder with learnable modality embeddings, conditional layer n...
A Plug-and-Play Framework for Volumetric Light-Sheet Image Reconstruction : Abstract: Cardiac contraction is a rapid, coordinated process that unfolds across three-dimensional tissue on millisecond timescales. Traditional optical imaging is often inadequate for capturing dyna...
ISC-Perception: A Hybrid Computer Vision Dataset for Object Detection in Novel Steel Assembly : Abstract: The Intermeshed Steel Connection (ISC) system, when paired with robotic manipulators, can accelerate steel-frame assembly and improve worker safety by eliminating manual assembly. Dependable...
DentalSplat: Dental Occlusion Novel View Synthesis from Sparse Intra-Oral Photographs : Abstract: In orthodontic treatment, particularly within telemedicine contexts, observing patients' dental occlusion from multiple viewpoints facilitates timely clinical decision-making. Recent advance...
Online Learning to Rank under Corruption: A Robust Cascading Bandits Approach : Abstract: Online learning to rank (OLTR) studies how to recommend a short ranked list of items from a large pool and improves future rankings based on user clicks. This setting is commonly modeled as ...
An Efficient Classification Model for Cyber Text : Abstract: The uprising of deep learning methodology and practice in recent years has brought about a severe consequence of increasing carbon footprint due to the insatiable demand for computational re...
Towards Scalable Backpropagation-Free Gradient Estimation : Abstract: While backpropagation--reverse-mode automatic differentiation--has been extraordinarily successful in deep learning, it requires two passes (forward and backward) through the neural network ...
From Insight to Exploit: Leveraging LLM Collaboration for Adaptive Adversarial Text Generation : Abstract: LLMs can provide substantial zero-shot performance on diverse tasks using a simple task prompt, eliminating the need for training or fine-tuning. However, when applying these models to sensi...
Test Time Adaptation Using Adaptive Quantile Recalibration : Abstract: Domain adaptation is a key strategy for enhancing the generalizability of deep learning models in real-world scenarios, where test distributions often diverge significantly from the training...
UnCLe: Towards Scalable Dynamic Causal Discovery in Non-linear Temporal Systems : Abstract: Uncovering cause-effect relationships from observational time series is fundamental to understanding complex systems. While many methods infer static causal graphs, real-world systems often ...
Periodic Skill Discovery : Abstract: Unsupervised skill discovery in reinforcement learning (RL) aims to learn diverse behaviors without relying on external rewards. However, current methods often overlook the periodic nature o...
Cross-Modal Alignment via Variational Copula Modelling : Abstract: Various data modalities are common in real-world applications (e.g., electronic health records, medical images and clinical notes in healthcare). It is essential to develop multimodal learni...
A Probabilistic U-Net Approach to Downscaling Climate Simulations : Abstract: Climate models are limited by heavy computational costs, often producing outputs at coarse spatial resolutions, while many climate change impact studies require finer scales. Statistical dow...
Incorporating Quality of Life in Climate Adaptation Planning via Reinforcement Learning : Abstract: Urban flooding is expected to increase in frequency and severity as a consequence of climate change, causing wide-ranging impacts that include a decrease in urban Quality of Life (QoL). Mean...
A Feedback-Control Framework for Efficient Dataset Collection from In-Vehicle Data Streams : Abstract: Modern AI systems are increasingly constrained not by model capacity but by the quality and diversity of their data. Despite growing emphasis on data-centric AI, most datasets are still gath...
A unified physics-informed generative operator framework for general inverse problems : Abstract: Solving inverse problems governed by partial differential equations (PDEs) is central to science and engineering, yet remains challenging when measurements are sparse, noisy, or when the und...
Climate Adaptation with Reinforcement Learning: Economic vs. Quality of Life Adaptation Pathways : Abstract: Climate change will cause an increase in the frequency and severity of flood events, prompting the need for cohesive adaptation policymaking. Designing effective adaptation policies, however...
Decoupled Entropy Minimization : Abstract: Entropy Minimization (EM) is beneficial to reducing class overlap, bridging domain gap, and restricting uncertainty for various tasks in machine learning, yet its potential is limited. To st...
Diffusion Language Models are Super Data Learners : Abstract: Under strictly controlled pre-training settings, we observe a Crossover: when unique data is limited, diffusion language models (DLMs) consistently surpass autoregressive (AR) models by trai...
Multi-Objective Adaptive Rate Limiting in Microservices Using Deep Reinforcement Learning : Abstract: As cloud computing and microservice architectures become increasingly prevalent, API rate limiting has emerged as a critical mechanism for ensuring system stability and service quality. Trad...
A Probabilistic Approach to Pose Synchronization for Multi-Reference Alignment with Applications to MIMO Wireless Communication Systems : Abstract: From molecular imaging to wireless communications, the ability to align and reconstruct signals from multiple misaligned observations is crucial for system performance. We study the problem ...
Graph Neural AI with Temporal Dynamics for Comprehensive Anomaly Detection in Microservices : Abstract: This study addresses the problem of anomaly detection and root cause tracing in microservice architectures and proposes a unified framework that combines graph neural networks with temporal ...
SORTeD Rashomon Sets of Sparse Decision Trees: Anytime Enumeration : Abstract: Sparse decision tree learning provides accurate and interpretable predictive models that are ideal for high-stakes applications by finding the single most accurate tree within a (soft) size ...
A Modular, Data-Free Pipeline for Multi-Label Intention Recognition in Transportation Agentic AI Applications : Abstract: In this study, a modular, data-free pipeline for multi-label intention recognition is proposed for agentic AI applications in transportation. Unlike traditional intent recognition systems th...
TripleWin: Fixed-Point Equilibrium Pricing for Data-Model Coupled Markets : Abstract: The rise of the machine learning (ML) model economy has intertwined markets for training datasets and pre-trained models. However, most pricing approaches still separate data and model trans...
POEMS: Product of Experts for Interpretable Multi-omic Integration using Sparse Decoding : Abstract: Integrating different molecular layers, i.e., multiomics data, is crucial for unraveling the complexity of diseases; yet, most deep generative models either prioritize predictive performance...
Reinforcement Learning Using known Invariances : Abstract: In many real-world reinforcement learning (RL) problems, the environment exhibits inherent symmetries that can be exploited to improve learning efficiency. This paper develops a theoretical ...
RAGBoost: Efficient Retrieval-Augmented Generation with Accuracy-Preserving Context Reuse : Abstract: Retrieval-augmented generation (RAG) enhances large language models (LLMs) with retrieved context but often suffers from downgraded prefill performance as modern applications demand longer a...
NAP: Attention-Based Late Fusion for Automatic Sleep Staging : Abstract: Polysomnography signals are highly heterogeneous, varying in modality composition (e.g., EEG, EOG, ECG), channel availability (e.g., frontal, occipital EEG), and acquisition protocols across...
Why Less is More (Sometimes): A Theory of Data Curation : Abstract: This paper introduces a theoretical framework to resolve a central paradox in modern machine learning: When is it better to use less data? This question has become critical as classical scal...
Learning Without Critics? Revisiting GRPO in Classical Reinforcement Learning Environments : Abstract: Group Relative Policy Optimization (GRPO) has emerged as a scalable alternative to Proximal Policy Optimization (PPO) by eliminating the learned critic and instead estimating advantages thro...
Byzantine-Robust Federated Learning with Learnable Aggregation Weights : Abstract: Federated Learning (FL) enables clients to collaboratively train a global model without sharing their private data. However, the presence of malicious (Byzantine) clients poses significant c...
Flat Minima and Generalization: Insights from Stochastic Convex Optimization : Abstract: Understanding the generalization behavior of learning algorithms is a central goal of learning theory. A recently emerging explanation is that learning algorithms are successful in practice ...
TabGemma: Text-Based Tabular ICL via LLM using Continued Pretraining and Retrieval : Abstract: We study LLMs for tabular prediction with mixed text, numeric, and categorical fields. We introduce TabGemma, a schema-agnostic in-context learner that treats rows as sequences and tackles t...
Tensor-Efficient High-Dimensional Q-learning : Abstract: High-dimensional reinforcement learning faces challenges with complex calculations and low sample efficiency in large state-action spaces. Q-learning algorithms struggle particularly with th...
Going Beyond Expert Performance via Deep Implicit Imitation Reinforcement Learning : Abstract: Imitation learning traditionally requires complete state-action demonstrations from optimal or near-optimal experts. These requirements severely limit practical applicability, as many real-w...
Towards Formalizing Reinforcement Learning Theory : Abstract: In this paper, we formalize the almost sure convergence of $Q$-learning and linear temporal difference (TD) learning with Markovian samples using the Lean 4 theorem prover based on the Mathl...
Financial Management System for SMEs: Real-World Deployment of Accounts Receivable and Cash Flow Prediction : Abstract: Small and Medium Enterprises (SMEs), particularly freelancers and early-stage businesses, face unique financial management challenges due to limited resources, small customer bases, and cons...
nanoTabPFN: A Lightweight and Educational Reimplementation of TabPFN : Abstract: Tabular foundation models such as TabPFN have revolutionized predictive machine learning for tabular data. At the same time, the driving factors of this revolution are hard to understand. Ex...
SHIELD: Securing Healthcare IoT with Efficient Machine Learning Techniques for Anomaly Detection : Abstract: The integration of IoT devices in healthcare introduces significant security and reliability challenges, increasing susceptibility to cyber threats and operational anomalies. This study prop...
Behavior-Adaptive Q-Learning: A Unifying Framework for Offline-to-Online RL : Abstract: Offline reinforcement learning (RL) enables training from fixed data without online interaction, but policies learned offline often struggle when deployed in dynamic environments due to dist...
Shrinking the Variance: Shrinkage Baselines for Reinforcement Learning with Verifiable Rewards : Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a powerful paradigm for post-training large reasoning models (LRMs) using policy-gradient methods such as GRPO. To stabil...
Supersimulators : Abstract: We prove that every randomized Boolean function admits a supersimulator: a randomized polynomial-size circuit whose output on random inputs cannot be efficiently distinguished from reality w...
Association-sensory spatiotemporal hierarchy and functional gradient-regularised recurrent neural network with implications for schizophrenia : Abstract: The human neocortex is functionally organised at its highest level along a continuous sensory-to-association (AS) hierarchy. This study characterises the AS hierarchy of patients with schizo...
ECGXtract: Deep Learning-based ECG Feature Extraction for Automated CVD Diagnosis : Abstract: This paper presents ECGXtract, a deep learning-based approach for interpretable ECG feature extraction, addressing the limitations of traditional signal processing and black-box machine lear...
Automatic Machine Translation Detection Using a Surrogate Multilingual Translation Model : Abstract: Modern machine translation (MT) systems depend on large parallel corpora, often collected from the Internet. However, recent evidence indicates that (i) a substantial portion of these texts ...
Scalable Single-Cell Gene Expression Generation with Latent Diffusion Models : Abstract: Computational modeling of single-cell gene expression is crucial for understanding cellular processes, but generating realistic expression profiles remains a major challenge. This difficulty...
Hybrid Convolution and Vision Transformer NAS Search Space for TinyML Image Classification : Abstract: Hybrids of Convolutional Neural Network (CNN) and Vision Transformer (ViT) have outperformed pure CNN or ViT architecture. However, since these architectures require large parameters and inc...
Unifying Information-Theoretic and Pair-Counting Clustering Similarity : Abstract: Comparing clusterings is central to evaluating unsupervised models, yet the many existing similarity measures can produce widely divergent, sometimes contradictory, evaluations. Clustering s...
Exploratory Analysis of Cyberattack Patterns on E-Commerce Platforms Using Statistical Methods : Abstract: Cyberattacks on e-commerce platforms have grown in sophistication, threatening consumer trust and operational continuity. This research presents a hybrid analytical framework that integrates...
The OpenHands Software Agent SDK: A Composable and Extensible Foundation for Production Agents : Abstract: Agents are now used widely in the process of software development, but building production-ready software engineering agents is a complex task. Deploying software agents effectively requires...
AnaFlow: Agentic LLM-based Workflow for Reasoning-Driven Explainable and Sample-Efficient Analog Circuit Sizing : Abstract: Analog/mixed-signal circuits are key for interfacing electronics with the physical world. Their design, however, remains a largely handcrafted process, resulting in long and error-prone desi...
Grounded Misunderstandings in Asymmetric Dialogue: A Perspectivist Annotation Scheme for MapTask : Abstract: Collaborative dialogue relies on participants incrementally establishing common ground, yet in asymmetric settings they may believe they agree while referring to different entities. We intro...
Beyond Single Pass, Looping Through Time: KG-IRAG with Iterative Knowledge Retrieval : Abstract: Graph Retrieval-Augmented Generation (GraphRAG) has proven highly effective in enhancing the performance of Large Language Models (LLMs) on tasks that require external knowledge. By leveragi...
Leveraging LLMs to Automate Energy-Aware Refactoring of Parallel Scientific Codes : Abstract: While large language models (LLMs) are increasingly used for generating parallel scientific codes, most efforts emphasize functional correctness, often overlooking performance, especially en...
Divide by Question, Conquer by Agent: SPLIT-RAG with Question-Driven Graph Partitioning : Abstract: Retrieval-Augmented Generation (RAG) systems empower large language models (LLMs) with external knowledge, yet struggle with efficiency-accuracy trade-offs when scaling to large knowledge gr...
s3: You Don't Need That Much Data to Train a Search Agent via RL : Abstract: Retrieval-augmented generation (RAG) systems empower large language models (LLMs) to access external knowledge during inference. Recent advances have enabled LLMs to act as search agents via...
Multi-Agent Reinforcement Learning for Autonomous Multi-Satellite Earth Observation: A Realistic Case Study : Abstract: The exponential growth of Low Earth Orbit (LEO) satellites has revolutionised Earth Observation (EO) missions, addressing challenges in climate monitoring, disaster management, and more. How...
LLM-Driven Collaborative Model for Untangling Commits via Explicit and Implicit Dependency Reasoning : Abstract: Atomic commits, which address a single development concern, are a best practice in software development. In practice, however, developers often produce tangled commits that mix unrelated cha...
Reinforcement Learning Foundations for Deep Research Systems: A Survey : Abstract: Deep research systems, agentic AI that solve complex, multi-step tasks by coordinating reasoning, search across the open web and user files, and tool use, are moving toward hierarchical depl...
TabDSR: Decompose, Sanitize, and Reason for Complex Numerical Reasoning in Tabular Data : Abstract: Complex reasoning over tabular data is crucial in real-world data analysis, yet large language models (LLMs) often underperform due to complex queries, noisy data, and limited numerical capa...
The ORCA Benchmark: Evaluating Real-World Calculation Accuracy in Large Language Models : Abstract: We present ORCA (Omni Research on Calculation in AI) Benchmark - a novel benchmark that evaluates large language models (LLMs) on multi-domain, real-life quantitative reasoning using verifie...
Kosmos: An AI Scientist for Autonomous Discovery : Abstract: Data-driven scientific discovery requires iterative cycles of literature search, hypothesis generation, and data analysis. Substantial progress has been made towards AI agents that can autom...
Emotion Detection From Social Media Posts : Abstract: Over the last few years, social media has evolved into a medium for expressing personal views, emotions, and even business and political proposals, recommendations, and advertisements. We ad...
Transfer Learning-based Real-time Handgun Detection : Abstract: Traditional surveillance systems rely on human attention, limiting their effectiveness. This study employs convolutional neural networks and transfer learning to develop a real-time computer...
Survey on AI Ethics: A Socio-technical Perspective : Abstract: The past decade has observed a significant advancement in AI with deep learning-based models being deployed in diverse scenarios, including safety-critical applications. As these AI systems ...
Neural Physics: Using AI Libraries to Develop Physics-Based Solvers for Incompressible Computational Fluid Dynamics : Abstract: Numerical discretisations of partial differential equations (PDEs) can be written as discrete convolutions, which, themselves, are a key tool in AI libraries and used in convolutional neural...
A Survey of Graph Neural Networks in Real world: Imbalance, Noise, Privacy and OOD Challenges : Abstract: Graph-structured data exhibits universality and widespread applicability across diverse domains, such as social network analysis, biochemistry, financial fraud detection, and network securit...
A Reliable Cryptographic Framework for Empirical Machine Unlearning Evaluation : Abstract: Machine unlearning updates machine learning models to remove information from specific training samples, complying with data protection regulations that allow individuals to request the remo...
Autonomous Robotic Drilling System for Mice Cranial Window Creation : Abstract: Robotic assistance for experimental manipulation in the life sciences is expected to enable favorable outcomes, regardless of the skill of the scientist. Experimental specimens in the life s...
MSDNet: Multi-Scale Decoder for Few-Shot Semantic Segmentation via Transformer-Guided Prototyping : Abstract: Few-shot Semantic Segmentation addresses the challenge of segmenting objects in query images with only a handful of annotated examples. However, many previous state-of-the-art methods either...
Inverse Entropic Optimal Transport Solves Semi-supervised Learning via Data Likelihood Maximization : Abstract: Learning conditional distributions $\pi^*(\cdot|x)$ is a central problem in machine learning, which is typically approached via supervised methods with paired data $(x,y) \sim \pi^*$. Howeve...
Mastering Contact-rich Tasks by Combining Soft and Rigid Robotics with Imitation Learning : Abstract: Soft robots have the potential to revolutionize the use of robotic systems with their capability of establishing safe, robust, and adaptable interactions with their environment, but their pr...
Intelligent Computing Social Modeling and Methodological Innovations in Political Science in the Era of Large Language Models : Abstract: The recent wave of artificial intelligence, epitomized by large language models (LLMs),has presented opportunities and challenges for methodological innovation in political science,sparking ...
Do Automatic Factuality Metrics Measure Factuality? A Critical Evaluation : Abstract: Modern LLMs can now produce highly readable abstractive summaries, to the point that traditional automated metrics for evaluating summary quality, such as ROUGE, have saturated. However, LLM...
RAG-IT: Retrieval-Augmented Instruction Tuning for Automated Financial Analysis : Abstract: Financial analysis relies heavily on the interpretation of earnings reports to assess company performance and guide decision-making. Traditional methods for generating such analyses demand s...
REFA: Reference Free Alignment for multi-preference optimization : Abstract: To mitigate reward hacking from response verbosity, modern preference optimization methods are increasingly adopting length normalization (e.g., SimPO, ORPO, LN-DPO). While effective against...
From Haystack to Needle: Label Space Reduction for Zero-shot Classification : Abstract: We present Label Space Reduction (LSR), a novel method for improving zero-shot classification performance of Large Language Models (LLMs). LSR iteratively refines the classification label sp...
Beyond Covariance Matrix: The Statistical Complexity of Private Linear Regression : Abstract: We study the statistical complexity of private linear regression under an unknown, potentially ill-conditioned covariate distribution. Somewhat surprisingly, under privacy constraints the in...
A Survey on Text-Driven 360-Degree Panorama Generation : Abstract: The advent of text-driven 360-degree panorama generation, enabling the synthesis of 360-degree panoramic images directly from textual descriptions, marks a transformative advancement in imme...
Assessing the Macro and Micro Effects of Random Seeds on Fine-Tuning Large Language Models : Abstract: The impact of random seeds in fine-tuning large language models (LLMs) has been largely overlooked despite its potential influence on model performance.In this study, we systematically evalu...
SecRepoBench: Benchmarking Code Agents for Secure Code Completion in Real-World Repositories : Abstract: This paper introduces SecRepoBench, a benchmark to evaluate code agents on secure code completion in real-world repositories. SecRepoBench has 318 code completion tasks in 27 C/C++ repositor...
A data-driven framework for team selection in Fantasy Premier League : Abstract: Fantasy football is a billion-dollar industry with millions of participants. Under a fixed budget, managers select squads to maximize future Fantasy Premier League (FPL) points. This study f...
Traversal Verification for Speculative Tree Decoding : Abstract: Speculative decoding is a promising approach for accelerating large language models. The primary idea is to use a lightweight draft model to speculate the output of the target model for mult...
RoboRAN: A Unified Robotics Framework for Reinforcement Learning-Based Autonomous Navigation : Abstract: Autonomous robots must navigate and operate in diverse environments, from terrestrial and aquatic settings to aerial and space domains. While Reinforcement Learning (RL) has shown promise in...
This Time is Different: An Observability Perspective on Time Series Foundation Models : Abstract: We introduce Toto, a time series forecasting foundation model with 151 million parameters. Toto uses a modern decoder-only architecture coupled with architectural innovations designed to acc...
Distilling LLM Agent into Small Models with Retrieval and Code Tools : Abstract: Large language models (LLMs) excel at complex reasoning tasks but remain computationally expensive, limiting their practical deployment. To address this, recent works have focused on distill...
Large Language Models Miss the Multi-Agent Mark : Abstract: Recent interest in Multi-Agent Systems of Large Language Models (MAS LLMs) has led to an increase in frameworks leveraging multiple LLMs to tackle complex tasks. However, much of this litera...
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing : Abstract: Large Language Models (LLMs) achieve impressive reasoning capabilities at the cost of substantial inference overhead, posing substantial deployment challenges. Although distilled Small Langu...
SLED: A Speculative LLM Decoding Framework for Efficient Edge Serving : Abstract: The growing gap between the increasing complexity of large language models (LLMs) and the limited computational budgets of edge devices poses a key challenge for efficient on-device inferenc...
Balancing Tails when Comparing Distributions: Comprehensive Equity Index (CEI) with Application to Bias Evaluation in Operational Face Biometrics : Abstract: Demographic bias in high-performance face recognition (FR) systems often eludes detection by existing metrics, especially with respect to subtle disparities in the tails of the score distrib...
AlphaDecay: Module-wise Weight Decay for Heavy-Tailed Balancing in LLMs : Abstract: Weight decay is a standard regularization technique for training large language models (LLMs). While it is common to assign a uniform decay rate to every layer, this approach overlooks the s...
Dense SAE Latents Are Features, Not Bugs : Abstract: Sparse autoencoders (SAEs) are designed to extract interpretable features from language models by enforcing a sparsity constraint. Ideally, training an SAE would yield latents that are both ...
Benchmarking Foundation Models and Parameter-Efficient Fine-Tuning for Prognosis Prediction in Medical Imaging : Abstract: Despite the significant potential of Foundation Models (FMs) in medical imaging, their application to prognosis prediction remains challenging due to data scarcity, class imbalance, and task...
Layer Importance for Mathematical Reasoning is Forged in Pre-Training and Invariant after Post-Training : Abstract: Large language models improve at math after instruction tuning, reinforcement learning, or knowledge distillation. We ask whether these gains come from major changes in the transformer layer...
FedRef: Communication-Efficient Bayesian Fine-Tuning using a Reference Model : Abstract: Federated learning (FL) collaboratively trains artificial intelligence (AI) models to ensure user data privacy. Sharing only model updates generated from local training on client data with t...
Omni-Router: Sharing Routing Decisions in Sparse Mixture-of-Experts for Speech Recognition : Abstract: Mixture-of-experts (MoE) architectures have expanded from language modeling to automatic speech recognition (ASR). Traditional MoE methods, such as the Switch Transformer, route experts inde...
DiffSpectra: Molecular Structure Elucidation from Spectra using Diffusion Models : Abstract: Molecular structure elucidation from spectra is a fundamental challenge in molecular science. Conventional approaches rely heavily on expert interpretation and lack scalability, while retrie...
Automatic Road Subsurface Distress Recognition from Ground Penetrating Radar Images using Deep Learning-based Cross-verification : Abstract: Ground penetrating radar (GPR) has become a rapid and non-destructive solution for road subsurface distress (RSD) detection. Deep learning-based automatic RSD recognition, though amelioratin...
GDS Agent for Graph Algorithmic Reasoning : Abstract: Large language models (LLMs) have shown remarkable multimodal information processing and reasoning ability. When equipped with tools through function calling and enhanced with retrieval-augm...
LA-MARRVEL: A Knowledge-Grounded and Language-Aware LLM Reranker for AI-MARRVEL in Rare Disease Diagnosis : Abstract: Diagnosing rare diseases often requires connecting variant-bearing genes to evidence that is written as unstructured clinical prose, which the current established pipelines still leave for c...
Adaptive and Robust Data Poisoning Detection and Sanitization in Wearable IoT Systems using Large Language Models : Abstract: The widespread integration of wearable sensing devices in Internet of Things (IoT) ecosystems, particularly in healthcare, smart homes, and industrial applications, has required robust human...
Digital Twin-Driven Pavement Health Monitoring and Maintenance Optimization Using Graph Neural Networks : Abstract: Pavement infrastructure monitoring is challenged by complex spatial dependencies, changing environmental conditions, and non-linear deterioration across road networks. Traditional Pavement M...
Inference-Time Personalized Alignment with a Few User Preference Queries : Abstract: We study the problem of aligning a generative model's response with a user's preferences. Recent works have proposed several different formulations for personalized alignment; however, they ...
Heterogeneous Metamaterials Design via Multiscale Neural Implicit Representation : Abstract: Metamaterials are engineered materials composed of specially designed unit cells that exhibit extraordinary properties beyond those of natural materials. Complex engineering tasks often requ...
Discrete Bayesian Sample Inference for Graph Generation : Abstract: Generating graph-structured data is crucial in applications such as molecular generation, knowledge graphs, and network analysis. However, their discrete, unordered nature makes them difficu...
Leveraging Discrete Function Decomposability for Scientific Design : Abstract: In the era of AI-driven science and engineering, we often want to design discrete objects in silico according to user-specified properties. For example, we may wish to design a protein to bi...
Data-Efficient Realized Volatility Forecasting with Vision Transformers : Abstract: Recent work in financial machine learning has shown the virtue of complexity: the phenomenon by which deep learning methods capable of learning highly nonlinear relationships outperform simp...
Unsupervised Evaluation of Multi-Turn Objective-Driven Interactions : Abstract: Large language models (LLMs) have seen increasing popularity in enterprise applications where AI agents and humans engage in objective-driven interactions. However, these systems are difficu...
The Curved Spacetime of Transformer Architectures : Abstract: We present a geometric framework for understanding Transformer-based language models, drawing an explicit analogy to General Relativity. Queries and keys induce an effective metric on repres...
Homomorphism distortion: A metric to distinguish them all and in the latent space bind them : Abstract: For far too long, expressivity of graph neural networks has been measured \emph{only} in terms of combinatorial properties. In this work we stray away from this tradition and provide a princ...
Evaluating Control Protocols for Untrusted AI Agents : Abstract: As AI systems become more capable and widely deployed as agents, ensuring their safe operation becomes critical. AI control offers one approach to mitigating the risk from untrusted AI agent...
PublicAgent: Multi-Agent Design Principles From an LLM-Based Open Data Analysis Framework : Abstract: Open data repositories hold potential for evidence-based decision-making, yet are inaccessible to non-experts lacking expertise in dataset discovery, schema mapping, and statistical analysis...
No-Human in the Loop: Agentic Evaluation at Scale for Recommendation : Abstract: Evaluating large language models (LLMs) as judges is increasingly critical for building scalable and trustworthy evaluation pipelines. We present ScalingEval, a large-scale benchmarking stud...
Epidemiology of Large Language Models: A Benchmark for Observational Distribution Knowledge : Abstract: Artificial intelligence (AI) systems hold great promise for advancing various scientific disciplines, and are increasingly used in real-world applications. Despite their remarkable progress,...
SnapStream: Efficient Long Sequence Decoding on Dataflow Accelerators : Abstract: The proliferation of 100B+ parameter Large Language Models (LLMs) with 100k+ context length support have resulted in increasing demands for on-chip memory to support large KV caches. Techniq...
Large language models require a new form of oversight: capability-based monitoring : Abstract: The rapid adoption of large language models (LLMs) in healthcare has been accompanied by scrutiny of their oversight. Existing monitoring approaches, inherited from traditional machine learn...
miniF2F-Lean Revisited: Reviewing Limitations and Charting a Path Forward : Abstract: We perform a thorough analysis of the formal and informal statements in the miniF2F benchmark from the perspective of an AI system that is tasked to participate in a math Olympiad consisting...
Using Multi-modal Large Language Model to Boost Fireworks Algorithm's Ability in Settling Challenging Optimization Tasks : Abstract: As optimization problems grow increasingly complex and diverse, advancements in optimization techniques and paradigm innovations hold significant importance. The challenges posed by optimiza...
A Proprietary Model-Based Safety Response Framework for AI Agents : Abstract: With the widespread application of Large Language Models (LLMs), their associated security issues have become increasingly prominent, severely constraining their trustworthy deployment in cr...
Uncovering Bugs in Formal Explainers: A Case Study with PyXAI : Abstract: Formal explainable artificial intelligence (XAI) offers unique theoretical guarantees of rigor when compared to other non-formal methods of explainability. However, little attention has been...
Toward Autonomous Engineering Design: A Knowledge-Guided Multi-Agent Framework : Abstract: The engineering design process often demands expertise from multiple domains, leading to complex collaborations and iterative refinements. Traditional methods can be resource-intensive and p...
Adobe Summit Concierge Evaluation with Human in the Loop : Abstract: Generative AI assistants offer significant potential to enhance productivity, streamline information access, and improve user experience in enterprise contexts. In this work, we present Summ...
From Five Dimensions to Many: Large Language Models as Precise and Interpretable Psychological Profilers : Abstract: Psychological constructs within individuals are widely believed to be interconnected. We investigated whether and how Large Language Models (LLMs) can model the correlational structure of hu...
Towards Scalable Web Accessibility Audit with MLLMs as Copilots : Abstract: Ensuring web accessibility is crucial for advancing social welfare, justice, and equality in digital spaces, yet the vast majority of website user interfaces remain non-compliant, due in par...
Explaining Decisions in ML Models: a Parameterized Complexity Analysis (Part I) : Abstract: This paper presents a comprehensive theoretical investigation into the parameterized complexity of explanation problems in various machine learning (ML) models. Contrary to the prevalent bla...
Outbidding and Outbluffing Elite Humans: Mastering Liar's Poker via Self-Play and Reinforcement Learning : Abstract: AI researchers have long focused on poker-like games as a testbed for environments characterized by multi-player dynamics, imperfect information, and reasoning under uncertainty. While recen...
An extended reality-based framework for user risk training in urban built environment : Abstract: In the context of increasing urban risks, particularly from climate change-induced flooding, this paper presents an extended Reality (XR)-based framework to improve user risk training within...
Evaluating Generative AI as an Educational Tool for Radiology Resident Report Drafting : Abstract: Objective: Radiology residents require timely, personalized feedback to develop accurate image analysis and reporting skills. Increasing clinical workload often limits attendings' ability to...
Digital Transformation Chatbot (DTchatbot): Integrating Large Language Model-based Chatbot in Acquiring Digital Transformation Needs : Abstract: Many organisations pursue digital transformation to enhance operational efficiency, reduce manual efforts, and optimise processes by automation and digital tools. To achieve this, a comprehe...
AI-Enhanced Wi-Fi Sensing Through Single Transceiver Pair : Abstract: The advancement of next-generation Wi-Fi technology heavily relies on sensing capabilities, which play a pivotal role in enabling sophisticated applications. In response to the growing deman...
Spatio-Temporal Attention Network for Epileptic Seizure Prediction : Abstract: In this study, we present a deep learning framework that learns complex spatio-temporal correlation structures of EEG signals through a Spatio-Temporal Attention Network (STAN) for accurate ...
EEGReXferNet: A Lightweight Gen-AI Framework for EEG Subspace Reconstruction via Cross-Subject Transfer Learning and Channel-Aware Embedding : Abstract: Electroencephalography (EEG) is a widely used non-invasive technique for monitoring brain activity, but low signal-to-noise ratios (SNR) due to various artifacts often compromise its utility...
Approaching Low-Cost Cardiac Intelligence with Semi-Supervised Knowledge Distillation : Abstract: Deploying advanced cardiac artificial intelligence for daily cardiac monitoring is hindered by its reliance on extensive medical data and high computational resources. Low-cost cardiac intel...
Consciousness-ECG Transformer for Conscious State Estimation System with Real-Time Monitoring : Abstract: Conscious state estimation is important in various medical settings, including sleep staging and anesthesia management, to ensure patient safety and optimize health outcomes. Traditional met...
SELF-REDRAFT: Eliciting Intrinsic Exploration-Exploitation Balance in Test-Time Scaling for Code Generation : Abstract: Test-time scaling without interpreter feedback is essential for real-world code generation scenarios where test cases are not readily available. While existing paradigms often rely on either...
Digitizing Spermatogenesis Lineage at Nanoscale Resolution In Tissue-Level Electron Microscopy : Abstract: Recent advances in 2D large-scale and 3D volume electron microscopy have stimulated the rapid development of nanoscale functional analysis at the tissue and organ levels. Digitizing the cell...
Mathematical exploration and discovery at scale : Abstract: AlphaEvolve is a generic evolutionary coding agent that combines the generative capabilities of LLMs with automated evaluation in an iterative evolutionary framework that proposes, tests, an...
LM-Fix: Lightweight Bit-Flip Detection and Rapid Recovery Framework for Language Models : Abstract: This paper presents LM-Fix, a lightweight detection and rapid recovery framework for faults in large language models (LLMs). Existing integrity approaches are often heavy or slow for modern ...
Proof-of-Spiking-Neurons(PoSN): Neuromorphic Consensus for Next-Generation Blockchains : Abstract: Blockchain systems face persistent challenges of scalability, latency, and energy inefficiency. Existing consensus protocols such as Proof-of-Work (PoW) and Proof-of-Stake (PoS) either consu...
Analysis of AdvFusion: Adapter-based Multilingual Learning for Code Large Language Models : Abstract: Programming languages can benefit from one another by utilizing a language model for software engineering tasks. Full fine-tuning and Parameter Efficient Fine-Tuning (PEFT) of Code Language ...
FATE: A Formal Benchmark Series for Frontier Algebra of Multiple Difficulty Levels : Abstract: Recent advances in large language models (LLMs) have demonstrated impressive capabilities in formal theorem proving, particularly on contest-based mathematical benchmarks like the IMO. Howev...
Academics and Generative AI: Empirical and Epistemic Indicators of Policy-Practice Voids : Abstract: As generative AI diffuses through academia, policy-practice divergence becomes consequential, creating demand for auditable indicators of alignment. This study prototypes a ten-item, indirec...
A Novel Reservoir Computing Framework for Chaotic Time Series Prediction Using Time Delay Embedding and Random Fourier Features : Abstract: Forecasting chaotic time series requires models that can capture the intrinsic geometry of the underlying attractor while remaining computationally efficient. We introduce a novel reservoir ...
Stochastic Deep Graph Clustering for Practical Group Formation : Abstract: While prior work on group recommender systems (GRSs) has primarily focused on improving recommendation accuracy, most approaches assume static or predefined groups, making them unsuitable fo...
NEF-NET+: Adapting Electrocardio panorama in the wild : Abstract: Conventional multi-lead electrocardiogram (ECG) systems capture cardiac signals from a fixed set of anatomical viewpoints defined by lead placement. However, certain cardiac conditions (e.g....
AgentSLA : Towards a Service Level Agreement for AI Agents : Abstract: AI components are increasingly becoming a key element of all types of software systems to enhance their functionality. These AI components are often implemented as AI Agents, offering more a...
Test-time Adaptation of Tiny Recursive Models : Abstract: Prior to the close of the 2025 ARC Prize competition, the leading open source approach - known as TRM, or Tiny Recursive Models - involved training a 7M parameter recursive neural network on...
Predicting Weekly Fishing Concentration Zones through Deep Learning Integration of Heterogeneous Environmental Spatial Datasets : Abstract: The North Indian Ocean, including the Arabian Sea and the Bay of Bengal, represents a vital source of livelihood for coastal communities, yet fishermen often face uncertainty in locating pro...
NABench: Large-Scale Benchmarks of Nucleotide Foundation Models for Fitness Prediction : Abstract: Nucleotide sequence variation can induce significant shifts in functional fitness. Recent nucleotide foundation models promise to predict such fitness effects directly from sequence, yet het...
A Criminology of Machines : Abstract: While the possibility of reaching human-like Artificial Intelligence (AI) remains controversial, the likelihood that the future will be characterized by a society with a growing presence of ...
Performance Evaluation of Bitstring Representations in a Linear Genetic Programming Framework : Abstract: Different bitstring representations can yield varying computational performance. This work compares three bitstring implementations in C++: std::bitset, boost::dynamic_bitset, and a custom d...
Generative Hints : Abstract: Data augmentation is widely used in vision to introduce variation and mitigate overfitting, through enabling models to learn invariant properties, such as spatial invariance. However, these ...
Zero-shot data citation function classification using transformer-based large language models (LLMs) : Abstract: Efforts have increased in recent years to identify associations between specific datasets and the scientific literature that incorporates them. Knowing that a given publication cites a given...
From Narrow to Wide: Autoencoding Transformers for Ultrasound Bandwidth Recovery : Abstract: Conventional pulse-echo ultrasound suffers when low-cost probes deliver only narrow fractional bandwidths, elongating pulses and erasing high-frequency detail. We address this limitation by ...
Power Constrained Nonstationary Bandits with Habituation and Recovery Dynamics : Abstract: A common challenge for decision makers is selecting actions whose rewards are unknown and evolve over time based on prior policies. For instance, repeated use may reduce an action's effectiv...
EvtSlowTV - A Large and Diverse Dataset for Event-Based Depth Estimation : Abstract: Event cameras, with their high dynamic range (HDR) and low latency, offer a promising alternative for robust depth estimation in challenging environments. However, many event-based depth est...
Value of Information-Enhanced Exploration in Bootstrapped DQN : Abstract: Efficient exploration in deep reinforcement learning remains a fundamental challenge, especially in environments characterized by high-dimensional states and sparse rewards. Traditional expl...
Systematizing LLM Persona Design: A Four-Quadrant Technical Taxonomy for AI Companion Applications : Abstract: The design and application of LLM-based personas in AI companionship is a rapidly expanding but fragmented field, spanning from virtual emotional compan- ions and game NPCs to embodied funct...
SLIP: Structural-aware Language-Image Pretraining for Vision-Language Alignment : Abstract: Vision-Language Pretraining (VLP) has achieved remarkable success across various downstream tasks, but such gains are largely driven by scaling up on training data. Yet, literature methods t...
Adaptive-Sensorless Monitoring of Shipping Containers : Abstract: Monitoring the internal temperature and humidity of shipping containers is essential to preventing quality degradation during cargo transportation. Sensorless monitoring -- machine learning ...
Reading Between the Lines: The One-Sided Conversation Problem : Abstract: Conversational AI is constrained in many real-world settings where only one side of a dialogue can be recorded, such as telemedicine, call centers, and smart glasses. We formalize this as th...
Sparse, self-organizing ensembles of local kernels detect rare statistical anomalies : Abstract: Modern artificial intelligence has revolutionized our ability to extract rich and versatile data representations across scientific disciplines. Yet, the statistical properties of these repre...
Scaling Multi-Agent Environment Co-Design with Diffusion Models : Abstract: The agent-environment co-design paradigm jointly optimises agent policies and environment configurations in search of improved system performance. With application domains ranging from wareh...
CARMA: Comprehensive Automatically-annotated Reddit Mental Health Dataset for Arabic : Abstract: Mental health disorders affect millions worldwide, yet early detection remains a major challenge, particularly for Arabic-speaking populations where resources are limited and mental health d...
Adaptive Detection of Software Aging under Workload Shift : Abstract: Software aging is a phenomenon that affects long-running systems, leading to progressive performance degradation and increasing the risk of failures. To mitigate this problem, this work prop...
FP-AbDiff: Improving Score-based Antibody Design by Capturing Nonequilibrium Dynamics through the Underlying Fokker-Planck Equation : Abstract: Computational antibody design holds immense promise for therapeutic discovery, yet existing generative models are fundamentally limited by two core challenges: (i) a lack of dynamical consis...
An Augmentation Overlap Theory of Contrastive Learning : Abstract: Recently, self-supervised contrastive learning has achieved great success on various tasks. However, its underlying working mechanism is yet unclear. In this paper, we first provide the tigh...
Image-Intrinsic Priors for Integrated Circuit Defect Detection and Novel Class Discovery via Self-Supervised Learning : Abstract: Integrated circuit manufacturing is highly complex, comprising hundreds of process steps. Defects can arise at any stage, causing yield loss and ultimately degrading product reliability. Sup...
Control Barrier Function for Aligning Large Language Models : Abstract: This paper proposes a control-based framework for aligning large language models (LLMs) by leveraging a control barrier function (CBF) to ensure user-desirable text generation. The presented...
EGMOF: Efficient Generation of Metal-Organic Frameworks Using a Hybrid Diffusion-Transformer Architecture : Abstract: Designing materials with targeted properties remains challenging due to the vastness of chemical space and the scarcity of property-labeled data. While recent advances in generative models o...
Optimal Boundary Control of Diffusion on Graphs via Linear Programming : Abstract: We propose a linear programming (LP) framework for steady-state diffusion and flux optimization on geometric networks. The state variable satisfies a discrete diffusion law on a weighted, or...
Deploying Rapid Damage Assessments from sUAS Imagery for Disaster Response : Abstract: This paper presents the first AI/ML system for automating building damage assessment in uncrewed aerial systems (sUAS) imagery to be deployed operationally during federally declared disaster...
From Measurement to Expertise: Empathetic Expert Adapters for Context-Based Empathy in Conversational AI Agents : Abstract: Empathy is a critical factor in fostering positive user experiences in conversational AI. While models can display empathy, it is often generic rather than tailored to specific tasks and con...
Forecast2Anomaly (F2A): Adapting Multivariate Time Series Foundation Models for Anomaly Prediction : Abstract: Forecasting anomalies (anomaly prediction) in multivariate time series from different real-world, dynamic, and complex systems is vital for preempting critical failures, leading to a substan...
Who Sees the Risk? Stakeholder Conflicts and Explanatory Policies in LLM-based Risk Assessment : Abstract: Understanding how different stakeholders perceive risks in AI systems is essential for their responsible deployment. This paper presents a framework for stakeholder-grounded risk assessment ...
RefAgent: A Multi-agent LLM-based Framework for Automatic Software Refactoring : Abstract: Large Language Models (LLMs) have substantially influenced various software engineering tasks. Indeed, in the case of software refactoring, traditional LLMs have shown the ability to reduce ...
GraphCliff: Short-Long Range Gating for Subtle Differences but Critical Changes : Abstract: Quantitative structure-activity relationship assumes a smooth relationship between molecular structure and biological activity. However, activity cliffs defined as pairs of structurally simi...
Optimizing Earth-Moon Transfer and Cislunar Navigation: Integrating Low-Energy Trajectories, AI Techniques and GNSS-R Technologies : Abstract: The rapid growth of cislunar activities, including lunar landings, the Lunar Gateway, and in-space refueling stations, requires advances in cost-efficient trajectory design and reliable inte...
Efficient Linear Attention for Multivariate Time Series Modeling via Entropy Equality : Abstract: Attention mechanisms have been extensively employed in various applications, including time series modeling, owing to their capacity to capture intricate dependencies; however, their utility...
A Quantized VAE-MLP Botnet Detection Model: A Systematic Evaluation of Quantization-Aware Training and Post-Training Quantization Strategies : Abstract: In an effort to counter the increasing IoT botnet-based attacks, state-of-the-art deep learning methods have been proposed and have achieved impressive detection accuracy. However, their com...
QG-CoC: Question-Guided Chain-of-Captions for Large Multimodal Models : Abstract: Recently, Multimodal Large Language Models (MLLMs) encounter two key issues in multi-image contexts: (1) a lack of fine-grained perception across disparate images, and (2) a diminished capab...
Retrofitters, pragmatists and activists: Public interest litigation for accountable automated decision-making : Abstract: This paper examines the role of public interest litigation in promoting accountability for AI and automated decision-making (ADM) in Australia. Since ADM regulatio faces geopolitical headwin...
LGM: Enhancing Large Language Models with Conceptual Meta-Relations and Iterative Retrieval : Abstract: Large language models (LLMs) exhibit strong semantic understanding, yet struggle when user instructions involve ambiguous or conceptually misaligned terms. We propose the Language Graph Mode...
Hybrid Fact-Checking that Integrates Knowledge Graphs, Large Language Models, and Search-Based Retrieval Agents Improves Interpretable Claim Verification : Abstract: Large language models (LLMs) excel in generating fluent utterances but can lack reliable grounding in verified information. At the same time, knowledge-graph-based fact-checkers deliver prec...
Node-Based Editing for Multimodal Generation of Text, Audio, Image, and Vide : Abstract: We present a node-based storytelling system for multimodal content generation. The system represents stories as graphs of nodes that can be expanded, edited, and iteratively refined through ...
GMoPE:A Prompt-Expert Mixture Framework for Graph Foundation Models : Abstract: Graph Neural Networks (GNNs) have demonstrated impressive performance on task-specific benchmarks, yet their ability to generalize across diverse domains and tasks remains limited. Existing ...
Generative deep learning for foundational video translation in ultrasound : Abstract: Deep learning (DL) has the potential to revolutionize image acquisition and interpretation across medicine, however, attention to data imbalance and missingness is required. Ultrasound data ...
Comparing the Performance of LLMs in RAG-based Question-Answering: A Case Study in Computer Science Literature : Abstract: Retrieval Augmented Generation (RAG) is emerging as a powerful technique to enhance the capabilities of Generative AI models by reducing hallucination. Thus, the increasing prominence of RAG...
When Generative Artificial Intelligence meets Extended Reality: A Systematic Review : Abstract: With the continuous advancement of technology, the application of generative artificial intelligence (AI) in various fields is gradually demonstrating great potential, particularly when comb...
How to Evaluate Speech Translation with Source-Aware Neural MT Metrics : Abstract: Automatic evaluation of speech-to-text translation (ST) systems is typically performed by comparing translation hypotheses with one or more reference translations. While effective to some ex...
Extending Fair Null-Space Projections for Continuous Attributes to Kernel Methods : Abstract: With the on-going integration of machine learning systems into the everyday social life of millions the notion of fairness becomes an ever increasing priority in their development. Fairness ...
Benchmarking the Thinking Mode of Multimodal Large Language Models in Clinical Tasks : Abstract: A recent advancement in Multimodal Large Language Models (MLLMs) research is the emergence of "reasoning MLLMs" that offer explicit control over their internal thinking processes (normally r...
Discourse-Aware Scientific Paper Recommendation via QA-Style Summarization and Multi-Level Contrastive Learning : Abstract: The rapid growth of open-access (OA) publications has intensified the challenge of identifying relevant scientific papers. Due to privacy constraints and limited access to user interaction d...
Generative Artificial Intelligence in Bioinformatics: A Systematic Review of Models, Applications, and Methodological Advances : Abstract: Generative artificial intelligence (GenAI) has become a transformative approach in bioinformatics that often enables advancements in genomics, proteomics, transcriptomics, structural biology...
Open Source State-Of-the-Art Solution for Romanian Speech Recognition : Abstract: In this work, we present a new state-of-the-art Romanian Automatic Speech Recognition (ASR) system based on NVIDIA's FastConformer architecture--explored here for the first time in the conte...
Decoupling Augmentation Bias in Prompt Learning for Vision-Language Models : Abstract: Recent advances in large-scale vision and language models have led to significant progress in zero-shot learning tasks. Methods such as CoOp and CoCoOp have shown that replacing handcrafted ...
Computational Imaging Meets LLMs: Zero-Shot IDH Mutation Prediction in Brain Gliomas : Abstract: We present a framework that combines Large Language Models with computational image analytics for non-invasive, zero-shot prediction of IDH mutation status in brain gliomas. For each subject...
Adaptable Hindsight Experience Replay for Search-Based Learning : Abstract: AlphaZero-like Monte Carlo Tree Search systems, originally introduced for two-player games, dynamically balance exploration and exploitation using neural network guidance. This combination m...
Light over Heavy: Automated Performance Requirements Quantification with Linguistic Inducement : Abstract: Elicited performance requirements need to be quantified for compliance in different engineering tasks, e.g., configuration tuning and performance testing. Much existing work has relied on ma...
Inter-Agent Trust Models: A Comparative Study of Brief, Claim, Proof, Stake, Reputation and Constraint in Agentic Web Protocol Design-A2A, AP2, ERC-8004, and Beyond : Abstract: As the "agentic web" takes shape-billions of AI agents (often LLM-powered) autonomously transacting and collaborating-trust shifts from human oversight to protocol design. In 2025, several i...
CareMedEval dataset: Evaluating Critical Appraisal and Reasoning in the Biomedical Field : Abstract: Critical appraisal of scientific literature is an essential skill in the biomedical field. While large language models (LLMs) can offer promising support in this task, their reliability rema...
Development of the Bioinspired Tendon-Driven DexHand 021 with Proprioceptive Compliance Control : Abstract: The human hand plays a vital role in daily life and industrial applications, yet replicating its multifunctional capabilities-including motion, sensing, and coordinated manipulation-with rob...
ROSBag MCP Server: Analyzing Robot Data with LLMs for Agentic Embodied AI Applications : Abstract: Agentic AI systems and Physical or Embodied AI systems have been two key research verticals at the forefront of Artificial Intelligence and Robotics, with Model Context Protocol (MCP) increa...
A Theoretical Framework for Environmental Similarity and Vessel Mobility as Coupled Predictors of Marine Invasive Species Pathways : Abstract: Marine invasive species spread through global shipping and generate substantial ecological and economic impacts. Traditional risk assessments require detailed records of ballast water and tr...
Efficient Neural Networks with Discrete Cosine Transform Activations : Abstract: In this paper, we extend our previous work on the Expressive Neural Network (ENN), a multilayer perceptron with adaptive activation functions parametrized using the Discrete Cosine Transform...
SOLVE-Med: Specialized Orchestration for Leading Vertical Experts across Medical Specialties : Abstract: Medical question answering systems face deployment challenges including hallucinations, bias, computational demands, privacy concerns, and the need for specialized expertise across diverse d...
Uncovering Code Insights: Leveraging GitHub Artifacts for Deeper Code Understanding : Abstract: Understanding the purpose of source code is a critical task in software maintenance, onboarding, and modernization. While large language models (LLMs) have shown promise in generating code e...
MultiZebraLogic: A Multilingual Logical Reasoning Benchmark : Abstract: Measuring the full abilities of large language models (LLMs) requires benchmarks representing multiple tasks. We aim to create large, high-quality datasets for comparison of logical reasonin...
AILA--First Experiments with Localist Language Models : Abstract: This paper presents the first empirical demonstration of controllable locality in transformer language models, a novel architectural framework that enables continuous control over the degree...
Imitation Learning in the Deep Learning Era: A Novel Taxonomy and Recent Advances : Abstract: Imitation learning (IL) enables agents to acquire skills by observing and replicating the behavior of one or multiple experts. In recent years, advances in deep learning have significantly e...
Multi-User Personalisation in Human-Robot Interaction: Using Quantitative Bipolar Argumentation Frameworks for Preferences Conflict Resolution : Abstract: While personalisation in Human-Robot Interaction (HRI) has advanced significantly, most existing approaches focus on single-user adaptation, overlooking scenarios involving multiple stakehol...
Learning Under Laws: A Constraint-Projected Neural PDE Solver that Eliminates Hallucinations : Abstract: Neural networks can approximate solutions to partial differential equations, but they often break the very laws they are meant to model-creating mass from nowhere, drifting shocks, or violat...
PerfDojo: Automated ML Library Generation for Heterogeneous Architectures : Abstract: The increasing complexity of machine learning models and the proliferation of diverse hardware architectures (CPUs, GPUs, accelerators) make achieving optimal performance a significant chall...
Step-Audio-EditX Technical Report : Abstract: We present Step-Audio-EditX, the first open-source LLM-based audio model excelling at expressive and iterative audio editing encompassing emotion, speaking style, and paralinguistics alongsi...
Visualization Biases MLLM's Decision Making in Network Data Tasks : Abstract: We evaluate how visualizations can influence the judgment of MLLMs about the presence or absence of bridges in a network. We show that the inclusion of visualization improves confidence over...
LiveTradeBench: Seeking Real-World Alpha with Large Language Models : Abstract: Large language models (LLMs) achieve strong performance across benchmarks--from knowledge quizzes and math reasoning to web-agent tasks--but these tests occur in static settings, lacking rea...
Watermarking Large Language Models in Europe: Interpreting the AI Act in Light of Technology : Abstract: To foster trustworthy Artificial Intelligence (AI) within the European Union, the AI Act requires providers to mark and detect the outputs of their general-purpose models. The Article 50 and...
Explaining Human Choice Probabilities with Simple Vector Representations : Abstract: When people pursue rewards in stochastic environments, they often match their choice frequencies to the observed target frequencies, even when this policy is demonstrably sub-optimal. We use...
ChiMDQA: Towards Comprehensive Chinese Document QA with Fine-grained Evaluation : Abstract: With the rapid advancement of natural language processing (NLP) technologies, the demand for high-quality Chinese document question-answering datasets is steadily growing. To address this is...
DQN Performance with Epsilon Greedy Policies and Prioritized Experience Replay : Abstract: We present a detailed study of Deep Q-Networks in finite environments, emphasizing the impact of epsilon-greedy exploration schedules and prioritized experience replay. Through systematic ex...
Whisper Leak: a side-channel attack on Large Language Models : Abstract: Large Language Models (LLMs) are increasingly deployed in sensitive domains including healthcare, legal services, and confidential communications, where privacy is paramount. This paper intr...
Structured Matrix Scaling for Multi-Class Calibration : Abstract: Post-hoc recalibration methods are widely used to ensure that classifiers provide faithful probability estimates. We argue that parametric recalibration functions based on logistic regressio...

Research Sources: 375 | Generated: 11/6/2025