AI RESEARCH PAPERS & ACADEMIC SOURCES
- Signal Intensity-weighted coordinate channels improve learning stability and generalisation in 1D and 2D CNNs in localisation tasks on biomedical signals : Abstract: Localisation tasks in biomedical data often require models to learn meaningful spatial or temporal relationships from signals with complex intensity distributions. A common strategy, exempli...
- A Lightweight 3D-CNN for Event-Based Human Action Recognition with Privacy-Preserving Potential : Abstract: This paper presents a lightweight three-dimensional convolutional neural network (3DCNN) for human activity recognition (HAR) using event-based vision data. Privacy preservation is a key cha...
- Part-Aware Bottom-Up Group Reasoning for Fine-Grained Social Interaction Detection : Abstract: Social interactions often emerge from subtle, fine-grained cues such as facial expressions, gaze, and gestures. However, existing methods for social interaction detection overlook such nuanc...
- Disentangled Concepts Speak Louder Than Words:Explainable Video Action Recognition : Abstract: Effective explanations of video action recognition models should disentangle how movements unfold over time from the surrounding spatial context. However, existing methods based on saliency ...
- Benchmarking ResNet for Short-Term Hypoglycemia Classification with DiaData : Abstract: Individualized therapy is driven forward by medical data analysis, which provides insight into the patient's context. In particular, for Type 1 Diabetes (T1D), which is an autoimmune disease...
- Optimizing the nnU-Net model for brain tumor (Glioma) segmentation Using a BraTS Sub-Saharan Africa (SSA) dataset : Abstract: Medical image segmentation is a critical achievement in modern medical science, developed over decades of research. It allows for the exact delineation of anatomical and pathological feature...
- Domain-Adaptive Transformer for Data-Efficient Glioma Segmentation in Sub-Saharan MRI : Abstract: Glioma segmentation is critical for diagnosis and treatment planning, yet remains challenging in Sub-Saharan Africa due to limited MRI infrastructure and heterogeneous acquisition protocols ...
- Comprehensive Assessment of LiDAR Evaluation Metrics: A Comparative Study Using Simulated and Real Data : Abstract: For developing safe Autonomous Driving Systems (ADS), rigorous testing is required before they are deemed safe for road deployments. Since comprehensive conventional physical testing is impr...
- Morpho-Genomic Deep Learning for Ovarian Cancer Subtype and Gene Mutation Prediction from Histopathology : Abstract: Ovarian cancer remains one of the most lethal gynecological malignancies, largely due to late diagnosis and extensive heterogeneity across subtypes. Current diagnostic methods are limited in...
- Seeing What You Say: Expressive Image Generation from Speech : Abstract: This paper proposes VoxStudio, the first unified and end-to-end speech-to-image model that generates expressive images directly from spoken descriptions by jointly aligning linguistic and pa...
- OneOcc: Semantic Occupancy Prediction for Legged Robots with a Single Panoramic Camera : Abstract: Robust 3D semantic occupancy is crucial for legged/humanoid robots, yet most semantic scene completion (SSC) systems target wheeled platforms with forward-facing sensors. We present OneOcc, ...
- Flying Robotics Art: ROS-based Drone Draws the Record-Breaking Mural : Abstract: This paper presents the innovative design and successful deployment of a pioneering autonomous unmanned aerial system developed for executing the world's largest mural painted by a drone. Ad...
- A New Comprehensive Framework for Multi-Exposure Stereo Coding Utilizing Low Rank Tucker-ALS and 3D-HEVC Techniques : Abstract: Display technology must offer high dynamic range (HDR) contrast-based depth induction and 3D personalization simultaneously. Efficient algorithms to compress HDR stereo data is critical. Dir...
- Seal2Real: Prompt Prior Learning on Diffusion Model for Unsupervised Document Seal Data Generation and Realisation : Abstract: Seal-related tasks in document processing-such as seal segmentation, authenticity verification, seal removal, and text recognition under seals-hold substantial commercial importance. However...
- BoxCell: Leveraging SAM for Cell Segmentation with Box Supervision : Abstract: Cell segmentation in histopathological images is vital for diagnosis, and treatment of several diseases. Annotating data is tedious, and requires medical expertise, making it difficult to em...
- A Label Propagation Strategy for CutMix in Multi-Label Remote Sensing Image Classification : Abstract: The development of supervised deep learning-based methods for multi-label scene classification (MLC) is one of the prominent research directions in remote sensing (RS). However, collecting a...
- ROADWork: A Dataset and Benchmark for Learning to Recognize, Observe, Analyze and Drive Through Work Zones : Abstract: Perceiving and autonomously navigating through work zones is a challenging and underexplored problem. Open datasets for this long-tailed scenario are scarce. We propose the ROADWork dataset ...
- FusionRF: High-Fidelity Satellite Neural Radiance Fields from Multispectral and Panchromatic Acquisitions : Abstract: We introduce FusionRF, a novel framework for digital surface reconstruction from satellite multispectral and panchromatic images. Current work has demonstrated the increased accuracy of neur...
- SAM-EM: Real-Time Segmentation for Automated Liquid Phase Transmission Electron Microscopy : Abstract: The absence of robust segmentation frameworks for noisy liquid phase transmission electron microscopy (LPTEM) videos prevents reliable extraction of particle trajectories, creating a major b...
- ALTo: Adaptive-Length Tokenizer for Autoregressive Mask Generation : Abstract: While humans effortlessly draw visual objects and shapes by adaptively allocating attention based on their complexity, existing multimodal large language models (MLLMs) remain constrained by...
- ZPressor: Bottleneck-Aware Compression for Scalable Feed-Forward 3DGS : Abstract: Feed-forward 3D Gaussian Splatting (3DGS) models have recently emerged as a promising solution for novel view synthesis, enabling one-pass inference without the need for per-scene 3DGS optim...
- SpatialLM: Training Large Language Models for Structured Indoor Modeling : Abstract: SpatialLM is a large language model designed to process 3D point cloud data and generate structured 3D scene understanding outputs. These outputs include architectural elements like walls, d...
- MagCache: Fast Video Generation with Magnitude-Aware Cache : Abstract: Existing acceleration techniques for video diffusion models often rely on uniform heuristics or time-embedding variants to skip timesteps and reuse cached features. These approaches typicall...
- Human Perception-Inspired Grain Segmentation Refinement Using Conditional Random Fields : Abstract: Automated detection of grain boundaries (GBs) in electron microscope images of polycrystalline materials could help accelerate the nanoscale characterization of myriad engineering materials ...
- MAROON: A Framework for the Joint Characterization of Near-Field High-Resolution Radar and Optical Depth Imaging Techniques : Abstract: Utilizing the complementary strengths of wavelength-specific range or depth sensors is crucial for robust computer-assisted tasks such as autonomous driving. Despite this, there is still lit...
- ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing : Abstract: While end-to-end video-to-audio generation has greatly improved, producing high-fidelity audio that authentically captures the nuances of visual content remains challenging. Like professiona...
- Accelerating Physical Property Reasoning for Augmented Visual Cognition : Abstract: This paper introduces \sysname, a system that accelerates vision-guided physical property reasoning to enable augmented visual cognition. \sysname minimizes the run-time latency of this reas...
- Finetuning-Free Personalization of Text to Image Generation via Hypernetworks : Abstract: Personalizing text-to-image diffusion models has traditionally relied on subject-specific fine-tuning approaches such as DreamBooth~\cite{ruiz2023dreambooth}, which are computationally expen...
- Subsampled Randomized Fourier GaLore for Adapting Foundation Models in Depth-Driven Liver Landmark Segmentation : Abstract: Accurate detection and delineation of anatomical structures in medical imaging are critical for computer-assisted interventions, particularly in laparoscopic liver surgery where 2D video str...
- SurgAnt-ViVQA: Learning to Anticipate Surgical Events through GRU-Driven Temporal Cross-Attention : Abstract: Anticipating forthcoming surgical events is vital for real-time assistance in endonasal transsphenoidal pituitary surgery, where visibility is limited and workflow changes rapidly. Most visu...
- PETWB-REP: A Multi-Cancer Whole-Body FDG PET/CT and Radiology Report Dataset for Medical Imaging Research : Abstract: Publicly available, large-scale medical imaging datasets are crucial for developing and validating artificial intelligence models and conducting retrospective clinical research. However, dat...
- MvBody: Multi-View-Based Hybrid Transformer Using Optical 3D Body Scan for Explainable Cesarean Section Prediction : Abstract: Accurately assessing the risk of cesarean section (CS) delivery is critical, especially in settings with limited medical resources, where access to healthcare is often restricted. Early and ...
- Diffusion-Guided Mask-Consistent Paired Mixing for Endoscopic Image Segmentation : Abstract: Augmentation for dense prediction typically relies on either sample mixing or generative synthesis. Mixing improves robustness but misaligned masks yield soft label ambiguity. Diffusion synt...
- Transformer-Progressive Mamba Network for Lightweight Image Super-Resolution : Abstract: Recently, Mamba-based super-resolution (SR) methods have demonstrated the ability to capture global receptive fields with linear complexity, addressing the quadratic computational cost of Tr...
- Decoupled Multi-Predictor Optimization for Inference-Efficient Model Tuning : Abstract: Recently, remarkable progress has been made in large-scale pre-trained model tuning, and inference efficiency is becoming more crucial for practical deployment. Early exiting in conjunction ...
- Enhancing Medical Image Segmentation via Heat Conduction Equation : Abstract: Medical image segmentation has been significantly advanced by deep learning architectures, notably U-Net variants. However, existing models struggle to achieve efficient global context model...
- IEC3D-AD: A 3D Dataset of Industrial Equipment Components for Unsupervised Point Cloud Anomaly Detection : Abstract: 3D anomaly detection (3D-AD) plays a critical role in industrial manufacturing, particularly in ensuring the reliability and safety of core equipment components. Although existing 3D dataset...
- Unified Long Video Inpainting and Outpainting via Overlapping High-Order Co-Denoising : Abstract: Generating long videos remains a fundamental challenge, and achieving high controllability in video inpainting and outpainting is particularly demanding. To address both of these challenges ...
- Diffusion-SDPO: Safeguarded Direct Preference Optimization for Diffusion Models : Abstract: Text-to-image diffusion models deliver high-quality images, yet aligning them with human preferences remains challenging. We revisit diffusion-based Direct Preference Optimization (DPO) for ...
- SurgViVQA: Temporally-Grounded Video Question Answering for Surgical Scene Understanding : Abstract: Video Question Answering (VideoQA) in the surgical domain aims to enhance intraoperative understanding by enabling AI models to reason over temporally coherent events rather than isolated fr...
- Multi-Object Tracking Retrieval with LLaVA-Video: A Training-Free Solution to MOT25-StAG Challenge : Abstract: In this report, we present our solution to the MOT25-Spatiotemporal Action Grounding (MOT25-StAG) Challenge. The aim of this challenge is to accurately localize and track multiple objects th...
- UniAVGen: Unified Audio and Video Generation with Asymmetric Cross-Modal Interactions : Abstract: Due to the lack of effective cross-modal modeling, existing open-source audio-video generation methods often exhibit compromised lip synchronization and insufficient semantic consistency. To...
- Robust Alignment of the Human Embryo in 3D Ultrasound using PCA and an Ensemble of Heuristic, Atlas-based and Learning-based Classifiers Evaluated on the Rotterdam Periconceptional Cohort : Abstract: Standardized alignment of the embryo in three-dimensional (3D) ultrasound images aids prenatal growth monitoring by facilitating standard plane detection, improving visualization of landmark...
- Generalizing Shape-from-Template to Topological Changes : Abstract: Reconstructing the surfaces of deformable objects from correspondences between a 3D template and a 2D image is well studied under Shape-from-Template (SfT) methods; however, existing approac...
- Human Mesh Modeling for Anny Body : Abstract: Parametric body models are central to many human-centric tasks, yet existing models often rely on costly 3D scans and learned shape spaces that are proprietary and demographically narrow. We...
- Data-Efficient Adaptation and a Novel Evaluation Method for Aspect-based Sentiment Analysis : Abstract: Aspect-based Sentiment Analysis (ABSA) is a fine-grained opinion mining approach that identifies and classifies opinions associated with specific entities (aspects) or their categories withi...
- Precise asymptotic analysis of Sobolev training for random feature models : Abstract: Gradient information is widely useful and available in applications, and is therefore natural to include in the training of neural networks. Yet little is known theoretically about the impac...
- Min-Max Optimization Is Strictly Easier Than Variational Inequalities : Abstract: Classically, a mainstream approach for solving a convex-concave min-max problem is to instead solve the variational inequality problem arising from its first-order optimality conditions. Is ...
- From Propagation to Prediction: Point-level Uncertainty Evaluation of MLS Point Clouds under Limited Ground Truth : Abstract: Evaluating uncertainty is critical for reliable use of Mobile Laser Scanning (MLS) point clouds in many high-precision applications such as Scan-to-BIM, deformation analysis, and 3D modeling...
- PolyNorm: Few-Shot LLM-Based Text Normalization for Text-to-Speech : Abstract: Text Normalization (TN) is a key preprocessing step in Text-to-Speech (TTS) systems, converting written forms into their canonical spoken equivalents. Traditional TN systems can exhibit high...
- Quantifying Articulatory Coordination as a Biomarker for Schizophrenia : Abstract: Advances in artificial intelligence (AI) and deep learning have improved diagnostic capabilities in healthcare, yet limited interpretability continues to hinder clinical adoption. Schizophre...
- Provable Accelerated Bayesian Optimization with Knowledge Transfer : Abstract: We study how Bayesian optimization (BO) can be accelerated on a target task with historical knowledge transferred from related source tasks. Existing works on BO with knowledge transfer eith...
- Scheduling the Off-Diagonal Weingarten Loss of Neural SDFs for CAD Models : Abstract: Neural signed distance functions (SDFs) have become a powerful representation for geometric reconstruction from point clouds, yet they often require both gradient- and curvature-based regula...
- Modeling Headway in Heterogeneous and Mixed Traffic Flow: A Statistical Distribution Based on a General Exponential Function : Abstract: The ability of existing headway distributions to accurately reflect the diverse behaviors and characteristics in heterogeneous traffic (different types of vehicles) and mixed traffic (human-...
- Learning-based Cooperative Robotic Paper Wrapping: A Unified Control Policy with Residual Force Control : Abstract: Human-robot cooperation is essential in environments such as warehouses and retail stores, where workers frequently handle deformable objects like paper, bags, and fabrics. Coordinating robo...
- Understanding Robustness of Model Editing in Code LLMs: An Empirical Study : Abstract: Large language models (LLMs) are increasingly used in software development. However, while LLMs remain static after pretraining, programming languages and APIs continue to evolve, leading to...
- Statistical Properties of Rectified Flow : Abstract: Rectified flow (Liu et al., 2022; Liu, 2022; Wu et al., 2023) is a method for defining a transport map between two distributions, and enjoys popularity in machine learning, although theoreti...
- Provable Separations between Memorization and Generalization in Diffusion Models : Abstract: Diffusion models have achieved remarkable success across diverse domains, but they remain vulnerable to memorization -- reproducing training data rather than generating novel outputs. This n...
- RKUM: An R Package for Robust Kernel Unsupervised Methods : Abstract: RKUM is an R package developed for implementing robust kernel-based unsupervised methods. It provides functions for estimating the robust kernel covariance operator (CO) and the robust kerne...
- Topography, climate, land cover, and biodiversity: Explaining endemic richness and management implications on a Mediterranean island : Abstract: Island endemism is shaped by complex interactions among environmental, ecological, and evolutionary factors, yet the relative contributions of topography, climate, and land cover remain inco...
- Death by a Thousand Prompts: Open Model Vulnerability Analysis : Abstract: Open-weight models provide researchers and developers with accessible foundations for diverse downstream applications. We tested the safety and security postures of eight open-weight large l...
- Influence of Data Dimensionality Reduction Methods on the Effectiveness of Quantum Machine Learning Models : Abstract: Data dimensionality reduction techniques are often utilized in the implementation of Quantum Machine Learning models to address two significant issues: the constraints of NISQ quantum device...
- SyMuPe: Affective and Controllable Symbolic Music Performance : Abstract: Emotions are fundamental to the creation and perception of music performances. However, achieving human-like expression and emotion through machine learning models for performance rendering ...
- A Support-Set Algorithm for Optimization Problems with Nonnegative and Orthogonal Constraints : Abstract: In this paper, we investigate optimization problems with nonnegative and orthogonal constraints, where any feasible matrix of size $n \times p$ exhibits a sparsity pattern such that each row...
- System Identification of a Moored ASV with Recessed Moon Pool via Deterministic and Bayesian Hankel-DMDc : Abstract: This study addresses the system identification of a small autonomous surface vehicle (ASV) under moored conditions using Hankel dynamic mode decomposition with control (HDMDc) and its Bayesi...
- BanglaSTEM: A Parallel Corpus for Technical Domain Bangla-English Translation : Abstract: Large language models work well for technical problem solving in English but perform poorly when the same questions are asked in Bangla. A simple solution would be to translate Bangla questi...
- The Structure of Cross-Validation Error: Stability, Covariance, and Minimax Limits : Abstract: Despite ongoing theoretical research on cross-validation (CV), many theoretical questions about CV remain widely open. This motivates our investigation into how properties of algorithm-distr...
- Vector-valued self-normalized concentration inequalities beyond sub-Gaussianity : Abstract: The study of self-normalized processes plays a crucial role in a wide range of applications, from sequential decision-making to econometrics. While the behavior of self-normalized concentrat...
- CLAX: Fast and Flexible Neural Click Models in JAX : Abstract: CLAX is a JAX-based library that implements classic click models using modern gradient-based optimization. While neural click models have emerged over the past decade, complex click models b...
- Neural Beamforming with Doppler-Aware Sparse Attention for High Mobility Environments : Abstract: Beamforming has significance for enhancing spectral efficiency and mitigating interference in multi-antenna wireless systems, facilitating spatial multiplexing and diversity in dense and hig...
- Towards Transparent Stance Detection: A Zero-Shot Approach Using Implicit and Explicit Interpretability : Abstract: Zero-Shot Stance Detection (ZSSD) identifies the attitude of the post toward unseen targets. Existing research using contrastive, meta-learning, or data augmentation suffers from generalizab...
- Quantifying Weighted Morphological Content of Large-Scale Structures via Simulation-Based Inference : Abstract: In this work, we perform a simulation-based forecasting analysis to compare the constraining power of two higher-order summary statistics of the large-scale structure (LSS), the Minkowski Fu...
- Efficient Testing Implies Structured Symmetry : Abstract: Given a small random sample of $n$-bit strings labeled by an unknown Boolean function, which properties of this function can be tested computationally efficiently? We show an equivalence bet...
- Colorectal Cancer Histopathological Grading using Multi-Scale Federated Learning : Abstract: Colorectal cancer (CRC) grading is a critical prognostic factor but remains hampered by inter-observer variability and the privacy constraints of multi-institutional data sharing. While deep...
- The Adaptivity Barrier in Batched Nonparametric Bandits: Sharp Characterization of the Price of Unknown Margin : Abstract: We study batched nonparametric contextual bandits under a margin condition when the margin parameter $\alpha$ is unknown. To capture the statistical price of this ignorance, we introduce the...
- Trustworthy Representation Learning via Information Funnels and Bottlenecks : Abstract: Ensuring trustworthiness in machine learning -- by balancing utility, fairness, and privacy -- remains a critical challenge, particularly in representation learning. In this work, we investi...
- How does training shape the Riemannian geometry of neural network representations? : Abstract: In machine learning, there is a long history of trying to build neural networks that can learn from fewer example data by baking in strong geometric priors. However, it is not always clear a...
- AlignIQL: Policy Alignment in Implicit Q-Learning through Constrained Optimization : Abstract: Implicit Q-learning (IQL) serves as a strong baseline for offline RL, which learns the value function using only dataset actions through quantile regression. However, it is unclear how to re...
- Data Quality Monitoring for the Hadron Calorimeters Using Transfer Learning for Anomaly Detection : Abstract: The proliferation of sensors brings an immense volume of spatio-temporal (ST) data in many domains, including monitoring, diagnostics, and prognostics applications. Data curation is a time-c...
- Dynamical loss functions shape landscape topography and improve learning in artificial neural networks : Abstract: Dynamical loss functions are derived from standard loss functions used in supervised classification tasks, but are modified so that the contribution from each class periodically increases an...
- Learning Expressive Random Feature Models via Parametrized Activations : Abstract: Random feature (RF) method is a powerful kernel approximation technique, but is typically equipped with fixed activation functions, limiting its adaptability across diverse tasks. To overcom...
- HALO: Hadamard-Assisted Lower-Precision Optimization for LLMs : Abstract: Quantized training of Large Language Models (LLMs) remains an open challenge, as maintaining accuracy while performing all matrix multiplications in low precision has proven difficult. This ...
- REINFORCE-ING Chemical Language Models for Drug Discovery : Abstract: Chemical language models, combined with reinforcement learning (RL), have shown significant promise to efficiently traverse large chemical spaces for drug discovery. However, the performance...
- Sundial: A Family of Highly Capable Time Series Foundation Models : Abstract: We introduce Sundial, a family of native, flexible, and scalable time series foundation models. To predict the next-patch's distribution, we propose a TimeFlow Loss based on flow-matching, w...
- Stable Port-Hamiltonian Neural Networks : Abstract: In recent years, nonlinear dynamic system identification using artificial neural networks has garnered attention due to its broad potential applications across science and engineering. Howev...
- Decision-aware training of spatiotemporal forecasting models to select a top K subset of sites for intervention : Abstract: Optimal allocation of scarce resources is a common problem for decision makers faced with choosing a limited number of locations for intervention. Spatiotemporal prediction models could make...
- UniFault: A Fault Diagnosis Foundation Model from Bearing Data : Abstract: Machine fault diagnosis (FD) is a critical task for predictive maintenance, enabling early fault detection and preventing unexpected failures. Despite its importance, existing FD models are ...
- Reliable and efficient inverse analysis using physics-informed neural networks with normalized distance functions and adaptive weight tuning : Abstract: Physics-informed neural networks have attracted significant attention in scientific machine learning for their capability to solve forward and inverse problems governed by partial differenti...
- NeuralSurv: Deep Survival Analysis with Bayesian Uncertainty Quantification : Abstract: We introduce NeuralSurv, the first deep survival model to incorporate Bayesian uncertainty quantification. Our non-parametric, architecture-agnostic framework captures time-varying covariate...
- On scalable and efficient training of diffusion samplers : Abstract: We address the challenge of training diffusion models to sample from unnormalized energy distributions in the absence of data, the so-called diffusion samplers. Although these approaches hav...
- A Theoretical Framework for Grokking: Interpolation followed by Riemannian Norm Minimisation : Abstract: We study the dynamics of gradient flow with small weight decay on general training losses $F: \mathbb{R}^d \to \mathbb{R}$. Under mild regularity assumptions and assuming convergence of the ...
- Robust and Computation-Aware Gaussian Processes : Abstract: Gaussian processes (GPs) are widely used for regression and optimization tasks such as Bayesian optimization (BO) due to their expressiveness and principled uncertainty estimates. However, i...
- Why Machine Learning Models Fail to Fully Capture Epistemic Uncertainty : Abstract: In recent years various supervised learning methods that disentangle aleatoric and epistemic uncertainty based on second-order distributions have been proposed. We argue that these methods f...
- DiCoFlex: Model-agnostic diverse counterfactuals with flexible control : Abstract: Counterfactual explanations play a pivotal role in explainable artificial intelligence (XAI) by offering intuitive, human-understandable alternatives that elucidate machine learning model de...
- Model-Informed Flows for Bayesian Inference : Abstract: Variational inference often struggles with the posterior geometry exhibited by complex hierarchical Bayesian models. Recent advances in flow-based variational families and Variationally Infe...
- Inference-Time Reward Hacking in Large Language Models : Abstract: A common paradigm to improve the performance of large language models is optimizing for a reward model. Reward models assign a numerical score to an LLM's output that indicates, for example,...
- Compliance Minimization via Physics-Informed Gaussian Processes : Abstract: Machine learning (ML) techniques have recently gained significant attention for solving compliance minimization (CM) problems. However, these methods typically provide poor feature boundarie...
- Composing Linear Layers from Irreducibles : Abstract: Contemporary large models often exhibit behaviors suggesting the presence of low-level primitives that compose into modules with richer functionality, but these fundamental building blocks r...
- OrdShap: Feature Position Importance for Sequential Black-Box Models : Abstract: Sequential deep learning models excel in domains with temporal or sequential dependencies, but their complexity necessitates post-hoc feature attribution methods for understanding their pred...
- Variable Selection in Maximum Mean Discrepancy for Interpretable Distribution Comparison : Abstract: We study two-sample variable selection: identifying variables that discriminate between the distributions of two sets of data vectors. Such variables help scientists understand the mechanism...
- Contraction of Private Quantum Channels and Private Quantum Hypothesis Testing : Abstract: A quantum generalized divergence by definition satisfies the data-processing inequality; as such, the relative decrease in such a divergence under the action of a quantum channel is at most ...
- Disentanglement with Factor Quantized Variational Autoencoders : Abstract: Disentangled representation learning aims to represent the underlying generative factors of a dataset in a latent representation independently of one another. In our work, we propose a discr...
- Alleviating Hyperparameter-Tuning Burden in SVM Classifiers for Pulmonary Nodules Diagnosis with Multi-Task Bayesian Optimization : Abstract: In the field of non-invasive medical imaging, radiomic features are utilized to measure tumor characteristics. However, these features can be affected by the techniques used to discretize th...
- Aspen Open Jets: Unlocking LHC Data for Foundation Models in Particle Physics : Abstract: Foundation models are deep learning models pre-trained on large amounts of data which are capable of generalizing to multiple datasets and/or downstream tasks. This work demonstrates how dat...
- Online Learning of Pure States is as Hard as Mixed States : Abstract: Quantum state tomography, the task of learning an unknown quantum state, is a fundamental problem in quantum information. In standard settings, the complexity of this problem depends signifi...
- Data-Driven Probabilistic Air-Sea Flux Parameterization : Abstract: Accurately quantifying air-sea fluxes is important for understanding air-sea interactions and improving coupled weather and climate systems. This study introduces a probabilistic framework t...
- Depth Matters: Multimodal RGB-D Perception for Robust Autonomous Agents : Abstract: Autonomous agents that rely purely on perception to make real-time control decisions require efficient and robust architectures. In this work, we demonstrate that augmenting RGB input with d...
- A Polynomial-Time Algorithm for Variational Inequalities under the Minty Condition : Abstract: Solving variational inequalities (SVIs) is a foundational problem at the heart of optimization. However, this expressivity comes at the cost of computational hardness. As a result, most rese...
- Tight Regret Bounds for Fixed-Price Bilateral Trade : Abstract: We examine fixed-price mechanisms in bilateral trade through the lens of regret minimization. Our main results are twofold. (i) For independent values, a near-optimal $\widetilde{\Theta}(T^{...
- VQC-MLPNet: An Unconventional Hybrid Quantum-Classical Architecture for Scalable and Robust Quantum Machine Learning : Abstract: Variational quantum circuits (VQCs) hold promise for quantum machine learning but face challenges in expressivity, trainability, and noise resilience. We propose VQC-MLPNet, a hybrid archite...
- Recurrent neural network-based robust control systems with closed-loop regional incremental ISS and application to MPC design : Abstract: This paper investigates the design of output-feedback schemes for systems described by a class of recurrent neural networks. We propose a procedure based on linear matrix inequalities for de...
- Cache Mechanism for Agent RAG Systems : Abstract: Recent advances in Large Language Model (LLM)-based agents have been propelled by Retrieval-Augmented Generation (RAG), which grants the models access to vast external knowledge bases. Despi...
- LEGO-Eval: Towards Fine-Grained Evaluation on Synthesizing 3D Embodied Environments with Tool Augmentation : Abstract: Despite recent progress in using Large Language Models (LLMs) for automatically generating 3D scenes, generated scenes often lack realistic spatial layouts and object attributes found in rea...
- Targeted Error Correction in Knowledge Distillation: Small Language Models Surpass GPT : Abstract: We introduce an Analyze-Revise-Finetune (ARF) pipeline that enables smaller open-source language models (LLMs) to surpass substantially larger proprietary models in customer service summariz...
- ROBoto2: An Interactive System and Dataset for LLM-assisted Clinical Trial Risk of Bias Assessment : Abstract: We present ROBOTO2, an open-source, web-based platform for large language model (LLM)-assisted risk of bias (ROB) assessment of clinical trials. ROBOTO2 streamlines the traditionally labor-i...
- A Computational Approach to Analyzing Disrupted Language in Schizophrenia: Integrating Surprisal and Coherence Measures : Abstract: Language disruptions are one of the well-known effects of schizophrenia symptoms. They are often manifested as disorganized speech and impaired discourse coherence. These abnormalities in sp...
- MME-CC: A Challenging Multi-Modal Evaluation Benchmark of Cognitive Capacity : Abstract: As reasoning models scale rapidly, the essential role of multimodality in human cognition has come into sharp relief, driving a growing need to probe vision-centric cognitive behaviors. Yet,...
- Measuring Aleatoric and Epistemic Uncertainty in LLMs: Empirical Evaluation on ID and OOD QA Tasks : Abstract: Large Language Models (LLMs) have become increasingly pervasive, finding applications across many industries and disciplines. Ensuring the trustworthiness of LLM outputs is paramount, where ...
- BengaliMoralBench: A Benchmark for Auditing Moral Reasoning in Large Language Models within Bengali Language and Culture : Abstract: As multilingual Large Language Models (LLMs) gain traction across South Asia, their alignment with local ethical norms, particularly for Bengali, which is spoken by over 285 million people a...
- Beyond Ranked Lists: The SARAL Framework for Cross-Lingual Document Set Retrieval : Abstract: Machine Translation for English Retrieval of Information in Any Language (MATERIAL) is an IARPA initiative targeted to advance the state of cross-lingual information retrieval (CLIR). This r...
- IndicSuperTokenizer: An Optimized Tokenizer for Indic Multilingual LLMs : Abstract: Tokenizers play a crucial role in determining the performance, training efficiency, and the inference cost of Large Language Models (LLMs). Designing effective tokenizers for multilingual LL...
- SCALE: Upscaled Continual Learning of Large Language Models : Abstract: We revisit continual pre-training for large language models and argue that progress now depends more on scaling the right structure than on scaling parameters alone. We introduce SCALE, a wi...
- Silenced Biases: The Dark Side LLMs Learned to Refuse : Abstract: Safety-aligned large language models (LLMs) are becoming increasingly widespread, especially in sensitive applications where fairness is essential and biased outputs can cause significant ha...
- EQ-Negotiator: Dynamic Emotional Personas Empower Small Language Models for Edge-Deployable Credit Negotiation : Abstract: The deployment of large language models (LLMs) in automated negotiation has set a high performance benchmark, but their computational cost and data privacy requirements render them unsuitabl...
- LFC-DA: Logical Formula-Controlled Data Augmentation for Enhanced Logical Reasoning : Abstract: For complex logical data augmentation, heavy reliance on human annotation is costly, whereas direct generation with large language models yields uninterpretable and logically homogeneous exa...
- Segmentation Beyond Defaults: Asymmetrical Byte Pair Encoding for Optimal Machine Translation Performance : Abstract: Existing Machine Translation (MT) research often suggests a single, fixed set of hyperparameters for word segmentation models, symmetric Byte Pair Encoding (BPE), which applies the same numb...
- Overcoming the Generalization Limits of SLM Finetuning for Shape-Based Extraction of Datatype and Object Properties : Abstract: Small language models (SLMs) have shown promises for relation extraction (RE) when extracting RDF triples guided by SHACL shapes focused on common datatype properties. This paper investigate...
- Efficient Reasoning via Thought-Training and Thought-Free Inference : Abstract: Recent advances in large language models (LLMs) have leveraged explicit Chain-of-Thought (CoT) prompting to improve reasoning accuracy. However, most existing methods primarily compress verb...
- Knowledge-Augmented Question Error Correction for Chinese Question Answer System with QuestionRAG : Abstract: Input errors in question-answering (QA) systems often lead to incorrect responses. Large language models (LLMs) struggle with this task, frequently failing to interpret user intent (misinter...
- Kastor: Fine-tuned Small Language Models for Shape-based Active Relation Extraction : Abstract: RDF pattern-based extraction is a compelling approach for fine-tuning small language models (SLMs) by focusing a relation extraction task on a specified SHACL shape. This technique enables t...
- HaluMem: Evaluating Hallucinations in Memory Systems of Agents : Abstract: Memory systems are key components that enable AI systems such as LLMs and AI agents to achieve long-term learning and sustained interaction. However, during memory storage and retrieval, the...
- One Battle After Another: Probing LLMs' Limits on Multi-Turn Instruction Following with a Benchmark Evolving Framework : Abstract: Understanding how well large language models can follow users' instructions throughout a dialogue spanning multiple topics is of great importance for data-intensive conversational applicatio...
- Bearing Syntactic Fruit with Stack-Augmented Neural Networks : Abstract: Any finite set of training data is consistent with an infinite number of hypothetical algorithms that could have generated it. Studies have shown that when human children learn language, the...
- ASVRI-Legal: Fine-Tuning LLMs with Retrieval Augmented Generation for Enhanced Legal Regulation : Abstract: In this study, we explore the fine-tuning of Large Language Models (LLMs) to better support policymakers in their crucial work of understanding, analyzing, and crafting legal regulations. To...
- A systematic review of relation extraction task since the emergence of Transformers : Abstract: This article presents a systematic review of relation extraction (RE) research since the advent of Transformer-based models. Using an automated framework to collect and annotate publications...
- Do Androids Dream of Unseen Puppeteers? Probing for a Conspiracy Mindset in Large Language Models : Abstract: In this paper, we investigate whether Large Language Models (LLMs) exhibit conspiratorial tendencies, whether they display sociodemographic biases in this domain, and how easily they can be ...
- Let the Bees Find the Weak Spots: A Path Planning Perspective on Multi-Turn Jailbreak Attacks against LLMs : Abstract: Large Language Models (LLMs) have been widely deployed across various applications, yet their potential security and ethical risks have raised increasing concerns. Existing research employs ...
- Beyond Citations: Measuring Idea-level Knowledge Diffusion from Research to Journalism and Policy-making : Abstract: Despite the importance of social science knowledge for various stakeholders, measuring its diffusion into different domains remains a challenge. This study uses a novel text-based approach t...
- Retrieval-Augmented Feature Generation for Domain-Specific Classification : Abstract: Feature generation can significantly enhance learning outcomes, particularly for tasks with limited data. An effective way to improve feature generation is to expand the current feature spac...
- Verdict: A Library for Scaling Judge-Time Compute : Abstract: The use of LLMs as automated judges ("LLM-as-a-judge") is now widespread, yet standard judges suffer from a multitude of reliability issues. To address these challenges, we introduce Verdict...
- Does Synthetic Data Help Named Entity Recognition for Low-Resource Languages? : Abstract: Named Entity Recognition(NER) for low-resource languages aims to produce robust systems for languages where there is limited labeled training data available, and has been an area of increasi...
- The Case for Repeatable, Open, and Expert-Grounded Hallucination Benchmarks in Large Language Models : Abstract: Plausible, but inaccurate, tokens in model-generated text are widely believed to be pervasive and problematic for the responsible adoption of language models. Despite this concern, there is ...
- Read Your Own Mind: Reasoning Helps Surface Self-Confidence Signals in LLMs : Abstract: We study the source of uncertainty in DeepSeek R1-32B by analyzing its self-reported verbal confidence on question answering (QA) tasks. In the default answer-then-confidence setting, the mo...
- LexTime: A Benchmark for Temporal Ordering of Legal Events : Abstract: Understanding temporal relationships and accurately reconstructing the event timeline is important for case law analysis, compliance monitoring, and legal summarization. However, existing be...
- Inv-Entropy: A Fully Probabilistic Framework for Uncertainty Quantification in Language Models : Abstract: Large language models (LLMs) have transformed natural language processing, but their reliable deployment requires effective uncertainty quantification (UQ). Existing UQ methods are often heu...
- Scalable Medication Extraction and Discontinuation Identification from Electronic Health Records Using Large Language Models : Abstract: Identifying medication discontinuations in electronic health records (EHRs) is vital for patient safety but is often hindered by information being buried in unstructured notes. This study ai...
- Post Persona Alignment for Multi-Session Dialogue Generation : Abstract: Multi-session persona-based dialogue generation presents challenges in maintaining long-term consistency and generating diverse, personalized responses. While large language models (LLMs) ex...
- Token Perturbation Guidance for Diffusion Models : Abstract: Classifier-free guidance (CFG) has become an essential component of modern diffusion models to enhance both generation quality and alignment with input conditions. However, CFG requires spec...
- Cropland Mapping using Geospatial Embeddings : Abstract: Accurate and up-to-date land cover maps are essential for understanding land use change, a key driver of climate change. Geospatial embeddings offer a more efficient and accessible way to ma...
- ProM3E: Probabilistic Masked MultiModal Embedding Model for Ecology : Abstract: We introduce ProM3E, a probabilistic masked multimodal embedding model for any-to-any generation of multimodal representations for ecology. ProM3E is based on masked modality reconstruction ...
- SCALE-VLP: Soft-Weighted Contrastive Volumetric Vision-Language Pre-training with Spatial-Knowledge Semantics : Abstract: Vision-language models (VLMs) have demonstrated strong cross-modal capabilities, yet most work remains limited to 2D data and assumes binary supervision (i.e., positive vs. negative pairs), ...
- Learning with less: label-efficient land cover classification at very high spatial resolution using self-supervised deep learning : Abstract: Deep learning semantic segmentation methods have shown promising performance for very high 1-m resolution land cover classification, but the challenge of collecting large volumes of represen...
- A Foundation Model for Brain MRI with Dynamic Modality Integration : Abstract: We present a foundation model for brain MRI that can work with different combinations of imaging sequences. The model uses one encoder with learnable modality embeddings, conditional layer n...
- A Plug-and-Play Framework for Volumetric Light-Sheet Image Reconstruction : Abstract: Cardiac contraction is a rapid, coordinated process that unfolds across three-dimensional tissue on millisecond timescales. Traditional optical imaging is often inadequate for capturing dyna...
- ISC-Perception: A Hybrid Computer Vision Dataset for Object Detection in Novel Steel Assembly : Abstract: The Intermeshed Steel Connection (ISC) system, when paired with robotic manipulators, can accelerate steel-frame assembly and improve worker safety by eliminating manual assembly. Dependable...
- DentalSplat: Dental Occlusion Novel View Synthesis from Sparse Intra-Oral Photographs : Abstract: In orthodontic treatment, particularly within telemedicine contexts, observing patients' dental occlusion from multiple viewpoints facilitates timely clinical decision-making. Recent advance...
- Online Learning to Rank under Corruption: A Robust Cascading Bandits Approach : Abstract: Online learning to rank (OLTR) studies how to recommend a short ranked list of items from a large pool and improves future rankings based on user clicks. This setting is commonly modeled as ...
- An Efficient Classification Model for Cyber Text : Abstract: The uprising of deep learning methodology and practice in recent years has brought about a severe consequence of increasing carbon footprint due to the insatiable demand for computational re...
- Towards Scalable Backpropagation-Free Gradient Estimation : Abstract: While backpropagation--reverse-mode automatic differentiation--has been extraordinarily successful in deep learning, it requires two passes (forward and backward) through the neural network ...
- From Insight to Exploit: Leveraging LLM Collaboration for Adaptive Adversarial Text Generation : Abstract: LLMs can provide substantial zero-shot performance on diverse tasks using a simple task prompt, eliminating the need for training or fine-tuning. However, when applying these models to sensi...
- Test Time Adaptation Using Adaptive Quantile Recalibration : Abstract: Domain adaptation is a key strategy for enhancing the generalizability of deep learning models in real-world scenarios, where test distributions often diverge significantly from the training...
- UnCLe: Towards Scalable Dynamic Causal Discovery in Non-linear Temporal Systems : Abstract: Uncovering cause-effect relationships from observational time series is fundamental to understanding complex systems. While many methods infer static causal graphs, real-world systems often ...
- Periodic Skill Discovery : Abstract: Unsupervised skill discovery in reinforcement learning (RL) aims to learn diverse behaviors without relying on external rewards. However, current methods often overlook the periodic nature o...
- Cross-Modal Alignment via Variational Copula Modelling : Abstract: Various data modalities are common in real-world applications (e.g., electronic health records, medical images and clinical notes in healthcare). It is essential to develop multimodal learni...
- A Probabilistic U-Net Approach to Downscaling Climate Simulations : Abstract: Climate models are limited by heavy computational costs, often producing outputs at coarse spatial resolutions, while many climate change impact studies require finer scales. Statistical dow...
- Incorporating Quality of Life in Climate Adaptation Planning via Reinforcement Learning : Abstract: Urban flooding is expected to increase in frequency and severity as a consequence of climate change, causing wide-ranging impacts that include a decrease in urban Quality of Life (QoL). Mean...
- A Feedback-Control Framework for Efficient Dataset Collection from In-Vehicle Data Streams : Abstract: Modern AI systems are increasingly constrained not by model capacity but by the quality and diversity of their data. Despite growing emphasis on data-centric AI, most datasets are still gath...
- A unified physics-informed generative operator framework for general inverse problems : Abstract: Solving inverse problems governed by partial differential equations (PDEs) is central to science and engineering, yet remains challenging when measurements are sparse, noisy, or when the und...
- Climate Adaptation with Reinforcement Learning: Economic vs. Quality of Life Adaptation Pathways : Abstract: Climate change will cause an increase in the frequency and severity of flood events, prompting the need for cohesive adaptation policymaking. Designing effective adaptation policies, however...
- Decoupled Entropy Minimization : Abstract: Entropy Minimization (EM) is beneficial to reducing class overlap, bridging domain gap, and restricting uncertainty for various tasks in machine learning, yet its potential is limited. To st...
- Diffusion Language Models are Super Data Learners : Abstract: Under strictly controlled pre-training settings, we observe a Crossover: when unique data is limited, diffusion language models (DLMs) consistently surpass autoregressive (AR) models by trai...
- Multi-Objective Adaptive Rate Limiting in Microservices Using Deep Reinforcement Learning : Abstract: As cloud computing and microservice architectures become increasingly prevalent, API rate limiting has emerged as a critical mechanism for ensuring system stability and service quality. Trad...
- A Probabilistic Approach to Pose Synchronization for Multi-Reference Alignment with Applications to MIMO Wireless Communication Systems : Abstract: From molecular imaging to wireless communications, the ability to align and reconstruct signals from multiple misaligned observations is crucial for system performance. We study the problem ...
- Graph Neural AI with Temporal Dynamics for Comprehensive Anomaly Detection in Microservices : Abstract: This study addresses the problem of anomaly detection and root cause tracing in microservice architectures and proposes a unified framework that combines graph neural networks with temporal ...
- SORTeD Rashomon Sets of Sparse Decision Trees: Anytime Enumeration : Abstract: Sparse decision tree learning provides accurate and interpretable predictive models that are ideal for high-stakes applications by finding the single most accurate tree within a (soft) size ...
- A Modular, Data-Free Pipeline for Multi-Label Intention Recognition in Transportation Agentic AI Applications : Abstract: In this study, a modular, data-free pipeline for multi-label intention recognition is proposed for agentic AI applications in transportation. Unlike traditional intent recognition systems th...
- TripleWin: Fixed-Point Equilibrium Pricing for Data-Model Coupled Markets : Abstract: The rise of the machine learning (ML) model economy has intertwined markets for training datasets and pre-trained models. However, most pricing approaches still separate data and model trans...
- POEMS: Product of Experts for Interpretable Multi-omic Integration using Sparse Decoding : Abstract: Integrating different molecular layers, i.e., multiomics data, is crucial for unraveling the complexity of diseases; yet, most deep generative models either prioritize predictive performance...
- Reinforcement Learning Using known Invariances : Abstract: In many real-world reinforcement learning (RL) problems, the environment exhibits inherent symmetries that can be exploited to improve learning efficiency. This paper develops a theoretical ...
- RAGBoost: Efficient Retrieval-Augmented Generation with Accuracy-Preserving Context Reuse : Abstract: Retrieval-augmented generation (RAG) enhances large language models (LLMs) with retrieved context but often suffers from downgraded prefill performance as modern applications demand longer a...
- NAP: Attention-Based Late Fusion for Automatic Sleep Staging : Abstract: Polysomnography signals are highly heterogeneous, varying in modality composition (e.g., EEG, EOG, ECG), channel availability (e.g., frontal, occipital EEG), and acquisition protocols across...
- Why Less is More (Sometimes): A Theory of Data Curation : Abstract: This paper introduces a theoretical framework to resolve a central paradox in modern machine learning: When is it better to use less data? This question has become critical as classical scal...
- Learning Without Critics? Revisiting GRPO in Classical Reinforcement Learning Environments : Abstract: Group Relative Policy Optimization (GRPO) has emerged as a scalable alternative to Proximal Policy Optimization (PPO) by eliminating the learned critic and instead estimating advantages thro...
- Byzantine-Robust Federated Learning with Learnable Aggregation Weights : Abstract: Federated Learning (FL) enables clients to collaboratively train a global model without sharing their private data. However, the presence of malicious (Byzantine) clients poses significant c...
- Flat Minima and Generalization: Insights from Stochastic Convex Optimization : Abstract: Understanding the generalization behavior of learning algorithms is a central goal of learning theory. A recently emerging explanation is that learning algorithms are successful in practice ...
- TabGemma: Text-Based Tabular ICL via LLM using Continued Pretraining and Retrieval : Abstract: We study LLMs for tabular prediction with mixed text, numeric, and categorical fields. We introduce TabGemma, a schema-agnostic in-context learner that treats rows as sequences and tackles t...
- Tensor-Efficient High-Dimensional Q-learning : Abstract: High-dimensional reinforcement learning faces challenges with complex calculations and low sample efficiency in large state-action spaces. Q-learning algorithms struggle particularly with th...
- Going Beyond Expert Performance via Deep Implicit Imitation Reinforcement Learning : Abstract: Imitation learning traditionally requires complete state-action demonstrations from optimal or near-optimal experts. These requirements severely limit practical applicability, as many real-w...
- Towards Formalizing Reinforcement Learning Theory : Abstract: In this paper, we formalize the almost sure convergence of $Q$-learning and linear temporal difference (TD) learning with Markovian samples using the Lean 4 theorem prover based on the Mathl...
- Financial Management System for SMEs: Real-World Deployment of Accounts Receivable and Cash Flow Prediction : Abstract: Small and Medium Enterprises (SMEs), particularly freelancers and early-stage businesses, face unique financial management challenges due to limited resources, small customer bases, and cons...
- nanoTabPFN: A Lightweight and Educational Reimplementation of TabPFN : Abstract: Tabular foundation models such as TabPFN have revolutionized predictive machine learning for tabular data. At the same time, the driving factors of this revolution are hard to understand. Ex...
- SHIELD: Securing Healthcare IoT with Efficient Machine Learning Techniques for Anomaly Detection : Abstract: The integration of IoT devices in healthcare introduces significant security and reliability challenges, increasing susceptibility to cyber threats and operational anomalies. This study prop...
- Behavior-Adaptive Q-Learning: A Unifying Framework for Offline-to-Online RL : Abstract: Offline reinforcement learning (RL) enables training from fixed data without online interaction, but policies learned offline often struggle when deployed in dynamic environments due to dist...
- Shrinking the Variance: Shrinkage Baselines for Reinforcement Learning with Verifiable Rewards : Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a powerful paradigm for post-training large reasoning models (LRMs) using policy-gradient methods such as GRPO. To stabil...
- Supersimulators : Abstract: We prove that every randomized Boolean function admits a supersimulator: a randomized polynomial-size circuit whose output on random inputs cannot be efficiently distinguished from reality w...
- Association-sensory spatiotemporal hierarchy and functional gradient-regularised recurrent neural network with implications for schizophrenia : Abstract: The human neocortex is functionally organised at its highest level along a continuous sensory-to-association (AS) hierarchy. This study characterises the AS hierarchy of patients with schizo...
- ECGXtract: Deep Learning-based ECG Feature Extraction for Automated CVD Diagnosis : Abstract: This paper presents ECGXtract, a deep learning-based approach for interpretable ECG feature extraction, addressing the limitations of traditional signal processing and black-box machine lear...
- Automatic Machine Translation Detection Using a Surrogate Multilingual Translation Model : Abstract: Modern machine translation (MT) systems depend on large parallel corpora, often collected from the Internet. However, recent evidence indicates that (i) a substantial portion of these texts ...
- Scalable Single-Cell Gene Expression Generation with Latent Diffusion Models : Abstract: Computational modeling of single-cell gene expression is crucial for understanding cellular processes, but generating realistic expression profiles remains a major challenge. This difficulty...
- Hybrid Convolution and Vision Transformer NAS Search Space for TinyML Image Classification : Abstract: Hybrids of Convolutional Neural Network (CNN) and Vision Transformer (ViT) have outperformed pure CNN or ViT architecture. However, since these architectures require large parameters and inc...
- Unifying Information-Theoretic and Pair-Counting Clustering Similarity : Abstract: Comparing clusterings is central to evaluating unsupervised models, yet the many existing similarity measures can produce widely divergent, sometimes contradictory, evaluations. Clustering s...
- Exploratory Analysis of Cyberattack Patterns on E-Commerce Platforms Using Statistical Methods : Abstract: Cyberattacks on e-commerce platforms have grown in sophistication, threatening consumer trust and operational continuity. This research presents a hybrid analytical framework that integrates...
- The OpenHands Software Agent SDK: A Composable and Extensible Foundation for Production Agents : Abstract: Agents are now used widely in the process of software development, but building production-ready software engineering agents is a complex task. Deploying software agents effectively requires...
- AnaFlow: Agentic LLM-based Workflow for Reasoning-Driven Explainable and Sample-Efficient Analog Circuit Sizing : Abstract: Analog/mixed-signal circuits are key for interfacing electronics with the physical world. Their design, however, remains a largely handcrafted process, resulting in long and error-prone desi...
- Grounded Misunderstandings in Asymmetric Dialogue: A Perspectivist Annotation Scheme for MapTask : Abstract: Collaborative dialogue relies on participants incrementally establishing common ground, yet in asymmetric settings they may believe they agree while referring to different entities. We intro...
- Beyond Single Pass, Looping Through Time: KG-IRAG with Iterative Knowledge Retrieval : Abstract: Graph Retrieval-Augmented Generation (GraphRAG) has proven highly effective in enhancing the performance of Large Language Models (LLMs) on tasks that require external knowledge. By leveragi...
- Leveraging LLMs to Automate Energy-Aware Refactoring of Parallel Scientific Codes : Abstract: While large language models (LLMs) are increasingly used for generating parallel scientific codes, most efforts emphasize functional correctness, often overlooking performance, especially en...
- Divide by Question, Conquer by Agent: SPLIT-RAG with Question-Driven Graph Partitioning : Abstract: Retrieval-Augmented Generation (RAG) systems empower large language models (LLMs) with external knowledge, yet struggle with efficiency-accuracy trade-offs when scaling to large knowledge gr...
- s3: You Don't Need That Much Data to Train a Search Agent via RL : Abstract: Retrieval-augmented generation (RAG) systems empower large language models (LLMs) to access external knowledge during inference. Recent advances have enabled LLMs to act as search agents via...
- Multi-Agent Reinforcement Learning for Autonomous Multi-Satellite Earth Observation: A Realistic Case Study : Abstract: The exponential growth of Low Earth Orbit (LEO) satellites has revolutionised Earth Observation (EO) missions, addressing challenges in climate monitoring, disaster management, and more. How...
- LLM-Driven Collaborative Model for Untangling Commits via Explicit and Implicit Dependency Reasoning : Abstract: Atomic commits, which address a single development concern, are a best practice in software development. In practice, however, developers often produce tangled commits that mix unrelated cha...
- Reinforcement Learning Foundations for Deep Research Systems: A Survey : Abstract: Deep research systems, agentic AI that solve complex, multi-step tasks by coordinating reasoning, search across the open web and user files, and tool use, are moving toward hierarchical depl...
- TabDSR: Decompose, Sanitize, and Reason for Complex Numerical Reasoning in Tabular Data : Abstract: Complex reasoning over tabular data is crucial in real-world data analysis, yet large language models (LLMs) often underperform due to complex queries, noisy data, and limited numerical capa...
- The ORCA Benchmark: Evaluating Real-World Calculation Accuracy in Large Language Models : Abstract: We present ORCA (Omni Research on Calculation in AI) Benchmark - a novel benchmark that evaluates large language models (LLMs) on multi-domain, real-life quantitative reasoning using verifie...
- Kosmos: An AI Scientist for Autonomous Discovery : Abstract: Data-driven scientific discovery requires iterative cycles of literature search, hypothesis generation, and data analysis. Substantial progress has been made towards AI agents that can autom...
- Emotion Detection From Social Media Posts : Abstract: Over the last few years, social media has evolved into a medium for expressing personal views, emotions, and even business and political proposals, recommendations, and advertisements. We ad...
- Transfer Learning-based Real-time Handgun Detection : Abstract: Traditional surveillance systems rely on human attention, limiting their effectiveness. This study employs convolutional neural networks and transfer learning to develop a real-time computer...
- Survey on AI Ethics: A Socio-technical Perspective : Abstract: The past decade has observed a significant advancement in AI with deep learning-based models being deployed in diverse scenarios, including safety-critical applications. As these AI systems ...
- Neural Physics: Using AI Libraries to Develop Physics-Based Solvers for Incompressible Computational Fluid Dynamics : Abstract: Numerical discretisations of partial differential equations (PDEs) can be written as discrete convolutions, which, themselves, are a key tool in AI libraries and used in convolutional neural...
- A Survey of Graph Neural Networks in Real world: Imbalance, Noise, Privacy and OOD Challenges : Abstract: Graph-structured data exhibits universality and widespread applicability across diverse domains, such as social network analysis, biochemistry, financial fraud detection, and network securit...
- A Reliable Cryptographic Framework for Empirical Machine Unlearning Evaluation : Abstract: Machine unlearning updates machine learning models to remove information from specific training samples, complying with data protection regulations that allow individuals to request the remo...
- Autonomous Robotic Drilling System for Mice Cranial Window Creation : Abstract: Robotic assistance for experimental manipulation in the life sciences is expected to enable favorable outcomes, regardless of the skill of the scientist. Experimental specimens in the life s...
- MSDNet: Multi-Scale Decoder for Few-Shot Semantic Segmentation via Transformer-Guided Prototyping : Abstract: Few-shot Semantic Segmentation addresses the challenge of segmenting objects in query images with only a handful of annotated examples. However, many previous state-of-the-art methods either...
- Inverse Entropic Optimal Transport Solves Semi-supervised Learning via Data Likelihood Maximization : Abstract: Learning conditional distributions $\pi^*(\cdot|x)$ is a central problem in machine learning, which is typically approached via supervised methods with paired data $(x,y) \sim \pi^*$. Howeve...
- Mastering Contact-rich Tasks by Combining Soft and Rigid Robotics with Imitation Learning : Abstract: Soft robots have the potential to revolutionize the use of robotic systems with their capability of establishing safe, robust, and adaptable interactions with their environment, but their pr...
- Intelligent Computing Social Modeling and Methodological Innovations in Political Science in the Era of Large Language Models : Abstract: The recent wave of artificial intelligence, epitomized by large language models (LLMs),has presented opportunities and challenges for methodological innovation in political science,sparking ...
- Do Automatic Factuality Metrics Measure Factuality? A Critical Evaluation : Abstract: Modern LLMs can now produce highly readable abstractive summaries, to the point that traditional automated metrics for evaluating summary quality, such as ROUGE, have saturated. However, LLM...
- RAG-IT: Retrieval-Augmented Instruction Tuning for Automated Financial Analysis : Abstract: Financial analysis relies heavily on the interpretation of earnings reports to assess company performance and guide decision-making. Traditional methods for generating such analyses demand s...
- REFA: Reference Free Alignment for multi-preference optimization : Abstract: To mitigate reward hacking from response verbosity, modern preference optimization methods are increasingly adopting length normalization (e.g., SimPO, ORPO, LN-DPO). While effective against...
- From Haystack to Needle: Label Space Reduction for Zero-shot Classification : Abstract: We present Label Space Reduction (LSR), a novel method for improving zero-shot classification performance of Large Language Models (LLMs). LSR iteratively refines the classification label sp...
- Beyond Covariance Matrix: The Statistical Complexity of Private Linear Regression : Abstract: We study the statistical complexity of private linear regression under an unknown, potentially ill-conditioned covariate distribution. Somewhat surprisingly, under privacy constraints the in...
- A Survey on Text-Driven 360-Degree Panorama Generation : Abstract: The advent of text-driven 360-degree panorama generation, enabling the synthesis of 360-degree panoramic images directly from textual descriptions, marks a transformative advancement in imme...
- Assessing the Macro and Micro Effects of Random Seeds on Fine-Tuning Large Language Models : Abstract: The impact of random seeds in fine-tuning large language models (LLMs) has been largely overlooked despite its potential influence on model performance.In this study, we systematically evalu...
- SecRepoBench: Benchmarking Code Agents for Secure Code Completion in Real-World Repositories : Abstract: This paper introduces SecRepoBench, a benchmark to evaluate code agents on secure code completion in real-world repositories. SecRepoBench has 318 code completion tasks in 27 C/C++ repositor...
- A data-driven framework for team selection in Fantasy Premier League : Abstract: Fantasy football is a billion-dollar industry with millions of participants. Under a fixed budget, managers select squads to maximize future Fantasy Premier League (FPL) points. This study f...
- Traversal Verification for Speculative Tree Decoding : Abstract: Speculative decoding is a promising approach for accelerating large language models. The primary idea is to use a lightweight draft model to speculate the output of the target model for mult...
- RoboRAN: A Unified Robotics Framework for Reinforcement Learning-Based Autonomous Navigation : Abstract: Autonomous robots must navigate and operate in diverse environments, from terrestrial and aquatic settings to aerial and space domains. While Reinforcement Learning (RL) has shown promise in...
- This Time is Different: An Observability Perspective on Time Series Foundation Models : Abstract: We introduce Toto, a time series forecasting foundation model with 151 million parameters. Toto uses a modern decoder-only architecture coupled with architectural innovations designed to acc...
- Distilling LLM Agent into Small Models with Retrieval and Code Tools : Abstract: Large language models (LLMs) excel at complex reasoning tasks but remain computationally expensive, limiting their practical deployment. To address this, recent works have focused on distill...
- Large Language Models Miss the Multi-Agent Mark : Abstract: Recent interest in Multi-Agent Systems of Large Language Models (MAS LLMs) has led to an increase in frameworks leveraging multiple LLMs to tackle complex tasks. However, much of this litera...
- R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing : Abstract: Large Language Models (LLMs) achieve impressive reasoning capabilities at the cost of substantial inference overhead, posing substantial deployment challenges. Although distilled Small Langu...
- SLED: A Speculative LLM Decoding Framework for Efficient Edge Serving : Abstract: The growing gap between the increasing complexity of large language models (LLMs) and the limited computational budgets of edge devices poses a key challenge for efficient on-device inferenc...
- Balancing Tails when Comparing Distributions: Comprehensive Equity Index (CEI) with Application to Bias Evaluation in Operational Face Biometrics : Abstract: Demographic bias in high-performance face recognition (FR) systems often eludes detection by existing metrics, especially with respect to subtle disparities in the tails of the score distrib...
- AlphaDecay: Module-wise Weight Decay for Heavy-Tailed Balancing in LLMs : Abstract: Weight decay is a standard regularization technique for training large language models (LLMs). While it is common to assign a uniform decay rate to every layer, this approach overlooks the s...
- Dense SAE Latents Are Features, Not Bugs : Abstract: Sparse autoencoders (SAEs) are designed to extract interpretable features from language models by enforcing a sparsity constraint. Ideally, training an SAE would yield latents that are both ...
- Benchmarking Foundation Models and Parameter-Efficient Fine-Tuning for Prognosis Prediction in Medical Imaging : Abstract: Despite the significant potential of Foundation Models (FMs) in medical imaging, their application to prognosis prediction remains challenging due to data scarcity, class imbalance, and task...
- Layer Importance for Mathematical Reasoning is Forged in Pre-Training and Invariant after Post-Training : Abstract: Large language models improve at math after instruction tuning, reinforcement learning, or knowledge distillation. We ask whether these gains come from major changes in the transformer layer...
- FedRef: Communication-Efficient Bayesian Fine-Tuning using a Reference Model : Abstract: Federated learning (FL) collaboratively trains artificial intelligence (AI) models to ensure user data privacy. Sharing only model updates generated from local training on client data with t...
- Omni-Router: Sharing Routing Decisions in Sparse Mixture-of-Experts for Speech Recognition : Abstract: Mixture-of-experts (MoE) architectures have expanded from language modeling to automatic speech recognition (ASR). Traditional MoE methods, such as the Switch Transformer, route experts inde...
- DiffSpectra: Molecular Structure Elucidation from Spectra using Diffusion Models : Abstract: Molecular structure elucidation from spectra is a fundamental challenge in molecular science. Conventional approaches rely heavily on expert interpretation and lack scalability, while retrie...
- Automatic Road Subsurface Distress Recognition from Ground Penetrating Radar Images using Deep Learning-based Cross-verification : Abstract: Ground penetrating radar (GPR) has become a rapid and non-destructive solution for road subsurface distress (RSD) detection. Deep learning-based automatic RSD recognition, though amelioratin...
- GDS Agent for Graph Algorithmic Reasoning : Abstract: Large language models (LLMs) have shown remarkable multimodal information processing and reasoning ability. When equipped with tools through function calling and enhanced with retrieval-augm...
- LA-MARRVEL: A Knowledge-Grounded and Language-Aware LLM Reranker for AI-MARRVEL in Rare Disease Diagnosis : Abstract: Diagnosing rare diseases often requires connecting variant-bearing genes to evidence that is written as unstructured clinical prose, which the current established pipelines still leave for c...
- Adaptive and Robust Data Poisoning Detection and Sanitization in Wearable IoT Systems using Large Language Models : Abstract: The widespread integration of wearable sensing devices in Internet of Things (IoT) ecosystems, particularly in healthcare, smart homes, and industrial applications, has required robust human...
- Digital Twin-Driven Pavement Health Monitoring and Maintenance Optimization Using Graph Neural Networks : Abstract: Pavement infrastructure monitoring is challenged by complex spatial dependencies, changing environmental conditions, and non-linear deterioration across road networks. Traditional Pavement M...
- Inference-Time Personalized Alignment with a Few User Preference Queries : Abstract: We study the problem of aligning a generative model's response with a user's preferences. Recent works have proposed several different formulations for personalized alignment; however, they ...
- Heterogeneous Metamaterials Design via Multiscale Neural Implicit Representation : Abstract: Metamaterials are engineered materials composed of specially designed unit cells that exhibit extraordinary properties beyond those of natural materials. Complex engineering tasks often requ...
- Discrete Bayesian Sample Inference for Graph Generation : Abstract: Generating graph-structured data is crucial in applications such as molecular generation, knowledge graphs, and network analysis. However, their discrete, unordered nature makes them difficu...
- Leveraging Discrete Function Decomposability for Scientific Design : Abstract: In the era of AI-driven science and engineering, we often want to design discrete objects in silico according to user-specified properties. For example, we may wish to design a protein to bi...
- Data-Efficient Realized Volatility Forecasting with Vision Transformers : Abstract: Recent work in financial machine learning has shown the virtue of complexity: the phenomenon by which deep learning methods capable of learning highly nonlinear relationships outperform simp...
- Unsupervised Evaluation of Multi-Turn Objective-Driven Interactions : Abstract: Large language models (LLMs) have seen increasing popularity in enterprise applications where AI agents and humans engage in objective-driven interactions. However, these systems are difficu...
- The Curved Spacetime of Transformer Architectures : Abstract: We present a geometric framework for understanding Transformer-based language models, drawing an explicit analogy to General Relativity. Queries and keys induce an effective metric on repres...
- Homomorphism distortion: A metric to distinguish them all and in the latent space bind them : Abstract: For far too long, expressivity of graph neural networks has been measured \emph{only} in terms of combinatorial properties. In this work we stray away from this tradition and provide a princ...
- Evaluating Control Protocols for Untrusted AI Agents : Abstract: As AI systems become more capable and widely deployed as agents, ensuring their safe operation becomes critical. AI control offers one approach to mitigating the risk from untrusted AI agent...
- PublicAgent: Multi-Agent Design Principles From an LLM-Based Open Data Analysis Framework : Abstract: Open data repositories hold potential for evidence-based decision-making, yet are inaccessible to non-experts lacking expertise in dataset discovery, schema mapping, and statistical analysis...
- No-Human in the Loop: Agentic Evaluation at Scale for Recommendation : Abstract: Evaluating large language models (LLMs) as judges is increasingly critical for building scalable and trustworthy evaluation pipelines. We present ScalingEval, a large-scale benchmarking stud...
- Epidemiology of Large Language Models: A Benchmark for Observational Distribution Knowledge : Abstract: Artificial intelligence (AI) systems hold great promise for advancing various scientific disciplines, and are increasingly used in real-world applications. Despite their remarkable progress,...
- SnapStream: Efficient Long Sequence Decoding on Dataflow Accelerators : Abstract: The proliferation of 100B+ parameter Large Language Models (LLMs) with 100k+ context length support have resulted in increasing demands for on-chip memory to support large KV caches. Techniq...
- Large language models require a new form of oversight: capability-based monitoring : Abstract: The rapid adoption of large language models (LLMs) in healthcare has been accompanied by scrutiny of their oversight. Existing monitoring approaches, inherited from traditional machine learn...
- miniF2F-Lean Revisited: Reviewing Limitations and Charting a Path Forward : Abstract: We perform a thorough analysis of the formal and informal statements in the miniF2F benchmark from the perspective of an AI system that is tasked to participate in a math Olympiad consisting...
- Using Multi-modal Large Language Model to Boost Fireworks Algorithm's Ability in Settling Challenging Optimization Tasks : Abstract: As optimization problems grow increasingly complex and diverse, advancements in optimization techniques and paradigm innovations hold significant importance. The challenges posed by optimiza...
- A Proprietary Model-Based Safety Response Framework for AI Agents : Abstract: With the widespread application of Large Language Models (LLMs), their associated security issues have become increasingly prominent, severely constraining their trustworthy deployment in cr...
- Uncovering Bugs in Formal Explainers: A Case Study with PyXAI : Abstract: Formal explainable artificial intelligence (XAI) offers unique theoretical guarantees of rigor when compared to other non-formal methods of explainability. However, little attention has been...
- Toward Autonomous Engineering Design: A Knowledge-Guided Multi-Agent Framework : Abstract: The engineering design process often demands expertise from multiple domains, leading to complex collaborations and iterative refinements. Traditional methods can be resource-intensive and p...
- Adobe Summit Concierge Evaluation with Human in the Loop : Abstract: Generative AI assistants offer significant potential to enhance productivity, streamline information access, and improve user experience in enterprise contexts. In this work, we present Summ...
- From Five Dimensions to Many: Large Language Models as Precise and Interpretable Psychological Profilers : Abstract: Psychological constructs within individuals are widely believed to be interconnected. We investigated whether and how Large Language Models (LLMs) can model the correlational structure of hu...
- Towards Scalable Web Accessibility Audit with MLLMs as Copilots : Abstract: Ensuring web accessibility is crucial for advancing social welfare, justice, and equality in digital spaces, yet the vast majority of website user interfaces remain non-compliant, due in par...
- Explaining Decisions in ML Models: a Parameterized Complexity Analysis (Part I) : Abstract: This paper presents a comprehensive theoretical investigation into the parameterized complexity of explanation problems in various machine learning (ML) models. Contrary to the prevalent bla...
- Outbidding and Outbluffing Elite Humans: Mastering Liar's Poker via Self-Play and Reinforcement Learning : Abstract: AI researchers have long focused on poker-like games as a testbed for environments characterized by multi-player dynamics, imperfect information, and reasoning under uncertainty. While recen...
- An extended reality-based framework for user risk training in urban built environment : Abstract: In the context of increasing urban risks, particularly from climate change-induced flooding, this paper presents an extended Reality (XR)-based framework to improve user risk training within...
- Evaluating Generative AI as an Educational Tool for Radiology Resident Report Drafting : Abstract: Objective: Radiology residents require timely, personalized feedback to develop accurate image analysis and reporting skills. Increasing clinical workload often limits attendings' ability to...
- Digital Transformation Chatbot (DTchatbot): Integrating Large Language Model-based Chatbot in Acquiring Digital Transformation Needs : Abstract: Many organisations pursue digital transformation to enhance operational efficiency, reduce manual efforts, and optimise processes by automation and digital tools. To achieve this, a comprehe...
- AI-Enhanced Wi-Fi Sensing Through Single Transceiver Pair : Abstract: The advancement of next-generation Wi-Fi technology heavily relies on sensing capabilities, which play a pivotal role in enabling sophisticated applications. In response to the growing deman...
- Spatio-Temporal Attention Network for Epileptic Seizure Prediction : Abstract: In this study, we present a deep learning framework that learns complex spatio-temporal correlation structures of EEG signals through a Spatio-Temporal Attention Network (STAN) for accurate ...
- EEGReXferNet: A Lightweight Gen-AI Framework for EEG Subspace Reconstruction via Cross-Subject Transfer Learning and Channel-Aware Embedding : Abstract: Electroencephalography (EEG) is a widely used non-invasive technique for monitoring brain activity, but low signal-to-noise ratios (SNR) due to various artifacts often compromise its utility...
- Approaching Low-Cost Cardiac Intelligence with Semi-Supervised Knowledge Distillation : Abstract: Deploying advanced cardiac artificial intelligence for daily cardiac monitoring is hindered by its reliance on extensive medical data and high computational resources. Low-cost cardiac intel...
- Consciousness-ECG Transformer for Conscious State Estimation System with Real-Time Monitoring : Abstract: Conscious state estimation is important in various medical settings, including sleep staging and anesthesia management, to ensure patient safety and optimize health outcomes. Traditional met...
- SELF-REDRAFT: Eliciting Intrinsic Exploration-Exploitation Balance in Test-Time Scaling for Code Generation : Abstract: Test-time scaling without interpreter feedback is essential for real-world code generation scenarios where test cases are not readily available. While existing paradigms often rely on either...
- Digitizing Spermatogenesis Lineage at Nanoscale Resolution In Tissue-Level Electron Microscopy : Abstract: Recent advances in 2D large-scale and 3D volume electron microscopy have stimulated the rapid development of nanoscale functional analysis at the tissue and organ levels. Digitizing the cell...
- Mathematical exploration and discovery at scale : Abstract: AlphaEvolve is a generic evolutionary coding agent that combines the generative capabilities of LLMs with automated evaluation in an iterative evolutionary framework that proposes, tests, an...
- LM-Fix: Lightweight Bit-Flip Detection and Rapid Recovery Framework for Language Models : Abstract: This paper presents LM-Fix, a lightweight detection and rapid recovery framework for faults in large language models (LLMs). Existing integrity approaches are often heavy or slow for modern ...
- Proof-of-Spiking-Neurons(PoSN): Neuromorphic Consensus for Next-Generation Blockchains : Abstract: Blockchain systems face persistent challenges of scalability, latency, and energy inefficiency. Existing consensus protocols such as Proof-of-Work (PoW) and Proof-of-Stake (PoS) either consu...
- Analysis of AdvFusion: Adapter-based Multilingual Learning for Code Large Language Models : Abstract: Programming languages can benefit from one another by utilizing a language model for software engineering tasks. Full fine-tuning and Parameter Efficient Fine-Tuning (PEFT) of Code Language ...
- FATE: A Formal Benchmark Series for Frontier Algebra of Multiple Difficulty Levels : Abstract: Recent advances in large language models (LLMs) have demonstrated impressive capabilities in formal theorem proving, particularly on contest-based mathematical benchmarks like the IMO. Howev...
- Academics and Generative AI: Empirical and Epistemic Indicators of Policy-Practice Voids : Abstract: As generative AI diffuses through academia, policy-practice divergence becomes consequential, creating demand for auditable indicators of alignment. This study prototypes a ten-item, indirec...
- A Novel Reservoir Computing Framework for Chaotic Time Series Prediction Using Time Delay Embedding and Random Fourier Features : Abstract: Forecasting chaotic time series requires models that can capture the intrinsic geometry of the underlying attractor while remaining computationally efficient. We introduce a novel reservoir ...
- Stochastic Deep Graph Clustering for Practical Group Formation : Abstract: While prior work on group recommender systems (GRSs) has primarily focused on improving recommendation accuracy, most approaches assume static or predefined groups, making them unsuitable fo...
- NEF-NET+: Adapting Electrocardio panorama in the wild : Abstract: Conventional multi-lead electrocardiogram (ECG) systems capture cardiac signals from a fixed set of anatomical viewpoints defined by lead placement. However, certain cardiac conditions (e.g....
- AgentSLA : Towards a Service Level Agreement for AI Agents : Abstract: AI components are increasingly becoming a key element of all types of software systems to enhance their functionality. These AI components are often implemented as AI Agents, offering more a...
- Test-time Adaptation of Tiny Recursive Models : Abstract: Prior to the close of the 2025 ARC Prize competition, the leading open source approach - known as TRM, or Tiny Recursive Models - involved training a 7M parameter recursive neural network on...
- Predicting Weekly Fishing Concentration Zones through Deep Learning Integration of Heterogeneous Environmental Spatial Datasets : Abstract: The North Indian Ocean, including the Arabian Sea and the Bay of Bengal, represents a vital source of livelihood for coastal communities, yet fishermen often face uncertainty in locating pro...
- NABench: Large-Scale Benchmarks of Nucleotide Foundation Models for Fitness Prediction : Abstract: Nucleotide sequence variation can induce significant shifts in functional fitness. Recent nucleotide foundation models promise to predict such fitness effects directly from sequence, yet het...
- A Criminology of Machines : Abstract: While the possibility of reaching human-like Artificial Intelligence (AI) remains controversial, the likelihood that the future will be characterized by a society with a growing presence of ...
- Performance Evaluation of Bitstring Representations in a Linear Genetic Programming Framework : Abstract: Different bitstring representations can yield varying computational performance. This work compares three bitstring implementations in C++: std::bitset, boost::dynamic_bitset, and a custom d...
- Generative Hints : Abstract: Data augmentation is widely used in vision to introduce variation and mitigate overfitting, through enabling models to learn invariant properties, such as spatial invariance. However, these ...
- Zero-shot data citation function classification using transformer-based large language models (LLMs) : Abstract: Efforts have increased in recent years to identify associations between specific datasets and the scientific literature that incorporates them. Knowing that a given publication cites a given...
- From Narrow to Wide: Autoencoding Transformers for Ultrasound Bandwidth Recovery : Abstract: Conventional pulse-echo ultrasound suffers when low-cost probes deliver only narrow fractional bandwidths, elongating pulses and erasing high-frequency detail. We address this limitation by ...
- Power Constrained Nonstationary Bandits with Habituation and Recovery Dynamics : Abstract: A common challenge for decision makers is selecting actions whose rewards are unknown and evolve over time based on prior policies. For instance, repeated use may reduce an action's effectiv...
- EvtSlowTV - A Large and Diverse Dataset for Event-Based Depth Estimation : Abstract: Event cameras, with their high dynamic range (HDR) and low latency, offer a promising alternative for robust depth estimation in challenging environments. However, many event-based depth est...
- Value of Information-Enhanced Exploration in Bootstrapped DQN : Abstract: Efficient exploration in deep reinforcement learning remains a fundamental challenge, especially in environments characterized by high-dimensional states and sparse rewards. Traditional expl...
- Systematizing LLM Persona Design: A Four-Quadrant Technical Taxonomy for AI Companion Applications : Abstract: The design and application of LLM-based personas in AI companionship is a rapidly expanding but fragmented field, spanning from virtual emotional compan- ions and game NPCs to embodied funct...
- SLIP: Structural-aware Language-Image Pretraining for Vision-Language Alignment : Abstract: Vision-Language Pretraining (VLP) has achieved remarkable success across various downstream tasks, but such gains are largely driven by scaling up on training data. Yet, literature methods t...
- Adaptive-Sensorless Monitoring of Shipping Containers : Abstract: Monitoring the internal temperature and humidity of shipping containers is essential to preventing quality degradation during cargo transportation. Sensorless monitoring -- machine learning ...
- Reading Between the Lines: The One-Sided Conversation Problem : Abstract: Conversational AI is constrained in many real-world settings where only one side of a dialogue can be recorded, such as telemedicine, call centers, and smart glasses. We formalize this as th...
- Sparse, self-organizing ensembles of local kernels detect rare statistical anomalies : Abstract: Modern artificial intelligence has revolutionized our ability to extract rich and versatile data representations across scientific disciplines. Yet, the statistical properties of these repre...
- Scaling Multi-Agent Environment Co-Design with Diffusion Models : Abstract: The agent-environment co-design paradigm jointly optimises agent policies and environment configurations in search of improved system performance. With application domains ranging from wareh...
- CARMA: Comprehensive Automatically-annotated Reddit Mental Health Dataset for Arabic : Abstract: Mental health disorders affect millions worldwide, yet early detection remains a major challenge, particularly for Arabic-speaking populations where resources are limited and mental health d...
- Adaptive Detection of Software Aging under Workload Shift : Abstract: Software aging is a phenomenon that affects long-running systems, leading to progressive performance degradation and increasing the risk of failures. To mitigate this problem, this work prop...
- FP-AbDiff: Improving Score-based Antibody Design by Capturing Nonequilibrium Dynamics through the Underlying Fokker-Planck Equation : Abstract: Computational antibody design holds immense promise for therapeutic discovery, yet existing generative models are fundamentally limited by two core challenges: (i) a lack of dynamical consis...
- An Augmentation Overlap Theory of Contrastive Learning : Abstract: Recently, self-supervised contrastive learning has achieved great success on various tasks. However, its underlying working mechanism is yet unclear. In this paper, we first provide the tigh...
- Image-Intrinsic Priors for Integrated Circuit Defect Detection and Novel Class Discovery via Self-Supervised Learning : Abstract: Integrated circuit manufacturing is highly complex, comprising hundreds of process steps. Defects can arise at any stage, causing yield loss and ultimately degrading product reliability. Sup...
- Control Barrier Function for Aligning Large Language Models : Abstract: This paper proposes a control-based framework for aligning large language models (LLMs) by leveraging a control barrier function (CBF) to ensure user-desirable text generation. The presented...
- EGMOF: Efficient Generation of Metal-Organic Frameworks Using a Hybrid Diffusion-Transformer Architecture : Abstract: Designing materials with targeted properties remains challenging due to the vastness of chemical space and the scarcity of property-labeled data. While recent advances in generative models o...
- Optimal Boundary Control of Diffusion on Graphs via Linear Programming : Abstract: We propose a linear programming (LP) framework for steady-state diffusion and flux optimization on geometric networks. The state variable satisfies a discrete diffusion law on a weighted, or...
- Deploying Rapid Damage Assessments from sUAS Imagery for Disaster Response : Abstract: This paper presents the first AI/ML system for automating building damage assessment in uncrewed aerial systems (sUAS) imagery to be deployed operationally during federally declared disaster...
- From Measurement to Expertise: Empathetic Expert Adapters for Context-Based Empathy in Conversational AI Agents : Abstract: Empathy is a critical factor in fostering positive user experiences in conversational AI. While models can display empathy, it is often generic rather than tailored to specific tasks and con...
- Forecast2Anomaly (F2A): Adapting Multivariate Time Series Foundation Models for Anomaly Prediction : Abstract: Forecasting anomalies (anomaly prediction) in multivariate time series from different real-world, dynamic, and complex systems is vital for preempting critical failures, leading to a substan...
- Who Sees the Risk? Stakeholder Conflicts and Explanatory Policies in LLM-based Risk Assessment : Abstract: Understanding how different stakeholders perceive risks in AI systems is essential for their responsible deployment. This paper presents a framework for stakeholder-grounded risk assessment ...
- RefAgent: A Multi-agent LLM-based Framework for Automatic Software Refactoring : Abstract: Large Language Models (LLMs) have substantially influenced various software engineering tasks. Indeed, in the case of software refactoring, traditional LLMs have shown the ability to reduce ...
- GraphCliff: Short-Long Range Gating for Subtle Differences but Critical Changes : Abstract: Quantitative structure-activity relationship assumes a smooth relationship between molecular structure and biological activity. However, activity cliffs defined as pairs of structurally simi...
- Optimizing Earth-Moon Transfer and Cislunar Navigation: Integrating Low-Energy Trajectories, AI Techniques and GNSS-R Technologies : Abstract: The rapid growth of cislunar activities, including lunar landings, the Lunar Gateway, and in-space refueling stations, requires advances in cost-efficient trajectory design and reliable inte...
- Efficient Linear Attention for Multivariate Time Series Modeling via Entropy Equality : Abstract: Attention mechanisms have been extensively employed in various applications, including time series modeling, owing to their capacity to capture intricate dependencies; however, their utility...
- A Quantized VAE-MLP Botnet Detection Model: A Systematic Evaluation of Quantization-Aware Training and Post-Training Quantization Strategies : Abstract: In an effort to counter the increasing IoT botnet-based attacks, state-of-the-art deep learning methods have been proposed and have achieved impressive detection accuracy. However, their com...
- QG-CoC: Question-Guided Chain-of-Captions for Large Multimodal Models : Abstract: Recently, Multimodal Large Language Models (MLLMs) encounter two key issues in multi-image contexts: (1) a lack of fine-grained perception across disparate images, and (2) a diminished capab...
- Retrofitters, pragmatists and activists: Public interest litigation for accountable automated decision-making : Abstract: This paper examines the role of public interest litigation in promoting accountability for AI and automated decision-making (ADM) in Australia. Since ADM regulatio faces geopolitical headwin...
- LGM: Enhancing Large Language Models with Conceptual Meta-Relations and Iterative Retrieval : Abstract: Large language models (LLMs) exhibit strong semantic understanding, yet struggle when user instructions involve ambiguous or conceptually misaligned terms. We propose the Language Graph Mode...
- Hybrid Fact-Checking that Integrates Knowledge Graphs, Large Language Models, and Search-Based Retrieval Agents Improves Interpretable Claim Verification : Abstract: Large language models (LLMs) excel in generating fluent utterances but can lack reliable grounding in verified information. At the same time, knowledge-graph-based fact-checkers deliver prec...
- Node-Based Editing for Multimodal Generation of Text, Audio, Image, and Vide : Abstract: We present a node-based storytelling system for multimodal content generation. The system represents stories as graphs of nodes that can be expanded, edited, and iteratively refined through ...
- GMoPE:A Prompt-Expert Mixture Framework for Graph Foundation Models : Abstract: Graph Neural Networks (GNNs) have demonstrated impressive performance on task-specific benchmarks, yet their ability to generalize across diverse domains and tasks remains limited. Existing ...
- Generative deep learning for foundational video translation in ultrasound : Abstract: Deep learning (DL) has the potential to revolutionize image acquisition and interpretation across medicine, however, attention to data imbalance and missingness is required. Ultrasound data ...
- Comparing the Performance of LLMs in RAG-based Question-Answering: A Case Study in Computer Science Literature : Abstract: Retrieval Augmented Generation (RAG) is emerging as a powerful technique to enhance the capabilities of Generative AI models by reducing hallucination. Thus, the increasing prominence of RAG...
- When Generative Artificial Intelligence meets Extended Reality: A Systematic Review : Abstract: With the continuous advancement of technology, the application of generative artificial intelligence (AI) in various fields is gradually demonstrating great potential, particularly when comb...
- How to Evaluate Speech Translation with Source-Aware Neural MT Metrics : Abstract: Automatic evaluation of speech-to-text translation (ST) systems is typically performed by comparing translation hypotheses with one or more reference translations. While effective to some ex...
- Extending Fair Null-Space Projections for Continuous Attributes to Kernel Methods : Abstract: With the on-going integration of machine learning systems into the everyday social life of millions the notion of fairness becomes an ever increasing priority in their development. Fairness ...
- Benchmarking the Thinking Mode of Multimodal Large Language Models in Clinical Tasks : Abstract: A recent advancement in Multimodal Large Language Models (MLLMs) research is the emergence of "reasoning MLLMs" that offer explicit control over their internal thinking processes (normally r...
- Discourse-Aware Scientific Paper Recommendation via QA-Style Summarization and Multi-Level Contrastive Learning : Abstract: The rapid growth of open-access (OA) publications has intensified the challenge of identifying relevant scientific papers. Due to privacy constraints and limited access to user interaction d...
- Generative Artificial Intelligence in Bioinformatics: A Systematic Review of Models, Applications, and Methodological Advances : Abstract: Generative artificial intelligence (GenAI) has become a transformative approach in bioinformatics that often enables advancements in genomics, proteomics, transcriptomics, structural biology...
- Open Source State-Of-the-Art Solution for Romanian Speech Recognition : Abstract: In this work, we present a new state-of-the-art Romanian Automatic Speech Recognition (ASR) system based on NVIDIA's FastConformer architecture--explored here for the first time in the conte...
- Decoupling Augmentation Bias in Prompt Learning for Vision-Language Models : Abstract: Recent advances in large-scale vision and language models have led to significant progress in zero-shot learning tasks. Methods such as CoOp and CoCoOp have shown that replacing handcrafted ...
- Computational Imaging Meets LLMs: Zero-Shot IDH Mutation Prediction in Brain Gliomas : Abstract: We present a framework that combines Large Language Models with computational image analytics for non-invasive, zero-shot prediction of IDH mutation status in brain gliomas. For each subject...
- Adaptable Hindsight Experience Replay for Search-Based Learning : Abstract: AlphaZero-like Monte Carlo Tree Search systems, originally introduced for two-player games, dynamically balance exploration and exploitation using neural network guidance. This combination m...
- Light over Heavy: Automated Performance Requirements Quantification with Linguistic Inducement : Abstract: Elicited performance requirements need to be quantified for compliance in different engineering tasks, e.g., configuration tuning and performance testing. Much existing work has relied on ma...
- Inter-Agent Trust Models: A Comparative Study of Brief, Claim, Proof, Stake, Reputation and Constraint in Agentic Web Protocol Design-A2A, AP2, ERC-8004, and Beyond : Abstract: As the "agentic web" takes shape-billions of AI agents (often LLM-powered) autonomously transacting and collaborating-trust shifts from human oversight to protocol design. In 2025, several i...
- CareMedEval dataset: Evaluating Critical Appraisal and Reasoning in the Biomedical Field : Abstract: Critical appraisal of scientific literature is an essential skill in the biomedical field. While large language models (LLMs) can offer promising support in this task, their reliability rema...
- Development of the Bioinspired Tendon-Driven DexHand 021 with Proprioceptive Compliance Control : Abstract: The human hand plays a vital role in daily life and industrial applications, yet replicating its multifunctional capabilities-including motion, sensing, and coordinated manipulation-with rob...
- ROSBag MCP Server: Analyzing Robot Data with LLMs for Agentic Embodied AI Applications : Abstract: Agentic AI systems and Physical or Embodied AI systems have been two key research verticals at the forefront of Artificial Intelligence and Robotics, with Model Context Protocol (MCP) increa...
- A Theoretical Framework for Environmental Similarity and Vessel Mobility as Coupled Predictors of Marine Invasive Species Pathways : Abstract: Marine invasive species spread through global shipping and generate substantial ecological and economic impacts. Traditional risk assessments require detailed records of ballast water and tr...
- Efficient Neural Networks with Discrete Cosine Transform Activations : Abstract: In this paper, we extend our previous work on the Expressive Neural Network (ENN), a multilayer perceptron with adaptive activation functions parametrized using the Discrete Cosine Transform...
- SOLVE-Med: Specialized Orchestration for Leading Vertical Experts across Medical Specialties : Abstract: Medical question answering systems face deployment challenges including hallucinations, bias, computational demands, privacy concerns, and the need for specialized expertise across diverse d...
- Uncovering Code Insights: Leveraging GitHub Artifacts for Deeper Code Understanding : Abstract: Understanding the purpose of source code is a critical task in software maintenance, onboarding, and modernization. While large language models (LLMs) have shown promise in generating code e...
- MultiZebraLogic: A Multilingual Logical Reasoning Benchmark : Abstract: Measuring the full abilities of large language models (LLMs) requires benchmarks representing multiple tasks. We aim to create large, high-quality datasets for comparison of logical reasonin...
- AILA--First Experiments with Localist Language Models : Abstract: This paper presents the first empirical demonstration of controllable locality in transformer language models, a novel architectural framework that enables continuous control over the degree...
- Imitation Learning in the Deep Learning Era: A Novel Taxonomy and Recent Advances : Abstract: Imitation learning (IL) enables agents to acquire skills by observing and replicating the behavior of one or multiple experts. In recent years, advances in deep learning have significantly e...
- Multi-User Personalisation in Human-Robot Interaction: Using Quantitative Bipolar Argumentation Frameworks for Preferences Conflict Resolution : Abstract: While personalisation in Human-Robot Interaction (HRI) has advanced significantly, most existing approaches focus on single-user adaptation, overlooking scenarios involving multiple stakehol...
- Learning Under Laws: A Constraint-Projected Neural PDE Solver that Eliminates Hallucinations : Abstract: Neural networks can approximate solutions to partial differential equations, but they often break the very laws they are meant to model-creating mass from nowhere, drifting shocks, or violat...
- PerfDojo: Automated ML Library Generation for Heterogeneous Architectures : Abstract: The increasing complexity of machine learning models and the proliferation of diverse hardware architectures (CPUs, GPUs, accelerators) make achieving optimal performance a significant chall...
- Step-Audio-EditX Technical Report : Abstract: We present Step-Audio-EditX, the first open-source LLM-based audio model excelling at expressive and iterative audio editing encompassing emotion, speaking style, and paralinguistics alongsi...
- Visualization Biases MLLM's Decision Making in Network Data Tasks : Abstract: We evaluate how visualizations can influence the judgment of MLLMs about the presence or absence of bridges in a network. We show that the inclusion of visualization improves confidence over...
- LiveTradeBench: Seeking Real-World Alpha with Large Language Models : Abstract: Large language models (LLMs) achieve strong performance across benchmarks--from knowledge quizzes and math reasoning to web-agent tasks--but these tests occur in static settings, lacking rea...
- Watermarking Large Language Models in Europe: Interpreting the AI Act in Light of Technology : Abstract: To foster trustworthy Artificial Intelligence (AI) within the European Union, the AI Act requires providers to mark and detect the outputs of their general-purpose models. The Article 50 and...
- Explaining Human Choice Probabilities with Simple Vector Representations : Abstract: When people pursue rewards in stochastic environments, they often match their choice frequencies to the observed target frequencies, even when this policy is demonstrably sub-optimal. We use...
- ChiMDQA: Towards Comprehensive Chinese Document QA with Fine-grained Evaluation : Abstract: With the rapid advancement of natural language processing (NLP) technologies, the demand for high-quality Chinese document question-answering datasets is steadily growing. To address this is...
- DQN Performance with Epsilon Greedy Policies and Prioritized Experience Replay : Abstract: We present a detailed study of Deep Q-Networks in finite environments, emphasizing the impact of epsilon-greedy exploration schedules and prioritized experience replay. Through systematic ex...
- Whisper Leak: a side-channel attack on Large Language Models : Abstract: Large Language Models (LLMs) are increasingly deployed in sensitive domains including healthcare, legal services, and confidential communications, where privacy is paramount. This paper intr...
- Structured Matrix Scaling for Multi-Class Calibration : Abstract: Post-hoc recalibration methods are widely used to ensure that classifiers provide faithful probability estimates. We argue that parametric recalibration functions based on logistic regressio...
Research Sources: 375 | Generated: 11/6/2025
