AI Research News Feeds for December 23rd, 2025

AI RESEARCH PAPERS & ACADEMIC SOURCES

AC4: Algebraic Computation Checker for Circuit Constraints in ZKPs : Abstract: Zero-knowledge proof (ZKP) systems have surged attention and held a fundamental role in contemporary cryptography. Zero-knowledge succinct non-interactive argument of knowledge (zk-SNARK) pr...
Label Words as Local Task Vectors in In-Context Learning : Abstract: Large Language Models (LLMs) have demonstrated remarkable abilities, one of the most important being in-context learning (ICL). With ICL, LLMs can derive the underlying rule from a few demon...
Structured Language Generation Model: Loss Calibration and Formatted Decoding for Robust Structure Prediction and Knowledge Retrieval : Abstract: Modern generative pre-trained language models excel at open-ended text generation, yet continue to underperform on structure-related tasks such as NER, relation extraction, and semantic role...
Epistemological Fault Lines Between Human and Artificial Intelligence : Abstract: Large language models (LLMs) are widely described as artificial intelligence, yet their epistemic profile diverges sharply from human cognition. Here we show that the apparent alignment betw...
From Retrieval to Reasoning: A Framework for Cyber Threat Intelligence NER with Explicit and Adaptive Instructions : Abstract: The automation of Cyber Threat Intelligence (CTI) relies heavily on Named Entity Recognition (NER) to extract critical entities from unstructured text. Currently, Large Language Models (LLMs...
BanglaForge: LLM Collaboration with Self-Refinement for Bangla Code Generation : Abstract: Bangla is a low-resource language for code generation, lacking large-scale annotated datasets and tools to transform natural language specifications into executable programs. This makes Bang...
Watch Closely: Mitigating Object Hallucinations in Large Vision-Language Models with Disentangled Decoding : Abstract: Large Vision-Language Models (LVLMs) bridge the gap between visual and linguistic modalities, demonstrating strong potential across a variety of domains. However, despite significant progres...
Affordance RAG: Hierarchical Multimodal Retrieval with Affordance-Aware Embodied Memory for Mobile Manipulation : Abstract: In this study, we address the problem of open-vocabulary mobile manipulation, where a robot is required to carry a wide range of objects to receptacles based on free-form natural language in...
Merge on workspaces as Hopf algebra Markov chain : Abstract: We study the dynamical properties of a Hopf algebra Markov chain with state space the binary rooted forests with labelled leaves. This Markovian dynamical system describes the core computati...
brat: Aligned Multi-View Embeddings for Brain MRI Analysis : Abstract: We present brat (brain report alignment transformer), a multi-view representation learning framework for brain magnetic resonance imaging (MRI) trained on MRIs paired with clinical reports. ...
Explainable Transformer-CNN Fusion for Noise-Robust Speech Emotion Recognition : Abstract: Speech Emotion Recognition (SER) systems often degrade in performance when exposed to the unpredictable acoustic interference found in real-world environments. Additionally, the opacity of d...
Measuring Fine-Grained Negotiation Tactics of Humans and LLMs in Diplomacy : Abstract: The study of negotiation styles dates back to Aristotle's ethos-pathos-logos rhetoric. Prior efforts primarily studied the success of negotiation agents. Here, we shift the focus towards the...
Investigating Spatial Attention Bias in Vision-Language Models : Abstract: Vision-Language Models have demonstrated remarkable capabilities in understanding visual content, yet systematic biases in their spatial processing remain largely unexplored. This work ident...
Distributed Asymmetric Allocation: A Topic Model for Large Imbalanced Corpora in Social Sciences : Abstract: Social scientists employ latent Dirichlet allocation (LDA) to find highly specific topics in large corpora, but they often struggle in this task because (1) LDA, in general, takes a signific...
Layout-Aware Text Editing for Efficient Transformation of Academic PDFs to Markdown : Abstract: Academic documents stored in PDF format can be transformed into plain text structured markup languages to enhance accessibility and enable scalable digital library workflows. Markup language...
GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators : Abstract: Training capable Large Language Model (LLM) agents is critically bottlenecked by the high cost and static nature of real-world interaction data. We address this by introducing GenEnv, a fram...
Exploring Zero-Shot ACSA with Unified Meaning Representation in Chain-of-Thought Prompting : Abstract: Aspect-Category Sentiment Analysis (ACSA) provides granular insights by identifying specific themes within reviews and their associated sentiment. While supervised learning approaches domina...
Diacritic Restoration for Low-Resource Indigenous Languages: Case Study with Bribri and Cook Islands M\=aori : Abstract: We present experiments on diacritic restoration, a form of text normalization essential for natural language processing (NLP) tasks. Our study focuses on two extremely under-resourced langua...
MauBERT: Universal Phonetic Inductive Biases for Few-Shot Acoustic Units Discovery : Abstract: This paper introduces MauBERT, a multilingual extension of HuBERT that leverages articulatory features for robust cross-lingual phonetic representation learning. We continue HuBERT pre-train...
Increasing the Thinking Budget is Not All You Need : Abstract: Recently, a new wave of thinking-capable Large Language Models has emerged, demonstrating exceptional capabilities across a wide range of reasoning benchmarks. Early studies have begun to ex...
Algerian Dialect : Abstract: We present Algerian Dialect, a large-scale sentiment-annotated dataset consisting of 45,000 YouTube comments written in Algerian Arabic dialect. The comments were collected from more than 30...
Event Extraction in Large Language Model : Abstract: Large language models (LLMs) and multimodal LLMs are changing event extraction (EE): prompting and generation can often produce structured outputs in zero shot or few shot settings. Yet LLM ...
A Large-Language-Model Framework for Automated Humanitarian Situation Reporting : Abstract: Timely and accurate situational reports are essential for humanitarian decision-making, yet current workflows remain largely manual, resource intensive, and inconsistent. We present a fully ...
SiamGPT: Quality-First Fine-Tuning for Stable Thai Text Generation : Abstract: Open-weights large language models remain difficult to deploy for Thai due to unstable generation under complex instructions, despite strong English performance. To mitigate these limitation...
MobileWorld: Benchmarking Autonomous Mobile Agents in Agent-User Interactive, and MCP-Augmented Environments : Abstract: Among existing online mobile-use benchmarks, AndroidWorld has emerged as the dominant benchmark due to its reproducible environment and deterministic evaluation; however, recent agents achie...
CodeSimpleQA: Scaling Factuality in Code Large Language Models : Abstract: Large language models (LLMs) have made significant strides in code generation, achieving impressive capabilities in synthesizing code snippets from natural language instructions. However, a ...
Kunnafonidilaw ka Cadeau: an ASR dataset of present-day Bambara : Abstract: We present Kunkado, a 160-hour Bambara ASR dataset compiled from Malian radio archives to capture present-day spontaneous speech across a wide range of topics. It includes code-switching, di...
HATS: High-Accuracy Triple-Set Watermarking for Large Language Models : Abstract: Misuse of LLM-generated text can be curbed by watermarking techniques that embed implicit signals into the output. We propose a watermark that partitions the vocabulary at each decoding step...
CienaLLM: Generative Climate-Impact Extraction from News Articles with Autoregressive LLMs : Abstract: Understanding and monitoring the socio-economic impacts of climate hazards requires extracting structured information from heterogeneous news articles on a large scale. To that end, we have ...
CycleChart: A Unified Consistency-Based Learning Framework for Bidirectional Chart Understanding and Generation : Abstract: Current chart-specific tasks, such as chart question answering, chart parsing, and chart generation, are typically studied in isolation, preventing models from learning the shared semantics ...
JEPA-Reasoner: Decoupling Latent Reasoning from Token Generation : Abstract: While Joint-Embedding Predictive Architecture (JEPA) has emerged as a powerful architecture for learning rich latent representations, it fundamentally lacks generative abilities. Meanwhile, ...
From Speech to Subtitles: Evaluating ASR Models in Subtitling Italian Television Programs : Abstract: Subtitles are essential for video accessibility and audience engagement. Modern Automatic Speech Recognition (ASR) systems, built upon Encoder-Decoder neural network architectures and traine...
QuCo-RAG: Quantifying Uncertainty from the Pre-training Corpus for Dynamic Retrieval-Augmented Generation : Abstract: Dynamic Retrieval-Augmented Generation adaptively determines when to retrieve during generation to mitigate hallucinations in large language models (LLMs). However, existing methods rely on ...
AWPO: Enhancing Tool-Use of Large Language Models through Explicit Integration of Reasoning Rewards : Abstract: While reinforcement learning (RL) shows promise in training tool-use large language models (LLMs) using verifiable outcome rewards, existing methods largely overlook the potential of explici...
Stop saying LLM: Large Discourse Models (LDM) and Artificial Discursive Agent (ADA)? : Abstract: This paper proposes an epistemological shift in the analysis of large generative models, replacing the category ''Large Language Models'' (LLM) with that of ''Large Discourse Models'' (LDM),...
A Large Language Model Based Method for Complex Logical Reasoning over Knowledge Graphs : Abstract: Reasoning over knowledge graphs (KGs) with first-order logic (FOL) queries is challenging due to the inherent incompleteness of real-world KGs and the compositional complexity of logical que...
DramaBench: A Six-Dimensional Evaluation Framework for Drama Script Continuation : Abstract: Drama script continuation requires models to maintain character consistency, advance plot coherently, and preserve dramatic structurecapabilities that existing benchmarks fail to evaluate co...
FASTRIC: Prompt Specification Language for Verifiable LLM Interactions : Abstract: Large Language Models (LLMs) execute complex multi-turn interaction protocols but lack formal specifications to verify execution against designer intent. We introduce FASTRIC, a Prompt Speci...
Remedy-R: Generative Reasoning for Machine Translation Evaluation without Error Annotations : Abstract: Over the years, automatic MT metrics have hillclimbed benchmarks and presented strong and sometimes human-level agreement with human ratings. Yet they remain black-box, offering little insig...
Toward Human-Centered AI-Assisted Terminology Work : Abstract: The rapid diffusion of generative artificial intelligence is transforming terminology work. While this technology promises gains in efficiency, its unstructured adoption risks weakening prof...
MDToC: Metacognitive Dynamic Tree of Concepts for Boosting Mathematical Problem-Solving of Large Language Models : Abstract: Despite advances in mathematical reasoning capabilities, Large Language Models (LLMs) still struggle with calculation verification when using established prompting techniques. We present MDT...
AraMix: Recycling, Refiltering, and Deduplicating to Deliver the Largest Arabic Pretraining Corpus : Abstract: We present AraMix, a deduplicated Arabic pretraining corpus containing approximately 178 billion tokens across 179 million documents. Rather than scraping the web again, AraMix demonstrates ...
From Word to World: Can Large Language Models be Implicit Text-based World Models? : Abstract: Agentic reinforcement learning increasingly relies on experience-driven scaling, yet real-world environments remain non-adaptive, limited in coverage, and difficult to scale. World models of...
From Natural Language to Control Signals: A Conceptual Framework for Semantic Channel Finding in Complex Experimental Infrastructure : Abstract: Modern experimental platforms such as particle accelerators, fusion devices, telescopes, and industrial process control systems expose tens to hundreds of thousands of control and diagnostic...
MemEvolve: Meta-Evolution of Agent Memory Systems : Abstract: Self-evolving memory systems are unprecedentedly reshaping the evolutionary paradigm of large language model (LLM)-based agents. Prior work has predominantly relied on manually engineered me...
Solver-Independent Automated Problem Formulation via LLMs for High-Cost Simulation-Driven Design : Abstract: In the high-cost simulation-driven design domain, translating ambiguous design requirements into a mathematical optimization formulation is a bottleneck for optimizing product performance. T...
Does It Tie Out? Towards Autonomous Legal Agents in Venture Capital : Abstract: Before closing venture capital financing rounds, lawyers conduct diligence that includes tying out the capitalization table: verifying that every security (for example, shares, options, warr...
A Comparative Study of Light-weight Language Models for PII Masking and their Deployment for Real Conversational Texts : Abstract: Automated masking of Personally Identifiable Information (PII) is critical for privacy-preserving conversational systems. While current frontier large language models demonstrate strong PII ...
On Finding Inconsistencies in Documents : Abstract: Professionals in academia, law, and finance audit their documents because inconsistencies can result in monetary, reputational, and scientific costs. Language models (LMs) have the potential...
Neologism Learning as a Parameter-Efficient Alternative to Fine-Tuning for Model Steering : Abstract: In language modeling, neologisms are new tokens trained to represent a concept not already included in a given model's vocabulary. Neologisms can be used to encourage specific behavior in mo...
LLMs on Drugs: Language Models Are Few-Shot Consumers : Abstract: Large language models (LLMs) are sensitive to the personas imposed on them at inference time, yet prompt-level "drug" interventions have never been benchmarked rigorously. We present the fir...
Teaching and Critiquing Conceptualization and Operationalization in NLP : Abstract: NLP researchers regularly invoke abstract concepts like "interpretability," "bias," "reasoning," and "stereotypes," without defining them. Each subfield has a shared understanding or concept...
SRS-Stories: Vocabulary-constrained multilingual story generation for language learning : Abstract: In this paper, we use large language models to generate personalized stories for language learners, using only the vocabulary they know. The generated texts are specifically written to teach...
DACE For Railway Acronym Disambiguation : Abstract: Acronym Disambiguation (AD) is a fundamental challenge in technical text processing, particularly in specialized sectors where high ambiguity complicates automated analysis. This paper addre...
Towards Efficient Agents: A Co-Design of Inference Architecture and System : Abstract: The rapid development of large language model (LLM)-based agents has unlocked new possibilities for autonomous multi-turn reasoning and tool-augmented decision-making. However, their real-wo...
LIR$^3$AG: A Lightweight Rerank Reasoning Strategy Framework for Retrieval-Augmented Generation : Abstract: Retrieval-Augmented Generation (RAG) effectively enhances Large Language Models (LLMs) by incorporating retrieved external knowledge into the generation process. Reasoning models improve LLM...
CTTA-T: Continual Test-Time Adaptation for Text Understanding via Teacher-Student with a Domain-aware and Generalized Teacher : Abstract: Text understanding often suffers from domain shifts. To handle testing domains, domain adaptation (DA) is trained to adapt to a fixed and observed testing domain; a more challenging paradigm...
InstructNet: A Novel Approach for Multi-Label Instruction Classification through Advanced Deep Learning : Abstract: People use search engines for various topics and items, from daily essentials to more aspirational and specialized objects. Therefore, search engines have taken over as peoples preferred res...
GeoSense-AI: Fast Location Inference from Crisis Microblogs : Abstract: This paper presents an applied AI pipeline for realtime geolocation from noisy microblog streams, unifying statistical hashtag segmentation, part-of-speech-driven proper-noun detection, depe...
Training LLMs with LogicReward for Faithful and Rigorous Reasoning : Abstract: Although LLMs exhibit strong reasoning capabilities, existing training methods largely depend on outcome-based feedback, which can produce correct answers with flawed reasoning. Prior work i...
Statistical laws and linguistics inform meaning in naturalistic and fictional conversation : Abstract: Conversation is a cornerstone of social connection and is linked to well-being outcomes. Conversations vary widely in type with some portion generating complex, dynamic stories. One approach...
CoPE: A Small Language Model for Steerable and Scalable Content Labeling : Abstract: This paper details the methodology behind CoPE, a policy-steerable small language model capable of fast and accurate content labeling. We present a novel training curricula called Contradict...
Q-KVComm: Efficient Multi-Agent Communication Via Adaptive KV Cache Compression : Abstract: Multi-agent Large Language Model (LLM) systems face a critical bottleneck: redundant transmission of contextual information between agents consumes excessive bandwidth and computational reso...
Any-Time Regret-Guaranteed Algorithm for Control of Linear Quadratic Systems : Abstract: We propose a computationally efficient algorithm that achieves anytime regret of order $\mathcal{O}(\sqrt{t})$, with explicit dependence on the system dimensions and on the solution of the D...
Training robust and generalizable quantum models : Abstract: Adversarial robustness and generalization are both crucial properties of reliable machine learning models. In this paper, we study these properties in the context of quantum machine learning...
Structure of Classifier Boundaries: Case Study for a Naive Bayes Classifier : Abstract: Classifiers assign complex input data points to one of a small number of output categories. For a Bayes classifier whose input space is a graph, we study the structure of the \emph{boundary}...
Variational Online Mirror Descent for Robust Learning in Schr\"odinger Bridge : Abstract: The Schrödinger bridge (SB) has evolved into a universal class of probabilistic generative models. In practice, however, estimated learning signals are innately uncertain, and the reliabilit...
Communication-Efficient and Privacy-Adaptable Mechanism for Federated Learning : Abstract: Training machine learning models on decentralized private data via federated learning (FL) poses two key challenges: communication efficiency and privacy protection. In this work, we address...
PowerMamba: A Deep State Space Model and Comprehensive Benchmark for Time Series Prediction in Electric Power Systems : Abstract: The electricity sector is undergoing substantial transformations due to the rising electrification of demand, enhanced integration of renewable energy resources, and the emergence of new tec...
Equivariant Polynomial Functional Networks : Abstract: Neural Functional Networks (NFNs) have gained increasing interest due to their wide range of applications, including extracting information from implicit representations of data, editing net...
A Reinforcement Learning Environment for Automatic Code Optimization in the MLIR Compiler : Abstract: Code optimization is a crucial task that aims to enhance code performance. However, this process is often tedious and complex, highlighting the necessity for automatic code optimization tech...
Continuum Attention for Neural Operators : Abstract: Transformers, and the attention mechanism in particular, have become ubiquitous in machine learning. Their success in modeling nonlocal, long-range correlations has led to their widespread a...
IT Intrusion Detection Using Statistical Learning and Testbed Measurements : Abstract: We study automated intrusion detection in an IT infrastructure, specifically the problem of identifying the start of an attack, the type of attack, and the sequence of actions an attacker ta...
Averaging $n$-step Returns Reduces Variance in Reinforcement Learning : Abstract: Multistep returns, such as $n$-step returns and $λ$-returns, are commonly used to improve the sample efficiency of reinforcement learning (RL) methods. The variance of the multistep returns ...
HUTFormer: Hierarchical U-Net Transformer for Long-Term Traffic Forecasting : Abstract: Traffic forecasting, which aims to predict traffic conditions based on historical observations, has been an enduring research topic and is widely recognized as an essential component of inte...
Anti-Correlated Noise in Epoch-Based Stochastic Gradient Descent: Implications for Weight Variances in Flat Directions : Abstract: Stochastic Gradient Descent (SGD) has become a cornerstone of neural network optimization due to its computational efficiency and generalization capabilities. However, the gradient noise int...
Neural Exploitation and Exploration of Contextual Bandits : Abstract: In this paper, we study utilizing neural networks for the exploitation and exploration of contextual multi-armed bandits. Contextual multi-armed bandits have been studied for decades with va...
Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement Learning : Abstract: Off-policy learning from multistep returns is crucial for sample-efficient reinforcement learning, but counteracting off-policy bias without exacerbating variance is challenging. Classically...
Pushing the Frontier of Audiovisual Perception with Large-Scale Multimodal Correspondence Learning : Abstract: We introduce Perception Encoder Audiovisual, PE-AV, a new family of encoders for audio and video understanding trained with scaled contrastive learning. Built on PE, PE-AV makes several key ...
Active Convolved Illumination with Deep Transfer Learning for Complex Beam Transmission through Atmospheric Turbulence : Abstract: Atmospheric turbulence imposes a fundamental limitation across a broad range of applications, including optical imaging, remote sensing, and free-space optical communication. Recent advances...
GLUE: Generative Latent Unification of Expertise-Informed Engineering Models : Abstract: Engineering complex systems (aircraft, buildings, vehicles) requires accounting for geometric and performance couplings across subsystems. As generative models proliferate for specialized do...
Real-Time Streamable Generative Speech Restoration with Flow Matching : Abstract: Diffusion-based generative models have greatly impacted the speech processing field in recent years, exhibiting high speech naturalness and spawning a new research direction. Their applicati...
A Critical Assessment of Pattern Comparisons Between POD and Autoencoders in Intraventricular Flows : Abstract: Understanding intraventricular hemodynamics requires compact and physically interpretable representations of the underlying flow structures, as characteristic flow patterns are closely assoc...
Cluster-Based Generalized Additive Models Informed by Random Fourier Features : Abstract: Explainable machine learning aims to strike a balance between prediction accuracy and model transparency, particularly in settings where black-box predictive models, such as deep neural netw...
Faster Distributed Inference-Only Recommender Systems via Bounded Lag Synchronous Collectives : Abstract: Recommender systems are enablers of personalized content delivery, and therefore revenue, for many large companies. In the last decade, deep learning recommender models (DLRMs) are the de-fa...
Orthogonal Approximate Message Passing with Optimal Spectral Initializations for Rectangular Spiked Matrix Models : Abstract: We propose an orthogonal approximate message passing (OAMP) algorithm for signal estimation in the rectangular spiked matrix model with general rotationally invariant (RI) noise. We establis...
GShield: Mitigating Poisoning Attacks in Federated Learning : Abstract: Federated Learning (FL) has recently emerged as a revolutionary approach to collaborative training Machine Learning models. In particular, it enables decentralized model training while prese...
Translating Flow to Policy via Hindsight Online Imitation : Abstract: Recent advances in hierarchical robot systems leverage a high-level planner to propose task plans and a low-level policy to generate robot actions. This design allows training the planner on...
Self-Consistent Probability Flow for High-Dimensional Fokker-Planck Equations : Abstract: Solving high-dimensional Fokker-Planck (FP) equations is a challenge in computational physics and stochastic dynamics, due to the curse of dimensionality (CoD) and the bottleneck of evaluati...
PEDESTRIAN: An Egocentric Vision Dataset for Obstacle Detection on Pavements : Abstract: Walking has always been a primary mode of transportation and is recognized as an essential activity for maintaining good health. Despite the need for safe walking conditions in urban environ...
Finite-sample guarantees for data-driven forward-backward operator methods : Abstract: We establish finite sample certificates on the quality of solutions produced by data-based forward-backward (FB) operator splitting schemes. As frequently happens in stochastic regimes, we c...
Evidential Trust-Aware Model Personalization in Decentralized Federated Learning for Wearable IoT : Abstract: Decentralized federated learning (DFL) enables collaborative model training across edge devices without centralized coordination, offering resilience against single points of failure. Howeve...
SAP: Syntactic Attention Pruning for Transformer-based Language Models : Abstract: This paper introduces Syntactic Attention Pruning (SAP), a novel method for effectively pruning attention heads in Transformer models. Unlike conventional approaches that rely solely on math...
Explicit and Non-asymptotic Query Complexities of Rank-Based Zeroth-order Algorithm on Stochastic Smooth Functions : Abstract: Zeroth-order (ZO) optimization with ordinal feedback has emerged as a fundamental problem in modern machine learning systems, particularly in human-in-the-loop settings such as reinforcement...
Auditing Significance, Metric Choice, and Demographic Fairness in Medical AI Challenges : Abstract: Open challenges have become the de facto standard for comparative ranking of medical AI methods. Despite their importance, medical AI leaderboards exhibit three persistent limitations: (1) s...
On Cost-Aware Sequential Hypothesis Testing with Random Costs and Action Cancellation : Abstract: We study a variant of cost-aware sequential hypothesis testing in which a single active Decision Maker (DM) selects actions with positive, random costs to identify the true hypothesis under ...
Elevating Intrusion Detection and Security Fortification in Intelligent Networks through Cutting-Edge Machine Learning Paradigms : Abstract: The proliferation of IoT devices and their reliance on Wi-Fi networks have introduced significant security vulnerabilities, particularly the KRACK and Kr00k attacks, which exploit weaknesses...
CETCAM: Camera-Controllable Video Generation via Consistent and Extensible Tokenization : Abstract: Achieving precise camera control in video generation remains challenging, as existing methods often rely on camera pose annotations that are difficult to scale to large and dynamic datasets ...
On Conditional Stochastic Interpolation for Generative Nonlinear Sufficient Dimension Reduction : Abstract: Identifying low-dimensional sufficient structures in nonlinear sufficient dimension reduction (SDR) has long been a fundamental yet challenging problem. Most existing methods lack theoretica...
Application of deep learning approaches for medieval historical documents transcription : Abstract: Handwritten text recognition and optical character recognition solutions show excellent results with processing data of modern era, but efficiency drops with Latin documents of medieval time...
RIS-Enabled Smart Wireless Environments: Fundamentals and Distributed Optimization : Abstract: This chapter overviews the concept of Smart Wireless Environments (SWEs) motivated by the emerging technology of Reconfigurable Intelligent Surfaces (RISs). The operating principles and stat...
Eff-GRot: Efficient and Generalizable Rotation Estimation with Transformers : Abstract: We introduce Eff-GRot, an approach for efficient and generalizable rotation estimation from RGB images. Given a query image and a set of reference images with known orientations, our method ...
InSight-o3: Empowering Multimodal Foundation Models with Generalized Visual Search : Abstract: The ability for AI agents to "think with images" requires a sophisticated blend of reasoning and perception. However, current open multimodal agents still largely fall short on the reasoning...
Unsupervised Feature Selection via Robust Autoencoder and Adaptive Graph Learning : Abstract: Effective feature selection is essential for high-dimensional data analysis and machine learning. Unsupervised feature selection (UFS) aims to simultaneously cluster data and identify the mo...
Task Vector in TTS: Toward Emotionally Expressive Dialectal Speech Synthesis : Abstract: Recent advances in text-to-speech (TTS) have yielded remarkable improvements in naturalness and intelligibility. Building on these achievements, research has increasingly shifted toward enha...
PMPGuard: Catching Pseudo-Matched Pairs in Remote Sensing Image-Text Retrieval : Abstract: Remote sensing (RS) image-text retrieval faces significant challenges in real-world datasets due to the presence of Pseudo-Matched Pairs (PMPs), semantically mismatched or weakly aligned ima...
Scaling up Stability: Reinforcement Learning for Distributed Control of Networked Systems in the Space of Stabilizing Policies : Abstract: We study distributed control of networked systems through reinforcement learning, where neural policies must be simultaneously scalable, expressive and stabilizing. We introduce a policy par...
Generalization Gaps in Political Fake News Detection: An Empirical Study on the LIAR Dataset : Abstract: The proliferation of linguistically subtle political disinformation poses a significant challenge to automated fact-checking systems. Despite increasing emphasis on complex neural architectu...
Pushing the limits of one-dimensional NMR spectroscopy for automated structure elucidation using artificial intelligence : Abstract: One-dimensional NMR spectroscopy is one of the most widely used techniques for the characterization of organic compounds and natural products. For molecules with up to 36 non-hydrogen atoms,...
NASTaR: NovaSAR Automated Ship Target Recognition Dataset : Abstract: Synthetic Aperture Radar (SAR) offers a unique capability for all-weather, space-based maritime activity monitoring by capturing and imaging strong reflections from ships at sea. A well-defi...
Research on a hybrid LSTM-CNN-Attention model for text-based web content classification : Abstract: This study presents a hybrid deep learning architecture that integrates LSTM, CNN, and an Attention mechanism to enhance the classification of web content based on text. Pretrained GloVe emb...
Automated Mosaic Tesserae Segmentation via Deep Learning Techniques : Abstract: Art is widely recognized as a reflection of civilization and mosaics represent an important part of cultural heritage. Mosaics are an ancient art form created by arranging small pieces, call...
PSI3D: Plug-and-Play 3D Stochastic Inference with Slice-wise Latent Diffusion Prior : Abstract: Diffusion models are highly expressive image priors for Bayesian inverse problems. However, most diffusion models cannot operate on large-scale, high-dimensional data due to high training an...
Efficient Zero-Shot Inpainting with Decoupled Diffusion Guidance : Abstract: Diffusion models have emerged as powerful priors for image editing tasks such as inpainting and local modification, where the objective is to generate realistic content that remains consiste...
A two-stream network with global-local feature fusion for bone age assessment : Abstract: Bone Age Assessment (BAA) is a widely used clinical technique that can accurately reflect an individual's growth and development level, as well as maturity. In recent years, although deep le...
CrystalFormer-CSP: Thinking Fast and Slow for Crystal Structure Prediction : Abstract: Crystal structure prediction is a fundamental problem in materials science. We present CrystalFormer-CSP, an efficient framework that unifies data-driven heuristic and physics-driven optimiz...
AutoSchA: Automatic Hierarchical Music Representations via Multi-Relational Node Isolation : Abstract: Hierarchical representations provide powerful and principled approaches for analyzing many musical genres. Such representations have been broadly studied in music theory, for instance via Sc...
Dimensionality Reduction Considered Harmful (Some of the Time) : Abstract: Visual analytics now plays a central role in decision-making across diverse disciplines, but it can be unreliable: the knowledge or insights derived from the analysis may not accurately refl...
Toward Efficient Testing of Graph Neural Networks via Test Input Prioritization : Abstract: Graph Neural Networks (GNNs) have demonstrated remarkable efficacy in handling graph-structured data; however, they exhibit failures after deployment, which can cause severe consequences. He...
Estimating Solvation Free Energies with Boltzmann Generators : Abstract: Accurate calculations of solvation free energies remain a central challenge in molecular simulations, often requiring extensive sampling and numerous alchemical intermediates to ensure suffi...
Optimal Software Pipelining and Warp Specialization for Tensor Core GPUs : Abstract: GPU architectures have continued to grow in complexity, with recent incarnations introducing increasingly powerful fixed-function units for matrix multiplication and data movement to accompa...
Exploring polymer classification with a hybrid single-photon quantum approach : Abstract: Polymers exhibit complex architectures and diverse properties that place them at the center of contemporary research in chemistry and materials science. As conventional computational techniq...
From Coverage to Causes: Data-Centric Fuzzing for JavaScript Engines : Abstract: Context: Exhaustive fuzzing of modern JavaScript engines is infeasible due to the vast number of program states and execution paths. Coverage-guided fuzzers waste effort on low-risk inputs, ...
Causal Inference as Distribution Adaptation: Optimizing ATE Risk under Propensity Uncertainty : Abstract: Standard approaches to causal inference, such as Outcome Regression and Inverse Probability Weighted Regression Adjustment (IPWRA), are typically derived through the lens of missing data imp...
Graph-based Nearest Neighbors with Dynamic Updates via Random Walks : Abstract: Approximate nearest neighbor search (ANN) is a common way to retrieve relevant search results, especially now in the context of large language models and retrieval augmented generation. One ...
Approximation and learning with compositional tensor trains : Abstract: We introduce compositional tensor trains (CTTs) for the approximation of multivariate functions, a class of models obtained by composing low-rank functions in the tensor-train format. This f...
Narrative Consolidation: Formulating a New Task for Unifying Multi-Perspective Accounts : Abstract: Processing overlapping narrative documents, such as legal testimonies or historical accounts, often aims not for compression but for a unified, coherent, and chronologically sound text. Stan...
NodMAISI: Nodule-Oriented Medical AI for Synthetic Imaging : Abstract: Objective: Although medical imaging datasets are increasingly available, abnormal and annotation-intensive findings critical to lung cancer screening, particularly small pulmonary nodules, r...
Long-range electrostatics for machine learning interatomic potentials is easier than we thought : Abstract: The lack of long-range electrostatics is a key limitation of modern machine learning interatomic potentials (MLIPs), hindering reliable applications to interfaces, charge-transfer reactions,...
Shuttling Compiler for Trapped-Ion Quantum Computers Based on Large Language Models : Abstract: Trapped-ion quantum computers based on segmented traps rely on shuttling operations to establish connectivity between multiple sub-registers within a quantum processing unit. Several archite...
Enhancing Tea Leaf Disease Recognition with Attention Mechanisms and Grad-CAM Visualization : Abstract: Tea is among the most widely consumed drinks globally. Tea production is a key industry for many countries. One of the main challenges in tea harvesting is tea leaf diseases. If the spread o...
MEGState: Phoneme Decoding from Magnetoencephalography Signals : Abstract: Decoding linguistically meaningful representations from non-invasive neural recordings remains a central challenge in neural speech decoding. Among available neuroimaging modalities, magneto...
Sampling from multimodal distributions with warm starts: Non-asymptotic bounds for the Reweighted Annealed Leap-Point Sampler : Abstract: Sampling from multimodal distributions is a central challenge in Bayesian inference and machine learning. In light of hardness results for sampling -- classical MCMC methods, even with tempe...
SCS-SupCon: Sigmoid-based Common and Style Supervised Contrastive Learning with Adaptive Decision Boundaries : Abstract: Image classification is hindered by subtle inter-class differences and substantial intra-class variations, which limit the effectiveness of existing contrastive learning methods. Supervised ...
Risk-Aware Financial Forecasting Enhanced by Machine Learning and Intuitionistic Fuzzy Multi-Criteria Decision-Making : Abstract: In the face of increasing financial uncertainty and market complexity, this study presents a novel risk-aware financial forecasting framework that integrates advanced machine learning techni...
chatter: a Python library for applying information theory and AI/ML models to animal communication : Abstract: The study of animal communication often involves categorizing units into types (e.g. syllables in songbirds, or notes in humpback whales). While this approach is useful in many cases, it nec...
A curated UK rain radar data set for training and benchmarking nowcasting models : Abstract: This paper documents a data set of UK rain radar image sequences for use in statistical modeling and machine learning methods for nowcasting. The main dataset contains 1,000 randomly sampled...
QAISim: A Toolkit for Modeling and Simulation of AI in Quantum Cloud Computing Environments : Abstract: Quantum computing offers new ways to explore the theory of computation via the laws of quantum mechanics. Due to the rising demand for quantum computing resources, there is growing interest ...
Supplementary Resources and Analysis for Automatic Speech Recognition Systems Trained on the Loquacious Dataset : Abstract: The recently published Loquacious dataset aims to be a replacement for established English automatic speech recognition (ASR) datasets such as LibriSpeech or TED-Lium. The main goal of the L...
Deep Legendre Transform : Abstract: We introduce a novel deep learning algorithm for computing convex conjugates of differentiable convex functions, a fundamental operation in convex analysis with various applications in diffe...
The Best of Both Worlds: Hybridizing Neural Operators and Solvers for Stable Long-Horizon Inference : Abstract: Numerical simulation of time-dependent partial differential equations (PDEs) is central to scientific and engineering applications, but high-fidelity solvers are often prohibitively expensiv...
KerJEPA: Kernel Discrepancies for Euclidean Self-Supervised Learning : Abstract: Recent breakthroughs in self-supervised Joint-Embedding Predictive Architectures (JEPAs) have established that regularizing Euclidean representations toward isotropic Gaussian priors yields ...
DFORD: Directional Feedback based Online Ordinal Regression Learning : Abstract: In this paper, we introduce directional feedback in the ordinal regression setting, in which the learner receives feedback on whether the predicted label is on the left or the right side of ...
Deep Learning for Unrelated-Machines Scheduling: Handling Variable Dimensions : Abstract: Deep learning has been effectively applied to many discrete optimization problems. However, learning-based scheduling on unrelated parallel machines remains particularly difficult to design....
Initialization of a Polyharmonic Cascade, Launch and Testing : Abstract: This paper concludes a series of studies on the polyharmonic cascade, a deep machine learning architecture theoretically derived from indifference principles and the theory of random functio...
Toward Scalable and Valid Conditional Independence Testing with Spectral Representations : Abstract: Conditional independence (CI) is central to causal inference, feature selection, and graphical modeling, yet it is untestable in many settings without additional assumptions. Existing CI tes...
Learning from sanctioned government suppliers: A machine learning and network science approach to detecting fraud and corruption in Mexico : Abstract: Detecting fraud and corruption in public procurement remains a major challenge for governments worldwide. Most research to-date builds on domain-knowledge-based corruption risk indicators of...
Lightweight Intrusion Detection in IoT via SHAP-Guided Feature Pruning and Knowledge-Distilled Kronecker Networks : Abstract: The widespread deployment of Internet of Things (IoT) devices requires intrusion detection systems (IDS) with high accuracy while operating under strict resource constraints. Conventional de...
Binary Kernel Logistic Regression: a sparsity-inducing formulation and a convergent decomposition training algorithm : Abstract: Kernel logistic regression (KLR) is a widely used supervised learning method for binary and multi-class classification, which provides estimates of the conditional probabilities of class mem...
An Inverse Scattering Inspired Fourier Neural Operator for Time-Dependent PDE Learning : Abstract: Learning accurate and stable time-advancement operators for nonlinear partial differential equations (PDEs) remains challenging, particularly for chaotic, stiff, and long-horizon dynamical s...
Symplectic Reservoir Representation of Legendre Dynamics : Abstract: Modern learning systems act on internal representations of data, yet how these representations encode underlying physical or statistical structure is often left implicit. In physics, conserv...
Brain-Grounded Axes for Reading and Steering LLM States : Abstract: Interpretability methods for large language models (LLMs) typically derive directions from textual supervision, which can lack external grounding. We propose using human brain activity not a...
Real-Time Machine Learning for Embedded Anomaly Detection : Abstract: The spread of a resource-constrained Internet of Things (IoT) environment and embedded devices has put pressure on the real-time detection of anomalies occurring at the edge. This survey pre...
From Points to Coalitions: Hierarchical Contrastive Shapley Values for Prioritizing Data Samples : Abstract: How should we quantify the value of each training example when datasets are large, heterogeneous, and geometrically structured? Classical Data-Shapley answers in principle, but its O(n!) com...
Interpretable Hybrid Deep Q-Learning Framework for IoT-Based Food Spoilage Prediction with Synthetic Data Generation and Hardware Validation : Abstract: The need for an intelligent, real-time spoilage prediction system has become critical in modern IoT-driven food supply chains, where perishable goods are highly susceptible to environmental ...
A Logical View of GNN-Style Computation and the Role of Activation Functions : Abstract: We study the numerical and Boolean expressiveness of MPLang, a declarative language that captures the computation of graph neural networks (GNNs) through linear message passing and activatio...
Time-Vertex Machine Learning for Optimal Sensor Placement in Temporal Graph Signals: Applications in Structural Health Monitoring : Abstract: Structural Health Monitoring (SHM) plays a crucial role in maintaining the safety and resilience of infrastructure. As sensor networks grow in scale and complexity, identifying the most info...
Small Language Models as Compiler Experts: Auto-Parallelization for Heterogeneous Systems : Abstract: Traditional auto-parallelizing compilers, reliant on rigid heuristics, struggle with the complexity of modern heterogeneous systems. This paper presents a comprehensive evaluation of small (...
From Black-Box Tuning to Guided Optimization via Hyperparameters Interaction Analysis : Abstract: Hyperparameters tuning is a fundamental, yet computationally expensive, step in optimizing machine learning models. Beyond optimization, understanding the relative importance and interaction...
Regression generation adversarial network based on dual data evaluation strategy for industrial application : Abstract: Soft sensing infers hard-to-measure data through a large number of easily obtainable variables. However, in complex industrial scenarios, the issue of insufficient data volume persists, whic...
Phase-space entropy at acquisition reflects downstream learnability : Abstract: Modern learning systems work with data that vary widely across domains, but they all ultimately depend on how much structure is already present in the measurements before any model is traine...
Causal Heterogeneous Graph Learning Method for Chronic Obstructive Pulmonary Disease Prediction : Abstract: Due to the insufficient diagnosis and treatment capabilities at the grassroots level, there are still deficiencies in the early identification and early warning of acute exacerbation of Chro...
RP-CATE: Recurrent Perceptron-based Channel Attention Transformer Encoder for Industrial Hybrid Modeling : Abstract: Nowadays, industrial hybrid modeling which integrates both mechanistic modeling and machine learning-based modeling techniques has attracted increasing interest from scholars due to its high...
A Convex Loss Function for Set Prediction with Optimal Trade-offs Between Size and Conditional Coverage : Abstract: We consider supervised learning problems in which set predictions provide explicit uncertainty estimates. Using Choquet integrals (a.k.a. Lov{á}sz extensions), we propose a convex loss funct...
A Composable Channel-Adaptive Architecture for Seizure Classification : Abstract: Objective: We develop a channel-adaptive (CA) architecture that seamlessly processes multi-variate time-series with an arbitrary number of channels, and in particular intracranial electroenc...
Timely Parameter Updating in Over-the-Air Federated Learning : Abstract: Incorporating over-the-air computations (OAC) into the model training process of federated learning (FL) is an effective approach to alleviating the communication bottleneck in FL systems. U...
Dual Model Deep Learning for Alzheimer Prognostication : Abstract: Disease modifying therapies for Alzheimer's disease demand precise timing decisions, yet current predictive models require longitudinal observations and provide no uncertainty quantification...
Efficient Personalization of Generative Models via Optimal Experimental Design : Abstract: Preference learning from human feedback has the ability to align generative models with the needs of end-users. Human feedback is costly and time-consuming to obtain, which creates demand fo...
Time-series Forecast for Indoor Zone Air Temperature with Long Horizons: A Case Study with Sensor-based Data from a Smart Building : Abstract: With the press of global climate change, extreme weather and sudden weather changes are becoming increasingly common. To maintain a comfortable indoor environment and minimize the contributi...
A Surrogate-Augmented Symbolic CFD-Driven Training Framework for Accelerating Multi-objective Physical Model Development : Abstract: Computational Fluid Dynamics (CFD)-driven training combines machine learning (ML) with CFD solvers to develop physically consistent closure models with improved predictive accuracy. In the o...
Optimizer Dynamics at the Edge of Stability with Differential Privacy : Abstract: Deep learning models can reveal sensitive information about individual training examples, and while differential privacy (DP) provides guarantees restricting such leakage, it also alters opt...
OPBO: Order-Preserving Bayesian Optimization : Abstract: Bayesian optimization is an effective method for solving expensive black-box optimization problems. Most existing methods use Gaussian processes (GP) as the surrogate model for approximating...
Outlier detection in mixed-attribute data: a semi-supervised approach with fuzzy approximations and relative entropy : Abstract: Outlier detection is a critical task in data mining, aimed at identifying objects that significantly deviate from the norm. Semi-supervised methods improve detection performance by leveragin...
Consistency-guided semi-supervised outlier detection in heterogeneous data using fuzzy rough sets : Abstract: Outlier detection aims to find samples that behave differently from the majority of the data. Semi-supervised detection methods can utilize the supervision of partial labels, thus reducing f...
Lag Operator SSMs: A Geometric Framework for Structured State Space Modeling : Abstract: Structured State Space Models (SSMs), which are at the heart of the recently popular Mamba architecture, are powerful tools for sequence modeling. However, their theoretical foundation relie...
Scaling Online Distributionally Robust Reinforcement Learning: Sample-Efficient Guarantees with General Function Approximation : Abstract: The deployment of reinforcement learning (RL) agents in real-world applications is often hindered by performance degradation caused by mismatches between training and deployment environments...
Learning Through Little Eyes: Attribute Discrimination Beyond Objects : Abstract: Infants learn to recognize not only object categories but also fine grained attributes such as color, size, and texture within their first two years of life. Prior work explores Childs View ...
DPSR: Differentially Private Sparse Reconstruction via Multi-Stage Denoising for Recommender Systems : Abstract: Differential privacy (DP) has emerged as the gold standard for protecting user data in recommender systems, but existing privacy-preserving mechanisms face a fundamental challenge: the priva...
The Ensemble Schr{\"o}dinger Bridge filter for Nonlinear Data Assimilation : Abstract: This work puts forward a novel nonlinear optimal filter namely the Ensemble Schr{ö}dinger Bridge nonlinear filter. The proposed filter finds marriage of the standard prediction procedure and...
Merging of Kolmogorov-Arnold networks trained on disjoint datasets : Abstract: Training on disjoint datasets can serve two primary goals: accelerating data processing and enabling federated learning. It has already been established that Kolmogorov-Arnold networks (KANs...
Generative Modeling through Spectral Analysis of Koopman Operator : Abstract: We propose Koopman Spectral Wasserstein Gradient Descent (KSWGD), a generative modeling framework that combines operator-theoretic spectral analysis with optimal transport. The novel insight...
Label-Informed Outlier Detection Based on Granule Density : Abstract: Outlier detection, crucial for identifying unusual patterns with significant implications across numerous applications, has drawn considerable research interest. Existing semi-supervised met...
Gaussian-Mixture-Model Q-Functions for Policy Iteration in Reinforcement Learning : Abstract: Unlike their conventional use as estimators of probability density functions in reinforcement learning (RL), this paper introduces a novel function-approximation role for Gaussian mixture mo...
Is Your Conditional Diffusion Model Actually Denoising? : Abstract: We study the inductive biases of diffusion models with a conditioning-variable, which have seen widespread application as both text-conditioned generative image models and observation-condit...
A Theoretical Lens for RL-Tuned Language Models via Energy-Based Models : Abstract: Large language models (LLMs) trained via KL-regularized reinforcement learning demonstrate strong instruction following, self-correction, and reasoning abilities. Yet their theoretical under...
ML Inference Scheduling with Predictable Latency : Abstract: Machine learning (ML) inference serving systems can schedule requests to improve GPU utilization and to meet service level objectives (SLOs) or deadlines. However, improving GPU utilization ...
Generating Risky Samples with Conformity Constraints via Diffusion Models : Abstract: Although neural networks achieve promising performance in many tasks, they may still fail when encountering some examples and bring about risks to applications. To discover risky samples, pr...
Improving Pattern Recognition of Scheduling Anomalies through Structure-Aware and Semantically-Enhanced Graphs : Abstract: This paper proposes a structure-aware driven scheduling graph modeling method to improve the accuracy and representation capability of anomaly identification in scheduling behaviors of compl...
Demonstration-Guided Continual Reinforcement Learning in Dynamic Environments : Abstract: Reinforcement learning (RL) excels in various applications but struggles in dynamic environments where the underlying Markov decision process evolves. Continual reinforcement learning (CRL) ...
From Shortcut to Induction Head: How Data Diversity Shapes Algorithm Selection in Transformers : Abstract: Transformers can implement both generalizable algorithms (e.g., induction heads) and simple positional shortcuts (e.g., memorizing fixed output positions). In this work, we study how the cho...
The Procrustean Bed of Time Series: The Optimization Bias of Point-wise Loss : Abstract: Optimizing time series models via point-wise loss functions (e.g., MSE) relying on a flawed point-wise independent and identically distributed (i.i.d.) assumption that disregards the causal ...
Trajectory Planning for UAV-Based Smart Farming Using Imitation-Based Triple Deep Q-Learning : Abstract: Unmanned aerial vehicles (UAVs) have emerged as a promising auxiliary platform for smart agriculture, capable of simultaneously performing weed detection, recognition, and data collection fr...
EIA-SEC: Improved Actor-Critic Framework for Multi-UAV Collaborative Control in Smart Agriculture : Abstract: The widespread application of wireless communication technology has promoted the development of smart agriculture, where unmanned aerial vehicles (UAVs) play a multifunctional role. We targe...
Benchmarking neural surrogates on realistic spatiotemporal multiphysics flows : Abstract: Predicting multiphysics dynamics is computationally expensive and challenging due to the severe coupling of multi-scale, heterogeneous physical processes. While neural surrogates promise a p...
SD2AIL: Adversarial Imitation Learning from Synthetic Demonstrations via Diffusion Models : Abstract: Adversarial Imitation Learning (AIL) is a dominant framework in imitation learning that infers rewards from expert demonstrations to guide policy optimization. Although providing more expert...
Comparing Dynamical Models Through Diffeomorphic Vector Field Alignment : Abstract: Dynamical systems models such as recurrent neural networks (RNNs) are increasingly popular in theoretical neuroscience for hypothesis-generation and data analysis. Evaluating the dynamics in...
Feature-Enhanced Graph Neural Networks for Classification of Synthetic Graph Generative Models: A Benchmarking Study : Abstract: The ability to discriminate between generative graph models is critical to understanding complex structural patterns in both synthetic graphs and the real-world structures that they emulate....
APC-GNN++: An Adaptive Patient-Centric GNN with Context-Aware Attention and Mini-Graph Explainability for Diabetes Classification : Abstract: We propose APC-GNN++, an adaptive patient-centric Graph Neural Network for diabetes classification. Our model integrates context-aware edge attention, confidence-guided blending of node feat...
The Geometry of Abstraction: Continual Learning via Recursive Quotienting : Abstract: Continual learning systems operating in fixed-dimensional spaces face a fundamental geometric barrier: the flat manifold problem. When experience is represented as a linear trajectory in Euc...
Out-of-Distribution Detection in Molecular Complexes via Diffusion Models for Irregular Graphs : Abstract: Predictive machine learning models generally excel on in-distribution data, but their performance degrades on out-of-distribution (OOD) inputs. Reliable deployment therefore requires robust ...
NOVA: Discovering Well-Conditioned Winograd Transforms through Numerical Optimization of Vandermonde Arithmetic : Abstract: Winograd convolution is the standard algorithm for efficient inference, reducing arithmetic complexity by 2.25x for 3x3 kernels. However, it faces a critical barrier in the modern era of low...
On the Universality of Transformer Architectures; How Much Attention Is Enough? : Abstract: Transformers are crucial across many AI fields, such as large language models, computer vision, and reinforcement learning. This prominence stems from the architecture's perceived universali...
MoE Pathfinder: Trajectory-driven Expert Pruning : Abstract: Mixture-of-experts (MoE) architectures used in large language models (LLMs) achieve state-of-the-art performance across diverse tasks yet face practical challenges such as deployment complex...
Why Most Optimism Bandit Algorithms Have the Same Regret Analysis: A Simple Unifying Theorem : Abstract: Several optimism-based stochastic bandit algorithms -- including UCB, UCB-V, linear UCB, and finite-arm GP-UCB -- achieve logarithmic regret using proofs that, despite superficial difference...
The Challenger: When Do New Data Sources Justify Switching Machine Learning Models? : Abstract: We study the problem of deciding whether, and when an organization should replace a trained incumbent model with a challenger relying on newly available features. We develop a unified econom...
Towards Guided Descent: Optimization Algorithms for Training Neural Networks At Scale : Abstract: Neural network optimization remains one of the most consequential yet poorly understood challenges in modern AI research, where improvements in training algorithms can lead to enhanced featu...
FedSUM Family: Efficient Federated Learning Methods under Arbitrary Client Participation : Abstract: Federated Learning (FL) methods are often designed for specific client participation patterns, limiting their applicability in practical deployments. We introduce the FedSUM family of algori...
LeJOT: An Intelligent Job Cost Orchestration Solution for Databricks Platform : Abstract: With the rapid advancements in big data technologies, the Databricks platform has become a cornerstone for enterprises and research institutions, offering high computational efficiency and a...
On the Convergence Rate of LoRA Gradient Descent : Abstract: The low-rank adaptation (LoRA) algorithm for fine-tuning large models has grown popular in recent years due to its remarkable performance and low computational requirements. LoRA trains two ...
FairExpand: Individual Fairness on Graphs with Partial Similarity Information : Abstract: Individual fairness, which requires that similar individuals should be treated similarly by algorithmic systems, has become a central principle in fair machine learning. Individual fairness ...
Conscious Data Contribution via Community-Driven Chain-of-Thought Distillation : Abstract: The current era of AI development places a heavy emphasis on training large models on increasingly scaled-up datasets. This paradigm has catalyzed entirely new product categories, such as LL...
TraCeR: Transformer-Based Competing Risk Analysis with Longitudinal Covariates : Abstract: Survival analysis is a critical tool for modeling time-to-event data. Recent deep learning-based models have reduced various modeling assumptions including proportional hazard and linearity....
Learning Generalizable Neural Operators for Inverse Problems : Abstract: Inverse problems challenge existing neural operator architectures because ill-posed inverse maps violate continuity, uniqueness, and stability assumptions. We introduce B2B${}^{-1}$, an inve...
Microstructure-based Variational Neural Networks for Robust Uncertainty Quantification in Materials Digital Twins : Abstract: Aleatoric uncertainties - irremovable variability in microstructure morphology, constituent behavior, and processing conditions - pose a major challenge to developing uncertainty-robust digi...
Probabilistic Digital Twins of Users: Latent Representation Learning with Statistically Validated Semantics : Abstract: Understanding user identity and behavior is central to applications such as personalization, recommendation, and decision support. Most existing approaches rely on deterministic embeddings o...
Towards Benchmarking Privacy Vulnerabilities in Selective Forgetting with Large Language Models : Abstract: The rapid advancements in artificial intelligence (AI) have primarily focused on the process of learning from data to acquire knowledgeable learning systems. As these systems are increasingl...
FedOAED: Federated On-Device Autoencoder Denoiser for Heterogeneous Data under Limited Client Availability : Abstract: Over the last few decades, machine learning (ML) and deep learning (DL) solutions have demonstrated their potential across many applications by leveraging large amounts of high-quality data....
MoE-TransMov: A Transformer-based Model for Next POI Prediction in Familiar & Unfamiliar Movements : Abstract: Accurate prediction of the next point of interest (POI) within human mobility trajectories is essential for location-based services, as it enables more timely and personalized recommendation...
What's the Price of Monotonicity? A Multi-Dataset Benchmark of Monotone-Constrained Gradient Boosting for Credit PD : Abstract: Financial institutions face a trade-off between predictive accuracy and interpretability when deploying machine learning models for credit risk. Monotonicity constraints align model behavior...
Microsoft Academic Graph Information Retrieval for Research Recommendation and Assistance : Abstract: In today's information-driven world, access to scientific publications has become increasingly easy. At the same time, filtering through the massive volume of available research has become m...
The Universe Learning Itself: On the Evolution of Dynamics from the Big Bang to Machine Intelligence : Abstract: We develop a unified, dynamical-systems narrative of the universe that traces a continuous chain of structure formation from the Big Bang to contemporary human societies and their artificial...
Love, Lies, and Language Models: Investigating AI's Role in Romance-Baiting Scams : Abstract: Romance-baiting scams have become a major source of financial and emotional harm worldwide. These operations are run by organized crime syndicates that traffic thousands of people into force...
Deep Reinforcement Learning Optimization for Uncertain Nonlinear Systems via Event-Triggered Robust Adaptive Dynamic Programming : Abstract: This work proposes a unified control architecture that couples a Reinforcement Learning (RL)-driven controller with a disturbance-rejection Extended State Observer (ESO), complemented by an ...
Large Model Enabled Embodied Intelligence for 6G Integrated Perception, Communication, and Computation Network : Abstract: The advent of sixth-generation (6G) places intelligence at the core of wireless architecture, fusing perception, communication, and computation into a single closed-loop. This paper argues t...
Continuum: Efficient and Robust Multi-Turn LLM Agent Scheduling with KV Cache Time-to-Live : Abstract: KV cache management is essential for efficient LLM inference. To maximize utilization, existing inference engines evict finished requests' KV cache if new requests are waiting. This policy b...
Less is More: Unlocking Specialization of Time Series Foundation Models via Structured Pruning : Abstract: Scaling laws motivate the development of Time Series Foundation Models (TSFMs) that pre-train vast parameters and achieve remarkable zero-shot forecasting performance. Surprisingly, even aft...
An Exploration of Default Images in Text-to-Image Generation : Abstract: In the creative practice of text-to-image (TTI) generation, images are synthesized from textual prompts. By design, TTI models always yield an output, even if the prompt contains unknown ter...
GSRender: Deduplicated Occupancy Prediction via Weakly Supervised 3D Gaussian Splatting : Abstract: Weakly-supervised 3D occupancy perception is crucial for vision-based autonomous driving in outdoor environments. Previous methods based on NeRF often face a challenge in balancing the numbe...
What Human-Horse Interactions may Teach us About Effective Human-AI Interactions : Abstract: This article explores human-horse interactions as a metaphor for understanding and designing effective human-AI partnerships. Drawing on the long history of human collaboration with horses, ...
Efficient Deep Learning Infrastructures for Embedded Computing Systems: A Comprehensive Survey and Future Envision : Abstract: Deep neural networks (DNNs) have recently achieved impressive success across a wide range of real-world vision and language processing tasks, spanning from image classification to many other...
AIDOVECL: AI-generated Dataset of Outpainted Vehicles for Eye-level Classification and Localization : Abstract: Image labeling is a critical bottleneck in the development of computer vision technologies, often constraining the potential of machine learning models due to the time-intensive nature of ma...
Automatic Detection of LLM-Generated Code: A Comparative Case Study of Contemporary Models Across Function and Class Granularities : Abstract: The adoption of Large Language Models (LLMs) for code generation risks incorporating vulnerable code into software systems. Existing detectors face two critical limitations: a lack of system...
Overcoming Growth-Induced Forgetting in Task-Agnostic Continual Learning : Abstract: In continual learning (CL), model growth enhances adaptability to new data. However, when model growth is applied improperly, especially in task-agnostic CL, where the entire grown model is ...
Generative Retrieval with Few-shot Indexing : Abstract: Existing generative retrieval (GR) methods rely on training-based indexing, which fine-tunes a model to memorise associations between queries and the document identifiers (docids) of relevan...
Graph Transformers: A Survey : Abstract: Graph transformers are a recent advancement in machine learning, offering a new class of neural network models for graph-structured data. The synergy between transformers and graph learning ...
ViGoR: Improving Visual Grounding of Large Vision Language Models with Fine-Grained Reward Modeling : Abstract: By combining natural language understanding, generation capabilities, and breadth of knowledge of large language models with image perception, recent large vision language models (LVLMs) hav...
CodeTF: One-stop Transformer Library for State-of-the-art Code LLMs : Abstract: Code intelligence plays a key role in transforming modern software engineering. Recently, deep learning-based models, especially Transformer-based large language models (LLMs), have demonstr...
Towards a resource for multilingual lexicons: an MT assisted and human-in-the-loop multilingual parallel corpus with multi-word expression annotation : Abstract: In this work, we introduce the construction of a machine translation (MT) assisted and human-in-the-loop multilingual parallel corpus with annotations of multi-word expressions (MWEs), named...
Discovering and Learning Probabilistic Models of Black-Box AI Capabilities : Abstract: Black-box AI (BBAI) systems such as foundational models are increasingly being used for sequential decision making. To ensure that such systems are safe to operate and deploy, it is imperati...
Dual Computational Horizons: Incompleteness and Unpredictability in Intelligent Systems : Abstract: We formalize two independent computational limitations that constrain algorithmic intelligence: formal incompleteness and dynamical unpredictability. The former limits the deductive power of...
Scaling Laws for Energy Efficiency of Local LLMs : Abstract: Deploying local large language models and vision-language models on edge devices requires balancing accuracy with constrained computational and energy budgets. Although graphics processors d...
PDE-Agent: A toolchain-augmented multi-agent framework for PDE solving : Abstract: Solving Partial Differential Equations (PDEs) is a cornerstone of engineering and scientific research. Traditional methods for PDE solving are cumbersome, relying on manual setup and domain ...
Neural Decoding of Overt Speech from ECoG Using Vision Transformers and Contrastive Representation Learning : Abstract: Speech Brain Computer Interfaces (BCIs) offer promising solutions to people with severe paralysis unable to communicate. A number of recent studies have demonstrated convincing reconstructio...
FreeAskWorld: An Interactive and Closed-Loop Simulator for Human-Centric Embodied AI : Abstract: As embodied intelligence emerges as a core frontier in artificial intelligence research, simulation platforms must evolve beyond low-level physical interactions to capture complex, human-cen...
TableMind: An Autonomous Programmatic Agent for Tool-Augmented Table Reasoning : Abstract: Table reasoning requires models to jointly perform comprehensive semantic understanding and precise numerical operations. Although recent large language model (LLM)-based methods have achiev...
Feature-Guided Metaheuristic with Diversity Management for Solving the Capacitated Vehicle Routing Problem : Abstract: We propose a feature-based guidance mechanism to enhance metaheuristic algorithms for solving the Capacitated Vehicle Routing Problem (CVRP). This mechanism leverages an Explainable AI (XAI)...
WorldWarp: Propagating 3D Geometry with Asynchronous Video Diffusion : Abstract: Generating long-range, geometrically consistent video presents a fundamental dilemma: while consistency demands strict adherence to 3D geometry in pixel space, state-of-the-art generative mo...
Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies : Abstract: Existing reinforcement learning (RL) approaches treat large language models (LLMs) as a single unified policy, overlooking their internal mechanisms. Understanding how policy evolves across ...
Beyond CLIP: Knowledge-Enhanced Multimodal Transformers for Cross-Modal Alignment in Diabetic Retinopathy Diagnosis : Abstract: Diabetic retinopathy (DR) is a leading cause of preventable blindness worldwide, demanding accurate automated diagnostic systems. While general-domain vision-language models like Contrastive...
Clustering with Label Consistency : Abstract: Designing efficient, effective, and consistent metric clustering algorithms is a significant challenge attracting growing attention. Traditional approaches focus on the stability of cluster ...
Exploring the features used for summary evaluation by Human and GPT : Abstract: Summary assessment involves evaluating how well a generated summary reflects the key ideas and meaning of the source text, requiring a deep understanding of the content. Large Language Model...
MapTrace: Scalable Data Generation for Route Tracing on Maps : Abstract: While Multimodal Large Language Models have achieved human-like performance on many visual and textual reasoning tasks, their proficiency in fine-grained spatial understanding, such as route...
LeLaR: The First In-Orbit Demonstration of an AI-Based Satellite Attitude Controller : Abstract: Attitude control is essential for many satellite missions. Classical controllers, however, are time-consuming to design and sensitive to model uncertainties and variations in operational bou...
The Epistemological Consequences of Large Language Models: Rethinking collective intelligence and institutional knowledge : Abstract: We examine epistemological threats posed by human and LLM interaction. We develop collective epistemology as a theory of epistemic warrant distributed across human collectives, using bounded...
Owning the Intelligence: Global AI Patents Landscape and Europe's Quest for Technological Sovereignty : Abstract: Artificial intelligence has become a key arena of global technological competition and a central concern for Europe's quest for technological sovereignty. This paper analyzes global AI paten...
Results of the 2024 CommonRoad Motion Planning Competition for Autonomous Vehicles : Abstract: Over the past decade, a wide range of motion planning approaches for autonomous vehicles has been developed to handle increasingly complex traffic scenarios. However, these approaches are ra...
REALM: A Real-to-Sim Validated Benchmark for Generalization in Robotic Manipulation : Abstract: Vision-Language-Action (VLA) models empower robots to understand and execute tasks described by natural language instructions. However, a key challenge lies in their ability to generalize be...
BabyFlow: 3D modeling of realistic and expressive infant faces : Abstract: Early detection of developmental disorders can be aided by analyzing infant craniofacial morphology, but modeling infant faces is challenging due to limited data and frequent spontaneous exp...
CARE What Fails: Contrastive Anchored-REflection for Verifiable Multimodal : Abstract: Group-relative reinforcement learning with verifiable rewards (RLVR) often wastes the most informative data it already has the failures. When all rollouts are wrong, gradients stall; when on...
CASA: Cross-Attention via Self-Attention for Efficient Vision-Language Fusion : Abstract: Vision-language models (VLMs) are commonly trained by inserting image tokens from a pretrained vision encoder into the textual stream of a language model. This allows text and image informat...
Learning Continuous Solvent Effects from Transient Flow Data: A Graph Neural Network Benchmark on Catechol Rearrangement : Abstract: Predicting reaction outcomes across continuous solvent composition ranges remains a critical challenge in organic synthesis and process chemistry. Traditional machine learning approaches oft...
LacaDM: A Latent Causal Diffusion Model for Multiobjective Reinforcement Learning : Abstract: Multiobjective reinforcement learning (MORL) poses significant challenges due to the inherent conflicts between objectives and the difficulty of adapting to dynamic environments. Traditional...
Anatomy-R1: Enhancing Anatomy Reasoning in Multimodal Large Language Models via Anatomical Similarity Curriculum and Group Diversity Augmentation : Abstract: Multimodal Large Language Models (MLLMs) have achieved impressive progress in natural image reasoning, yet their potential in medical imaging remains underexplored, especially in clinical an...
DK-STN: A Domain Knowledge Embedded Spatio-Temporal Network Model for MJO Forecast : Abstract: Understanding and predicting the Madden-Julian Oscillation (MJO) is fundamental for precipitation forecasting and disaster prevention. To date, long-term and accurate MJO prediction has rema...
Kolmogorov-Arnold Graph Neural Networks Applied to Inorganic Nanomaterials Dataset : Abstract: The recent development of Kolmogorov-Arnold Networks (KANs) introduced new discoveries in the field of Graph Neural Networks (GNNs), expanding the existing set of models with KAN-based versi...
A Dataset and Preliminary Study of Using GPT-5 for Code-change Impact Analysis : Abstract: Understanding source code changes and their impact on other code entities is a crucial skill in software development. However, the analysis of code changes and their impact is often performe...
Multi-Layer Confidence Scoring for Detection of Out-of-Distribution Samples, Adversarial Attacks, and In-Distribution Misclassifications : Abstract: The recent explosive growth in Deep Neural Networks applications raises concerns about the black-box usage of such models, with limited trasparency and trustworthiness in high-stakes domains...
Activations as Features: Probing LLMs for Generalizable Essay Scoring Representations : Abstract: Automated essay scoring (AES) is a challenging task in cross-prompt settings due to the diversity of scoring criteria. While previous studies have focused on the output of large language mod...
MT-Mark: Rethinking Image Watermarking via Mutual-Teacher Collaboration with Adaptive Feature Modulation : Abstract: Existing deep image watermarking methods follow a fixed embedding-distortion-extraction pipeline, where the embedder and extractor are weakly coupled through a final loss and optimized in is...
Attention Is Not What You Need : Abstract: We revisit a basic question in sequence modeling: is explicit self-attention actually necessary for strong performance and reasoning? We argue that standard multi-head attention is best seen...
Research Program: Theory of Learning in Dynamical Systems : Abstract: Modern learning systems increasingly interact with data that evolve over time and depend on hidden internal state. We ask a basic question: when is such a dynamical system learnable from obs...
DSTED: Decoupling Temporal Stabilization and Discriminative Enhancement for Surgical Workflow Recognition : Abstract: Purpose: Surgical workflow recognition enables context-aware assistance and skill assessment in computer-assisted interventions. Despite recent advances, current methods suffer from two crit...
OmniMER: Indonesian Multimodal Emotion Recognition via Auxiliary-Enhanced LLM Adaptation : Abstract: Indonesian, spoken by over 200 million people, remains underserved in multimodal emotion recognition research despite its dominant presence on Southeast Asian social media platforms. We intr...
Sprecher Networks: A Parameter-Efficient Kolmogorov-Arnold Architecture : Abstract: We present Sprecher Networks (SNs), a family of trainable neural architectures inspired by the classical Kolmogorov-Arnold-Sprecher (KAS) construction for approximating multivariate continuo...
Alternative positional encoding functions for neural transformers : Abstract: A key module in neural transformer-based deep architectures is positional encoding. This module enables a suitable way to encode positional information as input for transformer neural layers...
MAGIC: Achieving Superior Model Merging via Magnitude Calibration : Abstract: The proliferation of pre-trained models has given rise to a wide array of specialised, fine-tuned models. Model merging aims to merge the distinct capabilities of these specialised models in...
MixFlow Training: Alleviating Exposure Bias with Slowed Interpolation Mixture : Abstract: This paper studies the training-testing discrepancy (a.k.a. exposure bias) problem for improving the diffusion models. During training, the input of a prediction network at one training time...
Causal-Guided Detoxify Backdoor Attack of Open-Weight LoRA Models : Abstract: Low-Rank Adaptation (LoRA) has emerged as an efficient method for fine-tuning large language models (LLMs) and is widely adopted within the open-source community. However, the decentralized ...
Digital Twin-Driven Zero-Shot Fault Diagnosis of Axial Piston Pumps Using Fluid-Borne Noise Signals : Abstract: Axial piston pumps are crucial components in fluid power systems, where reliable fault diagnosis is essential for ensuring operational safety and efficiency. Traditional data-driven methods ...
Is Visual Realism Enough? Evaluating Gait Biometric Fidelity in Generative AI Human Animation : Abstract: Generative AI (GenAI) models have revolutionized animation, enabling the synthesis of humans and motion patterns with remarkable visual fidelity. However, generating truly realistic human an...
Machine Unlearning in the Era of Quantum Machine Learning: An Empirical Study : Abstract: We present the first comprehensive empirical study of machine unlearning (MU) in hybrid quantum-classical neural networks. While MU has been extensively explored in classical deep learning, ...
Auto-Prompting with Retrieval Guidance for Frame Detection in Logistics : Abstract: Prompt engineering plays a critical role in adapting large language models (LLMs) to complex reasoning and labeling tasks without the need for extensive fine-tuning. In this paper, we propos...
ChemATP: A Training-Free Chemical Reasoning Framework for Large Language Models : Abstract: Large Language Models (LLMs) exhibit strong general reasoning but struggle in molecular science due to the lack of explicit chemical priors in standard string representations. Current soluti...
Identifying Features Associated with Bias Against 93 Stigmatized Groups in Language Models and Guardrail Model Safety Mitigation : Abstract: Large language models (LLMs) have been shown to exhibit social bias, however, bias towards non-protected stigmatized identities remain understudied. Furthermore, what social features of stig...
From Pixels to Predicates Structuring urban perception with scene graphs : Abstract: Perception research is increasingly modelled using streetscapes, yet many approaches still rely on pixel features or object co-occurrence statistics, overlooking the explicit relations that ...
Towards Minimal Fine-Tuning of VLMs : Abstract: We introduce Image-LoRA, a lightweight parameter efficient fine-tuning (PEFT) recipe for transformer-based vision-language models (VLMs). Image-LoRA applies low-rank adaptation only to the v...
MixKVQ: Query-Aware Mixed-Precision KV Cache Quantization for Long-Context Reasoning : Abstract: Long Chain-of-Thought (CoT) reasoning has significantly advanced the capabilities of Large Language Models (LLMs), but this progress is accompanied by substantial memory and latency overhead...
On the Koopman-Based Generalization Bounds for Multi-Task Deep Learning : Abstract: The paper establishes generalization bounds for multitask deep neural networks using operator-theoretic techniques. The authors propose a tighter bound than those derived from conventional n...
Operator-Based Generalization Bound for Deep Learning: Insights on Multi-Task Learning : Abstract: This paper presents novel generalization bounds for vector-valued neural networks and deep kernel methods, focusing on multi-task learning through an operator-theoretic framework. Our key de...
Practical Quantum-Classical Feature Fusion for complex data Classification : Abstract: Hybrid quantum and classical learning aims to couple quantum feature maps with the robustness of classical neural networks, yet most architectures treat the quantum circuit as an isolated fe...
Vision-Language-Policy Model for Dynamic Robot Task Planning : Abstract: Bridging the gap between natural language commands and autonomous execution in unstructured environments remains an open challenge for robotics. This requires robots to perceive and reason o...
Beyond Sliding Windows: Learning to Manage Memory in Non-Markovian Environments : Abstract: Recent success in developing increasingly general purpose agents based on sequence models has led to increased focus on the problem of deploying computationally limited agents within the vas...
HyperLoad: A Cross-Modality Enhanced Large Language Model-Based Framework for Green Data Center Cooling Load Prediction : Abstract: The rapid growth of artificial intelligence is exponentially escalating computational demand, inflating data center energy use and carbon emissions, and spurring rapid deployment of green da...
DIVER-1 : Deep Integration of Vast Electrophysiological Recordings at Scale : Abstract: Electrophysiology signals such as EEG and iEEG are central to neuroscience, brain-computer interfaces, and clinical applications, yet existing foundation models remain limited in scale despi...
Fraud Detection Through Large-Scale Graph Clustering with Heterogeneous Link Transformation : Abstract: Collaborative fraud, where multiple fraudulent accounts coordinate to exploit online payment systems, poses significant challenges due to the formation of complex network structures. Traditi...
Finer-Personalization Rank: Fine-Grained Retrieval Examines Identity Preservation for Personalized Generation : Abstract: The rise of personalized generative models raises a central question: how should we evaluate identity preservation? Given a reference image (e.g., one's pet), we expect the generated image t...
The Erasure Illusion: Stress-Testing the Generalization of LLM Forgetting Evaluation : Abstract: Machine unlearning aims to remove specific data influences from trained models, a capability essential for adhering to copyright laws and ensuring AI safety. Current unlearning metrics typic...
IndoorUAV: Benchmarking Vision-Language UAV Navigation in Continuous Indoor Environments : Abstract: Vision-Language Navigation (VLN) enables agents to navigate in complex environments by following natural language instructions grounded in visual observations. Although most existing work ha...
Efficient Jailbreak Mitigation Using Semantic Linear Classification in a Multi-Staged Pipeline : Abstract: Prompt injection and jailbreaking attacks pose persistent security challenges to large language model (LLM)-based systems. We present an efficient and systematically evaluated defense archit...
The 6th International Verification of Neural Networks Competition (VNN-COMP 2025): Summary and Results : Abstract: This report summarizes the 6th International Verification of Neural Networks Competition (VNN-COMP 2025), held as a part of the 8th International Symposium on AI Verification (SAIV), that wa...
Context-Aware Initialization for Reducing Generative Path Length in Diffusion Language Models : Abstract: Diffusion Large Language Models (DLLMs) enable fully parallel token decoding but often remain impractical at inference time due to the many denoising iterations required to refine an informa...
Evaluating the Challenges of LLMs in Real-world Medical Follow-up: A Comparative Study and An Optimized Framework : Abstract: When applied directly in an end-to-end manner to medical follow-up tasks, Large Language Models (LLMs) often suffer from uncontrolled dialog flow and inaccurate information extraction due to...
ICP-4D: Bridging Iterative Closest Point and LiDAR Panoptic Segmentation : Abstract: Dominant paradigms for 4D LiDAR panoptic segmentation are usually required to train deep neural networks with large superimposed point clouds or design dedicated modules for instance associa...
R-GenIMA: Integrating Neuroimaging and Genetics with Interpretable Multimodal AI for Alzheimer's Disease Progression : Abstract: Early detection of Alzheimer's disease (AD) requires models capable of integrating macro-scale neuroanatomical alterations with micro-scale genetic susceptibility, yet existing multimodal ap...
Self-Attention with State-Object Weighted Combination for Compositional Zero Shot Learning : Abstract: Object recognition has become prevalent across various industries. However, most existing applications are limited to identifying objects alone, without considering their associated states. ...
Learning Hierarchical Procedural Memory for LLM Agents through Bayesian Selection and Contrastive Refinement : Abstract: We present MACLA, a framework that decouples reasoning from learning by maintaining a frozen large language model while performing all adaptation in an external hierarchical procedural memor...
When Less is More: 8-bit Quantization Improves Continual Learning in Large Language Models : Abstract: Catastrophic forgetting poses a fundamental challenge in continual learning, particularly when models are quantized for deployment efficiency. We systematically investigate the interplay bet...
LouvreSAE: Sparse Autoencoders for Interpretable and Controllable Style Transfer : Abstract: Artistic style transfer in generative models remains a significant challenge, as existing methods often introduce style only via model fine-tuning, additional adapters, or prompt engineering...
An Empirical Study of Developer-Provided Context for AI Coding Assistants in Open-Source Projects : Abstract: While Large Language Models (LLMs) have demonstrated remarkable capabilities, research shows that their effectiveness depends not only on explicit prompts but also on the broader context pro...
Structural Reinforcement Learning for Heterogeneous Agent Macroeconomics : Abstract: We present a new approach to formulating and solving heterogeneous agent models with aggregate risk. We replace the cross-sectional distribution with low-dimensional prices as state variable...
Can LLMs Estimate Student Struggles? Human-AI Difficulty Alignment with Proficiency Simulation for Item Difficulty Prediction : Abstract: Accurate estimation of item (question or task) difficulty is critical for educational assessment but suffers from the cold start problem. While Large Language Models demonstrate superhuman p...
CrashChat: A Multimodal Large Language Model for Multitask Traffic Crash Video Analysis : Abstract: Automating crash video analysis is essential to leverage the growing availability of driving video data for traffic safety research and accountability attribution in autonomous driving. Cras...
Psychometric Validation of the Sophotechnic Mediation Scale and a New Understanding of the Development of GenAI Mastery: Lessons from 3,392 Adult Brazilian Workers : Abstract: The rapid diffusion of generative artificial intelligence (GenAI) systems has introduced new forms of human-technology interaction, raising the question of whether sustained engagement gives...
Hyperbolic Graph Embeddings: a Survey and an Evaluation on Anomaly Detection : Abstract: This survey reviews hyperbolic graph embedding models, and evaluate them on anomaly detection, highlighting their advantages over Euclidean methods in capturing complex structures. Evaluatin...
Controllable Probabilistic Forecasting with Stochastic Decomposition Layers : Abstract: AI weather prediction ensembles with latent noise injection and optimized with the continuous ranked probability score (CRPS) have produced both accurate and well-calibrated predictions with...
FedVideoMAE: Efficient Privacy-Preserving Federated Video Moderation : Abstract: The rapid growth of short-form video platforms increases the need for privacy-preserving moderation, as cloud-based pipelines expose raw videos to privacy risks, high bandwidth costs, and in...
Reliable Audio Deepfake Detection in Variable Conditions via Quantum-Kernel SVMs : Abstract: Detecting synthetic speech is challenging when labeled data are scarce and recording conditions vary. Existing end-to-end deep models often overfit or fail to generalize, and while kernel me...
Smark: A Watermark for Text-to-Speech Diffusion Models via Discrete Wavelet Transform : Abstract: Text-to-Speech (TTS) diffusion models generate high-quality speech, which raises challenges for the model intellectual property protection and speech tracing for legal use. Audio watermarkin...
Code2Doc: A Quality-First Curated Dataset for Code Documentation : Abstract: The performance of automatic code documentation generation models depends critically on the quality of the training data used for supervision. However, most existing code documentation datas...
IPCV: Information-Preserving Compression for MLLM Visual Encoders : Abstract: Multimodal Large Language Models (MLLMs) deliver strong vision-language performance but at high computational cost, driven by numerous visual tokens processed by the Vision Transformer (ViT)...
PIPCFR: Pseudo-outcome Imputation with Post-treatment Variables for Individual Treatment Effect Estimation : Abstract: The estimation of individual treatment effects (ITE) focuses on predicting the outcome changes that result from a change in treatment. A fundamental challenge in observational data is that w...
$M^3-Verse$: A "Spot the Difference" Challenge for Large Multimodal Models : Abstract: Modern Large Multimodal Models (LMMs) have demonstrated extraordinary ability in static image and single-state spatial-temporal understanding. However, their capacity to comprehend the dynam...
Explainable and Fine-Grained Safeguarding of LLM Multi-Agent Systems via Bi-Level Graph Anomaly Detection : Abstract: Large language model (LLM)-based multi-agent systems (MAS) have shown strong capabilities in solving complex tasks. As MAS become increasingly autonomous in various safety-critical tasks, de...
CauTraj: A Causal-Knowledge-Guided Framework for Lane-Changing Trajectory Planning of Autonomous Vehicles : Abstract: Enhancing the performance of trajectory planners for lane - changing vehicles is one of the key challenges in autonomous driving within human - machine mixed traffic. Most existing studies h...
Fusion of Multiscale Features Via Centralized Sparse-attention Network for EEG Decoding : Abstract: Electroencephalography (EEG) signal decoding is a key technology that translates brain activity into executable commands, laying the foundation for direct brain-machine interfacing and intel...
Remoe: Towards Efficient and Low-Cost MoE Inference in Serverless Computing : Abstract: Mixture-of-Experts (MoE) has become a dominant architecture in large language models (LLMs) due to its ability to scale model capacity via sparse expert activation. Meanwhile, serverless com...
Geometric-Photometric Event-based 3D Gaussian Ray Tracing : Abstract: Event cameras offer a high temporal resolution over traditional frame-based cameras, which makes them suitable for motion and structure estimation. However, it has been unclear how event-bas...
ARC: Leveraging Compositional Representations for Cross-Problem Learning on VRPs : Abstract: Vehicle Routing Problems (VRPs) with diverse real-world attributes have driven recent interest in cross-problem learning approaches that efficiently generalize across problem variants. We pr...
LLM-CAS: Dynamic Neuron Perturbation for Real-Time Hallucination Correction : Abstract: Large language models (LLMs) often generate hallucinated content that lacks factual or contextual grounding, limiting their reliability in critical applications. Existing approaches such as ...
A Multi-agent Text2SQL Framework using Small Language Models and Execution Feedback : Abstract: Text2SQL, the task of generating SQL queries from natural language text, is a critical challenge in data engineering. Recently, Large Language Models (LLMs) have demonstrated superior perfor...
DASH: Deception-Augmented Shared Mental Model for a Human-Machine Teaming System : Abstract: We present DASH (Deception-Augmented Shared mental model for Human-machine teaming), a novel framework that enhances mission resilience by embedding proactive deception into Shared Mental Mo...
PTTA: A Pure Text-to-Animation Framework for High-Quality Creation : Abstract: Traditional animation production involves complex pipelines and significant manual labor cost. While recent video generation models such as Sora, Kling, and CogVideoX achieve impressive resu...
Text2Graph VPR: A Text-to-Graph Expert System for Explainable Place Recognition in Changing Environments : Abstract: Visual Place Recognition (VPR) in long-term deployment requires reasoning beyond pixel similarity: systems must make transparent, interpretable decisions that remain robust under lighting, w...
The Interaction Bottleneck of Deep Neural Networks: Discovery, Proof, and Modulation : Abstract: Understanding what kinds of cooperative structures deep neural networks (DNNs) can represent remains a fundamental yet insufficiently understood problem. In this work, we treat interactions ...
From Scratch to Fine-Tuned: A Comparative Study of Transformer Training Strategies for Legal Machine Translation : Abstract: In multilingual nations like India, access to legal information is often hindered by language barriers, as much of the legal and judicial documentation remains in English. Legal Machine Tran...
Modality-Dependent Memory Mechanisms in Cross-Modal Neuromorphic Computing : Abstract: Memory-augmented spiking neural networks (SNNs) promise energy-efficient neuromorphic computing, yet their generalization across sensory modalities remains unexplored. We present the first c...
Placenta Accreta Spectrum Detection Using an MRI-based Hybrid CNN-Transformer Model : Abstract: Placenta Accreta Spectrum (PAS) is a serious obstetric condition that can be challenging to diagnose with Magnetic Resonance Imaging (MRI) due to variability in radiologists' interpretations...
AI Code in the Wild: Measuring Security Risks and Ecosystem Shifts of AI-Generated Code in Modern Software : Abstract: Large language models (LLMs) for code generation are becoming integral to modern software development, but their real-world prevalence and security impact remain poorly understood. We pres...
Adaptive Accountability in Networked MAS: Tracing and Mitigating Emergent Norms at Scale : Abstract: Large-scale networked multi-agent systems increasingly underpin critical infrastructure, yet their collective behavior can drift toward undesirable emergent norms that elude conventional gov...
Enhancing Medical Large Vision-Language Models via Alignment Distillation : Abstract: Medical Large Vision-Language Models (Med-LVLMs) have shown promising results in clinical applications, but often suffer from hallucinated outputs due to misaligned visual understanding. In ...
Toward Training Superintelligent Software Agents through Self-Play SWE-RL : Abstract: While current software agents powered by large language models (LLMs) and agentic reinforcement learning (RL) can boost programmer productivity, their training data (e.g., GitHub issues and ...
SecureCode v2.0: A Production-Grade Dataset for Training Security-Aware Code Generation Models : Abstract: AI assistants produce vulnerable code in 45% of security-relevant scenarios, introducing flaws into production systems at scale. Yet existing secure coding datasets fall short. They lack inc...
Detection of AI Generated Images Using Combined Uncertainty Measures and Particle Swarm Optimised Rejection Mechanism : Abstract: As AI-generated images become increasingly photorealistic, distinguishing them from natural images poses a growing challenge. This paper presents a robust detection framework that leverages ...
A Formal Descriptive Language for Learning Dynamics: A Five-Layer Structural Coordinate System : Abstract: Understanding learning as a dynamic process is challenging due to the interaction of multiple factors, including cognitive load, internal state change, and subjective evaluation. Existing ap...
Prediction and Forecast of Short-Term Drought Impacts Using Machine Learning to Support Mitigation and Adaptation Efforts : Abstract: Drought is a complex natural hazard that affects ecological and human systems, often resulting in substantial environmental and economic losses. Recent increases in drought severity, frequen...
The Illusion of Consistency: Selection-Induced Bias in Gated Kalman Innovation Statistics : Abstract: Validation gating is a fundamental component of classical Kalman-based tracking systems. Only measurements whose normalized innovation squared (NIS) falls below a prescribed threshold are co...
GTMA: Dynamic Representation Optimization for OOD Vision-Language Models : Abstract: Vision-language models (VLMs) struggle in open-world applications, where out-of-distribution (OOD) concepts often trigger cross-modal alignment collapse and severely degrade zero-shot perfor...
PlantDiseaseNet-RT50: A Fine-tuned ResNet50 Architecture for High-Accuracy Plant Disease Detection Beyond Standard CNNs : Abstract: Plant diseases pose a significant threat to agricultural productivity and global food security, accounting for 70-80% of crop losses worldwide. Traditional detection methods rely heavily on ...
Enhancing Decision-Making in Windows PE Malware Classification During Dataset Shifts with Uncertainty Estimation : Abstract: Artificial intelligence techniques have achieved strong performance in classifying Windows Portable Executable (PE) malware, but their reliability often degrades under dataset shifts, leadin...
SWE-EVO: Benchmarking Coding Agents in Long-Horizon Software Evolution Scenarios : Abstract: Existing benchmarks for AI coding agents focus on isolated, single-issue tasks such as fixing a bug or implementing a small feature. However, real-world software engineering is fundamentally...
Self-organizing maps for water quality assessment in reservoirs and lakes: A systematic literature review : Abstract: Sustainable water quality underpins ecological balance and water security. Assessing and managing lakes and reservoirs is difficult due to data sparsity, heterogeneity, and nonlinear relatio...
Mitigating Spurious Correlations in NLI via LLM-Synthesized Counterfactuals and Dynamic Balanced Sampling : Abstract: Natural Language Inference (NLI) models frequently rely on spurious correlations rather than semantic reasoning. Existing mitigation strategies often incur high annotation costs or trigger c...
SoK: Understanding (New) Security Issues Across AI4Code Use Cases : Abstract: AI-for-Code (AI4Code) systems are reshaping software engineering, with tools like GitHub Copilot accelerating code generation, translation, and vulnerability detection. Alongside these advan...
Secret mixtures of experts inside your LLM : Abstract: Despite being one of the earliest neural network layers, the Multilayer Perceptron (MLP) is arguably one of the least understood parts of the transformer architecture due to its dense comput...
Snowveil: A Framework for Decentralised Preference Discovery : Abstract: Aggregating subjective preferences of a large group is a fundamental challenge in computational social choice, traditionally reliant on central authorities. To address the limitations of thi...
A Distributed Hierarchical Spatio-Temporal Edge-Enhanced Graph Neural Network for City-Scale Dynamic Logistics Routing : Abstract: City-scale logistics routing has become increasingly challenging as metropolitan road networks grow to tens of millions of edges and traffic conditions evolve rapidly under high-volume mobil...
An Agentic AI Framework for Training General Practitioner Student Skills : Abstract: Advancements in large language models offer strong potential for enhancing virtual simulated patients (VSPs) in medical education by providing scalable alternatives to resource-intensive tra...
MeniMV: A Multi-view Benchmark for Meniscus Injury Severity Grading : Abstract: Precise grading of meniscal horn tears is critical in knee injury diagnosis but remains underexplored in automated MRI analysis. Existing methods often rely on coarse study-level labels or b...
VeruSAGE: A Study of Agent-Based Verification for Rust Systems : Abstract: Large language models (LLMs) have shown impressive capability to understand and develop code. However, their capability to rigorously reason about and prove code correctness remains in quest...
Federated Learning Based Decentralized Adaptive Intelligent Transmission Protocol for Privacy Preserving 6G Networks : Abstract: The move to 6th Generation (6G) wireless networks creates new issues with privacy, scalability, and adaptability. The data-intensive nature of 6G is not handled well by older, centralized ne...
AmPLe: Supporting Vision-Language Models via Adaptive-Debiased Ensemble Multi-Prompt Learning : Abstract: Multi-prompt learning methods have emerged as an effective approach for facilitating the rapid adaptation of vision-language models to downstream tasks with limited resources. Existing multi...
AraToken: Optimizing Arabic Tokenization with Normalization Pipeline and Language Extension for Qwen3 : Abstract: Tokenization is a critical preprocessing step for large language models (LLMs), directly impacting training efficiency and downstream performance. General-purpose tokenizers trained predomin...
Neural Proofs for Sound Verification and Control of Complex Systems : Abstract: This informal contribution presents an ongoing line of research that is pursuing a new approach to the construction of sound proofs for the formal verification and control of complex stochas...
Exploration vs. Fixation: Scaffolding Divergent and Convergent Thinking for Human-AI Co-Creation with Generative Models : Abstract: Generative AI has begun to democratize creative work, enabling novices to produce complex artifacts such as code, images, and videos. However, in practice, existing interaction paradigms oft...
Datasets for machine learning and for assessing the intelligence level of automatic patent search systems : Abstract: The key to success in automating prior art search in patent research using artificial intelligence lies in developing large datasets for machine learning and ensuring their availability. Thi...
LLM Agents Implement an NLG System from Scratch: Building Interpretable Rule-Based RDF-to-Text Generators : Abstract: We present a novel neurosymbolic framework for RDF-to-text generation, in which the model is "trained" through collaborative interactions among multiple LLM agents rather than traditional ba...
LLM-based Few-Shot Early Rumor Detection with Imitation Agent : Abstract: Early Rumor Detection (EARD) aims to identify the earliest point at which a claim can be accurately classified based on a sequence of social media posts. This is especially challenging in da...
MCVI-SANet: A lightweight semi-supervised model for LAI and SPAD estimation of winter wheat under vegetation index saturation : Abstract: Vegetation index (VI) saturation during the dense canopy stage and limited ground-truth annotations of winter wheat constrain accurate estimation of LAI and SPAD. Existing VI-based and textu...
Dynamic Entropy Tuning in Reinforcement Learning Low-Level Quadcopter Control: Stochasticity vs Determinism : Abstract: This paper explores the impact of dynamic entropy tuning in Reinforcement Learning (RL) algorithms that train a stochastic policy. Its performance is compared against algorithms that train a...
Reinforcement Learning Position Control of a Quadrotor Using Soft Actor-Critic (SAC) : Abstract: This paper proposes a new Reinforcement Learning (RL) based control architecture for quadrotors. With the literature focusing on controlling the four rotors' RPMs directly, this paper aims t...
Asynchronous Pipeline Parallelism for Real-Time Multilingual Lip Synchronization in Video Communication Systems : Abstract: This paper introduces a parallel and asynchronous Transformer framework designed for efficient and accurate multilingual lip synchronization in real-time video conferencing systems. The prop...
Trustworthy and Explainable Deep Reinforcement Learning for Safe and Energy-Efficient Process Control: A Use Case in Industrial Compressed Air Systems : Abstract: This paper presents a trustworthy reinforcement learning approach for the control of industrial compressed air systems. We develop a framework that enables safe and energy-efficient operatio...
On Efficient Adjustment in Causal Graphs : Abstract: Observational studies in fields such as epidemiology often rely on covariate adjustment to estimate causal effects. Classical graphical criteria, like the back-door criterion and the general...
Embedded Safety-Aligned Intelligence via Differentiable Internal Alignment Embeddings : Abstract: We introduce Embedded Safety-Aligned Intelligence (ESAI), a theoretical framework for multi-agent reinforcement learning that embeds alignment constraints directly into agents internal repre...
AL-GNN: Privacy-Preserving and Replay-Free Continual Graph Learning via Analytic Learning : Abstract: Continual graph learning (CGL) aims to enable graph neural networks to incrementally learn from a stream of graph structured data without forgetting previously acquired knowledge. Existing m...
Evolutionary BP+OSD Decoding for Low-Latency Quantum Error Correction : Abstract: We propose an evolutionary belief propagation (EBP) decoder for quantum error correction, which incorporates trainable weights into the BP algorithm and optimizes them via the differential e...
Who Can See Through You? Adversarial Shielding Against VLM-Based Attribute Inference Attacks : Abstract: As vision-language models (VLMs) become widely adopted, VLM-based attribute inference attacks have emerged as a serious privacy concern, enabling adversaries to infer private attributes from...
TICL+: A Case Study On Speech In-Context Learning for Children's Speech Recognition : Abstract: Children's speech recognition remains challenging due to substantial acoustic and linguistic variability, limited labeled data, and significant differences from adult speech. Speech foundati...
Software Vulnerability Management in the Era of Artificial Intelligence: An Industry Perspective : Abstract: Artificial Intelligence (AI) has revolutionized software development, particularly by automating repetitive tasks and improving developer productivity. While these advancements are well-docu...
Towards Ancient Plant Seed Classification: A Benchmark Dataset and Baseline Model : Abstract: Understanding the dietary preferences of ancient societies and their evolution across periods and regions is crucial for revealing human-environment interactions. Seeds, as important archaeo...
Offline Behavioral Data Selection : Abstract: Behavioral cloning is a widely adopted approach for offline policy learning from expert demonstrations. However, the large scale of offline behavioral datasets often results in computational...
Spectral Discrepancy and Cross-modal Semantic Consistency Learning for Object Detection in Hyperspectral Image : Abstract: Hyperspectral images with high spectral resolution provide new insights into recognizing subtle differences in similar substances. However, object detection in hyperspectral images faces sig...
Breaking Minds, Breaking Systems: Jailbreaking Large Language Models via Human-like Psychological Manipulation : Abstract: Large Language Models (LLMs) have gained considerable popularity and protected by increasingly sophisticated safety mechanisms. However, jailbreak attacks continue to pose a critical securit...
Stable and Efficient Single-Rollout RL for Multimodal Reasoning : Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has become a key paradigm to improve the reasoning capabilities of Multimodal Large Language Models (MLLMs). However, prevalent group-ba...
LLaViDA: A Large Language Vision Driving Assistant for Explicit Reasoning and Enhanced Trajectory Planning : Abstract: Trajectory planning is a fundamental yet challenging component of autonomous driving. End-to-end planners frequently falter under adverse weather, unpredictable human behavior, or complex ro...
When Does Learning Renormalize? Sufficient Conditions for Power Law Spectral Dynamics : Abstract: Empirical power--law scaling has been widely observed across modern deep learning systems, yet its theoretical origins and scope of validity remain incompletely understood. The Generalized R...
PROVEX: Enhancing SOC Analyst Trust with Explainable Provenance-Based IDS : Abstract: Modern intrusion detection systems (IDS) leverage graph neural networks (GNNs) to detect malicious activity in system provenance data, but their decisions often remain a black box to analyst...
On Swarm Leader Identification using Probing Policies : Abstract: Identifying the leader within a robotic swarm is crucial, especially in adversarial contexts where leader concealment is necessary for mission success. This work introduces the interactive S...
Grad: Guided Relation Diffusion Generation for Graph Augmentation in Graph Fraud Detection : Abstract: Nowadays, Graph Fraud Detection (GFD) in financial scenarios has become an urgent research topic to protect online payment security. However, as organized crime groups are becoming more prof...
Holistic Evaluation of State-of-the-Art LLMs for Code Generation : Abstract: This study presents a comprehensive empirical evaluation of six state-of-the-art large language models (LLMs) for code generation, including both general-purpose and code-specialized models....
Uncertainty-Gated Region-Level Retrieval for Robust Semantic Segmentation : Abstract: Semantic segmentation of outdoor street scenes plays a key role in applications such as autonomous driving, mobile robotics, and assistive technology for visually-impaired pedestrians. For t...
From Prompt to Product: A Human-Centered Benchmark of Agentic App Generation Systems : Abstract: Agentic AI systems capable of generating full-stack web applications from natural language prompts ("prompt- to-app") represent a significant shift in software development. However, evaluati...
Characterising Behavioural Families and Dynamics of Promotional Twitter Bots via Sequence-Based Modelling : Abstract: This paper asks whether promotional Twitter/X bots form behavioural families and whether members evolve similarly. We analyse 2,798,672 tweets from 2,615 ground-truth promotional bot account...
FOODER: Real-time Facial Authentication and Expression Recognition : Abstract: Out-of-distribution (OOD) detection is essential for the safe deployment of neural networks, as it enables the identification of samples outside the training domain. We present FOODER, a rea...
Securing Agentic AI Systems -- A Multilayer Security Framework : Abstract: Securing Agentic Artificial Intelligence (AI) systems requires addressing the complex cyber risks introduced by autonomous, decision-making, and adaptive behaviors. Agentic AI systems are in...
A Dataset and Benchmarks for Atrial Fibrillation Detection from Electrocardiograms of Intensive Care Unit Patients : Abstract: Objective: Atrial fibrillation (AF) is the most common cardiac arrhythmia experienced by intensive care unit (ICU) patients and can cause adverse health effects. In this study, we publish a ...
Specification and Detection of LLM Code Smells : Abstract: Large Language Models (LLMs) have gained massive popularity in recent years and are increasingly integrated into software systems for diverse purposes. However, poorly integrating them in so...
ReGal: A First Look at PPO-based Legal AI for Judgment Prediction and Summarization in India : Abstract: This paper presents an early exploration of reinforcement learning methodologies for legal AI in the Indian context. We introduce Reinforcement Learning-based Legal Reasoning (ReGal), a fram...
Seeing Justice Clearly: Handwritten Legal Document Translation with OCR and Vision-Language Models : Abstract: Handwritten text recognition (HTR) and machine translation continue to pose significant challenges, particularly for low-resource languages like Marathi, which lack large digitized corpora a...
The Subject of Emergent Misalignment in Superintelligence: An Anthropological, Cognitive Neuropsychological, Machine-Learning, and Ontological Perspective : Abstract: We examine the conceptual and ethical gaps in current representations of Superintelligence misalignment. We find throughout Superintelligence discourse an absent human subject, and an under-...
A Hybrid Inductive-Transductive Network for Traffic Flow Imputation on Unsampled Locations : Abstract: Accurately imputing traffic flow at unsensed locations is difficult: loop detectors provide precise but sparse measurements, speed from probe vehicles is widely available yet only weakly cor...
Parameter-Efficient Fine-Tuning for HAR: Integrating LoRA and QLoRA into Transformer Models : Abstract: Human Activity Recognition is a foundational task in pervasive computing. While recent advances in self-supervised learning and transformer-based architectures have significantly improved HA...
Adaptive Agents in Spatial Double-Auction Markets: Modeling the Emergence of Industrial Symbiosis : Abstract: Industrial symbiosis fosters circularity by enabling firms to repurpose residual resources, yet its emergence is constrained by socio-spatial frictions that shape costs, matching opportuniti...
Re-assessing the evidence for mental rotation abilities in children using computational models : Abstract: There is strong and diverse evidence for mental rotation (MR) abilities in adults. However, current evidence for MR in children rests on just a few behavioral paradigms adapted from the adul...
CodeGEMM: A Codebook-Centric Approach to Efficient GEMM in Quantized LLMs : Abstract: Weight-only quantization is widely used to mitigate the memory-bound nature of LLM inference. Codebook-based methods extend this trend by achieving strong accuracy in the extremely low-bit r...
Convolutional-neural-operator-based transfer learning for solving PDEs : Abstract: Convolutional neural operator is a CNN-based architecture recently proposed to enforce structure-preserving continuous-discrete equivalence and enable the genuine, alias-free learning of sol...
A Critical Review of Monte Carlo Algorithms Balancing Performance and Probabilistic Accuracy with AI Augmented Framework : Abstract: Monte Carlo algorithms are a foundational pillar of modern computational science, yet their effective application hinges on a deep understanding of their performance trade offs. This paper p...
Real-Time Human-Robot Interaction Intent Detection Using RGB-based Pose and Emotion Cues with Cross-Camera Model Generalization : Abstract: Service robots in public spaces require real-time understanding of human behavioral intentions for natural interaction. We present a practical multimodal framework for frame-accurate human-r...
Victor Calibration (VC): Multi-Pass Confidence Calibration and CP4.3 Governance Stress Test under Round-Table Orchestration : Abstract: Safety alignment can make frontier LMs overly conservative, degrading collaboration via hedging or false refusals. We present a lightweight toolkit with three parts: (1) Victor Calibration (...
Seeing Beyond the Scene: Analyzing and Mitigating Background Bias in Action Recognition : Abstract: Human action recognition models often rely on background cues rather than human movement and pose to make predictions, a behavior known as background bias. We present a systematic analysis o...
Will AI Trade? A Computational Inversion of the No-Trade Theorem : Abstract: Classic no-trade theorems attribute trade to heterogeneous beliefs. We re-examine this conclusion for AI agents, asking if trade can arise from computational limitations, under common belief...
Which Coauthor Should I Nominate in My 99 ICLR Submissions? A Mathematical Analysis of the ICLR 2026 Reciprocal Reviewer Nomination Policy : Abstract: The rapid growth of AI conference submissions has created an overwhelming reviewing burden. To alleviate this, recent venues such as ICLR 2026 introduced a reviewer nomination policy: each s...
Let the Model Learn to Feel: Mode-Guided Tonality Injection for Symbolic Music Emotion Recognition : Abstract: Music emotion recognition is a key task in symbolic music understanding (SMER). Recent approaches have shown promising results by fine-tuning large-scale pre-trained models (e.g., MIDIBERT, ...
NystagmusNet: Explainable Deep Learning for Photosensitivity Risk Prediction : Abstract: Nystagmus patients with photosensitivity face significant daily challenges due to involuntary eye movements exacerbated by environmental brightness conditions. Current assistive solutions ar...
Accelerated Digital Twin Learning for Edge AI: A Comparison of FPGA and Mobile GPU : Abstract: Digital twins (DTs) can enable precision healthcare by continually learning a mathematical representation of patient-specific dynamics. However, mission critical healthcare applications requ...
Comparative Evaluation of Explainable Machine Learning Versus Linear Regression for Predicting County-Level Lung Cancer Mortality Rate in the United States : Abstract: Lung cancer (LC) is a leading cause of cancer-related mortality in the United States. Accurate prediction of LC mortality rates is crucial for guiding targeted interventions and addressing h...
Reinforcement Learning for Monetary Policy Under Macroeconomic Uncertainty: Analyzing Tabular and Function Approximation Methods : Abstract: We study how a central bank should dynamically set short-term nominal interest rates to stabilize inflation and unemployment when macroeconomic relationships are uncertain and time-varying. ...
Efficient Beamforming Optimization for STAR-RIS-Assisted Communications: A Gradient-Based Meta Learning Approach : Abstract: Simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) has emerged as a promising technology to realize full-space coverage and boost spectral efficiency in...
Inferring Latent Market Forces: Evaluating LLM Detection of Gamma Exposure Patterns via Obfuscation Testing : Abstract: We introduce obfuscation testing, a novel methodology for validating whether large language models detect structural market patterns through causal reasoning rather than temporal association...
Separating Constraint Compliance from Semantic Accuracy: A Novel Benchmark for Evaluating Instruction-Following Under Compression : Abstract: Large language models (LLMs) exhibit degraded performance under prompt compression, but the mechanisms remain poorly understood. We introduce the Compression-Decay Comprehension Test (CDCT),...
KVReviver: Reversible KV Cache Compression with Sketch-Based Token Reconstruction : Abstract: As the context length of current large language models (LLMs) rapidly increases, the memory demand for the Key-Value (KV) cache is becoming a bottleneck for LLM deployment and batch processi...
Learning to Prioritize IT Tickets: A Comparative Evaluation of Embedding-based Approaches and Fine-Tuned Transformer Models : Abstract: Prioritizing service tickets in IT Service Management (ITSM) is critical for operational efficiency but remains challenging due to noisy textual inputs, subjective writing styles, and pronou...
Byzantine Fault-Tolerant Multi-Agent System for Healthcare: A Gossip Protocol Approach to Secure Medical Message Propagation : Abstract: Recent advances in generative AI have enabled sophisticated multi-agent architectures for healthcare, where large language models power collaborative clinical decision-making. However, these...
Graph-O1 : Monte Carlo Tree Search with Reinforcement Learning for Text-Attributed Graph Reasoning : Abstract: ChatGPT said: Text-attributed graphs, where nodes and edges contain rich textual information, are widely used across diverse domains. A central challenge in this setting is question answerin...
Towards Reasoning-Preserving Unlearning in Multimodal Large Language Models : Abstract: Machine unlearning aims to erase requested data from trained models without full retraining. For Reasoning Multimodal Large Language Models (RMLLMs), this is uniquely challenging: intermedia...
Efficient Multi-Adapter LLM Serving via Cross-Model KV-Cache Reuse with Activated LoRA : Abstract: Modern large language model (LLM) systems increasingly rely on multi-turn pipelines that are composed of multiple task-specific adapters, yet existing serving frameworks remain inefficient, ...
Scalably Enhancing the Clinical Validity of a Task Benchmark with Physician Oversight : Abstract: Automating the calculation of clinical risk scores offers a significant opportunity to reduce physician administrative burden and enhance patient care. The current standard for evaluating th...
Augmenting Intelligence: A Hybrid Framework for Scalable and Stable Explanations : Abstract: Current approaches to Explainable AI (XAI) face a "Scalability-Stability Dilemma." Post-hoc methods (e.g., LIME, SHAP) may scale easily but suffer from instability, while supervised explanat...
Towards Closed-Loop Embodied Empathy Evolution: Probing LLM-Centric Lifelong Empathic Motion Generation in Unseen Scenarios : Abstract: In the literature, existing human-centric emotional motion generation methods primarily focus on boosting performance within a single scale-fixed dataset, largely neglecting the flexible and...
QuantiPhy: A Quantitative Benchmark Evaluating Physical Reasoning Abilities of Vision-Language Models : Abstract: Understanding the physical world is essential for generalist AI agents. However, it remains unclear whether state-of-the-art vision perception models (e.g., large VLMs) can reason physical p...
An Agentic Framework for Autonomous Materials Computation : Abstract: Large Language Models (LLMs) have emerged as powerful tools for accelerating scientific discovery, yet their static knowledge and hallucination issues hinder autonomous research applications...
EchoTrail-GUI: Building Actionable Memory for GUI Agents via Critic-Guided Self-Exploration : Abstract: Contemporary GUI agents, while increasingly capable due to advances in Large Vision-Language Models (VLMs), often operate with a critical limitation: they treat each task in isolation, lacki...
Learning General Policies with Policy Gradient Methods : Abstract: While reinforcement learning methods have delivered remarkable results in a number of settings, generalization, i.e., the ability to produce policies that generalize in a reliable and system...
First-Order Representation Languages for Goal-Conditioned RL : Abstract: First-order relational languages have been used in MDP planning and reinforcement learning (RL) for two main purposes: specifying MDPs in compact form, and representing and learning policies...
PENDULUM: A Benchmark for Assessing Sycophancy in Multimodal Large Language Models : Abstract: Sycophancy, an excessive tendency of AI models to agree with user input at the expense of factual accuracy or in contradiction of visual evidence, poses a critical and underexplored challeng...
VIGOR+: Iterative Confounder Generation and Validation via LLM-CEVAE Feedback Loop : Abstract: Hidden confounding remains a fundamental challenge in causal inference from observational data. Recent advances leverage Large Language Models (LLMs) to generate plausible hidden confounders...
SafeMed-R1: Adversarial Reinforcement Learning for Generalizable and Robust Medical Reasoning in Vision-Language Models : Abstract: Vision--Language Models (VLMs) show significant promise for Medical Visual Question Answering (VQA), yet their deployment in clinical settings is hindered by severe vulnerability to adversar...
Helios: A Foundational Language Model for Smart Energy Knowledge Reasoning and Application : Abstract: In the global drive toward carbon neutrality, deeply coordinated smart energy systems underpin industrial transformation. However, the interdisciplinary, fragmented, and fast-evolving expert...
Vibe Reasoning: Eliciting Frontier AI Mathematical Capabilities -- A Case Study on IMO 2025 Problem 6 : Abstract: We introduce Vibe Reasoning, a human-AI collaborative paradigm for solving complex mathematical problems. Our key insight is that frontier AI models already possess the knowledge required to...
DeliveryBench: Can Agents Earn Profit in Real World? : Abstract: LLMs and VLMs are increasingly deployed as embodied agents, yet existing benchmarks largely revolve around simple short-term tasks and struggle to capture rich realistic constraints that sha...
Generation of Programmatic Rules for Document Forgery Detection Using Large Language Models : Abstract: Document forgery poses a growing threat to legal, economic, and governmental processes, requiring increasingly sophisticated verification mechanisms. One approach involves the use of plausib...
Observer, Not Player: Simulating Theory of Mind in LLMs through Game Observation : Abstract: We present an interactive framework for evaluating whether large language models (LLMs) exhibit genuine "understanding" in a simple yet strategic environment. As a running example, we focus ...
Can We Test Consciousness Theories on AI? Ablations, Markers, and Robustness : Abstract: The search for reliable indicators of consciousness has fragmented into competing theoretical camps (Global Workspace Theory (GWT), Integrated Information Theory (IIT), and Higher-Order Theo...
Understanding Chain-of-Thought in Large Language Models via Topological Data Analysis : Abstract: With the development of large language models (LLMs), particularly with the introduction of the long reasoning chain technique, the reasoning ability of LLMs in complex problem-solving has b...
FC-MIR: A Mobile Screen Awareness Framework for Intent-Aware Recommendation based on Frame-Compressed Multimodal Trajectory Reasoning : Abstract: Identifying user intent from mobile UI operation trajectories is critical for advancing UI understanding and enabling task automation agents. While Multimodal Large Language Models (MLLMs) e...
Conditioning Accept-Desirability models in the context of AGM-like belief change : Abstract: We discuss conditionalisation for Accept-Desirability models in an abstract decision-making framework, where uncertain rewards live in a general linear space, and events are special projecti...
Tool-Augmented Hybrid Ensemble Reasoning with Distillation for Bilingual Mathematical Problem Solving : Abstract: Bilingual mathematical problem solving needs a clear link between language reasoning and symbolic calculation. Large language models often handle language well but are weak in accurate compu...
$\gamma(3,4)$ `Attention' in Cognitive Agents: Ontology-Free Knowledge Representations With Promise Theoretic Semantics : Abstract: The semantics and dynamics of `attention' are closely related to promise theoretic notions developed for autonomous agents and can thus easily be written down in promise framework. In this w...
Population-Evolve: a Parallel Sampling and Evolutionary Method for LLM Math Reasoning : Abstract: Test-time scaling has emerged as a promising direction for enhancing the reasoning capabilities of Large Language Models in last few years. In this work, we propose Population-Evolve, a trai...
Can abstract concepts from LLM improve SLM performance? : Abstract: Large language models (LLMs) excel at diverse tasks, but their deployment on resource-constrained devices remains challenging. Existing methods like quantization, pruning, and distillation c...
Recontextualization Mitigates Specification Gaming without Modifying the Specification : Abstract: Developers often struggle to specify correct training labels and rewards. Perhaps they don't need to. We propose recontextualization, which reduces how often language models "game" training ...
ORPR: An OR-Guided Pretrain-then-Reinforce Learning Model for Inventory Management : Abstract: As the pursuit of synergy between Artificial Intelligence (AI) and Operations Research (OR) gains momentum in handling complex inventory systems, a critical challenge persists: how to effect...
Training Multimodal Large Reasoning Models Needs Better Thoughts: A Three-Stage Framework for Long Chain-of-Thought Synthesis and Selection : Abstract: Large Reasoning Models (LRMs) have demonstrated remarkable performance on complex reasoning tasks through long Chain-of-Thought (CoT) reasoning. Extending these successes to multimodal reaso...
Clustering-based Transfer Learning for Dynamic Multimodal MultiObjective Evolutionary Algorithm : Abstract: Dynamic multimodal multiobjective optimization presents the dual challenge of simultaneously tracking multiple equivalent pareto optimal sets and maintaining population diversity in time-var...
Multimodal Bayesian Network for Robust Assessment of Casualties in Autonomous Triage : Abstract: Mass Casualty Incidents can overwhelm emergency medical systems and resulting delays or errors in the assessment of casualties can lead to preventable deaths. We present a decision support f...
Gabliteration: Adaptive Multi-Directional Neural Weight Modification for Selective Behavioral Alteration in Large Language Models : Abstract: We present Gabliteration, a novel neural weight modification technique that advances beyond traditional abliteration methods by implementing adaptive multi-directional projections with regul...
CORE: Concept-Oriented Reinforcement for Bridging the Definition-Application Gap in Mathematical Reasoning : Abstract: Large language models (LLMs) often solve challenging math exercises yet fail to apply the concept right when the problem requires genuine understanding. Popular Reinforcement Learning with V...
HARBOR: Holistic Adaptive Risk assessment model for BehaviORal healthcare : Abstract: Behavioral healthcare risk assessment remains a challenging problem due to the highly multimodal nature of patient data and the temporal dynamics of mood and affective disorders. While large...
The Dead Salmons of AI Interpretability : Abstract: In a striking neuroscience study, the authors placed a dead salmon in an MRI scanner and showed it images of humans in social situations. Astonishingly, standard analyses of the time reporte...
MEEA: Mere Exposure Effect-Driven Confrontational Optimization for LLM Jailbreaking : Abstract: The rapid advancement of large language models (LLMs) has intensified concerns about the robustness of their safety alignment. While existing jailbreak studies explore both single-turn and m...
Counterfactual Basis Extension and Representational Geometry: An MDL-Constrained Model of Conceptual Growth : Abstract: Concept learning becomes possible only when existing representations fail to account for experience. Most models of learning and inference, however, presuppose a fixed representational basis...
KeenKT: Knowledge Mastery-State Disambiguation for Knowledge Tracing : Abstract: Knowledge Tracing (KT) aims to dynamically model a student's mastery of knowledge concepts based on their historical learning interactions. Most current methods rely on single-point estimate...
Social Comparison without Explicit Inference of Others' Reward Values: A Constructive Approach Using a Probabilistic Generative Model : Abstract: Social comparison -- the process of evaluating one's rewards relative to others -- plays a fundamental role in primate social cognition. However, it remains unknown from a computational pers...
IntelliCode: A Multi-Agent LLM Tutoring System with Centralized Learner Modeling : Abstract: LLM-based tutors are typically single-turn assistants that lack persistent representations of learner knowledge, making it difficult to provide principled, transparent, and long-term pedagog...
Automatic Adaptation to Concept Complexity and Subjective Natural Concepts: A Cognitive Model based on Chunking : Abstract: A key issue in cognitive science concerns the fundamental psychological processes that underlie the formation and retrieval of multiple types of concepts in short-term and long-term memory (...
ASTIF: Adaptive Semantic-Temporal Integration for Cryptocurrency Price Forecasting : Abstract: Financial time series forecasting is fundamentally an information fusion challenge, yet most existing models rely on static architectures that struggle to integrate heterogeneous knowledge s...
ChronoDreamer: Action-Conditioned World Model as an Online Simulator for Robotic Planning : Abstract: We present ChronoDreamer, an action-conditioned world model for contact-rich robotic manipulation. Given a history of egocentric RGB frames, contact maps, actions, and joint states, ChronoDr...
Assignment-Routing Optimization: Solvers for Problems Under Constraints : Abstract: We study the Joint Routing-Assignment (JRA) problem in which items must be assigned one-to-one to placeholders while simultaneously determining a Hamiltonian cycle visiting all nodes exactly...
Reflective Confidence: Correcting Reasoning Flaws via Online Self-Correction : Abstract: Large language models (LLMs) have achieved strong performance on complex reasoning tasks using techniques such as chain-of-thought and self-consistency. However, ensemble-based approaches, e...
ESearch-R1: Learning Cost-Aware MLLM Agents for Interactive Embodied Search via Reinforcement Learning : Abstract: Multimodal Large Language Models (MLLMs) have empowered embodied agents with remarkable capabilities in planning and reasoning. However, when facing ambiguous natural language instructions (...
Vox Deorum: A Hybrid LLM Architecture for 4X / Grand Strategy Game AI -- Lessons from Civilization V : Abstract: Large Language Models' capacity to reason in natural language makes them uniquely promising for 4X and grand strategy games, enabling more natural human-AI gameplay interactions such as coll...
Large Language Models as Discounted Bayesian Filters : Abstract: Large Language Models (LLMs) demonstrate strong few-shot generalization through in-context learning, yet their reasoning in dynamic and stochastic environments remains opaque. Prior studies ...
Insider Threat Detection Using GCN and Bi-LSTM with Explicit and Implicit Graph Representations : Abstract: Insider threat detection (ITD) is challenging due to the subtle and concealed nature of malicious activities performed by trusted users. This paper proposes a post-hoc ITD framework that int...
Agent-Based Output Drift Detection for Breast Cancer Response Prediction in a Multisite Clinical Decision Support System : Abstract: Modern clinical decision support systems can concurrently serve multiple, independent medical imaging institutions, but their predictive performance may degrade across sites due to variation...
Few-Shot Learning of a Graph-Based Neural Network Model Without Backpropagation : Abstract: We propose a structural-graph approach to classifying contour images in a few-shot regime without using backpropagation. The core idea is to make structure the carrier of explanations: an im...
Monitoring Monitorability : Abstract: Observability into the decision making of modern AI systems may be required to safely deploy increasingly capable agents. Monitoring the chain-of-thought (CoT) of today's reasoning models ha...
Intelligent Human-Machine Partnership for Manufacturing: Enhancing Warehouse Planning through Simulation-Driven Knowledge Graphs and LLM Collaboration : Abstract: Manufacturing planners face complex operational challenges that require seamless collaboration between human expertise and intelligent systems to achieve optimal performance in modern produc...
MSC-180: A Benchmark for Automated Formal Theorem Proving from Mathematical Subject Classification : Abstract: Automated Theorem Proving (ATP) represents a core research direction in artificial intelligence for achieving formal reasoning and verification, playing a significant role in advancing machi...
Sophia: A Persistent Agent Framework of Artificial Life : Abstract: The development of LLMs has elevated AI agents from task-specific tools to long-lived, decision-making entities. Yet, most architectures remain static and reactive, tethered to manually defi...
External Hippocampus: Topological Cognitive Maps for Guiding Large Language Model Reasoning : Abstract: This paper proposes the External Hippocampus framework, which models language model reasoning from a cognitive dynamics perspective as the flow of information energy in semantic space. Unlik...
NL2CA: Auto-formalizing Cognitive Decision-Making from Natural Language Using an Unsupervised CriticNL2LTL Framework : Abstract: Cognitive computing models offer a formal and interpretable way to characterize human's deliberation and decision-making, yet their development remains labor-intensive. In this paper, we pro...
NEURO-GUARD: Neuro-Symbolic Generalization and Unbiased Adaptive Routing for Diagnostics -- Explainable Medical AI : Abstract: Accurate yet interpretable image-based diagnosis remains a central challenge in medical AI, particularly in settings characterized by limited data, subtle visual cues, and high-stakes clinic...
Propose, Solve, Verify: Self-Play Through Formal Verification : Abstract: Training models through self-play alone (without any human data) has been a longstanding goal in AI, but its effectiveness for training large language models remains unclear, particularly in...
Unifying Causal Reinforcement Learning: Survey, Taxonomy, Algorithms and Applications : Abstract: Integrating causal inference (CI) with reinforcement learning (RL) has emerged as a powerful paradigm to address critical limitations in classical RL, including low explainability, lack of r...
Efficient Mixture-of-Agents Serving via Tree-Structured Routing, Adaptive Pruning, and Dependency-Aware Prefill-Decode Overlap : Abstract: Mixture-of-Agents (MoA) inference can suffer from dense inter-agent communication and low hardware utilization, which jointly inflate serving latency. We present a serving design that target...
Rethinking Multi-Agent Intelligence Through the Lens of Small-World Networks : Abstract: Large language models (LLMs) have enabled multi-agent systems (MAS) in which multiple agents argue, critique, and coordinate to solve complex tasks, making communication topology a first-cla...
Faithful and Stable Neuron Explanations for Trustworthy Mechanistic Interpretability : Abstract: Neuron identification is a popular tool in mechanistic interpretability, aiming to uncover the human-interpretable concepts represented by individual neurons in deep networks. While algorith...
Conflict-Driven Clause Learning with VSIDS Heuristics for Discrete Facility Layout : Abstract: This paper studies the use of Conflict-Driven Clause Learning (CDCL) with VSIDS heuristics as a computational engine for discrete facility layout problems. The facility layout problem is mod...

Research Sources: 488 | Generated: 12/23/2025