AI Research News Feeds for January 14th, 2026

AI RESEARCH PAPERS & ACADEMIC SOURCES

Generative Adversarial Networks for Image Super-Resolution: A Survey : Abstract: Single image super-resolution (SISR) has played an important role in the field of image processing. Recent generative adversarial networks (GANs) can achieve excellent results on low-resolut...
Latent Reconstruction from Generated Data for Multimodal Misinformation Detection : Abstract: Multimodal misinformation, such as miscaptioned images, where captions misrepresent an image's origin, context, or meaning, poses a growing challenge in the digital age. Due to the scarcity ...
MSSF: A 4D Radar and Camera Fusion Framework With Multi-Stage Sampling for 3D Object Detection in Autonomous Driving : Abstract: As one of the automotive sensors that have emerged in recent years, 4D millimeter-wave radar has a higher resolution than conventional 3D radar and provides precise elevation measurements. B...
Learning-based Multi-View Stereo: A Survey : Abstract: 3D reconstruction aims to recover the dense 3D structure of a scene. It plays an essential role in various applications such as Augmented/Virtual Reality (AR/VR), autonomous driving and robo...
M3CoTBench: Benchmark Chain-of-Thought of MLLMs in Medical Image Understanding : Abstract: Chain-of-Thought (CoT) reasoning has proven effective in enhancing large language models by encouraging step-by-step intermediate reasoning, and recent advances have extended this paradigm t...
A Single-Parameter Factor-Graph Image Prior : Abstract: We propose a novel piecewise smooth image model with piecewise constant local parameters that are automatically adapted to each image. Technically, the model is formulated in terms of factor...
Automated Lesion Segmentation of Stroke MRI Using nnU-Net: A Comprehensive External Validation Across Acute and Chronic Lesions : Abstract: Accurate and generalisable segmentation of stroke lesions from magnetic resonance imaging (MRI) is essential for advancing clinical research, prognostic modelling, and personalised intervent...
Blind Deconvolution in Astronomy: How Does a Standalone U-Net Perform? : Abstract: Aims: This study investigates whether a U-Net architecture can perform standalone end-to-end blind deconvolution of astronomical images without any prior knowledge of the Point Spread Functi...
VLingNav: Embodied Navigation with Adaptive Reasoning and Visual-Assisted Linguistic Memory : Abstract: VLA models have shown promising potential in embodied navigation by unifying perception and planning while inheriting the strong generalization abilities of large VLMs. However, most existin...
Keyframe-based Dense Mapping with the Graph of View-Dependent Local Maps : Abstract: In this article, we propose a new keyframe-based mapping system. The proposed method updates local Normal Distribution Transform maps (NDT) using data from an RGB-D sensor. The cells of the ...
Temporal-Enhanced Interpretable Multi-Modal Prognosis and Risk Stratification Framework for Diabetic Retinopathy (TIMM-ProRS) : Abstract: Diabetic retinopathy (DR), affecting millions globally with projections indicating a significant rise, poses a severe blindness risk and strains healthcare systems. Diagnostic complexity ari...
Robust Subpixel Localization of Diagonal Markers in Large-Scale Navigation via Multi-Layer Screening and Adaptive Matching : Abstract: This paper proposes a robust, high-precision positioning methodology to address localization failures arising from complex background interference in large-scale flight navigation and the co...
Fiducial Exoskeletons: Image-Centric Robot State Estimation : Abstract: We introduce Fiducial Exoskeletons, an image-based reformulation of 3D robot state estimation that replaces cumbersome procedures and motor-centric pipelines with single-image inference. Tra...
Application of Ideal Observer for Thresholded Data in Search Task : Abstract: This study advances task-based image quality assessment by developing an anthropomorphic thresholded visual-search model observer. The model is an ideal observer for thresholded data inspire...
MLLM-VADStory: Domain Knowledge-Driven Multimodal LLMs for Video Ad Storyline Insights : Abstract: We propose MLLM-VADStory, a novel domain knowledge-guided multimodal large language models (MLLM) framework to systematically quantify and generate insights for video ad storyline understand...
RAVEN: Erasing Invisible Watermarks via Novel View Synthesis : Abstract: Invisible watermarking has become a critical mechanism for authenticating AI-generated image content, with major platforms deploying watermarking schemes at scale. However, evaluating the vu...
3AM: Segment Anything with Geometric Consistency in Videos : Abstract: Video object segmentation methods like SAM2 achieve strong performance through memory-based architectures but struggle under large viewpoint changes due to reliance on appearance features. T...
Near-perfect photo-ID of the Hula painted frog with zero-shot deep local-feature matching : Abstract: Accurate individual identification is essential for monitoring rare amphibians, yet invasive marking is often unsuitable for critically endangered species. We evaluate state-of-the-art compu...
DentalX: Context-Aware Dental Disease Detection with Radiographs : Abstract: Diagnosing dental diseases from radiographs is time-consuming and challenging due to the subtle nature of diagnostic evidence. Existing methods, which rely on object detection models designe...
Aggregating Diverse Cue Experts for AI-Generated Image Detection : Abstract: The rapid emergence of image synthesis models poses challenges to the generalization of AI-generated image detectors. However, existing methods often rely on model-specific features, leading...
Salience-SGG: Enhancing Unbiased Scene Graph Generation with Iterative Salience Estimation : Abstract: Scene Graph Generation (SGG) suffers from a long-tailed distribution, where a few predicate classes dominate while many others are underrepresented, leading to biased models that underperfor...
CtrlFuse: Mask-Prompt Guided Controllable Infrared and Visible Image Fusion : Abstract: Infrared and visible image fusion generates all-weather perception-capable images by combining complementary modalities, enhancing environmental awareness for intelligent unmanned systems. E...
SoC: Semantic Orthogonal Calibration for Test-Time Prompt Tuning : Abstract: With the increasing adoption of vision-language models (VLMs) in critical decision-making systems such as healthcare or autonomous driving, the calibration of their uncertainty estimates bec...
SfMamba: Efficient Source-Free Domain Adaptation via Selective Scan Modeling : Abstract: Source-free domain adaptation (SFDA) tackles the critical challenge of adapting source-pretrained models to unlabeled target domains without access to source data, overcoming data privacy an...
End-to-End Video Character Replacement without Structural Guidance : Abstract: Controllable video character replacement with a user-provided identity remains a challenging problem due to the lack of paired video data. Prior works have predominantly relied on a reconstr...
REVNET: Rotation-Equivariant Point Cloud Completion via Vector Neuron Anchor Transformer : Abstract: Incomplete point clouds captured by 3D sensors often result in the loss of both geometric and semantic information. Most existing point cloud completion methods are built on rotation-variant...
Closed-Loop LLM Discovery of Non-Standard Channel Priors in Vision Models : Abstract: Channel configuration search the optimization of layer specifications such as layer widths in deep neural networks presents a complex combinatorial challenge constrained by tensor shape comp...
An IoT-Enabled Smart Aquarium System for Real-Time Water Quality Monitoring and Automated Feeding : Abstract: Maintaining optimal water quality in aquariums is critical for aquatic health but remains challenging due to the need for continuous monitoring of multiple parameters. Traditional manual met...
Cross-modal Proxy Evolving for OOD Detection with Vision-Language Models : Abstract: Reliable zero-shot detection of out-of-distribution (OOD) inputs is critical for deploying vision-language models in open-world settings. However, the lack of labeled negatives in zero-shot ...
Towards Safer Mobile Agents: Scalable Generation and Evaluation of Diverse Scenarios for VLMs : Abstract: Vision Language Models (VLMs) are increasingly deployed in autonomous vehicles and mobile systems, making it crucial to evaluate their ability to support safer decision-making in complex env...
Modality-Decoupled RGB-Thermal Object Detector via Query Fusion : Abstract: The advantage of RGB-Thermal (RGB-T) detection lies in its ability to perform modality fusion and integrate cross-modality complementary information, enabling robust detection under diverse ...
Developing Predictive and Robust Radiomics Models for Chemotherapy Response in High-Grade Serous Ovarian Carcinoma : Abstract: Objectives: High-grade serous ovarian carcinoma (HGSOC) is typically diagnosed at an advanced stage with extensive peritoneal metastases, making treatment challenging. Neoadjuvant chemothera...
Incentivizing Cardiologist-Like Reasoning in MLLMs for Interpretable Echocardiographic Diagnosis : Abstract: Echocardiographic diagnosis is vital for cardiac screening yet remains challenging. Existing echocardiography foundation models do not effectively capture the relationships between quantitat...
Deep Learning Based Facial Retargeting Using Local Patches : Abstract: In the era of digital animation, the quest to produce lifelike facial animations for virtual characters has led to the development of various retargeting methods. While the retargeting facia...
MMLGNet: Cross-Modal Alignment of Remote Sensing Data using CLIP : Abstract: In this paper, we propose a novel multimodal framework, Multimodal Language-Guided Network (MMLGNet), to align heterogeneous remote sensing modalities like Hyperspectral Imaging (HSI) and Li...
SPARK: Scalable Real-Time Point Cloud Aggregation with Multi-View Self-Calibration : Abstract: Real-time multi-camera 3D reconstruction is crucial for 3D perception, immersive interaction, and robotics. Existing methods struggle with multi-view fusion, camera extrinsic uncertainty, an...
Edge-Optimized Multimodal Learning for UAV Video Understanding via BLIP-2 : Abstract: The demand for real-time visual understanding and interaction in complex scenarios is increasingly critical for unmanned aerial vehicles. However, a significant challenge arises from the con...
Design and Development of a Low-Cost Scalable GSM-IoT Smart Pet Feeder with a Remote Mobile Application : Abstract: Pet ownership is increasingly common in modern households, yet maintaining a consistent feeding schedule remains challenging for the owners particularly those who live in cities and have bus...
Source-Free Domain Adaptation for Geospatial Point Cloud Semantic Segmentation : Abstract: Semantic segmentation of 3D geospatial point clouds is pivotal for remote sensing applications. However, variations in geographic patterns across regions and data acquisition strategies indu...
Semantic Misalignment in Vision-Language Models under Perceptual Degradation : Abstract: Vision-Language Models (VLMs) are increasingly deployed in autonomous driving and embodied AI systems, where reliable perception is critical for safe semantic reasoning and decision-making. ...
From Local Windows to Adaptive Candidates via Individualized Exploratory: Rethinking Attention for Image Super-Resolution : Abstract: Single Image Super-Resolution (SISR) is a fundamental computer vision task that aims to reconstruct a high-resolution (HR) image from a low-resolution (LR) input. Transformer-based methods h...
Tissue Classification and Whole-Slide Images Analysis via Modeling of the Tumor Microenvironment and Biological Pathways : Abstract: Automatic integration of whole slide images (WSIs) and gene expression profiles has demonstrated substantial potential in precision clinical diagnosis and cancer progression studies. However...
UM-Text: A Unified Multimodal Model for Image Understanding : Abstract: With the rapid advancement of image generation, visual text editing using natural language instructions has received increasing attention. The main challenge of this task is to fully underst...
YOLOBirDrone: Dataset for Bird vs Drone Detection and Classification and a YOLO based enhanced learning architecture : Abstract: The use of aerial drones for commercial and defense applications has benefited in many ways and is therefore utilized in several different application domains. However, they are also increas...
SnapGen++: Unleashing Diffusion Transformers for Efficient High-Fidelity Image Generation on Edge Devices : Abstract: Recent advances in diffusion transformers (DiTs) have set new standards in image generation, yet remain impractical for on-device deployment due to their high computational and memory costs....
ReCo-KD: Region- and Context-Aware Knowledge Distillation for Efficient 3D Medical Image Segmentation : Abstract: Accurate 3D medical image segmentation is vital for diagnosis and treatment planning, but state-of-the-art models are often too large for clinics with limited computing resources. Lightweigh...
M3SR: Multi-Scale Multi-Perceptual Mamba for Efficient Spectral Reconstruction : Abstract: The Mamba architecture has been widely applied to various low-level vision tasks due to its exceptional adaptability and strong performance. Although the Mamba architecture has been adopted ...
KidVis: Do Multimodal Large Language Models Possess the Visual Perceptual Capabilities of a 6-Year-Old? : Abstract: While Multimodal Large Language Models (MLLMs) have demonstrated impressive proficiency in high-level reasoning tasks, such as complex diagrammatic interpretation, it remains an open questio...
AIMC-Spec: A Benchmark Dataset for Automatic Intrapulse Modulation Classification under Variable Noise Conditions : Abstract: A lack of standardized datasets has long hindered progress in automatic intrapulse modulation classification (AIMC) - a critical task in radar signal analysis for electronic support systems,...
Improving Zero-shot ADL Recognition with Large Language Models through Event-based Context and Confidence : Abstract: Unobtrusive sensor-based recognition of Activities of Daily Living (ADLs) in smart homes by processing data collected from IoT sensing devices supports applications such as healthcare, safet...
MobiDiary: Autoregressive Action Captioning with Wearable Devices and Wireless Signals : Abstract: Human Activity Recognition (HAR) in smart homes is critical for health monitoring and assistive living. While vision-based systems are common, they face privacy concerns and environmental li...
Unified Multi-Site Multi-Sequence Brain MRI Harmonization Enriched by Biomedical Semantic Style : Abstract: Aggregating multi-site brain MRI data can enhance deep learning model training, but also introduces non-biological heterogeneity caused by site-specific variations (e.g., differences in scan...
Route, Retrieve, Reflect, Repair: Self-Improving Agentic Framework for Visual Detection and Linguistic Reasoning in Medical Imaging : Abstract: Medical image analysis increasingly relies on large vision-language models (VLMs), yet most systems remain single-pass black boxes that offer limited control over reasoning, safety, and spat...
Human-inspired Global-to-Parallel Multi-scale Encoding for Lightweight Vision Models : Abstract: Lightweight vision networks have witnessed remarkable progress in recent years, yet achieving a satisfactory balance among parameter scale, computational overhead, and task performance remai...
Second-order Gaussian directional derivative representations for image high-resolution corner detection : Abstract: Corner detection is widely used in various computer vision tasks, such as image matching and 3D reconstruction. Our research indicates that there are theoretical flaws in Zhang et al.'s use ...
CogniMap3D: Cognitive 3D Mapping and Rapid Retrieval : Abstract: We present CogniMap3D, a bioinspired framework for dynamic 3D scene understanding and reconstruction that emulates human cognitive processes. Our approach maintains a persistent memory bank ...
Towards Cross-Platform Generalization: Domain Adaptive 3D Detection with Augmentation and Pseudo-Labeling : Abstract: This technical report represents the award-winning solution to the Cross-platform 3D Object Detection task in the RoboSense2025 Challenge. Our approach is built upon PVRCNN++, an efficient 3...
Representation Learning with Semantic-aware Instance and Sparse Token Alignments : Abstract: Medical contrastive vision-language pre-training (VLP) has demonstrated significant potential in improving performance on downstream tasks. Traditional approaches typically employ contrastiv...
A Hardware-Algorithm Co-Designed Framework for HDR Imaging and Dehazing in Extreme Rocket Launch Environments : Abstract: Quantitative optical measurement of critical mechanical parameters -- such as plume flow fields, shock wave structures, and nozzle oscillations -- during rocket launch faces severe challenge...
Instance-Aligned Captions for Explainable Video Anomaly Detection : Abstract: Explainable video anomaly detection (VAD) is crucial for safety-critical applications, yet even with recent progress, much of the research still lacks spatial grounding, making the explanati...
Where Does Vision Meet Language? Understanding and Refining Visual Fusion in MLLMs via Contrastive Attention : Abstract: Multimodal Large Language Models (MLLMs) have achieved remarkable progress in vision-language understanding, yet how they internally integrate visual and textual information remains poorly u...
From Prompts to Deployment: Auto-Curated Domain-Specific Dataset Generation via Diffusion Models : Abstract: In this paper, we present an automated pipeline for generating domain-specific synthetic datasets with diffusion models, addressing the distribution shift between pre-trained models and real...
Rescind: Countering Image Misconduct in Biomedical Publications with Vision-Language and State-Space Modeling : Abstract: Scientific image manipulation in biomedical publications poses a growing threat to research integrity and reproducibility. Unlike natural image forensics, biomedical forgery detection is uni...
A Highly Efficient Diversity-based Input Selection for DNN Improvement Using VLMs : Abstract: Maintaining or improving the performance of Deep Neural Networks (DNNs) through fine-tuning requires labeling newly collected inputs, a process that is often costly and time-consuming. To al...
Training Free Zero-Shot Visual Anomaly Localization via Diffusion Inversion : Abstract: Zero-Shot image Anomaly Detection (ZSAD) aims to detect and localise anomalies without access to any normal training samples of the target data. While recent ZSAD approaches leverage additio...
Decoder Generates Manufacturable Structures: A Framework for 3D-Printable Object Synthesis : Abstract: This paper presents a novel decoder-based approach for generating manufacturable 3D structures optimized for additive manufacturing. We introduce a deep learning framework that decodes laten...
CASHEW: Stabilizing Multimodal Reasoning via Iterative Trajectory Aggregation : Abstract: Vision-language models achieve strong performance across a wide range of multimodal understanding and reasoning tasks, yet their multi-step reasoning remains unstable. Repeated sampling over...
Predicting Region of Interest in Human Visual Search Based on Statistical Texture and Gabor Features : Abstract: Understanding human visual search behavior is a fundamental problem in vision science and computer vision, with direct implications for modeling how observers allocate attention in location-...
Likelihood ratio for a binary Bayesian classifier under a noise-exclusion model : Abstract: We develop a new statistical ideal observer model that performs holistic visual search (or gist) processing in part by placing thresholds on minimum extractable image features. In this model...
An Efficient Additive Kolmogorov-Arnold Transformer for Point-Level Maize Localization in Unmanned Aerial Vehicle Imagery : Abstract: High-resolution UAV photogrammetry has become a key technology for precision agriculture, enabling centimeter-level crop monitoring and point-level plant localization. However, point-level m...
Sesame Plant Segmentation Dataset: A YOLO Formatted Annotated Dataset : Abstract: This paper presents the Sesame Plant Segmentation Dataset, an open source annotated image dataset designed to support the development of artificial intelligence models for agricultural appli...
3DGS-Drag: Dragging Gaussians for Intuitive Point-Based 3D Editing : Abstract: The transformative potential of 3D content creation has been progressively unlocked through advancements in generative models. Recently, intuitive drag editing with geometric changes has att...
Edge-AI Perception Node for Cooperative Road-Safety Enforcement and Connected-Vehicle Integration : Abstract: Rapid motorization in emerging economies such as India has created severe enforcement asymmetries, with over 11 million recorded violations in 2023 against a human policing density of roughl...
Beyond the Turn-Based Game: Enabling Real-Time Conversations with Duplex Models : Abstract: As large language models (LLMs) increasingly permeate daily lives, there is a growing demand for real-time interactions that mirror human conversations. Traditional turn-based chat systems d...
PosIR: Position-Aware Heterogeneous Information Retrieval Benchmark : Abstract: While dense retrieval models have achieved remarkable success, rigorous evaluation of their sensitivity to the position of relevant information (i.e., position bias) remains largely unexplor...
When KV Cache Reuse Fails in Multi-Agent Systems: Cross-Candidate Interaction is Crucial for LLM Judges : Abstract: Multi-agent LLM systems routinely generate multiple candidate responses that are aggregated by an LLM judge. To reduce the dominant prefill cost in such pipelines, recent work advocates KV c...
Exploiting DINOv3-Based Self-Supervised Features for Robust Few-Shot Medical Image Segmentation : Abstract: Deep learning-based automatic medical image segmentation plays a critical role in clinical diagnosis and treatment planning but remains challenging in few-shot scenarios due to the scarcity ...
Spatial Context Improves the Integration of Text with Remote Sensing for Mapping Environmental Variables : Abstract: Recent developments in natural language processing highlight text as an emerging data source for ecology. Textual resources carry unique information that can be used in complementarity with ...
Inferring Latent Intentions: Attributional Natural Language Inference in LLM Agents : Abstract: Attributional inference, the ability to predict latent intentions behind observed actions, is a critical yet underexplored capability for large language models (LLMs) operating in multi-agen...
From Rows to Reasoning: A Retrieval-Augmented Multimodal Framework for Spreadsheet Understanding : Abstract: Large Language Models (LLMs) struggle to reason over large-scale enterprise spreadsheets containing thousands of numeric rows, multiple linked sheets, and embedded visual content such as cha...
PrivGemo: Privacy-Preserving Dual-Tower Graph Retrieval for Empowering LLM Reasoning with Memory Augmentation : Abstract: Knowledge graphs (KGs) provide structured evidence that can ground large language model (LLM) reasoning for knowledge-intensive question answering. However, many practical KGs are private, a...
RAGShaper: Eliciting Sophisticated Agentic RAG Skills via Automated Data Synthesis : Abstract: Agentic Retrieval-Augmented Generation (RAG) empowers large language models to autonomously plan and retrieve information for complex problem-solving. However, the development of robust agen...
Nationality and Region Prediction from Names: A Comparative Study of Neural Models and Large Language Models : Abstract: Predicting nationality from personal names has practical value in marketing, demographic research, and genealogical studies. Conventional neural models learn statistical correspondences betw...
QuantEval: A Benchmark for Financial Quantitative Tasks in Large Language Models : Abstract: Large Language Models (LLMs) have shown strong capabilities across many domains, yet their evaluation in financial quantitative tasks remains fragmented and mostly limited to knowledge-centr...
Analyzing Bias in False Refusal Behavior of Large Language Models for Hate Speech Detoxification : Abstract: While large language models (LLMs) have increasingly been applied to hate speech detoxification, the prompts often trigger safety alerts, causing LLMs to refuse the task. In this study, we s...
A Parallel Cross-Lingual Benchmark for Multimodal Idiomaticity Understanding : Abstract: Potentially idiomatic expressions (PIEs) construe meanings inherently tied to the everyday experience of a given language community. As such, they constitute an interesting challenge for ass...
Get away with less: Need of source side data curation to build parallel corpus for low resource Machine Translation : Abstract: Data curation is a critical yet under-researched step in the machine translation training paradigm. To train translation systems, data acquisition relies primarily on human translations and ...
How Order-Sensitive Are LLMs? OrderProbe for Deterministic Structural Reconstruction : Abstract: Large language models (LLMs) excel at semantic understanding, yet their ability to reconstruct internal structure from scrambled inputs remains underexplored. Sentence-level restoration is i...
GraphSearch: Agentic Search-Augmented Reasoning for Zero-Shot Graph Learning : Abstract: Recent advances in search-augmented large reasoning models (LRMs) enable the retrieval of external knowledge to reduce hallucinations in multistep reasoning. However, their ability to operat...
Ministral 3 : Abstract: We introduce the Ministral 3 series, a family of parameter-efficient dense language models designed for compute and memory constrained applications, available in three model sizes: 3B, 8B, a...
DeepResearch Bench II: Diagnosing Deep Research Agents via Rubrics from Expert Report : Abstract: Deep Research Systems (DRS) aim to help users search the web, synthesize information, and deliver comprehensive investigative reports. However, how to rigorously evaluate these systems remai...
Algorithmic Stability in Infinite Dimensions: Characterizing Unconditional Convergence in Banach Spaces : Abstract: The distinction between conditional, unconditional, and absolute convergence in infinite-dimensional spaces has fundamental implications for computational algorithms. While these concepts co...
It's All About the Confidence: An Unsupervised Approach for Multilingual Historical Entity Linking using Large Language Models : Abstract: Despite the recent advancements in NLP with the advent of Large Language Models (LLMs), Entity Linking (EL) for historical texts remains challenging due to linguistic variation, noisy inputs...
Surgical Refusal Ablation: Disentangling Safety from Intelligence via Concept-Guided Spectral Cleaning : Abstract: Safety-aligned language models systematically refuse harmful requests. While activation steering can modulate refusal, ablating the raw "refusal vector" calculated from contrastive harmful a...
Do You Understand How I Feel?: Towards Verified Empathy in Therapy Chatbots : Abstract: Conversational agents are increasingly used as support tools along mental therapeutic pathways with significant societal impacts. In particular, empathy is a key non-functional requirement i...
Fine-Mem: Fine-Grained Feedback Alignment for Long-Horizon Memory Management : Abstract: Effective memory management is essential for large language model agents to navigate long-horizon tasks. Recent research has explored using Reinforcement Learning to develop specialized memo...
Detecting Mental Manipulation in Speech via Synthetic Multi-Speaker Dialogue : Abstract: Mental manipulation, the strategic use of language to covertly influence or exploit others, is a newly emerging task in computational social reasoning. Prior work has focused exclusively on ...
CLaS-Bench: A Cross-Lingual Alignment and Steering Benchmark : Abstract: Understanding and controlling the behavior of large language models (LLMs) is an increasingly important topic in multilingual NLP. Beyond prompting or fine-tuning, , i.e.,~manipulating inter...
AgriAgent: Contract-Driven Planning and Capability-Aware Tool Orchestration in Real-World Agriculture : Abstract: Intelligent agent systems in real-world agricultural scenarios must handle diverse tasks under multimodal inputs, ranging from lightweight information understanding to complex multi-step exe...
D$^2$Plan: Dual-Agent Dynamic Global Planning for Complex Retrieval-Augmented Reasoning : Abstract: Recent search-augmented LLMs trained with reinforcement learning (RL) can interleave searching and reasoning for multi-hop reasoning tasks. However, they face two critical failure modes as t...
Discovery and Reinforcement of Tool-Integrated Reasoning Chains via Rollout Trees : Abstract: Tool-Integrated Reasoning has emerged as a key paradigm to augment Large Language Models (LLMs) with computational capabilities, yet integrating tool-use into long Chain-of-Thought (long CoT...
Med-CoReasoner: Reducing Language Disparities in Medical Reasoning via Language-Informed Co-Reasoning : Abstract: While reasoning-enhanced large language models perform strongly on English medical tasks, a persistent multilingual gap remains, with substantially weaker reasoning in local languages, limit...
User-Oriented Multi-Turn Dialogue Generation with Tool Use at scale : Abstract: The recent paradigm shift toward large reasoning models (LRMs) as autonomous agents has intensified the demand for sophisticated, multi-turn tool-use capabilities. Yet, existing datasets and...
Generation-Augmented Generation: A Plug-and-Play Framework for Private Knowledge Injection in Large Language Models : Abstract: In domains such as biomedicine, materials, and finance, high-stakes deployment of large language models (LLMs) requires injecting private, domain-specific knowledge that is proprietary, fast...
WISE-Flow: Workflow-Induced Structured Experience for Self-Evolving Conversational Service Agents : Abstract: Large language model (LLM)-based agents are widely deployed in user-facing services but remain error-prone in new tasks, tend to repeat the same failure patterns, and show substantial run-to...
How Reliable are Confidence Estimators for Large Reasoning Models? A Systematic Benchmark on High-Stakes Domains : Abstract: The miscalibration of Large Reasoning Models (LRMs) undermines their reliability in high-stakes domains, necessitating methods to accurately estimate the confidence of their long-form, multi...
Attention Projection Mixing and Exogenous Anchors : Abstract: Transformers that reuse early-layer attention projections as residuals face a fundamental tension: the first layer must simultaneously serve as a stable reference for all deeper layers and a...
Query Suggestion for Retrieval-Augmented Generation via Dynamic In-Context Learning : Abstract: Retrieval-augmented generation with tool-calling agents (agentic RAG) has become increasingly powerful in understanding, processing, and responding to user queries. However, the scope of the...
Calibration Is Not Enough: Evaluating Confidence Estimation Under Language Variations : Abstract: Confidence estimation (CE) indicates how reliable the answers of large language models (LLMs) are, and can impact user trust and decision-making. Existing work evaluates CE methods almost ex...
Universal computation is intrinsic to language model decoding : Abstract: Language models now provide an interface to express and often solve general problems in natural language, yet their ultimate computational capabilities remain a major topic of scientific deb...
Is Sentiment Banana-Shaped? Exploring the Geometry and Portability of Sentiment Concept Vectors : Abstract: Use cases of sentiment analysis in the humanities often require contextualized, continuous scores. Concept Vector Projections (CVP) offer a recent solution: by modeling sentiment as a direct...
VULCA-Bench: A Multicultural Vision-Language Benchmark for Evaluating Cultural Understanding : Abstract: We introduce VULCA-Bench, a multicultural art-critique benchmark for evaluating Vision-Language Models' (VLMs) cultural understanding beyond surface-level visual perception. Existing VLM ben...
Multilingual, Multimodal Pipeline for Creating Authentic and Structured Fact-Checked Claim Dataset : Abstract: The rapid proliferation of misinformation across online platforms underscores the urgent need for robust, up-to-date, explainable, and multilingual fact-checking resources. However, existing...
Cross-Cultural Expert-Level Art Critique Evaluation with Vision-Language Models : Abstract: Vision-Language Models (VLMs) excel at visual perception, yet their ability to interpret cultural meaning in art remains under-validated. We present a tri-tier evaluation framework for cross...
Explaining Generalization of AI-Generated Text Detectors Through Linguistic Analysis : Abstract: AI-text detectors achieve high accuracy on in-domain benchmarks, but often struggle to generalize across different generation conditions such as unseen prompts, model families, or domains. W...
Knowing But Not Doing: Convergent Morality and Divergent Action in LLMs : Abstract: Value alignment is central to the development of safe and socially compatible artificial intelligence. However, how Large Language Models (LLMs) represent and enact human values in real-worl...
A Human-Centric Pipeline for Aligning Large Language Models with Chinese Medical Ethics : Abstract: Recent advances in large language models have enabled their application to a range of healthcare tasks. However, aligning LLMs with the nuanced demands of medical ethics, especially under co...
EmbeddingRWKV: State-Centric Retrieval with Reusable States : Abstract: Current Retrieval-Augmented Generation (RAG) systems typically employ a traditional two-stage pipeline: an embedding model for initial retrieval followed by a reranker for refinement. Howeve...
Accelerated Gradient Methods with Biased Gradient Estimates: Risk Sensitivity, High-Probability Guarantees, and Large Deviation Bounds : Abstract: We study trade-offs between convergence rate and robustness to gradient errors in the context of first-order methods. Our focus is on generalized momentum methods (GMMs)--a broad class that ...
Engineering Spatial and Molecular Features from Cellular Niches to Inform Predictions of Inflammatory Bowel Disease : Abstract: Differentiating between the two main subtypes of Inflammatory Bowel Disease (IBD): Crohns disease (CD) and ulcerative colitis (UC) is a persistent clinical challenge due to overlapping prese...
Gradient-free online learning of subgrid-scale dynamics with neural emulators : Abstract: In this paper, we propose a generic algorithm to train machine learning-based subgrid parametrizations online, i.e., with \textit{a posteriori} loss functions, but for non-differentiable num...
ROSS: RObust decentralized Stochastic learning based on Shapley values : Abstract: In the paradigm of decentralized learning, a group of agents collaborate to learn a global model using a distributed dataset without a central server; nevertheless, it is severely challenged...
Efficient and Scalable Implementation of Differentially Private Deep Learning without Shortcuts : Abstract: Differentially private stochastic gradient descent (DP-SGD) is the standard algorithm for training machine learning models under differential privacy (DP). The most common DP-SGD privacy acc...
A New Formulation for Zeroth-Order Optimization of Adversarial EXEmples in Malware Detection : Abstract: Machine learning malware detectors are vulnerable to adversarial EXEmples, i.e., carefully-crafted Windows programs tailored to evade detection. Unlike other adversarial problems, attacks in...
Attacks on fairness in Federated Learning : Abstract: Federated Learning is an important emerging distributed training paradigm that keeps data private on clients. It is now well understood that by controlling only a small subset of FL clients,...
On the use of graph models to achieve individual and group fairness : Abstract: Machine Learning algorithms are ubiquitous in key decision-making contexts such as justice, healthcare and finance, which has spawned a great demand for fairness in these procedures. However...
Kernel Learning for Regression via Quantum Annealing Based Spectral Sampling : Abstract: While quantum annealing (QA) has been developed for combinatorial optimization, practical QA devices operate at finite temperature and under noise, and their outputs can be regarded as stoch...
Multi-Preconditioned LBFGS for Training Finite-Basis PINNs : Abstract: A multi-preconditioned LBFGS (MP-LBFGS) algorithm is introduced for training finite-basis physics-informed neural networks (FBPINNs). The algorithm is motivated by the nonlinear additive Sch...
RMBRec: Robust Multi-Behavior Recommendation towards Target Behaviors : Abstract: Multi-behavior recommendation faces a critical challenge in practice: auxiliary behaviors (e.g., clicks, carts) are often noisy, weakly correlated, or semantically misaligned with the target...
Enabling Population-Based Architectures for Neural Combinatorial Optimization : Abstract: Neural Combinatorial Optimization (NCO) has mostly focused on learning policies, typically neural networks, that operate on a single candidate solution at a time, either by constructing one ...
Al\'em do Desempenho: Um Estudo da Confiabilidade de Detectores de Deepfakes : Abstract: Deepfakes are synthetic media generated by artificial intelligence, with positive applications in education and creativity, but also serious negative impacts such as fraud, misinformation, a...
Safe Language Generation in the Limit : Abstract: Recent results in learning a language in the limit have shown that, although language identification is impossible, language generation is tractable. As this foundational area expands, we ne...
Robust low-rank estimation with multiple binary responses using pairwise AUC loss : Abstract: Multiple binary responses arise in many modern data-analytic problems. Although fitting separate logistic regressions for each response is computationally attractive, it ignores shared struc...
Accelerated Methods with Complexity Separation Under Data Similarity for Federated Learning Problems : Abstract: Heterogeneity within data distribution poses a challenge in many modern federated learning tasks. We formalize it as an optimization problem involving a computationally heavy composite under...
Interpretability and Individuality in Knee MRI: Patient-Specific Radiomic Fingerprint with Reconstructed Healthy Personas : Abstract: For automated assessment of knee MRI scans, both accuracy and interpretability are essential for clinical use and adoption. Traditional radiomics rely on predefined features chosen at the po...
Sample Complexity of Composite Quantum Hypothesis Testing : Abstract: This paper investigates symmetric composite binary quantum hypothesis testing (QHT), where the goal is to determine which of two uncertainty sets contains an unknown quantum state. While asy...
Convergence of gradient flow for learning convolutional neural networks : Abstract: Convolutional neural networks are widely used in imaging and image recognition. Learning such networks from training data leads to the minimization of a non-convex function. This makes the a...
Reducing Compute Waste in LLMs through Kernel-Level DVFS : Abstract: The rapid growth of AI has fueled the expansion of accelerator- or GPU-based data centers. However, the rising operational energy consumption has emerged as a critical bottleneck and a major...
Sampling via Stochastic Interpolants by Langevin-based Velocity and Initialization Estimation in Flow ODEs : Abstract: We propose a novel method for sampling from unnormalized Boltzmann densities based on a probability-flow ordinary differential equation (ODE) derived from linear stochastic interpolants. The...
Supervised Spike Agreement Dependent Plasticity for Fast Local Learning in Spiking Neural Networks : Abstract: Spike-Timing-Dependent Plasticity (STDP) provides a biologically grounded learning rule for spiking neural networks (SNNs), but its reliance on precise spike timing and pairwise updates limi...
STAR: Detecting Inference-time Backdoors in LLM Reasoning via State-Transition Amplification Ratio : Abstract: Recent LLMs increasingly integrate reasoning mechanisms like Chain-of-Thought (CoT). However, this explicit reasoning exposes a new attack surface for inference-time backdoors, which inject ...
GraphFusionSBR: Denoising Multi-Channel Graphs for Session-Based Recommendation : Abstract: Session-based recommendation systems must capture implicit user intents from sessions. However, existing models suffer from issues such as item interaction dominance and noisy sessions. We p...
AUV Trajectory Learning for Underwater Acoustic Energy Transfer and Age Minimization : Abstract: Internet of underwater things (IoUT) is increasingly gathering attention with the aim of monitoring sea life and deep ocean environment, underwater surveillance as well as maintenance of und...
Zero-Shot Distracted Driver Detection via Vision Language Models with Double Decoupling : Abstract: Distracted driving is a major cause of traffic collisions, calling for robust and scalable detection methods. Vision-language models (VLMs) enable strong zero-shot image classification, but ...
Noise-Adaptive Regularization for Robust Multi-Label Remote Sensing Image Classification : Abstract: The development of reliable methods for multi-label classification (MLC) has become a prominent research direction in remote sensing (RS). As the scale of RS data continues to expand, annota...
Silence the Judge: Reinforcement Learning with Self-Verifier via Latent Geometric Clustering : Abstract: Group Relative Policy Optimization (GRPO) significantly enhances the reasoning performance of Large Language Models (LLMs). However, this success heavily relies on expensive external verifie...
MLPlatt: Simple Calibration Framework for Ranking Models : Abstract: Ranking models are extensively used in e-commerce for relevance estimation. These models often suffer from poor interpretability and no scale calibration, particularly when trained with typi...
Disentangling History and Propagation Dependencies in Cross-Subject Knee Contact Stress Prediction Using a Shared MeshGraphNet Backbone : Abstract: Background:Subject-specific finite element analysis accurately characterizes knee joint mechanics but is computationally expensive. Deep surrogate models provide a rapid alternative, yet the...
AgriLens: Semantic Retrieval in Agricultural Texts Using Topic Modeling and Language Models : Abstract: As the volume of unstructured text continues to grow across domains, there is an urgent need for scalable methods that enable interpretable organization, summarization, and retrieval of info...
One-Shot Identification with Different Neural Network Approaches : Abstract: Convolutional neural networks (CNNs) have been widely used in the computer vision community, significantly improving the state-of-the-art. But learning good features often is computationally...
Structural Dimension Reduction in Bayesian Networks : Abstract: This work introduces a novel technique, named structural dimension reduction, to collapse a Bayesian network onto a minimum and localized one while ensuring that probabilistic inferences bet...
Towards Principled Design of Mixture-of-Experts Language Models under Memory and Inference Constraints : Abstract: Modern Mixture-of-Experts (MoE) language models are designed based on total parameters (memory footprint) and active parameters (inference cost). However, we find these two factors alone are...
FUME: Fused Unified Multi-Gas Emission Network for Livestock Rumen Acidosis Detection : Abstract: Ruminal acidosis is a prevalent metabolic disorder in dairy cattle causing significant economic losses and animal welfare concerns. Current diagnostic methods rely on invasive pH measurement...
Triplets Better Than Pairs: Towards Stable and Effective Self-Play Fine-Tuning for LLMs : Abstract: Recently, self-play fine-tuning (SPIN) has been proposed to adapt large language models to downstream applications with scarce expert-annotated data, by iteratively generating synthetic resp...
Wasserstein-p Central Limit Theorem Rates: From Local Dependence to Markov Chains : Abstract: Finite-time central limit theorem (CLT) rates play a central role in modern machine learning (ML). In this paper, we study CLT rates for multivariate dependent data in Wasserstein-$p$ ($\mat...
Relational Knowledge Distillation Using Fine-tuned Function Vectors : Abstract: Representing relations between concepts is a core prerequisite for intelligent systems to make sense of the world. Recent work using causal mediation analysis has shown that a small set of a...
Hierarchical Online-Scheduling for Energy-Efficient Split Inference with Progressive Transmission : Abstract: Device-edge collaborative inference with Deep Neural Networks (DNNs) faces fundamental trade-offs among accuracy, latency and energy consumption. Current scheduling exhibits two drawbacks: a...
Towards A Unified PAC-Bayesian Framework for Norm-based Generalization Bounds : Abstract: Understanding the generalization behavior of deep neural networks remains a fundamental challenge in modern statistical learning theory. Among existing approaches, PAC-Bayesian norm-based bo...
AdaJudge: Adaptive Multi-Perspective Judging for Reward Modeling : Abstract: Reward modeling is essential for aligning large language models with human preferences, yet predominant architectures rely on a static pooling strategy to condense sequences into scalar scor...
Operator learning for models of tear film breakup : Abstract: Tear film (TF) breakup is a key driver of understanding dry eye disease, yet estimating TF thickness and osmolarity from fluorescence (FL) imaging typically requires solving computationally ...
A Statistical Assessment of Amortized Inference Under Signal-to-Noise Variation and Distribution Shift : Abstract: Since the turn of the century, approximate Bayesian inference has steadily evolved as new computational techniques have been incorporated to handle increasingly complex and large-scale predi...
Enhancing Portfolio Optimization with Deep Learning Insights : Abstract: Our work focuses on deep learning (DL) portfolio optimization, tackling challenges in long-only, multi-asset strategies across market cycles. We propose training models with limited regime d...
A Sensing Dataset Protocol for Benchmarking and Multi-Task Wireless Sensing : Abstract: Wireless sensing has become a fundamental enabler for intelligent environments, supporting applications such as human detection, activity recognition, localization, and vital sign monitoring...
Fast and explainable clustering in the Manhattan and Tanimoto distance : Abstract: The CLASSIX algorithm is a fast and explainable approach to data clustering. In its original form, this algorithm exploits the sorting of the data points by their first principal component t...
Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs : Abstract: Reinforcement learning (RL) has become a central paradigm for post-training large language models (LLMs), particularly for complex reasoning tasks, yet it often suffers from exploration coll...
Adaptive Requesting in Decentralized Edge Networks via Non-Stationary Bandits : Abstract: We study a decentralized collaborative requesting problem that aims to optimize the information freshness of time-sensitive clients in edge networks consisting of multiple clients, access no...
A Novel Approach to Explainable AI with Quantized Active Ingredients in Decision Making : Abstract: Artificial Intelligence (AI) systems have shown good success at classifying. However, the lack of explainability is a true and significant challenge, especially in high-stakes domains, such ...
Model-Agnostic Solutions for Deep Reinforcement Learning in Non-Ergodic Contexts : Abstract: Reinforcement Learning (RL) remains a central optimisation framework in machine learning. Although RL agents can converge to optimal solutions, the definition of ``optimality'' depends on th...
Soft Partition-based KAPI-ELM for Multi-Scale PDEs : Abstract: Physics-informed machine learning holds great promise for solving differential equations, yet existing methods struggle with highly oscillatory, multiscale, or singularly perturbed PDEs due ...
Provably Safe Reinforcement Learning using Entropy Regularizer : Abstract: We consider the problem of learning the optimal policy for Markov decision processes with safety constraints. We formulate the problem in a reach-avoid setup. Our goal is to design online re...
EviNAM: Intelligibility and Uncertainty via Evidential Neural Additive Models : Abstract: Intelligibility and accurate uncertainty estimation are crucial for reliable decision-making. In this paper, we propose EviNAM, an extension of evidential learning that integrates the interp...
Your Group-Relative Advantage Is Biased : Abstract: Reinforcement Learning from Verifier Rewards (RLVR) has emerged as a widely used approach for post-training large language models on reasoning tasks, with group-based methods such as GRPO an...
DiffMM: Efficient Method for Accurate Noisy and Sparse Trajectory Map Matching via One Step Diffusion : Abstract: Map matching for sparse trajectories is a fundamental problem for many trajectory-based applications, e.g., traffic scheduling and traffic flow analysis. Existing methods for map matching ar...
Coverage Improvement and Fast Convergence of On-policy Preference Learning : Abstract: Online on-policy preference learning algorithms for language model alignment such as online direct policy optimization (DPO) can significantly outperform their offline counterparts. We provi...
Out-of-distribution generalization of deep-learning surrogates for 2D PDE-generated dynamics in the small-data regime : Abstract: Partial differential equations (PDEs) are a central tool for modeling the dynamics of physical, engineering, and materials systems, but high-fidelity simulations are often computationally ex...
Decodable but not structured: linear probing enables Underwater Acoustic Target Recognition with pretrained audio embeddings : Abstract: Increasing levels of anthropogenic noise from ships contribute significantly to underwater sound pollution, posing risks to marine ecosystems. This makes monitoring crucial to understand and...
Automated Machine Learning in Radiomics: A Comparative Evaluation of Performance, Efficiency and Accessibility : Abstract: Automated machine learning (AutoML) frameworks can lower technical barriers for predictive and prognostic model development in radiomics by enabling researchers without programming expertise...
Deep Exploration of Epoch-wise Double Descent in Noisy Data: Signal Separation, Large Activation, and Benign Overfitting : Abstract: Deep double descent is one of the key phenomena underlying the generalization capability of deep learning models. In this study, epoch-wise double descent, which is delayed generalization fo...
A Usable GAN-Based Tool for Synthetic ECG Generation in Cardiac Amyloidosis Research : Abstract: Cardiac amyloidosis (CA) is a rare and underdiagnosed infiltrative cardiomyopathy, and available datasets for machine-learning models are typically small, imbalanced and heterogeneous. This ...
LDLT L-Lipschitz Network Weight Parameterization Initialization : Abstract: We analyze initialization dynamics for LDLT-based $\mathcal{L}$-Lipschitz layers by deriving the exact marginal output variance when the underlying parameter matrix $W_0\in \mathbb{R}^{m\tim...
Incorporating Cognitive Biases into Reinforcement Learning for Financial Decision-Making : Abstract: Financial markets are influenced by human behavior that deviates from rationality due to cognitive biases. Traditional reinforcement learning (RL) models for financial decision-making assume...
A Preliminary Agentic Framework for Matrix Deflation : Abstract: Can a small team of agents peel a matrix apart, one rank-1 slice at a time? We propose an agentic approach to matrix deflation in which a solver Large Language Model (LLM) generates rank-1 S...
One-Shot Federated Ridge Regression: Exact Recovery via Sufficient Statistic Aggregation : Abstract: Federated learning protocols require repeated synchronization between clients and a central server, with convergence rates depending on learning rates, data heterogeneity, and client samplin...
Scalable Multiagent Reinforcement Learning with Collective Influence Estimation : Abstract: Multiagent reinforcement learning (MARL) has attracted considerable attention due to its potential in addressing complex cooperative tasks. However, existing MARL approaches often rely on fr...
TabPFN Through The Looking Glass: An interpretability study of TabPFN and its internal representations : Abstract: Tabular foundational models are pre-trained models designed for a wide range of tabular data tasks. They have shown strong performance across domains, yet their internal representations and ...
VBO-MI: A Fully Gradient-Based Bayesian Optimization Framework Using Variational Mutual Information Estimation : Abstract: Many real-world tasks require optimizing expensive black-box functions accessible only through noisy evaluations, a setting commonly addressed with Bayesian optimization (BO). While Bayesian...
Reverse Flow Matching: A Unified Framework for Online Reinforcement Learning with Diffusion and Flow Policies : Abstract: Diffusion and flow policies are gaining prominence in online reinforcement learning (RL) due to their expressive power, yet training them efficiently remains a critical challenge. A fundamen...
Generalization Analysis and Method for Domain Generalization for a Family of Recurrent Neural Networks : Abstract: Deep learning (DL) has driven broad advances across scientific and engineering domains. Despite its success, DL models often exhibit limited interpretability and generalization, which can un...
Intra-tree Column Subsampling Hinders XGBoost Learning of Ratio-like Interactions : Abstract: Many applied problems contain signal that becomes clear only after combining multiple raw measurements. Ratios and rates are common examples. In gradient boosted trees, this combination is n...
Structure Detection for Contextual Reinforcement Learning : Abstract: Contextual Reinforcement Learning (CRL) tackles the problem of solving a set of related Contextual Markov Decision Processes (CMDPs) that vary across different context variables. Traditional...
Learning a Stochastic Differential Equation Model of Tropical Cyclone Intensification from Reanalysis and Observational Data : Abstract: Tropical cyclones are dangerous natural hazards, but their hazard is challenging to quantify directly from historical datasets due to limited dataset size and quality. Models of cyclone inte...
LUT-Compiled Kolmogorov-Arnold Networks for Lightweight DoS Detection on IoT Edge Devices : Abstract: Denial-of-Service (DoS) attacks pose a critical threat to Internet of Things (IoT) ecosystems, yet deploying effective intrusion detection on resource-constrained edge devices remains challe...
Riemannian Zeroth-Order Gradient Estimation with Structure-Preserving Metrics for Geodesically Incomplete Manifolds : Abstract: In this paper, we study Riemannian zeroth-order optimization in settings where the underlying Riemannian metric $g$ is geodesically incomplete, and the goal is to approximate stationary poin...
InfGraND: An Influence-Guided GNN-to-MLP Knowledge Distillation : Abstract: Graph Neural Networks (GNNs) are the go-to model for graph data analysis. However, GNNs rely on two key operations - aggregation and update, which can pose challenges for low-latency inferen...
Beyond the Next Port: A Multi-Task Transformer for Forecasting Future Voyage Segment Durations : Abstract: Accurate forecasts of segment-level sailing durations are fundamental to enhancing maritime schedule reliability and optimizing long-term port operations. However, conventional estimated tim...
DataScribe: An AI-Native, Policy-Aligned Web Platform for Multi-Objective Materials Design and Discovery : Abstract: The acceleration of materials discovery requires digital platforms that go beyond data repositories to embed learning, optimization, and decision-making directly into research workflows. We ...
Transformer-Based Approach for Automated Functional Group Replacement in Chemical Compounds : Abstract: Functional group replacement is a pivotal approach in cheminformatics to enable the design of novel chemical compounds with tailored properties. Traditional methods for functional group remo...
Max-Min Neural Network Operators For Approximation of Multivariate Functions : Abstract: In this paper, we develop a multivariate framework for approximation by max-min neural network operators. Building on the recent advances in approximation theory by neural network operators,...
HOSC: A Periodic Activation with Saturation Control for High-Fidelity Implicit Neural Representations : Abstract: Periodic activations such as sine preserve high-frequency information in implicit neural representations (INRs) through their oscillatory structure, but often suffer from gradient instabilit...
MemGovern: Enhancing Code Agents through Learning from Governed Human Experiences : Abstract: While autonomous software engineering (SWE) agents are reshaping programming paradigms, they currently suffer from a "closed-world" limitation: they attempt to fix bugs from scratch or solel...
GenAITEd Ghana: A First-of-Its-Kind Context-Aware and Curriculum-Aligned Conversational AI Agent for Teacher Education : Abstract: Global frameworks increasingly advocate for Responsible Artificial Intelligence (AI) in education, yet they provide limited guidance on how ethical, culturally responsive, and curriculum-ali...
Data Work in Egypt: Who Are the Workers Behind Artificial Intelligence? : Abstract: The report highlights the role of Egyptian data workers in the global value chains of Artificial Intelligence (AI). These workers generate and annotate data for machine learning, check outpu...
KaLM: Knowledge-aligned Autoregressive Language Modeling via Dual-view Knowledge Graph Contrastive Learning : Abstract: Autoregressive large language models (LLMs) pre-trained by next token prediction are inherently proficient in generative tasks. However, their performance on knowledge-driven tasks such as f...
CausAdv: A Causal-based Framework for Detecting Adversarial Examples : Abstract: Deep learning has led to tremendous success in computer vision, largely due to Convolutional Neural Networks (CNNs). However, CNNs have been shown to be vulnerable to crafted adversarial per...
Beyond Backpropagation: Optimization with Multi-Tangent Forward Gradients : Abstract: The gradients used to train neural networks are typically computed using backpropagation. While an efficient way to obtain exact gradients, backpropagation is computationally expensive, hind...
Explainable Molecular Property Prediction: Aligning Chemical Concepts with Predictions via Language Models : Abstract: Providing explainable molecular property predictions is critical for many scientific domains, such as drug discovery and material science. Though transformer-based language models have shown...
Feed-Forward Optimization With Delayed Feedback for Neural Network Training : Abstract: Backpropagation has long been criticized for being biologically implausible due to its reliance on concepts that are not viable in natural learning processes. Two core issues are the weight ...
Cross-Domain Imitation Learning via Optimal Transport : Abstract: Cross-domain imitation learning studies how to leverage expert demonstrations of one agent to train an imitation agent with a different embodiment or morphology. Comparing trajectories and s...
SafePro: Evaluating the Safety of Professional-Level AI Agents : Abstract: Large language model-based agents are rapidly evolving from simple conversational assistants into autonomous systems capable of performing complex, professional-level tasks in various domain...
DeKeyNLU: Enhancing Natural Language to SQL Generation through Task Decomposition and Keyword Extraction : Abstract: Natural Language to SQL (NL2SQL) provides a new model-centric paradigm that simplifies database access for non-technical users by converting natural language queries into SQL commands. Recen...
Explaning with trees: interpreting CNNs using hierarchies : Abstract: Challenges persist in providing interpretable explanations for neural network reasoning in explainable AI (xAI). Existing methods like Integrated Gradients produce noisy maps, and LIME, whil...
Generative Semantic Communication: Diffusion Models Beyond Bit Recovery : Abstract: Semantic communication is expected to be one of the cores of next-generation AI-based communications. One of the possibilities offered by semantic communication is the capability to regenera...
Modeling LLM Agent Reviewer Dynamics in Elo-Ranked Review System : Abstract: In this work, we explore the Large Language Model (LLM) agent reviewer dynamics in an Elo-ranked review system using real-world conference paper submissions. Multiple LLM agent reviewers wit...
Motion Attribution for Video Generation : Abstract: Despite the rapid progress of video generation models, the role of data in influencing motion is poorly understood. We present Motive (MOTIon attribution for Video gEneration), a motion-cent...
MemRec: Collaborative Memory-Augmented Agentic Recommender System : Abstract: The evolution of recommender systems has shifted preference storage from rating matrices and dense embeddings to semantic memory in the agentic era. Yet existing agents rely on isolated memo...
Reasoning Matters for 3D Visual Grounding : Abstract: The recent development of Large Language Models (LLMs) with strong reasoning ability has driven research in various domains such as mathematics, coding, and scientific discovery. Meanwhile, ...
Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge : Abstract: Large language models often solve complex reasoning tasks more effectively with Chain-of-Thought (CoT), but at the cost of long, low-bandwidth token sequences. Humans, by contrast, often rea...
S3-CLIP: Video Super Resolution for Person-ReID : Abstract: Tracklet quality is often treated as an afterthought in most person re-identification (ReID) methods, with the majority of research presenting architectural modifications to foundational mod...
APEX-SWE : Abstract: We introduce the AI Productivity Index for Software Engineering (APEX-SWE), a benchmark for assessing whether frontier AI models can execute economically valuable software engineering work. ...
Asymptotic Universal Alignment: A New Alignment Framework via Test-Time Scaling : Abstract: Aligning large language models (LLMs) to serve users with heterogeneous and potentially conflicting preferences is a central challenge for personalized and trustworthy AI. We formalize an id...
Translating Light-Sheet Microscopy Images to Virtual H&E Using CycleGAN : Abstract: Histopathology analysis relies on Hematoxylin and Eosin (H&E) staining, but fluorescence microscopy offers complementary information. Converting fluorescence images to H&E-like appearance ca...
Reliable Graph-RAG for Codebases: AST-Derived Graphs vs LLM-Extracted Knowledge Graphs : Abstract: Retrieval-Augmented Generation for software engineering often relies on vector similarity search, which captures topical similarity but can fail on multi-hop architectural reasoning such as ...
Grid-Aware Charging and Operational Optimization for Mixed-Fleet Public Transit : Abstract: The rapid growth of urban populations and the increasing need for sustainable transportation solutions have prompted a shift towards electric buses in public transit systems. However, the ef...
UR-Bench: A Benchmark for Multi-Hop Reasoning over Ultra-High-Resolution Images : Abstract: Recent multimodal large language models (MLLMs) show strong capabilities in visual-language reasoning, yet their performance on ultra-high-resolution imagery remains largely unexplored. Exis...
To Retrieve or To Think? An Agentic Approach for Context Evolution : Abstract: Current context augmentation methods, such as retrieval-augmented generation, are essential for solving knowledge-intensive reasoning tasks.However, they typically adhere to a rigid, brute-f...
TableCache: Primary Foreign Key Guided KV Cache Precomputation for Low Latency Text-to-SQL : Abstract: In Text-to-SQL tasks, existing LLM-based methods often include extensive database schemas in prompts, leading to long context lengths and increased prefilling latency. While user queries typ...
TerraFormer: Automated Infrastructure-as-Code with LLMs Fine-Tuned via Policy-Guided Verifier Feedback : Abstract: Automating Infrastructure-as-Code (IaC) is challenging, and large language models (LLMs) often produce incorrect configurations from natural language (NL). We present TerraFormer, a neuro-sy...
ISLA: A U-Net for MRI-based acute ischemic stroke lesion segmentation with deep supervision, attention, domain adaptation, and ensemble learning : Abstract: Accurate delineation of acute ischemic stroke lesions in MRI is a key component of stroke diagnosis and management. In recent years, deep learning models have been successfully applied to th...
Real-Time Localization Framework for Autonomous Basketball Robots : Abstract: Localization is a fundamental capability for autonomous robots, enabling them to operate effectively in dynamic environments. In Robocon 2025, accurate and reliable localization is crucial f...
Auditing Student-AI Collaboration: A Case Study of Online Graduate CS Students : Abstract: As generative AI becomes embedded in higher education, it increasingly shapes how students complete academic tasks. While these systems offer efficiency and support, concerns persist regardi...
Region of interest detection for efficient aortic segmentation : Abstract: Thoracic aortic dissection and aneurysms are the most lethal diseases of the aorta. The major hindrance to treatment lies in the accurate analysis of the medical images. More particularly, a...
Lessons from the Field: An Adaptable Lifecycle Approach to Applied Dialogue Summarization : Abstract: Summarization of multi-party dialogues is a critical capability in industry, enhancing knowledge transfer and operational effectiveness across many domains. However, automatically generating...
TRACE: Reconstruction-Based Anomaly Detection in Ensemble and Time-Dependent Simulations : Abstract: Detecting anomalies in high-dimensional, time-dependent simulation data is challenging due to complex spatial and temporal dynamics. We study reconstruction-based anomaly detection for ensem...
RULERS: Locked Rubrics and Evidence-Anchored Scoring for Robust LLM Evaluation : Abstract: The LLM-as-a-Judge paradigm promises scalable rubric-based evaluation, yet aligning frozen black-box models with human standards remains a challenge due to inherent generation stochasticity....
Moral Lenses, Political Coordinates: Towards Ideological Positioning of Morally Conditioned LLMs : Abstract: While recent research has systematically documented political orientation in large language models (LLMs), existing evaluations rely primarily on direct probing or demographic persona engine...
M$^2$FMoE: Multi-Resolution Multi-View Frequency Mixture-of-Experts for Extreme-Adaptive Time Series Forecasting : Abstract: Forecasting time series with extreme events is critical yet challenging due to their high variance, irregular dynamics, and sparse but high-impact nature. While existing methods excel in mod...
SafeRedir: Prompt Embedding Redirection for Robust Unlearning in Image Generation Models : Abstract: Image generation models (IGMs), while capable of producing impressive and creative content, often memorize a wide range of undesirable concepts from their training data, leading to the repro...
VeriTaS: The First Dynamic Benchmark for Multimodal Automated Fact-Checking : Abstract: The growing scale of online misinformation urgently demands Automated Fact-Checking (AFC). Existing benchmarks for evaluating AFC systems, however, are largely limited in terms of task scope...
ExpSeek: Self-Triggered Experience Seeking for Web Agents : Abstract: Experience intervention in web agents emerges as a promising technical paradigm, enhancing agent interaction capabilities by providing valuable insights from accumulated experiences. However...
WaveFormer: Frequency-Time Decoupled Vision Modeling with Wave Equation : Abstract: Vision modeling has advanced rapidly with Transformers, whose attention mechanisms capture visual dependencies but lack a principled account of how semantic information propagates spatially....
Rewriting Video: Text-Driven Reauthoring of Video Footage : Abstract: Video is a powerful medium for communication and storytelling, yet reauthoring existing footage remains challenging. Even simple edits often demand expertise, time, and careful planning, con...
VideoHEDGE: Entropy-Based Hallucination Detection for Video-VLMs via Semantic Clustering and Spatiotemporal Perturbations : Abstract: Hallucinations in video-capable vision-language models (Video-VLMs) remain frequent and high-confidence, while existing uncertainty metrics often fail to align with correctness. We introduce...
Contrastive and Multi-Task Learning on Noisy Brain Signals with Nonlinear Dynamical Signatures : Abstract: We introduce a two-stage multitask learning framework for analyzing Electroencephalography (EEG) signals that integrates denoising, dynamical modeling, and representation learning. In the fi...
CD^2: Constrained Dataset Distillation for Few-Shot Class-Incremental Learning : Abstract: Few-shot class-incremental learning (FSCIL) receives significant attention from the public to perform classification continuously with a few training samples, which suffers from the key cata...
STAGE: A Benchmark for Knowledge Graph Construction, Question Answering, and In-Script Role-Playing over Movie Screenplays : Abstract: Movie screenplays are rich long-form narratives that interleave complex character relationships, temporally ordered events, and dialogue-driven interactions. While prior benchmarks target in...
Temporal Fusion Nexus: A task-agnostic multi-modal embedding model for clinical narratives and irregular time series in post-kidney transplant care : Abstract: We introduce Temporal Fusion Nexus (TFN), a multi-modal and task-agnostic embedding model to integrate irregular time series and unstructured clinical narratives. We analysed TFN in post-kid...
EfficientFSL: Enhancing Few-Shot Classification via Query-Only Tuning in Vision Transformers : Abstract: Large models such as Vision Transformers (ViTs) have demonstrated remarkable superiority over smaller architectures like ResNet in few-shot classification, owing to their powerful representa...
PKI: Prior Knowledge-Infused Neural Network for Few-Shot Class-Incremental Learning : Abstract: Few-shot class-incremental learning (FSCIL) aims to continually adapt a model on a limited number of new-class examples, facing two well-known challenges: catastrophic forgetting and overfit...
BenchOverflow: Measuring Overflow in Large Language Models via Plain-Text Prompts : Abstract: We investigate a failure mode of large language models (LLMs) in which plain-text prompts elicit excessive outputs, a phenomenon we term Overflow. Unlike jailbreaks or prompt injection, Over...
sui-1: Grounded and Verifiable Long-Form Summarization : Abstract: Large language models frequently generate plausible but unfaithful summaries that users cannot verify against source text, a critical limitation in compliance-sensitive domains such as gover...
JudgeRLVR: Judge First, Generate Second for Efficient Reasoning : Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has become a standard paradigm for reasoning in Large Language Models. However, optimizing solely for final-answer correctness often dri...
CoMa: Contextual Massing Generation with Vision-Language Models : Abstract: The conceptual design phase in architecture and urban planning, particularly building massing, is complex and heavily reliant on designer intuition and manual effort. To address this, we pro...
A Formal Proof of a Continued Fraction Conjecture for $\pi$ Originating from the Ramanujan Machine : Abstract: We provide a formal analytic proof for a class of non-canonical polynomial continued fractions representing π/4, originally conjectured by the Ramanujan Machine using algorithmic induction [...
Decoding Order Matters in Autoregressive Speech Synthesis : Abstract: Autoregressive speech synthesis often adopts a left-to-right order, yet generation order is a modelling choice. We investigate decoding order through masked diffusion framework, which progre...
Divide and Conquer: Static-Dynamic Collaboration for Few-Shot Class-Incremental Learning : Abstract: Few-shot class-incremental learning (FSCIL) aims to continuously recognize novel classes under limited data, which suffers from the key stability-plasticity dilemma: balancing the retention ...
Large Multimodal Models for Embodied Intelligent Driving: The Next Frontier in Self-Driving? : Abstract: The advent of Large Multimodal Models (LMMs) offers a promising technology to tackle the limitations of modular design in autonomous driving, which often falters in open-world scenarios requ...
Taxon: Hierarchical Tax Code Prediction with Semantically Aligned LLM Expert Guidance : Abstract: Tax code prediction is a crucial yet underexplored task in automating invoicing and compliance management for large-scale e-commerce platforms. Each product must be accurately mapped to a no...
Regulatory gray areas of LLM Terms : Abstract: Large Language Models (LLMs) are increasingly integrated into academic research pipelines; however, the Terms of Service governing their use remain under-examined. We present a comparative a...
PATS: Personality-Aware Teaching Strategies with Large Language Model Tutors : Abstract: Recent advances in large language models (LLMs) demonstrate their potential as educational tutors. However, different tutoring strategies benefit different student personalities, and mismatc...
An Explainable Two Stage Deep Learning Framework for Pericoronitis Assessment in Panoramic Radiographs Using YOLOv8 and ResNet-50 : Abstract: Objectives: To overcome challenges in diagnosing pericoronitis on panoramic radiographs, an AI-assisted assessment system integrating anatomical localization, pathological classification, an...
Controlled LLM Training on Spectral Sphere : Abstract: Scaling large models requires optimization strategies that ensure rapid convergence grounded in stability. Maximal Update Parametrization ($\boldsymbolμ$P) provides a theoretical safeguard f...
Training-Free Distribution Adaptation for Diffusion Models via Maximum Mean Discrepancy Guidance : Abstract: Pre-trained diffusion models have emerged as powerful generative priors for both unconditional and conditional sample generation, yet their outputs often deviate from the characteristics of ...
Geo-NVS-w: Geometry-Aware Novel View Synthesis In-the-Wild with an SDF Renderer : Abstract: We introduce Geo-NVS-w, a geometry-aware framework for high-fidelity novel view synthesis from unstructured, in-the-wild image collections. While existing in-the-wild methods already excel a...
Scalable Sequential Recommendation under Latency and Memory Constraints : Abstract: Sequential recommender systems must model long-range user behavior while operating under strict memory and latency constraints. Transformer-based approaches achieve strong accuracy but suffe...
IGAN: A New Inception-based Model for Stable and High-Fidelity Image Synthesis Using Generative Adversarial Networks : Abstract: Generative Adversarial Networks (GANs) face a significant challenge of striking an optimal balance between high-quality image generation and training stability. Recent techniques, such as DC...
Safe Heterogeneous Multi-Agent RL with Communication Regularization for Coordinated Target Acquisition : Abstract: This paper introduces a decentralized multi-agent reinforcement learning framework enabling structurally heterogeneous teams of agents to jointly discover and acquire randomly located target...
Enhancing Image Quality Assessment Ability of LMMs via Retrieval-Augmented Generation : Abstract: Large Multimodal Models (LMMs) have recently shown remarkable promise in low-level visual perception tasks, particularly in Image Quality Assessment (IQA), demonstrating strong zero-shot cap...
ORBIT: On-policy Exploration-Exploitation for Controllable Multi-Budget Reasoning : Abstract: Recent Large Reasoning Models (LRMs) achieve strong performance by leveraging long-form Chain-of-Thought (CoT) reasoning, but uniformly applying overlong reasoning at inference time incurs s...
Enhancing Sentiment Classification and Irony Detection in Large Language Models through Advanced Prompt Engineering Techniques : Abstract: This study investigates the use of prompt engineering to enhance large language models (LLMs), specifically GPT-4o-mini and gemini-1.5-flash, in sentiment analysis tasks. It evaluates advanc...
Demystifying the Slash Pattern in Attention: The Role of RoPE : Abstract: Large Language Models (LLMs) often exhibit slash attention patterns, where attention scores concentrate along the $Δ$-th sub-diagonal for some offset $Δ$. These patterns play a key role in p...
HIPPO: Accelerating Video Large Language Models Inference via Holistic-aware Parallel Speculative Decoding : Abstract: Speculative decoding (SD) has emerged as a promising approach to accelerate LLM inference without sacrificing output quality. Existing SD methods tailored for video-LLMs primarily focus on p...
On Evaluation of Unsupervised Feature Selection for Pattern Classification : Abstract: Unsupervised feature selection aims to identify a compact subset of features that captures the intrinsic structure of data without supervised label. Most existing studies evaluate the perfor...
Hyperbolic Heterogeneous Graph Transformer : Abstract: In heterogeneous graphs, we can observe complex structures such as tree-like or hierarchical structures. Recently, the hyperbolic space has been widely adopted in many studies to effectively...
GADPN: Graph Adaptive Denoising and Perturbation Networks via Singular Value Decomposition : Abstract: While Graph Neural Networks (GNNs) excel on graph-structured data, their performance is fundamentally limited by the quality of the observed graph, which often contains noise, missing links,...
Knowledge-based learning in Text-RAG and Image-RAG : Abstract: This research analyzed and compared the multi-modal approach in the Vision Transformer(EVA-ViT) based image encoder with the LlaMA or ChatGPT LLM to reduce the hallucination problem and dete...
DNF: Dual-Layer Nested Fingerprinting for Large Language Model Intellectual Property Protection : Abstract: The rapid growth of large language models raises pressing concerns about intellectual property protection under black-box deployment. Existing backdoor-based fingerprints either rely on rare...
Evaluating Implicit Regulatory Compliance in LLM Tool Invocation via Logic-Guided Synthesis : Abstract: The integration of large language models (LLMs) into autonomous agents has enabled complex tool use, yet in high-stakes domains, these systems must strictly adhere to regulatory standards be...
ForgetMark: Stealthy Fingerprint Embedding via Targeted Unlearning in Language Models : Abstract: Existing invasive (backdoor) fingerprints suffer from high-perplexity triggers that are easily filtered, fixed response patterns exposed by heuristic detectors, and spurious activations on b...
Autonomous Materials Exploration by Integrating Automated Phase Identification and AI-Assisted Human Reasoning : Abstract: Autonomous experimentation holds the potential to accelerate materials development by combining artificial intelligence (AI) with modular robotic platforms to explore extensive combinatorial...
GI-Bench: A Panoramic Benchmark Revealing the Knowledge-Experience Dissociation of Multimodal Large Language Models in Gastrointestinal Endoscopy Against Clinical Standards : Abstract: Multimodal Large Language Models (MLLMs) show promise in gastroenterology, yet their performance against comprehensive clinical workflows and human benchmarks remains unverified. To systemat...
Instruction-Driven 3D Facial Expression Generation and Transition : Abstract: A 3D avatar typically has one of six cardinal facial expressions. To simulate realistic emotional variation, we should be able to render a facial transition between two arbitrary expressions...
Prompt-Based Clarity Evaluation and Topic Detection in Political Question Answering : Abstract: Automatic evaluation of large language model (LLM) responses requires not only factual correctness but also clarity, particularly in political question-answering. While recent datasets provi...
SwiftMem: Fast Agentic Memory via Query-aware Indexing : Abstract: Agentic memory systems have become critical for enabling LLM agents to maintain long-term context and retrieve relevant information efficiently. However, existing memory frameworks suffer fr...
Dynamic Graph Structure Learning via Resistance Curvature Flow : Abstract: Geometric Representation Learning (GRL) aims to approximate the non-Euclidean topology of high-dimensional data through discrete graph structures, grounded in the manifold hypothesis. Howeve...
Enriching Semantic Profiles into Knowledge Graph for Recommender Systems Using Large Language Models : Abstract: Rich and informative profiling to capture user preferences is essential for improving recommendation quality. However, there is still no consensus on how best to construct and utilize such p...
Mechanisms are Transferable: Data-Efficient Low-Resource Adaptation via Circuit-Targeted Supervised Fine-Tuning : Abstract: Adapting LLMs to low-resource languages is difficult: labeled data is scarce, full-model fine-tuning is unstable, and continued cross-lingual tuning can cause catastrophic forgetting. We pro...
Qalb: Largest State-of-the-Art Urdu Large Language Model for 230M Speakers with Systematic Continued Pre-training : Abstract: Despite remarkable progress in large language models, Urdu-a language spoken by over 230 million people-remains critically underrepresented in modern NLP systems. Existing multilingual model...
Subspace Alignment for Vision-Language Model Test-time Adaptation : Abstract: Vision-language models (VLMs), despite their extraordinary zero-shot capabilities, are vulnerable to distribution shifts. Test-time adaptation (TTA) emerges as a predominant strategy to adap...
How Do Optical Flow and Textual Prompts Collaborate to Assist in Audio-Visual Semantic Segmentation? : Abstract: Audio-visual semantic segmentation (AVSS) represents an extension of the audio-visual segmentation (AVS) task, necessitating a semantic understanding of audio-visual scenes beyond merely ide...
PathoGen: Diffusion-Based Synthesis of Realistic Lesions in Histopathology Images : Abstract: The development of robust artificial intelligence models for histopathology diagnosis is severely constrained by the scarcity of expert-annotated lesion data, particularly for rare pathologi...
CSQL: Mapping Documents into Causal Databases : Abstract: We describe a novel system, CSQL, which automatically converts a collection of unstructured text documents into an SQL-queryable causal database (CDB). A CDB differs from a traditional DB: i...
Debiasing Large Language Models via Adaptive Causal Prompting with Sketch-of-Thought : Abstract: Despite notable advancements in prompting methods for Large Language Models (LLMs), such as Chain-of-Thought (CoT), existing strategies still suffer from excessive token usage and limited ge...
STO-RL: Offline RL under Sparse Rewards via LLM-Guided Subgoal Temporal Order : Abstract: Offline reinforcement learning (RL) enables policy learning from pre-collected datasets, avoiding costly and risky online interactions, but it often struggles with long-horizon tasks involvi...
High-Fidelity Modeling of Stochastic Chemical Dynamics on Complex Manifolds: A Multi-Scale SIREN-PINN Framework for the Curvature-Perturbed Ginzburg-Landau Equation : Abstract: The accurate identification and control of spatiotemporal chaos in reaction-diffusion systems remains a grand challenge in chemical engineering, particularly when the underlying catalytic su...
Local-Global Feature Fusion for Subject-Independent EEG Emotion Recognition : Abstract: Subject-independent EEG emotion recognition is challenged by pronounced inter-subject variability and the difficulty of learning robust representations from short, noisy recordings. To addre...
Q-realign: Piggybacking Realignment on Quantization for Safe and Efficient LLM Deployment : Abstract: Public large language models (LLMs) are typically safety-aligned during pretraining, yet task-specific fine-tuning required for deployment often erodes this alignment and introduces safety r...
Reasoning Beyond Chain-of-Thought: A Latent Computational Mode in Large Language Models : Abstract: Chain-of-Thought (CoT) prompting has improved the reasoning performance of large language models (LLMs), but it remains unclear why it works and whether it is the unique mechanism for trigge...
The Role of Noisy Data in Improving CNN Robustness for Image Classification : Abstract: Data quality plays a central role in the performance and robustness of convolutional neural networks (CNNs) for image classification. While high-quality data is often preferred for training,...
FigEx2: Visual-Conditioned Panel Detection and Captioning for Scientific Compound Figures : Abstract: Scientific compound figures combine multiple labeled panels into a single image, but captions in real pipelines are often missing or only provide figure-level summaries, making panel-level u...
Representations of Text and Images Align From Layer One : Abstract: We show that for a variety of concepts in adapter-based vision-language models, the representations of their images and their text descriptions are meaningfully aligned from the very first l...
TP-Blend: Textual-Prompt Attention Pairing for Precise Object-Style Blending in Diffusion Models : Abstract: Current text-conditioned diffusion editors handle single object replacement well but struggle when a new object and a new style must be introduced simultaneously. We present Twin-Prompt Atte...
LLM Review: Enhancing Creative Writing via Blind Peer Review Feedback : Abstract: Large Language Models (LLMs) often struggle with creative generation, and multi-agent frameworks that improve reasoning through interaction can paradoxically hinder creativity by inducing co...
DYCP: Dynamic Context Pruning for Long-Form Dialogue with LLMs : Abstract: Large Language Models (LLMs) often exhibit increased response latency and degraded answer quality as dialogue length grows, making effective context management essential. However, existing m...
From Word Sequences to Behavioral Sequences: Adapting Modeling and Evaluation Paradigms for Longitudinal NLP : Abstract: While NLP typically treats documents as independent and unordered samples, in longitudinal studies, this assumption rarely holds: documents are nested within authors and ordered in time, for...
Cultural Compass: A Framework for Organizing Societal Norms to Detect Violations in Human-AI Conversations : Abstract: Generative AI models ought to be useful and safe across cross-cultural contexts. One critical step toward this goal is understanding how AI models adhere to sociocultural norms. While this c...
Tuberculosis Screening from Cough Audio: Baseline Models, Clinical Variables, and Uncertainty Quantification : Abstract: In this paper, we propose a standardized framework for automatic tuberculosis (TB) detection from cough audio and routinely collected clinical data using machine learning. While TB screening...
LJ-Spoof: A Generatively Varied Corpus for Audio Anti-Spoofing and Synthesis Source Tracing : Abstract: Speaker-specific anti-spoofing and synthesis-source tracing are central challenges in audio anti-spoofing. Progress has been hampered by the lack of datasets that systematically vary model a...
LWMSCNN-SE: A Lightweight Multi-Scale Network for Efficient Maize Disease Classification on Edge Devices : Abstract: Maize disease classification plays a vital role in mitigating yield losses and ensuring food security. However, the deployment of traditional disease detection models in resource-constrained...
Quantum automated theorem proving : Abstract: Automated theorem proving, or more broadly automated reasoning, aims at using computer programs to automatically prove or disprove mathematical theorems and logical statements. It takes on a...
Hybrid SARIMA LSTM Model for Local Weather Forecasting: A Residual Learning Approach for Data Driven Meteorological Prediction : Abstract: Accurately forecasting long-term atmospheric variables remains a defining challenge in meteorological science due to the chaotic nature of atmospheric systems. Temperature data represents a ...
Reinforcement Learning Methods for Neighborhood Selection in Local Search : Abstract: Reinforcement learning has recently gained traction as a means to improve combinatorial optimization methods, yet its effectiveness within local search metaheuristics specifically remains co...
Coupled Diffusion-Encoder Models for Reconstruction of Flow Fields : Abstract: Data-driven flow-field reconstruction typically relies on autoencoder architectures that compress high-dimensional states into low-dimensional latent representations. However, classical appr...
Moonworks Lunara Aesthetic Dataset : Abstract: The dataset spans diverse artistic styles, including regionally grounded aesthetics from the Middle East, Northern Europe, East Asia, and South Asia, alongside general categories such as ske...
SECite: Analyzing and Summarizing Citations in Software Engineering Literature : Abstract: Identifying the strengths and limitations of a research paper is a core component of any literature review. However, traditional summaries reflect only the authors' self-presented perspectiv...
Towards Specialized Generalists: A Multi-Task MoE-LoRA Framework for Domain-Specific LLM Adaptation : Abstract: The rapid evolution of Large Language Models (LLMs) has shifted focus from general-purpose capabilities to domain-specific expertise. However, adapting LLMs to specialized fields such as med...
Enhancing Large Language Models for Time-Series Forecasting via Vector-Injected In-Context Learning : Abstract: The World Wide Web needs reliable predictive capabilities to respond to changes in user behavior and usage patterns. Time series forecasting (TSF) is a key means to achieve this goal. In rec...
Decentralized Online Convex Optimization with Unknown Feedback Delays : Abstract: Decentralized online convex optimization (D-OCO), where multiple agents within a network collaboratively learn optimal decisions in real-time, arises naturally in applications such as federa...
Large Language Models and Algorithm Execution: Application to an Arithmetic Function : Abstract: Large Language Models (LLMs) have recently developed new advanced functionalities. Their effectiveness relies on statistical learning and generalization capabilities. However, they face limi...
Revealing the Attention Floating Mechanism in Masked Diffusion Models : Abstract: Masked diffusion models (MDMs), which leverage bidirectional attention and a denoising process, are narrowing the performance gap with autoregressive models (ARMs). However, their internal a...
Sherry: Hardware-Efficient 1.25-Bit Ternary Quantization via Fine-grained Sparsification : Abstract: The deployment of Large Language Models (LLMs) on resource-constrained edge devices is increasingly hindered by prohibitive memory and computational requirements. While ternary quantization ...
KVzap: Fast, Adaptive, and Faithful KV Cache Pruning : Abstract: Growing context lengths in transformer-based language models have made the key-value (KV) cache a critical inference bottleneck. While many KV cache pruning methods have been proposed, they ...
Small Symbols, Big Risks: Exploring Emoticon Semantic Confusion in Large Language Models : Abstract: Emoticons are widely used in digital communication to convey affective intent, yet their safety implications for Large Language Models (LLMs) remain largely unexplored. In this paper, we ide...
Ideological Isolation in Online Social Networks: A Survey of Computational Definitions, Metrics, and Mitigation Strategies : Abstract: The proliferation of online social networks has significantly reshaped the way individuals access and engage with information. While these platforms offer unprecedented connectivity, they ma...
Tackling Heterogeneity in Quantum Federated Learning: An Integrated Sporadic-Personalized Approach : Abstract: Quantum federated learning (QFL) emerges as a powerful technique that combines quantum computing with federated learning to efficiently process complex data across distributed quantum device...
Sola-Visibility-ISPM: Benchmarking Agentic AI for Identity Security Posture Management Visibility : Abstract: Identity Security Posture Management (ISPM) is a core challenge for modern enterprises operating across cloud and SaaS environments. Answering basic ISPM visibility questions, such as unders...
Sliced-Wasserstein Distribution Alignment Loss Improves the Ultra-Low-Bit Quantization of Large Language Models : Abstract: The benefits of most large language models come with steep and often hidden economic and environmental costs due to their resource usage inefficiency during deployment. Model quantization im...
E^2-LLM: Bridging Neural Signals and Interpretable Affective Analysis : Abstract: Emotion recognition from electroencephalography (EEG) signals remains challenging due to high inter-subject variability, limited labeled data, and the lack of interpretable reasoning in exis...
NOVAK: Unified adaptive optimizer for deep neural networks : Abstract: This work introduces NOVAK, a modular gradient-based optimization algorithm that integrates adaptive moment estimation, rectified learning-rate scheduling, decoupled weight regularization, m...
Multiplicative Orthogonal Sequential Editing for Language Models : Abstract: Knowledge editing aims to efficiently modify the internal knowledge of large language models (LLMs) without compromising their other capabilities. The prevailing editing paradigm, which appe...
Imaging-anchored Multiomics in Cardiovascular Disease: Integrating Cardiac Imaging, Bulk, Single-cell, and Spatial Transcriptomics : Abstract: Cardiovascular disease arises from interactions between inherited risk, molecular programmes, and tissue-scale remodelling that are observed clinically through imaging. Health systems now ro...
RewriteNets: End-to-End Trainable String-Rewriting for Generative Sequence Modeling : Abstract: Dominant sequence models like the Transformer represent structure implicitly through dense attention weights, incurring quadratic complexity. We propose RewriteNets, a novel neural architect...
Affect and Effect: Limitations of regularisation-based continual learning in EEG-based emotion classification : Abstract: Generalisation to unseen subjects in EEG-based emotion classification remains a challenge due to high inter-and intra-subject variability. Continual learning (CL) poses a promising solution ...
Feature Entanglement-based Quantum Multimodal Fusion Neural Network : Abstract: Multimodal learning aims to enhance perceptual and decision-making capabilities by integrating information from diverse sources. However, classical deep learning approaches face a critical t...
An Empirical Study on Knowledge Transfer under Domain and Label Shifts in 3D LiDAR Point Clouds : Abstract: For 3D perception systems to be practical in real-world applications -- from autonomous driving to embodied AI -- models must adapt to continuously evolving object definitions and sensor dom...
Immunological Density Shapes Recovery Trajectories in Long COVID : Abstract: Post-acute sequelae of SARS-CoV-2 infection (Long COVID) frequently persists for months, yet drivers of clinical remission remain incompletely defined. Here we analyzed 97,564 longitudinal...
FinVault: Benchmarking Financial Agent Safety in Execution-Grounded Environments : Abstract: Financial agents powered by large language models (LLMs) are increasingly deployed for investment analysis, risk assessment, and automated decision-making, where their abilities to plan, inv...
Hierarchical Sparse Plus Low Rank Compression of LLM : Abstract: Modern large language models (LLMs) place extraordinary pressure on memory and compute budgets, making principled compression indispensable for both deployment and continued training. We pre...
A survey: Information search time optimization based on RAG (Retrieval Augmentation Generation) chatbot : Abstract: Retrieval-Augmented Generation (RAG) based chatbots are not only useful for information retrieval through questionanswering but also for making complex decisions based on injected private da...
Photometric Redshift Estimation Using Scaled Ensemble Learning : Abstract: The development of the state-of-the-art telescopic systems capable of performing expansive sky surveys such as the Sloan Digital Sky Survey, Euclid, and the Rubin Observatory's Legacy Survey...
Uncovering Political Bias in Large Language Models using Parliamentary Voting Records : Abstract: As large language models (LLMs) become deeply embedded in digital platforms and decision-making systems, concerns about their political biases have grown. While substantial work has examined...
Pervasive Annotation Errors Break Text-to-SQL Benchmarks and Leaderboards : Abstract: Researchers have proposed numerous text-to-SQL techniques to streamline data analytics and accelerate the development of database-driven applications. To compare these techniques and select ...
AI as Entertainment : Abstract: Generative AI systems are predominantly designed, evaluated, and marketed as intelligent systems which will benefit society by augmenting or automating human cognitive labor, promising to in...
Learning from Demonstrations via Capability-Aware Goal Sampling : Abstract: Despite its promise, imitation learning often fails in long-horizon environments where perfect replication of demonstrations is unrealistic and small errors can accumulate catastrophically. ...
Evaluating the Ability of Explanations to Disambiguate Models in a Rashomon Set : Abstract: Explainable artificial intelligence (XAI) is concerned with producing explanations indicating the inner workings of models. For a Rashomon set of similarly performing models, explanations pr...
All Required, In Order: Phase-Level Evaluation for AI-Human Dialogue in Healthcare and Beyond : Abstract: Conversational AI is starting to support real clinical work, but most evaluation methods miss how compliance depends on the full course of a conversation. We introduce Obligatory-Information...
MEMEWEAVER: Inter-Meme Graph Reasoning for Sexism and Misogyny Detection : Abstract: Women are twice as likely as men to face online harassment due to their gender. Despite recent advances in multimodal content moderation, most approaches still overlook the social dynamics b...
PersonaDual: Balancing Personalization and Objectivity via Adaptive Reasoning : Abstract: As users increasingly expect LLMs to align with their preferences, personalized information becomes valuable. However, personalized information can be a double-edged sword: it can improve in...
Advancing ESG Intelligence: An Expert-level Agent and Comprehensive Benchmark for Sustainable Finance : Abstract: Environmental, social, and governance (ESG) criteria are essential for evaluating corporate sustainability and ethical performance. However, professional ESG analysis is hindered by data fra...
Why AI Alignment Failure Is Structural: Learned Human Interaction Structures and AGI as an Endogenous Evolutionary Shock : Abstract: Recent reports of large language models (LLMs) exhibiting behaviors such as deception, threats, or blackmail are often interpreted as evidence of alignment failure or emergent malign agency....
Parallel Context-of-Experts Decoding for Retrieval Augmented Generation : Abstract: Retrieval Augmented Generation faces a trade-off: concatenating documents in a long prompt enables multi-document reasoning but creates prefill bottlenecks, while encoding document KV caches...
From Classical to Quantum Reinforcement Learning and Its Applications in Quantum Control: A Beginner's Tutorial : Abstract: This tutorial is designed to make reinforcement learning (RL) more accessible to undergraduate students by offering clear, example-driven explanations. It focuses on bridging the gap between...
Prism: Towards Lowering User Cognitive Load in LLMs via Complex Intent Understanding : Abstract: Large Language Models are rapidly emerging as web-native interfaces to social platforms. On the social web, users frequently have ambiguous and dynamic goals, making complex intent understan...
Resisting Manipulative Bots in Memecoin Copy Trading: A Multi-Agent Approach with Chain-of-Thought Reasoning : Abstract: The launch of \$Trump coin ignited a wave in meme coin investment. Copy trading, as a strategy-agnostic approach that eliminates the need for deep trading knowledge, quickly gains widespread...
ViDoRe V3: A Comprehensive Evaluation of Retrieval Augmented Generation in Complex Real-World Scenarios : Abstract: Retrieval-Augmented Generation (RAG) pipelines must address challenges beyond simple single-document retrieval, such as interpreting visual elements (tables, charts, images), synthesizing in...
WaterCopilot: An AI-Driven Virtual Assistant for Water Management : Abstract: Sustainable water resource management in transboundary river basins is challenged by fragmented data, limited real-time access, and the complexity of integrating diverse information sources....
Learner-Tailored Program Repair: A Solution Generator with Iterative Edit-Driven Retrieval Enhancement : Abstract: With the development of large language models (LLMs) in the field of programming, intelligent programming coaching systems have gained widespread attention. However, most research focuses on...
Sketch-Based Facade Renovation With Generative AI: A Streamlined Framework for Bypassing As-Built Modelling in Industrial Adaptive Reuse : Abstract: Facade renovation offers a more sustainable alternative to full demolition, yet producing design proposals that preserve existing structures while expressing new intent remains challenging. ...
What If TSF: A Benchmark for Reframing Forecasting as Scenario-Guided Multimodal Forecasting : Abstract: Time series forecasting is critical to real-world decision making, yet most existing approaches remain unimodal and rely on extrapolating historical patterns. While recent progress in large ...
SUMMPILOT: Bridging Efficiency and Customization for Interactive Summarization System : Abstract: This paper incorporates the efficiency of automatic summarization and addresses the challenge of generating personalized summaries tailored to individual users' interests and requirements. T...
M3-BENCH: Process-Aware Evaluation of LLM Agents Social Behaviors in Mixed-Motive Games : Abstract: As the capabilities of large language model (LLM) agents continue to advance, their advanced social behaviors, such as cooperation, deception, and collusion, call for systematic evaluation. ...
An Under-Explored Application for Explainable Multimodal Misogyny Detection in code-mixed Hindi-English : Abstract: Digital platforms have an ever-expanding user base, and act as a hub for communication, business, and connectivity. However, this has also allowed for the spread of hate speech and misogyny....
Beyond Linearization: Attributed Table Graphs for Table Reasoning : Abstract: Table reasoning, a task to answer questions by reasoning over data presented in tables, is an important topic due to the prevalence of knowledge stored in tabular formats. Recent solutions u...
YaPO: Learnable Sparse Activation Steering Vectors for Domain Adaptation : Abstract: Steering Large Language Models (LLMs) through activation interventions has emerged as a lightweight alternative to fine-tuning for alignment and personalization. Recent work on Bi-directiona...
RubricHub: A Comprehensive and Highly Discriminative Rubric Dataset via Automated Coarse-to-Fine Generation : Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has driven substantial progress in reasoning-intensive domains like mathematics. However, optimizing open-ended generation remains chall...
Hybrid Distillation with CoT Guidance for Edge-Drone Control Code Generation : Abstract: With large language models demonstrating significant potential in code generation tasks, their application to onboard control of resource-constrained Unmanned Aerial Vehicles has emerged as ...
WebTrap Park: An Automated Platform for Systematic Security Evaluation of Web Agents : Abstract: Web Agents are increasingly deployed to perform complex tasks in real web environments, yet their security evaluation remains fragmented and difficult to standardize. We present WebTrap Park...
Owen-Shapley Policy Optimization (OSPO): A Principled RL Algorithm for Generative Search LLMs : Abstract: Large language models are increasingly trained via reinforcement learning for personalized recommendation tasks, but standard methods like GRPO rely on sparse, sequence-level rewards that cr...
Creativity in AI as Emergence from Domain-Limited Generative Models : Abstract: Creativity in artificial intelligence is most often addressed through evaluative frameworks that aim to measure novelty, diversity, or usefulness in generated outputs. While such approaches ...
Deconstructing Pre-training: Knowledge Attribution Analysis in MoE and Dense Models : Abstract: Mixture-of-Experts (MoE) architectures decouple model capacity from per-token computation, enabling scaling beyond the computational limits imposed by dense scaling laws. Yet how MoE archite...
A Qualitative Model to Reason about Object Rotations (QOR) applied to solve the Cube Comparison Test (CCT) : Abstract: This paper presents a Qualitative model for Reasoning about Object Rotations (QOR) which is applied to solve the Cube Comparison Test (CCT) by Ekstrom et al. (1976). A conceptual neighborhoo...
Thematic Working Group 5 -- Artificial Intelligence (AI) literacy for teaching and learning: design and implementation : Abstract: TWG 5 focused on developing and implementing effective strategies for enhancing AI literacy and agency of teachers, equipping them with the knowledge and skills necessary to integrate AI int...
Semantic Laundering in AI Agent Architectures: Why Tool Boundaries Do Not Confer Epistemic Warrant : Abstract: LLM-based agent architectures systematically conflate information transport mechanisms with epistemic justification mechanisms. We formalize this class of architectural failures as semantic ...
AtomMem : Learnable Dynamic Agentic Memory with Atomic Memory Operation : Abstract: Equipping agents with memory is essential for solving real-world long-horizon problems. However, most existing agent memory mechanisms rely on static and hand-crafted workflows. This limits ...
OpenMic: A Multi-Agent-Based Stand-Up Comedy Generation System : Abstract: Chinese stand-up comedy generation goes beyond plain text generation, requiring culturally grounded humor, precise timing, stage-performance cues, and implicit multi-step reasoning. Moreover...
Greedy Is Enough: Sparse Action Discovery in Agentic LLMs : Abstract: Modern agentic systems operate in environments with extremely large action spaces, such as tool-augmented language models with thousands of available APIs or retrieval operations. Despite th...
ToolACE-MCP: Generalizing History-Aware Routing from MCP Tools to the Agent Web : Abstract: With the rise of the Agent Web and Model Context Protocol (MCP), the agent ecosystem is evolving into an open collaborative network, exponentially increasing accessible tools. However, curre...
Sparsity Is Necessary: Polynomial-Time Stability for Agentic LLMs in Large Action Spaces : Abstract: Tool-augmented LLM systems expose a control regime that learning theory has largely ignored: sequential decision-making with a massive discrete action universe (tools, APIs, documents) in wh...
VGG Induced Deep Hand Sign Language Detection : Abstract: Hand gesture recognition is an important aspect of human-computer interaction. It forms the basis of sign language for the visually impaired people. This work proposes a novel hand gesture r...
T3: Benchmarking Sycophancy and Skepticism in Causal Judgment : Abstract: We introduce T3 (Testing Trustworthy Thinking), a diagnostic benchmark designed to rigorously evaluate LLM causal judgment across Pearl's Ladder of Causality. Comprising 454 expert-curated v...
Large Artificial Intelligence Model Guided Deep Reinforcement Learning for Resource Allocation in Non Terrestrial Networks : Abstract: Large AI Model (LAM) have been proposed to applications of Non-Terrestrial Networks (NTN), that offer better performance with its great generalization and reduced task specific trainings. In...
The End of Reward Engineering: How LLMs Are Redefining Multi-Agent Coordination : Abstract: Reward engineering, the manual specification of reward functions to induce desired agent behavior, remains a fundamental challenge in multi-agent reinforcement learning. This difficulty is a...
MPCI-Bench: A Benchmark for Multimodal Pairwise Contextual Integrity Evaluation of Language Model Agents : Abstract: As language-model agents evolve from passive chatbots into proactive assistants that handle personal data, evaluating their adherence to social norms becomes increasingly critical, often thr...
An Axiomatic Approach to General Intelligence: SANC(E3) -- Self-organizing Active Network of Concepts with Energy E3 : Abstract: General intelligence must reorganize experience into internal structures that enable prediction and action under finite resources. Existing systems implicitly presuppose fixed primitive unit...
Adapting Rules of Official International Mahjong for Online Players : Abstract: As one of the worldwide spread traditional game, Official International Mahjong can be played and promoted online through remote devices instead of requiring face-to-face interaction. Howeve...
Improving LLM Reasoning with Homophily-aware Structural and Semantic Text-Attributed Graph Compression : Abstract: Large language models (LLMs) have demonstrated promising capabilities in Text-Attributed Graph (TAG) understanding. Recent studies typically focus on verbalizing the graph structures via han...
The Agent's First Day: Benchmarking Learning, Exploration, and Scheduling in the Workplace Scenarios : Abstract: The rapid evolution of Multi-modal Large Language Models (MLLMs) has advanced workflow automation; however, existing research mainly targets performance upper bounds in static environments, ...
ZeroDVFS: Zero-Shot LLM-Guided Core and Frequency Allocation for Embedded Platforms : Abstract: Dynamic voltage and frequency scaling (DVFS) and task-to-core allocation are critical for thermal management and balancing energy and performance in embedded systems. Existing approaches eit...
Project Synapse: A Hierarchical Multi-Agent Framework with Hybrid Memory for Autonomous Resolution of Last-Mile Delivery Disruptions : Abstract: This paper introduces Project Synapse, a novel agentic framework designed for the autonomous resolution of last-mile delivery disruptions. Synapse employs a hierarchical multi-agent architec...
Embedded AI Companion System on Edge Devices : Abstract: Computational resource constraints on edge devices make it difficult to develop a fully embedded AI companion system with a satisfactory user experience. AI companion and memory systems deta...
How vehicles change lanes after encountering crashes: Empirical analysis and modeling : Abstract: When a traffic crash occurs, following vehicles need to change lanes to bypass the obstruction. We define these maneuvers as post crash lane changes. In such scenarios, vehicles in the targe...
MirrorBench: An Extensible Framework to Evaluate User-Proxy Agents for Human-Likeness : Abstract: Large language models (LLMs) are increasingly used as human simulators, both for evaluating conversational systems and for generating fine-tuning data. However, naive "act-as-a-user" prompti...
MemoBrain: Executive Memory as an Agentic Brain for Reasoning : Abstract: Complex reasoning in tool-augmented agent frameworks is inherently long-horizon, causing reasoning traces and transient tool artifacts to accumulate and strain the bounded working context of...
Semantic Gravity Wells: Why Negative Constraints Backfire : Abstract: Negative constraints (instructions of the form "do not use word X") represent a fundamental test of instruction-following capability in large language models. Despite their apparent simplici...
A New Strategy for Verifying Reach-Avoid Specifications in Neural Feedback Systems : Abstract: Forward reachability analysis is the predominant approach for verifying reach-avoid properties in neural feedback systems (dynamical systems controlled by neural networks). This dominance st...
Forecast Aware Deep Reinforcement Learning for Efficient Electricity Load Scheduling in Dairy Farms : Abstract: Dairy farming is an energy intensive sector that relies heavily on grid electricity. With increasing renewable energy integration, sustainable energy management has become essential for redu...
Integrating Attendance Tracking and Emotion Detection for Enhanced Student Engagement in Smart Classrooms : Abstract: The increasing adoption of smart classroom technologies in higher education has mainly focused on automating attendance, with limited attention given to students' emotional and cognitive eng...
Internal Deployment Gaps in AI Regulation : Abstract: Frontier AI regulations primarily focus on systems deployed to external users, where deployment is more visible and subject to outside scrutiny. However, high-stakes applications can occur i...
Reasoning over Precedents Alongside Statutes: Case-Augmented Deliberative Alignment for LLM Safety : Abstract: Ensuring that Large Language Models (LLMs) adhere to safety principles without refusing benign requests remains a significant challenge. While OpenAI introduces deliberative alignment (DA) t...
When Models Know When They Do Not Know: Calibration, Cascading, and Cleaning : Abstract: When a model knows when it does not know, many possibilities emerge. The first question is how to enable a model to recognize that it does not know. A promising approach is to use confidence...
Executable Ontologies in Game Development: From Algorithmic Control to Semantic World Modeling : Abstract: This paper examines the application of Executable Ontologies (EO), implemented through the boldsea framework, to game development. We argue that EO represents a paradigm shift: a transition ...
Bridging the Trust Gap: Clinician-Validated Hybrid Explainable AI for Maternal Health Risk Assessment in Bangladesh : Abstract: While machine learning shows promise for maternal health risk prediction, clinical adoption in resource-constrained settings faces a critical barrier: lack of explainability and trust. This ...

Research Sources: 401 | Generated: 1/14/2026