AI RESEARCH PAPERS & ACADEMIC SOURCES
- 3D Point Cloud Object Detection on Edge Devices for Split Computing : Abstract: The field of autonomous driving technology is rapidly advancing, with deep learning being a key component. Particularly in the field of sensing, 3D point cloud data collected by LiDAR is uti...
- A Kullback-Leibler divergence method for input-system-state identification : Abstract: The capability of a novel Kullback-Leibler divergence method is examined herein within the Kalman filter framework to select the input-parameter-state estimation execution with the most plau...
- HAGI++: Head-Assisted Gaze Imputation and Generation : Abstract: Mobile eye tracking plays a vital role in capturing human visual attention across both real-world and extended reality (XR) environments, making it an essential tool for applications ranging...
- SigmaCollab: An Application-Driven Dataset for Physically Situated Collaboration : Abstract: We introduce SigmaCollab, a dataset enabling research on physically situated human-AI collaboration. The dataset consists of a set of 85 sessions in which untrained participants were guided ...
- Resource-efficient Automatic Refinement of Segmentations via Weak Supervision from Light Feedback : Abstract: Delineating anatomical regions is a key task in medical image analysis. Manual segmentation achieves high accuracy but is labor-intensive and prone to variability, thus prompting the develop...
- An unscented Kalman filter method for real time input-parameter-state estimation : Abstract: The input-parameter-state estimation capabilities of a novel unscented Kalman filter is examined herein on both linear and nonlinear systems. The unknown input is estimated in two stages wit...
- Robust Identity Perceptual Watermark Against Deepfake Face Swapping : Abstract: Notwithstanding offering convenience and entertainment to society, Deepfake face swapping has caused critical privacy issues with the rapid development of deep generative models. Due to impe...
- Training Convolutional Neural Networks with the Forward-Forward algorithm : Abstract: Recent successes in image analysis with deep neural networks are achieved almost exclusively with Convolutional Neural Networks (CNNs), typically trained using the backpropagation (BP) algor...
- Deep Fourier-embedded Network for RGB and Thermal Salient Object Detection : Abstract: The rapid development of deep learning has significantly improved salient object detection (SOD) combining both RGB and thermal (RGB-T) images. However, existing Transformer-based RGB-T SOD ...
- Mobile Robotic Multi-View Photometric Stereo : Abstract: Multi-View Photometric Stereo (MVPS) is a popular method for fine-detailed 3D acquisition of an object from images. Despite its outstanding results on diverse material objects, a typical MVP...
- Prompt to Restore, Restore to Prompt: Cyclic Prompting for Universal Adverse Weather Removal : Abstract: Universal adverse weather removal (UAWR) seeks to address various weather degradations within a unified framework. Recent methods are inspired by prompt learning using pre-trained vision-lan...
- RoMA: Scaling up Mamba-based Foundation Models for Remote Sensing : Abstract: Recent advances in self-supervised learning for Vision Transformers (ViTs) have fueled breakthroughs in remote sensing (RS) foundation models. However, the quadratic complexity of self-atten...
- 3DBonsai: Structure-Aware Bonsai Modeling Using Conditioned 3D Gaussian Splatting : Abstract: Recent advancements in text-to-3D generation have shown remarkable results by leveraging 3D priors in combination with 2D diffusion. However, previous methods utilize 3D priors that lack det...
- FractalForensics: Proactive Deepfake Detection and Localization via Fractal Watermarks : Abstract: Proactive Deepfake detection via robust watermarks has seen interest ever since passive Deepfake detectors encountered challenges in identifying high-quality synthetic images. However, while...
- Breaking Down Monocular Ambiguity: Exploiting Temporal Evolution for 3D Lane Detection : Abstract: Monocular 3D lane detection aims to estimate the 3D position of lanes from frontal-view (FV) images. However, existing methods are fundamentally constrained by the inherent ambiguity of sing...
- GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution : Abstract: Ultra-high-resolution (UHR) remote sensing (RS) imagery offers valuable data for Earth observation but pose challenges for existing multimodal foundation models due to two key bottlenecks: (...
- GeoSDF: Plane Geometry Diagram Synthesis via Signed Distance Field : Abstract: Plane Geometry Diagram Synthesis has been a crucial task in computer graphics, with applications ranging from educational tools to AI-driven mathematical reasoning. Traditionally, we rely on...
- Crucial-Diff: A Unified Diffusion Model for Crucial Image and Annotation Synthesis in Data-scarce Scenarios : Abstract: The scarcity of data in various scenarios, such as medical, industry and autonomous driving, leads to model overfitting and dataset imbalance, thus hindering effective detection and segmenta...
- Light Future: Multimodal Action Frame Prediction via InstructPix2Pix : Abstract: Predicting future motion trajectories is a critical capability across domains such as robotics, autonomous systems, and human activity forecasting, enabling safer and more intelligent decisi...
- A Practical Investigation of Spatially-Controlled Image Generation with Transformers : Abstract: Enabling image generation models to be spatially controlled is an important area of research, empowering users to better generate images according to their own fine-grained specifications vi...
- Label tree semantic losses for rich multi-class medical image segmentation : Abstract: Rich and accurate medical image segmentation is poised to underpin the next generation of AI-defined clinical practice by delineating critical anatomy for pre-operative planning, guiding rea...
- Talk2Event: Grounded Understanding of Dynamic Scenes from Event Cameras : Abstract: Event cameras offer microsecond-level latency and robustness to motion blur, making them ideal for understanding dynamic environments. Yet, connecting these asynchronous streams to human lan...
- Real World Federated Learning with a Knowledge Distilled Transformer for Cardiac CT Imaging : Abstract: Federated learning is a renowned technique for utilizing decentralized data while preserving privacy. However, real-world applications often face challenges like partially labeled datasets, ...
- The Coralscapes Dataset: Semantic Scene Understanding in Coral Reefs : Abstract: Coral reefs are declining worldwide due to climate change and local stressors. To inform effective conservation or restoration, monitoring at the highest possible spatial and temporal resolu...
- Large Language Models are Unreliable for Cyber Threat Intelligence : Abstract: Several recent works have argued that Large Language Models (LLMs) can be used to tame the data deluge in the cybersecurity field, by improving the automation of Cyber Threat Intelligence (C...
- Beyond Contrastive Learning: Synthetic Data Enables List-wise Training with Multiple Levels of Relevance : Abstract: Although synthetic data has changed various aspects of information retrieval (IR) pipelines, the main training paradigm remains: contrastive learning with binary relevance labels, where one ...
- Repetitions are not all alike: distinct mechanisms sustain repetition in language models : Abstract: Large Language Models (LLMs) can sometimes degrade into repetitive loops, persistently generating identical word sequences. Because repetition is rare in natural human language, its frequent...
- Evolutionary Machine Learning meets Self-Supervised Learning: a comprehensive survey : Abstract: The number of studies that combine Evolutionary Machine Learning and self-supervised learning has been growing steadily in recent years. Evolutionary Machine Learning has been shown to help ...
- NMCSE: Noise-Robust Multi-Modal Coupling Signal Estimation Method via Optimal Transport for Cardiovascular Disease Detection : Abstract: The coupling signal refers to a latent physiological signal that characterizes the transformation from cardiac electrical excitation, captured by the electrocardiogram (ECG), to mechanical c...
- OmniEarth-Bench: Towards Holistic Evaluation of Earth's Six Spheres and Cross-Spheres Interactions with Multimodal Observational Earth Data : Abstract: Existing benchmarks for multimodal learning in Earth science offer limited, siloed coverage of Earth's spheres and their cross-sphere interactions, typically restricting evaluation to the hu...
- Unsupervised Evolutionary Cell Type Matching via Entropy-Minimized Optimal Transport : Abstract: Identifying evolutionary correspondences between cell types across species is a fundamental challenge in comparative genomics and evolutionary biology. Existing approaches often rely on eith...
- DIsoN: Decentralized Isolation Networks for Out-of-Distribution Detection in Medical Imaging : Abstract: Safe deployment of machine learning (ML) models in safety-critical domains such as medical imaging requires detecting inputs with characteristics not seen during training, known as out-of-di...
- Scalable and Cost-Efficient de Novo Template-Based Molecular Generation : Abstract: Template-based molecular generation offers a promising avenue for drug design by ensuring generated compounds are synthetically accessible through predefined reaction templates and building ...
- MediQ-GAN: Quantum-Inspired GAN for High Resolution Medical Image Generation : Abstract: Machine learning-assisted diagnosis shows promise, yet medical imaging datasets are often scarce, imbalanced, and constrained by privacy, making data augmentation essential. Classical genera...
- Weakly Supervised Object Segmentation by Background Conditional Divergence : Abstract: As a computer vision task, automatic object segmentation remains challenging in specialized image domains without massive labeled data, such as synthetic aperture sonar images, remote sensin...
- AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-bench : Abstract: AI research agents are demonstrating great potential to accelerate scientific progress by automating the design, implementation, and training of machine learning models. We focus on methods ...
- Multi-Personality Generation of LLMs at Decoding-time : Abstract: Multi-personality generation for LLMs, enabling simultaneous embodiment of multiple personalization attributes, is a fundamental challenge. Existing retraining-based approaches are costly an...
- Rethinking LLM Human Simulation: When a Graph is What You Need : Abstract: Large language models (LLMs) are increasingly used to simulate humans, with applications ranging from survey prediction to decision-making. However, are LLMs strictly necessary, or can small...
- IG-Pruning: Input-Guided Block Pruning for Large Language Models : Abstract: With the growing computational demands of large language models (LLMs), efficient inference has become increasingly critical for practical deployment. Depth pruning has emerged as a promisin...
- LTD-Bench: Evaluating Large Language Models by Letting Them Draw : Abstract: Current evaluation paradigms for large language models (LLMs) represent a critical blind spot in AI research--relying on opaque numerical metrics that conceal fundamental limitations in spat...
- LiveSecBench: A Dynamic and Culturally-Relevant AI Safety Benchmark for LLMs in Chinese Context : Abstract: In this work, we propose LiveSecBench, a dynamic and continuously updated safety benchmark specifically for Chinese-language LLM application scenarios. LiveSecBench evaluates models across s...
- AyurParam: A State-of-the-Art Bilingual Language Model for Ayurveda : Abstract: Current large language models excel at broad, general-purpose tasks, but consistently underperform when exposed to highly specialized domains that require deep cultural, linguistic, and subj...
- Merging Continual Pretraining Models for Domain-Specialized LLMs: A Case Study in Finance : Abstract: While LLMs excel at general tasks, they struggle in specialized domains like finance, requiring diverse skills in domain knowledge, mathematical reasoning, and multilingual processing. Mergi...
- Prompting for Policy: Forecasting Macroeconomic Scenarios with Synthetic LLM Personas : Abstract: We evaluate whether persona-based prompting improves Large Language Model (LLM) performance on macroeconomic forecasting tasks. Using 2,368 economics-related personas from the PersonaHub cor...
- Smart-Hiring: An Explainable end-to-end Pipeline for CV Information Extraction and Job Matching : Abstract: Hiring processes often involve the manual screening of hundreds of resumes for each job, a task that is time and effort consuming, error-prone, and subject to human bias. This paper presents...
- The Analysis of Lexical Errors in Machine Translation from English into Romanian : Abstract: The research explores error analysis in the performance of translating by Machine Translation from English into Romanian, and it focuses on lexical errors found in texts which include offici...
- Next Token Knowledge Tracing: Exploiting Pretrained LLM Representations to Decode Student Behaviour : Abstract: Modelling student knowledge is a key challenge when leveraging AI in education, with major implications for personalised learning. The Knowledge Tracing (KT) task aims to predict how student...
- CGES: Confidence-Guided Early Stopping for Efficient and Accurate Self-Consistency : Abstract: Large language models (LLMs) are often queried multiple times at test time, with predictions aggregated by majority vote. While effective, this self-consistency strategy (arXiv:2203.11171) r...
- The Realignment Problem: When Right becomes Wrong in LLMs : Abstract: The alignment of Large Language Models (LLMs) with human values is central to their safe deployment, yet current practice produces static, brittle, and costly-to-maintain models that fail to...
- Understanding New-Knowledge-Induced Factual Hallucinations in LLMs: Analysis, Solution, and Interpretation : Abstract: Previous studies show that introducing new knowledge during large language models (LLMs) fine-tuning can lead to the generation of erroneous output when tested on known information, thereby ...
- PragExTra: A Multilingual Corpus of Pragmatic Explicitation in Translation : Abstract: Translators often enrich texts with background details that make implicit cultural meanings explicit for new audiences. This phenomenon, known as pragmatic explicitation, has been widely dis...
- AI Diffusion in Low Resource Language Countries : Abstract: Artificial intelligence (AI) is diffusing globally at unprecedented speed, but adoption remains uneven. Frontier Large Language Models (LLMs) are known to perform poorly on low-resource lang...
- Controlling Performance and Budget of a Centralized Multi-agent LLM System with Reinforcement Learning : Abstract: Large language models (LLMs) exhibit complementary strengths across domains and come with varying inference costs, motivating the design of multi-agent LLM systems where specialized models c...
- Beyond Single Embeddings: Capturing Diverse Targets with Multi-Query Retrieval : Abstract: Most text retrievers generate \emph{one} query vector to retrieve relevant documents. Yet, the conditional distribution of relevant documents for the query may be multimodal, e.g., represent...
- MemSearcher: Training LLMs to Reason, Search and Manage Memory via End-to-End Reinforcement Learning : Abstract: Typical search agents concatenate the entire interaction history into the LLM context, preserving information integrity but producing long, noisy contexts, resulting in high computation and ...
- Oolong: Evaluating Long Context Reasoning and Aggregation Capabilities : Abstract: As model context lengths continue to grow, concerns about whether models effectively use the full context length have persisted. While several carefully designed long-context evaluations hav...
- SciDaSynth: Interactive Structured Data Extraction from Scientific Literature with Large Language Model : Abstract: The explosion of scientific literature has made the efficient and accurate extraction of structured data a critical component for advancing scientific knowledge and supporting evidence-based...
- Complete asymptotic type-token relationship for growing complex systems with inverse power-law count rankings : Abstract: The growth dynamics of complex systems often exhibit statistical regularities involving power-law relationships. For real finite complex systems formed by countable tokens (animals, words) a...
- Deep Value Benchmark: Measuring Whether Models Generalize Deep values or Shallow Preferences : Abstract: We introduce the Deep Value Benchmark (DVB), an evaluation framework that directly tests whether large language models (LLMs) learn fundamental human values or merely surface-level preferenc...
- InsurAgent: A Large Language Model-Empowered Agent for Simulating Individual Behavior in Purchasing Flood Insurance : Abstract: Flood insurance is an effective strategy for individuals to mitigate disaster-related losses. However, participation rates among at-risk populations in the United States remain strikingly lo...
- An Evaluation of Interleaved Instruction Tuning on Semantic Reasoning Performance in an Audio MLLM : Abstract: Standard training for Multi-modal Large Language Models (MLLMs) involves concatenating non-textual information, like vision or audio, with a text prompt. This approach may not encourage deep...
- SAIL-RL: Guiding MLLMs in When and How to Think via Dual-Reward RL Tuning : Abstract: We introduce SAIL-RL, a reinforcement learning (RL) post-training framework that enhances the reasoning capabilities of multimodal large language models (MLLMs) by teaching them when and how...
- Link prediction Graph Neural Networks for structure recognition of Handwritten Mathematical Expressions : Abstract: We propose a Graph Neural Network (GNN)-based approach for Handwritten Mathematical Expression (HME) recognition by modeling HMEs as graphs, where nodes represent symbols and edges capture s...
- Unlocking the Power of Multi-Agent LLM for Reasoning: From Lazy Agents to Deliberation : Abstract: Large Language Models (LLMs) trained with reinforcement learning and verifiable rewards have achieved strong results on complex reasoning tasks. Recent work extends this paradigm to a multi-...
- CoCoVa: Chain of Continuous Vision-Language Thought for Latent Space Reasoning : Abstract: In human cognition, there exist numerous thought processes that are tacit and beyond verbal expression, enabling us to understand and interact with the world in multiple ways. However, conte...
- DetectiumFire: A Comprehensive Multi-modal Dataset Bridging Vision and Language for Fire Understanding : Abstract: Recent advances in multi-modal models have demonstrated strong performance in tasks such as image generation and reasoning. However, applying these models to the fire domain remains challeng...
- UniChange: Unifying Change Detection with Multimodal Large Language Model : Abstract: Change detection (CD) is a fundamental task for monitoring and analyzing land cover dynamics. While recent high performance models and high quality datasets have significantly advanced the f...
- The Collaboration Gap : Abstract: The trajectory of AI development suggests that we will increasingly rely on agent-based systems composed of independently developed agents with different information, privileges, and tools. ...
- CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use Agents : Abstract: Current evaluations of Large Language Model (LLM) agents primarily emphasize task completion, often overlooking resource efficiency and adaptability. This neglects a crucial capability: agen...
- VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation : Abstract: Code has emerged as a precise and executable medium for reasoning and action in the agent era. Yet, progress has largely focused on language-centric tasks such as program synthesis and debug...
- How Teachers Can Use Large Language Models and Bloom's Taxonomy to Create Educational Quizzes : Abstract: Question generation (QG) is a natural language processing task with an abundance of potential benefits and use cases in the educational domain. In order for this potential to be realized, QG...
- Path-Consistency with Prefix Enhancement for Efficient Inference in LLMs : Abstract: To enhance the reasoning capabilities of large language models (LLMs), self-consistency has become a popular approach, combining multiple samplings with majority voting. However, current met...
- On Extending Direct Preference Optimization to Accommodate Ties : Abstract: We derive and investigate two DPO variants that explicitly model the possibility of declaring a tie in pair-wise comparisons. We replace the Bradley-Terry model in DPO with two well-known mo...
- I Want to Break Free! Persuasion and Anti-Social Behavior of LLMs in Multi-Agent Settings with Social Hierarchy : Abstract: As LLM-based agents become increasingly autonomous and will more freely interact with each other, studying the interplay among them becomes crucial to anticipate emergent phenomena and poten...
- ProMQA: Question Answering Dataset for Multimodal Procedural Activity Understanding : Abstract: Multimodal systems have great potential to assist humans in procedural activities, where people follow instructions to achieve their goals. Despite diverse application scenarios, systems are...
- Composing or Not Composing? Towards Distributional Construction Grammars : Abstract: The mechanisms of comprehension during language processing remains an open question. Classically, building the meaning of a linguistic utterance is said to be incremental, step-by-step, base...
- The exponential distribution of the order of demonstrative, numeral, adjective and noun : Abstract: The frequency of the preferred order for a noun phrase formed by demonstrative, numeral, adjective and noun has received significant attention over the last two decades. We investigate the a...
- Mixture of Routers : Abstract: Supervised fine-tuning (SFT) is a milestone in aligning large language models with human instructions and adapting them to downstream tasks. In particular, Low-Rank Adaptation (LoRA) has gai...
- TwT: Thinking without Tokens by Habitual Reasoning Distillation with Multi-Teachers' Guidance : Abstract: Large Language Models (LLMs) have made significant strides in problem-solving by incorporating reasoning processes. However, this enhanced reasoning capability results in an increased number...
- Identifying Aspects in Peer Reviews : Abstract: Peer review is central to academic publishing, but the growing volume of submissions is straining the process. This motivates the development of computational approaches to support peer revi...
- Rethinking the Relationship between the Power Law and Hierarchical Structures : Abstract: Statistical analysis of corpora provides an approach to quantitatively investigate natural languages. This approach has revealed that several power laws consistently emerge across different ...
- Beyond the Link: Assessing LLMs' ability to Classify Political Content across Global Media : Abstract: The use of large language models (LLMs) is becoming common in political science and digital media research. While LLMs have demonstrated ability in labelling tasks, their effectiveness to cl...
- ValueCompass: A Framework for Measuring Contextual Value Alignment Between Human and LLMs : Abstract: As AI systems become more advanced, ensuring their alignment with a diverse range of individuals and societal values becomes increasingly critical. But how can we capture fundamental human v...
- Visual Program Distillation with Template-Based Augmentation : Abstract: Adapting visual programming or prompting large language models (LLMs) to generate executable code for visual tasks like visual question answering (VQA) for specialized tasks or domains remai...
- Understanding and Optimizing Agentic Workflows via Shapley value : Abstract: Agentic workflows have become the dominant paradigm for building complex AI systems, orchestrating specialized components, such as planning, reasoning, action execution, and reflection, to t...
- SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents : Abstract: LLM-based agents have shown promising capabilities in a growing range of software engineering (SWE) tasks. However, advancing this field faces two critical challenges. First, high-quality tr...
- iFlyBot-VLA Technical Report : Abstract: We introduce iFlyBot-VLA, a large-scale Vision-Language-Action (VLA) model trained under a novel framework. The main contributions are listed as follows: (1) a latent action model thoroughly...
- Challenging DINOv3 Foundation Model under Low Inter-Class Variability: A Case Study on Fetal Brain Ultrasound : Abstract: Purpose: This study provides the first comprehensive evaluation of foundation models in fetal ultrasound (US) imaging under low inter-class variability conditions. While recent vision founda...
- Assessing the value of Geo-Foundational Models for Flood Inundation Mapping: Benchmarking models for Sentinel-1, Sentinel-2, and Planetscope for end-users : Abstract: Geo-Foundational Models (GFMs) enable fast and reliable extraction of spatiotemporal information from satellite imagery, improving flood inundation mapping by leveraging location and time em...
- Locally-Supervised Global Image Restoration : Abstract: We address the problem of image reconstruction from incomplete measurements, encompassing both upsampling and inpainting, within a learning-based framework. Conventional supervised approache...
- Towards Selection of Large Multimodal Models as Engines for Burned-in Protected Health Information Detection in Medical Images : Abstract: The detection of Protected Health Information (PHI) in medical imaging is critical for safeguarding patient privacy and ensuring compliance with regulatory frameworks. Traditional detection ...
- StrengthSense: A Dataset of IMU Signals Capturing Everyday Strength-Demanding Activities : Abstract: Tracking strength-demanding activities with wearable sensors like IMUs is crucial for monitoring muscular strength, endurance, and power. However, there is a lack of comprehensive datasets c...
- Text-VQA Aug: Pipelined Harnessing of Large Multimodal Models for Automated Synthesis : Abstract: Creation of large-scale databases for Visual Question Answering tasks pertaining to the text data in a scene (text-VQA) involves skilful human annotation, which is tedious and challenging. W...
- Markerless Augmented Reality Registration for Surgical Guidance: A Multi-Anatomy Clinical Accuracy Study : Abstract: Purpose: In this paper, we develop and clinically evaluate a depth-only, markerless augmented reality (AR) registration pipeline on a head-mounted display, and assess accuracy across small o...
- From Instance Segmentation to 3D Growth Trajectory Reconstruction in Planktonic Foraminifera : Abstract: Planktonic foraminifera, marine protists characterized by their intricate chambered shells, serve as valuable indicators of past and present environmental conditions. Understanding their cha...
- Fast Measuring Pavement Crack Width by Cascading Principal Component Analysis : Abstract: Accurate quantification of pavement crack width plays a pivotal role in assessing structural integrity and guiding maintenance interventions. However, achieving precise crack width measureme...
- Autobiasing Event Cameras for Flickering Mitigation : Abstract: Understanding and mitigating flicker effects caused by rapid variations in light intensity is critical for enhancing the performance of event cameras in diverse environments. This paper intr...
- Pinpointing Trigger Moment for Grounded Video QA: Enhancing Spatio-temporal Grounding in Multimodal Large Language Models : Abstract: In this technical report, we introduce a framework to address Grounded Video Question Answering (GVQA) task for the ICCV 2025 Perception Test Challenge. The GVQA task demands robust multimod...
- MM-UNet: Morph Mamba U-shaped Convolutional Networks for Retinal Vessel Segmentation : Abstract: Accurate detection of retinal vessels plays a critical role in reflecting a wide range of health status indicators in the clinical diagnosis of ocular diseases. Recently, advances in deep le...
- Language-Enhanced Generative Modeling for PET Synthesis from MRI and Blood Biomarkers : Abstract: Background: Alzheimer's disease (AD) diagnosis heavily relies on amyloid-beta positron emission tomography (Abeta-PET), which is limited by high cost and limited accessibility. This study ex...
- Object-Centric 3D Gaussian Splatting for Strawberry Plant Reconstruction and Phenotyping : Abstract: Strawberries are among the most economically significant fruits in the United States, generating over $2 billion in annual farm-gate sales and accounting for approximately 13% of the total f...
- Estimation of Segmental Longitudinal Strain in Transesophageal Echocardiography by Deep Learning : Abstract: Segmental longitudinal strain (SLS) of the left ventricle (LV) is an important prognostic indicator for evaluating regional LV dysfunction, in particular for diagnosing and managing myocardi...
- Can Foundation Models Revolutionize Mobile AR Sparse Sensing? : Abstract: Mobile sensing systems have long faced a fundamental trade-off between sensing quality and efficiency due to constraints in computation, power, and other limitations. Sparse sensing, which a...
- Collaborative Attention and Consistent-Guided Fusion of MRI and PET for Alzheimer's Disease Diagnosis : Abstract: Alzheimer's disease (AD) is the most prevalent form of dementia, and its early diagnosis is essential for slowing disease progression. Recent studies on multimodal neuroimaging fusion using ...
- Monocular absolute depth estimation from endoscopy via domain-invariant feature learning and latent consistency : Abstract: Monocular depth estimation (MDE) is a critical task to guide autonomous medical robots. However, obtaining absolute (metric) depth from an endoscopy camera in surgical scenes is difficult, w...
- Medical Report Generation: A Hierarchical Task Structure-Based Cross-Modal Causal Intervention Framework : Abstract: Medical Report Generation (MRG) is a key part of modern medical diagnostics, as it automatically generates reports from radiological images to reduce radiologists' burden. However, reliable ...
- Are Euler angles a useful rotation parameterisation for pose estimation with Normalizing Flows? : Abstract: Object pose estimation is a task that is of central importance in 3D Computer Vision. Given a target image and a canonical pose, a single point estimate may very often be sufficient; however...
- Cycle-Sync: Robust Global Camera Pose Estimation through Enhanced Cycle-Consistent Synchronization : Abstract: We introduce Cycle-Sync, a robust and global framework for estimating camera poses (both rotations and locations). Our core innovation is a location solver that adapts message-passing least ...
- GAFD-CC: Global-Aware Feature Decoupling with Confidence Calibration for OOD Detection : Abstract: Out-of-distribution (OOD) detection is paramount to ensuring the reliability and robustness of learning models in real-world applications. Existing post-hoc OOD detection methods detect OOD ...
- M3PD Dataset: Dual-view Photoplethysmography (PPG) Using Front-and-rear Cameras of Smartphones in Lab and Clinical Settings : Abstract: Portable physiological monitoring is essential for early detection and management of cardiovascular disease, but current methods often require specialized equipment that limits accessibility...
- RxnCaption: Reformulating Reaction Diagram Parsing as Visual Prompt Guided Captioning : Abstract: Large-scale chemical reaction datasets are crucial for AI research in chemistry. However, existing chemical reaction data often exist as images within papers, making them not machine-readabl...
- A Novel Grouping-Based Hybrid Color Correction Algorithm for Color Point Clouds : Abstract: Color consistency correction for color point clouds is a fundamental yet important task in 3D rendering and compression applications. In the past, most previous color correction methods aime...
- Purrturbed but Stable: Human-Cat Invariant Representations Across CNNs, ViTs and Self-Supervised ViTs : Abstract: Cats and humans differ in ocular anatomy. Most notably, Felis Catus (domestic cats) have vertically elongated pupils linked to ambush predation; yet, how such specializations manifest in dow...
- IllumFlow: Illumination-Adaptive Low-Light Enhancement via Conditional Rectified Flow and Retinex Decomposition : Abstract: We present IllumFlow, a novel framework that synergizes conditional Rectified Flow (CRF) with Retinex theory for low-light image enhancement (LLIE). Our model addresses low-light enhancement...
- ChartM$^3$: A Multi-Stage Code-Driven Pipeline for Constructing Multi-Dimensional and Multi-Step Visual Reasoning Data in Chart Comprehension : Abstract: Complex chart understanding tasks demand advanced visual recognition and reasoning capabilities from multimodal large language models (MLLMs). However, current research provides limited cove...
- Synthetic Crop-Weed Image Generation and its Impact on Model Generalization : Abstract: Precise semantic segmentation of crops and weeds is necessary for agricultural weeding robots. However, training deep learning models requires large annotated datasets, which are costly to o...
- From the Laboratory to Real-World Application: Evaluating Zero-Shot Scene Interpretation on Edge Devices for Mobile Robotics : Abstract: Video Understanding, Scene Interpretation and Commonsense Reasoning are highly challenging tasks enabling the interpretation of visual information, allowing agents to perceive, interact with...
- KAO: Kernel-Adaptive Optimization in Diffusion for Satellite Image : Abstract: Satellite image inpainting is a crucial task in remote sensing, where accurately restoring missing or occluded regions is essential for robust image analysis. In this paper, we propose KAO, ...
- MVAFormer: RGB-based Multi-View Spatio-Temporal Action Recognition with Transformer : Abstract: Multi-view action recognition aims to recognize human actions using multiple camera views and deals with occlusion caused by obstacles or crowds. In this task, cooperation among views, which...
- OLATverse: A Large-scale Real-world Object Dataset with Precise Lighting Control : Abstract: We introduce OLATverse, a large-scale dataset comprising around 9M images of 765 real-world objects, captured from multiple viewpoints under a diverse set of precisely controlled lighting co...
- Object Detection as an Optional Basis: A Graph Matching Network for Cross-View UAV Localization : Abstract: With the rapid growth of the low-altitude economy, UAVs have become crucial for measurement and tracking in patrol systems. However, in GNSS-denied areas, satellite-based localization method...
- Adapting General-Purpose Foundation Models for X-ray Ptychography in Low-Data Regimes : Abstract: The automation of workflows in advanced microscopy is a key goal where foundation models like Language Models (LLMs) and Vision-Language Models (VLMs) show great potential. However, adapting...
- ESA: Energy-Based Shot Assembly Optimization for Automatic Video Editing : Abstract: Shot assembly is a crucial step in film production and video editing, involving the sequencing and arrangement of shots to construct a narrative, convey information, or evoke emotions. Tradi...
- Keeping it Local, Tiny and Real: Automated Report Generation on Edge Computing Devices for Mechatronic-Based Cognitive Systems : Abstract: Recent advancements in Deep Learning enable hardware-based cognitive systems, that is, mechatronic systems in general and robotics in particular with integrated Artificial Intelligence, to i...
- LiteVoxel: Low-memory Intelligent Thresholding for Efficient Voxel Rasterization : Abstract: Sparse-voxel rasterization is a fast, differentiable alternative for optimization-based scene reconstruction, but it tends to underfit low-frequency content, depends on brittle pruning heuri...
- Unsupervised Learning for Industrial Defect Detection: A Case Study on Shearographic Data : Abstract: Shearography is a non-destructive testing method for detecting subsurface defects, offering high sensitivity and full-field inspection capabilities. However, its industrial adoption remains ...
- The Urban Vision Hackathon Dataset and Models: Towards Image Annotations and Accurate Vision Models for Indian Traffic : Abstract: This report describes the UVH-26 dataset, the first public release by AIM@IISc of a large-scale dataset of annotated traffic-camera images from India. The dataset comprises 26,646 high-resol...
- Seeing Across Time and Views: Multi-Temporal Cross-View Learning for Robust Video Person Re-Identification : Abstract: Video-based person re-identification (ReID) in cross-view domains (for example, aerial-ground surveillance) remains an open problem because of extreme viewpoint shifts, scale disparities, an...
- A Cognitive Process-Inspired Architecture for Subject-Agnostic Brain Visual Decoding : Abstract: Subject-agnostic brain decoding, which aims to reconstruct continuous visual experiences from fMRI without subject-specific training, holds great potential for clinical applications. However...
- Zero-Shot Multi-Animal Tracking in the Wild : Abstract: Multi-animal tracking is crucial for understanding animal ecology and behavior. However, it remains a challenging task due to variations in habitat, motion patterns, and species appearance. ...
- Robust Face Liveness Detection for Biometric Authentication using Single Image : Abstract: Biometric technologies are widely adopted in security, legal, and financial systems. Face recognition can authenticate a person based on the unique facial features such as shape and texture....
- Can Visual Input Be Compressed? A Visual Token Compression Benchmark for Large Multimodal Models : Abstract: Large multimodal models (LMMs) often suffer from severe inference inefficiency due to the large number of visual tokens introduced by image encoders. While recent token compression methods, ...
- Differentiable Hierarchical Visual Tokenization : Abstract: Vision Transformers rely on fixed patch tokens that ignore the spatial and semantic structure of images. In this work, we introduce an end-to-end differentiable tokenizer that adapts to imag...
- Modality-Transition Representation Learning for Visible-Infrared Person Re-Identification : Abstract: Visible-infrared person re-identification (VI-ReID) technique could associate the pedestrian images across visible and infrared modalities in the practical scenarios of background illuminati...
- VidEmo: Affective-Tree Reasoning for Emotion-Centric Video Foundation Models : Abstract: Understanding and predicting emotion from videos has gathered significant attention in recent studies, driven by advancements in video large language models (VideoLLMs). While advanced metho...
- LLEXICORP: End-user Explainability of Convolutional Neural Networks : Abstract: Convolutional neural networks (CNNs) underpin many modern computer vision systems. With applications ranging from common to critical areas, a need to explain and understand the model and its...
- Dynamic Reflections: Probing Video Representations with Text Alignment : Abstract: The alignment of representations from different modalities has recently been shown to provide insights on the structural similarities and downstream capabilities of different encoders across...
- PercHead: Perceptual Head Model for Single-Image 3D Head Reconstruction & Editing : Abstract: We present PercHead, a method for single-image 3D head reconstruction and semantic 3D editing - two tasks that are inherently challenging due to severe view occlusions, weak perceptual super...
- When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought : Abstract: We propose MIRA, a new benchmark designed to evaluate models in scenarios where generating intermediate visual images is essential for successful reasoning. Unlike traditional CoT methods th...
- AI-Generated Image Detection: An Empirical Study and Future Research Directions : Abstract: The threats posed by AI-generated media, particularly deepfakes, are now raising significant challenges for multimedia forensics, misinformation detection, and biometric system resulting in ...
- PLUTO-4: Frontier Pathology Foundation Models : Abstract: Foundation models trained on large-scale pathology image corpora have demonstrated strong transfer capabilities across diverse histopathology tasks. Building on this progress, we introduce P...
- Densemarks: Learning Canonical Embeddings for Human Heads Images via Point Tracks : Abstract: We propose DenseMarks - a new learned representation for human heads, enabling high-quality dense correspondences of human head images. For a 2D image of a human head, a Vision Transformer n...
- Opto-Electronic Convolutional Neural Network Design Via Direct Kernel Optimization : Abstract: Opto-electronic neural networks integrate optical front-ends with electronic back-ends to enable fast and energy-efficient vision. However, conventional end-to-end optimization of both the o...
- A Step Toward World Models: A Survey on Robotic Manipulation : Abstract: Autonomous agents are increasingly expected to operate in complex, dynamic, and uncertain environments, performing tasks such as manipulation, navigation, and decision-making. Achieving thes...
- High-Resolution Magnetic Particle Imaging System Matrix Recovery Using a Vision Transformer with Residual Feature Network : Abstract: This study presents a hybrid deep learning framework, the Vision Transformer with Residual Feature Network (VRF-Net), for recovering high-resolution system matrices in Magnetic Particle Imag...
- Are Foundational Atomistic Models Reliable for Finite-Temperature Molecular Dynamics? : Abstract: Machine learning force fields have emerged as promising tools for molecular dynamics (MD) simulations, potentially offering quantum-mechanical accuracy with the efficiency of classical MD. I...
- Rethinking Bimanual Robotic Manipulation: Learning with Decoupled Interaction Framework : Abstract: Bimanual robotic manipulation is an emerging and critical topic in the robotics community. Previous works primarily rely on integrated control models that take the perceptions and states of ...
- MIP against Agent: Malicious Image Patches Hijacking Multimodal OS Agents : Abstract: Recent advances in operating system (OS) agents have enabled vision-language models (VLMs) to directly control a user's computer. Unlike conventional VLMs that passively output text, OS agen...
- Enhancing Federated Learning Privacy with QUBO : Abstract: Federated learning (FL) is a widely used method for training machine learning (ML) models in a scalable way while preserving privacy (i.e., without centralizing raw data). Prior research sho...
- Can LLMs subtract numbers? : Abstract: We present a systematic study of subtraction in large language models (LLMs). While prior benchmarks emphasize addition and multiplication, subtraction has received comparatively little atte...
- Fast, Private, and Protected: Safeguarding Data Privacy and Defending Against Model Poisoning Attacks in Federated Learning : Abstract: Federated Learning (FL) is a distributed training paradigm wherein participants collaborate to build a global model while ensuring the privacy of the involved data, which remains stored on p...
- TabTune: A Unified Library for Inference and Fine-Tuning Tabular Foundation Models : Abstract: Tabular foundation models represent a growing paradigm in structured data learning, extending the benefits of large-scale pretraining to tabular domains. However, their adoption remains limi...
- Assessing win strength in MLB win prediction models : Abstract: In Major League Baseball, strategy and planning are major factors in determining the outcome of a game. Previous studies have aided this by building machine learning models for predicting th...
- GeoCrossBench: Cross-Band Generalization for Remote Sensing : Abstract: The number and diversity of remote sensing satellites grows over time, while the vast majority of labeled data comes from older satellites. As the foundation models for Earth observation sca...
- In Good GRACEs: Principled Teacher Selection for Knowledge Distillation : Abstract: Knowledge distillation is an efficient strategy to use data generated by large "teacher" language models to train smaller capable "student" models, but selecting the optimal teacher for a sp...
- Learning phases with Quantum Monte Carlo simulation cell : Abstract: We propose the use of the ``spin-opstring", derived from Stochastic Series Expansion Quantum Monte Carlo (QMC) simulations as machine learning (ML) input data. It offers a compact, memory-ef...
- Condition-Invariant fMRI Decoding of Speech Intelligibility with Deep State Space Model : Abstract: Clarifying the neural basis of speech intelligibility is critical for computational neuroscience and digital speech processing. Recent neuroimaging studies have shown that intelligibility mo...
- BondBERT: What we learn when assigning sentiment in the bond market : Abstract: Bond markets respond differently to macroeconomic news compared to equity markets, yet most sentiment models, including FinBERT, are trained primarily on general financial or equity news dat...
- CytoNet: A Foundation Model for the Human Cerebral Cortex : Abstract: To study how the human brain works, we need to explore the organization of the cerebral cortex and its detailed cellular architecture. We introduce CytoNet, a foundation model that encodes h...
- Learned Cost Model for Placement on Reconfigurable Dataflow Hardware : Abstract: Mapping a dataflow-graph of an ML model onto a reconfigurable system is difficult, as different mappings have different throughputs and consume resource constraints differently. To solve thi...
- Effectiveness of High-Dimensional Distance Metrics on Solar Flare Time Series : Abstract: Solar-flare forecasting has been extensively researched yet remains an open problem. In this paper, we investigate the contributions of elastic distance measures for detecting patterns in th...
- Affordable EEG, Actionable Insights: An Open Dataset and Evaluation Framework for Epilepsy Patient Stratification : Abstract: Access to clinical multi-channel EEG remains limited in many regions worldwide. We present NEUROSKY-EPI, the first open dataset of single-channel, consumer-grade EEG for epilepsy, collected ...
- Mirror-Neuron Patterns in AI Alignment : Abstract: As artificial intelligence (AI) advances toward superhuman capabilities, aligning these systems with human values becomes increasingly critical. Current alignment strategies rely largely on ...
- LGCC: Enhancing Flow Matching Based Text-Guided Image Editing with Local Gaussian Coupling and Context Consistency : Abstract: Recent advancements have demonstrated the great potential of flow matching-based Multimodal Large Language Models (MLLMs) in image editing. However, state-of-the-art works like BAGEL face li...
- EvoMem: Improving Multi-Agent Planning with Dual-Evolving Memory : Abstract: Planning has been a cornerstone of artificial intelligence for solving complex problems, and recent progress in LLM-based multi-agent frameworks have begun to extend this capability. However...
- Delta-learned force fields for nonbonded interactions: Addressing the strength mismatch between covalent-nonbonded interaction for global models : Abstract: Noncovalent interactions--vdW dispersion, hydrogen/halogen bonding, ion-$\pi$, and $\pi$-stacking--govern structure, dynamics, and emergent phenomena in materials and molecular systems, yet ...
- Improving Bayesian inference in PTA data analysis: importance nested sampling with Normalizing Flows : Abstract: We present a detailed study of Bayesian inference workflows for pulsar timing array data with a focus on enhancing efficiency, robustness and speed through the use of normalizing flow-based ...
- Addressing prior dependence in hierarchical Bayesian modeling for PTA data analysis II: Noise and SGWB inference through parameter decorrelation : Abstract: Pulsar Timing Arrays provide a powerful framework to measure low-frequency gravitational waves, but accuracy and robustness of the results are challenged by complex noise processes that must...
- Stability of mixed-state phases under weak decoherence : Abstract: We prove that the Gibbs states of classical, and commuting-Pauli, Hamiltonians are stable under weak local decoherence: i.e., we show that the effect of the decoherence can be locally revers...
- SEAL - A Symmetry EncourAging Loss for High Energy Physics : Abstract: Physical symmetries provide a strong inductive bias for constructing functions to analyze data. In particular, this bias may improve robustness, data efficiency, and interpretability of mach...
- Solving cold start in news recommendations: a RippleNet-based system for large scale media outlet : Abstract: We present a scalable recommender system implementation based on RippleNet, tailored for the media domain with a production deployment in Onet.pl, one of Poland's largest online media platfo...
- Data-driven Learning of Interaction Laws in Multispecies Particle Systems with Gaussian Processes: Convergence Theory and Applications : Abstract: We develop a Gaussian process framework for learning interaction kernels in multi-species interacting particle systems from trajectory data. Such systems provide a canonical setting for mult...
- Re-FORC: Adaptive Reward Prediction for Efficient Chain-of-Thought Reasoning : Abstract: We propose Re-FORC, an adaptive reward prediction method that, given a context, enables prediction of the expected future rewards as a function of the number of future thinking tokens. Re-FO...
- Optimizing Attention on GPUs by Exploiting GPU Architectural NUMA Effects : Abstract: The rise of disaggregated AI GPUs has exposed a critical bottleneck in large-scale attention workloads: non-uniform memory access (NUMA). As multi-chiplet designs become the norm for scaling...
- DoFlow: Causal Generative Flows for Interventional and Counterfactual Time-Series Prediction : Abstract: Time-series forecasting increasingly demands not only accurate observational predictions but also causal forecasting under interventional and counterfactual queries in multivariate systems. ...
- Near Optimal Convergence to Coarse Correlated Equilibrium in General-Sum Markov Games : Abstract: No-regret learning dynamics play a central role in game theory, enabling decentralized convergence to equilibrium for concepts such as Coarse Correlated Equilibrium (CCE) or Correlated Equil...
- ScenicProver: A Framework for Compositional Probabilistic Verification of Learning-Enabled Systems : Abstract: Full verification of learning-enabled cyber-physical systems (CPS) has long been intractable due to challenges including black-box components and complex real-world environments. Existing to...
- Eliminating Multi-GPU Performance Taxes: A Systems Approach to Efficient Distributed LLMs : Abstract: As large language models (LLMs) continue to scale, their workloads increasingly rely on distributed execution across multiple GPUs. However, the conventional bulk synchronous parallel~(BSP) ...
- PrivGNN: High-Performance Secure Inference for Cryptographic Graph Neural Networks : Abstract: Graph neural networks (GNNs) are powerful tools for analyzing and learning from graph-structured (GS) data, facilitating a wide range of services. Deploying such services in privacy-critical...
- Personalized Decision Modeling: Utility Optimization or Textualized-Symbolic Reasoning : Abstract: Decision-making models for individuals, particularly in high-stakes scenarios like vaccine uptake, often diverge from population optimal predictions. This gap arises from the uniqueness of t...
- Training Proactive and Personalized LLM Agents : Abstract: While existing work focuses primarily on task success, we argue that effective real-world agents require optimizing three dimensions: productivity (task completion), proactivity (asking esse...
- Optimizing Multi-Lane Intersection Performance in Mixed Autonomy Environments : Abstract: One of the main challenges in managing traffic at multilane intersections is ensuring smooth coordination between human-driven vehicles (HDVs) and connected autonomous vehicles (CAVs). This ...
- Structural Plasticity as Active Inference: A Biologically-Inspired Architecture for Homeostatic Control : Abstract: Traditional neural networks, while powerful, rely on biologically implausible learning mechanisms such as global backpropagation. This paper introduces the Structurally Adaptive Predictive I...
- Demo: Statistically Significant Results On Biases and Errors of LLMs Do Not Guarantee Generalizable Results : Abstract: Recent research has shown that hallucinations, omissions, and biases are prevalent in everyday use-cases of LLMs. However, chatbots used in medical contexts must provide consistent advice in...
- From Models to Operators: Rethinking Autoscaling Granularity for Large Generative Models : Abstract: Serving large generative models such as LLMs and multi- modal transformers requires balancing user-facing SLOs (e.g., time-to-first-token, time-between-tokens) with provider goals of efficie...
- Limit Theorems for Stochastic Gradient Descent in High-Dimensional Single-Layer Networks : Abstract: This paper studies the high-dimensional scaling limits of online stochastic gradient descent (SGD) for single-layer networks. Building on the seminal work of Saad and Solla, which analyzed t...
- Automata-Conditioned Cooperative Multi-Agent Reinforcement Learning : Abstract: We study the problem of learning multi-task, multi-agent policies for cooperative, temporal objectives, under centralized training, decentralized execution. In this setting, using automata t...
- An Automated Framework for Strategy Discovery, Retrieval, and Evolution in LLM Jailbreak Attacks : Abstract: The widespread deployment of Large Language Models (LLMs) as public-facing web services and APIs has made their security a core concern for the web ecosystem. Jailbreak attacks, as one of th...
- Let Multimodal Embedders Learn When to Augment Query via Adaptive Query Augmentation : Abstract: Query augmentation makes queries more meaningful by appending further information to the queries to find relevant documents. Current studies have proposed Large Language Model (LLM)-based em...
- A new class of Markov random fields enabling lightweight sampling : Abstract: This work addresses the problem of efficient sampling of Markov random fields (MRF). The sampling of Potts or Ising MRF is most often based on Gibbs sampling, and is thus computationally exp...
- AutoAdv: Automated Adversarial Prompting for Multi-Turn Jailbreaking of Large Language Models : Abstract: Large Language Models (LLMs) remain vulnerable to jailbreaking attacks where adversarial prompts elicit harmful outputs, yet most evaluations focus on single-turn interactions while real-wor...
- Self-Supervised Moving Object Segmentation of Sparse and Noisy Radar Point Clouds : Abstract: Moving object segmentation is a crucial task for safe and reliable autonomous mobile systems like self-driving cars, improving the reliability and robustness of subsequent tasks like SLAM or...
- MammoClean: Toward Reproducible and Bias-Aware AI in Mammography through Dataset Harmonization : Abstract: The development of clinically reliable artificial intelligence (AI) systems for mammography is hindered by profound heterogeneity in data quality, metadata standards, and population distribu...
- Arithmetic Circuits and Neural Networks for Regular Matroids : Abstract: We prove that there exist uniform $(+,\times,/)$-circuits of size $O(n^3)$ to compute the basis generating polynomial of regular matroids on $n$ elements. By tropicalization, this implies th...
- An Adaptive Sampling Framework for Detecting Localized Concept Drift under Label Scarcity : Abstract: Concept drift and label scarcity are two critical challenges limiting the robustness of predictive models in dynamic industrial environments. Existing drift detection methods often assume gl...
- Learning CNF formulas from uniform random solutions in the local lemma regime : Abstract: We study the problem of learning a $n$-variables $k$-CNF formula $\Phi$ from its i.i.d. uniform random solutions, which is equivalent to learning a Boolean Markov random field (MRF) with $k$...
- Many-vs-Many Missile Guidance via Virtual Targets : Abstract: This paper presents a novel approach to many-vs-many missile guidance using virtual targets (VTs) generated by a Normalizing Flows-based trajectory predictor. Rather than assigning n interce...
- Agentic AI for Mobile Network RAN Management and Optimization : Abstract: Agentic AI represents a new paradigm for automating complex systems by using Large AI Models (LAMs) to provide human-level cognitive abilities with multimodal perception, planning, memory, a...
- Forecasting Future Anatomies: Longitudianl Brain Mri-to-Mri Prediction : Abstract: Predicting future brain state from a baseline magnetic resonance image (MRI) is a central challenge in neuroimaging and has important implications for studying neurodegenerative diseases suc...
- RIS-Assisted 3D Spherical Splatting for Object Composition Visualization using Detection Transformers : Abstract: The pursuit of immersive and structurally aware multimedia experiences has intensified interest in sensing modalities that reconstruct objects beyond the limits of visible light. Conventiona...
- TAUE: Training-free Noise Transplant and Cultivation Diffusion Model : Abstract: Despite the remarkable success of text-to-image diffusion models, their output of a single, flattened image remains a critical bottleneck for professional applications requiring layer-wise c...
- Redundancy Maximization as a Principle of Associative Memory Learning : Abstract: Associative memory, traditionally modeled by Hopfield networks, enables the retrieval of previously stored patterns from partial or noisy cues. Yet, the local computational principles which ...
- Verifying LLM Inference to Prevent Model Weight Exfiltration : Abstract: As large AI models become increasingly valuable assets, the risk of model weight exfiltration from inference servers grows accordingly. An attacker controlling an inference server may exfilt...
- The stability of shallow neural networks on spheres: A sharp spectral analysis : Abstract: We present an estimation of the condition numbers of the \emph{mass} and \emph{stiffness} matrices arising from shallow ReLU$^k$ neural networks defined on the unit sphere~$\mathbb{S}^d$. In...
- Federated Attention: A Distributed Paradigm for Collaborative LLM Inference over Edge Networks : Abstract: Large language models (LLMs) are proliferating rapidly at the edge, delivering intelligent capabilities across diverse application scenarios. However, their practical deployment in collabora...
- RL-Aided Cognitive ISAC: Robust Detection and Sensing-Communication Trade-offs : Abstract: This paper proposes a reinforcement learning (RL)-aided cognitive framework for massive MIMO-based integrated sensing and communication (ISAC) systems employing a uniform planar array (UPA)....
- Optimal Singular Damage: Efficient LLM Inference in Low Storage Regimes : Abstract: Large language models (LLMs) are increasingly prevalent across diverse applications. However, their enormous size limits storage and processing capabilities to a few well-resourced stakehold...
- Optimizing Kernel Discrepancies via Subset Selection : Abstract: Kernel discrepancies are a powerful tool for analyzing worst-case errors in quasi-Monte Carlo (QMC) methods. Building on recent advances in optimizing such discrepancy measures, we extend th...
- Agentic World Modeling for 6G: Near-Real-Time Generative State-Space Reasoning : Abstract: We argue that sixth-generation (6G) intelligence is not fluent token prediction but the capacity to imagine and choose -- to simulate future scenarios, weigh trade-offs, and act with calibra...
- DANIEL: A Distributed and Scalable Approach for Global Representation Learning with EHR Applications : Abstract: Classical probabilistic graphical models face fundamental challenges in modern data environments, which are characterized by high dimensionality, source heterogeneity, and stringent data-sha...
- Orion-MSP: Multi-Scale Sparse Attention for Tabular In-Context Learning : Abstract: Tabular data remain the predominant format for real-world applications. Yet, developing effective neural models for tabular data remains challenging due to heterogeneous feature types and co...
- Accelerated Frank-Wolfe Algorithms: Complementarity Conditions and Sparsity : Abstract: We develop new accelerated first-order algorithms in the Frank-Wolfe (FW) family for minimizing smooth convex functions over compact convex sets, with a focus on two prominent constraint cla...
- TWIST2: Scalable, Portable, and Holistic Humanoid Data Collection System : Abstract: Large-scale data has driven breakthroughs in robotics, from language models to vision-language-action models in bimanual manipulation. However, humanoid robotics lacks equally effective data...
- Agent-Omni: Test-Time Multimodal Reasoning via Model Coordination for Understanding Anything : Abstract: Multimodal large language models (MLLMs) have shown strong capabilities but remain limited to fixed modality pairs and require costly fine-tuning with large aligned datasets. Building fully ...
- Explainable Graph Neural Architecture Search via Monte-Carlo Tree Search (Full version) : Abstract: The number of graph neural network (GNN) architectures has increased rapidly due to the growing adoption of graph analysis. Although we use GNNs in wide application scenarios, it is a labori...
- Continuous-time Riemannian SGD and SVRG Flows on Wasserstein Probabilistic Space : Abstract: Recently, optimization on the Riemannian manifold have provided valuable insights to the optimization community. In this regard, extending these methods to to the Wasserstein space is of par...
- Reset & Distill: A Recipe for Overcoming Negative Transfer in Continual Reinforcement Learning : Abstract: We argue that the negative transfer problem occurring when the new task to learn arrives is an important problem that needs not be overlooked when developing effective Continual Reinforcemen...
- Energy-Based Model for Accurate Estimation of Shapley Values in Feature Attribution : Abstract: Shapley value is a widely used tool in explainable artificial intelligence (XAI), as it provides a principled way to attribute contributions of input features to model outputs. However, esti...
- Improving Uncertainty Estimation through Semantically Diverse Language Generation : Abstract: Large language models (LLMs) can suffer from hallucinations when generating text. These hallucinations impede various applications in society and industry by making LLMs untrustworthy. Curre...
- RASPNet: A Benchmark Dataset for Radar Adaptive Signal Processing Applications : Abstract: We present a large-scale dataset called RASPNet for radar adaptive signal processing (RASP) applications to support the development of data-driven models within the adaptive radar community....
- Feature compression is the root cause of adversarial fragility in neural network classifiers : Abstract: In this paper, we uniquely study the adversarial robustness of deep neural networks (NN) for classification tasks against that of optimal classifiers. We look at the smallest magnitude of po...
- Link Prediction with Untrained Message Passing Layers : Abstract: Message passing neural networks (MPNNs) operate on graphs by exchanging information between neigbouring nodes. MPNNs have been successfully applied to various node-, edge-, and graph-level t...
- Network Anomaly Traffic Detection via Multi-view Feature Fusion : Abstract: Traditional anomalous traffic detection methods are based on single-view analysis, which has obvious limitations in dealing with complex attacks and encrypted communications. In this regard,...
- Lower-dimensional projections of cellular expression improves cell type classification from single-cell RNA sequencing : Abstract: Single-cell RNA sequencing (scRNA-seq) enables the study of cellular diversity at single cell level. It provides a global view of cell-type specification during the onset of biological mecha...
- Constrained Optimal Fuel Consumption of HEVs under Observational Noise : Abstract: In our prior work, we investigated the minimum fuel consumption of a hybrid electric vehicle (HEV) under a state-of-charge (SOC) balance constraint, assuming perfect SOC measurements and acc...
- Multiscale spatiotemporal heterogeneity analysis of bike-sharing system's self-loop phenomenon: Evidence from Shanghai : Abstract: Bike-sharing is an environmentally friendly shared mobility mode, but its self-loop phenomenon, where bikes are returned to the same station after several time usage, significantly impacts e...
- MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems : Abstract: The sparse Mixture-of-Experts (MoE) architecture is increasingly favored for scaling Large Language Models (LLMs) efficiently, but it depends on heterogeneous compute and memory resources. T...
- LoLaFL: Low-Latency Federated Learning via Forward-only Propagation : Abstract: Federated learning (FL) has emerged as a widely adopted paradigm for enabling edge learning with distributed data while ensuring data privacy. However, the traditional FL with deep neural ne...
- LEASE: Offline Preference-based Reinforcement Learning with High Sample Efficiency : Abstract: Offline preference-based reinforcement learning (PbRL) provides an effective way to overcome the challenges of designing reward and the high costs of online interaction. However, since label...
- UFGraphFR: Graph Federation Recommendation System based on User Text description features : Abstract: Federated learning offers a privacy-preserving framework for recommendation systems by enabling local data processing; however, data localization introduces substantial obstacles. Traditiona...
- Gene Regulatory Network Inference in the Presence of Selection Bias and Latent Confounders : Abstract: Gene regulatory network inference (GRNI) aims to discover how genes causally regulate each other from gene expression data. It is well-known that statistical dependencies in observed data do...
- Personalized Interpolation: Achieving Efficient Conversion Estimation with Flexible Optimization Windows : Abstract: Optimizing conversions is crucial in modern online advertising systems, enabling advertisers to deliver relevant products to users and drive business outcomes. However, accurately predicting...
- Generalization Error Analysis for Selective State-Space Models Through the Lens of Attention : Abstract: State-space models (SSMs) have recently emerged as a compelling alternative to Transformers for sequence modeling tasks. This paper presents a theoretical generalization analysis of selectiv...
- Training Language Models to Reason Efficiently : Abstract: Scaling model size and training data has led to great advances in the performance of Large Language Models (LLMs). However, the diminishing returns of this approach necessitate alternative m...
- Universal Sequence Preconditioning : Abstract: We study the problem of preconditioning in sequential prediction. From the theoretical lens of linear dynamical systems, we show that convolving the target sequence corresponds to applying a...
- Position: Bridge the Gaps between Machine Unlearning and AI Regulation : Abstract: The ''right to be forgotten'' and the data privacy laws that encode it have motivated machine unlearning since its earliest days. Now, some argue that an inbound wave of artificial intellige...
- Remasking Discrete Diffusion Models with Inference-Time Scaling : Abstract: Part of the success of diffusion models stems from their ability to perform iterative refinement, i.e., repeatedly correcting outputs during generation. However, modern masked discrete diffu...
- Overcoming Non-stationary Dynamics with Evidential Proximal Policy Optimization : Abstract: Continuous control of non-stationary environments is a major challenge for deep reinforcement learning algorithms. The time-dependency of the state transition dynamics aggravates the notorio...
- Closing the Intent-to-Behavior Gap via Fulfillment Priority Logic : Abstract: Practitioners designing reinforcement learning policies face a fundamental challenge: translating intended behavioral objectives into representative reward functions. This challenge stems fr...
- Dense Backpropagation Improves Training for Sparse Mixture-of-Experts : Abstract: Mixture of Experts (MoE) pretraining is more scalable than dense Transformer pretraining, because MoEs learn to route inputs to a sparse set of their feedforward parameters. However, this me...
- Emergence and scaling laws in SGD learning of shallow neural networks : Abstract: We study the complexity of online stochastic gradient descent (SGD) for learning a two-layer neural network with $P$ neurons on isotropic Gaussian data: $f_*(\boldsymbol{x}) = \sum_{p=1}^P a...
- AI-driven software for automated quantification of skeletal metastases and treatment response evaluation using Whole-Body Diffusion-Weighted MRI (WB-DWI) in Advanced Prostate Cancer : Abstract: Quantitative assessment of treatment response in Advanced Prostate Cancer (APC) with bone metastases remains an unmet clinical need. Whole-Body Diffusion-Weighted MRI (WB-DWI) provides two r...
- Option-aware Temporally Abstracted Value for Offline Goal-Conditioned Reinforcement Learning : Abstract: Offline goal-conditioned reinforcement learning (GCRL) offers a practical learning paradigm in which goal-reaching policies are trained from abundant state-action trajectory datasets without...
- Contrastive Consolidation of Top-Down Modulations Achieves Sparsely Supervised Continual Learning : Abstract: Biological brains learn continually from a stream of unlabeled data, while integrating specialized information from sparsely labeled examples without compromising their ability to generalize...
- Imagine Beyond! Distributionally Robust Auto-Encoding for State Space Coverage in Online Reinforcement Learning : Abstract: Goal-Conditioned Reinforcement Learning (GCRL) enables agents to autonomously acquire diverse behaviors, but faces major challenges in visual environments due to high-dimensional, semantical...
- DreamPRM: Domain-Reweighted Process Reward Model for Multimodal Reasoning : Abstract: Reasoning has substantially improved the performance of large language models (LLMs) on complicated tasks. Central to the current reasoning studies, Process Reward Models (PRMs) offer a fine...
- Strategic Classification with Non-Linear Classifiers : Abstract: In strategic classification, the standard supervised learning setting is extended to support the notion of strategic user behavior in the form of costly feature manipulations made in respons...
- From Flat to Hierarchical: Extracting Sparse Representations with Matching Pursuit : Abstract: Motivated by the hypothesis that neural network representations encode abstract, interpretable features as linearly accessible, approximately orthogonal directions, sparse autoencoders (SAEs...
- Evaluating Sparse Autoencoders: From Shallow Design to Matching Pursuit : Abstract: Sparse autoencoders (SAEs) have recently become central tools for interpretability, leveraging dictionary learning principles to extract sparse, interpretable features from neural representa...
- Neural Collapse in Cumulative Link Models for Ordinal Regression: An Analysis with Unconstrained Feature Model : Abstract: A phenomenon known as ''Neural Collapse (NC)'' in deep classification tasks, in which the penultimate-layer features and the final classifiers exhibit an extremely simple geometric structure...
- End-to-End Probabilistic Framework for Learning with Hard Constraints : Abstract: We present ProbHardE2E, a probabilistic forecasting framework that incorporates hard operational/physical constraints, and provides uncertainty quantification. Our methodology uses a novel d...
- ABS: Enforcing Constraint Satisfaction On Generated Sequences Via Automata-Guided Beam Search : Abstract: Sequence generation and prediction form a cornerstone of modern machine learning, with applications spanning natural language processing, program synthesis, and time-series forecasting. Thes...
- Two-Player Zero-Sum Games with Bandit Feedback : Abstract: We study a two-player zero-sum game in which the row player aims to maximize their payoff against an adversarial column player, under an unknown payoff matrix estimated through bandit feedba...
- Consistent Sampling and Simulation: Molecular Dynamics with Energy-Based Diffusion Models : Abstract: In recent years, diffusion models trained on equilibrium molecular distributions have proven effective for sampling biomolecules. Beyond direct sampling, the score of such a model can also b...
- Modeling Hierarchical Spaces: A Review and Unified Framework for Surrogate-Based Architecture Design : Abstract: Simulation-based problems involving mixed-variable inputs frequently feature domains that are hierarchical, conditional, heterogeneous, or tree-structured. These characteristics pose challen...
- Aggregation of Published Non-Uniform Axial Power Data for Phase II of the OECD/NEA AI/ML Critical Heat Flux Benchmark : Abstract: Critical heat flux (CHF) marks the onset of boiling crisis in light-water reactors, defining safe thermal-hydraulic operating limits. To support Phase II of the OECD/NEA AI/ML CHF benchmark,...
- Relational Causal Discovery with Latent Confounders : Abstract: Estimating causal effects from real-world relational data can be challenging when the underlying causal model and potential confounders are unknown. While several causal discovery algorithms...
- Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate : Abstract: The prevailing paradigm for scaling large language models (LLMs) involves monolithic, end-to-end training, a resource-intensive process that lacks flexibility. This paper explores an alterna...
- Prior-Guided Flow Matching for Target-Aware Molecule Design with Learnable Atom Number : Abstract: Structure-based drug design (SBDD), aiming to generate 3D molecules with high binding affinity toward target proteins, is a vital approach in novel drug discovery. Although recent generative...
- A Compositional Kernel Model for Feature Learning : Abstract: We study a compositional variant of kernel ridge regression in which the predictor is applied to a coordinate-wise reweighting of the inputs. Formulated as a variational problem, this model ...
- Expertise and confidence explain how social influence evolves along intellective tasks : Abstract: Discovering the antecedents of individuals' influence in collaborative environments is an important, practical, and challenging problem. In this paper, we study interpersonal influence in sm...
- Efficient Learning of Quantum States Prepared With Few Non-Clifford Gates : Abstract: We give a pair of algorithms that efficiently learn a quantum state prepared by Clifford gates and $O(\log n)$ non-Clifford gates. Specifically, for an $n$-qubit state $|\psi\rangle$ prepare...
- Testing with Non-identically Distributed Samples : Abstract: We examine the extent to which sublinear-sample property testing and estimation apply to settings where samples are independently but not identically distributed. Specifically, we consider t...
- Cross-modal Diffusion Modelling for Super-resolved Spatial Transcriptomics : Abstract: The recent advancement of spatial transcriptomics (ST) allows to characterize spatial gene expression within tissue for discovery research. However, current ST platforms suffer from low reso...
- Tracking solutions of time-varying variational inequalities : Abstract: Tracking the solution of time-varying variational inequalities is an important problem with applications in game theory, optimization, and machine learning. Existing work considers time-vary...
- Scaffolded Language Models with Language Supervision for Mixed-Autonomy: A Survey : Abstract: This survey organizes the intricate literature on the design and optimization of emerging structures around post-trained LMs. We refer to this overarching structure as scaffolded LMs and foc...
- Language-Agnostic Modeling of Source Reliability on Wikipedia : Abstract: Over the last few years, verifying the credibility of information sources has become a fundamental need to combat disinformation. Here, we present a language-agnostic model designed to asses...
- Detection Augmented Bandit Procedures for Piecewise Stationary MABs: A Modular Approach : Abstract: Conventional Multi-Armed Bandit (MAB) algorithms are designed for stationary environments, where the reward distributions associated with the arms do not change with time. In many applicatio...
- Interpreting Emergent Features in Deep Learning-based Side-channel Analysis : Abstract: Side-channel analysis (SCA) poses a real-world threat by exploiting unintentional physical signals to extract secret information from secure devices. Evaluation labs also use the same techni...
- Astromer 2 : Abstract: Foundational models have emerged as a powerful paradigm in deep learning field, leveraging their capacity to learn robust representations from large-scale datasets and effectively to diverse...
- Bayesian Optimization by Kernel Regression and Density-based Exploration : Abstract: Bayesian optimization is highly effective for optimizing expensive-to-evaluate black-box functions, but it faces significant computational challenges due to the high computational complexity...
- Image Super-Resolution with Guarantees via Conformalized Generative Models : Abstract: The increasing use of generative ML foundation models for image restoration tasks such as super-resolution calls for robust and interpretable uncertainty quantification methods. We address t...
- Gradient GA: Gradient Genetic Algorithm for Drug Molecular Design : Abstract: Molecular discovery has brought great benefits to the chemical industry. Various molecule design techniques are developed to identify molecules with desirable properties. Traditional optimiz...
- Detection and Geographic Localization of Natural Objects in the Wild: A Case Study on Palms : Abstract: Palms are ecologically and economically indicators of tropical forest health, biodiversity, and human impact that support local economies and global forest product supply chains. While palm ...
- Towards efficient quantum algorithms for diffusion probabilistic models : Abstract: A diffusion probabilistic model (DPM) is a generative model renowned for its ability to produce high-quality outputs in tasks such as image and audio generation. However, training DPMs on la...
- Rethinking Video Super-Resolution: Towards Diffusion-Based Methods without Motion Alignment : Abstract: In this work, we rethink the approach to video super-resolution by introducing a method based on the Diffusion Posterior Sampling framework, combined with an unconditional video diffusion tr...
- FORTALESA: Fault-Tolerant Reconfigurable Systolic Array for DNN Inference : Abstract: The emergence of Deep Neural Networks (DNNs) in mission- and safety-critical applications brings their reliability to the front. High performance demands of DNNs require the use of specializ...
- Learning Interactive World Model for Object-Centric Reinforcement Learning : Abstract: Agents that understand objects and their interactions can learn policies that are more robust and transferable. However, most object-centric RL methods factor state by individual objects whi...
- Opportunistic Expert Activation: Batch-Aware Expert Routing for Faster Decode Without Retraining : Abstract: An increasing number of LLMs employ Mixture-of-Experts (MoE) architectures where the feed-forward layer is replaced by a pool of experts and each token only activates a small subset of them....
- Neural network initialization with nonlinear characteristics and information on spectral bias : Abstract: Initialization of neural network parameters, such as weights and biases, has a crucial impact on learning performance; if chosen well, we can even avoid the need for additional training with...
- Probabilistic Graph Cuts : Abstract: Probabilistic relaxations of graph cuts offer a differentiable alternative to spectral clustering, enabling end-to-end and online learning without eigendecompositions, yet prior work centere...
- Gradient-Variation Online Adaptivity for Accelerated Optimization with H\"older Smoothness : Abstract: Smoothness is known to be crucial for acceleration in offline optimization, and for gradient-variation regret minimization in online learning. Interestingly, these two problems are actually ...
- Reinforcement learning based data assimilation for unknown state model : Abstract: Data assimilation (DA) has increasingly emerged as a critical tool for state estimation across a wide range of applications. It is signiffcantly challenging when the governing equations of...
- Federated Quantum Kernel Learning for Anomaly Detection in Multivariate IoT Time-Series : Abstract: The rapid growth of industrial Internet of Things (IIoT) systems has created new challenges for anomaly detection in high-dimensional, multivariate time-series, where privacy, scalability, a...
- FP8-Flow-MoE: A Casting-Free FP8 Recipe without Double Quantization Error : Abstract: Training large Mixture-of-Experts (MoE) models remains computationally prohibitive due to their extreme compute and memory demands. Although low-precision training promises to accelerate com...
- The Sequential Edge: Inverse-Entropy Voting Beats Parallel Self-Consistency at Matched Compute : Abstract: We revisit test-time scaling for language model reasoning and ask a fundamental question: at equal token budget and compute, is it better to run multiple independent chains in parallel, or t...
- Large-scale automatic carbon ion treatment planning for head and neck cancers via parallel multi-agent reinforcement learning : Abstract: Head-and-neck cancer (HNC) planning is difficult because multiple critical organs-at-risk (OARs) are close to complex targets. Intensity-modulated carbon-ion therapy (IMCT) offers superior d...
- RoME: Domain-Robust Mixture-of-Experts for MILP Solution Prediction across Domains : Abstract: Mixed-Integer Linear Programming (MILP) is a fundamental and powerful framework for modeling complex optimization problems across diverse domains. Recently, learning-based methods have shown...
- Learning A Universal Crime Predictor with Knowledge-guided Hypernetworks : Abstract: Predicting crimes in urban environments is crucial for public safety, yet existing prediction methods often struggle to align the knowledge across diverse cities that vary dramatically in da...
- Reducing normalizing flow complexity for MCMC preconditioning : Abstract: Preconditioning is a key component of MCMC algorithms that improves sampling efficiency by facilitating exploration of geometrically complex target distributions through an invertible map. W...
- Human-Machine Ritual: Synergic Performance through Real-Time Motion Recognition : Abstract: We introduce a lightweight, real-time motion recognition system that enables synergic human-machine performance through wearable IMU sensor data, MiniRocket time-series classification, and r...
- Evolving Graph Learning for Out-of-Distribution Generalization in Non-stationary Environments : Abstract: Graph neural networks have shown remarkable success in exploiting the spatial and temporal patterns on dynamic graphs. However, existing GNNs exhibit poor generalization ability under distri...
- LUMA-RAG: Lifelong Multimodal Agents with Provably Stable Streaming Alignment : Abstract: Retrieval-Augmented Generation (RAG) has emerged as the dominant paradigm for grounding large language model outputs in verifiable evidence. However, as modern AI agents transition from stat...
- H-Infinity Filter Enhanced CNN-LSTM for Arrhythmia Detection from Heart Sound Recordings : Abstract: Early detection of heart arrhythmia can prevent severe future complications in cardiac patients. While manual diagnosis still remains the clinical standard, it relies heavily on visual inter...
- A Spatially Informed Gaussian Process UCB Method for Decentralized Coverage Control : Abstract: We present a novel decentralized algorithm for coverage control in unknown spatial environments modeled by Gaussian Processes (GPs). To trade-off between exploration and exploitation, each a...
- Improving Unlearning with Model Updates Probably Aligned with Gradients : Abstract: We formulate the machine unlearning problem as a general constrained optimization problem. It unifies the first-order methods from the approximate machine unlearning literature. This paper t...
- Accounting for Underspecification in Statistical Claims of Model Superiority : Abstract: Machine learning methods are increasingly applied in medical imaging, yet many reported improvements lack statistical robustness: recent works have highlighted that small but significant per...
- SKGE: Spherical Knowledge Graph Embedding with Geometric Regularization : Abstract: Knowledge graph embedding (KGE) has become a fundamental technique for representation learning on multi-relational data. Many seminal models, such as TransE, operate in an unbounded Euclidea...
- NOWS: Neural Operator Warm Starts for Accelerating Iterative Solvers : Abstract: Partial differential equations (PDEs) underpin quantitative descriptions across the physical sciences and engineering, yet high-fidelity simulation remains a major computational bottleneck f...
- BRAINS: A Retrieval-Augmented System for Alzheimer's Detection and Monitoring : Abstract: As the global burden of Alzheimer's disease (AD) continues to grow, early and accurate detection has become increasingly critical, especially in regions with limited access to advanced diagn...
- Variational Geometric Information Bottleneck: Learning the Shape of Understanding : Abstract: We propose a unified information-geometric framework that formalizes understanding in learning as a trade-off between informativeness and geometric simplicity. An encoder phi is evaluated by...
- An End-to-End Learning Approach for Solving Capacitated Location-Routing Problems : Abstract: The capacitated location-routing problems (CLRPs) are classical problems in combinatorial optimization, which require simultaneously making location and routing decisions. In CLRPs, the comp...
- Causal Graph Neural Networks for Healthcare : Abstract: Healthcare artificial intelligence systems routinely fail when deployed across institutions, with documented performance drops and perpetuation of discriminatory patterns embedded in histori...
- Rawlsian many-to-one matching with non-linear utility : Abstract: We study a many-to-one matching problem, such as the college admission problem, where each college can admit multiple students. Unlike classical models, colleges evaluate sets of students th...
- Theoretical Guarantees for Causal Discovery on Large Random Graphs : Abstract: We investigate theoretical guarantees for the false-negative rate (FNR) -- the fraction of true causal edges whose orientation is not recovered, under single-variable random interventions an...
- Adaptive Neighborhood-Constrained Q Learning for Offline Reinforcement Learning : Abstract: Offline reinforcement learning (RL) suffers from extrapolation errors induced by out-of-distribution (OOD) actions. To address this, offline RL algorithms typically impose constraints on act...
- Dynamic Priors in Bayesian Optimization for Hyperparameter Optimization : Abstract: Hyperparameter optimization (HPO), for example, based on Bayesian optimization (BO), supports users in designing models well-suited for a given dataset. HPO has proven its effectiveness on s...
- Directional-Clamp PPO : Abstract: Proximal Policy Optimization (PPO) is widely regarded as one of the most successful deep reinforcement learning algorithms, known for its robustness and effectiveness across a range of probl...
- A Large Language Model for Corporate Credit Scoring : Abstract: We introduce Omega^2, a Large Language Model-driven framework for corporate credit scoring that combines structured financial data with advanced machine learning to improve predictive reliab...
- Neural Network Interoperability Across Platforms : Abstract: The development of smart systems (i.e., systems enhanced with AI components) has thrived thanks to the rapid advancements in neural networks (NNs). A wide range of libraries and frameworks h...
- A Non-Adversarial Approach to Idempotent Generative Modelling : Abstract: Idempotent Generative Networks (IGNs) are deep generative models that also function as local data manifold projectors, mapping arbitrary inputs back onto the manifold. They are trained to ac...
- Recursively Enumerably Representable Classes and Computable Versions of the Fundamental Theorem of Statistical Learning : Abstract: We study computable probably approximately correct (CPAC) learning, where learners are required to be computable functions. It had been previously observed that the Fundamental Theorem of St...
- Natural-gas storage modelling by deep reinforcement learning : Abstract: We introduce GasRL, a simulator that couples a calibrated representation of the natural gas market with a model of storage-operator policies trained with deep reinforcement learning (RL). We...
- Apriel-H1: Towards Efficient Enterprise Reasoning Models : Abstract: Large Language Models (LLMs) achieve remarkable reasoning capabilities through transformer architectures with attention mechanisms. However, transformers suffer from quadratic time and memor...
- Nesterov-Accelerated Robust Federated Learning Over Byzantine Adversaries : Abstract: We investigate robust federated learning, where a group of workers collaboratively train a shared model under the orchestration of a central server in the presence of Byzantine adversaries c...
- In Situ Training of Implicit Neural Compressors for Scientific Simulations via Sketch-Based Regularization : Abstract: Focusing on implicit neural representations, we present a novel in situ training protocol that employs limited memory buffers of full and sketched data samples, where the sketched data are l...
- Scalable Evaluation and Neural Models for Compositional Generalization : Abstract: Compositional generalization-a key open challenge in modern machine learning-requires models to predict unknown combinations of known concepts. However, assessing compositional generalizatio...
- Curriculum Design for Trajectory-Constrained Agent: Compressing Chain-of-Thought Tokens in LLMs : Abstract: Training agents to operate under strict constraints during deployment, such as limited resource budgets or stringent safety requirements, presents significant challenges, especially when the...
- Does Interpretability of Knowledge Tracing Models Support Teacher Decision Making? : Abstract: Knowledge tracing (KT) models are a crucial basis for pedagogical decision-making, namely which task to select next for a learner and when to stop teaching a particular skill. Given the high...
- Calibration improves detection of mislabeled examples : Abstract: Mislabeled data is a pervasive issue that undermines the performance of machine learning systems in real-world applications. An effective approach to mitigate this problem is to detect misla...
- ConMeZO: Adaptive Descent-Direction Sampling for Gradient-Free Finetuning of Large Language Models : Abstract: Zeroth-order or derivative-free optimization (MeZO) is an attractive strategy for finetuning large language models (LLMs) because it eliminates the memory overhead of backpropagation. Howeve...
- From Solo to Symphony: Orchestrating Multi-Agent Collaboration with Single-Agent Demos : Abstract: Training a team of agents from scratch in multi-agent reinforcement learning (MARL) is highly inefficient, much like asking beginners to play a symphony together without first practicing sol...
- VecComp: Vector Computing via MIMO Digital Over-the-Air Computation : Abstract: Recently, the ChannelComp framework has proposed digital over-the-air computation by designing digital modulations that enable the computation of arbitrary functions. Unlike traditional anal...
- STAR-VAE: Latent Variable Transformers for Scalable and Controllable Molecular Generation : Abstract: The chemical space of drug-like molecules is vast, motivating the development of generative models that must learn broad chemical distributions, enable conditional generation by capturing st...
- Adam Reduces a Unique Form of Sharpness: Theoretical Insights Near the Minimizer Manifold : Abstract: Despite the popularity of the Adam optimizer in practice, most theoretical analyses study Stochastic Gradient Descent (SGD) as a proxy for Adam, and little is known about how the solutions f...
- Efficient Vector Symbolic Architectures from Histogram Recovery : Abstract: Vector symbolic architectures (VSAs) are a family of information representation techniques which enable composition, i.e., creating complex information structures from atomic vectors via bin...
- A Detailed Study on LLM Biases Concerning Corporate Social Responsibility and Green Supply Chains : Abstract: Organizations increasingly use Large Language Models (LLMs) to improve supply chain processes and reduce environmental impacts. However, LLMs have been shown to reproduce biases regarding th...
- SmartMLOps Studio: Design of an LLM-Integrated IDE with Automated MLOps Pipelines for Model Development and Monitoring : Abstract: The rapid expansion of artificial intelligence and machine learning (ML) applications has intensified the demand for integrated environments that unify model development, deployment, and mon...
- Trove: A Flexible Toolkit for Dense Retrieval : Abstract: We introduce Trove, an easy-to-use open-source retrieval toolkit that simplifies research experiments without sacrificing flexibility or speed. For the first time, we introduce efficient dat...
- Learning Complementary Policies for Human-AI Teams : Abstract: This paper tackles the critical challenge of human-AI complementarity in decision-making. Departing from the traditional focus on algorithmic performance in favor of performance of the human...
- Interpretable end-to-end Neurosymbolic Reinforcement Learning agents : Abstract: Deep reinforcement learning (RL) agents rely on shortcut learning, preventing them from generalizing to slightly different environments. To address this problem, symbolic method, that use ob...
- The Digital Ecosystem of Beliefs: does evolution favour AI over humans? : Abstract: As AI systems are integrated into social networks, there are AI safety concerns that AI-generated content may dominate the web, e.g. in popularity or impact on beliefs. To understand such qu...
- Survey Transfer Learning: Recycling Data with Silicon Responses : Abstract: As researchers increasingly turn to large language models (LLMs) to generate synthetic survey data, less attention has been paid to alternative AI paradigms given environmental costs of LLMs...
- The Limits of AI Explainability: An Algorithmic Information Theory Approach : Abstract: This paper establishes a theoretical foundation for understanding the fundamental limits of AI explainability through algorithmic information theory. We formalize explainability as the appro...
- LiteCUA: Computer as MCP Server for Computer-Use Agent on AIOS : Abstract: We present AIOS 1.0, a novel platform designed to advance computer-use agent (CUA) capabilities through environmental contextualization. While existing approaches primarily focus on building...
- CoP: Agentic Red-teaming for Large Language Models using Composition of Principles : Abstract: Recent advances in Large Language Models (LLMs) have spurred transformative applications in various domains, ranging from open-source to proprietary LLMs. However, jailbreak attacks, which a...
- AnyMAC: Cascading Flexible Multi-Agent Collaboration via Next-Agent Prediction : Abstract: Recent progress in large language model (LLM)-based multi-agent collaboration highlights the power of structured communication in enabling collective intelligence. However, existing methods ...
- Limits of Safe AI Deployment: Differentiating Oversight and Control : Abstract: Oversight and control, which we collectively call supervision, are often discussed as ways to ensure that AI systems are accountable, reliable, and able to fulfill governance and management ...
- Agentic Large Language Models for Conceptual Systems Engineering and Design : Abstract: Early-stage engineering design involves complex, iterative reasoning, yet existing large language model (LLM) workflows struggle to maintain task continuity and generate executable models. W...
- Prevailing Research Areas for Music AI in the Era of Foundation Models : Abstract: Parallel to rapid advancements in foundation model research, the past few years have witnessed a surge in music AI applications. As AI-generated and AI-augmented music become increasingly ma...
- FTSmartAudit: A Knowledge Distillation-Enhanced Framework for Automated Smart Contract Auditing Using Fine-Tuned LLMs : Abstract: The rapid growth of blockchain technology has driven the widespread adoption of smart contracts. However, their inherent vulnerabilities have led to significant financial losses. Traditional...
- Runtime Analysis of Evolutionary Algorithms for Multi-party Multi-objective Optimization : Abstract: In scenarios where multiple decision-makers operate within a common decision space, each focusing on their own multi-objective optimization problem (e.g., bargaining games), the problem can ...
- Generative AI and Empirical Software Engineering: A Paradigm Shift : Abstract: The adoption of large language models (LLMs) and autonomous agents in software engineering marks an enduring paradigm shift. These systems create new opportunities for tool design, workflow ...
- HCT-QA: A Benchmark for Question Answering on Human-Centric Tables : Abstract: Tabular data embedded within PDF files, web pages, and other document formats are prevalent across numerous sectors such as government, engineering, science, and business. These human-centri...
- Memory Assisted LLM for Personalized Recommendation System : Abstract: Large language models (LLMs) have demonstrated significant potential in solving recommendation tasks. With proven capabilities in understanding user preferences, LLM personalization has emer...
- In Dialogue with Intelligence: Rethinking Large Language Models as Collective Knowledge : Abstract: Large Language Models (LLMs) can be understood as Collective Knowledge (CK): a condensation of human cultural and technical output, whose apparent intelligence emerges in dialogue. This pers...
- Balancing Caregiving and Self-Care: Exploring Mental Health Needs of Alzheimer's and Dementia Caregivers : Abstract: Alzheimer's Disease and Related Dementias (AD/ADRD) are progressive neurodegenerative conditions that impair memory, thought processes, and functioning. Family caregivers of individuals with...
- PPMI: Privacy-Preserving LLM Interaction with Socratic Chain-of-Thought Reasoning and Homomorphically Encrypted Vector Databases : Abstract: Large language models (LLMs) are increasingly used as personal agents, accessing sensitive user data such as calendars, emails, and medical records. Users currently face a trade-off: They ca...
- A Collectivist, Economic Perspective on AI : Abstract: Information technology is in the midst of a revolution in which omnipresent data collection and machine learning are impacting the human world as never before. The word "intelligence" is bei...
- H-NeiFi: Non-Invasive and Consensus-Efficient Multi-Agent Opinion Guidance : Abstract: The openness of social media enables the free exchange of opinions, but it also presents challenges in guiding opinion evolution towards global consensus. Existing methods often directly mod...
- CudaForge: An Agent Framework with Hardware Feedback for CUDA Kernel Optimization : Abstract: Developing efficient CUDA kernels is increasingly critical for AI applications such as large-scale LLM training. However, manual kernel design is both costly and time-consuming, motivating a...
- Retrieval-Augmented Multimodal Depression Detection : Abstract: Multimodal deep learning has shown promise in depression detection by integrating text, audio, and video signals. Recent work leverages sentiment analysis to enhance emotional understanding,...
- The Eigenvalues Entropy as a Classifier Evaluation Measure : Abstract: Classification is a machine learning method used in many practical applications: text mining, handwritten character recognition, face recognition, pattern classification, scene labeling, com...
- Variational Geometry-aware Neural Network based Method for Solving High-dimensional Diffeomorphic Mapping Problems : Abstract: Traditional methods for high-dimensional diffeomorphic mapping often struggle with the curse of dimensionality. We propose a mesh-free learning framework designed for $n$-dimensional mapping...
- Superpositional Gradient Descent: Harnessing Quantum Principles for Model Training : Abstract: Large language models (LLMs) are increasingly trained with classical optimization techniques like AdamW to improve convergence and generalization. However, the mechanisms by which quantum-in...
- Neural Green's Functions : Abstract: We introduce Neural Green's Function, a neural solution operator for linear partial differential equations (PDEs) whose differential operators admit eigendecompositions. Inspired by Green's ...
- DeepContour: A Hybrid Deep Learning Framework for Accelerating Generalized Eigenvalue Problem Solving via Efficient Contour Design : Abstract: Solving large-scale Generalized Eigenvalue Problems (GEPs) is a fundamental yet computationally prohibitive task in science and engineering. As a promising direction, contour integral (CI) m...
- Dynamic Population Distribution Aware Human Trajectory Generation with Diffusion Model : Abstract: Human trajectory data is crucial in urban planning, traffic engineering, and public health. However, directly using real-world trajectory data often faces challenges such as privacy concerns...
- Deciphering Personalization: Towards Fine-Grained Explainability in Natural Language for Personalized Image Generation Models : Abstract: Image generation models are usually personalized in practical uses in order to better meet the individual users' heterogeneous needs, but most personalized models lack explainability about h...
- Tool Zero: Training Tool-Augmented LLMs via Pure RL from Scratch : Abstract: Training tool-augmented LLMs has emerged as a promising approach to enhancing language models' capabilities for complex tasks. The current supervised fine-tuning paradigm relies on construct...
- Q-Sat AI: Machine Learning-Based Decision Support for Data Saturation in Qualitative Studies : Abstract: The determination of sample size in qualitative research has traditionally relied on the subjective and often ambiguous principle of data saturation, which can lead to inconsistencies and th...
- Shorter but not Worse: Frugal Reasoning via Easy Samples as Length Regularizers in Math RLVR : Abstract: Large language models (LLMs) trained for step-by-step reasoning often become excessively verbose, raising inference cost. Standard Reinforcement Learning with Verifiable Rewards (RLVR) pipel...
- The Geometry of Grokking: Norm Minimization on the Zero-Loss Manifold : Abstract: Grokking is a puzzling phenomenon in neural networks where full generalization occurs only after a substantial delay following the complete memorization of the training data. Previous resear...
- Learning a Distance for the Clustering of Patients with Amyotrophic Lateral Sclerosis : Abstract: Amyotrophic lateral sclerosis (ALS) is a severe disease with a typical survival of 3-5 years after symptom onset. Current treatments offer only limited life extension, and the variability in...
- COFAP: A Universal Framework for COFs Adsorption Prediction through Designed Multi-Modal Extraction and Cross-Modal Synergy : Abstract: Covalent organic frameworks (COFs) are promising adsorbents for gas adsorption and separation, while identifying the optimal structures among their vast design space requires efficient high-...
- Interpretable Heart Disease Prediction via a Weighted Ensemble Model: A Large-Scale Study with SHAP and Surrogate Decision Trees : Abstract: Cardiovascular disease (CVD) remains a critical global health concern, demanding reliable and interpretable predictive models for early risk assessment. This study presents a large-scale ana...
- EchoLSTM: A Self-Reflective Recurrent Network for Stabilizing Long-Range Memory : Abstract: Standard Recurrent Neural Networks, including LSTMs, struggle to model long-range dependencies, particularly in sequences containing noisy or misleading information. We propose a new archite...
- NeuroClean: A Generalized Machine-Learning Approach to Neural Time-Series Conditioning : Abstract: Electroencephalography (EEG) and local field potentials (LFP) are two widely used techniques to record electrical activity from the brain. These signals are used in both the clinical and res...
- Bulk-boundary decomposition of neural networks : Abstract: We present the bulk-boundary decomposition as a new framework for understanding the training dynamics of deep neural networks. Starting from the stochastic gradient descent formulation, we s...
- TapOut: A Bandit-Based Approach to Dynamic Speculative Decoding : Abstract: Speculative decoding accelerates LLMs by using a lightweight draft model to generate tokens autoregressively before verifying them in parallel with a larger target model. However, determinin...
- Shared Parameter Subspaces and Cross-Task Linearity in Emergently Misaligned Behavior : Abstract: Recent work has discovered that large language models can develop broadly misaligned behaviors after being fine-tuned on narrowly harmful datasets, a phenomenon known as emergent misalignmen...
- Path-Coordinated Continual Learning with Neural Tangent Kernel-Justified Plasticity: A Theoretical Framework with Near State-of-the-Art Performance : Abstract: Catastrophic forgetting is one of the fundamental issues of continual learning because neural networks forget the tasks learned previously when trained on new tasks. The proposed framework i...
- RobustFSM: Submodular Maximization in Federated Setting with Malicious Clients : Abstract: Submodular maximization is an optimization problem benefiting many machine learning applications, where we seek a small subset best representing an extremely large dataset. We focus on the f...
- Predicting Microbial Interactions Using Graph Neural Networks : Abstract: Predicting interspecies interactions is a key challenge in microbial ecology, as these interactions are critical to determining the structure and activity of microbial communities. In this w...
- Quantum-Enhanced Generative Models for Rare Event Prediction : Abstract: Rare events such as financial crashes, climate extremes, and biological anomalies are notoriously difficult to model due to their scarcity and heavy-tailed distributions. Classical deep gene...
- Flashlight: PyTorch Compiler Extensions to Accelerate Attention Variants : Abstract: Bad charactors when submitting to arXiv: Attention is a fundamental building block of large language models (LLMs), so there have been many efforts to implement it efficiently. For example, ...
- Regularization Through Reasoning: Systematic Improvements in Language Model Classification via Explanation-Enhanced Fine-Tuning : Abstract: Fine-tuning LLMs for classification typically maps inputs directly to labels. We ask whether attaching brief explanations to each label during fine-tuning yields better models. We evaluate c...
- A Dual-Use Framework for Clinical Gait Analysis: Attention-Based Sensor Optimization and Automated Dataset Auditing : Abstract: Objective gait analysis using wearable sensors and AI is critical for managing neurological and orthopedic conditions. However, models are vulnerable to hidden dataset biases, and task-speci...
- Finding Probably Approximate Optimal Solutions by Training to Estimate the Optimal Values of Subproblems : Abstract: The paper is about developing a solver for maximizing a real-valued function of binary variables. The solver relies on an algorithm that estimates the optimal objective-function value of ins...
- Beyond Static Cutoffs: One-Shot Dynamic Thresholding for Diffusion Language Models : Abstract: Masked diffusion language models (MDLMs) are becoming competitive with their autoregressive counterparts but typically decode with fixed steps and sequential unmasking. To accelerate decodin...
- Energy Loss Functions for Physical Systems : Abstract: Effectively leveraging prior knowledge of a system's physics is crucial for applications of machine learning to scientific domains. Previous approaches mostly focused on incorporating physic...
- LLM Probing with Contrastive Eigenproblems: Improving Understanding and Applicability of CCS : Abstract: Contrast-Consistent Search (CCS) is an unsupervised probing method able to test whether large language models represent binary features, such as sentence truth, in their internal activations...
- Natural Building Blocks for Structured World Models: Theory, Evidence, and Scaling : Abstract: The field of world modeling is fragmented, with researchers developing bespoke architectures that rarely build upon each other. We propose a framework that specifies the natural building blo...
- Uncertainty Guided Online Ensemble for Non-stationary Data Streams in Fusion Science : Abstract: Machine Learning (ML) is poised to play a pivotal role in the development and operation of next-generation fusion devices. Fusion data shows non-stationary behavior with distribution drifts,...
- Geometric Data Valuation via Leverage Scores : Abstract: Shapley data valuation provides a principled, axiomatic framework for assigning importance to individual datapoints, and has gained traction in dataset curation, pruning, and pricing. Howeve...
- Measuring the Intrinsic Dimension of Earth Representations : Abstract: Within the context of representation learning for Earth observation, geographic Implicit Neural Representations (INRs) embed low-dimensional location inputs (longitude, latitude) into high-d...
- Matrix Sensing with Kernel Optimal Loss: Robustness and Optimization Landscape : Abstract: In this paper we study how the choice of loss functions of non-convex optimization problems affects their robustness and optimization landscape, through the study of noisy matrix sensing. In...
- Variance-Aware Feel-Good Thompson Sampling for Contextual Bandits : Abstract: Variance-dependent regret bounds have received increasing attention in recent studies on contextual bandits. However, most of these studies are focused on upper confidence bound (UCB)-based ...
- QuPCG: Quantum Convolutional Neural Network for Detecting Abnormal Patterns in PCG Signals : Abstract: Early identification of abnormal physiological patterns is essential for the timely detection of cardiac disease. This work introduces a hybrid quantum-classical convolutional neural network...
- Disentangling Causal Substructures for Interpretable and Generalizable Drug Synergy Prediction : Abstract: Drug synergy prediction is a critical task in the development of effective combination therapies for complex diseases, including cancer. Although existing methods have shown promising result...
- CFL: On the Use of Characteristic Function Loss for Domain Alignment in Machine Learning : Abstract: Machine Learning (ML) models are extensively used in various applications due to their significant advantages over traditional learning methods. However, the developed ML models often underp...
- ProtoTSNet: Interpretable Multivariate Time Series Classification With Prototypical Parts : Abstract: Time series data is one of the most popular data modalities in critical domains such as industry and medicine. The demand for algorithms that not only exhibit high accuracy but also offer in...
- Tackling Incomplete Data in Air Quality Prediction: A Bayesian Deep Learning Framework for Uncertainty Quantification : Abstract: Accurate air quality forecasts are vital for public health alerts, exposure assessment, and emissions control. In practice, observational data are often missing in varying proportions and pa...
- OmniField: Conditioned Neural Fields for Robust Multimodal Spatiotemporal Learning : Abstract: Multimodal spatiotemporal learning on real-world experimental data is constrained by two challenges: within-modality measurements are sparse, irregular, and noisy (QA/QC artifacts) but cross...
- GEPOC Parameters -- Open Source Parametrisation and Validation for Austria, Version 2.0 : Abstract: GEPOC, short for Generic Population Concept, is a collection of models and methods for analysing population-level research questions. For the valid application of the models for a specific c...
- Engineering.ai: A Platform for Teams of AI Engineers in Computational Design : Abstract: In modern engineering practice, human engineers collaborate in specialized teams to design complex products, with each expert completing their respective tasks while communicating and exchan...
- Incremental Selection of Most-Filtering Conjectures and Proofs of the Selected Conjectures : Abstract: We present an improved incremental selection algorithm of the selection algorithm presented in [1] and prove all the selected conjectures.
- Better Call CLAUSE: A Discrepancy Benchmark for Auditing LLMs Legal Reasoning Capabilities : Abstract: The rapid integration of large language models (LLMs) into high-stakes legal work has exposed a critical gap: no benchmark exists to systematically stress-test their reliability against the ...
- A Multimodal Framework for Depression Detection during Covid-19 via Harvesting Social Media: A Novel Dataset and Method : Abstract: The recent coronavirus disease (Covid-19) has become a pandemic and has affected the entire globe. During the pandemic, we have observed a spike in cases related to mental health, such as an...
- GraphChain: Large Language Models for Large-scale Graph Analysis via Tool Chaining : Abstract: Large Language Models (LLMs) face significant limitations when applied to large-scale graphs, struggling with context constraints and inflexible reasoning. We present GraphChain, a framework...
- Reimagining Safety Alignment with An Image : Abstract: Large language models (LLMs) excel in diverse applications but face dual challenges: generating harmful content under jailbreak attacks and over-refusal of benign queries due to rigid safety...
- Efficient Generation of Binary Magic Squares : Abstract: We propose a simple algorithm for generating Binary Magic Squares (BMS), i.e., square binary matrices where the sum of all rows and all columns are equal. We show by induction that our algor...
- PreferThinker: Reasoning-based Personalized Image Preference Assessment : Abstract: Personalized image preference assessment aims to evaluate an individual user's image preferences by relying only on a small set of reference images as prior information. Existing methods mai...
- Lifted Successor Generation in Numeric Planning : Abstract: Most planners ground numeric planning tasks, given in a first-order-like language, into a ground task representation. However, this can lead to an exponential blowup in task representation s...
- Ariadne: A Controllable Framework for Probing and Extending VLM Reasoning Boundaries : Abstract: While Vision-Language Models (VLMs) post-trained with Reinforcement Learning (RL) show impressive general reasoning, their evaluation is often confined to language-dominant tasks (e.g., math...
- Active Thinking Model: A Goal-Directed Self-Improving Framework for Real-World Adaptive Intelligence : Abstract: Real-world artificial intelligence (AI) systems are increasingly required to operate autonomously in dynamic, uncertain, and continuously changing environments. However, most existing AI mod...
- How Focused Are LLMs? A Quantitative Study via Repetitive Deterministic Prediction Tasks : Abstract: We investigate the performance of large language models on repetitive deterministic prediction tasks and study how the sequence accuracy rate scales with output length. Each such task involv...
- Count-Based Approaches Remain Strong: A Benchmark Against Transformer and LLM Pipelines on Structured EHR : Abstract: Structured electronic health records (EHR) are essential for clinical prediction. While count-based learners continue to perform strongly on such data, no benchmarking has directly compared ...
- Do Math Reasoning LLMs Help Predict the Impact of Public Transit Events? : Abstract: Predicting public transit incident duration from unstructured text alerts is a critical but challenging task. Addressing the domain sparsity of transit operations with standard Supervised Fi...
- AI for pRedicting Exacerbations in KIDs with aSthma (AIRE-KIDS) : Abstract: Recurrent exacerbations remain a common yet preventable outcome for many children with asthma. Machine learning (ML) algorithms using electronic medical records (EMR) could allow accurate id...
- Knowledge Elicitation with Large Language Models for Interpretable Cancer Stage Identification from Pathology Reports : Abstract: Cancer staging is critical for patient prognosis and treatment planning, yet extracting pathologic TNM staging from unstructured pathology reports poses a persistent challenge. Existing natu...
- Efficient Test-Time Retrieval Augmented Generation : Abstract: Although Large Language Models (LLMs) demonstrate significant capabilities, their reliance on parametric knowledge often leads to inaccuracies. Retrieval Augmented Generation (RAG) mitigates...
- Modular Task Decomposition and Dynamic Collaboration in Multi-Agent Systems Driven by Large Language Models : Abstract: This paper addresses the limitations of a single agent in task decomposition and collaboration during complex task execution, and proposes a multi-agent architecture for modular task decompo...
- DART: Difficulty-Adaptive Reasoning Truncation for Efficient Large Language Models : Abstract: Adaptive reasoning is essential for aligning the computational effort of large language models (LLMs) with the intrinsic difficulty of problems. Current chain-of-thought methods boost reason...
- MiRAGE: Misconception Detection with Retrieval-Guided Multi-Stage Reasoning and Ensemble Fusion : Abstract: Detecting student misconceptions in open-ended responses is a longstanding challenge, demanding semantic precision and logical reasoning. We propose MiRAGE - Misconception Detection with Ret...
- QiMeng-NeuComBack: Self-Evolving Translation from IR to Assembly Code : Abstract: Compilers, while essential, are notoriously complex systems that demand prohibitively expensive human expertise to develop and maintain. The recent advancements in Large Language Models (LLM...
- Graph Neural Network-Based Semi-Supervised Open-Set Fault Diagnosis for Marine Machinery Systems : Abstract: Recently, fault diagnosis methods for marine machinery systems based on deep learning models have attracted considerable attention in the shipping industry. Most existing studies assume faul...
- llmSHAP: A Principled Approach to LLM Explainability : Abstract: Feature attribution methods help make machine learning-based inference explainable by determining how much one or several features have contributed to a model's output. A particularly popula...
- OmniFuser: Adaptive Multimodal Fusion for Service-Oriented Predictive Maintenance : Abstract: Accurate and timely prediction of tool conditions is critical for intelligent manufacturing systems, where unplanned tool failures can lead to quality degradation and production downtime. In...
- Unbiased Platform-Level Causal Estimation for Search Systems: A Competitive Isolation PSM-DID Framework : Abstract: Evaluating platform-level interventions in search-based two-sided marketplaces is fundamentally challenged by systemic effects such as spillovers and network interference. While widely used ...
- Automatic Minds: Cognitive Parallels Between Hypnotic States and Large Language Model Processing : Abstract: The cognitive processes of the hypnotized mind and the computational operations of large language models (LLMs) share deep functional parallels. Both systems generate sophisticated, contextu...
- Align to Misalign: Automatic LLM Jailbreak with Meta-Optimized LLM Judges : Abstract: Identifying the vulnerabilities of large language models (LLMs) is crucial for improving their safety by addressing inherent weaknesses. Jailbreaks, in which adversaries bypass safeguards wi...
- Relaxing partition admissibility in Cluster-DAGs: a causal calculus with arbitrary variable clustering : Abstract: Cluster DAGs (C-DAGs) provide an abstraction of causal graphs in which nodes represent clusters of variables, and edges encode both cluster-level causal relationships and dependencies arisen...
- Modulation of temporal decision-making in a deep reinforcement learning agent under the dual-task paradigm : Abstract: This study explores the interference in temporal processing within a dual-task paradigm from an artificial intelligence (AI) perspective. In this context, the dual-task setup is implemented ...
- Robust Multimodal Sentiment Analysis via Double Information Bottleneck : Abstract: Multimodal sentiment analysis has received significant attention across diverse research domains. Despite advancements in algorithm design, existing approaches suffer from two critical limit...
- From Passive to Proactive: A Multi-Agent System with Dynamic Task Orchestration for Intelligent Medical Pre-Consultation : Abstract: Global healthcare systems face critical challenges from increasing patient volumes and limited consultation times, with primary care visits averaging under 5 minutes in many countries. While...
- TPS-Bench: Evaluating AI Agents' Tool Planning \& Scheduling Abilities in Compounding Tasks : Abstract: Large language model (LLM) agents have exhibited strong problem-solving competence across domains like research and coding. Yet, it remains underexplored whether LLM agents can tackle compou...
- Analyzing Sustainability Messaging in Large-Scale Corporate Social Media : Abstract: In this work, we introduce a multimodal analysis pipeline that leverages large foundation models in vision and language to analyze corporate social media content, with a focus on sustainabil...
- ExplicitLM: Decoupling Knowledge from Parameters via Explicit Memory Banks : Abstract: Large language models suffer from knowledge staleness and lack of interpretability due to implicit knowledge storage across entangled network parameters, preventing targeted updates and reas...
- IVGAE-TAMA-BO: A novel temporal dynamic variational graph model for link prediction in global food trade networks with momentum structural memory and Bayesian optimization : Abstract: Global food trade plays a crucial role in ensuring food security and maintaining supply chain stability. However, its network structure evolves dynamically under the influence of geopolitica...
- Hybrid Retrieval-Augmented Generation Agent for Trustworthy Legal Question Answering in Judicial Forensics : Abstract: As artificial intelligence permeates judicial forensics, ensuring the veracity and traceability of legal question answering (QA) has become critical. Conventional large language models (LLMs...
- A Two Level Neural Approach Combining Off-Chip Prediction with Adaptive Prefetch Filtering : Abstract: To alleviate the performance and energy overheads of contemporary applications with large data footprints, we propose the Two Level Perceptron (TLP) predictor, a neural mechanism that effect...
- Sorting by Strip Swaps is NP-Hard : Abstract: We show that \emph{Sorting by Strip Swaps} (SbSS) is NP-hard by a polynomial reduction of \emph{Block Sorting}. The key idea is a local gadget, a \emph{cage}, that replaces every decreasing ...
- STRIDER: Navigation via Instruction-Aligned Structural Decision Space Optimization : Abstract: The Zero-shot Vision-and-Language Navigation in Continuous Environments (VLN-CE) task requires agents to navigate previously unseen 3D environments using natural language instructions, witho...
- Endowing GPT-4 with a Humanoid Body: Building the Bridge Between Off-the-Shelf VLMs and the Physical World : Abstract: Humanoid agents often struggle to handle flexible and diverse interactions in open environments. A common solution is to collect massive datasets to train a highly capable model, but this ap...
- RailEstate: An Interactive System for Metro Linked Property Trends : Abstract: Access to metro systems plays a critical role in shaping urban housing markets by enhancing neighborhood accessibility and driving property demand. We present RailEstate, a novel web based s...
- Adding New Capability in Existing Scientific Application with LLM Assistance : Abstract: With the emergence and rapid evolution of large language models (LLM), automating coding tasks has become an important research topic. Many efforts are underway and literature abounds about ...
- Digital Twin based Automatic Reconfiguration of Robotic Systems in Smart Environments : Abstract: Robotic systems have become integral to smart environments, enabling applications ranging from urban surveillance and automated agriculture to industrial automation. However, their effective...
- Urban-MAS: Human-Centered Urban Prediction with LLM-Based Multi-Agent System : Abstract: Urban Artificial Intelligence (Urban AI) has advanced human-centered urban tasks such as perception prediction and human dynamics. Large Language Models (LLMs) can integrate multimodal input...
- Artificial Intelligence in Elementary STEM Education: A Systematic Review of Current Applications and Future Challenges : Abstract: Artificial intelligence (AI) is transforming elementary STEM education, yet evidence remains fragmented. This systematic review synthesizes 258 studies (2020-2025) examining AI applications ...
- Real-DRL: Teach and Learn in Reality : Abstract: This paper introduces the Real-DRL framework for safety-critical autonomous systems, enabling runtime learning of a deep reinforcement learning (DRL) agent to develop safe and high-performan...
- Inferring multiple helper Dafny assertions with LLMs : Abstract: The Dafny verifier provides strong correctness guarantees but often requires numerous manual helper assertions, creating a significant barrier to adoption. We investigate the use of Large La...
- End-to-End Dexterous Arm-Hand VLA Policies via Shared Autonomy: VR Teleoperation Augmented by Autonomous Hand VLA Policy for Efficient Data Collection : Abstract: Achieving human-like dexterous manipulation remains a major challenge for general-purpose robots. While Vision-Language-Action (VLA) models show potential in learning skills from demonstrati...
- What a diff makes: automating code migration with large language models : Abstract: Modern software programs are built on stacks that are often undergoing changes that introduce updates and improvements, but may also break any project that depends upon them. In this paper w...
- Effectiveness of LLMs in Temporal User Profiling for Recommendation : Abstract: Effectively modeling the dynamic nature of user preferences is crucial for enhancing recommendation accuracy and fostering transparency in recommender systems. Traditional user profiling oft...
- Understanding Code Agent Behaviour: An Empirical Study of Success and Failure Trajectories : Abstract: The increasing deployment of Large Language Model (LLM) agents for complex software engineering tasks has created a need to understand their problem-solving behaviours beyond simple success ...
- Neural Transparency: Mechanistic Interpretability Interfaces for Anticipating Model Behaviors for Personalized AI : Abstract: Millions of users now design personalized LLM-based chatbots that shape their daily interactions, yet they can only loosely anticipate how their design choices will manifest as behaviors in ...
- Scalable Processing-Near-Memory for 1M-Token LLM Inference: CXL-Enabled KV-Cache Management Beyond GPU Limits : Abstract: The expansion of context windows in large language models (LLMs) to multi-million tokens introduces severe memory and compute bottlenecks, particularly in managing the growing Key-Value (KV)...
- Emotion Detection in Speech Using Lightweight and Transformer-Based Models: A Comparative and Ablation Study : Abstract: Emotion recognition from speech plays a vital role in the development of empathetic human-computer interaction systems. This paper presents a comparative analysis of lightweight transformer-...
- Quantum Machine Unlearning: Foundations, Mechanisms, and Taxonomy : Abstract: Quantum Machine Unlearning has emerged as a foundational challenge at the intersection of quantum information theory privacypreserving computation and trustworthy artificial intelligence Thi...
- Human-AI Programming Role Optimization: Developing a Personality-Driven Self-Determination Framework : Abstract: As artificial intelligence transforms software development, a critical question emerges: how can developers and AI systems collaborate most effectively? This dissertation optimizes human-AI ...
- LIR: The First Workshop on Late Interaction and Multi Vector Retrieval @ ECIR 2026 : Abstract: Late interaction retrieval methods, pioneered by ColBERT, have emerged as a powerful alternative to single-vector neural IR. By leveraging fine-grained, token-level representations, they hav...
- DRIP: Defending Prompt Injection via De-instruction Training and Residual Fusion Model Architecture : Abstract: Large language models (LLMs) have demonstrated impressive instruction-following capabilities. However, these capabilities also expose models to prompt injection attacks, where maliciously cr...
- Proactive DDoS Detection and Mitigation in Decentralized Software-Defined Networking via Port-Level Monitoring and Zero-Training Large Language Models : Abstract: Centralized Software-Defined Networking (cSDN) offers flexible and programmable control of networks but suffers from scalability and reliability issues due to its reliance on centralized con...
- A Multimodal Dataset for Indoor Radio Mapping with 3D Point Clouds and RSSI : Abstract: The growing number of smart devices supporting bandwidth-intensive and latency-sensitive applications, such as real-time video analytics, smart sensing, and Extended Reality (XR), necessitat...
- HIP-LLM: A Hierarchical Imprecise Probability Approach to Reliability Assessment of Large Language Models : Abstract: Large Language Models (LLMs) are increasingly deployed across diverse domains, raising the need for rigorous reliability assessment methods. Existing benchmark-based evaluations primarily of...
- On Improvisation and Open-Endedness: Insights for Experiential AI : Abstract: Improvisation-the art of spontaneous creation that unfolds moment-to-moment without a scripted outcome-requires practitioners to continuously sense, adapt, and create anew. It is a fundament...
- EPARA: Parallelizing Categorized AI Inference in Edge Clouds : Abstract: With the increasing adoption of AI applications such as large language models and computer vision AI, the computational demands on AI inference systems are continuously rising, making the en...
- AgentGit: A Version Control Framework for Reliable and Scalable LLM-Powered Multi-Agent Systems : Abstract: With the rapid progress of large language models (LLMs), LLM-powered multi-agent systems (MAS) are drawing increasing interest across academia and industry. However, many current MAS framewo...
- More Than A Shortcut: A Hyperbolic Approach To Early-Exit Networks : Abstract: Deploying accurate event detection on resource-constrained devices is challenged by the trade-off between performance and computational cost. While Early-Exit (EE) networks offer a solution ...
- Lessons Learned from the Use of Generative AI in Engineering and Quality Assurance of a WEB System for Healthcare : Abstract: The advances and availability of technologies involving Generative Artificial Intelligence (AI) are evolving clearly and explicitly, driving immediate changes in various work activities. Sof...
- ShadowLogic: Backdoors in Any Whitebox LLM : Abstract: Large language models (LLMs) are widely deployed across various applications, often with safeguards to prevent the generation of harmful or restricted content. However, these safeguards can ...
- A Voice-Enabled Virtual Patient System for Interactive Training in Standardized Clinical Assessment : Abstract: Training mental health clinicians to conduct standardized clinical assessments is challenging due to a lack of scalable, realistic practice opportunities, which can impact data quality in cl...
- FeNN-DMA: A RISC-V SoC for SNN acceleration : Abstract: Spiking Neural Networks (SNNs) are a promising, energy-efficient alternative to standard Artificial Neural Networks (ANNs) and are particularly well-suited to spatio-temporal tasks such as k...
- EP-HDC: Hyperdimensional Computing with Encrypted Parameters for High-Throughput Privacy-Preserving Inference : Abstract: While homomorphic encryption (HE) provides strong privacy protection, its high computational cost has restricted its application to simple tasks. Recently, hyperdimensional computing (HDC) a...
- Quantifying truth and authenticity in AI-assisted candidate evaluation: A multi-domain pilot analysis : Abstract: This paper presents a retrospective analysis of anonymized candidate-evaluation data collected during pilot hiring campaigns conducted through AlteraSF, an AI-native resume-verification plat...
- Towards Ultra-Low Latency: Binarized Neural Network Architectures for In-Vehicle Network Intrusion Detection : Abstract: The Control Area Network (CAN) protocol is essential for in-vehicle communication, facilitating high-speed data exchange among Electronic Control Units (ECUs). However, its inherent design l...
- CodeClash: Benchmarking Goal-Oriented Software Engineering : Abstract: Current benchmarks for coding evaluate language models (LMs) on concrete, well-specified tasks such as fixing specific bugs or writing targeted tests. However, human programmers do not spend...
- Pay for The Second-Best Service: A Game-Theoretic Approach Against Dishonest LLM Providers : Abstract: The widespread adoption of Large Language Models (LLMs) through Application Programming Interfaces (APIs) induces a critical vulnerability: the potential for dishonest manipulation by servic...
- Fast Stochastic Greedy Algorithm for $k$-Submodular Cover Problem : Abstract: We study the $k$-Submodular Cover ($kSC$) problem, a natural generalization of the classical Submodular Cover problem that arises in artificial intelligence and combinatorial optimization ta...
- Dynamic Logic of Trust-Based Beliefs : Abstract: Traditionally, an agent's beliefs would come from what the agent can see, hear, or sense. In the modern world, beliefs are often based on the data available to the agents. In this work, we i...
- Maestro: Orchestrating Robotics Modules with Vision-Language Models for Zero-Shot Generalist Robots : Abstract: Today's best-explored routes towards generalist robots center on collecting ever larger "observations-in actions-out" robotics datasets to train large end-to-end models, copying a recipe tha...
- URDF-Anything: Constructing Articulated Objects with 3D Multimodal Language Model : Abstract: Constructing accurate digital twins of articulated objects is essential for robotic simulation training and embodied AI world model building, yet historically requires painstaking manual mod...
- Keys in the Weights: Transformer Authentication Using Model-Bound Latent Representations : Abstract: We introduce Model-Bound Latent Exchange (MoBLE), a decoder-binding property in Transformer autoencoders formalized as Zero-Shot Decoder Non-Transferability (ZSDN). In identity tasks using i...
- HAFixAgent: History-Aware Automated Program Repair Agent : Abstract: Automated program repair (APR) has recently shifted toward large language models and agent-based systems, yet most systems rely on local snapshot context, overlooking repository history. Pri...
- AthenaBench: A Dynamic Benchmark for Evaluating LLMs in Cyber Threat Intelligence : Abstract: Large Language Models (LLMs) have demonstrated strong capabilities in natural language reasoning, yet their application to Cyber Threat Intelligence (CTI) remains limited. CTI analysis invol...
- A High-Throughput Spiking Neural Network Processor Enabling Synaptic Delay Emulation : Abstract: Synaptic delay has attracted significant attention in neural network dynamics for integrating and processing complex spatiotemporal information. This paper introduces a high-throughput Spiki...
- Forget BIT, It is All about TOKEN: Towards Semantic Information Theory for LLMs : Abstract: Large language models (LLMs) have demonstrated remarkable capabilities in numerous real-world applications. While the vast majority of research conducted from an experimental perspective is ...
- Influence-aware Causal Autoencoder Network for Node Importance Ranking in Complex Networks : Abstract: Node importance ranking is a fundamental problem in graph data analysis. Existing approaches typically rely on node features derived from either traditional centrality measures or advanced g...
- Speech-DRAME: A Framework for Human-Aligned Benchmarks in Speech Role-Play : Abstract: Role-play has become a key testbed for generative models, expanding from text-only dialogue to multimodal interaction. Extending role-play to speech captures prosody, emotion, and delivery, ...
- Rescuing the Unpoisoned: Efficient Defense against Knowledge Corruption Attacks on RAG Systems : Abstract: Large language models (LLMs) are reshaping numerous facets of our daily lives, leading widespread adoption as web-based services. Despite their versatility, LLMs face notable challenges, suc...
- Exploringand Unleashing the Power of Large Language Models in CI/CD Configuration Translation : Abstract: Continuous Integration (CI) is a cornerstone of modern collaborative software development, and numerous CI platforms are available. Differences in maintenance overhead, reliability, and inte...
- AI for Requirements Engineering: Industry adoption and Practitioner perspectives : Abstract: The integration of AI for Requirements Engineering (RE) presents significant benefits but also poses real challenges. Although RE is fundamental to software engineering, limited research has...
- Embodied Cognition Augmented End2End Autonomous Driving : Abstract: In recent years, vision-based end-to-end autonomous driving has emerged as a new paradigm. However, popular end-to-end approaches typically rely on visual feature extraction networks trained...
- Beyond Permissions: Investigating Mobile Personalization with Simulated Personas : Abstract: Mobile applications increasingly rely on sensor data to infer user context and deliver personalized experiences. Yet the mechanisms behind this personalization remain opaque to users and res...
- The Future of Generative AI in Software Engineering: A Vision from Industry and Academia in the European GENIUS Project : Abstract: Generative AI (GenAI) has recently emerged as a groundbreaking force in Software Engineering, capable of generating code, suggesting fixes, and supporting quality assurance. While its use in...
- AI Literacy in UAE Libraries: Assessing Competencies, Training Needs, and Ethical Considerations for the Digital Age : Abstract: The study explores the current state of artificial intelligence (AI) literacy levels among library professionals employing a quantitative approach consisting of 92 surveys of LIS professiona...
- FoldPath: End-to-End Object-Centric Motion Generation via Modulated Implicit Paths : Abstract: Object-Centric Motion Generation (OCMG) is instrumental in advancing automated manufacturing processes, particularly in domains requiring high-precision expert robotic motions, such as spray...
- MO-SeGMan: Rearrangement Planning Framework for Multi Objective Sequential and Guided Manipulation in Constrained Environments : Abstract: In this work, we introduce MO-SeGMan, a Multi-Objective Sequential and Guided Manipulation planner for highly constrained rearrangement problems. MO-SeGMan generates object placement sequenc...
- Prompt Injection as an Emerging Threat: Evaluating the Resilience of Large Language Models : Abstract: Large Language Models (LLMs) are increasingly used in intelligent systems that perform reasoning, summarization, and code generation. Their ability to follow natural-language instructions, w...
- The Ghost in the Keys: A Disklavier Demo for Human-AI Musical Co-Creativity : Abstract: While generative models for music composition are increasingly capable, their adoption by musicians is hindered by text-prompting, an asynchronous workflow disconnected from the embodied, re...
- Spin-Adapted Neural Network Wavefunctions in Real Space : Abstract: Spin plays a fundamental role in understanding electronic structure, yet many real-space wavefunction methods fail to adequately consider it. We introduce the Spin-Adapted Antisymmetrization...
- Student Engagement in AI Assisted Complex Problem Solving: A Pilot Study of Human AI Rubik's Cube Collaboration : Abstract: Games and puzzles play important pedagogical roles in STEM learning. New AI algorithms that can solve complex problems offer opportunities for scaffolded instruction in puzzle solving. This ...
- Scam Shield: Multi-Model Voting and Fine-Tuned LLMs Against Adversarial Attacks : Abstract: Scam detection remains a critical challenge in cybersecurity as adversaries craft messages that evade automated filters. We propose a Hierarchical Scam Detection System (HSDS) that combines ...
- SM-based Semantics for Answer Set Programs Containing Conditional Literals and Arithmetic : Abstract: Modern answer set programming solvers such as CLINGO support advanced language constructs that improve the expressivity and conciseness of logic programs. Conditional literals are one such c...
- Context-Guided Decompilation: A Step Towards Re-executability : Abstract: Binary decompilation plays an important role in software security analysis, reverse engineering, and malware understanding when source code is unavailable. However, existing decompilation te...
- GenDexHand: Generative Simulation for Dexterous Hands : Abstract: Data scarcity remains a fundamental bottleneck for embodied intelligence. Existing approaches use large language models (LLMs) to automate gripper-based simulation generation, but they trans...
Research Sources: 497 | Generated: 11/5/2025
