AI RESEARCH PAPERS & ACADEMIC SOURCES
- "Let's Agree to Disagree": Investigating the Disagreement Problem in Explainable AI for Text Summarization : Abstract: Explainable Artificial Intelligence (XAI) methods in text summarization are essential for understanding the model behavior and fostering trust in model-generated summaries. Despite the effec...
- PoCo: Agentic Proof-of-Concept Exploit Generation for Smart Contracts : Abstract: Smart contracts operate in a highly adversarial environment, where vulnerabilities can lead to substantial financial losses. Thus, smart contracts are subject to security audits. In auditing...
- Building Altruistic and Moral AI Agent with Brain-inspired Emotional Empathy Mechanisms : Abstract: As AI closely interacts with human society, it is crucial to ensure that its behavior is safe, altruistic, and aligned with human ethical and moral values. However, existing research on embe...
- Style2Code: A Style-Controllable Code Generation Framework with Dual-Modal Contrastive Representation Learning : Abstract: Controllable code generation, the ability to synthesize code that follows a specified style while maintaining functionality, remains a challenging task. We propose a two-stage training frame...
- Evaluating LLM-Contaminated Crowdsourcing Data Without Ground Truth : Abstract: The recent success of generative AI highlights the crucial role of high-quality human feedback in building trustworthy AI systems. However, the increasing use of large language models (LLMs)...
- Transferable & Stealthy Ensemble Attacks: A Black-Box Jailbreaking Framework for Large Language Models : Abstract: We present a novel black-box jailbreaking framework that integrates multiple LLM-as-Attacker strategies to deliver highly transferable and effective attacks. The framework is grounded in thr...
- SafeVLA: Towards Safety Alignment of Vision-Language-Action Model via Constrained Learning : Abstract: Vision-language-action models (VLAs) show potential as generalist robot policies. However, these models pose extreme safety challenges during real-world deployment, including the risk of har...
- Back to Ear: Perceptually Driven High Fidelity Music Reconstruction : Abstract: Variational Autoencoders (VAEs) are essential for large-scale audio tasks like diffusion-based generation. However, existing open-source models often neglect auditory perceptual aspects duri...
- Agentmandering: A Game-Theoretic Framework for Fair Redistricting via Large Language Model Agents : Abstract: Redistricting plays a central role in shaping how votes are translated into political power. While existing computational methods primarily aim to generate large ensembles of legally valid d...
- Testing the Testers: Human-Driven Quality Assessment of Voice AI Testing Platforms : Abstract: Voice AI agents are rapidly transitioning to production deployments, yet systematic methods for ensuring testing reliability remain underdeveloped. Organizations cannot objectively assess wh...
- When Empowerment Disempowers : Abstract: Empowerment, a measure of an agent's ability to control its environment, has been proposed as a universal goal-agnostic objective for motivating assistive behavior in AI agents. While multi-...
- Opus: A Quantitative Framework for Workflow Evaluation : Abstract: This paper introduces the Opus Workflow Evaluation Framework, a probabilistic-normative formulation for quantifying Workflow quality and efficiency. It integrates notions of correctness, rel...
- Shared Spatial Memory Through Predictive Coding : Abstract: Sharing and reconstructing a consistent spatial memory is a critical challenge in multi-agent systems, where partial observability and limited bandwidth often lead to catastrophic failures i...
- RLoop: An Self-Improving Framework for Reinforcement Learning with Iterative Policy Initialization : Abstract: While Reinforcement Learning for Verifiable Rewards (RLVR) is powerful for training large reasoning models, its training dynamics harbor a critical challenge: RL overfitting, where models ga...
- GUI-360: A Comprehensive Dataset and Benchmark for Computer-Using Agents : Abstract: We introduce GUI-360$^\circ$, a large-scale, comprehensive dataset and benchmark suite designed to advance computer-using agents (CUAs). CUAs present unique challenges and is constrained by ...
- AdversariaLLM: A Unified and Modular Toolbox for LLM Robustness Research : Abstract: The rapid expansion of research on Large Language Model (LLM) safety and robustness has produced a fragmented and oftentimes buggy ecosystem of implementations, datasets, and evaluation meth...
- RxSafeBench: Identifying Medication Safety Issues of Large Language Models in Simulated Consultation : Abstract: Numerous medical systems powered by Large Language Models (LLMs) have achieved remarkable progress in diverse healthcare tasks. However, research on their medication safety remains limited d...
- Monitor-Generate-Verify (MGV):Formalising Metacognitive Theory for Language Model Reasoning : Abstract: Test-time reasoning architectures such as those following the Generate-Verify paradigm -- where a model iteratively refines or verifies its own generated outputs -- prioritise generation and...
- Post-Training LLMs as Better Decision-Making Agents: A Regret-Minimization Approach : Abstract: Large language models (LLMs) are increasingly deployed as "agents" for decision-making (DM) in interactive and dynamic environments. Yet, since they were not originally designed for DM, rece...
- Beyond Shortest Path: Agentic Vehicular Routing with Semantic Context : Abstract: Traditional vehicle routing systems efficiently optimize singular metrics like time or distance, and when considering multiple metrics, they need more processes to optimize . However, they l...
- Promoting Sustainable Web Agents: Benchmarking and Estimating Energy Consumption through Empirical and Theoretical Analysis : Abstract: Web agents, like OpenAI's Operator and Google's Project Mariner, are powerful agentic systems pushing the boundaries of Large Language Models (LLM). They can autonomously interact with the i...
- Optimizing Sensor Placement in Urban Storm Sewers: A Data-Driven Sparse Sensing Approach : Abstract: Urban surface water flooding, triggered by intense rainfall overwhelming drainage systems, is increasingly frequent and widespread. While flood prediction and monitoring in high spatial-temp...
- Question the Questions: Auditing Representation in Online Deliberative Processes : Abstract: A central feature of many deliberative processes, such as citizens' assemblies and deliberative polls, is the opportunity for participants to engage directly with experts. While participants...
- MazeMate: An LLM-Powered Chatbot to Support Computational Thinking in Gamified Programming Learning : Abstract: Computational Thinking (CT) is a foundational problem-solving skill, and gamified programming environments are a widely adopted approach to cultivating it. While large language models (LLMs)...
- Efficient On-Device Agents via Adaptive Context Management : Abstract: On-device AI agents offer the potential for personalized, low-latency assistance, but their deployment is fundamentally constrained by limited memory capacity, which restricts usable context...
- Beyond Chat: a Framework for LLMs as Human-Centered Support Systems : Abstract: Large language models are moving beyond transactional question answering to act as companions, coaches, mediators, and curators that scaffold human growth, decision-making, and well-being. T...
- Not All Explanations are Created Equal: Investigating the Pitfalls of Current XAI Evaluation : Abstract: Explainable Artificial Intelligence (XAI) aims to create transparency in modern AI models by offering explanations of the models to human users. There are many ways in which researchers have...
- Conversational Collective Intelligence (CCI) using Hyperchat AI in an Authentic Forecasting Task : Abstract: Hyperchat AI is a novel agentic technology that enables thoughtful conversations among networked human groups of potentially unlimited size. It allows large teams to discuss complex issues, ...
- OpenMENA: An Open-Source Memristor Interfacing and Compute Board for Neuromorphic Edge-AI Applications : Abstract: Memristive crossbars enable in-memory multiply-accumulate and local plasticity learning, offering a path to energy-efficient edge AI. To this end, we present Open-MENA (Open Memristor-in-Mem...
- Leveraging LLM-based agents for social science research: insights from citation network simulations : Abstract: The emergence of Large Language Models (LLMs) demonstrates their potential to encapsulate the logic and patterns inherent in human behavior simulation by leveraging extensive web data pre-tr...
- OptiMA: A Transaction-Based Framework with Throughput Optimization for Very Complex Multi-Agent Systems : Abstract: In recent years, the research of multi-agent systems has taken a direction to explore larger and more complex models to fulfill sophisticated tasks. We point out two possible pitfalls that m...
- Probing the Probes: Methods and Metrics for Concept Alignment : Abstract: In explainable AI, Concept Activation Vectors (CAVs) are typically obtained by training linear classifier probes to detect human-understandable concepts as directions in the activation space...
- CORE - A Cell-Level Coarse-to-Fine Image Registration Engine for Multi-stain Image Alignment : Abstract: Accurate and efficient registration of whole slide images (WSIs) is essential for high-resolution, nuclei-level analysis in multi-stained tissue slides. We propose a novel coarse-to-fine fra...
- Levers of Power in the Field of AI : Abstract: This paper examines how decision makers in academia, government, business, and civil society navigate questions of power in implementations of artificial intelligence. The study explores how...
- Secure Code Generation at Scale with Reflexion : Abstract: Large language models (LLMs) are now widely used to draft and refactor code, but code that works is not necessarily secure. We evaluate secure code generation using the Instruct Prime, which...
- SnappyMeal: Design and Longitudinal Evaluation of a Multimodal AI Food Logging Application : Abstract: Food logging, both self-directed and prescribed, plays a critical role in uncovering correlations between diet, medical, fitness, and health outcomes. Through conversations with nutritional ...
- Evolutionary Optimization Trumps Adam Optimization on Embedding Space Exploration : Abstract: Deep generative models, especially diffusion architectures, have transformed image generation; however, they are challenging to control and optimize for specific goals without expensive retr...
- Collaborative Agents for Automated Program Repair in Ruby : Abstract: Automated Program Repair (APR) has advanced rapidly with Large Language Models (LLMs), but most existing methods remain computationally expensive, and focused on a small set of languages. Ru...
- PEFA-AI: Advancing Open-source LLMs for RTL generation using Progressive Error Feedback Agentic-AI : Abstract: We present an agentic flow consisting of multiple agents that combine specialized LLMs and hardware simulation tools to collaboratively complete the complex task of Register Transfer Level (...
- Hybrid Fuzzing with LLM-Guided Input Mutation and Semantic Feedback : Abstract: Software fuzzing has become a cornerstone in automated vulnerability discovery, yet existing mutation strategies often lack semantic awareness, leading to redundant test cases and slow explo...
- An LLM-based Framework for Human-Swarm Teaming Cognition in Disaster Search and Rescue : Abstract: Large-scale disaster Search And Rescue (SAR) operations are persistently challenged by complex terrain and disrupted communications. While Unmanned Aerial Vehicle (UAV) swarms offer a promis...
- Advancing Equitable AI: Evaluating Cultural Expressiveness in LLMs for Latin American Contexts : Abstract: Artificial intelligence (AI) systems often reflect biases from economically advanced regions, marginalizing contexts in economically developing regions like Latin America due to imbalanced d...
- An Automated Theorem Generator with Theoretical Foundation Based on Rectangular Standard Contradiction : Abstract: Currently, there is a lack of rigorous theoretical system for systematically generating non-trivial and logically valid theorems. Addressing this critical gap, this paper conducts research t...
- Scaffolding Metacognition in Programming Education: Understanding Student-AI Interactions and Design Implications : Abstract: Generative AI tools such as ChatGPT now provide novice programmers with unprecedented access to instant, personalized support. While this holds clear promise, their influence on students' me...
- Are We Aligned? A Preliminary Investigation of the Alignment of Responsible AI Values between LLMs and Human Judgment : Abstract: Large Language Models (LLMs) are increasingly employed in software engineering tasks such as requirements elicitation, design, and evaluation, raising critical questions regarding their alig...
- Explaining Software Vulnerabilities with Large Language Models : Abstract: The prevalence of security vulnerabilities has prompted companies to adopt static application security testing (SAST) tools for vulnerability detection. Nevertheless, these tools frequently ...
- A Reinforced Evolution-Based Approach to Multi-Resource Load Balancing : Abstract: This paper presents a reinforced genetic approach to a defined d-resource system optimization problem. The classical evolution schema was ineffective due to a very strict feasibility functio...
- On the Brittleness of CLIP Text Encoders : Abstract: Multimodal co-embedding models, especially CLIP, have advanced the state of the art in zero-shot classification and multimedia information retrieval in recent years by aligning images and te...
- Speed at the Cost of Quality? The Impact of LLM Agent Assistance on Software Development : Abstract: Large language models (LLMs) have demonstrated the promise to revolutionize the field of software engineering. Among other things, LLM agents are rapidly gaining momentum in their applicatio...
- Generate, Evaluate, Iterate: Synthetic Data for Human-in-the-Loop Refinement of LLM Judges : Abstract: The LLM-as-a-judge paradigm enables flexible, user-defined evaluation, but its effectiveness is often limited by the scarcity of diverse, representative data for refining criteria. We presen...
- LLM-as-a-Judge: Toward World Models for Slate Recommendation Systems : Abstract: Modeling user preferences across domains remains a key challenge in slate recommendation (i.e. recommending an ordered sequence of items) research. We investigate how Large Language Models (...
- Discussion Graph Semantics of First-Order Logic with Equality for Reasoning about Discussion and Argumentation : Abstract: We make three contributions. First, we formulate a discussion-graph semantics for first-order logic with equality, enabling reasoning about discussion and argumentation in AI more generally ...
- Collaboration Dynamics and Reliability Challenges of Multi-Agent LLM Systems in Finite Element Analysis : Abstract: Large Language Model (LLM)-based multi-agent systems are increasingly applied to automate computational workflows in science and engineering. However, how inter-agent dynamics influence reas...
- Expert Evaluation of LLM World Models: A High-$T_c$ Superconductivity Case Study : Abstract: Large Language Models (LLMs) show great promise as a powerful tool for scientific literature exploration. However, their effectiveness in providing scientifically accurate and comprehensive ...
- KGFR: A Foundation Retriever for Generalized Knowledge Graph Question Answering : Abstract: Large language models (LLMs) excel at reasoning but struggle with knowledge-intensive questions due to limited context and parametric knowledge. However, existing methods that rely on finetu...
- Scaling Agent Learning via Experience Synthesis : Abstract: While reinforcement learning (RL) can empower large language model (LLM) agents by enabling self-improvement through interaction, its practical adoption remains challenging due to costly rol...
- Extracting Causal Relations in Deep Knowledge Tracing : Abstract: A longstanding goal in computational educational research is to develop explainable knowledge tracing (KT) models. Deep Knowledge Tracing (DKT), which leverages a Recurrent Neural Network (R...
- ArchPilot: A Proxy-Guided Multi-Agent Approach for Machine Learning Engineering : Abstract: Recent LLM-based agents have demonstrated strong capabilities in automated ML engineering. However, they heavily rely on repeated full training runs to evaluate candidate solutions, resultin...
- Detecting Silent Failures in Multi-Agentic AI Trajectories : Abstract: Multi-Agentic AI systems, powered by large language models (LLMs), are inherently non-deterministic and prone to silent failures such as drift, cycles, and missing details in outputs, which ...
- Interpreting Multi-Attribute Confounding through Numerical Attributes in Large Language Models : Abstract: Although behavioral studies have documented numerical reasoning errors in large language models (LLMs), the underlying representational mechanisms remain unclear. We hypothesize that numeric...
- Denoised Recommendation Model with Collaborative Signal Decoupling : Abstract: Although the collaborative filtering (CF) algorithm has achieved remarkable performance in recommendation systems, it suffers from suboptimal recommendation performance due to noise in the u...
- Cross-modal Causal Intervention for Alzheimer's Disease Prediction : Abstract: Mild Cognitive Impairment (MCI) serves as a prodromal stage of Alzheimer's Disease (AD), where early identification and intervention can effectively slow the progression to dementia. However...
- Benchmark Designers Should "Train on the Test Set" to Expose Exploitable Non-Visual Shortcuts : Abstract: Robust benchmarks are crucial for evaluating Multimodal Large Language Models (MLLMs). Yet we find that models can ace many multimodal benchmarks without strong visual understanding, instead...
- SIMS-V: Simulated Instruction-Tuning for Spatial Video Understanding : Abstract: Despite impressive high-level video comprehension, multimodal language models struggle with spatial reasoning across time and space. While current spatial training approaches rely on real-wo...
- Cambrian-S: Towards Spatial Supersensing in Video : Abstract: We argue that progress in true multimodal intelligence calls for a shift from reactive, task-driven systems and brute-force long context towards a broader paradigm of supersensing. We frame ...
- InfinityStar: Unified Spacetime AutoRegressive Modeling for Visual Generation : Abstract: We introduce InfinityStar, a unified spacetime autoregressive framework for high-resolution image and dynamic video synthesis. Building on the recent success of autoregressive modeling in bo...
- Tracking and Understanding Object Transformations : Abstract: Real-world objects frequently undergo state transformations. From an apple being cut into pieces to a butterfly emerging from its cocoon, tracking through these changes is important for unde...
- Carousel: A High-Resolution Dataset for Multi-Target Automatic Image Cropping : Abstract: Automatic image cropping is a method for maximizing the human-perceived quality of cropped regions in photographs. Although several works have proposed techniques for producing singular crop...
- GraSP-VLA: Graph-based Symbolic Action Representation for Long-Horizon Planning with VLA Policies : Abstract: Deploying autonomous robots that can learn new skills from demonstrations is an important challenge of modern robotics. Existing solutions often apply end-to-end imitation learning with Visi...
- $\mu$NeuFMT: Optical-Property-Adaptive Fluorescence Molecular Tomography via Implicit Neural Representation : Abstract: Fluorescence Molecular Tomography (FMT) is a promising technique for non-invasive 3D visualization of fluorescent probes, but its reconstruction remains challenging due to the inherent ill-p...
- Evo-1: Lightweight Vision-Language-Action Model with Preserved Semantic Alignment : Abstract: Vision-Language-Action (VLA) models have emerged as a powerful framework that unifies perception, language, and control, enabling robots to perform diverse tasks through multimodal understan...
- X-Diffusion: Training Diffusion Policies on Cross-Embodiment Human Demonstrations : Abstract: Human videos can be recorded quickly and at scale, making them an appealing source of training data for robot learning. However, humans and robots differ fundamentally in embodiment, resulti...
- GentleHumanoid: Learning Upper-body Compliance for Contact-rich Human and Object Interaction : Abstract: Humanoid robots are expected to operate in human-centered environments where safe and natural physical interaction is essential. However, most recent reinforcement learning (RL) policies emp...
- Practical solutions to the relative pose of three calibrated cameras : Abstract: We study the challenging problem of estimating the relative pose of three calibrated cameras from four point correspondences. We propose novel efficient solutions to this problem that are ba...
- Robust Self-calibration of Focal Lengths from the Fundamental Matrix : Abstract: The problem of self-calibration of two cameras from a given fundamental matrix is one of the basic problems in geometric computer vision. Under the assumption of known principal points and s...
- LEAP-VO: Long-term Effective Any Point Tracking for Visual Odometry : Abstract: Visual odometry estimates the motion of a moving camera based on visual input. Existing methods, mostly focusing on two-view point tracking, often ignore the rich temporal context in the ima...
- Revealing the structure-property relationships of copper alloys with FAGC : Abstract: Cu-Cr-Zr alloys play a crucial role in electronic devices and the electric power industry, where their electrical conductivity and hardness are of great importance. However, due to the scarc...
- EMHI: A Multimodal Egocentric Human Motion Dataset with HMD and Body-Worn IMUs : Abstract: Egocentric human pose estimation (HPE) using wearable sensors is essential for VR/AR applications. Most methods rely solely on either egocentric-view images or sparse Inertial Measurement Un...
- Pseudo-Stereo Inputs: A Solution to the Occlusion Challenge in Self-Supervised Stereo Matching : Abstract: Self-supervised stereo matching holds great promise by eliminating the reliance on expensive ground-truth data. Its dominant paradigm, based on photometric consistency, is however fundamenta...
- Are Minimal Radial Distortion Solvers Necessary for Relative Pose Estimation? : Abstract: Estimating the relative pose between two cameras is a fundamental step in many applications such as Structure-from-Motion. The common approach to relative pose estimation is to apply a minim...
- Three-view Focal Length Recovery From Homographies : Abstract: In this paper, we propose a novel approach for recovering focal lengths from three-view homographies. By examining the consistency of normal vectors between two homographies, we derive new e...
- Optimized Minimal 3D Gaussian Splatting : Abstract: 3D Gaussian Splatting (3DGS) has emerged as a powerful representation for real-time, high-performance rendering, enabling a wide range of applications. However, representing 3D scenes with n...
- What Time Tells Us? An Explorative Study of Time Awareness Learned from Static Images : Abstract: Time becomes visible through illumination changes in what we see. Inspired by this, in this paper we explore the potential to learn time awareness from static images, trying to answer: *what...
- CFReID: Continual Few-shot Person Re-Identification : Abstract: Real-world surveillance systems are dynamically evolving, requiring a person Re-identification model to continuously handle newly incoming data from various domains. To cope with these dynam...
- CREA: A Collaborative Multi-Agent Framework for Creative Image Editing and Generation : Abstract: Creativity in AI imagery remains a fundamental challenge, requiring not only the generation of visually compelling content but also the capacity to add novel, expressive, and artistically ri...
- EarthGPT-X: A Spatial MLLM for Multi-level Multi-Source Remote Sensing Imagery Understanding with Visual Prompting : Abstract: Recent advances in natural-domain multi-modal large language models (MLLMs) have demonstrated effective spatial reasoning through visual and textual prompting. However, their direct transfer...
- Back on Track: Bundle Adjustment for Dynamic Scene Reconstruction : Abstract: Traditional SLAM systems, which rely on bundle adjustment, struggle with highly dynamic scenes commonly found in casual videos. Such videos entangle the motion of dynamic elements, undermini...
- WaveGuard: Robust Deepfake Detection and Source Tracing via Dual-Tree Complex Wavelet and Graph Neural Networks : Abstract: Deepfake technology poses increasing risks such as privacy invasion and identity theft. To address these threats, we propose WaveGuard, a proactive watermarking framework that enhances robus...
- TextRegion: Text-Aligned Region Tokens from Frozen Image-Text Models : Abstract: Image-text models excel at image-level tasks but struggle with detailed visual understanding. While these models provide strong visual-language alignment, segmentation models like SAM2 offer...
- UMA: Ultra-detailed Human Avatars via Multi-level Surface Alignment : Abstract: Learning an animatable and clothed human avatar model with vivid dynamics and photorealistic appearance from multi-view videos is an important foundational research problem in computer graph...
- MIND: Material Interface Generation from UDFs for Non-Manifold Surface Reconstruction : Abstract: Unsigned distance fields (UDFs) are widely used in 3D deep learning due to their ability to represent shapes with arbitrary topology. While prior work has largely focused on learning UDFs fr...
- Advanced Sign Language Video Generation with Compressed and Quantized Multi-Condition Tokenization : Abstract: Sign Language Video Generation (SLVG) seeks to generate identity-preserving sign language videos from spoken language texts. Existing methods primarily rely on the single coarse condition (\...
- Seeing What Matters: Generalizable AI-generated Video Detection with Forensic-Oriented Augmentation : Abstract: Synthetic video generation is progressing very rapidly. The latest models can produce very realistic high-resolution videos that are virtually indistinguishable from real ones. Although seve...
- Zero-Shot Referring Expression Comprehension via Vison-Language True/False Verification : Abstract: Referring Expression Comprehension (REC) is usually addressed with task-trained grounding models. We show that a zero-shot workflow, without any REC-specific training, can achieve competitiv...
- X-Diffusion: Generating Detailed 3D MRI Volumes From a Single Image Using Cross-Sectional Diffusion Models : Abstract: Magnetic Resonance Imaging (MRI) is a crucial diagnostic tool, but high-resolution scans are often slow and expensive due to extensive data acquisition requirements. Traditional MRI reconstr...
- Information-driven design of imaging systems : Abstract: Imaging systems have traditionally been designed to mimic the human eye and produce visually interpretable measurements. Modern imaging systems, however, process raw measurements computation...
- Evaluating and Improving the Effectiveness of Synthetic Chest X-Rays for Medical Image Analysis : Abstract: Purpose: To explore best-practice approaches for generating synthetic chest X-ray images and augmenting medical imaging datasets to optimize the performance of deep learning models in downst...
- SLAM&Render: A Benchmark for the Intersection Between Neural Rendering, Gaussian Splatting and SLAM : Abstract: Models and methods originally developed for Novel View Synthesis and Scene Rendering, such as Neural Radiance Fields (NeRF) and Gaussian Splatting, are increasingly being adopted as represen...
- Higher-Order Singular-Value Derivatives of Rectangular Real Matrices : Abstract: We present a theoretical framework for deriving the general $n$-th order Fr\'echet derivatives of singular values in real rectangular matrices, by leveraging reduced resolvent operators from...
- Particle-Grid Neural Dynamics for Learning Deformable Object Models from RGB-D Videos : Abstract: Modeling the dynamics of deformable objects is challenging due to their diverse physical properties and the difficulty of estimating states from limited visual information. We address these ...
- VERA: Variational Inference Framework for Jailbreaking Large Language Models : Abstract: The rise of API-only access to state-of-the-art LLMs highlights the need for effective black-box jailbreak methods to identify model vulnerabilities in real-world settings. Without a princip...
- Towards Efficient and Accurate Spiking Neural Networks via Adaptive Bit Allocation : Abstract: Multi-bit spiking neural networks (SNNs) have recently become a heated research spot, pursuing energy-efficient and high-accurate AI. However, with more bits involved, the associated memory ...
- A LoD of Gaussians: Unified Training and Rendering for Ultra-Large Scale Reconstruction with External Memory : Abstract: Gaussian Splatting has emerged as a high-performance technique for novel view synthesis, enabling real-time rendering and high-quality reconstruction of small scenes. However, scaling to lar...
- MCTED: A Machine-Learning-Ready Dataset for Digital Elevation Model Generation From Mars Imagery : Abstract: This work presents a new dataset for the Martian digital elevation model prediction task, ready for machine learning applications called MCTED. The dataset has been generated using a compreh...
- Activation-Space Personality Steering: Hybrid Layer Selection for Stable Trait Control in LLMs : Abstract: Large Language Models exhibit implicit personalities in their generation, but reliably controlling or aligning these traits to meet specific needs remains an open challenge. The need for eff...
- TextualVerifier: Verify TextGrad Step-by-Step : Abstract: TextGrad is a novel approach to text-based automatic differentiation that enables composite AI systems to perform optimization without explicit numerical equations. However, it currently lac...
- GRDD+: An Extended Greek Dialectal Dataset with Cross-Architecture Fine-tuning Evaluation : Abstract: We present an extended Greek Dialectal Dataset (GRDD+) 1that complements the existing GRDD dataset with more data from Cretan, Cypriot, Pontic and Northern Greek, while we add six new variet...
- PLLuM: A Family of Polish Large Language Models : Abstract: Large Language Models (LLMs) play a central role in modern artificial intelligence, yet their development has been primarily focused on English, resulting in limited support for other langua...
- STARS: Segment-level Token Alignment with Rejection Sampling in Large Language Models : Abstract: Aligning large language models with human values is crucial for their safe deployment; however, existing methods, such as fine-tuning, are computationally expensive and suboptimal. In contra...
- Divide, Cache, Conquer: Dichotomic Prompting for Efficient Multi-Label LLM-Based Classification : Abstract: We introduce a method for efficient multi-label text classification with large language models (LLMs), built on reformulating classification tasks as sequences of dichotomic (yes/no) decisio...
- Evaluating Machine Translation Datasets for Low-Web Data Languages: A Gendered Lens : Abstract: As low-resourced languages are increasingly incorporated into NLP research, there is an emphasis on collecting large-scale datasets. But in prioritizing quantity over quality, we risk 1) bui...
- Context informs pragmatic interpretation in vision-language models : Abstract: Iterated reference games - in which players repeatedly pick out novel referents using language - present a test case for agents' ability to perform context-sensitive pragmatic reasoning in m...
- The Human Flourishing Geographic Index: A County-Level Dataset for the United States, 2013--2023 : Abstract: Quantifying human flourishing, a multidimensional construct including happiness, health, purpose, virtue, relationships, and financial stability, is critical for understanding societal well-...
- Direct Semantic Communication Between Large Language Models via Vector Translation : Abstract: In multi-agent settings, such as debate, reflection, or tool-calling, large language models (LLMs) pass messages as plain tokens, discarding most latent semantics. This constrains informatio...
- Abductive Inference in Retrieval-Augmented Language Models: Generating and Validating Missing Premises : Abstract: Large Language Models (LLMs) enhanced with retrieval -- commonly referred to as Retrieval-Augmented Generation (RAG) -- have demonstrated strong performance in knowledge-intensive tasks. How...
- WST: Weakly Supervised Transducer for Automatic Speech Recognition : Abstract: The Recurrent Neural Network-Transducer (RNN-T) is widely adopted in end-to-end (E2E) automatic speech recognition (ASR) tasks but depends heavily on large-scale, high-quality annotated data...
- T-FIX: Text-Based Explanations with Features Interpretable to eXperts : Abstract: As LLMs are deployed in knowledge-intensive settings (e.g., surgery, astronomy, therapy), users expect not just answers, but also meaningful explanations for those answers. In these settings...
- Plan of Knowledge: Retrieval-Augmented Large Language Models for Temporal Knowledge Graph Question Answering : Abstract: Temporal Knowledge Graph Question Answering (TKGQA) aims to answer time-sensitive questions by leveraging factual information from Temporal Knowledge Graphs (TKGs). While previous studies ha...
- The truth is no diaper: Human and AI-generated associations to emotional words : Abstract: Human word associations are a well-known method of gaining insight into the internal mental lexicon, but the responses spontaneously offered by human participants to word cues are not always...
- Improving the Performance of Radiology Report De-identification with Large-Scale Training and Benchmarking Against Cloud Vendor Methods : Abstract: Objective: To enhance automated de-identification of radiology reports by scaling transformer-based models through extensive training datasets and benchmarking performance against commercial...
- Batch Prompting Suppresses Overthinking Reasoning Under Constraint: How Batch Prompting Suppresses Overthinking in Reasoning Models : Abstract: Recent work has explored batch prompting as a strategy to amortize inference cost in large language models (LLMs). In this paper, we show that batching offers an additional, underappreciated...
- RIDE: Difficulty Evolving Perturbation with Item Response Theory for Mathematical Reasoning : Abstract: Large language models (LLMs) achieve high performance on mathematical reasoning, but these results can be inflated by training data leakage or superficial pattern matching rather than genuin...
- CantoASR: Prosody-Aware ASR-LALM Collaboration for Low-Resource Cantonese : Abstract: Automatic speech recognition (ASR) is critical for language accessibility, yet low-resource Cantonese remains challenging due to limited annotated data, six lexical tones, tone sandhi, and a...
- BAPPA: Benchmarking Agents, Plans, and Pipelines for Automated Text-to-SQL Generation : Abstract: Text-to-SQL systems provide a natural language interface that can enable even laymen to access information stored in databases. However, existing Large Language Models (LLM) struggle with SQ...
- Trustworthy LLM-Mediated Communication: Evaluating Information Fidelity in LLM as a Communicator (LAAC) Framework in Multiple Application Domains : Abstract: The proliferation of AI-generated content has created an absurd communication theater where senders use LLMs to inflate simple ideas into verbose content, recipients use LLMs to compress the...
- Computational Turing Test Reveals Systematic Differences Between Human and AI Language : Abstract: Large language models (LLMs) are increasingly used in the social sciences to simulate human behavior, based on the assumption that they can generate realistic, human-like text. Yet this assu...
- LLM-as-a-Judge is Bad, Based on AI Attempting the Exam Qualifying for the Member of the Polish National Board of Appeal : Abstract: This study provides an empirical assessment of whether current large language models (LLMs) can pass the official qualifying examination for membership in Poland's National Appeal Chamber (K...
- Reusing Pre-Training Data at Test Time is a Compute Multiplier : Abstract: Large language models learn from their vast pre-training corpora, gaining the ability to solve an ever increasing variety of tasks; yet although researchers work to improve these datasets, t...
- Efficient Topic Extraction via Graph-Based Labeling: A Lightweight Alternative to Deep Models : Abstract: Extracting topics from text has become an essential task, especially with the rapid growth of unstructured textual data. Most existing works rely on highly computational methods to address t...
- SSPO: Subsentence-level Policy Optimization : Abstract: As a significant part of post-training of the Large Language Models (LLMs), Reinforcement Learning from Verifiable Reward (RLVR) has greatly improved LLMs' reasoning skills. However, some RL...
- Dynamic Jointly Batch Selection for Data Efficient Machine Translation Fine-Tuning : Abstract: Data quality and its effective selection are fundamental to improving the performance of machine translation models, serving as cornerstones for achieving robust and reliable translation sys...
- If I Could Turn Back Time: Temporal Reframing as a Historical Reasoning Task for LLMs : Abstract: In this study, we experiment with the ability of LLMs to do temporal reasoning. Using a Norwegian book from 1940 containing trivia questions, we prompt the LLMs to answer the questions as if...
- Probabilistic Textual Time Series Depression Detection : Abstract: Accurate and interpretable predictions of depression severity are essential for clinical decision support, yet existing models often lack uncertainty estimates and temporal modeling. We prop...
- ThaiOCRBench: A Task-Diverse Benchmark for Vision-Language Understanding in Thai : Abstract: We present ThaiOCRBench, the first comprehensive benchmark for evaluating vision-language models (VLMs) on Thai text-rich visual understanding tasks. Despite recent progress in multimodal mo...
- OUNLP at TSAR 2025 Shared Task: Multi-Round Text Simplifier via Code Generation : Abstract: This paper describes the OUNLP system submitted to the TSAR-2025 Shared Task (Alva-Manchego et al., 2025), designed for readability-controlled text simplification using LLM-prompting-based g...
- Decoding Emergent Big Five Traits in Large Language Models: Temperature-Dependent Expression and Architectural Clustering : Abstract: As Large Language Models (LLMs) become integral to human-centered applications, understanding their personality-like behaviors is increasingly important for responsible development and deplo...
- RAGalyst: Automated Human-Aligned Agentic Evaluation for Domain-Specific RAG : Abstract: Retrieval-Augmented Generation (RAG) is a critical technique for grounding Large Language Models (LLMs) in factual evidence, yet evaluating RAG systems in specialized, safety-critical domain...
- Modeling Clinical Uncertainty in Radiology Reports: from Explicit Uncertainty Markers to Implicit Reasoning Pathways : Abstract: Radiology reports are invaluable for clinical decision-making and hold great potential for automated analysis when structured into machine-readable formats. These reports often contain uncer...
- Are language models aware of the road not taken? Token-level uncertainty and hidden state dynamics : Abstract: When a language model generates text, the selection of individual tokens might lead it down very different reasoning paths, making uncertainty difficult to quantify. In this work, we conside...
- IntelliProof: An Argumentation Network-based Conversational Helper for Organized Reflection : Abstract: We present IntelliProof, an interactive system for analyzing argumentative essays through LLMs. IntelliProof structures an essay as an argumentation graph, where claims are represented as no...
- From Model to Breach: Towards Actionable LLM-Generated Vulnerabilities Reporting : Abstract: As the role of Large Language Models (LLM)-based coding assistants in software development becomes more critical, so does the role of the bugs they generate in the overall cybersecurity land...
- BanglaMedQA and BanglaMMedBench: Evaluating Retrieval-Augmented Generation Strategies for Bangla Biomedical Question Answering : Abstract: Developing accurate biomedical Question Answering (QA) systems in low-resource languages remains a major challenge, limiting equitable access to reliable medical knowledge. This paper introd...
- When retrieval outperforms generation: Dense evidence retrieval for scalable fake news detection : Abstract: The proliferation of misinformation necessitates robust yet computationally efficient fact verification systems. While current state-of-the-art approaches leverage Large Language Models (LLM...
- Logit-Entropy Adaptive Stopping Heuristic for Efficient Chain-of-Thought Reasoning : Abstract: Chain-of-Thought (CoT) prompting is a key technique for enabling complex reasoning in large language models. However, generating full, fixed-length rationales is computationally wasteful, in...
- MimiTalk: Revolutionizing Qualitative Research with Dual-Agent AI : Abstract: We present MimiTalk, a dual-agent constitutional AI framework designed for scalable and ethical conversational data collection in social science research. The framework integrates a supervis...
- MIDI-LLM: Adapting Large Language Models for Text-to-MIDI Music Generation : Abstract: We present MIDI-LLM, an LLM for generating multitrack MIDI music from free-form text prompts. Our approach expands a text LLM's vocabulary to include MIDI tokens, and uses a two-stage traini...
- Multi-Agent Collaborative Framework For Math Problem Generation : Abstract: Automatic question generation (AQG) for mathematics education remains an elusive goal for Intelligent Tutoring Systems and educators. While pre-trained transformer-based language models have...
- LLMs and Cultural Values: the Impact of Prompt Language and Explicit Cultural Framing : Abstract: Large Language Models (LLMs) are rapidly being adopted by users across the globe, who interact with them in a diverse range of languages. At the same time, there are well-documented imbalanc...
- Explorability in Pushdown Automata : Abstract: We study explorability, a measure of nondeterminism in pushdown automata, which generalises history-determinism. An automaton is k-explorable if, while reading the input, it suffices to foll...
- Sub-exponential Growth in Online Word Usage: A Piecewise Power-Law Model : Abstract: The diffusion of ideas and language in society has conventionally been described by S-shaped models, such as the logistic curve. However, the role of sub-exponential growth -a slower than ex...
- Seeing Straight: Document Orientation Detection for Efficient OCR : Abstract: Despite significant advances in document understanding, determining the correct orientation of scanned or photographed documents remains a critical pre-processing step in the real world sett...
- Transforming Mentorship: An AI Powered Chatbot Approach to University Guidance : Abstract: University students face immense challenges during their undergraduate lives, often being deprived of personalized on-demand guidance that mentors fail to provide at scale. Digital tools exi...
- Black-Box Guardrail Reverse-engineering Attack : Abstract: Large language models (LLMs) increasingly employ guardrails to enforce ethical, legal, and application-specific constraints on their outputs. While effective at mitigating harmful responses,...
- Large language models replicate and predict human cooperation across experiments in game theory : Abstract: Large language models (LLMs) are increasingly used both to make decisions in domains such as health, education and law, and to simulate human behavior. Yet how closely LLMs mirror actual hum...
- Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm : Abstract: "Thinking with Text" and "Thinking with Images" paradigm significantly improve the reasoning ability of large language models (LLMs) and Vision Language Models (VLMs). However, these paradig...
- Are We Asking the Right Questions? On Ambiguity in Natural Language Queries for Tabular Data Analysis : Abstract: Natural language interfaces to tabular data must handle ambiguities inherent to queries. Instead of treating ambiguity as a deficiency, we reframe it as a feature of cooperative interaction,...
- VeriCoT: Neuro-symbolic Chain-of-Thought Validation via Logical Consistency Checks : Abstract: LLMs can perform multi-step reasoning through Chain-of-Thought (CoT), but they cannot reliably verify their own logic. Even when they reach correct answers, the underlying reasoning may be f...
- Decomposed Prompting: Probing Multilingual Linguistic Structure Knowledge in Large Language Models : Abstract: Probing the multilingual knowledge of linguistic structure in LLMs, often characterized as sequence labeling, faces challenges with maintaining output templates in current text-to-text promp...
- Legal Fact Prediction: The Missing Piece in Legal Judgment Prediction : Abstract: Legal judgment prediction (LJP), which enables litigants and their lawyers to forecast judgment outcomes and refine litigation strategies, has emerged as a crucial legal NLP task. Existing s...
- DAMRO: Dive into the Attention Mechanism of LVLM to Reduce Object Hallucination : Abstract: Despite the great success of Large Vision-Language Models (LVLMs), they inevitably suffer from hallucination. As we know, both the visual encoder and the Large Language Model (LLM) decoder i...
- Who is the root in a syntactic dependency structure? : Abstract: The syntactic structure of a sentence can be described as a tree that indicates the syntactic relationships between words. In spite of significant progress in unsupervised methods that retri...
- Pragmatic Reasoning improves LLM Code Generation : Abstract: Large Language Models (LLMs) have demonstrated impressive potential in translating natural language (NL) instructions into program code. However, user instructions often contain inherent amb...
- GraphCheck: Multipath Fact-Checking with Entity-Relationship Graphs : Abstract: Automated fact-checking aims to assess the truthfulness of textual claims based on relevant evidence. However, verifying complex claims that require multi-hop reasoning remains a significant...
- Benchmarking LLM Faithfulness in RAG with Evolving Leaderboards : Abstract: Retrieval-augmented generation (RAG) aims to reduce hallucinations by grounding responses in external context, yet large language models (LLMs) still frequently introduce unsupported informa...
- On Multilingual Encoder Language Model Compression for Low-Resource Languages : Abstract: In this paper, we combine two-step knowledge distillation, structured pruning, truncation, and vocabulary trimming for extremely compressing multilingual encoder-only language models for low...
- Compression Hacking: A Supplementary Perspective on Informatics Properties of Language Models from Geometric Distortion : Abstract: Recently, the concept of ``compression as intelligence'' has provided a novel informatics metric perspective for language models (LMs), emphasizing that highly structured representations sig...
- Reasoning Models Hallucinate More: Factuality-Aware Reinforcement Learning for Large Reasoning Models : Abstract: Large language models (LLMs) have significantly advanced in reasoning tasks through reinforcement learning (RL) optimization, achieving impressive capabilities across various challenging ben...
- Homogeneous Keys, Heterogeneous Values: Exploiting Local KV Cache Asymmetry for Long-Context LLMs : Abstract: Recent advances in Large Language Models (LLMs) have highlighted the critical importance of extending context length, yet the quadratic complexity of attention mechanisms poses significant c...
- TurBLiMP: A Turkish Benchmark of Linguistic Minimal Pairs : Abstract: We introduce TurBLiMP, the first Turkish benchmark of linguistic minimal pairs, designed to evaluate the linguistic abilities of monolingual and multilingual language models (LMs). Covering ...
- FinEval-KR: A Financial Domain Evaluation Framework for Large Language Models' Knowledge and Reasoning : Abstract: Large Language Models (LLMs) demonstrate significant potential but face challenges in complex financial reasoning tasks requiring both domain knowledge and sophisticated reasoning. Current e...
- Text2VectorSQL: Towards a Unified Interface for Vector Search and SQL Queries : Abstract: The proliferation of unstructured data poses a fundamental challenge to traditional database interfaces. While Text-to-SQL has democratized access to structured data, it remains incapable of...
- XL-DURel: Finetuning Sentence Transformers for Ordinal Word-in-Context Classification : Abstract: We propose XL-DURel, a finetuned, multilingual Sentence Transformer model optimized for ordinal Word-in-Context classification. We test several loss functions for regression and ranking task...
- Balancing Quality and Variation: Spam Filtering Distorts Data Label Distributions : Abstract: For machine learning datasets to accurately represent diverse opinions in a population, they must preserve variation in data labels while filtering out spam or low-quality responses. How can...
- Will Large Language Models Transform Clinical Prediction? : Abstract: Objective: Large language models (LLMs) are attracting increasing interest in healthcare. This commentary evaluates the potential of LLMs to improve clinical prediction models (CPMs) for dia...
- Hierarchical Retrieval with Evidence Curation for Open-Domain Financial Question Answering on Standardized Documents : Abstract: Retrieval-augmented generation (RAG) based large language models (LLMs) are widely used in finance for their excellent performance on knowledge-intensive tasks. However, standardized documen...
- LoRA-Edge: Tensor-Train-Assisted LoRA for Practical CNN Fine-Tuning on Edge Devices : Abstract: On-device fine-tuning of CNNs is essential to withstand domain shift in edge applications such as Human Activity Recognition (HAR), yet full fine-tuning is infeasible under strict memory, co...
- SILVI: Simple Interface for Labeling Video Interactions : Abstract: Computer vision methods are increasingly used for the automated analysis of large volumes of video data collected through camera traps, drones, or direct observations of animals in the wild....
- Noise Injection: Improving Out-of-Distribution Generalization for Limited Size Datasets : Abstract: Deep learned (DL) models for image recognition have been shown to fail to generalize to data from different devices, populations, etc. COVID-19 detection from Chest X-rays (CXRs), in particu...
- Improving Diagnostic Performance on Small and Imbalanced Datasets Using Class-Based Input Image Composition : Abstract: Small, imbalanced datasets and poor input image quality can lead to high false predictions rates with deep learning models. This paper introduces Class-Based Image Composition, an approach t...
- I Detect What I Don't Know: Incremental Anomaly Learning with Stochastic Weight Averaging-Gaussian for Oracle-Free Medical Imaging : Abstract: Unknown anomaly detection in medical imaging remains a fundamental challenge due to the scarcity of labeled anomalies and the high cost of expert supervision. We introduce an unsupervised, o...
- Adaptive Temporal Refinement: Continuous Depth Allocation and Distance Regression for Efficient Action Localization : Abstract: Temporal action localization requires precise boundary detection; however, current methods apply uniform computation despite significant variations in difficulty across boundaries. We presen...
- Improving Multi-View Reconstruction via Texture-Guided Gaussian-Mesh Joint Optimization : Abstract: Reconstructing real-world objects from multi-view images is essential for applications in 3D editing, AR/VR, and digital content creation. Existing methods typically prioritize either geomet...
- A Linear Fractional Transformation Model and Calibration Method for Light Field Camera : Abstract: Accurate calibration of internal parameters is a crucial yet challenging prerequisite for 3D reconstruction using light field cameras. In this paper, we propose a linear fractional transform...
- Room Envelopes: A Synthetic Dataset for Indoor Layout Reconstruction from Images : Abstract: Modern scene reconstruction methods are able to accurately recover 3D surfaces that are visible in one or more images. However, this leads to incomplete reconstructions, missing all occluded...
- Simple 3D Pose Features Support Human and Machine Social Scene Understanding : Abstract: Humans can quickly and effortlessly extract a variety of information about others' social interactions from visual input, ranging from visuospatial cues like whether two people are facing ea...
- CaRF: Enhancing Multi-View Consistency in Referring 3D Gaussian Splatting Segmentation : Abstract: Referring 3D Gaussian Splatting Segmentation (R3DGS) aims to interpret free-form language expressions and localize the corresponding 3D regions in Gaussian fields. While recent advances have...
- PhysCorr: Dual-Reward DPO for Physics-Constrained Text-to-Video Generation with Automated Preference Selection : Abstract: Recent advances in text-to-video generation have achieved impressive perceptual quality, yet generated content often violates fundamental principles of physical plausibility - manifesting as...
- GNN-MoE: Context-Aware Patch Routing using GNNs for Parameter-Efficient Domain Generalization : Abstract: Domain generalization (DG) seeks robust Vision Transformer (ViT) performance on unseen domains. Efficiently adapting pretrained ViTs for DG is challenging; standard fine-tuning is costly and...
- MedDChest: A Content-Aware Multimodal Foundational Vision Model for Thoracic Imaging : Abstract: The performance of vision models in medical imaging is often hindered by the prevailing paradigm of fine-tuning backbones pre-trained on out-of-domain natural images. To address this fundame...
- Near-Lossless 3D Voxel Representation Free from Iso-surface : Abstract: Accurate and efficient voxelized representations of 3D meshes are the foundation of 3D reconstruction and generation. However, existing representations based on iso-surface heavily rely on w...
- A Hybrid Deep Learning Model for Robust Biometric Authentication from Low-Frame-Rate PPG Signals : Abstract: Photoplethysmography (PPG) signals, which measure changes in blood volume in the skin using light, have recently gained attention in biometric authentication because of their non-invasive ac...
- Unveiling Deep Semantic Uncertainty Perception for Language-Anchored Multi-modal Vision-Brain Alignment : Abstract: Unveiling visual semantics from neural signals such as EEG, MEG, and fMRI remains a fundamental challenge due to subject variability and the entangled nature of visual features. Existing app...
- Adversarial and Score-Based CT Denoising: CycleGAN vs Noise2Score : Abstract: We study CT image denoising in the unpaired and self-supervised regimes by evaluating two strong, training-data-efficient paradigms: a CycleGAN-based residual translator and a Noise2Score (N...
- When Swin Transformer Meets KANs: An Improved Transformer Architecture for Medical Image Segmentation : Abstract: Medical image segmentation is critical for accurate diagnostics and treatment planning, but remains challenging due to complex anatomical structures and limited annotated training data. CNN-...
- SpatialLock: Precise Spatial Control in Text-to-Image Synthesis : Abstract: Text-to-Image (T2I) synthesis has made significant advancements in recent years, driving applications such as generating datasets automatically. However, precise control over object localiza...
- Tortoise and Hare Guidance: Accelerating Diffusion Model Inference with Multirate Integration : Abstract: In this paper, we propose Tortoise and Hare Guidance (THG), a training-free strategy that accelerates diffusion sampling while maintaining high-fidelity generation. We demonstrate that the n...
- Text to Sketch Generation with Multi-Styles : Abstract: Recent advances in vision-language models have facilitated progress in sketch generation. However, existing specialized methods primarily focus on generic synthesis and lack mechanisms for p...
- Automated Tennis Player and Ball Tracking with Court Keypoints Detection (Hawk Eye System) : Abstract: This study presents a complete pipeline for automated tennis match analysis. Our framework integrates multiple deep learning models to detect and track players and the tennis ball in real ti...
- DMSORT: An efficient parallel maritime multi-object tracking architecture for unmanned vessel platforms : Abstract: Accurate perception of the marine environment through robust multi-object tracking (MOT) is essential for ensuring safe vessel navigation and effective maritime surveillance. However, the co...
- Learning from Online Videos at Inference Time for Computer-Use Agents : Abstract: Computer-use agents can operate computers and automate laborious tasks, but despite recent rapid progress, they still lag behind human users, especially when tasks require domain-specific pr...
- Systematic Evaluation of Preprocessing Techniques for Accurate Image Registration in Digital Pathology : Abstract: Image registration refers to the process of spatially aligning two or more images by mapping them into a common coordinate system, so that corresponding anatomical or tissue structures are m...
- Covariance Descriptors Meet General Vision Encoders: Riemannian Deep Learning for Medical Image Classification : Abstract: Covariance descriptors capture second-order statistics of image features. They have shown strong performance in general computer vision tasks, but remain underexplored in medical imaging. We...
- AStF: Motion Style Transfer via Adaptive Statistics Fusor : Abstract: Human motion style transfer allows characters to appear less rigidity and more realism with specific style. Traditional arbitrary image style transfer typically process mean and variance whi...
- Proto-LeakNet: Towards Signal-Leak Aware Attribution in Synthetic Human Face Imagery : Abstract: The growing sophistication of synthetic image and deepfake generation models has turned source attribution and authenticity verification into a critical challenge for modern computer vision ...
- DINOv2 Driven Gait Representation Learning for Video-Based Visible-Infrared Person Re-identification : Abstract: Video-based Visible-Infrared person re-identification (VVI-ReID) aims to retrieve the same pedestrian across visible and infrared modalities from video sequences. Existing methods tend to ex...
- FastGS: Training 3D Gaussian Splatting in 100 Seconds : Abstract: The dominant 3D Gaussian splatting (3DGS) acceleration methods fail to properly regulate the number of Gaussians during training, causing redundant computational time overhead. In this paper...
- Vision Foundation Models in Agriculture: Toward Domain-Specific Adaptation for Weed Herbicide Trials Assessment : Abstract: Herbicide field trials require accurate identification of plant species and assessment of herbicide-induced damage across diverse environments. While general-purpose vision foundation models...
- Deep learning-based object detection of offshore platforms on Sentinel-1 Imagery and the impact of synthetic training data : Abstract: The recent and ongoing expansion of marine infrastructure, including offshore wind farms, oil and gas platforms, artificial islands, and aquaculture facilities, highlights the need for effec...
- RISE-T2V: Rephrasing and Injecting Semantics with LLM for Expansive Text-to-Video Generation : Abstract: Most text-to-video(T2V) diffusion models depend on pre-trained text encoders for semantic alignment, yet they often fail to maintain video quality when provided with concise prompts rather t...
- Comparative Study of CNN Architectures for Binary Classification of Horses and Motorcycles in the VOC 2008 Dataset : Abstract: This paper presents a comprehensive evaluation of nine convolutional neural network architectures for binary classification of horses and motorcycles in the VOC 2008 dataset. We address the ...
- Evaluating the Impact of Weather-Induced Sensor Occlusion on BEVFusion for 3D Object Detection : Abstract: Accurate 3D object detection is essential for automated vehicles to navigate safely in complex real-world environments. Bird's Eye View (BEV) representations, which project multi-sensor data...
- A MATLAB tutorial on deep feature extraction combined with chemometrics for analytical applications : Abstract: Background In analytical chemistry, spatial information about materials is commonly captured through imaging techniques, such as traditional color cameras or with advanced hyperspectral came...
- BoRe-Depth: Self-supervised Monocular Depth Estimation with Boundary Refinement for Embedded Systems : Abstract: Depth estimation is one of the key technologies for realizing 3D perception in unmanned systems. Monocular depth estimation has been widely researched because of its low-cost advantage, but ...
- DORAEMON: A Unified Library for Visual Object Modeling and Representation Learning at Scale : Abstract: DORAEMON is an open-source PyTorch library that unifies visual object modeling and representation learning across diverse scales. A single YAML-driven workflow covers classification, retriev...
- HideAndSeg: an AI-based tool with automated prompting for octopus segmentation in natural habitats : Abstract: Analyzing octopuses in their natural habitats is challenging due to their camouflage capability, rapid changes in skin texture and color, non-rigid body deformations, and frequent occlusions...
- Solving Convex Partition Visual Jigsaw Puzzles : Abstract: Jigsaw puzzle solving requires the rearrangement of unordered pieces into their original pose in order to reconstruct a coherent whole, often an image, and is known to be an intractable prob...
- V-Thinker: Interactive Thinking with Images : Abstract: Empowering Large Multimodal Models (LMMs) to deeply integrate image interaction with long-horizon reasoning capabilities remains a long-standing challenge in this field. Recent advances in v...
- Landslide Hazard Mapping with Geospatial Foundation Models: Geographical Generalizability, Data Scarcity, and Band Adaptability : Abstract: Landslides cause severe damage to lives, infrastructure, and the environment, making accurate and timely mapping essential for disaster preparedness and response. However, conventional deep ...
- THEval. Evaluation Framework for Talking Head Video Generation : Abstract: Video generation has achieved remarkable progress, with generated videos increasingly resembling real ones. However, the rapid advance in generation has outpaced the development of adequate ...
- Learning from Single Timestamps: Complexity Estimation in Laparoscopic Cholecystectomy : Abstract: Purpose: Accurate assessment of surgical complexity is essential in Laparoscopic Cholecystectomy (LC), where severe inflammation is associated with longer operative times and increased risk ...
- UniSplat: Unified Spatio-Temporal Fusion via 3D Latent Scaffolds for Dynamic Driving Scene Reconstruction : Abstract: Feed-forward 3D reconstruction for autonomous driving has advanced rapidly, yet existing methods struggle with the joint challenges of sparse, non-overlapping camera views and complex scene ...
- PixCLIP: Achieving Fine-grained Visual Language Understanding via Any-granularity Pixel-Text Alignment Learning : Abstract: While the Contrastive Language-Image Pretraining(CLIP) model has achieved remarkable success in a variety of downstream vison language understanding tasks, enhancing its capability for fine-...
- Building Trust in Virtual Immunohistochemistry: Automated Assessment of Image Quality : Abstract: Deep learning models can generate virtual immunohistochemistry (IHC) stains from hematoxylin and eosin (H&E) images, offering a scalable and low-cost alternative to laboratory IHC. However, ...
- NovisVQ: A Streaming Convolutional Neural Network for No-Reference Opinion-Unaware Frame Quality Assessment : Abstract: Video quality assessment (VQA) is vital for computer vision tasks, but existing approaches face major limitations: full-reference (FR) metrics require clean reference videos, and most no-ref...
- Polarization-resolved imaging improves eye tracking : Abstract: Polarization-resolved near-infrared imaging adds a useful optical contrast mechanism to eye tracking by measuring the polarization state of light reflected by ocular tissues in addition to i...
- TIMESAFE: Timing Interruption Monitoring and Security Assessment for Fronthaul Environments : Abstract: 5G and beyond cellular systems embrace the disaggregation of Radio Access Network (RAN) components, exemplified by the evolution of the fronthaul (FH) connection between cellular baseband an...
- coverforest: Conformal Predictions with Random Forest in Python : Abstract: Conformal prediction provides a framework for uncertainty quantification, specifically in the forms of prediction intervals and sets with distribution-free guaranteed coverage. While recent ...
- KGGen: Extracting Knowledge Graphs from Plain Text with Language Models : Abstract: Recent interest in building foundation models for KGs has highlighted a fundamental challenge: knowledge-graph data is relatively scarce. The best-known KGs are primarily human-labeled, crea...
- Datasets, Documents, and Repetitions: The Practicalities of Unequal Data Quality : Abstract: Data filtering has become a powerful tool for improving model performance while reducing computational cost. However, as large language model compute budgets continue to grow, the limited da...
- Efficient Model Development through Fine-tuning Transfer : Abstract: Modern LLMs struggle with efficient updates, as each new pretrained model version requires repeating expensive alignment processes. This challenge also applies to domain- or languagespecific...
- TathyaNyaya and FactLegalLlama: Advancing Factual Judgment Prediction and Explanation in the Indian Legal Context : Abstract: In the landscape of Fact-based Judgment Prediction and Explanation (FJPE), reliance on factual data is essential for developing robust and realistic AI-driven decision-making tools. This pap...
- DashCLIP: Leveraging multimodal models for generating semantic embeddings for DoorDash : Abstract: Despite the success of vision-language models in various generative tasks, obtaining high-quality semantic representations for products and user intents is still challenging due to the inabi...
- RadZero: Similarity-Based Cross-Attention for Explainable Vision-Language Alignment in Chest X-ray with Zero-Shot Multi-Task Capability : Abstract: Recent advancements in multimodal models have significantly improved vision-language (VL) alignment in radiology. However, existing approaches struggle to effectively utilize complex radiolo...
- Robustness in Large Language Models: A Survey of Mitigation Strategies and Evaluation Metrics : Abstract: Large Language Models (LLMs) have emerged as a promising cornerstone for the development of natural language processing (NLP) and artificial intelligence (AI). However, ensuring the robustne...
- Information-theoretic Generalization Analysis for VQ-VAEs: A Role of Latent Variables : Abstract: Latent variables (LVs) play a crucial role in encoder-decoder models by enabling effective data compression, prediction, and generation. Although their theoretical properties, such as genera...
- Efficient Reinforcement Learning from Human Feedback via Bayesian Preference Inference : Abstract: Learning from human preferences is a cornerstone of aligning machine learning models with subjective human judgments. Yet, collecting such preference data is often costly and time-consuming,...
- Differentially Private In-Context Learning with Nearest Neighbor Search : Abstract: Differentially private in-context learning (DP-ICL) has recently become an active research topic due to the inherent privacy risks of in-context learning. However, existing approaches overlo...
- LUME-DBN: Full Bayesian Learning of DBNs from Incomplete data in Intensive Care : Abstract: Dynamic Bayesian networks (DBNs) are increasingly used in healthcare due to their ability to model complex temporal relationships in patient data while maintaining interpretability, an essen...
- Spurious Correlation-Aware Embedding Regularization for Worst-Group Robustness : Abstract: Deep learning models achieve strong performance across various domains but often rely on spurious correlations, making them vulnerable to distribution shifts. This issue is particularly seve...
- The Illusion of Certainty: Uncertainty quantification for LLMs fails under ambiguity : Abstract: Accurate uncertainty quantification (UQ) in Large Language Models (LLMs) is critical for trustworthy deployment. While real-world language is inherently ambiguous, reflecting aleatoric uncer...
- On the Equivalence of Regression and Classification : Abstract: A formal link between regression and classification has been tenuous. Even though the margin maximization term $\|w\|$ is used in support vector regression, it has at best been justified as ...
- ForecastGAN: A Decomposition-Based Adversarial Framework for Multi-Horizon Time Series Forecasting : Abstract: Time series forecasting is essential across domains from finance to supply chain management. This paper introduces ForecastGAN, a novel decomposition based adversarial framework addressing l...
- Federated Stochastic Minimax Optimization under Heavy-Tailed Noises : Abstract: Heavy-tailed noise has attracted growing attention in nonconvex stochastic optimization, as numerous empirical studies suggest it offers a more realistic assumption than standard bounded var...
- Towards Causal Market Simulators : Abstract: Market generators using deep generative models have shown promise for synthetic financial data generation, but existing approaches lack causal reasoning capabilities essential for counterfac...
- Ground-Truth Subgraphs for Better Training and Evaluation of Knowledge Graph Augmented LLMs : Abstract: Retrieval of information from graph-structured knowledge bases represents a promising direction for improving the factuality of LLMs. While various solutions have been proposed, a comparison...
- Q3R: Quadratic Reweighted Rank Regularizer for Effective Low-Rank Training : Abstract: Parameter-efficient training, based on low-rank optimization, has become a highly successful tool for fine-tuning large deep-learning models. However, these methods fail at low-rank pre-trai...
- Distribution-Aware Tensor Decomposition for Compression of Convolutional Neural Networks : Abstract: Neural networks are widely used for image-related tasks but typically demand considerable computing power. Once a network has been trained, however, its memory- and compute-footprint can be ...
- Alternative Fairness and Accuracy Optimization in Criminal Justice : Abstract: Algorithmic fairness has grown rapidly as a research area, yet key concepts remain unsettled, especially in criminal justice. We review group, individual, and process fairness and map the co...
- Linear Mode Connectivity under Data Shifts for Deep Ensembles of Image Classifiers : Abstract: The phenomenon of linear mode connectivity (LMC) links several aspects of deep learning, including training stability under noisy stochastic gradients, the smoothness and generalization of l...
- Comparing EPGP Surrogates and Finite Elements Under Degree-of-Freedom Parity : Abstract: We present a new benchmarking study comparing a boundary-constrained Ehrenpreis--Palamodov Gaussian Process (B-EPGP) surrogate with a classical finite element method combined with Crank--Nic...
- End-to-End Reinforcement Learning of Koopman Models for eNMPC of an Air Separation Unit : Abstract: With our recently proposed method based on reinforcement learning (Mayfrank et al. (2024), Comput. Chem. Eng. 190), Koopman surrogate models can be trained for optimal performance in specifi...
- Uncertainty Quantification for Reduced-Order Surrogate Models Applied to Cloud Microphysics : Abstract: Reduced-order models (ROMs) can efficiently simulate high-dimensional physical systems, but lack robust uncertainty quantification methods. Existing approaches are frequently architecture- o...
- Integrating Temporal and Structural Context in Graph Transformers for Relational Deep Learning : Abstract: In domains such as healthcare, finance, and e-commerce, the temporal dynamics of relational data emerge from complex interactions-such as those between patients and providers, or users and p...
- ARETE: an R package for Automated REtrieval from TExt with large language models : Abstract: 1. A hard stop for the implementation of rigorous conservation initiatives is our lack of key species data, especially occurrence data. Furthermore, researchers have to contend with an accel...
- Complexity as Advantage: A Regret-Based Perspective on Emergent Structure : Abstract: We introduce Complexity as Advantage (CAA), a framework that defines the complexity of a system relative to a family of observers. Instead of measuring complexity as an intrinsic property, w...
- Regret Lower Bounds for Decentralized Multi-Agent Stochastic Shortest Path Problems : Abstract: Multi-agent systems (MAS) are central to applications such as swarm robotics and traffic routing, where agents must coordinate in a decentralized manner to achieve a common objective. Stocha...
- Environment Agnostic Goal-Conditioning, A Study of Reward-Free Autonomous Learning : Abstract: In this paper we study how transforming regular reinforcement learning environments into goal-conditioned environments can let agents learn to solve tasks autonomously and reward-free. We sh...
- Addressing divergent representations from causal interventions on neural networks : Abstract: A common approach to mechanistic interpretability is to causally manipulate model representations via targeted interventions in order to understand what those representations encode. Here we...
- Efficient probabilistic surrogate modeling techniques for partially-observed large-scale dynamical systems : Abstract: This paper is concerned with probabilistic techniques for forecasting dynamical systems described by partial differential equations (such as, for example, the Navier-Stokes equations). In pa...
- Optimal Inference Schedules for Masked Diffusion Models : Abstract: A major bottleneck of standard auto-regressive large language models is that their inference process is inherently sequential, resulting in very long and costly inference times. To circumven...
- TT-Prune: Joint Model Pruning and Resource Allocation for Communication-efficient Time-triggered Federated Learning : Abstract: Federated learning (FL) offers new opportunities in machine learning, particularly in addressing data privacy concerns. In contrast to conventional event-based federated learning, time-trigg...
- Nowcast3D: Reliable precipitation nowcasting via gray-box learning : Abstract: Extreme precipitation nowcasting demands high spatiotemporal fidelity and extended lead times, yet existing approaches remain limited. Numerical Weather Prediction (NWP) and its deep-learnin...
- Forgetting is Everywhere : Abstract: A fundamental challenge in developing general learning algorithms is their tendency to forget past knowledge when adapting to new data. Addressing this problem requires a principled understa...
- Multi-Method Analysis of Mathematics Placement Assessments: Classical, Machine Learning, and Clustering Approaches : Abstract: This study evaluates a 40-item mathematics placement examination administered to 198 students using a multi-method framework combining Classical Test Theory, machine learning, and unsupervis...
- Simulation-Based Validation of an Integrated 4D/5D Digital-Twin Framework for Predictive Construction Control : Abstract: Persistent cost and schedule deviations remain a major challenge in the U.S. construction industry, revealing the limitations of deterministic CPM and static document-based estimating. This ...
- Friction on Demand: A Generative Framework for the Inverse Design of Metainterfaces : Abstract: Designing frictional interfaces to exhibit prescribed macroscopic behavior is a challenging inverse problem, made difficult by the non-uniqueness of solutions and the computational cost of c...
- A convolutional neural network deep learning method for model class selection : Abstract: The response-only model class selection capability of a novel deep convolutional neural network method is examined herein in a simple, yet effective, manner. Specifically, the responses from...
- A Dynamic Recurrent Adjacency Memory Network for Mixed-Generation Power System Stability Forecasting : Abstract: Modern power systems with high penetration of inverter-based resources exhibit complex dynamic behaviors that challenge the scalability and generalizability of traditional stability assessme...
- Bifidelity Karhunen-Lo\`eve Expansion Surrogate with Active Learning for Random Fields : Abstract: We present a bifidelity Karhunen-Lo\`eve expansion (KLE) surrogate model for field-valued quantities of interest (QoIs) under uncertain inputs. The approach combines the spectral efficiency ...
- Deep Learning-Driven Downscaling for Climate Risk Assessment of Projected Temperature Extremes in the Nordic Region : Abstract: Rapid changes and increasing climatic variability across the widely varied Koppen-Geiger regions of northern Europe generate significant needs for adaptation. Regional planning needs high-re...
- Climbing the label tree: Hierarchy-preserving contrastive learning for medical imaging : Abstract: Medical image labels are often organized by taxonomies (e.g., organ - tissue - subtype), yet standard self-supervised learning (SSL) ignores this structure. We present a hierarchy-preserving...
- Learning Paths for Dynamic Measure Transport: A Control Perspective : Abstract: We bring a control perspective to the problem of identifying paths of measures for sampling via dynamic measure transport (DMT). We highlight the fact that commonly used paths may be poor ch...
- How Different Tokenization Algorithms Impact LLMs and Transformer Models for Binary Code Analysis : Abstract: Tokenization is fundamental in assembly code analysis, impacting intrinsic characteristics like vocabulary size, semantic coverage, and extrinsic performance in downstream tasks. Despite its...
- To See or To Read: User Behavior Reasoning in Multimodal LLMs : Abstract: Multimodal Large Language Models (MLLMs) are reshaping how modern agentic systems reason over sequential user-behavior data. However, whether textual or image representations of user behavio...
- Which Similarity-Sensitive Entropy? : Abstract: A canonical step in quantifying a system is to measure its entropy. Shannon entropy and other traditional entropy measures capture only the information encoded in the frequencies of a system...
- OMPILOT: Harnessing Transformer Models for Auto Parallelization to Shared Memory Computing Paradigms : Abstract: Recent advances in large language models (LLMs) have significantly accelerated progress in code translation, enabling more accurate and efficient transformation across programming languages....
- Computed Tomography (CT)-derived Cardiovascular Flow Estimation Using Physics-Informed Neural Networks Improves with Sinogram-based Training: A Simulation Study : Abstract: Background: Non-invasive imaging-based assessment of blood flow plays a critical role in evaluating heart function and structure. Computed Tomography (CT) is a widely-used imaging modality t...
- KnowThyself: An Agentic Assistant for LLM Interpretability : Abstract: We develop KnowThyself, an agentic assistant that advances large language model (LLM) interpretability. Existing tools provide useful insights but remain fragmented and code-intensive. KnowT...
- Investigating Robot Control Policy Learning for Autonomous X-ray-guided Spine Procedures : Abstract: Imitation learning-based robot control policies are enjoying renewed interest in video-based robotics. However, it remains unclear whether this approach applies to X-ray-guided procedures, s...
- Desert Waste Detection and Classification Using Data-Based and Model-Based Enhanced YOLOv12 DL Model : Abstract: The global waste crisis is escalating, with solid waste generation expected to increase by 70% by 2050. Traditional waste collection methods, particularly in remote or harsh environments lik...
- Shape Deformation Networks for Automated Aortic Valve Finite Element Meshing from 3D CT Images : Abstract: Accurate geometric modeling of the aortic valve from 3D CT images is essential for biomechanical analysis and patient-specific simulations to assess valve health or make a preoperative plan....
- A general technique for approximating high-dimensional empirical kernel matrices : Abstract: We present simple, user-friendly bounds for the expected operator norm of a random kernel matrix under general conditions on the kernel function $k(\cdot,\cdot)$. Our approach uses decouplin...
- GRAD: Graph-Retrieved Adaptive Decoding for Hallucination Mitigation : Abstract: Hallucination mitigation remains a persistent challenge for large language models (LLMs), even as model scales grow. Existing approaches often rely on external knowledge sources, such as str...
- Vectorized Computation of Euler Characteristic Functions and Transforms : Abstract: The weighted Euler characteristic transform (WECT) and Euler characteristic function (ECF) have proven to be useful tools in a variety of applications. However, current methods for computing...
- High-dimensional limit theorems for SGD: Momentum and Adaptive Step-sizes : Abstract: We develop a high-dimensional scaling limit for Stochastic Gradient Descent with Polyak Momentum (SGD-M) and adaptive step-sizes. This provides a framework to rigourously compare online SGD ...
- Robust inference using density-powered Stein operators : Abstract: We introduce a density-power weighted variant for the Stein operator, called the $\gamma$-Stein operator. This is a novel class of operators derived from the $\gamma$-divergence, designed to...
- A Characterization of List Language Identification in the Limit : Abstract: We study the problem of language identification in the limit, where given a sequence of examples from a target language, the goal of the learner is to output a sequence of guesses for the ta...
- Automated and Explainable Denial of Service Analysis for AI-Driven Intrusion Detection Systems : Abstract: With the increasing frequency and sophistication of Distributed Denial of Service (DDoS) attacks, it has become critical to develop more efficient and interpretable detection methods. Tradit...
- REMIND: Input Loss Landscapes Reveal Residual Memorization in Post-Unlearning LLMs : Abstract: Machine unlearning aims to remove the influence of specific training data from a model without requiring full retraining. This capability is crucial for ensuring privacy, safety, and regulat...
- Twirlator: A Pipeline for Analyzing Subgroup Symmetry Effects in Quantum Machine Learning Ansatzes : Abstract: Leveraging data symmetries has been a key driver of performance gains in geometric deep learning and geometric and equivariant quantum machine learning. While symmetrization appears to be a ...
- MedSapiens: Taking a Pose to Rethink Medical Imaging Landmark Detection : Abstract: This paper does not introduce a novel architecture; instead, it revisits a fundamental yet overlooked baseline: adapting human-centric foundation models for anatomical landmark detection in ...
- Online Conformal Inference with Retrospective Adjustment for Faster Adaptation to Distribution Shift : Abstract: Conformal prediction has emerged as a powerful framework for constructing distribution-free prediction sets with guaranteed coverage assuming only the exchangeability assumption. However, th...
- Robustness of Minimum-Volume Nonnegative Matrix Factorization under an Expanded Sufficiently Scattered Condition : Abstract: Minimum-volume nonnegative matrix factorization (min-vol NMF) has been used successfully in many applications, such as hyperspectral imaging, chemical kinetics, spectroscopy, topic modeling,...
- DeepPAAC: A New Deep Galerkin Method for Principal-Agent Problems : Abstract: We consider numerical resolution of principal-agent (PA) problems in continuous time. We formulate a generic PA model with continuous and lump payments and a multi-dimensional strategy of th...
- AIM: Software and Hardware Co-design for Architecture-level IR-drop Mitigation in High-performance PIM : Abstract: SRAM Processing-in-Memory (PIM) has emerged as the most promising implementation for high-performance PIM, delivering superior computing density, energy efficiency, and computational precisi...
- Submanifold Sparse Convolutional Networks for Automated 3D Segmentation of Kidneys and Kidney Tumours in Computed Tomography : Abstract: The accurate delineation of tumours in radiological images like Computed Tomography is a very specialised and time-consuming task, and currently a bottleneck preventing quantitative analyses...
- Where Do LLMs Still Struggle? An In-Depth Analysis of Code Generation Benchmarks : Abstract: Large Language Models (LLMs) have achieved remarkable success in code generation, and the race to improve their performance has become a central focus of AI research. Benchmarks and leaderbo...
- Causal Regime Detection in Energy Markets With Augmented Time Series Structural Causal Models : Abstract: Energy markets exhibit complex causal relationships between weather patterns, generation technologies, and price formation, with regime changes occurring continuously rather than at discrete...
- MusRec: Zero-Shot Text-to-Music Editing via Rectified Flow and Diffusion Transformers : Abstract: Music editing has emerged as an important and practical area of artificial intelligence, with applications ranging from video game and film music production to personalizing existing tracks ...
- Multi-Task Learning for Visually Grounded Reasoning in Gastrointestinal VQA : Abstract: We present a multi-task framework for the MediaEval Medico 2025 challenge, leveraging a LoRA-tuned Florence-2 model for simultaneous visual question answering (VQA), explanation generation, ...
- Online Bayesian Experimental Design for Partially Observed Dynamical Systems : Abstract: Bayesian experimental design (BED) provides a principled framework for optimizing data collection, but existing approaches do not apply to crucial real-world settings such as dynamical syste...
- Deep Koopman Economic Model Predictive Control of a Pasteurisation Unit : Abstract: This paper presents a deep Koopman-based Economic Model Predictive Control (EMPC) for efficient operation of a laboratory-scale pasteurization unit (PU). The method uses Koopman operator the...
- The Peril of Preference: Why GRPO fails on Ordinal Rewards : Abstract: Group-relative Policy Optimization's (GRPO) simplicity makes it highly desirable for adapting LLMs to become experts at specific tasks. But this simplicity also makes it ill-specified as we ...
- Deep Dictionary-Free Method for Identifying Linear Model of Nonlinear System with Input Delay : Abstract: Nonlinear dynamical systems with input delays pose significant challenges for prediction, estimation, and control due to their inherent complexity and the impact of delays on system behavior...
- Fitting Reinforcement Learning Model to Behavioral Data under Bandits : Abstract: We consider the problem of fitting a reinforcement learning (RL) model to some given behavioral data under a multi-armed bandit environment. These models have received much attention in rece...
- Data-driven uncertainty-aware seakeeping prediction of the Delft 372 catamaran using ensemble Hankel dynamic mode decomposition : Abstract: In this study, we present and validate an ensemble-based Hankel Dynamic Mode Decomposition with control (HDMDc) for uncertainty-aware seakeeping predictions of a high-speed catamaran, namely...
- Fraud-Proof Revenue Division on Subscription Platforms : Abstract: We study a model of subscription-based platforms where users pay a fixed fee for unlimited access to content, and creators receive a share of the revenue. Existing approaches to detecting fr...
- Online Algorithms for Repeated Optimal Stopping: Achieving Both Competitive Ratio and Regret Bounds : Abstract: We study the repeated optimal stopping problem, which generalizes the classical optimal stopping problem with an unknown distribution to a setting where the same problem is solved repeatedly...
- RUST-BENCH: Benchmarking LLM Reasoning on Unstructured Text within Structured Tables : Abstract: Existing tabular reasoning benchmarks mostly test models on small, uniform tables, underrepresenting the complexity of real-world data and giving an incomplete view of Large Language Models'...
- Unified Generative Latent Representation for Functional Brain Graphs : Abstract: Functional brain graphs are often characterized with separate graph-theoretic or spectral descriptors, overlooking how these properties covary and partially overlap across brains and conditi...
- Confidential Computing for Cloud Security: Exploring Hardware based Encryption Using Trusted Execution Environments : Abstract: The growth of cloud computing has revolutionized data processing and storage capacities to another levels of scalability and flexibility. But in the process, it has created a huge challenge ...
- Uncertainties in Physics-informed Inverse Problems: The Hidden Risk in Scientific AI : Abstract: Physics-informed machine learning (PIML) integrates partial differential equations (PDEs) into machine learning models to solve inverse problems, such as estimating coefficient functions (e....
- Machine Learning for Electron-Scale Turbulence Modeling in W7-X : Abstract: Constructing reduced models for turbulent transport is essential for accelerating profile predictions and enabling many-query tasks such as uncertainty quantification, parameter scans, and d...
- Riesz Regression As Direct Density Ratio Estimation : Abstract: Riesz regression has garnered attention as a tool in debiased machine learning for causal and structural parameter estimation (Chernozhukov et al., 2021). This study shows that Riesz regress...
- Physics-Informed Neural Networks and Neural Operators for Parametric PDEs: A Human-AI Collaborative Analysis : Abstract: PDEs arise ubiquitously in science and engineering, where solutions depend on parameters (physical properties, boundary conditions, geometry). Traditional numerical methods require re-solvin...
- Jr. AI Scientist and Its Risk Report: Autonomous Scientific Exploration from a Baseline Paper : Abstract: Understanding the current capabilities and risks of AI Scientist systems is essential for ensuring trustworthy and sustainable AI-driven scientific progress while preserving the integrity of...
- evomap: A Toolbox for Dynamic Mapping in Python : Abstract: This paper presents evomap, a Python package for dynamic mapping. Mapping methods are widely used across disciplines to visualize relationships among objects as spatial representations, or m...
- Dynamic causal discovery in Alzheimer's disease through latent pseudotime modelling : Abstract: The application of causal discovery to diseases like Alzheimer's (AD) is limited by the static graph assumptions of most methods; such models cannot account for an evolving pathophysiology, ...
- ODE approximation for the Adam algorithm: General and overparametrized setting : Abstract: The Adam optimizer is currently presumably the most popular optimization method in deep learning. In this article we develop an ODE based method to study the Adam optimizer in a fast-slow sc...
- DR. WELL: Dynamic Reasoning and Learning with Symbolic World Model for Embodied LLM-Based Multi-Agent Collaboration : Abstract: Cooperative multi-agent planning requires agents to make joint decisions with partial information and limited communication. Coordination at the trajectory level often fails, as small deviat...
- Real-to-Sim Robot Policy Evaluation with Gaussian Splatting Simulation of Soft-Body Interactions : Abstract: Robotic manipulation policies are advancing rapidly, but their direct evaluation in the real world remains costly, time-consuming, and difficult to reproduce, particularly for tasks involvin...
- Dark Energy Survey Year 3 results: Simulation-based $w$CDM inference from weak lensing and galaxy clustering maps with deep learning. I. Analysis design : Abstract: Data-driven approaches using deep learning are emerging as powerful techniques to extract non-Gaussian information from cosmological large-scale structure. This work presents the first simul...
- Rater Equivalence: Evaluating Classifiers in Human Judgment Settings : Abstract: In many decision settings, the definitive ground truth is either non-existent or inaccessible. We introduce a framework for evaluating classifiers based solely on human judgments. In such ca...
- Local Fragments, Global Gains: Subgraph Counting using Graph Neural Networks : Abstract: Subgraph counting is a fundamental task for analyzing structural patterns in graph-structured data, with important applications in domains such as computational biology and social network an...
- A Unified Kernel for Neural Network Learning : Abstract: Past decades have witnessed a great interest in the distinction and connection between neural network learning and kernel learning. Recent advancements have made theoretical progress in conn...
- Stochastic Diffusion: A Diffusion Probabilistic Model for Stochastic Time Series Forecasting : Abstract: Recent innovations in diffusion probabilistic models have paved the way for significant progress in image, text and audio generation, leading to their applications in generative time series ...
- Generalizing Graph Transformers Across Diverse Graphs and Tasks via Pre-training : Abstract: Graph pre-training has been concentrated on graph-level tasks involving small graphs (e.g., molecular graphs) or learning node representations on a fixed graph. Extending graph pre-trained m...
- FedQUIT: On-Device Federated Unlearning via a Quasi-Competent Virtual Teacher : Abstract: Federated Learning (FL) systems enable the collaborative training of machine learning models without requiring centralized collection of individual data. FL participants should have the abil...
- Diffusion & Adversarial Schr\"odinger Bridges via Iterative Proportional Markovian Fitting : Abstract: The Iterative Markovian Fitting (IMF) procedure, which iteratively projects onto the space of Markov processes and the reciprocal class, successfully solves the Schr\"odinger Bridge (SB) pro...
- Small Singular Values Matter: A Random Matrix Analysis of Transformer Models : Abstract: This work analyzes singular-value spectra of weight matrices in pretrained transformer models to understand how information is stored at both ends of the spectrum. Using Random Matrix Theory...
- Shallow Diffuse: Robust and Invisible Watermarking through Low-Dimensional Subspaces in Diffusion Models : Abstract: The widespread use of AI-generated content from diffusion models has raised significant concerns regarding misinformation and copyright infringement. Watermarking is a crucial technique for ...
- Scaling Laws for Task-Optimized Models of the Primate Visual Ventral Stream : Abstract: When trained on large-scale object classification datasets, certain artificial neural network models begin to approximate core object recognition behaviors and neural response patterns in th...
- scMEDAL for the interpretable analysis of single-cell transcriptomics data with batch effect visualization using a deep mixed effects autoencoder : Abstract: Single-cell RNA sequencing enables high-resolution analysis of cellular heterogeneity, yet disentangling biological signal from batch effects remains a major challenge. Existing batch-correc...
- AnomalyAID: Reliable Interpretation for Semi-supervised Network Anomaly Detection : Abstract: Semi-supervised Learning plays a crucial role in network anomaly detection applications, however, learning anomaly patterns with limited labeled samples is not easy. Additionally, the lack o...
- GASP: Efficient Black-Box Generation of Adversarial Suffixes for Jailbreaking LLMs : Abstract: LLMs have shown impressive capabilities across various natural language processing tasks, yet remain vulnerable to input prompts, known as jailbreak attacks, carefully designed to bypass saf...
- Revisiting Federated Fine-Tuning: A Single Communication Round is Enough for Foundation Models : Abstract: The recent advancement of foundation models (FMs) has increased the demand for fine-tuning these models on large-scale cross-domain datasets. To address this, federated fine-tuning has emerg...
- How Memory in Optimization Algorithms Implicitly Modifies the Loss : Abstract: In modern optimization methods used in deep learning, each update depends on the history of previous iterations, often referred to as memory, and this dependence decays fast as the iterates ...
- A Little Depth Goes a Long Way: The Expressive Power of Log-Depth Transformers : Abstract: Recent theoretical results show transformers cannot express sequential reasoning problems over long inputs, intuitively because their computational depth is bounded. However, prior work trea...
- Quamba2: A Robust and Scalable Post-training Quantization Framework for Selective State Space Models : Abstract: State Space Models (SSMs) are emerging as a compelling alternative to Transformers because of their consistent memory usage and high performance. Despite this, scaling up SSMs on cloud servi...
- Explanations Go Linear: Interpretable and Individual Latent Encoding for Post-hoc Explainability : Abstract: Post-hoc explainability is essential for understanding black-box machine learning models. Surrogate-based techniques are widely used for local and global model-agnostic explanations but have...
- Multimodal Cancer Modeling in the Age of Foundation Model Embeddings : Abstract: The Cancer Genome Atlas (TCGA) has enabled novel discoveries and served as a large-scale reference dataset in cancer through its harmonized genomics, clinical, and imaging data. Numerous pri...
- Learning Dynamics of RNNs in Closed-Loop Environments : Abstract: Recurrent neural networks (RNNs) trained on neuroscience-inspired tasks offer powerful models of brain computation. However, typical training paradigms rely on open-loop, supervised settings...
- Regularized least squares learning with heavy-tailed noise is minimax optimal : Abstract: This paper examines the performance of ridge regression in reproducing kernel Hilbert spaces in the presence of noise that exhibits a finite number of higher moments. We establish excess ris...
- But what is your honest answer? Aiding LLM-judges with honest alternatives using steering vectors : Abstract: Detecting subtle forms of dishonesty like sycophancy and manipulation in Large Language Models (LLMs) remains challenging for both humans and automated evaluators, as these behaviors often a...
- Exact Expressive Power of Transformers with Padding : Abstract: Chain of thought is a natural inference-time method for increasing the computational power of transformer-based large language models (LLMs), but comes at the cost of sequential decoding. Ar...
- Mustafar: Promoting Unstructured Sparsity for KV Cache Pruning in LLM Inference : Abstract: We demonstrate that unstructured sparsity significantly improves KV cache compression for LLMs, enabling sparsity levels up to 70% without compromising accuracy or requiring fine-tuning. We ...
- Composite Flow Matching for Reinforcement Learning with Shifted-Dynamics Data : Abstract: Incorporating pre-collected offline data from a source environment can significantly improve the sample efficiency of reinforcement learning (RL), but this benefit is often challenged by dis...
- How do Transformers Learn Implicit Reasoning? : Abstract: Recent work suggests that large language models (LLMs) can perform multi-hop reasoning implicitly -- producing correct answers without explicitly verbalizing intermediate steps -- but the un...
- Critical Batch Size Revisited: A Simple Empirical Approach to Large-Batch Language Model Training : Abstract: The right batch size is important when training language models at scale: a large batch size is necessary for fast training, but a batch size that is too large will harm token efficiency. To...
- HELM: Hyperbolic Large Language Models via Mixture-of-Curvature Experts : Abstract: Large language models (LLMs) have shown great success in text modeling tasks across domains. However, natural language exhibits inherent semantic hierarchies and nuanced geometric structure,...
- Learning-at-Criticality in Large Language Models for Quantum Field Theory and Beyond : Abstract: Fundamental physics often confronts complex symbolic problems with few guiding exemplars or established principles. While artificial intelligence (AI) offers promise, its typical need for va...
- Explicit Density Approximation for Neural Implicit Samplers Using a Bernstein-Based Convex Divergence : Abstract: Rank-based statistical metrics, such as the invariant statistical loss (ISL), have recently emerged as robust and practically effective tools for training implicit generative models. In this...
- Approximate non-linear model predictive control with safety-augmented neural networks : Abstract: Model predictive control (MPC) achieves stability and constraint satisfaction for general nonlinear systems, but requires computationally expensive online optimization. This paper studies ap...
- Bridging Generative and Discriminative Noisy-Label Learning via Direction-Agnostic EM Formulation : Abstract: Although noisy-label learning is often approached with discriminative methods for simplicity and speed, generative modeling offers a principled alternative by capturing the joint mechanism t...
- SySMOL: Co-designing Algorithms and Hardware for Neural Networks with Heterogeneous Precisions : Abstract: Ultra-low-precision inference can sharply reduce memory and latency but often degrades accuracy and relies on specialized hardware. We present SONIQ, a system-optimized, noise-injected quant...
- EERO: Early Exit with Reject Option for Efficient Classification with limited budget : Abstract: The increasing complexity of advanced machine learning models requires innovative approaches to manage computational resources effectively. One such method is the Early Exit strategy, which ...
- Beyond State Space Representation: A General Theory for Kernel Packets : Abstract: Gaussian process (GP) regression provides a flexible, nonparametric framework for probabilistic modeling, yet remains computationally demanding in large-scale applications. For one-dimension...
- Projection Methods for Operator Learning and Universal Approximation : Abstract: We obtain a new universal approximation theorem for continuous (possibly nonlinear) operators on arbitrary Banach spaces using the Leray-Schauder mapping. Moreover, we introduce and study a ...
- LLM Targeted Underperformance Disproportionately Impacts Vulnerable Users : Abstract: While state-of-the-art large language models (LLMs) have shown impressive performance on many tasks, there has been extensive research on undesirable model behavior such as hallucinations an...
- Measure-Theoretic Time-Delay Embedding : Abstract: The celebrated Takens' embedding theorem provides a theoretical foundation for reconstructing the full state of a dynamical system from partial observations. However, the classical theorem a...
- QCircuitBench: A Large-Scale Dataset for Benchmarking Quantum Algorithm Design : Abstract: Quantum computing is an emerging field recognized for the significant speedup it offers over classical computing through quantum algorithms. However, designing and implementing quantum algor...
- Dispersion based Recurrent Neural Network Model for Methane Monitoring in Albertan Tailings Ponds : Abstract: Bitumen extraction for the production of synthetic crude oil in Canada's Athabasca Oil Sands industry has recently come under spotlight for being a significant source of greenhouse gas emiss...
- Applying Time Series Deep Learning Models to Forecast the Growth of Perennial Ryegrass in Ireland : Abstract: Grasslands, constituting the world's second-largest terrestrial carbon sink, play a crucial role in biodiversity and the regulation of the carbon cycle. Currently, the Irish dairy sector, a ...
- Federated Learning with Gramian Angular Fields for Privacy-Preserving ECG Classification on Heterogeneous IoT Devices : Abstract: This study presents a federated learning (FL) framework for privacy-preserving electrocardiogram (ECG) classification in Internet of Things (IoT) healthcare environments. By transforming 1D ...
- Laugh, Relate, Engage: Stylized Comment Generation for Short Videos : Abstract: Short-video platforms have become a central medium in the modern Internet landscape, where efficient information delivery and strong interactivity are reshaping user engagement and cultural ...
- What's in Common? Multimodal Models Hallucinate When Reasoning Across Scenes : Abstract: Multimodal language models possess a remarkable ability to handle an open-vocabulary's worth of objects. Yet the best models still suffer from hallucinations when reasoning about scenes in t...
- Contamination Detection for VLMs using Multi-Modal Semantic Perturbation : Abstract: Recent advances in Vision-Language Models (VLMs) have achieved state-of-the-art performance on numerous benchmark tasks. However, the use of internet-scale, often proprietary, pretraining co...
- FusionDP: Foundation Model-Assisted Differentially Private Learning for Partially Sensitive Features : Abstract: Ensuring the privacy of sensitive training data is crucial in privacy-preserving machine learning. However, in practical scenarios, privacy protection may be required for only a subset of fe...
- Fair and Explainable Credit-Scoring under Concept Drift: Adaptive Explanation Frameworks for Evolving Populations : Abstract: Evolving borrower behaviors, shifting economic conditions, and changing regulatory landscapes continuously reshape the data distributions underlying modern credit-scoring systems. Convention...
- Optimizing Reasoning Efficiency through Prompt Difficulty Prediction : Abstract: Reasoning language models perform well on complex tasks but are costly to deploy due to their size and long reasoning traces. We propose a routing approach that assigns each problem to the s...
- One Size Does Not Fit All: Architecture-Aware Adaptive Batch Scheduling with DEBA : Abstract: Adaptive batch size methods aim to accelerate neural network training, but existing approaches apply identical adaptation strategies across all architectures, assuming a one-size-fits-all so...
- Sketch-Augmented Features Improve Learning Long-Range Dependencies in Graph Neural Networks : Abstract: Graph Neural Networks learn on graph-structured data by iteratively aggregating local neighborhood information. While this local message passing paradigm imparts a powerful inductive bias an...
- From Static to Dynamic: Enhancing Offline-to-Online Reinforcement Learning via Energy-Guided Diffusion Stratification : Abstract: Transitioning from offline to online reinforcement learning (RL) poses critical challenges due to distributional shifts between the fixed behavior policy in the offline dataset and the evolv...
- Higher-Order Causal Structure Learning with Additive Models : Abstract: Causal structure learning has long been the central task of inferring causal insights from data. Despite the abundance of real-world processes exhibiting higher-order mechanisms, however, an...
- Enhancing Q-Value Updates in Deep Q-Learning via Successor-State Prediction : Abstract: Deep Q-Networks (DQNs) estimate future returns by learning from transitions sampled from a replay buffer. However, the target updates in DQN often rely on next states generated by actions fr...
- Benchmark Datasets for Lead-Lag Forecasting on Social Platforms : Abstract: Social and collaborative platforms emit multivariate time-series traces in which early interactions-such as views, likes, or downloads-are followed, sometimes months or years later, by highe...
- DecoHD: Decomposed Hyperdimensional Classification under Extreme Memory Budgets : Abstract: Decomposition is a proven way to shrink deep networks without changing I/O. We bring this idea to hyperdimensional computing (HDC), where footprint cuts usually shrink the feature axis and e...
- On Predicting Sociodemographics from Mobility Signals : Abstract: Inferring sociodemographic attributes from mobility data could help transportation planners better leverage passively collected datasets, but this task remains difficult due to weak and inco...
- SynQuE: Estimating Synthetic Dataset Quality Without Annotations : Abstract: We introduce and formalize the Synthetic Dataset Quality Estimation (SynQuE) problem: ranking synthetic datasets by their expected real-world task performance using only limited unannotated ...
- NVIDIA Nemotron Nano V2 VL : Abstract: We introduce Nemotron Nano V2 VL, the latest model of the Nemotron vision-language series designed for strong real-world document understanding, long video comprehension, and reasoning tasks...
- LogHD: Robust Compression of Hyperdimensional Classifiers via Logarithmic Class-Axis Reduction : Abstract: Hyperdimensional computing (HDC) suits memory, energy, and reliability-constrained systems, yet the standard "one prototype per class" design requires $O(CD)$ memory (with $C$ classes and di...
- RLHF: A comprehensive Survey for Cultural, Multimodal and Low Latency Alignment Methods : Abstract: Reinforcement Learning from Human Feedback (RLHF) is the standard for aligning Large Language Models (LLMs), yet recent progress has moved beyond canonical text-based methods. This survey sy...
- Conditional Score Learning for Quickest Change Detection in Markov Transition Kernels : Abstract: We address the problem of quickest change detection in Markov processes with unknown transition kernels. The key idea is to learn the conditional score $\nabla_{\mathbf{y}} \log p(\mathbf{y}...
- PrivacyCD: Hierarchical Unlearning for Protecting Student Privacy in Cognitive Diagnosis : Abstract: The need to remove specific student data from cognitive diagnosis (CD) models has become a pressing requirement, driven by users' growing assertion of their "right to be forgotten". However,...
- Non-Asymptotic Optimization and Generalization Bounds for Stochastic Gauss-Newton in Overparameterized Models : Abstract: An important question in deep learning is how higher-order optimization methods affect generalization. In this work, we analyze a stochastic Gauss-Newton (SGN) method with Levenberg-Marquard...
- PETRA: Pretrained Evolutionary Transformer for SARS-CoV-2 Mutation Prediction : Abstract: Since its emergence, SARS-CoV-2 has demonstrated a rapid and unpredictable evolutionary trajectory, characterized by the continual emergence of immune-evasive variants. This poses persistent...
- Structural Priors and Modular Adapters in the Composable Fine-Tuning Algorithm of Large-Scale Models : Abstract: This paper proposes a composable fine-tuning method that integrates graph structural priors with modular adapters to address the high computational cost and structural instability faced by l...
- TwIST: Rigging the Lottery in Transformers with Independent Subnetwork Training : Abstract: We introduce TwIST, a distributed training framework for efficient large language model (LLM) sparsification. TwIST trains multiple subnetworks in parallel, periodically aggregates their par...
- Use of Continuous Glucose Monitoring with Machine Learning to Identify Metabolic Subphenotypes and Inform Precision Lifestyle Changes : Abstract: The classification of diabetes and prediabetes by static glucose thresholds obscures the pathophysiological dysglycemia heterogeneity, primarily driven by insulin resistance (IR), beta-cell ...
- Multiscale Astrocyte Network Calcium Dynamics for Biologically Plausible Intelligence in Anomaly Detection : Abstract: Network anomaly detection systems encounter several challenges with traditional detectors trained offline. They become susceptible to concept drift and new threats such as zero-day or polymo...
- Towards Scalable Meta-Learning of near-optimal Interpretable Models via Synthetic Model Generations : Abstract: Decision trees are widely used in high-stakes fields like finance and healthcare due to their interpretability. This work introduces an efficient, scalable method for generating synthetic pr...
- Accelerating scientific discovery with the common task framework : Abstract: Machine learning (ML) and artificial intelligence (AI) algorithms are transforming and empowering the characterization and control of dynamic systems in the engineering, physical, and biolog...
- Memory- and Latency-Constrained Inference of Large Language Models via Adaptive Split Computing : Abstract: Large language models (LLMs) have achieved near-human performance across diverse reasoning tasks, yet their deployment on resource-constrained Internet-of-Things (IoT) devices remains imprac...
- Enhancing Multimodal Protein Function Prediction Through Dual-Branch Dynamic Selection with Reconstructive Pre-Training : Abstract: Multimodal protein features play a crucial role in protein function prediction. However, these features encompass a wide range of information, ranging from structural data and sequence featu...
- DartQuant: Efficient Rotational Distribution Calibration for LLM Quantization : Abstract: Quantization plays a crucial role in accelerating the inference of large-scale models, and rotational matrices have been shown to effectively improve quantization performance by smoothing ou...
- Pediatric Appendicitis Detection from Ultrasound Images : Abstract: Pediatric appendicitis remains one of the most common causes of acute abdominal pain in children, and its diagnosis continues to challenge clinicians due to overlapping symptoms and variable...
- Left Atrial Segmentation with nnU-Net Using MRI : Abstract: Accurate segmentation of the left atrium (LA) from cardiac MRI is critical for guiding atrial fibrillation (AF) ablation and constructing biophysical cardiac models. Manual delineation is ti...
- Learning Filter-Aware Distance Metrics for Nearest Neighbor Search with Multiple Filters : Abstract: Filtered Approximate Nearest Neighbor (ANN) search retrieves the closest vectors for a query vector from a dataset. It enforces that a specified set of discrete labels $S$ for the query must...
- DeNoise: Learning Robust Graph Representations for Unsupervised Graph-Level Anomaly Detection : Abstract: With the rapid growth of graph-structured data in critical domains, unsupervised graph-level anomaly detection (UGAD) has become a pivotal task. UGAD seeks to identify entire graphs that dev...
- KoTaP: A Panel Dataset for Corporate Tax Avoidance, Performance, and Governance in Korea : Abstract: This study introduces the Korean Tax Avoidance Panel (KoTaP), a long-term panel dataset of non-financial firms listed on KOSPI and KOSDAQ between 2011 and 2024. After excluding financial fir...
- Decomposable Neuro Symbolic Regression : Abstract: Symbolic regression (SR) models complex systems by discovering mathematical expressions that capture underlying relationships in observed data. However, most SR methods prioritize minimizing...
- Exploring the Feasibility of End-to-End Large Language Model as a Compiler : Abstract: In recent years, end-to-end Large Language Model (LLM) technology has shown substantial advantages across various domains. As critical system software and infrastructure, compilers are respo...
- Exchange Policy Optimization Algorithm for Semi-Infinite Safe Reinforcement Learning : Abstract: Safe reinforcement learning (safe RL) aims to respect safety requirements while optimizing long-term performance. In many practical applications, however, the problem involves an infinite nu...
- Learning to Land Anywhere: Transferable Generative Models for Aircraft Trajectories : Abstract: Access to trajectory data is a key requirement for developing and validating Air Traffic Management (ATM) solutions, yet many secondary and regional airports face severe data scarcity. This ...
- Deep Learning Approach for Clinical Risk Identification Using Transformer Modeling of Heterogeneous EHR Data : Abstract: This study proposes a Transformer-based longitudinal modeling method to address challenges in clinical risk classification with heterogeneous Electronic Health Record (EHR) data, including i...
- On Joint Regularization and Calibration in Deep Ensembles : Abstract: Deep ensembles are a powerful tool in machine learning, improving both model performance and uncertainty calibration. While ensembles are typically formed by training and tuning models indiv...
- ScaleDL: Towards Scalable and Efficient Runtime Prediction for Distributed Deep Learning Workloads : Abstract: Deep neural networks (DNNs) form the cornerstone of modern AI services, supporting a wide range of applications, including autonomous driving, chatbots, and recommendation systems. As models...
- Block Rotation is All You Need for MXFP4 Quantization : Abstract: Large language models (LLMs) have achieved remarkable success, but their rapidly growing scale imposes prohibitive costs in memory, computation, and energy. Post-training quantization (PTQ) ...
- The Strong Lottery Ticket Hypothesis for Multi-Head Attention Mechanisms : Abstract: The strong lottery ticket hypothesis (SLTH) conjectures that high-performing subnetworks, called strong lottery tickets (SLTs), are hidden in randomly initialized neural networks. Although r...
- seqme: a Python library for evaluating biological sequence design : Abstract: Recent advances in computational methods for designing biological sequences have sparked the development of metrics to evaluate these methods performance in terms of the fidelity of the desi...
- Guided by Stars: Interpretable Concept Learning Over Time Series via Temporal Logic Semantics : Abstract: Time series classification is a task of paramount importance, as this kind of data often arises in safety-critical applications. However, it is typically tackled with black-box deep learning...
Research Sources: 411 | Generated: 11/7/2025
