AI RESEARCH PAPERS & ACADEMIC SOURCES
- So-Fake: Benchmarking and Explaining Social Media Image Forgery Detection
- Revisiting Reweighted Risk for Calibration: AURC, Focal, and Inverse Focal Loss
- RichControl: Structure- and Appearance-Rich Training-Free Spatial Control for Text-to-Image Generation
- WaveNet-SF: A Hybrid Network for Retinal Disease Detection Based on Wavelet Transform in Spatial-Frequency Domain
- GCVAMD: A Modified CausalVAE Model for Causal Age-related Macular Degeneration Risk Factor Detection and Prediction
- PyRadiomics-cuda: a GPU-accelerated 3D features extraction from medical images within PyRadiomics
- Neural Posterior Estimation with Autoregressive Tiling for Detecting Objects in Astronomical Images
- Filter-Guided Diffusion for Controllable Image Generation
- Comparing YOLOv8 and Mask R-CNN for instance segmentation in complex orchard environments
- A Survey of Defenses against AI-generated Visual Media: Detection, Disruption, and Authentication
- Toward a Holistic Evaluation of Robustness in CLIP Models
- Fine-grained Abnormality Prompt Learning for Zero-shot Anomaly Detection
- Ranked from Within: Ranking Large Multimodal Models Without Labels
- SoccerSynth-Detection: A Synthetic Dataset for Soccer Player Detection
- Gate-Shift-Pose: Enhancing Action Recognition in Sports with Skeleton Information
- EFC++: Elastic Feature Consolidation with Prototype Re-balancing for Cold Start Exemplar-free Incremental Learning
- Vehicle-Scene Interaction: A Text-Driven 3D Lidar Place Recognition Method for Autonomous Driving
- Enhancing Monocular Height Estimation via Sparse LiDAR-Guided Correction
- Not every day is a sunny day: Synthetic cloud injection for deep land cover segmentation robustness evaluation across data sources
- PocketSR: The Super-Resolution Expert in Your Pocket Mobiles
- InsideOut: An EfficientNetV2-S Based Deep Learning Framework for Robust Multi-Class Facial Emotion Recognition
- Latent Diffusion Unlearning: Protecting Against Unauthorized Personalization Through Trajectory Shifted Perturbations
- GeoComplete: Geometry-Aware Diffusion for Reference-Driven Image Completion
- Taming Text-to-Sounding Video Generation via Advanced Modality Condition and Interaction
- Dynamic Prompt Generation for Interactive 3D Medical Image Segmentation Training
- Product-Quantised Image Representation for High-Quality Image Synthesis
- Memory Forcing: Spatio-Temporal Memory for Consistent Scene Generation on Minecraft
- MonSTeR: a Unified Model for Motion, Scene, Text Retrieval
- MIXER: Mixed Hyperspherical Random Embedding Neural Network for Texture Recognition
- LEAML: Label-Efficient Adaptation to Out-of-Distribution Visual Tasks for Multimodal Large Language Models
- A UAV-Based VNIR Hyperspectral Benchmark Dataset for Landmine and UXO Detection
- Image Enhancement Based on Pigment Representation
- Retrv-R1: A Reasoning-Driven MLLM Framework for Universal and Efficient Multimodal Retrieval
- Bayesian Test-time Adaptation for Object Recognition and Detection with Vision-language Models
- AdaRD-key: Adaptive Relevance-Diversity Keyframe Sampling for Long-form Video understanding
- Reasoning Riddles: How Explainability Reveals Cognitive Limits in Vision-Language Models
- OTR: Synthesizing Overlay Text Dataset for Text Removal
- Med-K2N: Flexible K-to-N Modality Translation for Medical Image Synthesis
- One Patch to Caption Them All: A Unified Zero-Shot Captioning Framework
- Training-Free Out-Of-Distribution Segmentation With Foundation Models
- Don't Just Chase "Highlighted Tokens" in MLLMs: Revisiting Visual Holistic Context Retention
- Zero-Shot Robustness of Vision Language Models Via Confidence-Aware Weighting
- Flip Distribution Alignment VAE for Multi-Phase MRI Synthesis
- TIT-Score: Evaluating Long-Prompt Based Text-to-Image Alignment via Text-to-Image-to-Text Consistency
- Towards Scalable and Consistent 3D Editing
- Exploring OCR-augmented Generation for Bilingual VQA
- PhysHMR: Learning Humanoid Control Policies from Vision for Physically Plausible Human Motion Reconstruction
- Unlocking the power of partnership: How humans and machines can work together to improve face recognition
- PEO: Training-Free Aesthetic Quality Enhancement in Pre-Trained Text-to-Image Diffusion Models with Prompt Embedding Optimization
- Ego-Exo 3D Hand Tracking in the Wild with a Mobile Multi-Camera Rig
- Input-Aware Sparse Attention for Real-Time Co-Speech Video Generation
- Deep Generative Continual Learning using Functional LoRA: FunLoRA
- Sequence-Preserving Dual-FoV Defense for Traffic Sign and Light Recognition in Autonomous Vehicles
- Smart-GRPO: Smartly Sampling Noise for Efficient RL of Flow-Matching Models
- MoGIC: Boosting Motion Generation via Intention Understanding and Visual Context
- From Tokens to Nodes: Semantic-Guided Motion Control for Dynamic 3D Gaussian Splatting
- Net2Net: When Un-trained Meets Pre-trained Networks for Robust Real-World Denoising
- On the Diminishing Returns of Complex Robust RAG Training in the Era of Powerful LLMs
- BottleHumor: Self-Informed Humor Explanation using the Information Bottleneck Principle
- From TOWER to SPIRE: Adding the Speech Modality to a Text-Only LLM
- Same evaluation, more tokens: On the effect of input length for machine translation evaluation using Large Language Models
- EconWebArena: Benchmarking Autonomous Agents on Economic Tasks in Realistic Web Environments
- Same Task, Different Circuits: Disentangling Modality-Specific Mechanisms in VLMs
- Did I Faithfully Say What I Thought? Bridging the Gap Between Neural Activity and Self-Explanations in Large Language Models
- Query-Level Uncertainty in Large Language Models
- Triadic Multi-party Voice Activity Projection for Turn-taking in Spoken Dialogue Systems
- The Prompt Makes the Person(a): A Systematic Evaluation of Sociodemographic Persona Prompting for Large Language Models
- From Long Videos to Engaging Clips: A Human-Inspired Video Editing Framework with Multimodal Narrative Understanding
- Leave No TRACE: Black-box Detection of Copyrighted Dataset Usage in Large Language Models via Watermarking
- Revisiting Direct Speech-to-Text Translation with Speech LLMs: Better Scaling than CoT Prompting?
- Semantic Similarity in Radiology Reports via LLMs and NER
- Listening or Reading? Evaluating Speech Awareness in Chain-of-Thought Speech-to-Text Translation
- SurveyBench: How Well Can LLM(-Agents) Write Academic Surveys?
- Beyond the Final Layer: Intermediate Representations for Better Multilingual Calibration in Large Language Models
- EditLens: Quantifying the Extent of AI Editing in Text
- Neural Correlates of Language Models Are Specific to Human Language
- Model-Based Ranking of Source Languages for Zero-Shot Cross-Lingual Transfer
- FocusAgent: Simple Yet Effective Ways of Trimming the Large Context of Web Agents
- SpeechCT-CLIP: Distilling Text-Image Knowledge to Speech for Voice-Native Multimodal CT Analysis
- When Names Disappear: Revealing What LLMs Actually Understand About Code
- Did Translation Models Get More Robust Without Anyone Even Noticing?
- ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference
- KnowledgeSmith: Uncovering Knowledge Updating in LLMs with Model Editing and Unlearning
- Retrieval and Augmentation of Domain Knowledge for Text-to-SQL Semantic Parsing
- Transcribe, Translate, or Transliterate: An Investigation of Intermediate Representations in Spoken Language Models
- Evaluation Framework for Highlight Explanations of Context Utilisation in Language Models
- Mind the Gap: Linguistic Divergence and Adaptation Strategies in Human-LLM Assistant vs. Human-Human Interactions
- SoT: Structured-of-Thought Prompting Guides Multilingual Reasoning in Large Language Models
- Self-Improvement in Multimodal Large Language Models: A Survey
- PGMEL: Policy Gradient-based Generative Adversarial Network for Multimodal Entity Linking
- IndiCASA: A Dataset and Bias Evaluation Framework in LLMs Using Contrastive Embedding Similarity in the Indian Context
- The Path of Self-Evolving Large Language Models: Achieving Data-Efficient Learning via Intrinsic Feedback
- XTRA: Cross-Lingual Topic Modeling with Topic and Representation Alignments
- Self-Reflective Generation at Test Time
- Finding Diamonds in Conversation Haystacks: A Benchmark for Conversational Data Retrieval
- Learning to Route: A Rule-Driven Agent Framework for Hybrid-Source Retrieval-Augmented Generation
- Understanding How CodeLLMs (Mis)Predict Types with Activation Steering
- Programming with Pixels: Can Computer-Use Agents do Software Engineering?
- DeepGDel: Deep Learning-based Gene Deletion Prediction Framework for Growth-Coupled Production in Genome-Scale Metabolic Models
- On the $O(\frac{\sqrt{d}}{K^{1/4}})$ Convergence Rate of AdamW Measured by $\ell_1$ Norm
- Graph-Reward-SQL: Execution-Free Reinforcement Learning for Text-to-SQL via Graph Matching and Stepwise Reward
- Understanding the Performance Gap in Preference Learning: A Dichotomy of RLHF and DPO
- QiMeng-CodeV-R1: Reasoning-Enhanced Verilog Generation
- Risk-Sensitive Agent Compositions
- Generative Modeling of Weights: Generalization or Memorization?
- Modern Methods in Associative Memory
- FR-LUX: Friction-Aware, Regime-Conditioned Policy Optimization for Implementable Portfolio Management
- The Computational Complexity of Almost Stable Clustering with Penalties
- ReeMark: Reeb Graphs for Simulating Patterns of Life in Spatiotemporal Trajectories
- Improving Online-to-Nonconvex Conversion for Smooth Optimization via Double Optimism
- Automatic Generation of Digital Twins for Network Testing
- Joint Bidding on Intraday and Frequency Containment Reserve Markets
- Cache-to-Cache: Direct Semantic Communication Between Large Language Models
- The Challenges of Hyperparameter Tuning for Accurate Causal Effect Estimation
- Wasserstein Bounds for generative diffusion models with Gaussian tail targets
- ColNet: Collaborative Optimization in Decentralized Federated Multi-task Learning Systems
- Best Policy Learning from Trajectory Preference Feedback
- On the Effect of Sampling Diversity in Scaling LLM Inference
- To Backtrack or Not to Backtrack: When Sequential Search Limits Model Reasoning
- Heterogeneous Graph Representation of Stiffened Panels with Non-Uniform Boundary Conditions and Loads
- Unraveling Syntax: How Language Models Learn Context-Free Grammars
- Self-supervised diffusion model fine-tuning for costate initialization using Markov chain Monte Carlo
- Even Faster Kernel Matrix Linear Algebra via Density Estimation
- FLOWR.root: A flow matching based foundation model for joint multi-purpose structure-aware 3D ligand generation and affinity prediction
- Uncertainty as Feature Gaps: Epistemic Uncertainty Quantification of LLMs in Contextual Question-Answering
- Quantitative Convergence Analysis of Projected Stochastic Gradient Descent for Non-Convex Losses via the Goldstein Subdifferential
- The land use-climate change-biodiversity nexus in European islands stakeholders
- ELMF4EggQ: Ensemble Learning with Multimodal Feature Fusion for Non-Destructive Egg Quality Assessment
- SALSA-V: Shortcut-Augmented Long-form Synchronized Audio from Videos
- Mechanistic Interpretability of Code Correctness in LLMs via Sparse Autoencoders
- Scalable Quantum Optimisation using HADOF: Hamiltonian Auto-Decomposition Optimisation Framework
- oRANS: Online optimisation of RANS machine learning models with embedded DNS data generation
- Q-Learning with Shift-Aware Upper Confidence Bound in Non-Stationary Reinforcement Learning
- PRISM-Physics: Causal DAG-Based Process Evaluation for Physics Reasoning
- Superposition disentanglement of neural representations reveals hidden alignment
- Estimation of Resistance Training RPE using Inertial Sensors and Electromyography
- To Distill or Decide? Understanding the Algorithmic Trade-off in Partially Observable Reinforcement Learning
- Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward
- WEE-Therapy: A Mixture of Weak Encoders Framework for Psychological Counseling Dialogue Analysis
- Can Prompts Rewind Time for LLMs? Evaluating the Effectiveness of Prompted Knowledge Cutoffs
- An Senegalese Legal Texts Structuration Using LLM-augmented Knowledge Graph
- Modeling the language cortex with form-independent and enriched representations of sentence meaning reveals remarkable semantic abstractness
- An Encoder-Decoder Network for Beamforming over Sparse Large-Scale MIMO Channels
- Uncertainty-Aware Answer Selection for Improved Reasoning in Multi-LLM Systems
- The Equilibrium Response of Atmospheric Machine-Learning Models to Uniform Sea Surface Temperature Warming
- Words That Make Language Models Perceive
- Confidence and Dispersity as Signals: Unsupervised Model Evaluation and Ranking
- Distributional Inverse Reinforcement Learning
- Lightweight Transformer for EEG Classification via Balanced Signed Graph Algorithm Unrolling
- Bayesian E(3)-Equivariant Interatomic Potential with Iterative Restratification of Many-body Message Passing
- Bootstrap Learning for Combinatorial Graph Alignment with Sequential GNNs
- Adaptive Node Feature Selection For Graph Neural Networks
- AdaBet: Gradient-free Layer Selection for Efficient Training of Deep Neural Networks
- Real Time Headway Predictions in Urban Rail Systems and Implications for Service Control: A Deep Learning Approach
- Enhancing XAI Narratives through Multi-Narrative Refinement and Knowledge Distillation
- Taming Imperfect Process Verifiers: A Sampling Perspective on Backtracking
- Mixture of Many Zero-Compute Experts: A High-Rate Quantization Theory Perspective
- Calibrated Uncertainty Sampling for Active Learning
- FTTE: Federated Learning on Resource-Constrained Devices
- TokenFlow: Responsive LLM Text Streaming Serving under Request Burst via Preemptive Scheduling
- Curl Descent: Non-Gradient Learning Dynamics with Sign-Diverse Plasticity
- A Granular Study of Safety Pretraining under Model Abliteration
- Optimal Rates for Generalization of Gradient Descent for Deep ReLU Classification
- Mitigating Spurious Correlation via Distributionally Robust Learning with Hierarchical Ambiguity Sets
- Online Learning in the Random Order Model
- FlexiQ: Adaptive Mixed-Precision Quantization for Latency/Accuracy Trade-Offs in Deep Neural Networks
- The Curious Case of In-Training Compression of State Space Models
- Multi-scale Autoregressive Models are Laplacian, Discrete, and Latent Diffusion Models in Disguise
- Subject-Adaptive Sparse Linear Models for Interpretable Personalized Health Prediction from Multimodal Lifelog Data
- RoiRL: Efficient, Self-Supervised Reasoning with Offline Iterative Reinforcement Learning
- Learning Explicit Single-Cell Dynamics Using ODE Representations
- RAxSS: Retrieval-Augmented Sparse Sampling for Explainable Variable-Length Medical Time Series Classification
- ContextFlow: Context-Aware Flow Matching For Trajectory Inference From Spatial Omics Data
- AttentiveGRUAE: An Attention-Based GRU Autoencoder for Temporal Clustering and Behavioral Characterization of Depression from Wearable Data
- On The Expressive Power of GNN Derivatives
- Geospatial Machine Learning Libraries
- Use the Online Network If You Can: Towards Fast and Stable Reinforcement Learning
- Towards CONUS-Wide ML-Augmented Conceptually-Interpretable Modeling of Catchment-Scale Precipitation-Storage-Runoff Dynamics
- TabImpute: Accurate and Fast Zero-Shot Missing-Data Imputation with a Pre-Trained Transformer
- HyperAdaLoRA: Accelerating LoRA Rank Allocation During Training via Hypernetworks without Sacrificing Performance
- Optimal Characteristics of Inspection Vehicle for Drive-by Bridge Inspection
- Topological Invariance and Breakdown in Learning
- EvoSpeak: Large Language Models for Interpretable Genetic Programming-Evolved Heuristics
- Accuracy Law for the Future of Deep Time Series Forecasting
- Dale meets Langevin: A Multiplicative Denoising Diffusion Model
- Hybrid-Collaborative Augmentation and Contrastive Sample Adaptive-Differential Awareness for Robust Attributed Graph Clustering
- OpenTSLM: Time-Series Language Models for Reasoning over Multivariate Medical Text- and Time-Series Data
- Assessing the Potential for Catastrophic Failure in Dynamic Post-Training Quantization
- SAGE: Streaming Agreement-Driven Gradient Sketches for Representative Subset Selection
- Uncertainty-Guided Model Selection for Tabular Foundation Models in Biomolecule Efficacy Prediction
- Improved Robustness of Deep Reinforcement Learning for Control of Time-Varying Systems by Bounded Extremum Seeking
- Beyond Imitation: Recovering Dense Rewards from Demonstrations
- In-memory Training on Analog Devices with Limited Conductance States via Multi-tile Residual Learning
- Graph Generation with Spectral Geodesic Flow Matching
- Model-brain comparison using inter-animal transforms
- STORI: A Benchmark and Taxonomy for Stochastic Environments
- PropRAG: Guiding Retrieval with Beam Search over Proposition Paths
- AlignDiT: Multimodal Aligned Diffusion Transformer for Synchronized Speech Generation
- Continuous Thought Machines
- A Survey of Deep Learning for Complex Speech Spectrograms
- OT Score: An OT based Confidence Score for Source Free Unsupervised Domain Adaptation
- Pre-training Limited Memory Language Models with Internal and External Knowledge
- NeSyGeo: A Neuro-Symbolic Framework for Multimodal Geometric Reasoning Data Generation
- Manipulating 3D Molecules in a Fixed-Dimensional E(3)-Equivariant Latent Space
- SP-VLA: A Joint Model Scheduling and Token Pruning Approach for VLA Model Acceleration
- A Survey of Pun Generation: Datasets, Evaluations and Methodologies
- Unified Domain Adaptive Semantic Segmentation
- RACCooN: A Versatile Instructional Video Editing Framework with Auto-Generated Narratives
- Optimizing Container Loading and Unloading through Dual-Cycling and Dockyard Rehandle Reduction Using a Hybrid Genetic Algorithm
- Inferring Pluggable Types with Machine Learning
- CBVLM: Training-free Explainable Concept-based Large Vision Language Models for Medical Image Classification
- MarketSenseAI 2.0: Enhancing Stock Analysis through LLM Agents
- Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs
- L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning
- DatawiseAgent: A Notebook-Centric LLM Agent Framework for Adaptive and Robust Data Science Automation
- Towards Quantifying Long-Range Interactions in Graph Machine Learning: a Large Graph Dataset and a Measurement
- Verbosity Tradeoffs and the Impact of Scale on the Faithfulness of LLM Self-Explanations
- Not a nuisance but a useful heuristic: Outlier dimensions favor frequent tokens in language models
- Activated LoRA: Fine-tuned LLMs for Intrinsics
- Abstain and Validate: A Dual-LLM Policy for Reducing Noise in Agentic Program Repair
- Self-Anchor: Large Language Model Reasoning via Step-by-step Attention Alignment
- Test-Time Defense Against Adversarial Attacks via Stochastic Resonance of Latent Ensembles
- Improving GUI Grounding with Explicit Position-to-Coordinate Mapping
- Reward Models are Metrics in a Trench Coat
- Improved Monte Carlo Planning via Causal Disentanglement for Structurally-Decomposed Markov Decision Processes
- MIRROR: Modular Internal Processing for Personalized Safety in LLM Dialogue
- V2X-UniPool: Unifying Multimodal Perception and Knowledge Reasoning for Autonomous Driving
- LayerCake: Token-Aware Contrastive Decoding within Large Language Model Layers
- THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning
- ZeroShotOpt: Towards Zero-Shot Pretrained Models for Efficient Black-Box Optimization
- Semantic Differentiation in Speech Emotion Recognition: Insights from Descriptive and Expressive Speech Roles
- Comparative Analysis of Parameterized Action Actor-Critic Reinforcement Learning Algorithms for Web Search Match Plan Generation
- A Unified Deep Reinforcement Learning Approach for Close Enough Traveling Salesman Problem
- A Study of Neural Polar Decoders for Communication
- What Drives Compositional Generalization in Visual Generative Models?
- HAVIR: HierArchical Vision to Image Reconstruction using CLIP-Guided Versatile Diffusion
- Signature-Informed Transformer for Asset Allocation
- Stimulus-Voltage-Based Prediction of Action Potential Onset Timing: Classical vs. Quantum-Inspired Approaches
- SpineBench: A Clinically Salient, Level-Aware Benchmark Powered by the SpineMed-450k Corpus
- UniShield: An Adaptive Multi-Agent Framework for Unified Forgery Image Detection and Localization
- Topic Modeling as Long-Form Generation: Can Long-Context LLMs revolutionize NTM via Zero-Shot Prompting?
- Wave-GMS: Lightweight Multi-Scale Generative Model for Medical Image Segmentation
- Constraint Satisfaction Approaches to Wordle: Novel Heuristics and Cross-Lexicon Validation
- Representing Beauty: Towards a Participatory but Objective Latent Aesthetics
- Global Convergence of Policy Gradient for Entropy Regularized Linear-Quadratic Control with multiplicative noise
- FinReflectKG - MultiHop: Financial QA Benchmark for Reasoning with Knowledge Graph Evidence
- FeDABoost: Fairness Aware Federated Learning with Adaptive Boosting
- Multimodal Carotid Risk Stratification with Large Vision-Language Models: Benchmarking, Fine-Tuning, and Clinical Insights
- Ergodic Risk Measures: Towards a Risk-Aware Foundation for Continual Reinforcement Learning
- Corrosion Risk Estimation for Heritage Preservation: An Internet of Things and Machine Learning Approach Using Temperature and Humidity
- From high-frequency sensors to noon reports: Using transfer learning for shaft power prediction in maritime
- BrainIB++: Leveraging Graph Neural Networks and Information Bottleneck for Functional Brain Biomarkers in Schizophrenia
- Learning Robust Diffusion Models from Imprecise Supervision
- Investigating The Smells of LLM Generated Code
- When and Where do Events Switch in Multi-Event Video Generation?
- TravelBench : Exploring LLM Performance in Low-Resource Domains
- SAE-RNA: A Sparse Autoencoder Model for Interpreting RNA Language Model Representations
- Hierarchical Generalized Category Discovery for Brain Tumor Classification in Digital Pathology
- Fusing Multi- and Hyperspectral Satellite Data for Harmful Algal Bloom Monitoring with Self-Supervised and Hierarchical Deep Learning
- Align Your Query: Representation Alignment for Multimodality Medical Object Detection
- MaskCD: Mitigating LVLM Hallucinations by Image Head Masked Contrastive Decoding
- Pareto-optimal Non-uniform Language Generation
- OptunaHub: A Platform for Black-Box Optimization
- Relevance-Aware Thresholding in Online Conformal Prediction for Time Series
- Dissecting Transformers: A CLEAR Perspective towards Green AI
- A Computational Framework for Interpretable Text-Based Personality Assessment from Social Media
- Evaluating Large Language Models for IUCN Red List Species Information
- Knowledge-Aware Modeling with Frequency Adaptive Learning for Battery Health Prognostics
- Flamed-TTS: Flow Matching Attention-Free Models for Efficient Generating and Dynamic Pacing Zero-shot Text-to-Speech
- Knowledge-Graph Based RAG System Evaluation Framework
- Oracle-RLAIF: An Improved Fine-Tuning Framework for Multi-modal Video Models through Reinforcement Learning from Ranking Feedback
- How Confident are Video Models? Empowering Video Models to Express their Uncertainty
- MINERVA: Mutual Information Neural Estimation for Supervised Feature Selection
- Automatic Building Code Review: A Case Study
- TutorBench: A Benchmark To Assess Tutoring Capabilities Of Large Language Models
- HALO: Memory-Centric Heterogeneous Accelerator with 2.5D Integration for Low-Batch LLM Inference
- To Compress or Not? Pushing the Frontier of Lossless GenAI Model Weights Compression with Exponent Concentration
- Can Data-Driven Dynamics Reveal Hidden Physics? There Is A Need for Interpretable Neural Operators
- Fine-Tuning Diffusion Models via Intermediate Distribution Shaping
- RAMAC: Multimodal Risk-Aware Offline Reinforcement Learning and the Role of Behavior Regularization
- Time-To-Inconsistency: A Survival Analysis of Large Language Model Robustness to Adversarial Attacks
- Fully automated inverse co-optimization of templates and block copolymer blending recipes for DSA lithography
- CWM: An Open-Weights LLM for Research on Code Generation with World Models
- Linear RNNs for autoregressive generation of long music samples
- Glaucoma Detection and Structured OCT Report Generation via a Fine-tuned Multimodal Large Language Model
- Extreme value forecasting using relevance-based data augmentation with deep learning models
- RainSeer: Fine-Grained Rainfall Reconstruction via Physics-Guided Modeling
- Cross-Platform DNA Methylation Classifier for the Eight Molecular Subtypes of Group 3 & 4 Medulloblastoma
- NEURODNAAI: Neural pipeline approaches for the advancing dna-based information storage as a sustainable digital medium using deep learning framework
- How to Train Your Advisor: Steering Black-Box LLMs with Advisor Models
- Market-Based Data Subset Selection -- Principled Aggregation of Multi-Criteria Example Utility
- CLARITY: Clinical Assistant for Routing, Inference, and Triage
- Litespark Technical Report: High-Throughput, Energy-Efficient LLM Training Framework
- From Pixels to Factors: Learning Independently Controllable State Variables for Reinforcement Learning
- PHORECAST: Enabling AI Understanding of Public Health Outreach Across Populations
- Breaking the MoE LLM Trilemma: Dynamic Expert Clustering with Structured Compression
- Small Language Models for Curriculum-based Guidance
- mini-vec2vec: Scaling Universal Geometry Alignment with Linear Transformations
- LLMSQL: Upgrading WikiSQL for the LLM Era of Text-to-SQL
- Language, Culture, and Ideology: Personalizing Offensiveness Detection in Political Tweets with Reasoning LLMs
- Evaluating Bias in Spoken Dialogue LLMs for Real-World Decisions and Recommendations
- DiffuSpec: Unlocking Diffusion Language Models for Speculative Decoding
- Emission-GPT: A domain-specific language model agent for knowledge retrieval, emission inventory and data analysis
- Spiral of Silence in Large Language Model Agents
- ChunkLLM: A Lightweight Pluggable Framework for Accelerating LLMs Inference
- A Cross-Lingual Analysis of Bias in Large Language Models Using Romanian History
- Beyond Manuals and Tasks: Instance-Level Context Learning for LLM Agents
- Training Dynamics of Parametric and In-Context Knowledge Utilization in Language Models
- Pretraining with hierarchical memories: separating long-tail and common knowledge
- KAME: Tandem Architecture for Enhancing Knowledge in Real-Time Speech-to-Speech Conversational AI
- AMANDA: Agentic Medical Knowledge Augmentation for Data-Efficient Medical Visual Question Answering
- SelfJudge: Faster Speculative Decoding via Self-Supervised Judge Verification
- EntropyLong: Effective Long-Context Training via Predictive Uncertainty
- Synthetic Dialogue Generation for Interactive Conversational Elicitation & Recommendation (ICER)
- Human Mobility Datasets Enriched With Contextual and Social Dimensions
- Where Did It Go Wrong? Attributing Undesirable LLM Behaviors via Representation Gradient Tracing
- FormalML: A Benchmark for Evaluating Formal Subgoal Completion in Machine Learning Theory
- KurdSTS: The Kurdish Semantic Textual Similarity
- CRACQ: A Multi-Dimensional Approach To Automated Document Assessment
- Optimizing Long-Form Clinical Text Generation with Claim-Based Rewards
- Evaluating Uncertainty Quantification Methods in Argumentative Large Language Models
- DRIFT: Learning from Abundant User Dissatisfaction in Real-World Preference Learning
- $\texttt{BluePrint}$: A Social Media User Dataset for LLM Persona Evaluation and Training
- Onto-Epistemological Analysis of AI Explanations
- From Facts to Foils: Designing and Evaluating Counterfactual Explanations for Smart Environments
- A Study of Rule Omission in Raven's Progressive Matrices
- CoDA: Agentic Systems for Collaborative Data Visualization
- Coevolutionary Continuous Discrete Diffusion: Make Your Diffusion Language Model a Latent Reasoner
- Representation Learning for Compressed Video Action Recognition via Attentive Cross-modal Interaction with Motion Enhancement
- Hallucination reduction with CASAL: Contrastive Activation Steering For Amortized Learning
- Hallucination-Resistant, Domain-Specific Research Assistant with Self-Evaluation and Vector-Grounded Retrieval
- NCV: A Node-Wise Consistency Verification Approach for Low-Cost Structured Error Localization in LLM Reasoning
- Beyond the Final Answer: Evaluating the Reasoning Trajectories of Tool-Augmented Agents
- Take Goodhart Seriously: Principled Limit on General-Purpose AI Optimization
- Reward Model Routing in Alignment
- Consolidating Reinforcement Learning for Multimodal Discrete Diffusion Models
- BrowserArena: Evaluating LLM Agents on Real-World Web Navigation Tasks
- RefineShot: Rethinking Cinematography Understanding with Foundational Skill Evaluation
- Safe and Efficient In-Context Learning via Risk Control
- Multimodal Function Vectors for Spatial Relations
- Orchestrating Human-AI Teams: The Manager Agent as a Unifying Research Challenge
- Agentic Additive Manufacturing Alloy Discovery
- A Benchmark Study of Deep Reinforcement Learning Algorithms for the Container Stowage Planning Problem
- Multimodal Large Language Model Framework for Safe and Interpretable Grid-Integrated EVs
- Mitigating Modal Imbalance in Multimodal Reasoning
- On the Role of Temperature Sampling in Test-Time Scaling
- A Concept of Possibility for Real-World Events
- ARMs: Adaptive Red-Teaming Agent against Multimodal Models with Plug-and-Play Attacks
- Automated Constraint Specification for Job Scheduling by Regulating Generative Model with Domain-Specific Representation
- Efficient Preimage Approximation for Neural Network Certification
- Smart Contract Intent Detection with Pre-trained Programming Language Model
- XBreaking: Explainable Artificial Intelligence for Jailbreaking LLMs
- JALMBench: Benchmarking Jailbreak Vulnerabilities in Audio Language Models
- Permissioned LLMs: Enforcing Access Control in Large Language Models
- Semantic Preprocessing for LLM-based Malware Analysis
- Diffusion-aided Task-oriented Semantic Communications with Model Inversion Attack
- GATEBLEED: Exploiting On-Core Accelerator Power Gating for High Performance & Stealthy Attacks on AI
- ShikkhaChain: A Blockchain-Powered Academic Credential Verification System for Bangladesh
- Secure and Scalable Blockchain Voting: A Comparative Framework and the Role of Large Language Models
- Scam2Prompt: A Scalable Framework for Auditing Malicious Scam Endpoints in Production LLMs
- Mutual Information Guided Backdoor Mitigation for Pre-trained Encoders
- OML: A Primitive for Reconciling Open Access with Owner Control in AI Model Distribution
- Rethinking the Vulnerability of Concept Erasure and a New Method
- FinP: Fairness-in-Privacy in Federated Learning by Addressing Disparities in Privacy Risk
- Interplay between Security, Privacy and Trust in 6G-enabled Intelligent Transportation Systems
- A Bilevel Optimization Framework for Adversarial Control of Gas Pipeline Operations
- A Novel Unified Lightweight Temporal-Spatial Transformer Approach for Intrusion Detection in Drone Networks
- CST-AFNet: A dual attention-based deep learning framework for intrusion detection in IoT networks
- Automated Repair of OpenID Connect Programs (Extended Version)
- DMark: Order-Agnostic Watermarking for Diffusion Large Language Models
- WavInWav: Time-domain Speech Hiding via Invertible Neural Network
- Cheat-Penalised Quantum Weak Coin-Flipping
- LLAMAFUZZ: Large Language Model Enhanced Greybox Fuzzing
- RSFuzz: A Robustness-Guided Swarm Fuzzing Framework Based on Behavioral Constraints
- PrisonBreak: Jailbreaking Large Language Models with at Most Twenty-Five Targeted Bit-flips
- Primus: A Pioneering Collection of Open-Source Datasets for Cybersecurity LLM Training
- Activation Functions Considered Harmful: Recovering Neural Network Weights through Controlled Channels
- Using Preformed Resistive Random Access Memory to Create a Strong Physically Unclonable Function
- MALF: A Multi-Agent LLM Framework for Intelligent Fuzzing of Industrial Control Protocols
- A Statistical Method for Attack-Agnostic Adversarial Attack Detection with Compressive Sensing Comparison
- Attack via Overfitting: 10-shot Benign Fine-tuning to Jailbreak LLMs
- Improved Search-to-Decision Reduction for Random Local Functions
- SoK: Preconfirmations
- SoK: Kicking CAN Down the Road. Systematizing CAN Security Knowledge
- External Data Extraction Attacks against Retrieval-Augmented Large Language Models
- Untargeted Jailbreak Attack
- Protecting Persona Biometric Data: The Case of Facial Privacy
- TPM-Based Continuous Remote Attestation and Integrity Verification for 5G VNFs on Kubernetes
- A High-Capacity and Secure Disambiguation Algorithm for Neural Linguistic Steganography
- From Trace to Line: LLM Agent for Real-World OSS Vulnerability Localization
- Apply Bayes Theorem to Optimize IVR Authentication Process
- Hybrid Schemes of NIST Post-Quantum Cryptography Standard Algorithms and Quantum Key Distribution for Key Exchange and Digital Signature
- Selmer-Inspired Elliptic Curve Generation
- Secure and Robust Watermarking for AI-generated Images: A Comprehensive Survey
- On The Fragility of Benchmark Contamination Detection in Reasoning Models
- LLM-Generated Samples for Android Malware Detection
- PolyLink: A Blockchain Based Decentralized Edge AI Platform for LLM Inference
- Dynamic Target Attack
- Adaptive Deception Framework with Behavioral Analysis for Enhanced Cybersecurity Defense
- Rigorous Evaluation of Microarchitectural Side-Channels with Statistical Model Checking
- TLoRa: Implementing TLS Over LoRa for Secure HTTP Communication in IoT
- ToolTweak: An Attack on Tool Selection in LLM-based Agents
- Hybrid Horizons: Policy for Post-Quantum Security
- Modeling the Attack: Detecting AI-Generated Text by Quantifying Adversarial Perturbations
- Agentic-AI Healthcare: Multilingual, Privacy-First Framework with MCP Agents
- CATMark: A Context-Aware Thresholding Framework for Robust Cross-Task Watermarking in Large Language Models
- An Investigation into the Performance of Non-Contrastive Self-Supervised Learning Methods for Network Intrusion Detection
- Measuring Physical-World Privacy Awareness of Large Language Models: An Evaluation Benchmark
- Privacy in the Age of AI: A Taxonomy of Data Risks
- Bootstrapping as a Morphism: An Arithmetic Geometry Approach to Asymptotically Faster Homomorphic Encryption
- Federated Spatiotemporal Graph Learning for Passive Attack Detection in Smart Grids
- A-MemGuard: A Proactive Defense Framework for LLM-Based Agent Memory
- A Hybrid CAPTCHA Combining Generative AI with Keystroke Dynamics for Enhanced Bot Detection
- Scaling Homomorphic Applications in Deployment
- Less LLM, More Documents: Searching for Improved RAG
- AgenticRAG: Tool-Augmented Foundation Models for Zero-Shot Explainable Recommender Systems
- OpenZL: A Graph-Based Model for Compression
- Hierarchical Semantic Retrieval with Cobweb
- Geolog-IA: Conversational System for Academic Theses
- StepChain GraphRAG: Reasoning Over Knowledge Graphs for Multi-Hop Question Answering
- Grounding Large Language Models in Clinical Evidence: A Retrieval-Augmented Generation System for Querying UK NICE Clinical Guidelines
- CHORD: Customizing Hybrid-precision On-device Model for Sequential Recommendation with Device-cloud Collaboration
- cAST: Enhancing Code Retrieval-Augmented Generation with Structural Chunking via Abstract Syntax Tree
- MHier-RAG: Multi-Modal RAG for Visual-Rich Document Question-Answering via Hierarchical and Multi-Granularity Reasoning
- When Large Language Models are Reliable for Judging Empathic Communication
- Hyperparameters are all you need: Using five-step inference for an original diffusion model to generate images comparable to the latest distillation model
- Visualizing Spatial Point Clouds: A Task-Oriented Taxonomy
- GS-Share: Enabling High-fidelity Map Sharing with Incremental Gaussian Splatting
- PCG-Informed Neural Solvers for High-Resolution Homogenization of Periodic Microstructures
- FSFSplatter: Build Surface and Novel Views with Sparse-Views within 3min
- ROGR: Relightable 3D Objects using Generative Relighting
- Topological Autoencoders++: Fast and Accurate Cycle-Aware Dimensionality Reduction
- Revisiting Query Variants: The Advantage of Retrieval Over Generation of Query Variants for Effective QPP
- A Simple but Effective Elaborative Query Reformulation Approach for Natural Language Recommendation
- Open WebUI: An Open, Extensible, and Usable Interface for AI Interaction
- When Researchers Say Mental Model/Theory of Mind of AI, What Are They Really Talking About?
- "It Felt Real" Victim Perspectives on Platform Design and Longer-Running Scams
- Prototyping Digital Social Spaces through Metaphor-Driven Design: Translating Spatial Concepts into an Interactive Social Simulation
- Fostering Collective Discourse: A Distributed Role-Based Approach to Online News Commenting
- PromptMap: Supporting Exploratory Text-to-Image Generation
- VR as a "Drop-In" Well-being Tool for Knowledge Workers
- Who's Wearing? Ear Canal Biometric Key Extraction for User Authentication on Wireless Earbuds
- AutoMaAS: Self-Evolving Multi-Agent Architecture Search for Large Language Models
- AI Generated Child Sexual Abuse Material - What's the Harm?
- Reading.help: Supporting EFL Readers with Proactive and On-Demand Explanation of English Grammar and Semantics
- Vibe coding: programming through conversation with artificial intelligence
- Comparing Exploration-Exploitation Strategies of LLMs and Humans: Insights from Standard Multi-armed Bandit Experiments
- Learning High-Fidelity Robot Self-Model with Articulated 3D Gaussian Splatting
- A Benchmarking Study of Vision-based Robotic Grasping Algorithms
- Learned IMU Bias Prediction for Invariant Visual Inertial Odometry
- Latent Action Diffusion for Cross-Embodiment Manipulation
- Beyond Anthropomorphism: Enhancing Grasping and Eliminating a Degree of Freedom by Fusing the Abduction of Digits Four and Five
- An Intention-driven Lane Change Framework Considering Heterogeneous Dynamic Cooperation in Mixed-traffic Environment
- A Multi-Fidelity Control Variate Approach for Policy Gradient Estimation
- Optimal Modified Feedback Strategies in LQ Games under Control Imperfections
- HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person Scenarios
- Decoupling Geometry from Optimization in 2D Irregular Cutting and Packing Problems: an Open-Source Collision Detection Engine
- Vector Autoregression (VAR) of Longitudinal Sleep and Self-report Mood Data
- Learning Stability Certificate for Robotics in Real-World Environments
- MM-Nav: Multi-View VLA Model for Robust Visual Navigation via Multi-Expert Learning
- Optimal Smooth Coverage Trajectory Planning for Quadrotors in Cluttered Environment
- Simulation to Rules: A Dual-VLM Framework for Formal Visual Planning
- Conceptualizing and Modeling Communication-Based Cyberattacks on Automated Vehicles
- Periodic Event-Triggered Prescribed Time Control of Euler-Lagrange Systems under State and Input Constraints
- VERNIER: an open-source software pushing marker pose estimation down to the micrometer and nanometer scales
- A Dimension-Decomposed Learning Framework for Online Disturbance Identification in Quadrotor SE(3) Control
- Geometry Meets Vision: Revisiting Pretrained Semantics in Distilled Fields
- Mask2IV: Interaction-Centric Video Generation via Mask Trajectories
- Improving Cooperation in Collaborative Embodied AI
- S-Graphs 2.0 -- A Hierarchical-Semantic Optimization and Loop Closure for SLAM
- Active Alignments of Lens Systems with Reinforcement Learning
- Assist-as-needed Control for FES in Foot Drop Management
- Action Deviation-Aware Inference for Low-Latency Wireless Robots
- Novel UWB Synthetic Aperture Radar Imaging for Mobile Robot Mapping
- Point Cloud-Based Control Barrier Functions for Model Predictive Control in Safety-Critical Navigation of Autonomous Mobile Robots
- Metrics vs Surveys: Can Quantitative Measures Replace Human Surveys in Social Robot Navigation? A Correlation Analysis
- Single-Rod Brachiation Robot: Mechatronic Control Design and Validation of Prejump Phases
- YawSitter: Modeling and Controlling a Tail-Sitter UAV with Enhanced Yaw Control
- AI-Enhanced Kinematic Modeling of Flexible Manipulators Using Multi-IMU Sensor Fusion
- Real-Time Nonlinear Model Predictive Control of Heavy-Duty Skid-Steered Mobile Platform for Trajectory Tracking Tasks
- 3D-CovDiffusion: 3D-Aware Diffusion Policy for Coverage Path Planning
- HumanoidExo: Scalable Whole-Body Humanoid Manipulation via Wearable Exoskeleton
- Long-Term Human Motion Prediction Using Spatio-Temporal Maps of Dynamics
- Embracing Evolution: A Call for Body-Control Co-Design in Embodied Humanoid Robot
- Whisker-based Tactile Flight for Tiny Drones
- A Recipe for Efficient Sim-to-Real Transfer in Manipulation with Online Imitation-Pretrained World Models
- Efficient Optimal Path Planning in Dynamic Environments Using Koopman MPC
- SubSense: VR-Haptic and Motor Feedback for Immersive Control in Subsea Telerobotics
- UMI-on-Air: Embodiment-Aware Guidance for Embodiment-Agnostic Visuomotor Policies
- RSV-SLAM: Toward Real-Time Semantic Visual SLAM in Indoor Dynamic Environments
- Reachable Predictive Control: A Novel Control Algorithm for Nonlinear Systems with Unknown Dynamics and its Practical Applications
- Multi-robot Rigid Formation Navigation via Synchronous Motion and Discrete-time Communication-Control Optimization
- A Trajectory Generator for High-Density Traffic and Diverse Agent-Interaction Scenarios
- A $1000\times$ Faster LLM-enhanced Algorithm For Path Planning in Large-scale Grid Maps
- Team Xiaomi EV-AD VLA: Caption-Guided Retrieval System for Cross-Modal Drone Navigation - Technical Report for IROS 2025 RoboSense Challenge Track 4
- Flow with the Force Field: Learning 3D Compliant Flow Matching Policies from Force and Demonstration-Guided Simulation Data
- Work Zones challenge VLM Trajectory Planning: Toward Mitigation and Robust Autonomous Driving
- Graph Neural Networks for Transmission Grid Topology Control: Busbar Information Asymmetry and Heterogeneous Representations
- Learning Counterfactual Outcomes Under Rank Preservation
- Dynamical local Fr\'echet curve regression in manifolds
- Theoretical Investigation on Inductive Bias of Isolation Forest
- A Malliavin-Gamma calculus approach to Score Based Diffusion Generative models for random fields
- Online Decision-Focused Learning
- Restricted Spectral Gap Decomposition for Simulated Tempering Targeting Mixture Distributions
- DiffusionBlocks: Block-wise Neural Network Training via Diffusion Interpretation
- Optimal structure learning and conditional independence testing
- ERUPT: An Open Toolkit for Interfacing with Robot Motion Planners in Extended Reality
- SIMSplat: Predictive Driving Scene Editing with Language-aligned 4D Gaussian Splatting
- U-LAG: Uncertainty-Aware, Lag-Adaptive Goal Retargeting for Robotic Manipulation
- A fast non-reversible sampler for Bayesian finite mixture models
- Discrimination in machine learning algorithms
- Post Reinforcement Learning Inference
- Extending Mean-Field Variational Inference via Entropic Regularization: Theory and Computation
- Fractional signature: a generalisation of the signature inspired by fractional calculus
- Statistical Inference for Temporal Difference Learning with Linear Function Approximation
- Iteratively reweighted kernel machines efficiently learn sparse functions
- Highly Efficient and Effective LLMs with Multi-Boolean Architectures
- A Nearly Optimal and Low-Switching Algorithm for Reinforcement Learning with General Function Approximation
- Variance Reduction and Low Sample Complexity in Stochastic Optimization via Proximal Point Method
- Batched Nonparametric Contextual Bandits
- On diffusion posterior sampling via sequential Monte Carlo for zero-shot scaffolding of protein motifs
- A semiconcavity approach to stability of entropic plans and exponential convergence of Sinkhorn's algorithm
- Orthogonal Procrustes problem preserves correlations in synthetic data
- Learning a distance measure from the information-estimation geometry of data
- What is in the model? A Comparison of variable selection criteria and model search approaches
- VisitHGNN: Heterogeneous Graph Neural Networks for Modeling Point-of-Interest Visit Patterns
- Hyperparameter Loss Surfaces Are Simple Near their Optima
- Oracle-based Uniform Sampling from Convex Bodies
- Differentially Private Wasserstein Barycenters
- Gradient-enhanced global sensitivity analysis with Poincar{\'e} chaos expansions
- Distilled Protein Backbone Generation
- Rates of Convergence of Generalised Variational Inference Posteriors under Prior Misspecification
- Total Robustness in Bayesian Nonlinear Regression for Measurement Error Problems under Model Misspecification
- Why Do We Need Warm-up? A Theoretical Perspective
- Best-of-Majority: Minimax-Optimal Strategy for Pass@$k$ Inference Scaling
- Higher-arity PAC learning, VC dimension and packing lemma
- Predictive inference for time series: why is split conformal effective despite temporal dependence?
- Beyond Linear Diffusions: Improved Representations for Rare Conditional Generative Modeling
- Adaptive randomized pivoting and volume sampling
- Learning Multi-Index Models with Hyper-Kernel Ridge Regression
- Neural Jump ODEs as Generative Models
Research Sources: 541 | Generated: 10/6/2025