AI RESEARCH PAPERS & ACADEMIC SOURCES
- Flow-based generative models as iterative algorithms in probability space
- The feasibility of multi-graph alignment: a Bayesian approach
- Randomized Quasi-Monte Carlo Features for Kernel Approximation
- The Ground Cost for Optimal Transport of Angular Velocity
- Efficient $Q$-Learning and Actor-Critic Methods for Robust Average Reward Reinforcement Learning
- Precise Bayesian Neural Networks
- GenAI-Powered Inference
- Beyond ATE: Multi-Criteria Design for A/B Testing
- Predicting Market Troughs: A Machine Learning Approach with Causal Interpretation
- On Rate-Optimal Partitioning Classification from Observable and from Privatised Data
- Variational Inference for Uncertainty Quantification: an Analysis of Trade-offs
- Robust Generative Learning with Lipschitz-Regularized $\alpha$-Divergences Allows Minimal Assumptions on Target Distributions
- Effect of Random Learning Rate: Theoretical Analysis of SGD Dynamics in Non-Convex Optimization via Stationary Distribution
- Autoencoders in Function Space
- Confirmation Bias in Gaussian Mixture Models
- Limit Theorems for Stochastic Gradient Descent with Infinite Variance
- Sequential Controlled Langevin Diffusions
- Error-quantified Conformal Inference for Time Series
- Uncertainty quantification for Markov chain induced martingales with application to temporal difference learning
- KD$^{2}$M: A unifying framework for feature knowledge distillation
- Quantum-inspired probability metrics define a complete, universal space for statistical learning
- Catapult Dynamics and Phase Transitions in Quadratic Nets
- Sequential Gibbs Posteriors with Applications to Principal Component Analysis
- Probabilistic Shapley Value Modeling and Inference
- The Over-Certainty Phenomenon in Modern Test-Time Adaptation Algorithms
- Towards a General Time Series Forecasting Model with Unified Representation and Adaptive Transfer
- Optimality of Approximate Message Passing Algorithms for Spiked Matrix Models with Rotationally Invariant Noise
- A Fully Parameter-Free Second-Order Algorithm for Convex-Concave Minimax Problems
- Off-Policy Maximum Entropy RL with Future State and Action Visitation Measures
- Achieving $\widetilde{\mathcal{O}}(\sqrt{T})$ Regret in Average-Reward POMDPs with Known Observation Models
- A Theoretical Justification for Asymmetric Actor-Critic Algorithms
- Statistical description and dimension reduction of categorical trajectories with multivariate functional principal components
- Stereovision Image Processing for Planetary Navigation Maps with Semi-Global Matching and Superpixel Segmentation
- Brain Tumor Detection Through Diverse CNN Architectures in IoT Healthcare Industries: Fast R-CNN, U-Net, Transfer Learning-Based CNN, and Fully Connected CNN
- eKalibr-Inertial: Continuous-Time Spatiotemporal Calibration for Event-Based Visual-Inertial Systems
- O$^3$Afford: One-Shot 3D Object-to-Object Affordance Grounding for Generalizable Robotic Manipulation
- Contrastive Anatomy-Contrast Disentanglement: A Domain-General MRI Harmonization Method
- From Skin to Skeleton: Towards Biomechanically Accurate 3D Digital Humans
- Towards In-Air Ultrasonic QR Codes: Deep Learning for Classification of Passive Reflector Constellations
- MM-DINOv2: Adapting Foundation Models for Multi-Modal Medical Image Analysis
- LLaDA-VLA: Vision Language Diffusion Action Models
- Scaling Transformer-Based Novel View Synthesis Models with Token Disentanglement and Synthetic Data
- F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions
- Deep Reactive Policy: Learning Reactive Manipulator Motion Planning for Dynamic Environments
- Can Machines Imitate Humans? Integrative Turing-like tests for Language and Vision Demonstrate a Narrowing Gap
- ADIR: Adaptive Diffusion for Image Reconstruction
- The GOOSE Dataset for Perception in Unstructured Environments
- EdgeSAM: Prompt-In-the-Loop Distillation for SAM
- LD-SDM: Language-Driven Hierarchical Species Distribution Modeling
- Osprey: Pixel Understanding with Visual Instruction Tuning
- Sequential Least-Squares Estimators with Fast Randomized Sketching for Linear Statistical Models
- Learning from one graph: transductive learning guarantees via the geometry of small random worlds
- Interpretable dimension reduction for compositional data
- Event Spectroscopy: Event-based Multispectral and Depth Sensing using Structured Light
- Pothole Detection and Recognition based on Transfer Learning
- Raw2Event: Converting Raw Frame Camera into Event Camera
- D-HUMOR: Dark Humor Understanding via Multimodal Open-ended Reasoning
- UrbanTwin: High-Fidelity Synthetic Replicas of Roadside Lidar Datasets
- P3-SAM: Native 3D Part Segmentation
- SynthDrive: Scalable Real2Sim2Real Sensor Simulation Pipeline for High-Fidelity Asset Generation and Driving Data Synthesis
- MIORe & VAR-MIORe: Benchmarks to Push the Boundaries of Restoration
- UMO: Scaling Multi-Identity Consistency for Image Customization via Matching Reward
- Video-Based MPAA Rating Prediction: An Attention-Driven Hybrid Architecture Using Contrastive Learning
- Curia: A Multi-Modal Foundation Model for Radiology
- Leveraging Generic Foundation Models for Multimodal Surgical Data Analysis
- Evaluating the Impact of Adversarial Attacks on Traffic Sign Classification using the LISA Dataset
- ToonOut: Fine-tuned Background-Removal for Anime Characters
- Automated Radiographic Total Sharp Score (ARTSS) in Rheumatoid Arthritis: A Solution to Reduce Inter-Intra Reader Variation and Enhancing Clinical Practice
- Matching Shapes Under Different Topologies: A Topology-Adaptive Deformation Guided Approach
- A New Hybrid Model of Generative Adversarial Network and You Only Look Once Algorithm for Automatic License-Plate Recognition
- Barlow-Swin: Toward a novel siamese-based segmentation architecture using Swin-Transformers
- Intraoperative 2D/3D Registration via Spherical Similarity Learning and Inference-Time Differentiable Levenberg-Marquardt Optimization
- BIR-Adapter: A Low-Complexity Diffusion Model Adapter for Blind Image Restoration
- FoMo4Wheat: Toward reliable crop vision foundation models with globally curated data
- H$_{2}$OT: Hierarchical Hourglass Tokenizer for Efficient Video Pose Transformers
- Layer-Wise Anomaly Detection in Directed Energy Deposition using High-Fidelity Fringe Projection Profilometry
- A Synthetic-to-Real Dehazing Method based on Domain Unification
- Phantom-Insight: Adaptive Multi-cue Fusion for Video Camouflaged Object Detection with Multimodal LLM
- When Language Model Guides Vision: Grounding DINO for Cattle Muzzle Detection
- Perception-oriented Bidirectional Attention Network for Image Super-resolution Quality Assessment
- Cross3DReg: Towards a Large-scale Real-world Cross-source Point Cloud Registration Benchmark
- A Statistical 3D Stomach Shape Model for Anatomical Analysis
- Does DINOv3 Set a New Medical Vision Standard?
- FSG-Net: Frequency-Spatial Synergistic Gated Network for High-Resolution Remote Sensing Change Detection
- WS$^2$: Weakly Supervised Segmentation using Before-After Supervision in Waste Sorting
- TIDE: Achieving Balanced Subject-Driven Image Generation via Target-Instructed Diffusion Enhancement
- Predicting Brain Tumor Response to Therapy using a Hybrid Deep Learning and Radiomics Approach
- Benchmarking EfficientTAM on FMO datasets
- Back To The Drawing Board: Rethinking Scene-Level Sketch-Based Image Retrieval
- Evolving from Unknown to Known: Retentive Angular Representation Learning for Incremental Open Set Recognition
- CausNVS: Autoregressive Multi-view Diffusion for Flexible 3D Novel View Synthesis
- Hybrid Swin Attention Networks for Simultaneously Low-Dose PET and CT Denoising
- Investigating Location-Regularised Self-Supervised Feature Learning for Seafloor Visual Imagery
- Online Clustering of Seafloor Imagery for Interpretation during Long-Term AUV Operations
- VIM-GS: Visual-Inertial Monocular Gaussian Splatting via Object-level Guidance in Large Scenes
- BioLite U-Net: Edge-Deployable Semantic Segmentation for In Situ Bioprinting Monitoring
- STAGE: Segmentation-oriented Industrial Anomaly Synthesis via Graded Diffusion with Explicit Mask Alignment
- Cortex-Synth: Differentiable Topology-Aware 3D Skeleton Synthesis with Hierarchical Graph Attention
- MRI-Based Brain Tumor Detection through an Explainable EfficientNetV2 and MLP-Mixer-Attention Architecture
- Zero-shot 3D-Aware Trajectory-Guided image-to-video generation via Test-Time Training
- Co-Seg: Mutual Prompt-Guided Collaborative Learning for Tissue and Nuclei Segmentation
- Analysis of Blood Report Images Using General Purpose Vision-Language Models
- Multi-Stage Graph Neural Networks for Data-Driven Prediction of Natural Convection in Enclosed Cavities
- Home-made Diffusion Model from Scratch to Hatch
- High-Quality Tomographic Image Reconstruction Integrating Neural Networks and Mathematical Optimization
- MedSeqFT: Sequential Fine-tuning Foundation Models for 3D Medical Image Segmentation
- PathoHR: Hierarchical Reasoning for Vision-Language Models in Pathology
- CARDIE: clustering algorithm on relevant descriptors for image enhancement
- RetinaGuard: Obfuscating Retinal Age in Fundus Images for Biometric Privacy Preserving
- UniVerse-1: Unified Audio-Video Generation via Stitching of Experts
- AI-Based Applied Innovation for Fracture Detection in X-rays Using Custom CNN and Transfer Learning Models
- Exploring Light-Weight Object Recognition for Real-Time Document Detection
- Spatial Reasoning with Vision-Language Models in Ego-Centric Multi-View Scenes
- AI-driven Remote Facial Skin Hydration and TEWL Assessment from Selfie Images: A Systematic Solution
- Prototype-Aware Multimodal Alignment for Open-Vocabulary Visual Grounding
- Video-based Generalized Category Discovery via Memory-Guided Consistency-Aware Contrastive Learning
- Text4Seg++: Advancing Image Segmentation via Generative Language Modeling
- Towards scalable organ level 3D plant segmentation: Bridging the data algorithm computing gap
- Quantitative Currency Evaluation in Low-Resource Settings through Pattern Analysis to Assist Visually Impaired Users
- Multi-Modal Camera-Based Detection of Vulnerable Road Users
- Harnessing Object Grounding for Time-Sensitive Video Understanding
- Your Super Resolution Model is not Enough for Tackling Real-World Scenarios
- AI-based response assessment and prediction in longitudinal imaging for brain metastases treated with stereotactic radiosurgery
- 3DOF+Quantization: 3DGS quantization for large scenes with limited Degrees of Freedom
- JRN-Geo: A Joint Perception Network based on RGB and Normal images for Cross-view Geo-localization
- Multi-LVI-SAM: A Robust LiDAR-Visual-Inertial Odometry for Multiple Fisheye Cameras
- Depth-Aware Super-Resolution via Distance-Adaptive Variational Formulation
- PictOBI-20k: Unveiling Large Multimodal Models in Visual Decipherment for Pictographic Oracle Bone Characters
- Posterior shape models revisited: Improving 3D reconstructions from partial data using target specific models
- 3DPillars: Pillar-based two-stage 3D object detection
- CRAB: Camera-Radar Fusion for Reducing Depth Ambiguity in Backward Projection based View Transformation
- A Probabilistic Segment Anything Model for Ambiguity-Aware Medical Image Segmentation
- BTCChat: Advancing Remote Sensing Bi-temporal Change Captioning with Multimodal Large Language Model
- A Fine-Grained Attention and Geometric Correspondence Model for Musculoskeletal Risk Classification in Athletes Using Multimodal Visual and Skeletal Features
- Compression Beyond Pixels: Semantic Compression with Multimodal Foundation Models
- AttriPrompt: Dynamic Prompt Composition Learning for CLIP
- Coefficients-Preserving Sampling for Reinforcement Learning with Flow Matching
- Dual Interaction Network with Cross-Image Attention for Medical Image Segmentation
- StripDet: Strip Attention-Based Lightweight 3D Object Detection from Point Cloud
- Neural Bloom: A Deep Learning Approach to Real-Time Lighting
- Spatial-Aware Self-Supervision for Medical 3D Imaging with Multi-Granularity Observable Tasks
- OmniStyle2: Scalable and High Quality Artistic Style Transfer Data Generation via Destylization
- Multi-Strategy Guided Diffusion via Sparse Masking Temporal Reweighting Distribution Correction
- Motion Aware ViT-based Framework for Monocular 6-DoF Spacecraft Pose Estimation
- BLaVe-CoT: Consistency-Aware Visual Question Answering for Blind and Low Vision Users
- Cross-Modal Enhancement and Benchmark for UAV-based Open-Vocabulary Object Detection
- Micro-Expression Recognition via Fine-Grained Dynamic Perception
- DVLO4D: Deep Visual-Lidar Odometry with Sparse Spatial-temporal Fusion
- Context-Aware Knowledge Distillation with Adaptive Weighting for Image Classification
- A Real-Time, Vision-Based System for Badminton Smash Speed Estimation on Mobile Devices
- A Stroke-Level Large-Scale Database of Chinese Character Handwriting and the OpenHandWrite_Toolbox for Handwriting Research
- Anticipatory Fall Detection in Humans with Hybrid Directed Graph Neural Networks and Long Short-Term Memory
- Systematic Integration of Attention Modules into CNNs for Accurate and Generalizable Medical Image Diagnosis
- Vision-Based Object Detection for UAV Solar Panel Inspection Using an Enhanced Defects Dataset
- Dynamic Sensitivity Filter Pruning using Multi-Agent Reinforcement Learning For DCNN's
- Veriserum: A dual-plane fluoroscopic dataset with knee implant phantoms for deep learning in medical imaging
- Quaternion Approximation Networks for Enhanced Image Classification and Oriented Object Detection
- Visibility-Aware Language Aggregation for Open-Vocabulary Segmentation in 3D Gaussian Splatting
- DuoCLR: Dual-Surrogate Contrastive Learning for Skeleton-based Human Action Segmentation
- RED: Robust Event-Guided Motion Deblurring with Modality-Specific Disentangled Representation
- Sensitivity-Aware Post-Training Quantization for Deep Neural Networks
- Reconstruction and Reenactment Separated Method for Realistic Gaussian Head
- MFFI: Multi-Dimensional Face Forgery Image Dataset for Real-World Scenarios
- Patch-level Kernel Alignment for Self-Supervised Dense Representation Learning
- SuMa: A Subspace Mapping Approach for Robust and Effective Concept Erasure in Text-to-Image Diffusion Models
- Evaluating YOLO Architectures: Implications for Real-Time Vehicle Detection in Urban Environments of Bangladesh
- EditIDv2: Editable ID Customization with Data-Lubricated ID Feature Integration for Text-to-Image Generation
- OOTSM: A Decoupled Linguistic Framework for Effective Scene Graph Anticipation
- WIPUNet: A Physics-inspired Network with Weighted Inductive Biases for Image Denoising
- Context-Aware Multi-Turn Visual-Textual Reasoning in LVLMs via Dynamic Memory and Adaptive Visual Guidance
- MeshMetrics: A Precise Implementation of Distance-Based Image Segmentation Metrics
- Leveraging Vision-Language Large Models for Interpretable Video Action Recognition with Semantic Tokenization
- SUDER: Self-Improving Unified Large Multimodal Models for Understanding and Generation with Dual Self-Rewards
- Sticker-TTS: Learn to Utilize Historical Experience with a Sticker-driven Test-Time Scaling Framework
- Label Smoothing++: Enhanced Label Regularization for Training Neural Networks
- BeSimulator: A Large Language Model Powered Text-based Behavior Simulator
- Causal Representation Learning with Generative Artificial Intelligence: Application to Texts as Treatments
- ETF: An Entity Tracing Framework for Hallucination Detection in Code Summaries
- ElectroVizQA: How well do Multi-modal LLMs perform in Electronics Visual Question Answering?
- ChinaTravel: An Open-Ended Benchmark for Language Agents in Chinese Travel Planning
- AI Sees Your Location, But With A Bias Toward The Wealthy World
- VisBias: Measuring Explicit and Implicit Social Biases in Vision Language Models
- MM-Spatial: Exploring 3D Spatial Understanding in Multimodal LLMs
- OpenDeception: Benchmarking and Investigating AI Deceptive Behaviors via Open-ended Interaction Simulation
- A Minimum Description Length Approach to Regularization in Neural Networks
- InterFeat: A Pipeline for Finding Interesting Scientific Features
- Project Riley: Multimodal Multi-Agent LLM Collaboration with Emotional Reasoning and Voting
- TreeReview: A Dynamic Tree of Questions Framework for Deep and Efficient LLM-based Scientific Peer Review
- Persona-driven Simulation of Voting Behavior in the European Parliament with Large Language Models
- RADIANT: Retrieval AugmenteD entIty-context AligNmenT -- Introducing RAG-ability and Entity-Context Divergence
- Dynamic Injection of Entity Knowledge into Dense Retrievers
- Step-level Verifier-guided Hybrid Test-Time Scaling for Large Language Models
- The Good, the Bad and the Constructive: Automatically Measuring Peer Review's Utility for Authors
- MedualTime: A Dual-Adapter Language Model for Medical Time Series-Text Multimodal Learning
- ResearchArena: Benchmarking Large Language Models' Ability to Collect and Organize Information as Research Agents
- Error Classification of Large Language Models on Math Word Problems: A Dynamically Adaptive Framework
- Through the Prism of Culture: Evaluating LLMs' Understanding of Indian Subcultures and Traditions
- Premise-Augmented Reasoning Chains Improve Error Identification in Math reasoning with LLMs
- Position: LLMs Can be Good Tutors in English Education
- Reinforced Lifelong Editing for Language Models
- Improve LLM-as-a-Judge Ability as a General Ability
- Soft Token Attacks Cannot Reliably Audit Unlearning in Large Language Models
- Evaluating the Robustness and Accuracy of Text Watermarking Under Real-World Cross-Lingual Manipulations
- PlainQAFact: Retrieval-augmented Factual Consistency Evaluation Metric for Biomedical Plain Language Summarization
- LinkAlign: Scalable Schema Linking for Real-World Large-Scale Multi-Database Text-to-SQL
- Learning to Reason for Long-Form Story Generation
- Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models
- Advancing Scientific Text Classification: Fine-Tuned Models with Dataset Expansion and Hard-Voting
- Assessing and Mitigating Medical Knowledge Drift and Conflicts in Large Language Models
- Pierce the Mists, Greet the Sky: Decipher Knowledge Overshadowing via Knowledge Circuit Analysis
- Language Mixing in Reasoning Language Models: Patterns, Impact, and Internal Causes
- VocalBench: Benchmarking the Vocal Conversational Abilities for Speech Interaction Models
- Too Consistent to Detect: A Study of Self-Consistent Errors in LLMs
- Fast Quiet-STaR: Thinking Without Thought Tokens
- Rhapsody: A Dataset for Highlight Detection in Podcasts
- Self-Critique and Refinement for Faithful Natural Language Explanations
- ChatCFD: An LLM-Driven Agent for End-to-End CFD Automation with Domain-Specific Structured Reasoning
- Interleaving Reasoning for Better Text-to-Image Generation
- Multiple Noises in Diffusion Model for Semi-Supervised Multi-Domain Translation
- Support or Refute: Analyzing the Stance of Evidence to Detect Out-of-Context Mis- and Disinformation
- Grammaticality illusion or ambiguous interpretation? Event-related potentials reveal the nature of the missing-NP effect in Mandarin centre-embedded structures
- Repetition Improves Language Model Embeddings
- Linearly Controlled Language Generation with Performative Guarantees
- Synth-SBDH: A Synthetic Dataset of Social and Behavioral Determinants of Health for Clinical Text
- A Principled Framework for Evaluating on Typologically Diverse Languages
- Affective Computing in the Era of Large Language Models: A Survey from the NLP Perspective
- Self-Alignment: Improving Alignment of Cultural Values in LLMs via In-Context Learning
- Extracting and Combining Abilities For Building Multi-lingual Ability-enhanced Large Language Models
- Conversational Code Generation: a Case Study of Designing a Dialogue System for Generating Driving Scenarios for Testing Autonomous Vehicles
- GASE: Generatively Augmented Sentence Encoding
- Exploring the Limits of Large Language Models: A Systematic Evaluation of Masked Text Processing Ability through MskQA and MskCal
- HierTOD: A Task-Oriented Dialogue System Driven by Hierarchical Goals
- Lessons from Studying Two-Hop Latent Reasoning
- Think-to-Talk or Talk-to-Think? When LLMs Come Up with an Answer in Multi-Hop Arithmetic Reasoning
- Concept Bottleneck Large Language Models
- Process-Supervised Reward Models for Verifying Clinical Note Generation: A Scalable Approach Guided by Domain Expertise
- Revealing the impact of synthetic native samples and multi-tasking strategies in Hindi-English code-mixed humour and sarcasm detection
- Knowledge Editing through Chain-of-Thought
- OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking
- LAMDAS: LLM as an Implicit Classifier for Domain-specific Data Selection
- SLiNT: Structure-aware Language Model with Injection and Contrastive Training for Knowledge Graph Completion
- HAVE: Head-Adaptive Gating and ValuE Calibration for Hallucination Mitigation in Large Language Models
- Guided Decoding and Its Critical Role in Retrieval-Augmented Generation
- Modelling Intertextuality with N-gram Embeddings
- Domain-Aware RAG: MoL-Enhanced RL for Efficient Training and Scalable Retrieval
- IntrEx: A Dataset for Modeling Engagement in Educational Conversations
- ParCzech4Speech: A New Speech Corpus Derived from Czech Parliamentary Data
- Will Annotators Disagree? Identifying Subjectivity in Value-Laden Arguments
- Anchoring Refusal Direction: Mitigating Safety Risks in Tuning via Projection Constraint
- MachineLearningLM: Continued Pretraining Language Models on Millions of Synthetic Tabular Prediction Tasks Scales In-Context ML
- MoGU V2: Toward a Higher Pareto Frontier Between Model Usability and Security
- Saturation-Driven Dataset Generation for LLM Mathematical Reasoning in the TPTP Ecosystem
- A Comparative Benchmark of Large Language Models for Labelling Wind Turbine Maintenance Logs
- COMPACT: Common-token Optimized Model Pruning Across Channels and Tokens
- EPT Benchmark: Evaluation of Persian Trustworthiness in Large Language Models
- The Majority is not always right: RL training for solution aggregation
- UNH at CheckThat! 2025: Fine-tuning Vs Prompting in Claim Extraction
- mmBERT: A Modern Multilingual Encoder with Annealed Language Learning
- Proof-Carrying Numbers (PCN): A Protocol for Trustworthy Numeric Answers from LLMs via Claim Verification
- Beyond Two-Stage Training: Cooperative SFT and RL for LLM Reasoning
- Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models
- On the Same Wavelength? Evaluating Pragmatic Reasoning in Language Models across Broad Concepts
- On the Contribution of Lexical Features to Speech Emotion Recognition
- An Ethically Grounded LLM-Based Approach to Insider Threat Synthesis and Detection
- From Staff Messages to Actionable Insights: A Multi-Stage LLM Classification Framework for Healthcare Analytics
- Ad hoc conventions generalize to new referents
- Mitigating Spurious Correlations Between Question and Answer via Chain-of-Thought Correctness Perception Distillation
- Beyond Keywords: Driving Generative Search Engine Optimization with Content-Centric Agents
- Few-Shot Query Intent Detection via Relation-Aware Prompt Learning
- Cross-Question Method Reuse in Large Language Models: From Word-Level Prediction to Rational Logical-Layer Reasoning
- Exploring Subjective Tasks in Farsi: A Survey Analysis and Evaluation of Language Models
- QCSE: A Pretrained Quantum Context-Sensitive Word Embedding for Natural Language Processing
- Enhancing Factual Accuracy and Citation Generation in LLMs via Multi-Stage Self-Verification
- LatinX: Aligning a Multilingual TTS Model with Direct Preference Optimization
- MedFactEval and MedAgentBrief: A Framework and Workflow for Generating and Evaluating Factual Clinical Summaries
- Enhancing the Robustness of Contextual ASR to Varying Biasing Information Volumes Through Purified Semantic Correlation Joint Modeling
- Accelerating Large Language Model Inference via Early-Exiting Algorithms
- KatotohananQA: Evaluating Truthfulness of Large Language Models in Filipino
- Multimodal Fine-grained Context Interaction Graph Modeling for Conversational Speech Synthesis
- Multimodal Reasoning for Science: Technical Report and 1st Place Solution to the ICML 2025 SeePhys Challenge
- Orthogonal Low-rank Adaptation in Lie Groups for Continual Learning of Large Language Models
- Understanding the Influence of Synthetic Data for Text Embedders
- Augmented Fine-Tuned LLMs for Enhanced Recruitment Automation
- MSLEF: Multi-Segment LLM Ensemble Finetuning in Recruitment
- No Encore: Unlearning as Opt-Out in Music Generation
- Do LLMs exhibit the same commonsense capabilities across languages?
- WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents
- Crown, Frame, Reverse: Layer-Wise Scaling Variants for LLM Pre-Training
- Detection of trade in products derived from threatened species using machine learning and a smartphone
- Integrating Spatial and Semantic Embeddings for Stereo Sound Event Localization in Videos
- Improved Classification of Nitrogen Stress Severity in Plants Under Combined Stress Conditions Using Spatio-Temporal Deep Learning Framework
- Neural ARFIMA model for forecasting BRIC exchange rates with long memory under oil shocks and policy uncertainties
- When Secure Isn't: Assessing the Security of Machine Learning Model Sharing
- Automating API Documentation with LLMs: A BERTopic Approach
- Risk-averse Fair Multi-class Classification
- Causal Clustering for Conditional Average Treatment Effects Estimation and Subgroup Discovery
- Vector-based loss functions for turbulent flow field inpainting
- Fisher Random Walk: Automatic Debiasing Contextual Preference Inference for Large Language Model Evaluation
- Near Real-Time Dust Aerosol Detection with 3D Convolutional Neural Networks on MODIS Data
- Machine learning magnetism from simple global descriptors
- ALPHA: LLM-Enabled Active Learning for Human-Free Network Anomaly Detection
- Using Reinforcement Learning to Optimize the Global and Local Crossing Number
- Robust Analysis for Resilient AI System
- Learning in ImaginationLand: Omnidirectional Policies through 3D Generative Models (OP-Gen)
- Repeating vs. Non-Repeating FRBs: A Deep Learning Approach To Morphological Characterization
- FineServe: Precision-Aware KV Slab and Two-Level Scheduling for Heterogeneous Precision LLM Serving
- PLRV-O: Advancing Differentially Private Deep Learning via Privacy Loss Random Variable Optimization
- An Explainable Framework for Particle Swarm Optimization using Landscape Analysis and Machine Learning
- MOSAIC: Minimax-Optimal Sparsity-Adaptive Inference for Change Points in Dynamic Networks
- Enhancing Low-Altitude Airspace Security: MLLM-Enabled UAV Intent Recognition
- Embedding Poisoning: Bypassing Safety Alignment via Embedding Semantic Shift
- A Multi-Modal Deep Learning Framework for Colorectal Pathology Diagnosis: Integrating Histological and Colonoscopy Data in a Pilot Study
- IGAff: Benchmarking Adversarial Iterative and Genetic Affine Algorithms on Deep Neural Networks
- On the Reproducibility of "FairCLIP: Harnessing Fairness in Vision-Language Learning''
- Concolic Testing on Individual Fairness of Neural Network Models
- From Noise to Narrative: Tracing the Origins of Hallucinations in Transformers
- Outcome-based Exploration for LLM Reasoning
- Predicting Brain Morphogenesis via Physics-Transfer Learning
- Privacy-Preserving Offloading for Large Language Models in 6G Vehicular Networks
- Application of discrete Ricci curvature in pruning randomly wired neural networks: A case study with chest x-ray classification of COVID-19
- Handling imbalance and few-sample size in ML based Onion disease classification
- Ensembling Membership Inference Attacks Against Tabular Generative Models
- Beyond ROUGE: N-Gram Subspace Features for LLM Hallucination Detection
- RoboBallet: Planning for Multi-Robot Reaching with Graph Neural Networks and Reinforcement Learning
- Distributed Link Sparsification for Scalable Scheduling Using Graph Neural Networks (Journal Version)
- Biomedical Literature Q&A System Using Retrieval-Augmented Generation (RAG)
- New Insights into Optimal Alignment of Acoustic and Linguistic Representations for Knowledge Transfer in ASR
- Audits Under Resource, Data, and Access Constraints: Scaling Laws For Less Discriminatory Alternatives
- Robust variational neural posterior estimation for simulation-based inference
- NeuroDeX: Unlocking Diverse Support in Decompiling Deep Neural Network Executables
- QualityFM: a Multimodal Physiological Signal Foundation Model with Self-Distillation for Signal Quality Challenges in Critically Ill Patients
- Lane Change Intention Prediction of two distinct Populations using a Transformer
- Contrastive Self-Supervised Network Intrusion Detection using Augmented Negative Pairs
- AI for Scientific Discovery is a Social Problem
- A Survey of Generalization of Graph Anomaly Detection: From Transfer Learning to Foundation Models
- BEAM: Brainwave Empathy Assessment Model for Early Childhood
- Knowledge-Guided Machine Learning for Stabilizing Near-Shortest Path Routing
- Group Effect Enhanced Generative Adversarial Imitation Learning for Individual Travel Behavior Modeling under Incentives
- Barycentric Neural Networks and Length-Weighted Persistent Entropy Loss: A Green Geometric and Topological Framework for Function Approximation
- Probabilistic Modeling of Latent Agentic Substructures in Deep Neural Networks
- RT-HCP: Dealing with Inference Delays and Sample Efficiency to Learn Directly on Robotic Platforms
- Aligning Large Vision-Language Models by Deep Reinforcement Learning and Direct Preference Optimization
- Asynchronous Message Passing for Addressing Oversquashing in Graph Neural Networks
- Physics-informed Value Learner for Offline Goal-Conditioned Reinforcement Learning
- \texttt{R$^\textbf{2}$AI}: Towards Resistant and Resilient AI in an Evolving World
- floq: Training Critics via Flow-Matching for Scaling Compute in Value-Based RL
- Performance of Conformal Prediction in Capturing Aleatoric Uncertainty
- Finetuning LLMs for Human Behavior Prediction in Social Science Experiments
- SPINN: An Optimal Self-Supervised Physics-Informed Neural Network Framework
- X-SQL: Expert Schema Linking and Understanding of Text-to-SQL with Multi-LLMs
- A novel biomass fluidized bed gasification model coupled with machine learning and CFD simulation
- A Surrogate model for High Temperature Superconducting Magnets to Predict Current Distribution with Neural Network
- If generative AI is the answer, what is the question?
- Metric Embedding Initialization-Based Differentially Private and Explainable Graph Clustering
- MCIGLE: Multimodal Exemplar-Free Class-Incremental Graph Learning
- IPR: Intelligent Prompt Routing with User-Controlled Quality-Cost Trade-offs
- RecMind: LLM-Enhanced Graph Neural Networks for Personalized Consumer Recommendations
- A Spatio-Temporal Graph Neural Networks Approach for Predicting Silent Data Corruption inducing Circuit-Level Faults
- LoaQ: Layer-wise Output Approximation Quantization
- WindFM: An Open-Source Foundation Model for Zero-Shot Wind Power Forecasting
- Text-Trained LLMs Can Zero-Shot Extrapolate PDE Dynamics
- Breaking SafetyCore: Exploring the Risks of On-Device AI Deployment
- Graph Neural Networks for Resource Allocation in Interference-limited Multi-Channel Wireless Networks with QoS Constraints
- Ban&Pick: Achieving Free Performance Gains and Inference Speedup via Smarter Routing in MoE-LLMs
- Mask-GCG: Are All Tokens in Adversarial Suffixes Necessary for Jailbreak Attacks?
- MeanFlow-Accelerated Multimodal Video-to-Audio Synthesis via One-Step Generation
- Explained, yet misunderstood: How AI Literacy shapes HR Managers' interpretation of User Interfaces in Recruiting Recommender Systems
- Safeguarding Graph Neural Networks against Topology Inference Attacks
- STL-based Optimization of Biomolecular Neural Networks for Regression and Control
- DreamPRM-1.5: Unlocking the Potential of Each Instance for Multimodal Process Reward Model Training
- Reinforcement Learning with Anticipation: A Hierarchical Approach for Long-Horizon Tasks
- Distributed Deep Learning using Stochastic Gradient Staleness
- Morphological Perceptron with Competitive Layer: Training Using Convex-Concave Procedure
- DCMI: A Differential Calibration Membership Inference Attack Against Retrieval-Augmented Generation
- DreamAudio: Customized Text-to-Audio Generation with Diffusion Models
- BranchGRPO: Stable and Efficient GRPO with Structured Branching in Diffusion Models
- Empirical Study of Code Large Language Models for Binary Security Patch Detection
- PolicyEvolve: Evolving Programmatic Policies by LLMs for multi-player games via Population-Based Training
- Software Dependencies 2.0: An Empirical Study of Reuse and Integration of Pre-Trained Models in Open-Source Projects
- Language Native Lightly Structured Databases for Large Language Model Driven Composite Materials Research
- Teaching Precommitted Agents: Model-Free Policy Evaluation and Control in Quasi-Hyperbolic Discounted MDPs
- SpecSwin3D: Generating Hyperspectral Imagery from Multispectral Data via Transformer Networks
- Tracking daily paths in home contexts with RSSI fingerprinting based on UWB through deep learning models
- Benchmarking Gender and Political Bias in Large Language Models
- AI Governance in Higher Education: A course design exploring regulatory, ethical and practical considerations
- Toward a Metrology for Artificial Intelligence: Hidden-Rule Environments and Reinforcement Learning
- Beamforming-LLM: What, Where and When Did I Miss?
- Statistical Inference for Misspecified Contextual Bandits
- AttestLLM: Efficient Attestation Framework for Billion-scale On-device LLMs
- A Fragile Number Sense: Probing the Elemental Limits of Numerical Reasoning in LLMs
- Simulation Priors for Data-Efficient Deep Learning
- InterAct: A Large-Scale Dataset of Dynamic, Expressive and Interactive Activities between Two People in Daily Scenarios
- Unleashing Hierarchical Reasoning: An LLM-Driven Framework for Training-Free Referring Video Object Segmentation
- Exploit Tool Invocation Prompt for Tool Behavior Hijacking in LLM-Based Agentic System
- time2time: Causal Intervention in Hidden States to Simulate Rare Events in Time Series Foundation Models
- Decoding Latent Attack Surfaces in LLMs: Prompt Injection via HTML in Web Summarization
- GenAI on Wall Street -- Opportunities and Risk Controls
- ZhiFangDanTai: Fine-tuning Graph-based Retrieval-Augmented Generation Model for Traditional Chinese Medicine Formula
- Learning to Construct Knowledge through Sparse Reference Selection with Reinforcement Learning
- Uncertainty Quantification in Probabilistic Machine Learning Models: Theory, Methods, and Insights
- GeoAnalystBench: A GeoAI benchmark for assessing large language models for spatial analysis workflow and code generation
- Let's Roleplay: Examining LLM Alignment in Collaborative Dialogues
- Multimodal Prompt Injection Attacks: Risks and Defenses for Modern LLMs
- Challenges in Deep Learning-Based Small Organ Segmentation: A Benchmarking Perspective for Medical Research with Limited Datasets
- Meta-training of diffractive meta-neural networks for super-resolution direction of arrival estimation
- ConstStyle: Robust Domain Generalization with Unified Style Transformation
- Operationalising AI Regulatory Sandboxes under the EU AI Act: The Triple Challenge of Capacity, Coordination and Attractiveness to Providers
- Neural Breadcrumbs: Membership Inference Attacks on LLMs Through Hidden State and Attention Pattern Analysis
- Behind the Mask: Benchmarking Camouflaged Jailbreaks in Large Language Models
- From Vision to Validation: A Theory- and Data-Driven Construction of a GCC-Specific AI Adoption Index
- MambaLite-Micro: Memory-Optimized Mamba Inference on MCUs
- OpenEgo: A Large-Scale Multimodal Egocentric Dataset for Dexterous Manipulation
- Combining TSL and LLM to Automate REST API Testing: A Comparative Study
- Using Contrastive Learning to Improve Two-Way Reasoning in Large Language Models: The Obfuscation Task as a Case Study
- Learning to Walk in Costume: Adversarial Motion Priors for Aesthetically Constrained Humanoids
- Icon$^{2}$: Aligning Large Language Models Using Self-Synthetic Preference Data via Inherent Regulation
- Cross-Service Threat Intelligence in LLM Services using Privacy-Preserving Fingerprints
- Causal Debiasing Medical Multimodal Representation Learning with Missing Modalities
- Orchestrator: Active Inference for Multi-Agent Systems in Long-Horizon Tasks
- LM-Searcher: Cross-domain Neural Architecture Search with LLMs via Unified Numerical Encoding
- Llama-GENBA-10B: A Trilingual Large Language Model for German, English and Bavarian
- GraMFedDHAR: Graph Based Multimodal Differentially Private Federated HAR
- SEASONED: Semantic-Enhanced Self-Counterfactual Explainable Detection of Adversarial Exploiter Contracts
- Authorship Without Writing: Large Language Models and the Senior Author Analogy
- Talk Isn't Always Cheap: Understanding Failure Modes in Multi-Agent Debate
- Universality of physical neural networks with multivariate nonlinearity
- Advanced Brain Tumor Segmentation Using EMCAD: Efficient Multi-scale Convolutional Attention Decoding
- Direct-Scoring NLG Evaluators Can Use Pairwise Comparisons Too
- ForensicsData: A Digital Forensics Dataset for Large Language Models
- Plantbot: Integrating Plant and Robot through LLM Modular Agent Networks
- Comparative Evaluation of Hard and Soft Clustering for Precise Brain Tumor Segmentation in MR Imaging
- Spiking Neural Networks for Continuous Control via End-to-End Model-Based Learning
- An Empirical Analysis of Discrete Unit Representations in Speech Language Modeling Pre-training
- Governing AI R&D: A Legal Framework for Constraining Dangerous AI
- AI-in-the-Loop: Privacy Preserving Real-Time Scam Detection and Conversational Scambaiting by Leveraging LLMs and Federated Learning
- Prototyping an AI-powered Tool for Energy Efficiency in New Zealand Homes
- Between a Rock and a Hard Place: Exploiting Ethical Reasoning to Jailbreak LLMs
- Privacy Preservation and Identity Tracing Prevention in AI-Driven Eye Tracking for Interactive Learning Environments
- ThreatGPT: An Agentic AI Framework for Enhancing Public Safety through Threat Modeling
- User Privacy and Large Language Models: An Analysis of Frontier Developers' Privacy Policies
- A Lightweight Framework for Trigger-Guided LoRA-Based Self-Adaptation in LLMs
- Augmented Structure Preserving Neural Networks for cell biomechanics
- HyFedRAG: A Federated Retrieval-Augmented Generation Framework for Heterogeneous and Privacy-Sensitive Data
- Accelerate Scaling of LLM Alignment via Quantifying the Coverage and Depth of Instruction Set
- MAS-Bench: A Unified Benchmark for Shortcut-Augmented Hybrid Mobile GUI Agents
- MORSE: Multi-Objective Reinforcement Learning via Strategy Evolution for Supply Chain Optimization
- Scaling up Multi-Turn Off-Policy RL and Multi-Agent Tree Search for LLM Step-Provers
- An AI system to help scientists write expert-level empirical software
- Another Turn, Better Output? A Turn-Wise Analysis of Iterative LLM Prompting
- RAFFLES: Reasoning-based Attribution of Faults for LLM Systems
- Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI Agents
- Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference
- AdCare-VLM: Leveraging Large Vision Language Model (LVLM) to Monitor Long-Term Medication Adherence and Care
- LocoMamba: Vision-Driven Locomotion via End-to-End Deep Reinforcement Learning with Mamba
- Livia: An Emotion-Aware AR Companion Powered by Modular AI Agents and Progressive Memory Compression
- Multi-IaC-Eval: Benchmarking Cloud Infrastructure as Code Across Multiple Formats
- Towards Log Analysis with AI Agents: Cowrie Case Study
- Large Language Model Integration with Reinforcement Learning to Augment Decision-Making in Autonomous Cyber Operations
- Evaluation of Large Language Models for Anomaly Detection in Autonomous Vehicles
- Standard vs. Modular Sampling: Best Practices for Reliable LLM Unlearning
- Backdoor Samples Detection Based on Perturbation Discrepancy Consistency in Pre-trained Language Models
- A Dataset Generation Scheme Based on Video2EEG-SPGN-Diffusion for SEED-VD
- Characterizing Fitness Landscape Structures in Prompt Engineering
- Murphys Laws of AI Alignment: Why the Gap Always Wins
- Towards Meta-Cognitive Knowledge Editing for Multimodal LLMs
- Hyperbolic Large Language Models
- DRF: LLM-AGENT Dynamic Reputation Filtering Framework
- Decision-Focused Learning Enhanced by Automated Feature Engineering for Energy Storage Optimisation
- Rethinking Reasoning Quality in Large Language Models through Enhanced Chain-of-Thought via RL
- From Long to Short: LLMs Excel at Trimming Own Reasoning Chains
- PillagerBench: Benchmarking LLM-Based Agents in Competitive Minecraft Team Environments
- Proof2Silicon: Prompt Repair for Verified Code and Hardware Generation via Reinforcement Learning
- REMI: A Novel Causal Schema Memory Architecture for Personalized Lifestyle Recommendation Agents
- SFR-DeepResearch: Towards Effective Reinforcement Learning for Autonomously Reasoning Single Agents
- From Implicit Exploration to Structured Reasoning: Leveraging Guideline and Refinement for LLMs
- Can AI Make Energy Retrofit Decisions? An Evaluation of Large Language Models
- Large Language Models as Virtual Survey Respondents: Evaluating Sociodemographic Response Generation
- Evaluating Multi-Turn Bargain Skills in LLM-Based Seller Agent
- Teaching AI Stepwise Diagnostic Reasoning with Report-Guided Chain-of-Thought Learning
- Tree of Agents: Improving Long-Context Capabilities of Large Language Models through Multi-Perspective Reasoning
- Attention of a Kiss: Exploring Attention Maps in Video Diffusion for XAIxArts
- MVRS: The Multimodal Virtual Reality Stimuli-based Emotion Recognition Dataset
- Benchmarking Large Language Models for Personalized Guidance in AI-Enhanced Learning
- SasAgent: Multi-Agent AI System for Small-Angle Scattering Data Analysis
Research Sources: 484 | Generated: 9/9/2025