AI RESEARCH PAPERS & ACADEMIC SOURCES
- Robust Decision-Making Via Free Energy Minimization
- Contra4: Evaluating Contrastive Cross-Modal Reasoning in Audio, Video, Image, and 3D
- Executable Ontologies: Synthesizing Event Semantics with Dataflow Architecture
- Agentic Lybic: Multi-Agent Execution System with Tiered Reasoning and Orchestration
- Learning Environment-Aware Affordance for 3D Articulated Object Manipulation under Occlusions
- A Statistical Analysis of Deep Federated Learning for Intrinsically Low-dimensional Data
- Multi-task and few-shot learning in virtual flow metering
- Minimax optimal transfer learning for high-dimensional additive regression
- Understanding Boolean Function Learnability on Deep Neural Networks: PAC Learning Meets Neurosymbolic Models
- Learning from a Biased Sample
- Finite Neural Networks as Mixtures of Gaussian Processes: From Provable Error Bounds to Prior Selection
- Convex Regularization and Convergence of Policy Gradient Flows under Safety Constraints
- A Stable Measure for Conditional Periodicity of Time Series using Persistent Homology
- Understanding Generalization in Physics Informed Models through Affine Variety Dimensions
- InfoGain Wavelets: Furthering the Design of Graph Diffusion Wavelets
- Diagnosis for Less-Prevalent Thyroid Carcinoma Subtype Using a Dual-Branch Attention Deep Network with Ultrasound Images
- TransDiffuser: Diverse Trajectory Generation with Decorrelated Multi-modal Representation for End-to-end Autonomous Driving
- ForceVLA: Enhancing VLA Models with a Force-aware MoE for Contact-rich Manipulation
- Can Generalist Vision Language Models (VLMs) Rival Specialist Medical VLMs? Benchmarking and Strategic Insights
- PBPK-iPINNs : Inverse Physics-Informed Neural Networks for Physiologically Based Pharmacokinetic Brain Models
- SURGIN: SURrogate-guided Generative INversion for subsurface multiphase flow with quantified uncertainty
- Jackknife Variance Estimation for H\'ajek-Dominated Generalized U-Statistics
- Reduced Order Modeling of Energetic Materials Using Physics-Aware Recurrent Convolutional Neural Networks in a Latent Space (LatentPARC)
- Bayesian Parametric Matrix Models: Principled Uncertainty Quantification for Spectral Learning
- Selective Risk Certification for LLM Outputs via Information-Lift Statistics: PAC-Bayes, Robustness, and Skeleton Design
- Power-Dominance in Estimation Theory: A Third Pathological Axis
- Fast reconstruction of degenerate populations of conductance-based neuron models from spike times
- Modeling nonstationary spatial processes with normalizing flows
- Gaussian Mixture Model with unknown diagonal covariances via continuous sparse regularization
- Reversible Deep Equilibrium Models
- Causal Discovery via Quantile Partial Effect
- Optimal Conformal Prediction, E-values, Fuzzy Prediction Sets and Subsequent Decisions
- Learning Discrete Bayesian Networks with Hierarchical Dirichlet Shrinkage
- Palmprint De-Identification Using Diffusion Model for High-Quality and Diverse Synthesis
- HoloDx: Knowledge- and Data-Driven Multimodal Diagnosis of Alzheimer's Disease
- Think Before You Diffuse: LLMs-Guided Physics-Aware Video Generation
- WorldExplorer: Towards Generating Fully Navigable 3D Scenes
- MedEBench: Diagnosing Reliability in Text-Guided Medical Image Editing
- AMF-MedIT: An Efficient Align-Modulation-Fusion Framework for Medical Image-Tabular Data
- Taming Anomalies with Down-Up Sampling Networks: Group Center Preserving Reconstruction for 3D Anomaly Detection
- ByDeWay: Boost Your multimodal LLM with DEpth prompting in a Training-Free Way
- SlumpGuard: An AI-Powered Real-Time System for Automated Concrete Slump Prediction via Video Analysis
- Test-Time Canonicalization by Foundation Models for Robust Perception
- Revisiting Transferable Adversarial Images: Systemization, Evaluation, and New Insights
- Pitfalls of defacing whole-head MRI: re-identification risk with diffusion models and compromised research potential
- StyleSculptor: Zero-Shot Style-Controllable 3D Asset Generation with Texture-Geometry Dual Guidance
- 3D Aware Region Prompted Vision Language Model
- Neural Diffeomorphic-Neural Operator for Residual Stress-Induced Deformation Prediction
- InJecteD: Analyzing Trajectories and Drift Dynamics in Denoising Diffusion Probabilistic Models for 2D Point Cloud Generation
- Enhancing Radiographic Disease Detection with MetaCheX, a Context-Aware Multimodal Model
- Universal Gr\"obner Bases of (Universal) Multiview Ideals
- Neural 3D Object Reconstruction with Small-Scale Unmanned Aerial Vehicles
- iCD: A Implicit Clustering Distillation Mathod for Structural Information Mining
- Generalizable Holographic Reconstruction via Amplitude-Only Diffusion Priors
- Unleashing the Power of Discrete-Time State Representation: Ultrafast Target-based IMU-Camera Spatial-Temporal Calibration
- Tool-R1: Sample-Efficient Reinforcement Learning for Agentic Tool Use
- QDFlow: A Python package for physics simulations of quantum dot devices
- MEIL-NeRF: Memory-Efficient Incremental Learning of Neural Radiance Fields
- Optimal Transport Based Unsupervised Restoration Learning Exploiting Degradation Sparsity
- Detection of Synthetic Face Images: Accuracy, Robustness, Generalization
- 3DSRBench: A Comprehensive 3D Spatial Reasoning Benchmark
- Gradient-Free Adversarial Purification with Diffusion Models
- IMPROVE: Iterative Model Pipeline Refinement and Optimization Leveraging LLM Experts
- Semantic-ICP: Iterative Closest Point for Non-rigid Multi-Organ Point Cloud Registration
- HierRelTriple: Guiding Indoor Layout Generation with Hierarchical Relationship Triplet Losses
- ICDAR 2025 Competition on FEw-Shot Text line segmentation of ancient handwritten documents (FEST)
- SHREC 2025: Protein surface shape retrieval including electrostatic potential
- Improving Accuracy and Efficiency of Implicit Neural Representations: Making SIREN a WINNER
- PANORAMA: The Rise of Omnidirectional Vision in the Embodied AI Era
- Brought a Gun to a Knife Fight: Modern VFM Baselines Outgun Specialized Detectors on In-the-Wild AI Image Detection
- Drone Detection Using a Low-Power Neuromorphic Virtual Tripwire
- Dream3DAvatar: Text-Controlled 3D Avatar Reconstruction from a Single Image
- HERO: Rethinking Visual Token Early Dropping in High-Resolution Large Vision-Language Models
- Using KL-Divergence to Focus Frequency Information in Low-Light Image Enhancement
- Enhancing Dual Network Based Semi-Supervised Medical Image Segmentation with Uncertainty-Guided Pseudo-Labeling
- A Synthetic Data Pipeline for Supporting Manufacturing SMEs in Visual Assembly Control
- Weakly and Self-Supervised Class-Agnostic Motion Prediction for Autonomous Driving
- Advancing Real-World Parking Slot Detection with Large-Scale Dataset and Semi-Supervised Baseline
- MSDNet: Efficient 4D Radar Super-Resolution via Multi-Stage Distillation
- TexTAR : Textual Attribute Recognition in Multi-domain and Multi-lingual Document Images
- Enhancing Video Large Language Models with Structured Multi-Video Collaborative Reasoning (early version)
- WHU-STree: A Multi-modal Benchmark Dataset for Street Tree Inventory
- More performant and scalable: Rethinking contrastive vision-language pre-training of radiology in the LLM era
- Road Obstacle Video Segmentation
- Vi-SAFE: A Spatial-Temporal Framework for Efficient Violence Detection in Public Surveillance
- End4: End-to-end Denoising Diffusion for Diffusion-Based Inpainting Detection
- Intelligent Vacuum Thermoforming Process
- Image Realness Assessment and Localization with Multimodal Features
- BATR-FST: Bi-Level Adaptive Token Refinement for Few-Shot Transformers
- Modeling the Multivariate Relationship with Contextualized Representations for Effective Human-Object Interaction Detection
- Double Helix Diffusion for Cross-Domain Anomaly Image Generation
- Superpixel Anything: A general object-based framework for accurate yet regular superpixel segmentation
- Hunyuan3D Studio: End-to-End AI Pipeline for Game-Ready 3D Asset Generation
- SAGA: Selective Adaptive Gating for Efficient and Expressive Linear Attention
- Exploring Metric Fusion for Evaluation of NeRFs
- Leveraging Large Language Models to Effectively Generate Visual Data for Canine Musculoskeletal Diagnoses
- Cumulative Consensus Score: Label-Free and Model-Agnostic Evaluation of Object Detectors in Deployment
- Few to Big: Prototype Expansion Network via Diffusion Learner for Point Cloud Few-shot Semantic Segmentation
- Lego-Edit: A General Image Editing Framework with Model-Level Bricks and MLLM Builder
- MEJO: MLLM-Engaged Surgical Triplet Recognition via Inter- and Intra-Task Joint Optimization
- DialNav: Multi-turn Dialog Navigation with a Remote Guide
- MSGFusion: Multimodal Scene Graph-Guided Infrared and Visible Image Fusion
- AREPAS: Anomaly Detection in Fine-Grained Anatomy with Reconstruction-Based Semantic Patch-Scoring
- T-SiamTPN: Temporal Siamese Transformer Pyramid Networks for Robust and Efficient UAV Tracking
- A Novel Compression Framework for YOLOv8: Achiev-ing Real-Time Aerial Object Detection on Edge Devices via Structured Pruning and Channel-Wise Distillation
- MATTER: Multiscale Attention for Registration Error Regression
- 4DRadar-GS: Self-Supervised Dynamic Driving Scene Reconstruction with 4D Radar
- Beyond Averages: Open-Vocabulary 3D Scene Understanding with Gaussian Splatting and Bag of Embeddings
- Time-step Mixup for Efficient Spiking Knowledge Transfer from Appearance to Event Domain
- MMMS: Multi-Modal Multi-Surface Interactive Segmentation
- Evaluating Robustness of Vision-Language Models Under Noisy Conditions
- Instance-Guided Class Activation Mapping for Weakly Supervised Semantic Segmentation
- Artist-Created Mesh Generation from Raw Observation
- Axis-Aligned 3D Stalk Diameter Estimation from RGB-D Imagery
- Neural Collapse-Inspired Multi-Label Federated Learning under Label-Distribution Skew
- Agent4FaceForgery: Multi-Agent LLM Framework for Realistic Face Forgery Detection
- Explicit Multimodal Graph Modeling for Human-Object Interaction Detection
- VQT-Light:Lightweight HDR Illumination Map Prediction with Richer Texture.pdf
- Exploring Spectral Characteristics for Single Image Reflection Removal
- Maps for Autonomous Driving: Full-process Survey and Frontiers
- StereoCarla: A High-Fidelity Driving Dataset for Generalizable Stereo
- SmokeBench: A Real-World Dataset for Surveillance Image Desmoking in Early-Stage Fire Scenes
- RIS-FUSION: Rethinking Text-Driven Infrared and Visible Image Fusion from the Perspective of Referring Image Segmentation
- Learning by Imagining: Debiased Feature Augmentation for Compositional Zero-Shot Learning
- AsyMoE: Leveraging Modal Asymmetry for Enhanced Expert Specialization in Large Vision-Language Models
- EvoEmpirBench: Dynamic Spatial Reasoning with Agent-ExpVer
- SPGen: Spherical Projection as Consistent and Flexible Representation for Single Image 3D Shape Generation
- Effective Gaussian Management for High-fidelity Object Reconstruction
- Modelling and analysis of the 8 filters from the "master key filters hypothesis" for depthwise-separable deep networks in relation to idealized receptive fields based on scale-space theory
- What Makes a Good Generated Image? Investigating Human and Multimodal LLM Image Preference Alignment
- Recurrent Cross-View Object Geo-Localization
- A-TDOM: Active TDOM via On-the-Fly 3DGS
- DyGLNet: Hybrid Global-Local Feature Fusion with Dynamic Upsampling for Medical Image Segmentation
- SuPreME: A Supervised Pre-training Framework for Multimodal ECG Representation Learning
- Evaluating the Robustness of Open-Source Vision-Language Models to Domain Shift in Object Captioning
- Visual Contextual Attack: Jailbreaking MLLMs with Image-Driven Context Injection
- Teaching Vision-Language Models to Ask: Resolving Ambiguity in Visual Questions
- TSPC: A Two-Stage Phoneme-Centric Architecture for code-switching Vietnamese-English Speech Recognition
- Artificial Intelligence in Breast Cancer Care: Transforming Preoperative Planning and Patient Education with 3D Reconstruction
- EfficientNet-Based Multi-Class Detection of Real, Deepfake, and Plastic Surgery Faces
- Uncertainty-Aware Hourly Air Temperature Mapping at 2 km Resolution via Physics-Guided Deep Learning
- DS@GT AnimalCLEF: Triplet Learning over ViT Manifolds with Nearest Neighbor Classification for Animal Re-identification
- From Orthomosaics to Raw UAV Imagery: Enhancing Palm Detection and Crown-Center Localization
- DYNAMO: Dependency-Aware Deep Learning Framework for Articulated Assembly Motion Prediction
- Cott-ADNet: Lightweight Real-Time Cotton Boll and Flower Detection Under Field Conditions
- Deep learning for 3D point cloud processing - from approaches, tasks to its implications on urban and environmental applications
- Two-Stage Decoupling Framework for Variable-Length Glaucoma Prognosis
- Image Tokenizer Needs Post-Training
- Towards Foundational Models for Single-Chip Radar
- The Strawberry Problem: Emergence of Character-level Understanding in Tokenized Language Models
- Keep Security! Benchmarking Security Policy Preservation in Large Language Model Contexts Against Indirect Attacks in Question Answering
- From Surveys to Narratives: Rethinking Cultural Value Adaptation in LLMs
- Reading Between the Prompts: How Stereotypes Shape LLM's Implicit Personalization
- PatentScore: Multi-dimensional Evaluation of LLM-Generated Patent Claims
- Counterfactual Simulatability of LLM Explanations for Generation Tasks
- UniversalCEFR: Enabling Open Multilingual Research on Language Proficiency Assessment
- From Understanding to Generation: An Efficient Shortcut for Evaluating Language Models
- EIFBENCH: Extremely Complex Instruction Following Benchmark for Large Language Models
- Efficient Context Selection for Long-Context QA: No Tuning, No Iteration, Just Adaptive-$k$
- Chain-of-Thought Prompting Obscures Hallucination Cues in Large Language Models: An Empirical Evaluation
- TAPS: Tool-Augmented Personalisation via Structured Tagging
- ToM-SSI: Evaluating Theory of Mind in Situated Social Interactions
- ICR: Iterative Clarification and Rewriting for Conversational Search
- A Novel Recurrent Neural Network Framework for Prediction and Treatment of Oncogenic Mutation Progression
- Similarity-Distance-Magnitude Activations
- Rethinking the Evaluation of Alignment Methods: Insights into Diversity, Generalisation, and Safety
- When Inverse Data Outperforms: Exploring the Pitfalls of Mixed Data in Multi-Stage Fine-Tuning
- Textarium: Entangling Annotation, Abstraction and Argument
- Podcasts as a Medium for Participation in Collective Action: A Case Study of Black Lives Matter
- WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic Data and Scalable Reinforcement Learning
- Do predictability factors towards signing avatars hold across cultures?
- Emphasising Structured Information: Integrating Abstract Meaning Representation into LLMs for Enhanced Open-Domain Dialogue Evaluation
- Can Code-Switched Texts Activate a Knowledge Switch in LLMs? A Case Study on English-Korean Code-Switching
- Break the Checkbox: Challenging Closed-Style Evaluations of Cultural Alignment in LLMs
- TokenSkip: Controllable Chain-of-Thought Compression in LLMs
- How Much Do LLMs Hallucinate across Languages? On Multilingual Estimation of LLM Hallucination in the Wild
- Teaching Your Models to Understand Code via Focal Preference Alignment
- Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
- Dynamic Relation Inference via Verb Embeddings
- Why Stop at One Error? Benchmarking LLMs as Data Science Code Debuggers for Multi-Hop and Multi-Bug Errors
- Is the Top Still Spinning? Evaluating Subjectivity in Narrative Understanding
- Do Large Language Models Truly Grasp Addition? A Rule-Focused Diagnostic Using Two-Integer Arithmetic
- Game-RL: Synthesizing Verifiable Game Tasks at Scale to Boost VLMs General Reasoning
- The Better You Learn, The Smarter You Prune: Towards Efficient Vision-language-action Models via Differentiable Token Pruning
- Benchmarking and Improving LVLMs on Event Extraction from Multimedia Documents
- Automated Generation of Research Workflows from Academic Papers: A Full-text Mining Framework
- Do LLMs Understand Wine Descriptors Across Cultures? A Benchmark for Cultural Adaptations of Wine Reviews
- SitLLM: Large Language Models for Sitting Posture Health Understanding via Pressure Sensor Data
- Empowering LLMs with Parameterized Skills for Adversarial Long-Horizon Planning
- LLM Hallucination Detection: A Fast Fourier Transform Method Based on Hidden Layer Temporal Signals
- The Few-shot Dilemma: Over-prompting Large Language Models
- Evaluating LLM Alignment on Personality Inference from Real-World Interview Data
- ChartGaze: Enhancing Chart Understanding in LVLMs with Eye-Tracking Guided Attention Refinement
- WebResearcher: Unleashing unbounded reasoning capability in Long-Horizon Agents
- Scaling Agents via Continual Pre-training
- Towards General Agentic Intelligence via Environment Scaling
- WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research
- ReSum: Unlocking Long-Horizon Search Intelligence via Context Summarization
- Do Natural Language Descriptions of Model Activations Convey Privileged Information?
- Exact Coset Sampling for Quantum Lattice Algorithms
- Context-Aware Language Models for Forecasting Market Impact from Sequences of Financial News
- The Adaptation Paradox: Agency vs. Mimicry in Companion Chatbots
- LEAF: Knowledge Distillation of Text Embedding Models with Teacher-Aligned Representations
- Yet Another Watermark for Large Language Models
- MTEB-NL and E5-NL: Embedding Benchmark and Models for Dutch
- LLM-as-a-Judge: Rapid Evaluation of Legal Document Recommendation for Retrieval-Augmented Generation
- SENTRA: Selected-Next-Token Transformer for LLM Text Detection
- MORQA: Benchmarking Evaluation Metrics for Medical Open-Ended Question Answering
- Topic Coverage-based Demonstration Retrieval for In-Context Learning
- Does Language Model Understand Language?
- Audited Reasoning Refinement: Fine-Tuning Language Models via LLM-Guided Step-Wise Evaluation and Correction
- A comparison of pipelines for the translation of a low resource language based on transformers
- MAGIC-Enhanced Keyword Prompting for Zero-Shot Audio Captioning with CLIP Models
- PAC: Pronunciation-Aware Contextualized Large Language Model-based Automatic Speech Recognition
- Mitigating Strategy Preference Bias in Emotional Support Conversation via Uncertainty Estimations
- Chat-Driven Text Generation and Interaction for Person Retrieval
- Towards Inclusive Toxic Content Moderation: Addressing Vulnerabilities to Adversarial Attacks in Toxicity Classifiers Tackling LLM-generated Content
- Case-Based Decision-Theoretic Decoding with Quality Memories
- HistoryBankQA: Multilingual Temporal Question Answering on Historical Events
- Contrastive Learning with Enhanced Abstract Representations using Grouped Loss of Abstract Semantic Supervision
- ConvergeWriter: Data-Driven Bottom-Up Article Construction
- Data Augmentation for Maltese NLP using Transliterated and Machine Translated Arabic Data
- EMOE: A Framework for Out-of-distribution Uncertainty Based Rejection via Model-Agnostic Expansive Matching of Experts
- Informed Correctors for Discrete Diffusion Models
- RingMo-Aerial: An Aerial Remote Sensing Foundation Model With Affine Transformation Contrastive Learning
- Context-Aware Membership Inference Attacks against Pre-trained Large Language Models
- TRANSAGENT: An LLM-Based Multi-Agent System for Code Translation
- T2V-Turbo-v2: Enhancing Video Generation Model Post-Training through Data, Reward, and Conditional Guidance Design
- Responsible AI in NLP: GUS-Net Span-Level Bias Detection Dataset and Benchmark for Generalizations, Unfairness, and Stereotypes
- The Belief State Transformer
- Adversarial Prompt Distillation for Vision-Language Models
- On the Correlation between Individual Fairness and Predictive Accuracy in Probabilistic Models
- B-TGAT: A Bi-directional Temporal Graph Attention Transformer for Clustering Multivariate Spatiotemporal Data
- Rich Vehicle Routing Problem with diverse Vertices allowing Hierarchical and Multimodal Time-Dependant Transhipment of multiple Node- Vehicle- compatible Cargo with Cascaded Time-Minimization Objective for Emergency Decision Support Systems
- Metacognitive Reuse: Turning Recurring LLM Reasoning Into Concise Behaviors
- JANUS: A Dual-Constraint Generative Framework for Stealthy Node Injection Attacks
- RadGame: An AI-Powered Platform for Radiology Education
- Concurrent Linguistic Error Detection (CLED): a New Methodology for Error Detection in Large Language Models
- Probing LLM Hallucination from Within: Perturbation-Driven Approach via Internal Knowledge
- CredID: Credible Multi-Bit Watermark for Large Language Models Identification
- Comprehend, Divide, and Conquer: Feature Subspace Exploration via Multi-Agent Hierarchical Reinforcement Learning
- Random Rule Forest (RRF): Interpretable Ensembles of LLM-Generated Questions for Predicting Startup Success
- Small Language Models are the Future of Agentic AI
- TRiSM for Agentic AI: A Review of Trust, Risk, and Security Management in LLM-based Agentic Multi-Agent Systems
- The LLM Already Knows: Estimating LLM-Perceived Question Difficulty via Hidden Representations
- Conan-Embedding-v2: Training an LLM from Scratch for Text Embeddings
- Cross-Layer Vision Smoothing: Enhancing Visual Understanding via Sustained Focus on Key Objects in Large Vision-Language Models
- All Roads Lead to Rome: Graph-Based Confidence Estimation for Large Language Model Reasoning
- Jailbreaking Large Language Models Through Content Concretization
- Sy-FAR: Symmetry-based Fair Adversarial Robustness
- FusionMAE: large-scale pretrained model to optimize and simplify diagnostic and control of fusion plasma
- Investigating ReLoRA: Effects on the Learning Dynamics of Small Language Models
- Out of Distribution Detection in Self-adaptive Robots with AI-powered Digital Twins
- Dual-Stage Reweighted MoE for Long-Tailed Egocentric Mistake Detection
- Bridging Performance Gaps for Foundation Models: A Post-Training Strategy for ECGFounder
- xOffense: An AI-driven autonomous penetration testing framework with offensive knowledge-enhanced LLMs and multi agent systems
- Validating Solidity Code Defects using Symbolic and Concrete Execution powered by Large Language Models
- GView: A Survey of Binary Forensics via Visual, Semantic, and AI-Enhanced Analysis
- Perception Before Reasoning: Two-Stage Reinforcement Learning for Visual Reasoning in Vision-Language Models
- MIA-EPT: Membership Inference Attack via Error Prediction for Tabular Data
- Multi-Model Synthetic Training for Mission-Critical Small Language Models
- TFANet: Three-Stage Image-Text Feature Alignment Network for Robust Referring Image Segmentation
- A Design Co-Pilot for Task-Tailored Manipulators
- Shaping Explanations: Semantic Reward Modeling with Encoder-Only Transformers for GRPO
- An Uncertainty-Weighted Decision Transformer for Navigation in Dense, Complex Driving Scenarios
- Beyond Artificial Misalignment: Detecting and Grounding Semantic-Coordinated Multimodal Manipulations
- Exact alternative optima for nonlinear optimization problems defined with maximum component objective function constrained by the Sugeno-Weber fuzzy relational inequalities
- Instance-level Randomization: Toward More Stable LLM Evaluations
- Joint AoI and Handover Optimization in Space-Air-Ground Integrated Network
- Defense-to-Attack: Bypassing Weak Defenses Enables Stronger Jailbreaks in Vision-Language Models
- A Graph Machine Learning Approach for Detecting Topological Patterns in Transactional Graphs
- Deep Learning for Model-Free Prediction of Thermal States of Robot Joint Motors
- Deep Generative and Discriminative Digital Twin endowed with Variational Autoencoder for Unsupervised Predictive Thermal Condition Monitoring of Physical Robots in Industry 6.0 and Society 6.0
- Toward Ownership Understanding of Objects: Active Question Generation with Large Language Model and Probabilistic Generative Model
- InfoGain-RAG: Boosting Retrieval-Augmented Generation via Document Information Gain-based Reranking and Filtering
- MEGAN: Mixture of Experts for Robust Uncertainty Estimation in Endoscopy Videos
- EmbeddedML: A New Optimized and Fast Machine Learning Library
- LLM-Based Approach for Enhancing Maintainability of Automotive Architectures
- Data Scaling Laws for Radiology Foundation Models
- A Pressure-Based Diffusion Model for Influence Maximization on Social Networks
- Multi-Robot Task Planning for Multi-Object Retrieval Tasks with Distributed On-Site Knowledge via Large Language Models
- Improving Anomalous Sound Detection with Attribute-aware Representation from Domain-adaptive Pre-training
- AI Factories: It's time to rethink the Cloud-HPC divide
- Evaluating Large Language Models for Functional and Maintainable Code in Industrial Settings: A Case Study at ASML
- Understanding Prompt Management in GitHub Repositories: A Call for Best Practices
- MedFact: Benchmarking the Fact-Checking Capabilities of Large Language Models on Chinese Medical Texts
- PromptSculptor: Multi-Agent Based Text-to-Image Prompt Optimization
- Reinforcement Learning-Based Market Making as a Stochastic Control on Non-Stationary Limit Order Book Dynamics
- DinoAtten3D: Slice-Level Attention Aggregation of DinoV2 for 3D Brain MRI Anomaly Classification
- Pre-trained Visual Representations Generalize Where it Matters in Model-Based Reinforcement Learning
- A Multimodal Foundation Model to Enhance Generalizability and Data Efficiency for Pan-cancer Prognosis Prediction
- ScaleDoc: Scaling LLM-based Predicates over Large Document Collections
- ActiveVLN: Towards Active Exploration via Multi-Turn RL in Vision-and-Language Navigation
- DoubleAgents: Exploring Mechanisms of Building Trust with Proactive AI
- A Systematic Evaluation of Parameter-Efficient Fine-Tuning Methods for the Security of Code LLMs
- Leveraging Intermediate Representations of Time Series Foundation Models for Anomaly Detection
- Don't Change My View: Ideological Bias Auditing in Large Language Models
- Modular, On-Site Solutions with Lightweight Anomaly Detection for Sustainable Nutrient Management in Agriculture
- Humor in Pixels: Benchmarking Large Multimodal Models Understanding of Online Comics
- Physics-Informed Neural Networks vs. Physics Models for Non-Invasive Glucose Monitoring: A Comparative Study Under Realistic Synthetic Conditions
- A Variational Physics-Informed Neural Network Framework Using Petrov-Galerkin Method for Solving Singularly Perturbed Boundary Value Problems
- Omni-CLST: Error-aware Curriculum Learning with guided Selective chain-of-Thought for audio questuin answering
- Domain Adaptive SAR Wake Detection: Leveraging Similarity Filtering and Memory Guidance
- An End to End Edge to Cloud Data and Analytics Strategy
- Linear Dimensionality Reduction for Word Embeddings in Tabular Data Classification
- Enhancing Smart Farming Through Federated Learning: A Secure, Scalable, and Efficient Approach for AI-Driven Agriculture
- An integrated process for design and control of lunar robotics using AI and simulation
- MORABLES: A Benchmark for Assessing Abstract Moral Reasoning in LLMs with Fables
- GhostNetV3-Small: A Tailored Architecture and Comparative Study of Distillation Strategies for Tiny Images
- Amulet: a Python Library for Assessing Interactions Among ML Defenses and Risks
- Causal-Symbolic Meta-Learning (CSML): Inducing Causal World Models for Few-Shot Generalization
- Evaluating the printability of stl files with ML
- The Anatomy of Alignment: Decomposing Preference Optimization by Steering Sparse Features
- Toward PDDL Planning Copilot
- A Visualized Framework for Event Cooperation with Generative Agents
- Reasoning with Preference Constraints: A Benchmark for Language Models in Many-to-One Matching Markets
- Agentic AI for Financial Crime Compliance
- Simulating Clinical AI Assistance using Multimodal LLMs: A Case Study in Diabetic Retinopathy
- A Scenario-Driven Cognitive Approach to Next-Generation AI Memory
- TinyServe: Query-Aware Cache Selection for Efficient LLM Serving
- Scaling Up Data Parallelism in Decentralized Deep Learning
- MEUV: Achieving Fine-Grained Capability Activation in Large Language Models via Mutually Exclusive Unlock Vectors
- Ratio1 -- AI meta-OS
- Learning to Route: Per-Sample Adaptive Routing for Multimodal Multitask Prediction
- Profiling LoRA/QLoRA Fine-Tuning Efficiency on Consumer GPUs: An RTX 4060 Case Study
- Towards Trustworthy Agentic IoEV: AI Agents for Explainable Cyberthreat Mitigation and State Analytics
- Flexible Multimodal Neuroimaging Fusion for Alzheimer's Disease Progression Prediction
- LLMAP: LLM-Assisted Multi-Objective Route Planning with User Preferences
- Developing an aeroponic smart experimental greenhouse for controlling irrigation and plant disease detection using deep learning and IoT
- AIssistant: An Agentic Approach for Human--AI Collaborative Scientific Work on Reviews and Perspectives in Machine Learning
- Building Coding Agents via Entropy-Enhanced Multi-Turn Preference Optimization
- Reasoning Models Can be Accurately Pruned Via Chain-of-Thought Reconstruction
- Empowering Clinical Trial Design through AI: A Randomized Evaluation of PowerGPT
- A Dimensionality-Reduced XAI Framework for Roundabout Crash Severity Insights
- zELO: ELO-inspired Training Method for Rerankers and Embedding Models
- Human + AI for Accelerating Ad Localization Evaluation
- Redefining CX with Agentic AI: Minerva CQ Case Study
- Match Chat: Real Time Generative AI and Generative Computing for Tennis
- DaSAThco: Data-Aware SAT Heuristics Combinations Optimization via Large Language Models
- Analogy-Driven Financial Chain-of-Thought (AD-FCoT): A Prompting Approach for Financial Sentiment Analysis
- Mob-based cattle weight gain forecasting using ML models
- ECG-aBcDe: Overcoming Model Dependence, Encoding ECG into a Universal Language for Any LLM
- Learn to Relax with Large Language Models: Solving Nonlinear Combinatorial Optimization Problems via Bidirectional Coevolution
- Large Language Models Imitate Logical Reasoning, but at what Cost?
- Zero-shot Graph Reasoning via Retrieval Augmented Framework with LLMs
- H$^2$R: Hierarchical Hindsight Reflection for Multi-Task LLM Agents
- LTA-thinker: Latent Thought-Augmented Training Framework for Large Language Models on Complex Reasoning
- Stochastic Streets: A Walk Through Random LLM Address Generation in four European Cities
- Population Estimation using Deep Learning over Gandhinagar Urban Area
- DISPLIB: a library of train dispatching problems
- InPhyRe Discovers: Large Multimodal Models Struggle in Inductive Physical Reasoning
Research Sources: 351 | Generated: 9/17/2025