Apple ML Research

🍎 AI Labs

What Matters in Practical Learned Image Compression

One of the major differentiators unlocked by learned codecs relative to their hard-coded traditional counterparts is their ability to be optimized directly to appeal to…

#image compression #learned codecs #neural architecture search

🕐 22 hours ago Read →

🍎 AI Labs

Text-Conditional JEPA for Learning Semantically Rich Visual Representations

Image-based Joint-Embedding Predictive Architecture (I-JEPA) offers a promising approach to visual self-supervised learning through masked feature prediction. However with the inherent visual uncertainty at masked…

#self-supervised learning #vision-language models #visual representations

🕐 22 hours ago Read →

🍎 AI Labs

SpecMD: A Comprehensive Study on Speculative Expert Prefetching

Mixture-of-Experts (MoE) models enable sparse expert activation, meaning that only a subset of the model’s parameters is used during each inference. However, to translate this…

#moe #caching #inference

🕐 a day ago Read →

🍎 AI Labs

Normalizing Flows with Iterative Denoising

Normalizing Flows (NFs) are a classical family of likelihood-based methods that have received revived attention. Recent efforts such as TARFlow have shown that NFs are…

#normalizing flows #generative models #image synthesis

🕐 a day ago Read →

🍎 AI Labs

From Where Things Are to What They’re For: Benchmarking Spatial–Functional Intelligence for Multimodal LLMs

True spatial intelligence for multimodal agents transcends low-level geometric perception, evolving from knowing where things are to understanding what they are for. While existing benchmarks,…

#multimodal llm #spatial reasoning #benchmark

🕐 a day ago Read →

🍎 AI Labs

Stochastic KV Routing: Enabling Adaptive Depth-Wise Cache Sharing

Serving transformer language models with high throughput requires caching Key-Values (KVs) to avoid redundant computation during autoregressive generation. The memory footprint of KV caching is…

#llm #kv-cache #memory-optimization

🕐 2 days ago Read →

🍎 AI Labs

PORTool: Importance-Aware Policy Optimization with Rewarded Tree for Multi-Tool-Integrated Reasoning

Multi-tool-integrated reasoning enables LLM-empowered tool-use agents to solve complex tasks by interleaving natural-language reasoning with calls to external tools. However, training such agents using outcome-only…

#multi-tool reasoning #reinforcement learning #policy optimization

🕐 3 days ago Read →

🍎 AI Labs

Reinforced Agent: Inference-Time Feedback for Tool-Calling Agents

This paper was accepted at the Fifth Workshop on Natural Language Generation, Evaluation, and Metrics at ACL 2026. Tool-calling agents are evaluated on tool selection,…

#tool-calling agents #agent feedback #inference-time evaluation

🕐 6 days ago Read →

🍎 AI Labs

International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2026

Apple is presenting new research at the annual International Conference on Acoustics, Speech and Signal Processing (ICASSP) , which takes place in person in Barcelona,…

#speech processing #signal processing #acoustics

🕐 7 days ago Read →

🍎 AI Labs

STARFlow-V: End-to-End Video Generative Modeling with Normalizing Flows

Normalizing flows (NFs) are end-to-end likelihood-based generative models for continuous data, and have recently regained attention with encouraging progress on image generation. Yet in the…

#video generation #normalizing flows #generative models

🕐 7 days ago Read →

Latest from Apple ML Research