Adrián Bazaga's Homepage

AI Research Scientist @ Microsoft

Welcome to my website!

Background:
I’m a Research Scientist at Microsoft, pioneering the development of cutting-edge foundational Small Language Models (SLMs) that power intelligent, agentic AI experiences directly on user devices for millions of users worldwide. My work focuses on advancing the frontier of on-device intelligence, bringing fast and capable AI to life at global scale.

I’m a core contributor to Mu, Microsoft’s blazing-fast on-device SLM, where I played a central role developing the pretraining, mid-training, and post-training pipelines. I also co-led the development of the Windows Settings AI agent, already live on Windows Copilot+ devices. These efforts are part of my broader vision to enable seamless, deeply integrated AI experiences for everyone.

I hold a Ph.D. in Machine Learning from the University of Cambridge, where I conducted research under the supervision of Prof. Pietro Liò and Prof. Gos Micklem. My work has been published in leading Machine Learning conferences such as ICLR, ICML, ACL and EMNLP, as well as in Nature journals. Previously, I gained research experience through internships at Microsoft Research and Amazon AGI, where I explored novel training schemes to enhance few-step generation in diffusion models, and test-time scaling for temporal reasoning with Large Language Models (LLMs). Prior to that, I spent ~5 years in various startups, working at the intersection of AI and biology.

Research Interests:
My research focuses on advancing the capabilities of AI in the areas of foundational LLMs, reasoning and multimodality. I’m particularly interested in developing architectures that enhance the applicability of generative models and integrate diverse data modalities to tackle both core challenges and real-world problems 🧪. My current focus is on building small-scale foundational language models with agentic capabilities, expanding their multimodal capabilities, refining new training methodologies, improving inference efficiency, and ensuring robustness and alignment. I’m passionate about impactful, global-scale AI and open to collaborations that push its boundaries 👐.

Beyond Research:
In addition to my core research, I’m interested in how advancements in generative models can revolutionize education and governance. I’m deeply committed to conducting research with a significant impact and am eager to engage in discussions and collaborations that align with these goals.

News

Jun 23, 2025 We have launched Mu, our 0.3B ‘micro-size’ language model, built for blazing-fast on-device inference and already powering native agentic experiences on Windows devices. 🚀
Jun 1, 2025 [Paper] Our paper “Learning to Reason Over Time: Timeline Self-Reflection for Improved Temporal Reasoning in Language Models” has been accepted at ACL 2025 (Main) 🎉
Jan 3, 2025 I joined Microsoft as an AI Research Scientist in London (UK). Excited to work on delivering on-device LLM-based AI experiences for millions of users worldwide. ⭐
Sep 20, 2024 [Paper] Our paper “HyperBERT: Mixing Hypergraph-Aware Layers with Language Models for Node Classification on Text-Attributed Hypergraphs” has been accepted at EMNLP 2024 🎉
Aug 15, 2024 I joined Amazon Science AGI team as a Research Scientist Intern to work on test-time scaling for temporal reasoning with LLMs alongside Bill Byrne, Rexhina Blloshmi and Adrià de Gispert, in Berlin (Germany). ⭐
Aug 13, 2024 I’m now part of the Reviewer Committee for the International Conference on Learning Representations (ICLR) and ACL conferences. 👍
Jun 16, 2024 Our paper “FLUID-LLM: Learning Computational Fluid Dynamics with Spatiotemporal-aware Large Language Models” is now on arXiv. 📋
Jun 5, 2024 [Paper] Our paper “TabMDA: Tabular Manifold Data Augmentation for Any Classifier using Transformers with In-context Subsetting” has been accepted at ICML 2024 🎉
Jun 1, 2024 I received a PhD Student Award by the Cambridge Society for the Application of Research in recognition of outstanding research with real world application, for my work on language-graph weakly supervised distillation for dense retrieval. 🏅
May 1, 2024 I joined Microsoft Research as a Research Scientist Intern to work on improved few-step generation for diffusion models with Javier Zazo, Richard Turner and Ted Meeds, in Cambridge (UK). ⭐

Selected Publications

  1. ICLR
    Unsupervised Pretraining for Fact Verification by Language Model Distillation
    Adrián BazagaPietro Liò, and Gos Micklem
    In ICLR 2024 (International Conference on Learning Representations) 2024
  2. arXiv
    SQLformer: Deep Auto-Regressive Query Graph Generation for Text-to-SQL Translation
    Adrián BazagaPietro Liò, and Gos Micklem
    In arXiv:2310.18376 2023
  3. ICLR
    Language Model Knowledge Distillation for Efficient Question Answering in Spanish
    Adrián BazagaPietro Liò, and Gos Micklem
    In ICLR 2024 (International Conference on Learning Representations) 2023
  4. EMNLP
    HyperBERT: Mixing Hypergraph-Aware Layers with Language Models for Node Classification on Text-Attributed Hypergraphs
    Adrián BazagaPietro Liò, and Gos Micklem
    In EMNLP 2024 (Empirical Methods in Natural Language Processing) 2024
  5. ICML
    TabMDA: Tabular Manifold Data Augmentation for Any Classifier using Transformers with In-context Subsetting
    Andrei Margeloiu,  Adrián Bazaga, Nikola Simidjievski, Pietro Liò, and Mateja Jamnik
    In ICML 2024 (International Conference on Machine Learning) 2024
  6. arXiv
    FLUID-LLM: Learning Computational Fluid Dynamics with Spatiotemporal-aware Large Language Models
    Max Zhu,  Adrián Bazaga, and Pietro Liò
    In arXiv:2406.04501 2024
  7. ACL
    Learning to Reason Over Time: Timeline Self-Reflection for Improved Temporal Reasoning in Language Models
    Adrián Bazaga, Rexhina Blloshmi, Bill Byrne, and Adrià Gispert
    In ACL 2025 (Association for Computational Linguistics) 2025