Adrián Bazaga's Homepage

PhD Candidate @ University of Cambridge

Welcome to my website!

I am a final-year PhD student at University of Cambridge, working on Multimodality and Language Models and supervised by Prof. Pietro Liò and Prof. Gos Micklem. Before this, I obtained my Bachelor’s in Computer Science and Master’s in Machine Learning degrees from University of La Laguna and Polytechnic University of Catalonia (UPC) in 2017 and 2019.

My research focuses on devising architectures for NLP systems, with emphasis on multi-modality and Large Language Models 🤖 for solving both fundamental and real-world problems 🧪. For my engineering work, I’m dedicated to designing software systems for streamlining research, optimizing and scaling ML pipelines. I’m proficient in translating user requirements to software, and my diverse experience enables me to thrive in cross-disciplinary environments. I am dedicated to conducting high-impact research and open for collaborations 👐.

Tangentially, I’ve been exploring how advancements in natural language processing, specifically generative models, can enhance and support modern education systems and government. If you are interested in discussing about this, please contact me.

In my free time, you can catch me training in the gym, enjoying nature, or tasting new dishes.

News

Apr 3, 2024 I will join Microsoft Research as a Research Scientist Intern next month. Thrilled to embark on this exciting journey, working with Javier Zazo and Ed Meeds, as well as the fantastic team at Cambridge, UK. ⭐
Mar 16, 2024 [Paper] Our paper “Language Model Knowledge Distillation for Efficient Question Answering in Spanish” has been accepted at ICLR 2024 Tiny Papers Workshop 🎉
Mar 6, 2024 [Paper] Our paper “Unsupervised Pretraining for Fact Verification by Language Model Distillation” has been accepted at ICLR 2024 🎉
Feb 15, 2024 I was granted “Visiting Student” status in the Computer Science & Technology department under Pietro Liò´s group. I’m working on tabular LLMs to improve small tabular classification problems and LLMs for modeling partial differential equations 🚀
Feb 13, 2024 Our paper “HyperBERT: Mixing Hypergraph-Aware Layers with Language Models for Node Classification on Text-Attributed Hypergraphs” is now on arXiv. 📋
Oct 27, 2023 Our paper “SQLformer: Deep Auto-Regressive Query Graph Generation for Text-to-SQL Translation” is now on arXiv. 📋
Sep 16, 2021 [Paper] Our paper “Translating synthetic natural language to database queries with a polyglot deep learning framework” has been accepted at Nature Scientific Reports 🎉
Dec 1, 2020 I was granted a Senior Scholarship Award by Fitzwilliam College in recognition of the progress on my PhD research. 🏆

Selected Publications

  1. ICLR
    Unsupervised Pretraining for Fact Verification by Language Model Distillation
    Adrián BazagaPietro Liò, and Gos Micklem
    In International Conference on Learning Representations (ICLR 2024) 2023
  2. arXiv
    SQLformer: Deep Auto-Regressive Query Graph Generation for Text-to-SQL Translation
    Adrián BazagaPietro Liò, and Gos Micklem
    In arXiv:2310.18376 2023
  3. ICLR
    Language Model Knowledge Distillation for Efficient Question Answering in Spanish
    Adrián BazagaPietro Liò, and Gos Micklem
    In International Conference on Learning Representations (ICLR 2024 Tiny Papers) 2023
  4. arXiv
    HyperBERT: Mixing Hypergraph-Aware Layers with Language Models for Node Classification on Text-Attributed Hypergraphs
    Adrián BazagaPietro Liò, and Gos Micklem
    In arXiv:2402.07309 2024