👋 Hello!

I'm Nan Yang,

Yes, my name may read as 'NaN'—Not a Number—but I work with data and numbers every single day as a researcher in empirical software engineering!

About Me

I am a researcher at the Dutch National Applied Research Organization (TNO-ESI) and a guest researcher at Eindhoven University of Technology. My work lies at the intersection of empirical software engineering, legacy system modernization, and open-source ecosystems. I mine software repositories, conduct surveys and interviews, and perform experiments, exploring both social and technical aspects of software development in industry and open-source contexts. I obtained my Ph.D. in Computer Science from TU Eindhoven in 2023.

Researcher investigating logs and automation data

This illustration, used as the cover of my Ph.D. thesis, shows a curious engineer inspecting execution data generated by an industrial production system.

To understand how engineers derive insights from execution data, I interviewed 39 engineers working on large-scale production software systems. One of them said something that stayed with me:

“At times the machines could be like a giant blackbox for us and we have to use all kinds of tools to get some insight into what’s going on inside.”

That quote captures the heart of my research. I’ve been trying to understand what tools and processes really support engineers in making sense of these complex systems.

I started with log analysis and model inference techniques to uncover behavioral insights from execution data — and now, I continue this passion by applying static analysis and large language models to help make complex systems more understandable, explainable, and eventually more maintainable.

Research

Industry-Driven Projects

LLMs for Legacy Systems (Philips, Feb 2024 – Present)

Apply large language models with static analysis to support developer queries on complex legacy codebases.

[Project description]

Synthesis-Based Engineering (ASML, Jul 2023 - Feb 2024)

Investigated the feasibility of synthesis-based engineering approaches in an industry setting.

[Publication]

Log Analysis Practices (ASML, May 2019 – Mar 2022)

Conducted interviews and analysis across five companies to understand current practices and challenges in log analysis for embedded systems.

[Publication]

Model-Driven Software Engineering Practices (ASML, May 2019 – Mar 2022)

Mined industrial model repositories to understand how developers model their software with an industrial MDSE tool (called ASD) and what limitations of such MDSE tools have.

[Publication]

Model Learning (ASML, Feb 2017 – Mar 2018)

Combined active learning and passive learning to infer behavioral models from software systems.

[Publication]

Academic Project

Open Source Ecosystems and Software Foundations (Ongoing hobby project)

Study collaboration dynamics, project survivability, and the impact of governance structures in large-scale open source ecosystems such as Apache and Eclipse foundations. [Publication]

Service

🧑‍🏫 Teaching

  • Data Analytics for Engineers (2023) – Bachelor’s course, Eindhoven University of Technology (Teaching Assistant)
  • Real-time Systems (2019–2021) – Master’s course, Eindhoven University of Technology (Teaching Assistant)

🧪 Reviewing

  • ICSE 2025 – Program Committee (Software Engineering in Practice track)
  • MSR 2024 – Program Committee (Data and Tool Showcase track)
  • NWO ICT.Open 2024 – Program Committee (Mastering Complexity for Cyber-Physical Systems track)
  • MSR 2023 – Junior Program Committee
  • MSR 2021 – Shadow Program Committee

🎤 Organizing

  • NWO ICT.OPEN 2025 – Track Chair (Mastering Complexity for Cyber-Physical Systems track)
  • ESI Symposium 2025 – Organizing Committee

🤝 Volunteering

  • ICSE 2019–2020 – Student Volunteer at the 41st and 42nd International Conference on Software Engineering

Publications

Contact Me

☕ I’m always up for a coffee and a good chat! 📬 Feel free to reach out via email or connect with me on LinkedIn!