I am Sam! πŸ‘‹ I'm an Assistant Teaching Professor in the HalΔ±cΔ±oğlu Data Science Institute at UC San Diego. I use methods from human-computer interaction (HCI) to design software tools like Pandas Tutor that help programming, statistics, and data science instructors prepare their lessons. I also wrote a textbook called Learning Data Science, published by O'Reilly Media in 2023.

My teaching faculty application materials for the 2022-2023 application cycle are linked below. CV β€” Teaching Statement β€” Research Statement β€” Diversity Statement

lau@ucsd.edu

github.com/samlau95 @samlau95

Site last updated: Nov 2024

projects

Pandas Tutor

makes step-by-step diagrams to show how Python pandas code transforms data.

teaching

UC San Diego

  • DSC 80: Practice and Application of Data Science

    Instructor: 24fa β€” 24sp β€” 23fa

  • DSC 106: Data Visualization

    Instructor: 24wi

UC Berkeley

papers

β€œI’m not sure, but...”: Expert Practices that Enable Effective Code Comprehension in Data Science.

Christopher Lum*, Guoxuan Xu*, Sam Lau (*equal contribution).

ACM Technical Symposium on Computer Science Education (SIGCSE),

2025.

Expert data scientists apply a range of metacognitive practices to develop deeper understanding of complicated dataframe code.

How Novices Use Program Visualizations to Understand Code that Manipulates Data Tables.

Ylesia Wu*, Qirui Zheng*, Sam Lau (*equal contribution).

ACM Technical Symposium on Computer Science Education (SIGCSE),

2025.

Program visualization tools like Pandas Tutor can help novices verify assumptions and see salient information, but diagrams can become difficult to understand without methods for managing cognitive load.

From "Ban It Till We Understand It" to "Resistance is Futile": How University Programming Instructors Plan to Adapt as More Students Use AI Code Generation and Explanation Tools such as ChatGPT and GitHub Copilot.

Sam Lau, Philip J. Guo.

ACM Conference on International Computing Education Research (ICER),

2023.

Interviews with a diverse range of instructors reveal a broad range of perspectives on how to approach effective teaching with – or in spite of – AI coding tools.

Teaching Data Science by Visualizing Data Table Transformations: Pandas Tutor for Python, Tidy Data Tutor for R, and SQL Tutor.

Sam Lau*, Sean Kross*, Eugene Wu, Philip J. Guo (*equal contribution).

International Workshop on Data Systems Education (DataEd),

2023.

The Pandas/Tidy Data/SQL Tutor tools automatically create diagrams for data science code.

Codehound: Helping Instructors Track Pedagogical Code Dependencies in Course Materials.

Sam Lau, Philip J. Guo.

ACM SIGPLAN SPLASH-E Symposium (SPLASH-E),

2022.

Tracking code use within course materials enables tools that give automated help while refactoring a course.

The Challenges of Evolving Technical Courses at Scale: Four Case Studies of Updating Large Data Science Courses.

Sam Lau, Justin Eldridge, Shannon Ellis, Aaron Fraenkel, Marina Langlois, Suraj Rampure, Janine Tiefenbruck, Philip J. Guo.

ACM Conference on Learning @ Scale (L@S),

2022.

Instructors of large technical courses run into major logistical challenges that take time away from teaching.

How Computer Science and Statistics Instructors Approach Data Science Pedagogy Differently: Three Case Studies.

Sam Lau, Deborah Nolan, Joseph Gonzalez, Philip J. Guo.

ACM Technical Symposium on Computer Science Education (SIGCSE),

2022.

Real-world case studies show tradeoffs in balancing computer science and statistics in data science teaching.

TweakIt: Supporting End-User Programmers Who Transmogrify Code.

Sam Lau, Sruti Srinivasa Ragavan, Ken Milne, Titus Barik, Advait Sarkar.

ACM Conference on Human Factors in Computing Systems (CHI),

2021.

Placing live previews of code outputs directly in spreadsheets enables data analysts to tweak and reuse Python examples without needing Python expertise.

The Design Space of Computational Notebooks: An Analysis of 60 Systems in Academia and Industry.

Sam Lau, Ian Drosos, Julia M. Markel, Philip J. Guo.

IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC),

2020.

Computational notebooks vary widely in data import, code editing, code execution, and output format.

workshop and poster papers

Data Theater: A Live Programming Environment for Prototyping Data-Driven Explorable Explanations.

Sam Lau, Philip Guo.

Workshop on Live Programming (LIVE),

2020.

Separating logic from presentation simplifies the process of creating explorable explanations.

Experiment Reconstruction Reduces Fixation on Surface Details of Explanations.

Sam Lau, Tricia Ngoon, Vineet Pandey, Scott Klemmer.

Poster in ACM Conference on Creativity and Cognition (C&C),

2019.

Asking people to mentally recreate an experiment briefly reduces the allure of scientific terminology.

news

Congratulations to four members of our lab – Christopher Lum, Guoxuan (Jason) Xu, Ylesia Wu, and Qirui (Sara) Zheng – for getting their very first papers published. See you at SIGCSE 2025! – Nov 2024
Started as an Assistant Teaching Professor in Data Science at UCSD. – Sep 2023
Pandas Tutor v2 released and featured on the official Pyodide blog. – May 2022