Steven Feng

I'm an incoming Stanford Computer Science PhD student, working with the Stanford NLP Group. I'm currently a master's student at Carnegie Mellon University (CMU) and previously an undergraduate at the University of Waterloo and Wilfrid Laurier University. I have a strong passion for data science, machine learning, and natural language processing (NLP).

My goal is to teach machines how to understand and generate human language. To do so, I have explored ways to improve the controllability of language generation models, incorporate and assess their commonsense reasoning capabilities, and integrate structured, multimodal, and linguistic information to enhance them.

I am working with Eduard Hovy at CMU's Language Technologies Institute and Malihe Alikhani at the University of Pittsburgh on research projects involving language generation, semantics, and data augmentation (podcast, talk). Earlier, I worked at the University of Waterloo with Jesse Hoey. Throughout my research, I have learned the importance of designing grounded, controllable, and robust machine learning models for effective language generation.

My research contributions have been recognized with several publications at major conferences such as EMNLP, ACL, and AAAI, and a best paper award at INLG 2021. I am also an Honorable Mention for the Jessie W.H. Zou Memorial Award and CRA Outstanding Undergraduate Researcher Award.

Further, I am involved in the research communities through leadership and community-building activities. For example, I led the organization of CtrlGen, a controllable generation workshop at NeurIPS 2021, and am involved in the GEM benchmark and workshop for NLG evaluation. I am also mentoring and advising several students on different NLP projects.

Other than research, I enjoy machine learning projects and hackathons. In my free time, I like gaming, playing the piano, and table tennis. I will be applying to PhD programs in Fall 2021!

Email  /  CV (Dec. 2021)  /  Google Scholar  /  LinkedIn  /  Twitter  /  GitHub

profile photo

Recent News

Peer-Reviewed Publications and Conference Proceedings

Retrieve, Caption, Generate: Visual Grounding for Enhancing Commonsense in Text Generation Models
Steven Y. Feng, Kevin Lu, Zhuofu Tao, Malihe Alikhani, Teruko Mitamura, Eduard Hovy, Varun Gangal
Accepted to AAAI Conference on Artificial Intelligence 2022 (Acceptance rate: 15%)
Accepted to AKBC 2021 Commonsense Reasoning and Knowledge Bases (CSKB) Workshop.
Abstract / Bibtex / GitHub / Presentation Slides

NAREOR: The Narrative Reordering Problem
Varun Gangal*, Steven Y. Feng*, Malihe Alikhani, Teruko Mitamura, Eduard Hovy
Accepted to AAAI Conference on Artificial Intelligence 2022 (Acceptance rate: 15%)
Abstract / Bibtex / GitHub

SAPPHIRE: Approaches for Enhanced Concept-to-Text Generation
Steven Y. Feng, Jessica Huynh, Chaitanya Narisetty, Eduard Hovy, Varun Gangal
Proceedings of International Conference on Natural Language Generation (INLG) 2021 [Best Long Paper]
Abstract / Bibtex / GitHub / Poster / Chinese News Article

A Survey of Data Augmentation Approaches for NLP
Steven Y. Feng*, Varun Gangal*, Jason Wei, Sarath Chandar, Soroush Vosoughi, Teruko Mitamura, Eduard Hovy
Proceedings of Association for Computational Linguistics (ACL) 2021 Findings [Long Paper]
Abstract / Bibtex / GitHub / Podcast (with Ed Hovy) / Talk (for Google Research) / Presentation Slides / Poster / Chinese News Articles [1,2,3]

GenAug: Data Augmentation for Finetuning Text Generators
Steven Y. Feng*, Varun Gangal*, Dongyeop Kang, Teruko Mitamura, Eduard Hovy
Proceedings of EMNLP 2020 Deep Learning Inside Out (DeeLIO) Workshop [Long Paper]
Abstract / Bibtex / GitHub / Talk / Presentation Slides

ALOHA: Artificial Learning of Human Attributes for Dialogue Agents
Aaron W. Li, Veronica Jiang*, Steven Y. Feng*, Julia Sprague, Wei Zhou, Jesse Hoey
Proceedings of AAAI Conference on Artificial Intelligence 2020 (Acceptance rate: 20.6%) [Oral]
Abstract / Bibtex / GitHub / Talk / Presentation Slides / Poster

Keep Calm and Switch On! Preserving Sentiment and Fluency in Semantic Text Exchange
Steven Y. Feng*, Aaron W. Li*, Jesse Hoey
Proceedings of Empirical Methods in Natural Language Processing (EMNLP) 2019 (Acceptance rate: 23.8%) [Long Paper]
Abstract / Bibtex / GitHub / Poster / News Article

* Equal Contribution

Talks and Interviews

July 2021: Eduard Hovy and I were on The Data Exchange Podcast with Ben Lorica. We discuss data augmentation for NLP (inspired by our survey paper) and challenges + future directions in NLP and machine learning research. Audio and notes here.



Aug. 2021: Varun and I gave a talk (to over 100 attendees) for Google Research about data augmentation for NLP (inspired by our survey paper). We also touch upon NL-Augmenter and our CtrlGen Workshop at NeurIPS 2021.



Feb. 2020: I presented our work ALOHA: Artificial Learning of Human Attributes for Dialogue Agents at the AAAI Conference on Artificial Intelligence 2020 in New York. The room was packed full of listeners!



Nov. 2020: I discussed my tips and advice for finding undergraduate research opportunities and preparing for graduate school applications at the University of Waterloo Data Science Club lightning talks.



Mentorship and Advising

  • Kevin Lu [University of Waterloo Undergrad, Computer Science, Class of 2026]
  • Mentoring several research projects on controllable, creative, and visually-grounded text generation [e.g. paper1, paper2].
  • Sedrick Scott Keh [CMU Master's of Machine Learning (MSML), Class of 2022]
  • Mentoring research projects on creative text generation.
  • Jerry Huang, Hongru Xiang, Xintao (Cynthia) Zhu, Saidi Tang [University of Waterloo Undergrads, Software Engineering, Class of 2022]
  • Advising their software engineering capstone project on text simplification for ESL students.
  • Zhuofu (Derek) Tao [UCLA Ph.D. in Electrical Engineering, Class of 2025]
  • Mentored a research project on controllable and visually-grounded text generation [paper].

Piano and Music

I have been playing piano since I was 6 years old, and also play a bit of guitar. I sometimes upload to my YouTube and TikTok channels.

Aoi Tori 「蒼い鳥」 - IDOLM@STER [Piano]



Swordland - Sword Art Online (SAO) Main Theme [Piano]



Colors of the Wind (Pocahontas) & I Dreamed a Dream (Les Misérables) [Piano, Singing, Violin]



Laputa: Castle in the Sky - Carrying You / Innocent [Piano]



Fairy Tail Main Theme (Slow Piano Version)



Unravel - Tokyo Ghoul OP [Piano]
@fengoku Finally remember acc pw 😂 1st time playing in months so excuse the mistakes. Should I post on YT again? #tokyoghoul #piano #anime #unravel #cover ♬ original sound - fengoku



Last Updated: April 17, 2022 Site Template