I’m currently a research assistant at Vision and Learning Lab at National Taiwan University, Department of Electrical Engineering.

My research interest lies in the intersection of Computer Vision (CV) and Natural Language Processing (NLP), aiming to equip computers with the ability to understand and relate data across different modalities. Specifically, I am interested in the following topcis:

  • Multimodal Generation: Text-to-Image Synthesis, Audio-Visual Manipulation
  • Multimodal Grounding & Reasoning: Image Captioning, VQA, Vision-and-Language Navigation (VLN)
  • Multimodal Representation Learning: Multimodal Contrastive Learning, Multimodal Self-Training

I’m looking for PhD opportunities Fall 2022