Welcome, Professor Siyu Tang

28.01.2020 | Anna Ettlin

Siyu Tang officially joined the Department of Computer Science of ETH Zurich in January 2020 as Tenure Track Assistant Professor of Computer Vision. Get to know her in this short interview.

Professor Siyu Tang joined the Department of Computer Science in January 2020.

Professor Tang, welcome to ETH! What are your current research interests?
My team is called the Computer Vision and Learning Group (VLG). Our research interest lies in computer vision and the combination with machine learning. We work on discovering and proposing algorithms and implementations for solving high-level visual recognition problems. The goal is to advance the frontier of robust machine perception in real-world settings. Our research agenda focuses on three related domains.

The first one concerns visual understanding of people in unconstrained environments where the complexity of visual scenes impacts the robustness and the generalisation of the underlying model. We are working on improving the state-of-the-art performance on people tracking, re-identification, pose estimation and fine-grained activity recognition. A long-term goal is to infer detailed representations of pose, shape, expression and social interaction of humans from images and videos so that computers are able to communicate in meaningful ways.

The second thrust of our research focuses on learning holistic scene representations. We try to understand what kind of representations allow complex reasoning about the real world. The goal is to study the algorithmic foundation so that machines can learn holistic representations from different levels of visual granularity and multiple sensory inputs such as images and language.

The third area is efficient and scalable learning and optimisation techniques. Here, we focus on studying computational models that enable machines to perceive large-scale visual input. A long-term goal is to automate the learning and inference process and make it more accessible for large-scale real-world settings.

What is the impact of your research on society?
Humans have the remarkable ability to perceive visual scenes, recognise objects and understand activities within fractions of a second. Meanwhile, demand for automated machine perception of visual data is rapidly increasing, ranging from autonomous driving to personal robots. Despite the tremendous research progress made in recent years, human vision is still a far more robust and error-tolerant system than machine vision when it comes to challenging visual tasks in a real-world setting. The research question we are trying to answer is how do we create robust computer vision systems that are able to perceive the world as well as humans? The solution to this problem will have a profound social and economic impact on a global scale.

Where were you working before you came to ETH?
For the last two years, I was a research group leader in the Department of Perceiving Systems at the Max Planck Institute for Intelligent Systems in Tübingen, Germany. Before that, I completed my PhD at the Max Planck Institute for Informatics in Saarbrücken in 2017.

Which courses will you be teaching at ETH?
This has not been decided yet. I would like to contribute to the courses related to high-level computer vision, most likely on the topics of understanding humans and their actions in the visual world. The goal I will try to pursue in my lectures is to convey my excitement about the topics and to explain different approaches and algorithms from theoretical and practical perspectives.

Name an interesting fact about your research.
Our research has strong connections with machine learning, optimisation, computer graphics and AR/VR research. For instance, learning from visual data is a classical problem of machine learning and drives practical applications of machine learning research. The integration of computer vision technology in real-time AR/VR devices allows a more immersive interaction and defines new research and engineering challenges. In general, I think computer vision is one of the most important aspects of building intelligent systems. It provides us with a source of challenging and fascinating problems, while being of tremendous practical importance.