TSVP Talk: "Advancing LLM Post-training via Bilevel Optimization" by Tianyi Chen

Date
Location
Description
Title: Advancing LLM Post-training via Bilevel Optimization
Speaker: Tianyi Chen, Electrical, Computer, and Systems Engineering, Rensselaer Polytechnic Institute
Abstract: The rapid advancements in large AI models have underscored the importance of scaling laws, which demonstrate that model capability increases with larger architectures and richer datasets. However, these advancements pose dual challenges: i) on the learning front, ensuring that in addition to accuracy, new evaluation metrics - such as fairness, safety, and robustness - are met; ii) on the computing front, meeting stringent demands for efficient sensing, communication, and processing. Addressing these challenges necessitates principled methods to handle multiple (possibly competing) performance metrics and resource constraints. In this talk, I will introduce a unified framework to tackle these challenges in learning and computing problems, grounded in bilevel optimization. This framework provides theoretical guarantees on optimality, complexity, as well as algorithmic scalability for large-scale AI models and systems. I will conclude by showcasing applications of bilevel optimization techniques in fine-tuning LLMs and enabling efficient analog in-memory training.
Profile: Tianyi Chen is an Associate Professor in the Department of Electrical, Computer, and Systems Engineering at Rensselaer Polytechnic Institute (RPI), where he is jointly supported by the RPI - IBM Artificial Intelligence Research Partnership. Before joining RPI at 2019, Dr. Chen received his B. Eng. degree from Fudan University in 2014, and the Ph.D. degree from the University of Minnesota in 2019. Dr. Chen's His research focuses on the theoretical foundations of bilevel and multi-objective optimization, with applications in LLM fine-tuning, wireless computing, and analog computing systems. His work bridges theory and practice, contributing to IBM’s industrial products and resulting in patents.
Dr. Chen is the inaugural recipient of IEEE Signal Processing Society (SPS) Best PhD Dissertation Award in 2020, a recipient of NSF CAREER Award in 2021, and several industrial research awards including Amazon Research Award and Cisco Research Award. He is also the recipient of several best (student) paper awards including the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) in 2021, and the IEEE SPS Young Author Best Paper Award in 2024. Personal Website
Language: English
Target audience: General audience/everyone at OIST and beyond.
Freely accessible to all OIST members and guests without registration.
This talk will also be broadcast online via Zoom:
Meeting ID: 997 1631 6423
Passcode: 491827
※ Please note that this event may be recorded and the videos uploaded. In addition, photos may be taken during the event. These are intended for publication online (the OIST website, social media, etc.)※
Subscribe to the OIST Calendar: Right-click to download, then open in your calendar application.