Hi everyone, There are a ton of language models out there today! Many of which have their unique way of learning “self-supervised” language representations that can be used by other downstream tasks. In this article, I decided to summarize the current trends and share some key insights to glue all these novel approaches together. 😃 (Slide credits: Delvin et. al. Stanford CS224n)