The "context window" limits the number of words that can be input to (or output by) a given Large Language Model. Today's episode introduces FlashAttention, a trick that allows for much larger context windows.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.