{"@attributes":{"version":"2.0"},"channel":{"title":"Seq2seq on Oriol Al\u00e0s Cerc\u00f3s","link":"https:\/\/oriolac.github.io\/tags\/seq2seq\/","description":"Recent content in Seq2seq on Oriol Al\u00e0s Cerc\u00f3s","generator":"Hugo -- 0.150.0","language":"en-us","copyright":"Oriol Al\u00e0s Cerc\u00f3s","lastBuildDate":"Mon, 17 Feb 2025 12:31:23 +0100","item":{"title":"Introduction to Attention Mechanism and Transformers","link":"https:\/\/oriolac.github.io\/posts\/20241029-attention\/","pubDate":"Mon, 17 Feb 2025 12:31:23 +0100","guid":"https:\/\/oriolac.github.io\/posts\/20241029-attention\/","description":"<p>Transformers have demonstrated excellent capabilities and they overcome challenges such <em>NLP<\/em>, <em>Text-To-Image Generation<\/em> or <em>Image Completion<\/em>\nwith large datasets, great model size and enough compute.\nTalking about transformers nowadays is as casual as talking about <em>CNNs<\/em>, <em>MLPs<\/em> or <em>Linear Regressions<\/em>. Why not take a glance through this state-of-the-art architecture?<\/p>\n<p>In this post, we\u2019ll introduce the Sequence-to-Sequence (Seq2Seq) paradigm, explore the attention mechanism, and provide a detailed,\nstep-by-step explanation of the components that make up transformer architectures.<\/p>"}}}