ROKO

[coursera] Sequence Models: Week 4 본문

Artificial Intelligence/Deep Learning

[coursera] Sequence Models: Week 4

RO_KO 2024. 7. 9. 17:57
728x90

Transformer

  • Attention + CMM
    • Self-attention: calculate attention parallel
    • Multi-head attention: rich representations

Self-attention

Multi-head attention

Transformer architecture

728x90
Comments