AI机器学习TransformerTransformer Tokenization - Embedding - Positional encoding - Transformer block - Attention 参考 What Are Transformer Models and How Do They Work? Attention Is All You Need 2017 The Illustrated Transformer The Transformer Family 2020 https://www.zhihu.com/question/445556653/answer/3254012065