1. Негизги Transformer туундулары

  • BERT / RoBERTa / DeBERTa Context-aware language models
  • ALBERT A Lite BERT with parameter-sharing
  • ELECTRA Replaced Token Detection model
  • DistilBERT / TinyBERT / MiniLM Distilled (compressed) BERT variants
  • T5 / mT5 / Flan-T5 Universal "Text-to-Text" models
  • GPT / GPT-NeoX / LLaMA Autoregressive text generation models

2. Сейрек жана эффективдүү эсептөөлөр

3. Эс тутум жана узак контекст

  • Transformer-XL / Compressive Transformer Recurrent transformers with extended memory
  • RETRO Retrieval-enhanced transformer using external DB
  • RMT (Recurrent Memory Transformer) Transformer with recurrent memory segments
  • Hyena / Hyena Hierarchy Implicit attention models (convolution+attention)
  • RWKV / Mamba Hybrid RNN and Transformer architectures

4. Тил жана структураны түшүнүү

  • Syntax-aware Transformer Transformers that understand grammar
  • Graph Neural Transformers Transformers for graph-structured data
  • AdapterFusion / LoRA / Prefix-Tuning Modular and efficient fine-tuning methods

5. Көпмодалдуу моделдер

  • CLIP / BLIP / ALIGN Text-image contrastive learning models
  • Whisper / SpeechT5 / SeamlessM4T Speech-to-text and speech-to-speech models
  • Flamingo / Kosmos / Gemini Multi-modal intelligence systems (text+image+video)
  • Perceiver IO / Perceiver AR General-purpose architectures for all modalities