The GenAI Learning Guide
From Basics to Agents
By Rajesh Pal
Table of Contents
1. Deepen Model Intuition
2. Jump Into Models Early
3. Master the RAG Workflow
4. Evaluation & Guardrails
5. Mini Projects
6. Reliability & MLOps Basics
7. Agents
8. Frameworks
Appendix A: GenAI Cheat Sheet
1. Deepen Model Intuition
• Learn how embeddings are generated (cosine similarity, dot product, dimensionality
trade-offs).
• Understand attention in detail: query/key/value, scaling.
• Read key papers: 'Attention is All You Need', and blogs breaking down GPT, LLaMA.
2. Jump Into Models Early
• Use Hugging Face transformers to load, tokenize, and generate.
• Run local models with Ollama (LLaMA3, Mistral).
• Compare responses across APIs vs open models.
3. Master the RAG Workflow
• Build manual pipeline: chunk → embed → store (FAISS/Chroma) → retrieve → generate.
• Add logging, retries, caching.
• Apply RAG on real-world data (e.g., supply chain docs, IoT manuals).
4. Evaluation & Guardrails
• Build ground-truth dataset and measure accuracy/precision.
• Log latency and cost per query.
• Implement PII redaction with regex/spaCy.
• Create hallucination tests with known-answer queries.
5. Mini Projects
• Document Q&A; (factory specs, HR policies).
• Summarizer with fidelity benchmarks.
• Structured extraction (BOM details → JSON).
6. Reliability & MLOps Basics
• CI/CD pipeline for prompt configs.
• Use LangSmith or Weights & Biases for tracing & experiment tracking.
• Build a simple cost dashboard.
7. Agents
• Start with one-tool agents (search + LLM).
• Add memory and planning once metrics show value.
8. Frameworks
• Layer in LangGraph, CrewAI, or LlamaIndex for orchestration.
• Keep your core logic framework-agnostic.
Appendix A: GenAI Cheat Sheet
Concept Key Notes
Embeddings Numeric vector representations of text. Compare with cosine similarity/dot product.
Attention Mechanism to focus on relevant tokens (Q, K, V).
RAG Ingest → Chunk → Embed → Store → Retrieve → Re-rank → Generate.
Evaluation Track accuracy, latency, cost. Create ground-truth dataset.
Guardrails Handle hallucinations, PII redaction, content filters.
Mini-Projects Doc Q&A, summarization, structured extraction.
MLOps CI/CD for prompts, observability, cost dashboards.
Agents Start simple. Add memory/planning later.
Frameworks LangGraph, LlamaIndex—use after manual mastery.