Generative AI:
An Overview
Understanding Recurrent Neural Networks (RNNs)
RNNs are a type of neural network.
They are designed to process
sequential data.
These architectures were widely
used for NLP tasks, speech
processing, and time series.
Challenge-?
The Rise of Transformers: Self-Attention
In 2017, researchers at Google
published a paper that proposed a
novel neural network architecture
for sequence modeling known as
Transformer.
Outperformed recurrent neural
networks (RNNs) on machine
translation tasks, both in terms of
translation quality and training cost.
A Timeline of Large Language Models
2022: ChatGPT
Generative Pre-trained Transformer 2.
2024: Meta's Llama 3, Claude 3, and Q2, and Mistral's Mixtral 8x7B
Larger and more powerful model.
2025: DeepSeek-R1
Multimodality: Text, Image, Video
Diving into ChatGPT
Generative Pre-trained Transformer
Next word prediction LLM is pre-trained on massive Encoder-decoder architecture
amount of text
Why did ChatGPT couldn't replace Google Search?
How was ChatGPT trained?
Large Language Models
What do LLMs essentially do?
LLMs as Machine Learning Task?
LLMs as Deep Learning Task?
Training Data for LLMs
Next word Generation
Phases of LLM Training
Pre-training Instruction fine tuning Reinforcement Learning
Massive amount of text data Curating Q n A dataset to from Human Feedback
from internet - books, train the model to answer (RLHF)
research papers, websites questions or instructions Align the output closer to
Model learns to predict the Model learns to become a human like responses
next word helpful assistant Responses are updated
considering human
feedback and preference.
Limitation of LLMs
1. Hallucination
2. Mathematical Problem solving
3. Context window
4. Cost
How to make LLMs respond better?
Zero-Shot Few-Shot Chain-of-Thought(CoT)
Give some instructions to Give some examples of how For complex tasks- prompt
solve a task. to solve a task. an LLM to <think step by
step=
Latest LLMs & Frameworks
LLMs Frameworks
Mistral Together AI- [Link]
Mixtral Groq- [Link]
Llama Replicate- [Link]
Gemini LiteLLM - [Link]
DeepSeek Hugging Face- [Link]
Generative AI Project Lifecycle