PEC GEN AI NOTES
MODULE 1
AI is a discipline and Machine Learning(ML) is a subfield of this discipline and deep
learning (DL) is a further subfield of ML
DL is a type of ML that uses Artificial Neural Networks to learn complex patterns
from the data. It contains further two types of techniques:
Descriptive Technique: Discriminative model (classifies/differentiate between
spam or not spam)
Generative Technique: Generative AI is a subset of AI that aims to generate new
content form given instructions (prompt)
AI-Machines doing tasks intelligently, ML-Machine that learn from patterns of data
to make predictions, Gen-AI-Machines creating new, original content (ChatGPT,
DALL-E)
Large Language Model (LLM): A language model is a type of Gen-AI that generates
text e.g autocomplete
Machine Learning Model Types:
1) Supervised Learning: Learns from labelled data e.g. predicting house prices.
2) Unsupervised Learning: Finds patterns in unlabeled data e.g. clustering
customers
3) Reinforced Learning: Learn by trial and error e.g. game playing AI.
4) Deep learning: Identifies patterns in data using neural networks.
Large Language Models: Trained on vast amounts of text data to understand
context, grammar, and nuances of language.
How to interact with LLM’s:
Chat interfaces (ChatGPT, Gemini), LLM APIs (GPT 3.5,4.0 and Gemini), Specialized
models (YOLO, Whisper)
Chat Interfaces: Simplest ways to interact with LLMS. Most LLMs contain these
interfaces (ChatGPT, Gemini), most of these services have a free plan that limits
(restriction in no. of prompts) usage or limits models available (chatGPT 3.5 vs 4.0)
API: Application Programming Interface: (similar to waiters at restaurants). They
take prompts from you, deliver it to the server, the server not the API process
data, API brings data back to user. Acts as an agent b/w you and the server.
Another example is CR in uni who’s the only medium b/w teachers and students.
Adv of API: Easy to setup and use, faster prototyping, don’t need expensive GPU
as all processes run on company’s servers.
Disadv of API: Costs money to use, application can be slower as it has to go
company’s server and then come back. Any device using company LLM API’s needs
to be connected to the internet and its speed can be effected by internet’s speed.
Pre-trained Specialized models: Smaller models that are trained and fine-tuned to
perform specific tasks e.g. YOLO: Object detection, whisper: speech to text. These
smaller models are more accurate at accomplishing specific tasks. These models
can be further fine-tuned using own data to increase accuracy.
Where can specialized models be found? Most of them are open source meaning
they are freely available on the internet for anyone to download, use, and modify.
There are specific platforms where trained models are uploaded alongside
instructions on how to use them. Hugging face is the most popular source for pre-
trained models and has easy to use instructions for most models. These platforms
have communities and forums where questions can be asked as well.
GPT: Generative Pre-trained transformer – An LLM trained to generate answers to
our queries utilizing transformer libraries. The underlying dataset/transformer
libraries are same, there are simply multiple models that are being trained from
the same dataset. ChatGPT is one of these models. These GPTs are already trained
on large datasets (e.g GPT3 was trained on all Wikipedia) but there’s still room for
additional training (fine-tuning) for specific tasks.
Transformers are the core neural network architecture that every generative AI
model is built on. All data processing occurs in these transformers from the
decoding of your query, changing into computer language, processing, changing
back into human [Link].
HOW CHATGPT WORKS:
Prompt-> pre-processing-> tokenization (breakdown of text into smaller units like
words)-> embedding (conversion of tokens into numerical representations i.e.
vectors-> self-attention (understanding the meaning of the prompt)-> picks
appropriate model (GPT, DALL-E)-> transformer than generates the next
information-> un-embedding and post processing-> response in word form.
Tokenizing: Breaking inputs into chunks (employed to characterize what’s the
meaning of the input prompt and removes unnecessary or redundant data).
Embedding; Converting tokenized chunks into vectors in an N-dimensional vector
space. Can do vector operations like add, subtract, dot and cross products.
Un-embedding: After many rounds of attention and MLP’s, a final vector is
produced which is the official next bit of info. Now we reverse embedding into
useful token that is the same as input prompt.
Cleaning inputs: Human language has a lot of filler words like ‘a’,’the’,’an’ called
Stop words that don’t really add value to the text. Removing these words lowers
the amount of tokens used and helps get more accurate results. Utilities like: NLTK
is a python package used to remove stop words.
STEMMING: A way to remove prefixes and suffixes from words to distill meaning
e.g ‘walks’ becomes ‘walk’ and ‘retrieval’ becomes ‘retrieve’. Common suffixes like
‘ed’, ‘ing’, plurals to singular, removing adjectives ‘happier’ to ‘happy’ are removed
using common algorithms like: PorterStemmer.
Stemming is useful in information retrieval, search, and data mining.
Be careful not to overstem and reduce words to a meaningless form e.g
‘university’ to ‘univers’ so stemming words can make them lose their contextual
meaning. Compound words like ‘whiteboard’ are usually not well handled.
Lemmatization: More sophisticated than stemming (better contextual awareness).
It basically reduces a word to its dictionary basic form e.g ‘running’ and ‘ran’ will
both be reduced to ‘run’ and ‘university’ remains ‘university’.
This helps models group certain words and sentences together.
Helps reduce overhead and reduce dimensionality of vectors that words produce.
Helps in info retrieval e.g ‘Best coffee’ will also retrieve results for ‘good coffee’
Also done using NLTK.
RAG (Retrieval Augmented Generation)
A technique that allows AI models to access and incorporate external information
to generate more accurate and informative responses. Essentially, it's like giving an
AI model access to a vast library of knowledge, enabling it to provide more
comprehensive and reliable answers.
Hybrid Approach: Combines LLMs with information retrieval systems to enhance
response accuracy and relevance.
Why RAG? LLM’s face limitations in certain tasks:
Static Knowledge: LLM’s are limited to knowledge up to their last training and lack
real-time updates
Contextual Limits: Struggle with highly specific or less common topics without
sufficient context
Large scale data handling: Handling large amounts of information and ensuring
relevancy & accuracy of responses.
Hallucination: Incorrect or misleading generated by an AI model.
Data Staleness: Model’s inability to provide updated info because it was trained
on a fixed dataset that does not include updated data.
How RAG works:
Retrieval Component: Fetches relevant documents or data from an external
database. It does 3 steps on that document:
1) Chunking: Divides large documents into smaller chunks. For example, if a doc is
of 100 words, then the entire text is split into chunks of 10-10 words-> then
tokenization of these chunks occurs-> then embedding occurs and this
embedding’s are stored in vector databases.
2) Semantic Understanding: The user then writes a prompt asking something
from the document which will also be embedded. Now the query embedding
will be matched with document embedding and the most relevant & semantic
relationships b/w the query and document are returned as response. How it
finds the most relevant relationships is through Distance based retrieval.
Through it, it identifies relevant data by calculating distance in b/w vectors and
filters out irrelevant data by setting a distance threshold. The smaller the
distance b/w the vectors the more relevant the data
Generation Component: LLM generates responses based on both retrieved data
and own capabilities.
It returns both the most relevant answer from the document provided and the
user’s query. This is called an augmented response. This augmented response then
goes to the LLM that generates a coherent response
Here's how RAG works:
1) Query: A user submits a query to the language model.
2) Retrieval: The model accesses a knowledge base and retrieves relevant
information related to the query.
3) Augmentation: The retrieved information is integrated into the model's
response generation process.
4) Response Generation: The model generates a response that incorporates both
its own knowledge and the information from the knowledge base
Benefits of RAG:
Updated knowledge: Access to real-time or recent information
Enhanced Accuracy: More precise and contextually relevant answers
Scalability: Handles large data more effectively.
In conclusion RAG improves a model’s ability to generate accurate responses for
queries that require knowledge which isn’t present in a model’s pre-trained data.
* LOOK UP HOW IT USES WIGHTS TO PRIORITIZE STUFF
MODULE 2-PYHTON FOR BEGINNERS
Python: One of the most popular programming languages, used in Web
development, data science, AI, scientific computing, automation, and more. Used
by Google, Netflix, Fb
Real World Application of Python: 1) Web Development: Frameworks like Django
and Flask made building web applications easier. 2) Data Science and ML: Libraries
like Pandas, NumPy, Scikit-learn, and Tensor Flow, 3) Automation: Python scripts
automate repetitive tasks, 4) Game Development: Libraries like Pygame are used
to create games.
Examples: Youtube: Uses Pytho, Instagram: Uses Django, a Python web
framework, Spotify: Uses Python for Data Analysis and Backend Services.
Why Python? 1) Its syntax is easy to write and understand (similar to plain
English), 2) Readable Code: Python emphasizes readability, making I easier to
learn and debug, 3) Large community and resources: Vast no. of tutorials, forums,
and documentation available, 4) Extensive Libraries and Frameworks: Makes
development easier and faster.
GOOGLE COLAB: (ONLY RUNS PYTHON CODE)- We’ll be using it to write code.
Connect it with browser to use first. This will connect to CPU giving you a disk
space of 107.7GB, and RAM of 12.7GB.
We can also connect it to a GPU. Basically it makes ML (Machine Learning)
computations much faster due to faster processing speed.
There are two blocks labelled Code and Text under the formatting bar in Google
colab. Code is used for writing code and text for writing comments/notes/
headings.
The purpose of Google Colab is that you’ll be copying the code written by
ChatGPT and pasting it here to run and check for errors. If errors do arise you’ll
be debugging it by sending the code back to ChatGPT alongside the errors that
have occurred and ask it to debug and fix it.
To write comment in Python use they symbol: # before writing the comment.
Variable Names in Python: 1) Must start with letter or underscore _ , 2) there
must be no white space in a variable name, 3) Cannot start with a number, 4)
Variable name can’t have special characters, 5) Names are case sensitive i.e. Age is
not equal to age,6) Keywords cannot be used as variable names
Arithmetic Operators in Python: +, -, *, /, % (modulus operator-gives remainder
e.g. 9%2=1), // (floor division- divides two numbers and returns the largest integer
less than or equal to the result of the division. It's like regular division, but it
rounds down to the nearest whole number e.g. 9/2=4), ** (power operator ^)
Logical Operators in Python: AND, OR, NOT (DON’T USE SYMBOLS IN PYTHON)
Comparison Operators: <, >, != (Not equal to), <=, >=, == (equal to)
Conditional Statements: instead of elseif python uses elif.
LOOPS IN PYTHON:
1) For Loop:
- For each loop (value based loop)
- Index based loop
2) While Loop
For Loop: (indexed based loop)
For i in range(10): #index starts from 0 so count will be from 0 to 9
Print(i) #prints out a serial from 0 to 9
For i in range(3, 10): # it now starts from the and ends at 9
Print(i) # output 3,4,5,…..,9
For i in range(3, 10, 2): # the third entry indicates the jump from each integer
Print(i) #output 3,5,7,9
So for i in range(3, 9, 2): # (start, end(exclusive), jump)
While Loop
i=0;
while i<10: (REMEMBER PYTHON USES COLON : INSTEAD OF SEMI COLON ;)
print(i)
i+=2 #output 0,2,4,6,8
FUNCTIONS IN PYTHON: 1) Built in functions, 2) User Defined Functions (which we
will be generating with chatgpt)
Built in functions: max(), min(), [Link]() [calculates square root of a given
function e.g. [Link](16)=4.0
Built in functions don’t need to be defined prior to use e.g.
Output= max (3,4)
Print(output) #gives 4
User Defined Function Syntax:
def function_name (parameters): # here def is a keyword
# body of function
Return expression
e.g
def multiply(x,y):
output=x*y
return output
x=5
y=10
output = multiply(x,y)
print(output) #gives 50
LISTS IN PYTHON: denoted by square brackets e.g clothes=[“shirts”, “tie”, “pants”]
Print(clothes[1]) #gives back tie as answer
#applying for loops on lists
1) Index based for loop
For i in range(len(clothes)): # Loop through the list: This line sets up a loop
that will run once for each item in the clothes list. len(clothes) gives the
number of items in the list (which is 3), and range(3) creates a sequence
of numbers from 0 to 2.
Print(clothes[i]) # Inside the loop, this line prints the item at the current
position i in the clothes list.
2) Value based for loop
For i in clothes: # This line sets up a loop that will go through each item in
the clothes list one by one.
Print(i) # Inside the loop, this line prints the current item i from the
clothes list.
**LISTS CAN BE EDITED AFTER THEY HAVE BEEN MADE**
Cloth=[“shirts”,”ties”,”Pants”]
Cloth[0]= “Jeans”
Print(Cloth) # [“Jeans”,”ties”.”Pants”]
TUPLE IN PYTHON: Similar to List but it cannot be changed once it is created and
its uses parenthesis () instead of square brackets []
e.g cloth=(“Jeans”,”ties”,”Pants”)
print(type(cloth)) #tuple
SETS IN PYTHON: A set is a collection of unique data, meaning elements within a
set can be duplicated,. ELEMENTS IN A SET ARE UNORDERED. SETS USE CURLY
BRACKETS {}
e.g numbers={1,2,3,4,5,”numbers”,2.45,3} # even if you add multiple 3’s the
output will only have only 3
print(numbers) # gives 1,2,3,4,5,2.45,numbers in response i.e. unordered
WE CANNOT USE INDEXES TO FETCH ITEMS IN SETS
e.g. print(numbers[2]) # Type error: ‘set’ object is not subscriptable
BUT WE CAN APPLY LOOPS
e.g. for element in numbers:
print(numbers) # 1,2,3,4,5,2.45 ,numbers
DICTIONARY IN PYTHON: ALSO REPRESENTED BY CURLY BRACKETS {} JUST LIKE
SETS. The difference is that a dictionary is a key-value pair (consisting of a key and
avalue). KEYS are always unique but values may not.
e.g country_capitals={ ‘Germany’ : ‘Berlin’, ‘Canada’ : ‘Ottawa’, ‘England’ :
‘London’} Here both Germany and Berlin form 1 element
We can access values from dictionaries with this syntax:
Print(country_capital[‘Canada’]) #Ottowa
IN DICTIONARIES DATA CAN’T BE REPEATED BECAUSE WHEN IT IS PRINTED IT
IGNORES THE FIRST ENTRY AND PRINTS THE LATEST ONE. e.g. products
{‘cans’:1,’bins’:3, ‘cans’:5}
Print(products[‘cans’]) # 5
Applying loops to Dictionaries
For element in [Link](); #.items is new
Print(element) # bins:3 (next line) cans:5
To get each item separately
For product_type, rate in [Link]()
Print(product_type, “=” rate) # Bins=3 (next line) Cans=5
HOW TO GET USER INPUT IN PYTHON:
User_input=input(‘enter a number: ‘)
Print(User_input)
LEET CODE WEBSITE