0% found this document useful (0 votes)
17 views10 pages

Machine Learning: Bayes' Theorem in Python

The document is a practical file for a Machine Learning Lab course, detailing experiments related to Bayes' theorem and its applications in Python. It includes examples of calculating probabilities using Bayes' rule, explanations of conditional probability, and the implementation of Naive Bayes classifiers. Additionally, it covers data extraction from a MySQL database using Python, along with relevant source code and viva questions.

Uploaded by

abhisheksinghc88
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views10 pages

Machine Learning: Bayes' Theorem in Python

The document is a practical file for a Machine Learning Lab course, detailing experiments related to Bayes' theorem and its applications in Python. It includes examples of calculating probabilities using Bayes' rule, explanations of conditional probability, and the implementation of Naive Bayes classifiers. Additionally, it covers data extraction from a MySQL database using Python, along with relevant source code and viva questions.

Uploaded by

abhisheksinghc88
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

GWALIOR INSTITUTE OF INFORMATION

TECHNOLOGY

Practical File on
Machine Learning Lab (IT-804)
Session:…………………

Submitted To: Submitted By:


Name:
Semester:
Enrollment No:
Branch:
LIST OF EXPERIMENTS

PROGRAM
1. The probability thatitisFridayandthatastudentisabsentis3%.Sincethereare5school
daysinaweek,theprobabilitythatitisFridayis20%.Whatisthe probability that a student
is absent given that today is Friday? Apply Baye’s rule in python to get the result.
(Ans: 15%)

2. EXTRACT THE DATA FROM DATABASE USING PYTHON


The probability thatitisFridayandthatastudentisabsentis3%.Sincethereare5school
daysinaweek,theprobabilitythatitisFridayis20%.Whatisthe probabilitythatastudent is
absent given that today is Friday? Apply Baye’s rule in python to get the result. (Ans:
15%)

AIM: To find the probability that a student is absent given that today is Friday.

DESCRIPTION:

Machine learning is a method of data analysis that automates analytical model building of data set.
Using the implemented algorithms that iteratively learn from data, machine learning allows computers
to find hidden insights without being explicitly programmed where to look. Naive bayes algorithm is
one of the most popular machines learning technique. In this article we will look how to implement
Naive bayes algorithm using python.

Before someone can understand Bayes’ theorem, they need to know a couple of related concepts first,
namely, the idea of Conditional Probability, and Bayes’ Rule.

Conditional Probability is just what is the probability that something will happen, given that something
else has already happened.

Let say we have a collection of people. Some of them are singers. They are either male or female. If we
select a random sample, what is the probability that this person is a male? what is the probability that
this person is a male and singer? Conditional Probability is the best option here. We can calculate
probability like,

P(Singer & Male) = P(Male) x P(Singer / Male)

What is Bayes rule ?

We can simply define Bayes rule like this. Let A1, A2, …, An be a set of mutually exclusive events that
together form the sample space S. Let B be any event from the same sample space, such that P(B) > 0.
Then, P( Ak | B ) = P( Ak ∩ B ) / P( A1 ∩ B ) + P( A2 ∩ B ) + . . . + P( An ∩ B )
What is Bayes classifier?

Naive Bayes classifiers are a family of simple probabilistic classifiers based on applying Bayes’
theorem with strong (naive) independence assumptions between the features in machine learning.
Basically we can use above theories and equations for classification problem.

SOURCE CODE:

probAbsentFriday=0.0

3 probFriday=0.2

# bayes Formula

#p(Absent|Friday)=p(Friday|Absent)p(Absent)/p(Friday)

#p(Friday|Absent)=p(Friday∩Absent)/p(Absent)
# Therefore the result is:
bayesResult=(probAbsentFriday/probFriday)
print(bayesResult * 100)

Output: 15

[Link] QUESTIONS & ANSWERS


1. What are Bayesian Networks (BN) ?

Bayesian Network is used to represent the graphical model for probability relationship among a set of
variables. Bayes’ theorem is a way to figure out conditional probability. Conditional probability is the
probability of an event happening, given that it has some relationship to one or more other events. For
example, your probability of getting a parking space is connected to the time of day you park, where
you park, and what conventions are going on at any time. Bayes’ theorem is slightly more nuanced. In a
nutshell, it gives you the actual probability of an event given information about tests.
• “Events” Are different from “tests.” For example, there is a test for liver disease, but that’s
separate from the event of actually having liver disease.
• Tests are flawed: just because you have a positive test does not mean you actually have the
disease. Many tests have a high false positive rate. Rare events tend to have higher false positive
rates than more common events. We’re not just talking about medical tests here. For example, spam
filtering can have high false positive rates. Bayes’ theorem takes the test results and calculates your real
probability that the test has identified the event.

2. Can you give any real time example using Bayes’ Theorem (liver disease).

You might be interested in finding out a patient’s probability of having liver disease if they are an
alcoholic. “Being an alcoholic” is the test (kind of like a litmus test) for liver disease.

• A could mean the event “Patient has liver disease.” Past data tells you that 10% of patients
entering your clinic have liver disease. P(A) = 0.10.
• B could mean the litmus test that “Patient is an alcoholic.” Five percent of the clinic’s patients are
alcoholics. P(B) = 0.05.
• You might also know that among those patients diagnosed with liver disease, 7% are alcoholics.
This is your B|A: the probability that a patient is alcoholic, given that they have liver disease, is 7%.

Bayes’ theorem tells you:

P(A|B)=(0.07*0.1)/0.05=0.14

In other words, if the patient is an alcoholic, their chances of having liver disease is 0.14 (14%).

This is a large increase from the 10% suggested by past data. But it’s still unlikely that any particular
patient has liver disease.

3. Bayes’ Theorem Examples 2: what is the probability that they will be prescribed pain pills?

Another way to look at the theorem is to say that one event follows another. Above I said “tests” and
“events”, but it’s also legitimate to think of it as the “first event” that leads to the “second event.”
There’s no one right way to do this: use the terminology that makes most sense to you.

In a particular pain clinic, 10% of patients are prescribed narcotic pain killers. Overall, five percent of
the clinic’s patients are addicted to narcotics (including pain killers and illegal substances). Out of all
the people prescribed pain pills, 8% are addicts. If a patient is an addict, what is the probability that
they will be prescribed pain pills?
Step 1:Figure out what your event “A” is from the question. That information is in the italicized part
of this particular question. The event that happens first (A) is being prescribed pain pills. That’s given as
10%.

Step 2:Figure out what your event “B” is from the question. That information is also in the italicized
part of this particular question. Event B is being an addict. That’s given as 5%.

Step 3:Figure out what the probability of event B (Step 2) given event A (Step 1). In other words,
find what (B|A) is. We want to know “Given that people are prescribed pain pills, what’s the probability
they are an addict?” That is given in the question as 8%, or .8.

Step 4:Insert your answers from Steps 1, 2 and 3 into the formula and solve.
P(A|B) = P(B|A) * P(A) / P(B) = (0.08 * 0.1)/0.05 = 0.16

The probability of an addict being prescribed pain pills is 0.16 (16%).

4. Bayes’ Theorem Examples 3: the Medical Test if a person gets a positive test result.

what are the odds they actually have the genetic defect?

A slightly more complicated example involves a medical test (in this case, a genetic test):

There are several forms of Bayes’ Theorem out there, and they are all equivalent (they are just written
in slightly different ways). In this next equation, “X” is used in place of “B.” In addition, you’ll see
some changes in the denominator. The proof of why we can rearrange the equation like this is beyond
the scope of this article (otherwise it would be 5,000 words instead of 2,000!). However, if you come
across a question involving medical tests, you’ll likely be using this alternative formula to find the
answer:

1% of people have a certain genetic defect.


90% of tests for the gene detect the defect (true positives).
9.6% of the tests are false positives.
If a person gets a positive test result, what are the odds they actually have the genetic defect?
The first step into solving Bayes’ theorem problems is to assign letters to events:

• A = chance of having the faulty gene. That was given in the question as 1%. That also means the
probability of not having the gene (~A) is 99%.
• X = A positive test result.

So:

1. P(A|X) = Probability of having the gene given a positive test result.


2. P(X|A) = Chance of a positive test result given that the person actually has the gene. That was
given in the question as 90%.
3. p(X|~A) = Chance of a positive test if the person doesn’t have the gene. That was given in the
question as 9.6%

Now we have all of the information we need to put into the equation:
P(A|X) = (.9 * .01) / (.9 * .01 + .096 * .99) = 0.0865 (8.65%).

The probability of having the faulty gene on the test is 8.65%.

[Link] the following statistics, what is the probability that a woman has cancer if she has a
positive mammogram result?

• One percent of women over 50 have breast cancer.


• Ninety percent of women who have breast cancer test positive on mammograms.
• Eight percent of women will have false positives.

Step 1: Assign events to A or X. You want to know what a woman’s probability of having cancer is,
given a positive mammogram. For this problem, actually having cancer is A and a positive test result is
X.
Step 2: List out the parts of the equation (this makes it easier to work the actual equation):
P(A)=0.01
P(~A)=0.99
P(X|A)=0.9
P(X|~A)=0.08
Step 3: Insert the parts into the equation and solve. Note that as this is a medical test, we’re using the
form of the equation from example #2:
(0.9 * 0.01) / ((0.9 * 0.01) + (0.08 * 0.99) = 0.10.

The probability of a woman having cancer, given a positive test result, is 10%.
PROGRAM

3. EXTRACT THE DATA FROM DATABASE USING PYTHON

You’ll learn the following MySQL SELECT operations from Python using a ‘MySQL
Connector Python’ module.

• Execute the SELECT query and process the result set returned by the queryin Python.
• Use Python variables in a where clause of a SELECT query to pass dynamic
values.
• Use fetchall(), fetchmany(), and fetchone() methods of a cursor class to fetch all or
limited rows from atable.
• Python Select from MySQL Table

This article demonstrates how to select rows of a MySQL table in Python.

You’ll learn the following MySQL SELECT operations from Python using a ‘MySQL Connector
Python’ module
.
• Execute the SELECT query and process the result set returned by the queryin Python.
• Use Python variables in a where clause of a SELECT query to pass dynamic values.
• Use fetchall(),fetchmany(), and fetchone() methods of a cursor class to fetch all or limited rows
from atable.
• Steps to fetch rows from a MySQL database table Follow these steps:–

SOURCE CODE:

importpymysql
defmysqlconnect():
# To connect MySQL
database conn = [Link](
host='localhost', user='root',
password = "pass",
db='College',

)
cur = [Link]()
[Link]("select @@version")
output = [Link]() print(output)

# To close the connection [Link](


)

OUTPUT:
VIVA QUESTIONS AND ANSWERS

1. How to select from a MySQL table using Python?

Connect to MySQL from Python

Refer to Python MySQL database connection to connect to MySQL database from Python using
MySQL Connector module

Define a SQL SELECT Query

Next, prepare a SQL SELECT query to fetch rows from a table. You can select all or limited rows based
on your requirement. If the where condition is used, then it decides the number of rows to fetch.
For example, SELECT col1, col2,…colnN FROM MySQL_table WHERE id = 10;. This will return row number
10.

Get Cursor Object from Connection

Next, use a [Link]() method to create a cursor object. This method creates a new MySQLCursor
object.
Execute the SELECT query using

execute() method Execute the select

query using the [Link]()

[Link] all rows from a

result

After successfully executing a Select operation, Use the fetchall() method of a cursor
object to get allrows from a query result. it returns a list of rows.

Iterate each row

Iterate a row list using a for loop and access each row individually (Access each
row’s column datausing a column name or index number.)

Close the cursor object and database connection object

[Link]() and [Link]() method to close open connections after your work
completes.

2. What are the methods to fetch data returned by a [Link]( )

• [Link]() to fetch all rows


• [Link]() to fetch a single row
• [Link](SIZE) to fetch limited rows

Common questions

Powered by AI

Conditional probability considers the probability of an event given another event, whereas Bayes' Theorem refines this by incorporating prior probabilities and the likelihood of test results to update the probability of an actual event. For instance, a medical test (an event) might show positive results, but Bayes' Theorem calculates the actual probability of having the disease by accounting for false positives and the base rate of the disease, distinguishing test events from actual conditions .

The Naive Bayes classifier in machine learning employs Bayes' theorem with strong independence assumptions between the features. It calculates the probability of each class given the feature values and selects the class with the highest probability. This simplifies computation and makes it efficient for large datasets. By assuming that all feature pairs are independent given the class, it can apply Bayes' rule iteratively to find useful insights from data .

In the context of pain pill prescriptions, Bayes' Theorem is used to determine the probability that an addict will be prescribed pain pills. Given that 5% of the clinic's patients are addicts and 10% are prescribed narcotics, with 8% of prescribed individuals being addicts, Bayes' Theorem finds the probability of an addict being prescribed pain pills as 16% . This highlights how the theorem integrates various probability components to estimate an informed likelihood based on observed data.

Independence is crucial in Naive Bayes models because it allows the simplification of probability computation by assuming that the presence of a particular feature in a class is not related to the presence of any other feature. This reduces the computational complexity from exponential to linear in the number of features, making it feasible to compute posteriors and classify data efficiently, even in datasets with many features or instances . This also enables the practical application of Naive Bayes in real-time scenarios.

To implement a Naive Bayes algorithm in Python, you first calculate the prior probabilities of each class in the dataset. Then, compute the likelihood of each feature value given each class. Combine these using Bayes' Theorem, under the naive assumption of independence between features, to estimate the posterior probability for each class. Finally, classify data points by selecting the class with the highest posterior probability . This involves coding the calculations and leveraging libraries such as scikit-learn for simplification.

Bayes' Theorem helps improve the accuracy of medical diagnostics by incorporating information about false positives. For instance, in the case of a genetic test where 1% of people have a certain defect, but the test has a 9.6% false positive rate, Bayes' Theorem calculates the actual probability of having the defect as 8.65% if tested positive, despite the high false positive rate . This highlights how the theorem uses the base rates of a condition and the test characteristics to give a more accurate probability of the condition's presence in individuals with positive test results.

Bayesian Networks are effective for representing probabilistic relationships among a set of variables through graphical models. They offer a framework for reasoning about uncertainty by encoding dependencies using directed acyclic graphs. A real-world example includes medical diagnosis systems, where Bayesian Networks model symptom-disease relationships and update probabilities as new symptoms are observed, thus supporting decision-making processes about diagnosis and treatment strategies .

Data extraction from a MySQL database using Python involves connecting using a library like pymysql, executing SQL SELECT queries, and processing the results with cursor methods like fetchall(), fetchmany(), or fetchone(). This capability is crucial for machine learning as it allows accessing and manipulating large datasets efficiently from a database, enabling preprocessing, analysis, and model feeding with real-world data . The connection and querying steps must be carefully coded to reliably handle database interactions.

Bayes' Theorem can be applied to real-time prediction problems in domains such as finance, where it predicts market movements based on new information, or in recommendation systems, where it updates user preferences as new behavior data is received. For example, a streaming service could use it to better infer a user's likelihood of enjoying a new release based on past viewing history and ratings, continuously refining recommendations with each interaction . These examples showcase the theorem's adaptability to dynamic datasets.

In spam filtering systems, high false positive rates can significantly affect the reliability and user satisfaction. Bayes’ Theorem helps mitigate this by recalculating the probability of an email being spam after a positive filter result, based on the prior probability of spam and the likelihood ratios. Quantitatively, if 1% of emails are spam and the filter has a 5% false positive rate, Bayes' Theorem adjusts the spam probability, reducing unnecessary false alarms . Understanding these dynamics helps optimize filter settings to balance sensitivity and specificity.

You might also like