0% found this document useful (0 votes)

60 views35 pages

Practical Big Data Analytics Guide

The document outlines a series of practical exercises focused on data science techniques, including the installation and configuration of Hadoop, various classification and regression models, and clustering methods. Each practical includes specific aims, installation instructions, code snippets, and data analysis steps using R programming. The exercises cover decision trees, SVM, linear regression, logistic regression, and k-means clustering, providing a comprehensive overview of machine learning applications.

Uploaded by

pranavmhatre.mscit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

60 views35 pages

Practical Big Data Analytics Guide

Uploaded by

pranavmhatre.mscit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

INDEX

[Link]. Practical Date sing

1 Install, configure and run Hadoop and HDFS ad explore

HDFS
2 Implement Decision tree classification techniquesb.

3 . Implement SVM classification techniques.

4 Implement of REGRESSION MODLE.

5 Implement of Simple Linear Regaression.

6 Implement of Multiple Linear Regression.

7 Implement of Logistic regression.

8 Read a datafile grades_km_input.csv and apply k-means

clustering.

9 Perform Apriori algorithm using Groceries dataset from

the R arules package.

Shanti Chourasiya

Roll no.2023ITI1

[Link]
Practical 1

Aim: -Install, configure and run Hadoop and HDFS and explore HDFS on Windows

Code:

Steps to Install Hadoop

1. Install Java JDK 1.8
2. Download Hadoop and extract and place under C drive
3. Set Path in Environment Variables
4. Config files under Hadoop directory
5. Create folder datanode and namenode under data directory
6. Edit HDFS and YARN files
7. Set Java Home environment in Hadoop environment
8. Setup Complete. Test by executing [Link]

There are two ways to install Hadoop, i.e.

9. Single node
10. Multi node
Here, we use multi node cluster.
1. Install Java
11. – Java JDK Link to download
[Link]
12. – extract and install Java in C:\Java
13. – open cmd and type -> javac -version

2. Download Hadoop
[Link]
[Link]

 right click .[Link] file -> show more options -> 7-zip->and extract to C:\Hadoop-
3.3.0\
3 Set the path JAVA_HOME Environment variable

4 Set the path HADOOP_HOME Environment variable

Click on New to both user variables and system variables.

Click on user variable -> path -> edit-> add path for Hadoop and java upto ‘bin’
Click Ok, Ok, Ok.
5. Configurations
Edit file C:/Hadoop-3.3.0/etc/hadoop/[Link],
paste the xml code in folder and save

======================================================

<configuration>
<property>
<name>[Link]</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
======================================================

Rename “[Link]” to “[Link]” and edit this file C:/Hadoop-

3.3.0/etc/hadoop/[Link], paste xml code and save this file.
======================================================

<configuration>
<property>
<name>[Link]</name>
<value>yarn</value>
</property>
</configuration>
======================================================

Create folder “data” under “C:\Hadoop-3.3.0”

Create folder “datanode” under “C:\Hadoop-3.3.0\data”

Create folder “namenode” under “C:\Hadoop-3.3.0\data”

======================================================

Edit file C:\Hadoop-3.3.0/etc/hadoop/[Link],

paste xml code and save this file.

<property>
<name>[Link]</name>
<value>/hadoop-3.3.0/data/namenode</value>
</property>
<property>
<name>[Link]</name>
<value>/hadoop-3.3.0/data/datanode</value>
</property>
</configuration>
======================================================

Edit file C:/Hadoop-3.3.0/etc/hadoop/[Link],

paste xml code and save this file.

</configuration>
======================================================

6. Edit file C:/Hadoop-3.3.0/etc/hadoop/[Link]

Find “JAVA_HOME=%JAVA_HOME%” and replace it as
set JAVA_HOME="C:\Java\jdk1.8.0_361"
======================================================

7. Download “redistributable” package

Download and run VC_redist.[Link]

This is a “redistributable” package of the Visual C runtime code for 64-bit applications, from
Microsoft. It contains certain shared code that every application written with Visual C expects to
have available on the Windows computer it runs on.

8. Hadoop Configurations
Download bin folder from

[Link]

– Copy the bin folder to c:\hadoop-3.3.0. Replace the existing bin folder.

9. copy "[Link]" from ~\hadoop-

3.0.3\share\hadoop\yarn\timelineservice to ~\hadoop-3.0.3\share\hadoop\yarn
folder.

10. Format the NameNode

– Open cmd ‘Run as Administrator’ and type command “hdfs namenode –format”
11. Testing

– Open cmd ‘Run as Administrator’ and change directory to C:\Hadoop-

3.3.0\sbin

– type [Link]

- type [Link]

– type [Link]

– You will get 4 more running threads for Datanode, namenode, resouce manager and node
manager
Output:

12. Type JPS command to [Link] command prompt, you will get following output.

13. Run [Link] from any browser

Or [Link]
Practical 2
Aim: - Implement Decision tree classification techniquesb.

[Link]('datasets')

[Link]('caTools')

[Link]('party')

[Link]('dplyr')

[Link]('magrittr')
library(datasets)

library(caTools)

library(party)

library(dplyr)

library(magrittr)

data("readingSkills")

head(readingSkills)

sample_data = [Link](readingSkills, SplitRatio = 0.8)

train_data <- subset(readingSkills, sample_data == TRUE)

test_data <- subset(readingSkills, sample_data == FALSE)

model<- ctree(nativeSpeaker ~ ., train_data)

plot(model)
Practical 3
Aim:- Implement SVM classification techniques

#Code for installation of all necessary packages

[Link]("caret")

[Link]("ggplot2")

[Link]("GGally")

[Link]("psych")

[Link]("ggpubr")

[Link]("reshape")

# Code for importation of all necessary packages

library(caret)

library(ggplot2)

library(GGally)

library(psych)

library(ggpubr)

library(reshape)

# Code
df <- [Link]("D:\\[Link]")

head(df)

# Code

sum([Link](df))

# Code

dim(df)

# Code

sapply(df, class)

# Code

summary(df) # to calculate the summary of our dataset

# Code

a <- ggplot(data = df, aes(x = Pregnancies)) +

geom_histogram( color = "red", fill = "blue", alpha = 0.1) +

geom_density()

b <- ggplot(data = df, aes(x = Glucose)) +

geom_histogram( color = "red", fill = "blue", alpha = 0.1) +

geom_density()

c <- ggplot(data = df, aes(x = BloodPressure)) +

geom_histogram( color = "red", fill = "blue", alpha = 0.1) +

geom_density()

d <- ggplot(data = df, aes(x = SkinThickness)) +

geom_histogram( color = "red", fill = "blue", alpha = 0.1) +

geom_density()

e <- ggplot(data = df, aes(x = Insulin)) +

geom_histogram( color = "red", fill = "blue", alpha = 0.1) +

geom_density()

f <- ggplot(data = df, aes(x = BMI)) +

geom_histogram( color = "red", fill = "blue", alpha = 0.1) +

geom_density()

g <- ggplot(data = df, aes(x = DiabetesPedigreeFunction)) +

geom_histogram( color = "red", fill = "blue", alpha = 0.1) +

geom_density()

h <- ggplot(data = df, aes(x = Age)) +

geom_histogram( color = "red", fill = "blue", alpha = 0.1) +geom_density()

ggarrange(a, b, c, d,e,f,g, h + rremove("[Link]"),

labels = c("a", "b", "c", "d","e", "f", "g", "h"),

ncol = 3, nrow = 3)
# Code

ggplot(data = df, aes(x =Outcome, fill = Outcome)) +

geom_bar()
# Code to label our categorical variable as a factor

df$Outcome<- factor(df$Outcome,

levels = c(0, 1),

labels = c("Negative", "Positive"))

out <- subset(df,

select = c(Pregnancies,Glucose,

BloodPressure,SkinThickness,

Insulin,BMI,

DiabetesPedigreeFunction,Age))

# Code for boxplot

ggplot(data = melt(out),

aes(x=variable, y=value)) +

geom_boxplot(aes(fill=variable))
corPlot(df[, 1:8])
cutoff <- createDataPartition(df$Outcome, p=0.85, list=FALSE)

# select 15% of the data for validation

testdf <- df[-cutoff,]

# use the remaining 85% of data to training and testing the models

traindf <- df[cutoff,]

# Code to train the SVM

[Link](1234)

# set the 10 fold crossvalidation with AU

# to pick for us what we call the best model

control <- trainControl(method="cv",number=10, classProbs = TRUE)

metric <- "Accuracy"

model <- train(Outcome ~., data = traindf, method = "svmRadial",

tuneLength = 8,preProc = c("center","scale"),

metric=metric, trControl=control)

# Code for model summary

Model
# Code

plot(model)
Practical 4
Aim: - Implement of REGRESSION MODLE.

# Generate random IQ values with mean = 30 and sd =2

IQ <- rnorm(40, 30, 2)

# Sorting IQ level in ascending order

IQ <- sort(IQ)

# Generate vector with pass and fail values of 40 students

result <- c(0, 0, 0, 1, 0, 0, 0, 0, 0, 1,

1, 0, 0, 0, 1, 1, 0, 0, 1, 0,

0, 0, 1, 0, 0, 1, 1, 0, 1, 1,

1, 1, 1, 0, 1, 1, 1, 1, 0, 1)

# Data Frame

df <- [Link](cbind(IQ, result))

# Print data frame

print(df)
# Plotting IQ on x-axis and result on y-axis

plot(IQ, result, xlab = "IQ Level",

ylab = "Probability of Passing")

# Create a logistic model

g = glm(result~IQ, family=binomial, df)

# Create a curve based on prediction using the regression model

curve(predict(g, [Link](IQ=x), type="resp"), add=TRUE)

# Based on fit to the regression model

points(IQ, fitted(g), pch=30)

# Summary of the regression model

summary(g)

Practical 5
Aim :- Implement of Simple Linear Regaression.

years_of_exp = c(7,5,1,3)

salary_in_lakhs = c(21,13,6,8)

#[Link] = [Link](satisfaction_score, years_of_exp, salary_in_lakhs)

[Link] = [Link](years_of_exp, salary_in_lakhs)

[Link]

# Estimation of the salary of an employee, based on his year of experience and satisfaction score in
his company.

model <- lm(salary_in_lakhs ~ years_of_exp, data = [Link])

summary(model)

# The formula of Regression becomes

# Y = 2 + 2.5*year_of_Exp

# Visualization of Regression

plot(salary_in_lakhs ~ years_of_exp, data = [Link])

abline(model)
Practical 6
Aim :- Implement of Multiple Linear Regression.

# Importing the dataset

dataset = [Link]('D:\\[Link]')

# Encoding categorical data

dataset$State = factor(dataset$State,

levels = c('New York', 'California', 'Florida'),

labels = c(1, 2, 3))

dataset$State

# Splitting the dataset into the Training set and Test set

[Link]('caTools')

library(caTools)

[Link](123)

split = [Link](dataset$Profit, SplitRatio = 0.8)

training_set = subset(dataset, split == TRUE)

test_set = subset(dataset, split == FALSE)

# Feature Scaling

# training_set = scale(training_set)

# test_set = scale(test_set)

# Fitting Multiple Linear Regression to the Training set

regressor = lm(formula = Profit ~ .,

data = training_set)

# Predicting the Test set results

y_pred = predict(regressor, newdata = test_set)

Practical 7
Aim :- Implement of Logistic regression.

Source code:

[Link]("ISLR")

library(ISLR)

#load dataset

data <- ISLR::Default

print (head(ISLR::Default))

#view summary of dataset

summary(data)

#find total observations in dataset

nrow(data)

#Create Training and Test Samples

#split the dataset into a training set to train the model on and a testing set to test the model

[Link](1)

#Use 70% of dataset as training set and remaining 30% as testing set

sample <- sample(c(TRUE, FALSE), nrow(data), replace=TRUE, prob=c(0.7,0.3))

print (sample)

train <- data[sample, ]

test <- data[!sample, ]

nrow(train)

nrow(test)

# Fit the Logistic Regression Model

# use the glm (general linear model) function and specify family="binomial"

#so that R fits a logistic regression model to the dataset

model <- glm(default~student+balance+income, family="binomial", data=train)

#view model summary

summary(model)

#Model Diagnostics

[Link]("InformationValue")
library(InformationValue)

predicted <- predict(model, test, type="response")

confusionMatrix(test$default, predicted)
Practical 8
Aim: Read a datafile grades_km_input.csv and apply k-means clustering.

Datafile:

# install required packages

[Link]("plyr")

[Link]("ggplot2")

[Link]("cluster")

[Link]("lattice")

[Link]("grid")

[Link]("gridExtra")

# Load the package

library(plyr)

library(ggplot2)

library(cluster)

library(lattice)

library(grid)

library(gridExtra)

# A data frame is a two-dimensional array-like structure in which each column contains values of one
variable and each row contains one set of values from each column.

grade_input=[Link]([Link]("D:\\grades_km_input.csv"))

kmdata_orig=[Link](grade_input[, c ("Student","English","Math","Science")])

kmdata=kmdata_orig[,2:4]

kmdata[1:10,]

# the k-means algorithm is used to identify clusters for k = 1, 2, .. . , 15. For each value of k, the WSS
is calculated.

wss=numeric(15)

# the option n start=25 specifies that the k-means algorithm will be repeated 25 times, each starting
with k random initial centroids

for(k in 1:15)wss[k]=sum(kmeans(kmdata,centers=k,nstart=25)$withinss)

plot(1:15,wss,type="b",xlab="Number of Clusters",ylab="Within sum of square")

#As can be seen, the WSS is greatly reduced when k increases from one to two. Another substantial
reduction in WSS occurs at k = 3. However, the improvement in WSS is fairly linear fork > 3.

km = kmeans(kmdata,3,nstart=25)

c( wss[3] , sum(km$withinss))

df=[Link](kmdata_orig[,2:4])

df$cluster=factor(km$cluster)

centers=[Link](km$centers)

g1=ggplot(data=df, aes(x=English, y=Math, color=cluster )) +

geom_point() + theme([Link]="right") +

geom_point(data=centers,aes(x=English,y=Math, color=[Link](c(1,2,3))),size=10, alpha=.3,

[Link] =FALSE)

g2=ggplot(data=df, aes(x=English, y=Science, color=cluster )) +

geom_point () +geom_point(data=centers,aes(x=English,y=Science,
color=[Link](c(1,2,3))),size=10, alpha=.3, [Link]=FALSE)

g3 = ggplot(data=df, aes(x=Math, y=Science, color=cluster )) +

geom_point () + geom_point(data=centers,aes(x=Math,y=Science,
color=[Link](c(1,2,3))),size=10, alpha=.3, [Link]=FALSE)

tmp=ggplot_gtable(ggplot_build(g1))

[Link](arrangeGrob(g1 + theme([Link]="none"),g2 +
theme([Link]="none"),g3 + theme([Link]="none"),top ="High School Student
Cluster Analysis" ,ncol=1))
Practical 9
Aim: Perform Apriori algorithm using Groceries dataset from the R arules package.

[Link]("arules")

[Link]("arulesViz")

[Link]("RColorBrewer")

# Loading Libraries

library(arules)

library(arulesViz)

library(RColorBrewer)

# import dataset

data(Groceries)

Groceries

summary(Groceries)

class(Groceries)

# using apriori() function

rules = apriori(Groceries, parameter = list(supp = 0.02, conf = 0.2))

summary (rules)

# using inspect() function

inspect(rules[1:10])

# using itemFrequencyPlot() function

arules::itemFrequencyPlot(Groceries, topN = 20,

col = [Link](8, 'Pastel2'),

main = 'Relative Item Frequency Plot',

type = "relative",

ylab = "Item Frequency (Relative)")

itemsets = apriori(Groceries, parameter = list(minlen=2, maxlen=2,support=0.02, target="frequent

itemsets"))

summary(itemsets)

# using inspect() function

inspect(itemsets[1:10])
itemsets_3 = apriori(Groceries, parameter = list(minlen=3, maxlen=3,support=0.02, target="frequent
itemsets"))

summary(itemsets_3)

# using inspect() function

inspect(itemsets_3)

Common questions

Visualization techniques like ggplot histograms and density plots are invaluable for understanding dataset distributions before applying machine learning models. They enable the identification of data skewness, potential outliers, and overall shape, which informs necessary preprocessing steps such as normalization or transformation. By visualizing variables, one can assess whether assumptions like normality are met, which is crucial for certain models. These insights guide the choice and tuning of models appropriate for the data's underlying structure and help in anticipating model behavior and potential biases .

The essential packages required for implementing decision tree classification in R include 'datasets', 'caTools', 'party', 'dplyr', and 'magrittr'. Initial steps involve loading these libraries, loading the data using the 'data()' function, then splitting the dataset into training and testing sets using 'sample.split()'. Following this, a decision tree model can be created using the 'ctree()' function on the training data .

Simple linear regression involves modeling the relationship between a single predictor variable and the response variable, exemplified by estimating salary against years of experience using 'lm()'. Multiple linear regression, on the other hand, involves more than one predictor, as seen in the example where state and other factors predict profit. While simple linear regression is useful for straightforward correlations, multiple regression considers the effect of multiple variables simultaneously, providing a more comprehensive analysis of factors influencing the dependent variable .

Installing Hadoop on a multi-node cluster involves configuring namenode and datanode directories on separate nodes, as opposed to a single node setup where everything runs on one machine. Key steps include setting environment paths for Java and Hadoop, configuration of core-site.xml to specify the cluster's NameNode, and modifying mapred-site.xml and yarn-site.xml for resource management. Additionally, setting up datanode and namenode directories, and formatting the NameNode with the command 'hdfs namenode -format' are crucial before starting the Hadoop services using 'start-dfs' and 'start-yarn' commands .

Before training an SVM model in R, preliminary data processing steps include checking for missing values using 'sum(is.na())', determining the dimensions of the data with 'dim()', and obtaining a statistical summary using 'summary()'. Visualization steps include generating histograms and density plots for different variables using 'ggplot()' to understand their distributions. Additionally, labels for categorical variables can be added using 'factor()'. These steps help in understanding the data better and preparing it for model development .

Logistic regression is pivotal in analyzing the relationship between IQ and student pass rates because it models the probability of a binary outcome (pass or fail) based on the predictor variables (in this case, IQ). By fitting the logistic regression model with 'glm()', the analysis evaluates the likelihood of passing as IQ levels vary, highlighting non-linear relationships not captured by linear regression. This technique efficiently handles dichotomous outcomes and provides insights into how changes in IQ levels impact passing probability through the derived model coefficients and fitted values overlaying actual outcomes .

Preparing a dataset for logistic regression analysis involves several steps such as examining the dataset's contents with 'head()' and 'summary()', splitting it into training and testing sets by sampling, and ensuring the sample is representative. The logistic regression model is then fitted using 'glm()', with the formula specifying the predictor and response variables. It is crucial to ensure categorical variables are correctly labeled, and diagnostics are performed to evaluate model accuracy, often utilizing functions like 'predict()' and confusion matrix tools for validation .

A package like 'caTools' is essential in data splitting operations for multiple linear regression because it allows for creating a random sampling for dividing data into training and testing sets, which is critical for avoiding overfitting and ensuring the model's generalizability. The 'sample.split()' function enables the allocation of a predetermined split ratio, facilitating an effective separation of data into subsets, ensuring that model training and evaluation accurately reflect out-of-sample performance. This process is foundational in building robust models that effectively predict outcomes on unseen data .

Applying the Apriori algorithm to the Groceries dataset generates association rules that reveal frequent itemsets purchased together, providing valuable insights for market basket analysis. The support parameter determines how frequently an itemset appears in the dataset, while confidence evaluates the likelihood of an item Y being bought given that item X is bought. Adjusting these parameters affects the number and specificity of the generated rules; higher support reduces rule numbers but increases significance, while lower thresholds may expand potential insights albeit with less robustness .

The k-means clustering algorithm helps determine the optimal number of clusters through the calculation of the within-cluster sum of squares (WSS) for different values of k. In the example, WSS is calculated for k ranging from 1 to 15, and plotted to create an Elbow plot. The 'elbow' point in this plot, where the rate of decrease sharply drops, suggests an optimal number of clusters. This is critical because it indicates a balance between having a compact internal cluster structure and distinct separation between clusters .

Predictive Analytics Model Exam Paper
No ratings yet
Predictive Analytics Model Exam Paper
2 pages
Big Data Analytics Question Bank for B.Tech
No ratings yet
Big Data Analytics Question Bank for B.Tech
20 pages
RTAP Applications in Real-Time Analytics
No ratings yet
RTAP Applications in Real-Time Analytics
16 pages
Big Data and NoSQL Overview
No ratings yet
Big Data and NoSQL Overview
88 pages
Data Warehouse Design Overview
0% (1)
Data Warehouse Design Overview
20 pages
Data Stream Mining Techniques
No ratings yet
Data Stream Mining Techniques
16 pages
NoSQL Databases and Big Data Frameworks
No ratings yet
NoSQL Databases and Big Data Frameworks
42 pages
Hive Lecture Notes
100% (1)
Hive Lecture Notes
17 pages
Understanding Hive as a NoSQL Database
No ratings yet
Understanding Hive as a NoSQL Database
9 pages
Data Analytics Models and Algorithms For Intelligent Data Analysis 1st Edition Thomas A. Runkler (Auth.) Latest PDF 2025
No ratings yet
Data Analytics Models and Algorithms For Intelligent Data Analysis 1st Edition Thomas A. Runkler (Auth.) Latest PDF 2025
84 pages
Matrix Multiplication in Hadoop Lab
No ratings yet
Matrix Multiplication in Hadoop Lab
44 pages
OLAP in Data Warehousing Overview
No ratings yet
OLAP in Data Warehousing Overview
26 pages
Big Data Analytics with R and Machine Learning
No ratings yet
Big Data Analytics with R and Machine Learning
62 pages
Deep Learning Data Processing Guide
No ratings yet
Deep Learning Data Processing Guide
41 pages
Hadoop: The Definitive Guide Overview
100% (1)
Hadoop: The Definitive Guide Overview
57 pages
Data Similarity and Dissimilarity Measures
No ratings yet
Data Similarity and Dissimilarity Measures
24 pages
VPNs and IDS/IPS Security Overview
No ratings yet
VPNs and IDS/IPS Security Overview
7 pages
Data Preparation in Analytics Lifecycle
100% (1)
Data Preparation in Analytics Lifecycle
51 pages
Access Control Models in DBMS
No ratings yet
Access Control Models in DBMS
7 pages
Big Data Analytics Exam Insights
No ratings yet
Big Data Analytics Exam Insights
4 pages
Predictive Analytics and Regression Techniques
No ratings yet
Predictive Analytics and Regression Techniques
48 pages
DBMS Unit 5: Authentication & Access Control
No ratings yet
DBMS Unit 5: Authentication & Access Control
8 pages
Data Science Process Overview
No ratings yet
Data Science Process Overview
6 pages
Understanding Big Data Characteristics
No ratings yet
Understanding Big Data Characteristics
18 pages
Sampling Distributions in Big Data
No ratings yet
Sampling Distributions in Big Data
36 pages
MapReduce in Batch Processing
No ratings yet
MapReduce in Batch Processing
57 pages
Spark SQL and Streaming Overview
No ratings yet
Spark SQL and Streaming Overview
63 pages
Computer Networks Assignment 1 Guide
No ratings yet
Computer Networks Assignment 1 Guide
3 pages
Data Mining and Warehousing Q&A Guide
No ratings yet
Data Mining and Warehousing Q&A Guide
19 pages
Introduction to Apache Pig in Big Data
No ratings yet
Introduction to Apache Pig in Big Data
38 pages
CURE vs K-Means in Clustering Analysis
No ratings yet
CURE vs K-Means in Clustering Analysis
48 pages
Basics of Hadoop in Big Data Analytics
No ratings yet
Basics of Hadoop in Big Data Analytics
22 pages
Stream Processing and Data Sampling Techniques
No ratings yet
Stream Processing and Data Sampling Techniques
23 pages
Data Analytics & Visualization Syllabus
No ratings yet
Data Analytics & Visualization Syllabus
1 page
Model Selection in Data Mining
No ratings yet
Model Selection in Data Mining
27 pages
Machine Learning Question Bank 2024
No ratings yet
Machine Learning Question Bank 2024
6 pages
Data Analytics Lab Manual for B.Tech
No ratings yet
Data Analytics Lab Manual for B.Tech
35 pages
Object-Oriented Database Overview
No ratings yet
Object-Oriented Database Overview
13 pages
Pruning Techniques in Decision Trees
No ratings yet
Pruning Techniques in Decision Trees
10 pages
Understanding BOOTP and DHCP Protocols
No ratings yet
Understanding BOOTP and DHCP Protocols
13 pages
Big Data Mining: Statistical Modeling & ML
100% (2)
Big Data Mining: Statistical Modeling & ML
27 pages
Overview of Restricted Boltzmann Machines
No ratings yet
Overview of Restricted Boltzmann Machines
6 pages
R Programming for Statistics and Analytics
No ratings yet
R Programming for Statistics and Analytics
3 pages
Big Data Analytics Question Bank
No ratings yet
Big Data Analytics Question Bank
8 pages
Inter and Trans Firewall Analytics in Big Data
No ratings yet
Inter and Trans Firewall Analytics in Big Data
20 pages
Facets of Data
No ratings yet
Facets of Data
6 pages
Hadoop and Big Data Exam Papers
No ratings yet
Hadoop and Big Data Exam Papers
4 pages
Data Mining Mid Question Bank 2025 2026
No ratings yet
Data Mining Mid Question Bank 2025 2026
20 pages
Ensemble Learning in Machine Learning
No ratings yet
Ensemble Learning in Machine Learning
61 pages
Data Science Overview and Applications
No ratings yet
Data Science Overview and Applications
25 pages
When to Use Manhattan Distance in Clustering
No ratings yet
When to Use Manhattan Distance in Clustering
183 pages
Data Analytics Fundamentals and Techniques
0% (1)
Data Analytics Fundamentals and Techniques
2 pages
Spark Big Data Mini Project Guide
No ratings yet
Spark Big Data Mini Project Guide
3 pages
Association Rules Learning Overview
No ratings yet
Association Rules Learning Overview
72 pages
Big Data Analytics Overview and Insights
No ratings yet
Big Data Analytics Overview and Insights
20 pages
IML 4350702 Machine Learning Assignments
No ratings yet
IML 4350702 Machine Learning Assignments
5 pages
Data Analytics with R Question Bank
No ratings yet
Data Analytics with R Question Bank
4 pages
Mining Frequent Itemsets and Clustering Techniques
No ratings yet
Mining Frequent Itemsets and Clustering Techniques
46 pages
Big Data Analytics Practical Guide
No ratings yet
Big Data Analytics Practical Guide
32 pages
Hadoop Installation and MapReduce Guide
No ratings yet
Hadoop Installation and MapReduce Guide
39 pages
Microservices Adoption Challenges and Strategies
No ratings yet
Microservices Adoption Challenges and Strategies
5 pages
Understanding Data, Databases, and Warehousing
No ratings yet
Understanding Data, Databases, and Warehousing
11 pages
Business Research and Knowledge Management
No ratings yet
Business Research and Knowledge Management
4 pages
Data Analysis and Statistical Testing Guide
No ratings yet
Data Analysis and Statistical Testing Guide
1 page
Java Adapter Pattern Explained
No ratings yet
Java Adapter Pattern Explained
16 pages
Defining Economics: Wealth vs. Welfare
No ratings yet
Defining Economics: Wealth vs. Welfare
3 pages
Latin Square Cryptosystem Analysis
No ratings yet
Latin Square Cryptosystem Analysis
1 page
Grade 8 Araling Panlipunan Curriculum Map
No ratings yet
Grade 8 Araling Panlipunan Curriculum Map
15 pages
ICFRE Recruitment for MTS and Others
No ratings yet
ICFRE Recruitment for MTS and Others
38 pages
Export Procedures Assignment Overview
No ratings yet
Export Procedures Assignment Overview
3 pages
Test Bank For Job Readiness For Health Professionals 3rd Edition by Elsevier Inc
No ratings yet
Test Bank For Job Readiness For Health Professionals 3rd Edition by Elsevier Inc
61 pages
CS501 Theory of Computation Key Questions
No ratings yet
CS501 Theory of Computation Key Questions
7 pages
Understanding Sound: Types and Measurement
No ratings yet
Understanding Sound: Types and Measurement
11 pages
Mechanical & Electrical Schedule 2x4 MW
No ratings yet
Mechanical & Electrical Schedule 2x4 MW
2 pages
Tall Wood Buildings Design, Construction and Performance
No ratings yet
Tall Wood Buildings Design, Construction and Performance
216 pages
Essentials of Anatomy & Physiology 8th Ed.
0% (1)
Essentials of Anatomy & Physiology 8th Ed.
17 pages
The Founders Soni en 45064
No ratings yet
The Founders Soni en 45064
6 pages
Engineering
No ratings yet
Engineering
249 pages
Kindergarten Daily Lesson Plan: Community
No ratings yet
Kindergarten Daily Lesson Plan: Community
5 pages
Slimhole Drilling for Geothermal Use
No ratings yet
Slimhole Drilling for Geothermal Use
17 pages
Teaching English Vocabulary Strategies
No ratings yet
Teaching English Vocabulary Strategies
36 pages
Understanding Statistical Process Control
No ratings yet
Understanding Statistical Process Control
83 pages
Morphic Fields and Morphic Resonance
100% (2)
Morphic Fields and Morphic Resonance
12 pages
Concrete Mix Design for Retaining Walls
No ratings yet
Concrete Mix Design for Retaining Walls
13 pages
AURICOR 106N Gold Plating Instructions
No ratings yet
AURICOR 106N Gold Plating Instructions
2 pages
LOCTITE LB N 5000 en - GL
No ratings yet
LOCTITE LB N 5000 en - GL
2 pages
Ernő Lendvai
No ratings yet
Ernő Lendvai
2 pages
Software Testing Notes in Telugu
No ratings yet
Software Testing Notes in Telugu
4 pages
Technical Support The Way You've Always Wanted It
No ratings yet
Technical Support The Way You've Always Wanted It
2 pages
Pump Station Design for Wastewater Treatment
No ratings yet
Pump Station Design for Wastewater Treatment
47 pages
Tahisco Form Four Physics Exam 2025
No ratings yet
Tahisco Form Four Physics Exam 2025
7 pages
Understanding Nanotechnology Basics
No ratings yet
Understanding Nanotechnology Basics
27 pages
Elevated Steam Traps for Tracer Systems
No ratings yet
Elevated Steam Traps for Tracer Systems
8 pages
Probability Concepts in Drug Testing
No ratings yet
Probability Concepts in Drug Testing
5 pages