SRM UNIVERSITY
ASSIGNMENT WORK
NAME: GUNJAN
YEAR/SEMESTER: 3rd /6th
PROGRAM: [Link].B (2021)
REGISTRATION NO.:
46021210022
SUBJECT: BIG DATA AND
ANALYTICS
QUESTION: REAL-LIFE EXAMPLES ILLUSTRATED WITH
A DISCUSSION ON THE SIGNIFICANCE OF BIG DATA
Answer: BIG DATA: Every aspect of our day-to-day life is gadget-
oriented, there is a huge volume of data that has been emanating from various
digital sources. Big data refers to the large, diverse sets of information that
grow at ever-increasing rates. It encompasses the volume of information, the
velocity or speed at which it is created and collected, and the variety or scope
of the data points being covered.
REAL-LIFE EXAMPLES:
EDUCATION INDUSTRY: The education industry is flooded with huge
amounts of data related to students, faculty, courses, results, and whatnot. Now,
we have realized that proper study and analysis of this data can provide insights
that can be used to improve the operational effectiveness and working of
educational institutes. Following are some of the fields in the education industry
that has been transformed by big data-motivated changes:
Customized and Dynamic Learning Programs: Customized programs
and schemes to benefit individual students can be created using the data
collected based on each student’s learning history. This improves the
overall student results.
Reframing Course Material: Reframing the course material according to
the data that is collected based on what a student learns and to what extent
by real-time monitoring of the components of a course is beneficial for the
students.
Grading Systems: New advancements in grading systems have been
introduced as a result of a proper analysis of student data.
Career Prediction: Appropriate analysis and study of every student’s
records will help understand each student’s progress, strengths, weaknesses,
interests, and more. It would also help in determining which career would
be the most suitable for the student in the future. The applications of big
data have provided a solution to one of the biggest pitfalls in the education
system, that is, the one-size-fits-all fashion of academic set-up, by
contributing to e-learning solutions.
HEALTHCARE INDUSTRY: Healthcare is yet another industry that is bound
to generate a huge amount of data. Following are some of how big data has
contributed to healthcare:
Big data reduces the costs of a treatment since there are fewer chances of
having to perform unnecessary diagnoses.
It helps in predicting outbreaks of epidemics and also in deciding what
preventive measures could be taken to minimize the effects of the same.
It helps avoid preventable diseases by detecting them in the early stages. It
prevents them from getting any worse which in turn makes their treatment
easy and effective.
Patients can be provided with evidence-based medicine identified and
prescribed after researching past medical results.
Wearable devices and sensors have been introduced in the healthcare industry
which can provide real-time feed to the electronic health record of a patient.
One such technology is Apple. Apple has come up with Apple HealthKit,
CareKit, and ResearchKit. The main goal is to empower iPhone users to store
and access their real-time health records on their phones.
BIG DATA IN GOVERNMENT SECTOR: Governments, be it of any country,
come face to face with a huge amount of data almost daily. The reason for this
is, they have to keep track of various records and databases regarding their
citizens, their growth, energy resources, geographical surveys, and many more.
All this data contributes to big data. The proper study and analysis of this data,
hence, helps governments in endless ways. A few of them are as follows:
Welfare Schemes
In making faster and more informed decisions regarding various political
programs
To identify areas that are in immediate need of attention
To stay up to date in the field of agriculture by keeping track of all existing
land and livestock.
To overcome national challenges such as unemployment, terrorism, energy
resources exploration, and much more.
Cyber Security
Big Data is hugely used for deceit recognition in the domain of cyber
security.
It is also used in catching tax evaders.
Cyber security engineers protect networks and data from unauthorized
access.
Food and Drug Administration (FDA) which runs under the jurisdiction of the
Federal Government of the USA leverages the analysis of big data to discover
patterns and associations to identify and examine the expected or unexpected
occurrences of food-based infections.
BIG DATA IN MEDIA AND ENTERTAINMENT: With people having access
to various digital gadgets, the generation of a large amount of data is inevitable
and this is the main cause of the rise in big data in the media and entertainment
industry.
Other than this, social media platforms are another way in which a huge amount
of data is generated. Although businesses in the media and entertainment
industry have realized the importance of this data, they have been able to benefit
from it for their growth.
Some of the benefits extracted from big data in the media and
entertainment industry are given below:
Predicting the interests of audiences
Optimized or on-demand scheduling of media streams in digital media
distribution platforms
Getting insights from customer reviews
Effective targeting of the advertisements
Spotify, on-demand music-providing platform, uses Big Data Analytics, collects
data from all its users around the globe, and then uses the analyzed data to give
informed music recommendations and suggestions to every individual user.
Amazon Prime which offers, videos, music, and Kindle books in a one-stop
shop is also big on using big data.
BIG DATA IN WEATHER PATTERNS: There are weather sensors and
satellites deployed all around the globe. A huge amount of data is collected
from them, and then this data is used to monitor the weather and environmental
conditions.
All of the data collected from these sensors and satellites contribute to big data
and can be used in different ways such as:
In weather forecasting
To study global warming
In understanding the patterns of natural disasters
To make necessary preparations in the case of crises
To predict the availability of usable water around the world
IBM Deep Thunder, which is a research project by IBM, provides weather
forecasting through high-performance computing of big data. IBM is also
assisting Tokyo with improved weather forecasting for natural disasters or
predicting the probability of damaged power lines.
BIG DATA IN TRANSPORTATION INDUSTRY: Since the rise of big data, it
has been used in various ways to make transportation more efficient and easy.
Following are some of the areas where big data contributes to transportation.
Route planning: Big data can be used to understand and estimate users’
needs on different routes and multiple modes of transportation and then
utilize route planning to reduce their wait time.
Congestion management and traffic control: Using big data, real-time
estimation of congestion and traffic patterns is now possible. For example,
people are using Google Maps to locate the least traffic-prone routes.
The level of traffic: Using the real-time processing of big data and
predictive analysis to identify accident-prone areas can help reduce
accidents and increase the safety level of traffic.
Example
Let’s take Uber as an example here. Uber generates and uses a huge amount of
data regarding drivers, their vehicles, locations, every trip from every vehicle,
etc. All this data is analyzed and then used to predict supply, demand, location
of drivers, and fares that will be set for every trip.
And guess what? We too make use of this application when we choose a route
to save fuel and time, based on our knowledge of having taken that particular
route sometime in the past. In this case, we analyzed and made use of the data
that we had previously acquired on account of our experience, and then we used
it to make a smart decision. It’s pretty cool that big data has played parts not
only in big fields but also in our smallest day-to-day life decisions too.
BIG DATA IN BANKING SECTOR: The amount of data in the banking sector
is skyrocketing every second. According to the GDC prognosis, this data is
estimated to grow 700 percent by the end of the next year. Proper study and
analysis of this data can help detect any illegal activities that are being carried
out such as:
Misuse of credit/debit cards
Venture credit hazard treatment
Business clarity
Customer statistics alteration
Money laundering
Risk mitigation
Various anti-money laundering software such as SAS AML uses Data
Analytics in Banking to detect suspicious transactions and analyze
customer data. Bank of America has been a SAS AML customer for more
than 25 years.
BIG DATA IN MARKETING SECTOR: Traditional marketing techniques
were based on the survey and one-on-one interactions with the customers.
Companies would run advertisements on radios, TV channels, and newspapers,
and put huge banners on the roadside. Little did they know about the impact of
their ads on the customer.
With the evolution of the internet and technologies like big data, this field of
marketing also went digital, known as Digital Marketing. Today, with big data,
you can collect huge amounts of data and get to know the choices of millions of
customers in a few seconds. Business Analysts analyze the data to help
marketers run campaigns, increase click-through rates, put relevant
advertisements, improve the product, and cover the nuances to reach the desired
target.
For example, Amazon collected data about the purchase done by millions of
people around the world. They analyzed the purchase patterns and payment
methods used by the customers and used the results to design new offers and
advertisements.
BIG DATA IN INSIGHT SPACE: One of the best Big Data applications we
can see in modern industries is generating business insights. Around 60 percent
of the total data collected by various enterprises and social media websites is
either unstructured or didn’t get analyzed by them. This data if used correctly,
can solve a lot of problems related to profits, customer satisfaction, and product
development. Luckily, companies are now getting aware of the importance of
using the latest technologies to manage and analyze this data more effectively.
One of the companies named Netflix is using Big Data to understand the user
behavior, the type of content they like, popular movies on the website, similar
content that can suggest to the user, and which series or movies should they
invest in.
BIG DATA IN SPACE SECTOR: Space agencies of different countries collect
huge amounts of data every day by observing outer space and information
received from satellites orbiting the earth, probes studying outer space, and
rovers on other planets. They analyze petabytes of data and use them to simulate
the flight path before launching the actual payload in space. Before launching
any rocket, it is necessary to run complex simulations and consider various
factors like weather, payload, orbit location, trajectory, etc.
For example, NASA is collecting data from different satellites and rovers about
the geography, atmospheric conditions, and other factors of mars for their
upcoming mission. It uses big data to manage all that data and analyzes that to
run simulations
HOW TO INSTALL HADOOP?
Hadoop is a well-known big data processing system for storing and
analysing enormous volumes of data. the steps of installing Hadoop on
Windows.
Step 1: Download and install Java:
Hadoop is built on Java, so you must have Java installed on your PC. You
can get the most recent version of Java from the official website. After
downloading, follow the installation wizard to install Java on your system.
JDK: [Link]
Step 2: Download Hadoop
Hadoop can be downloaded from the Apache Hadoop website. Make sure to
have the latest stable release of Hadoop. Once downloaded, extract the
contents to a convenient location.
Hadoop: [Link]
Step 3: Set Environment Variables
You must configure environment variables after downloading and unpacking
Hadoop. Launch the Start menu, type “Edit the system environment
variables,” and select the result. This will launch the System Properties
dialogue box. Click on “Environment Variables” button to open.
Click “New” under System Variables to add a new variable. Enter the
variable name “HADOOP_HOME” and the path to the Hadoop folder as the
variable value. Then press “OK.”
Then, under System Variables, locate the “Path” variable and click “Edit.”
Click “New” in the Edit Environment Variable window and enter
“%HADOOP_HOME%bin” as the variable value. To close all the windows,
use the “OK” button.
Step 4: Setup Hadoop
You must configure Hadoop in this phase by modifying several
configuration files. Navigate to the “etc/hadoop” folder in the Hadoop
folder. You must make changes to three files:
[Link]
[Link]
[Link]
Open each file in a text editor and edit the following properties:
In [Link]
<configuration>
<property>
<name>[Link]</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
In [Link]
<configuration>
<property>
<name>[Link]</name>
<value>1</value>
</property>
<property>
<name>[Link]</name>
<value>file:/hadoop-3.3.1/data/namenode</value>
</property>
<property>
<name>[Link]</name>
<value>file:/hadoop-3.3.1/data/datanode</value>
</property>
</configuration>
In [Link]
<configuration>
<property>
<name>[Link]</name>
<value>localhost:54311</value>
</property>
</configuration>
Save the changes in each file.
Step 5: Format Hadoop NameNode
You must format the NameNode before you can start Hadoop. Navigate to
the Hadoop bin folder using a command prompt. Execute this command:
hadoop namenode -format
Step 6: Start Hadoop
To start Hadoop, open a command prompt and navigate to the Hadoop bin
folder. Run the following command:
[Link]
This command will start all the required Hadoop services, including the
NameNode, DataNode, and JobTracker. Wait for a few minutes until all the
services are started.
Step 7: Verify Hadoop Installation
To ensure that Hadoop is properly installed, open a web browser and go
to [Link] This will launch the web interface for the Hadoop
NameNode. You should see a page with Hadoop cluster information.
Wrapping Up
By following the instructions provided in this article, you should be able to
get Hadoop up and operating on your machine. Remember to get the most
recent stable version of Hadoop, install Java, configure Hadoop, format the
NameNode, and start Hadoop services. Finally, check the NameNode web
interface to ensure that Hadoop is properly installed.