NoSQL Database Management Overview

NoSQL is a database management system designed for large volumes of unstructured and semi-structured data, offering flexible data models and horizontal scalability. It is classified into four main categories: document databases, key-value stores, column-family stores, and graph databases, each with unique features and use cases. While NoSQL provides advantages like high scalability and flexibility, it also has disadvantages such as lack of standardization and ACID compliance.

Uploaded by

desika1636

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views57 pages

NoSQL Database Management Overview

Uploaded by

desika1636

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

UNIT II

NOSQL Data Management

• NoSQL is a type of database management
system (DBMS) that is designed to handle
and store large volumes of unstructured and
semi-structured data.
• Unlike traditional relational databases that
use tables with pre-defined schemas to store
data, NoSQL databases use flexible data
models that can adapt to changes in data
structures and are capable of scaling
horizontally to handle growing amounts of
data.
• Horizontal scaling (scaling out) involves adding
more machines to a system to handle
increased workload, while vertical scaling
(scaling up) involves increasing the resources
(CPU, RAM, etc.) of a single machine.
• Horizontal scaling offers greater flexibility and
fault tolerance, while vertical scaling can be
simpler to implement for some applications.
• NoSQL databases are generally classified into
four main categories:
Document databases
Key-value stores
Column-family stores
Graph databases
Document databases
• Store data in JSON, BSON or XML format.
• Data are stored as documents that can contain
varying attributes.
• Examples: MongoDB, Cloudant
Key-value stores
• Data is stored as key-value pairs, making retrieval
extremely fast.
• Optimized for caching and session storage.
• Examples: Redis, Memcached, Amazon
DynamoDB
Column-family stores
• Data are stored in columns rather than rows,
enabling high-speed analytics and distributed
computations.
• Efficient for handling large-scale data with high
write/read demands.
• Examples: Apache Cassandra, HBase, Google
Bigtable
Graph databases
• Data are stored as nodes and edges, enabling
complex relationship management.
• Best suited for social networks, fraud detection,
and recommendation engines.
• Examples: Neo4j, Amazon Neptune, ArangoDB
Key Features of NoSQL :
• Dynamic schema
• Horizontal scalability
• Document-based
• Key-value-based
• Column-based
• Distributed and high availability - They are
designed to be highly available and to
automatically handle node failures and data
replication across multiple nodes in a database
cluster
• Flexibility - Allow developers to store and
retrieve data in a flexible and dynamic manner
• Performance
Advantages of NoSQL
• High scalability
• Flexibility
• High availability
• Performance
• Cost-effectiveness - deployed on commodity
hardware
• Agility - quickly and effectively extract value
and translate that value into actionable
insights and business outcomes.
Commodity hardware
• readily available, off-the-shelf computer
components that are widely used and
relatively inexpensive
Disadvantages of NoSQL
• Lack of standardization
• Lack of ACID compliance
• Narrow focus – better storage and lack in TM
• Lack of support for complex queries
• Lack of maturity – reliability and security
• Management challenge –Maintaining NOSQL
• GUI is not available
• Backup
• Large document size
When should NoSQL be used
• When a huge amount of data needs to be stored
and retrieved.
• The relationship between the data you store is
not that important
• The data changes over time and is not structured.
• Support of Constraints and Joins is not required
at the database level
• The data is growing continuously and you need to
scale the database regularly to handle the data.
NoSQL (Non-Relational
Feature SQL (Relational DB)
DB)

Flexible (Documents,
Data Model Structured, Tabular
Key-Value, Graphs)

Scalability Vertical Scaling Horizontal Scaling

Schema Predefined Dynamic & Schema-less

Limited or Eventual
ACID Support Strong
Consistency

Transactional Big data, real-time

Best For
applications analytics

MySQL, PostgreSQL, MongoDB, Cassandra,

Examples
Oracle Redis
Aggregate Data Models
• We put into four categories widely used in
the NoSQL ecosystem: key-value, document,
column-family, and graph. Of these, the first
three share a common characteristic of their
data models which we will call aggregate
orientation.
• Aggregate means collection of object that are
treated as a single unit.
• Aggregate is a term that comes from Domain-
Driven Design .
• Domain-Driven Design (DDD) is a software
development approach that emphasizes the
importance of understanding and modeling the
business domain
• In Domain-Driven Design aggregate is a
collection of related objects that are treat as a
unit.
• It doesn’t support ACID Property
• With the help of ADM, OLAP operation can be
easily performed.
Eg – Assume building a E-commerce website.
Data Model for relational Database
An aggregate data model
• The diamond shows how data fit into the
aggregate structure
• Customer contains a list of billing address
• Payment also contains the billing address
• It provides fast performance and horizontal
Scalability.
• Limited query capabilities.
• Doesn’t work well with relational data.
• When the value of data increases it is difficult
to maintain unique values.
Column-Family Stores
Graph Database
• A graph database (GDB) is a database that
uses graph structures for storing data.
• It uses nodes, edges, and properties .
• The edges represent relationships between
the nodes.
• The data is stored in the nodes of the graph
and the relationship between the data are
represented by the edges between the
nodes.
Advantages
• It solves Many-To-Many relationship problems
• When relationships between data elements are
more important
• Low latency with large scale data
Disadvantages
• Graph Databases may not be offering better
choice over the NoSQL variations.
• If application needs to scale horizontally this
may introduces poor performance.
• Not very efficient when it needs to update all
nodes with a given parameter.
Materialized view
• A materialized view is a database object that
stores the results of a query as a physical
table.
• SELECT c.customer_id,
SUM(o.order_total) as lifetime_value
FROM Customers c
JOIN Orders o ON c.customer_id =
o.customer_id
GROUP BY c.customer_id;
• CREATE MATERIALIZED VIEW
customer_lifetime_value
AS
SELECT c.customer_id,
SUM(o.order_total) as lifetime_value
FROM Customers c
JOIN Orders o ON c.customer_id =
o.customer_id
GROUP BY c.customer_id;
• SELECT * FROM customer_lifetime_value;
• Advantages:
Improve performance
Increase the speed of the queries
Efficient
• Disadvantages:
Not every DB type support
Read only
Cannot create key,constraint,triggers.
Distribution Models
• NoSQL : Data distributed over large cluster
• Data distribution model – Single and Multiple
server
• Orthogonal data distribution – Sharding and
Replication
• Advantages – handle larger quantity of data,
to process a greater read and write traffic
• Disadvantages – Cost and Complexity.
Single Server
• The first and the simplest distribution option
is — no distribution at all.
• Run the database on a single machine that
handles all the reads and writes to the data
store. We prefer this option because it
eliminates all the complexities.
• it’s easy for operations people to manage and
easy for application developers.
• When to use?
Sharding
Master-Slave Replication
Peer-to-Peer Replication
• There are two styles of distributing data:
Sharding –
Distributes different data across
multiple servers, so each server acts as
the single source for a subset of data.
Replication –
Copies data across multiple servers, so
each bit of data can be found in multiple
• A system may use either or both techniques.
Replication comes in two forms:
Master-slave replication -
Makes one node the authoritative copy that
handles writes while slaves synchronize with
the master and may handle reads.
Peer-to-peer replication –
Allows writes to any node; the nodes
coordinate to synchronize their copies of the
data.
Master-slave replication reduces the chance
of update conflicts but peer-to-peer
replication avoids loading all writes onto a
single point of failure.
CASSANDRA
• Column Oriented DB
• Peer to Peer Architecture
• Distributed, High Performance,Scalable,Fault
tolerant and NoSQL DB
• It is created at Facebook.
• Flexible on cloud as well as On Premise system
• Does not separate layer on cashing
• Writes and read - tuneable level of consistency
• Deployed on commodity hardware
• Compress – Google snappy data compression
algorithm
• CQL
Cassandra disadvantages
• Not support ACID
• High throughput of write operation
• Not support TM , join
• Data is distributed across all nodes- if there is
failure then spread across all nodes
Cassandra Architecture
• Peer to Peer architecture
• Does not have a Single pt of Failure
• Looks like a ring – gossip protocol
• Data spread using hash value
• Commit log
• Seeds & Gossip
Rack and Data Center
Key Components
• Node
• Rack
• Data Center
• Cluster
• Commit log
• Memtable (In Memory Cache)
• SSTable(Sorted String Table)
• Bloom Filter - Probabilistic data structure
• CQL
CASSANDRA DATA MODEL
Cassandra Data Model
• Query Driven Approach
• Fast read and write
• Tables – Primary key, Column family,
Alphabets (_)
• Columns – Define Data Structure within a
table
Column
• A column is the basic data structure of
Cassandra with three values, namely key or
column name, value, and a time stamp,TTL.
Given below is the structure of a column.
Row
Keyspace
• Keyspace is the outermost container for data in
Cassandra.
• The basic attributes of a Keyspace in Cassandra are −
• Replication factor − It is the number of machines in
the cluster that will receive copies of the same data.
• Replica placement strategy − It is nothing but the
strategy to place replicas in the ring. We have
strategies such as simple strategy , old network
topology strategy and network topology strategy .
• Column families − Keyspace is a container for a list of
one or more column families.
• The syntax of creating a Keyspace is as follows
CREATE KEYSPACE Keyspace name WITH
replication = {'class': 'SimpleStrategy',
'replication_factor' : 3};
Column Family
• A column family is a container for an ordered
collection of rows. Each row, in turn, is an
ordered collection of columns.
• A schema in a relational model is fixed. Once we
define certain columns for a table, while
inserting data, in every row all the columns must
be filled at least with a null value.
• In Cassandra, although the column families are
defined, the columns are not. You can freely add
any column to any column family at any time.
A Cassandra column family has the following
aIributes −
• keys_cache − Holds column family locaJon
keys (2,00,000)
• rows_cache − when a data or set of columns
is being used frequently and considered as
hot data.
SuperColumn
• A super column is a special column,
therefore, it is also a key-value pair. But not
support CQL.
CQL Data Definition Commands
• CREATE KEYSPACE − Creates a KeySpace in Cassandra.
• USE − Connects to a created KeySpace.
• ALTER KEYSPACE − Changes the properJes of a KeySpace.
• DROP KEYSPACE − Removes a KeySpace
• CREATE TABLE − Creates a table in a KeySpace.
• ALTER TABLE − Modiﬁes the column properJes of a table.
• DROP TABLE − Removes a table.
• TRUNCATE − Removes all the data from a table.
• CREATE INDEX − Deﬁnes a new index on a single column of
a table.
• DROP INDEX − Deletes a named index.
CQL Data Manipulation Commands

• INSERT − Adds columns for a row in a table.

• UPDATE − Updates a column of a row.
• DELETE − Deletes data from a table.
• BATCH − Executes mulJple DML statements
at once.
CQL Clauses
• SELECT − This clause reads data from a table
• WHERE − The where clause is used along with
select to read a specific data.
• ORDERBY − The orderby clause is used along
with select to read a specific data in a specific
order.
• CREATE (TABLE | COLUMNFAMILY)
<tablename> ('<column-definition>' ,
'<column-definition>') (WITH <option> AND
<option>)
• ALTER (TABLE | COLUMNFAMILY)
<tablename> <instruction>
• Using ALTER command, you can perform the
following operaJons −
Add a column
Drop a column
• cqlsh> USE ks;
• cqlsh:ks>; CREATE TABLE emp(
emp_id int PRIMARY KEY,
emp_name text,
emp_city text,
emp_sal varint,
emp_phone varint );
cqlsh:ks> select * from emp;
emp_id | emp_city | emp_name | emp_phone | emp_sal
--------+----------+----------+-----------+---------
(0 rows)

NoSQL Data Management Overview
No ratings yet
NoSQL Data Management Overview
26 pages
Bda Unit 2 Notes
No ratings yet
Bda Unit 2 Notes
33 pages
NoSQL Database Overview and Models
No ratings yet
NoSQL Database Overview and Models
32 pages
Understanding NoSQL Database Management
No ratings yet
Understanding NoSQL Database Management
29 pages
Understanding NoSQL Databases Explained
No ratings yet
Understanding NoSQL Databases Explained
18 pages
Introduction to NoSQL Databases
No ratings yet
Introduction to NoSQL Databases
23 pages
Introduction to NoSQL Databases and Models
No ratings yet
Introduction to NoSQL Databases and Models
22 pages
Understanding NoSQL Databases: Features & Types
No ratings yet
Understanding NoSQL Databases: Features & Types
12 pages
Sharding and Replication in NoSQL
100% (1)
Sharding and Replication in NoSQL
101 pages
Overview of NoSQL Data Management
No ratings yet
Overview of NoSQL Data Management
29 pages
Unit II Bda Material Mlwec
No ratings yet
Unit II Bda Material Mlwec
25 pages
Understanding NoSQL Databases and Their Features
No ratings yet
Understanding NoSQL Databases and Their Features
52 pages
BDAUnit 2 Notes
No ratings yet
BDAUnit 2 Notes
29 pages
UNIT-II-NO SQL Data Management (Revised)
No ratings yet
UNIT-II-NO SQL Data Management (Revised)
33 pages
Understanding NoSQL Database Systems
No ratings yet
Understanding NoSQL Database Systems
32 pages
NoSQL Data Architecture Patterns Explained
No ratings yet
NoSQL Data Architecture Patterns Explained
18 pages
Understanding NoSQL Database Diversity
No ratings yet
Understanding NoSQL Database Diversity
48 pages
NoSQL Database Overview and Benefits
No ratings yet
NoSQL Database Overview and Benefits
43 pages
NoSQL Databases: Features and Models
No ratings yet
NoSQL Databases: Features and Models
143 pages
Understanding NoSQL Databases Explained
No ratings yet
Understanding NoSQL Databases Explained
50 pages
Unit Ii Nosql Data Management 2.1.1 Introduction To Nosql
No ratings yet
Unit Ii Nosql Data Management 2.1.1 Introduction To Nosql
57 pages
Introduction to NoSQL Databases
No ratings yet
Introduction to NoSQL Databases
32 pages
Wide-Column Databases in Entertainment
No ratings yet
Wide-Column Databases in Entertainment
57 pages
Understanding NoSQL Databases and Models
No ratings yet
Understanding NoSQL Databases and Models
45 pages
NoSQL Databases: Features and Types
No ratings yet
NoSQL Databases: Features and Types
25 pages
Understanding NoSQL Databases
No ratings yet
Understanding NoSQL Databases
9 pages
Overview of NoSQL Database Types
No ratings yet
Overview of NoSQL Database Types
65 pages
NoSQL Data Management Overview
No ratings yet
NoSQL Data Management Overview
85 pages
Understanding NoSQL Database Types
No ratings yet
Understanding NoSQL Database Types
89 pages
NoSQL Database Models Overview
No ratings yet
NoSQL Database Models Overview
41 pages
Understanding NoSQL Databases Explained
No ratings yet
Understanding NoSQL Databases Explained
33 pages
Understanding NoSQL Databases Explained
No ratings yet
Understanding NoSQL Databases Explained
20 pages
Unit II Bda Material Mlwec
No ratings yet
Unit II Bda Material Mlwec
30 pages
Understanding NoSQL Databases and Types
No ratings yet
Understanding NoSQL Databases and Types
144 pages
Understanding NoSQL Databases and Models
No ratings yet
Understanding NoSQL Databases and Models
102 pages
Types of NoSQL Databases Overview
No ratings yet
Types of NoSQL Databases Overview
42 pages
Understanding NoSQL for Big Data Analytics
No ratings yet
Understanding NoSQL for Big Data Analytics
25 pages
Understanding NoSQL Database Systems
No ratings yet
Understanding NoSQL Database Systems
20 pages
Understanding NoSQL Databases Explained
No ratings yet
Understanding NoSQL Databases Explained
17 pages
Understanding NoSQL Databases
No ratings yet
Understanding NoSQL Databases
15 pages
NoSQL MongoDB
No ratings yet
NoSQL MongoDB
39 pages
Introduction to NoSQL Databases
No ratings yet
Introduction to NoSQL Databases
27 pages
Advanced Node.js & NoSQL Database Guide
No ratings yet
Advanced Node.js & NoSQL Database Guide
27 pages
Lecture 3-1
No ratings yet
Lecture 3-1
29 pages
NOSQL Databases Overview and Applications
No ratings yet
NOSQL Databases Overview and Applications
6 pages
Introduction to NoSQL Databases
No ratings yet
Introduction to NoSQL Databases
26 pages
Overview of NoSQL Databases and Features
No ratings yet
Overview of NoSQL Databases and Features
25 pages
Overview of NoSQL Database Types
No ratings yet
Overview of NoSQL Database Types
38 pages
Understanding Big Data Concepts and Structures
No ratings yet
Understanding Big Data Concepts and Structures
3 pages
Comparing SQL, NoSQL, and NewSQL Databases
No ratings yet
Comparing SQL, NoSQL, and NewSQL Databases
83 pages
Big 3
No ratings yet
Big 3
32 pages
Intro To NoSQL Chap1
No ratings yet
Intro To NoSQL Chap1
59 pages
Understanding NoSQL Databases
No ratings yet
Understanding NoSQL Databases
8 pages
Bda 2
No ratings yet
Bda 2
8 pages
Understanding NoSQL Databases and Types
No ratings yet
Understanding NoSQL Databases and Types
30 pages
Introduction to NoSQL Data Management
No ratings yet
Introduction to NoSQL Data Management
35 pages
NoSQL Databases: Features & Use Cases
No ratings yet
NoSQL Databases: Features & Use Cases
38 pages
SQL Quiz for Employee Database
No ratings yet
SQL Quiz for Employee Database
3 pages
Full Stack Python & Django Course Overview
No ratings yet
Full Stack Python & Django Course Overview
12 pages
Managing Databases in Microservices
No ratings yet
Managing Databases in Microservices
7 pages
Database Design Quiz for CSCI 414/514
No ratings yet
Database Design Quiz for CSCI 414/514
4 pages
Information Retrieval Methods in Libraries and Information Centers
No ratings yet
Information Retrieval Methods in Libraries and Information Centers
14 pages
Data Service Consultant Profile Summary
No ratings yet
Data Service Consultant Profile Summary
2 pages
Error-Based SQL Injection Tutorial
No ratings yet
Error-Based SQL Injection Tutorial
3 pages
SAP HANA Migration Role Requirements
No ratings yet
SAP HANA Migration Role Requirements
18 pages
EER to ODB Schema Mapping Guide
No ratings yet
EER to ODB Schema Mapping Guide
242 pages
High-Fidelity Test Data for SQL Services
No ratings yet
High-Fidelity Test Data for SQL Services
11 pages
APEX 22.2 Upgrade Errors on 4K DB
No ratings yet
APEX 22.2 Upgrade Errors on 4K DB
2 pages
Databricks Data Engineer Professional Study Guide
No ratings yet
Databricks Data Engineer Professional Study Guide
29 pages
Information Retrieval Systems Exam 2025
No ratings yet
Information Retrieval Systems Exam 2025
2 pages
Internet Programming II Question Bank
No ratings yet
Internet Programming II Question Bank
4 pages
Oracle Database 12c Upgrade Workshop
No ratings yet
Oracle Database 12c Upgrade Workshop
2 pages
SSRS Database Insights for Admins
No ratings yet
SSRS Database Insights for Admins
10 pages
Scholar Register Management System
No ratings yet
Scholar Register Management System
5 pages
Code First vs Database First in EF
No ratings yet
Code First vs Database First in EF
33 pages
Sirena Data Analyst Assessment Guide
No ratings yet
Sirena Data Analyst Assessment Guide
5 pages
MySQL Queries Comprehensive Guide
No ratings yet
MySQL Queries Comprehensive Guide
8 pages
Pandas and Tkinter GUI Basics
No ratings yet
Pandas and Tkinter GUI Basics
20 pages
Understanding Views in Oracle SQL
No ratings yet
Understanding Views in Oracle SQL
31 pages
Expert DBA in Oracle, SingleStore, Neo4j
No ratings yet
Expert DBA in Oracle, SingleStore, Neo4j
10 pages
Azure Data Engineer: Pipeline & BI Expert
No ratings yet
Azure Data Engineer: Pipeline & BI Expert
3 pages
SQL Commands - DML, DDL, DCL, TCL, DQL With Query Example
100% (2)
SQL Commands - DML, DDL, DCL, TCL, DQL With Query Example
11 pages
Store App Design with Product Lookup
No ratings yet
Store App Design with Product Lookup
1 page
System Design Framework for Engineers
No ratings yet
System Design Framework for Engineers
34 pages
Dataabase LLAABBBBB
No ratings yet
Dataabase LLAABBBBB
6 pages
Case+study-+IIITB+ +upGrad+Template-+Solution+doc.
100% (2)
Case+study-+IIITB+ +upGrad+Template-+Solution+doc.
26 pages
Components of File Management Systems
No ratings yet
Components of File Management Systems
58 pages

NoSQL Database Management Overview

Uploaded by

NoSQL Database Management Overview

Uploaded by

UNIT II

NOSQL Data Management

Scalability Vertical Scaling Horizontal Scaling

Schema Predefined Dynamic & Schema-less

Transactional Big data, real-time

MySQL, PostgreSQL, MongoDB, Cassandra,

• INSERT − Adds columns for a row in a table.

You might also like