NoSQL et MongoDB
SQL Vs NoSQL
NoSQL
MongoDB
Basic CRUD operations
Schema Validation
Designing Schema
SQL VS NoSQL
SQL NoSQL
Storage Tables Documents, Columns, Key-Value
Schema structured schema schema-less, dynamic
Relationships Foriegn Key Embedded, References
Security Secured Not Secured
Transactions ACID BASE
Data validation Yes No
Scalability Vertically Horizontally
Usage ERP, E-commerce big data,
2
real-time
NoSQL Definition
NoSQL is a type of database management system
(DBMS) that is designed to handle and store large
volumes of unstructured and semi-structured data.
non-relational,
distributed,
open-source,
horizontal scalable ,
schema-free,
easy replication support,
simple API,
eventually consistent / BASE (not ACID),
huge data amount, and more.
3
BASE: Basically Available, Soft
state, Eventually Consistent
BASE is a model of database transactions designed to prioritize
availability, performance and partition tolerance over strong
consistency, often used in distributed systems like NoSQL
databases.
Characteristics
Weak consistency – stale data OK,
Availability first,
Best effort,
Approximate answers OK,
Aggressive (optimistic),
Simpler and faster
4
NoSQL Database Types
Column Store –
Each storage block
contains data from
only one column
Document
Store – stores
Key-Value
Store – Hash documents
made up of
table of keys
tagged
elements
5
MongoDB
What is MongoDB ?
• Scalable High-Performance Open-source,
Document-orientated database.
• Built for Speed
• Rich Document based queries for Easy readability.
• Full Index Support for High Performance.
• Replication and Failover for High Availability.
• Auto Sharding for Easy Scalability.
• Map / Reduce for Aggregation.
Why use MongoDB?
• SQL was invented in the 70’s to store data.
• MongoDB stores documents (or) objects.
• Now-a-days, everyone works with objects
(Python/Ruby/Java/etc.)
• And we need Databases to persist our objects.
Then why not store objects directly ?
• Embedded documents and arrays reduce need for
joins. No Joins and No-multi document
transactions.
What is MongoDB great for?
• RDBMS replacement for Web Applications.
• Semi-structured Content Management.
• Real-time Analytics & High-Speed Logging.
• Caching and High Scalability
Web 2.0, Media, SAAS, Gaming
HealthCare, Finance, Telecom, Government
Not great for?
• Highly Transactional Applications.
• Problems requiring SQL.
Some Companies using MongoDB in Production
Advantages of MongoDB
Schema less : Number of fields, content and size of the
document can be differ from one document to another.
No complex joins
Data is stored as JSON style
Index on any attribute
Replication and High availability
11
MongoDB Terminologies for
RDBMS concepts
RDBMS MongoDB
Database Database
Table, View Collection
Row Document (JSON, BSON)
Column Field
Index Index
Join Embedded Document
Foreign Key Reference
Partition Shard
12
JSON
“JavaScript Object Notation”
Easy for humans to write/read, easy for
computers to parse/generate
Objects can be nested
Built on
• name/value pairs
• Ordered list of values
[Link]
13
BSON
“Binary JSON”
Binary-encoded serialization of JSON-like docs
Embedded structure reduces need for joins
Goals
• Lightweight
• Traversable
• Efficient (decoding and encoding)
[Link]
14
BSON Example
{
"_id" : "37010"
“City" : “Nashik",
“Pin" : 423201,
"state" : “MH",
“Postman” : {
name: “Ramesh Jadhav”
address: “Panchavati”
}
} 15
Data Types of MongoDB
Integer
Date Boolean
Binary data Double
Object ID String
Null Arrays 16
Data Types
String : This is most commonly used datatype to store the data. String in
mongodb must be UTF-8 valid.
Integer : This type is used to store a numerical value. Integer can be 32
bit or 64 bit depending upon your server.
Boolean : This type is used to store a boolean (true/ false) value.
Double : This type is used to store floating point values.
Min/ Max keys : This type is used to compare a value against the lowest
and highest BSON elements.
Arrays : This type is used to store arrays or list or multiple values into one
key.
Timestamp : ctimestamp. This can be handy for recording when a
document has been modified or added.
Object : This datatype is used for embedded documents.
17
Data Types
Null : This type is used to store a Null value.
Symbol : This datatype is used identically to a string however,
it's generally reserved for languages that use a specific symbol
type.
Date : This datatype is used to store the current date or time
in UNIX time format. You can specify your own date time by
creating object of Date and passing day, month, year into it.
Object ID : This datatype is used to store the document’s ID.
Binary data : This datatype is used to store binay data.
Code : This datatype is used to store javascript code into
document.
Regular expression : This datatype is used to store regular
expression 18
Basic Database Operations- Database
use <database name>
• switched to database provided with ciommand
db
• To check currently selected database use the
command db
show dbs
• Displays the list of databases
[Link]()
19
• To Drop the database
Basic Database Operations- Collection
[Link] (name)
• To create collection: Ex:- [Link](Stud)
show collections
• List out all names of collection in current database
[Link]({Key : Value})
• MongoDB creates collection automatically when you insert some
document. Ex:- [Link]({{Name:”Jiya”})
[Link]()
• Used to drop a collection from the database. Example:-
20
[Link]()
CRUD Operations
Insert
Find
Update
Delete
21
Schema validation
MongoDB can perform schema validation during
updates and insertions. Existing documents do not
undergo validation checks until modification.
validator: specify validation rules or expressions for
the collection
validationLevel: determines how strictly MongoDB [Link]( <name>,
applies validation rules to existing documents during {validator: <document>,
an update validationLevel: <string>,
strict, the default, applies to all changes to any document validationAction: <string>,
of the collection
})
moderate, applies only to existing documents that
already fulfill the validation criteria or to inserts
validationAction: determines whether MongoDB should
raise error and reject documents that violate the
validation rules
or warn about the violations in the log but allow invalid
documents
22
JSON Schema Validation
Starting in version 3.6, MongoDB supports JSON Schema validation
specific type of schema validation that strictly adheres to the JSON Schema standard
[Link]("students",
{ validator: {
$jsonSchema: {
bsonType: "object",
required: [ "name", "year" ],
properties: {
name: {
bsonType: "string",
description: "must be a string and is required"
},
year: {
bsonType: "int",
minimum: 2000,
maximum: 2099,
description: "must be an integer in [2000, 2099] and is
required»
23
}}}}})
Designing Schema
Data can be stored using:
Embedded documents
Embedded documents store related data directly within the same
document. This denormalized approach keeps all related data
together in a single record.
References
References store related data in separate documents, linking
them through an identifier. This normalized approach is useful for
more complex relationships or when data grows independently.
Hybride
Combine both approaches by embedding frequently accessed or
summary data and referencing the rest
24
Designing Schema: Embedded
Use embedded documents in those cases: Example: Blog with Comments
If each blog post has comments that are tightly
One-to-One or One-to-Few coupled, embedding makes sense:
{
Relationships: When the related data is _id: ObjectId("123"),
small and tightly coupled with the title: "Introduction to MongoDB",
parent. content: "MongoDB is a NoSQL database...",
comments: [
{
Data Is Frequently Accessed
commenter: "John Doe",
Together: If queries often retrieve both message: "Great post!",
the parent and child data, embedding postedAt: ISODate("2024-11-21T10:00:00Z")
},
avoids additional lookups. {
commenter: "Jane Smith",
Low Update Frequency: When the message: "Very informative, thanks!",
embedded data doesn’t change postedAt: ISODate("2024-11-21T11:00:00Z")
independently of the parent. }
]
}
25
Designing Schema: References
Use embedded documents in those cases: Example: Orders and Customers
A customers collection and an orders collection can
One-to-Many or Many-to-Many reference each other:
Relationships: When the related data is Customers Collection:
large or needs independent {
_id: ObjectId("customer123"),
management. name: "Alice",
email: "alice@[Link]",
Data Access Patterns Vary: If child data phone: "123-456-7890"
}
is often queried independently of the Orders Collection:
parent. {
_id: ObjectId("order456"),
Frequent Updates: When the related // Reference to customers collection
customerId: ObjectId("customer123"),
data is updated more often than the orderDate: ISODate("2024-11-20T15:00:00Z"),
parent. total: 250.00
}
26
Designing Schema: Hybride
Example: Social Media User Profiles
Users Collection:
{
_id: ObjectId("user123"),
it’s possible to combine both approaches name: "John Doe",
email: "john@[Link]",
by embedding frequently accessed or posts: [
{
summary data and referencing the rest: postId: ObjectId("post123"),
title: "My First Post",
High Read Performance for Common preview: "This is a short summary of my first post..."
},
Data: Embed frequently accessed {
postId: ObjectId("post456"),
fields for speed while referencing less title: "A Day in the Life",
preview: "Sharing some thoughts about today..."
commonly used data. }
]
}
Handling Both Scalability and
Posts Collection (Referenced for Full Content):
Performance: Allows flexible design {
_id: ObjectId("post123"),
for large-scale systems. title: "My First Post",
content: "This is the full content of the first post...",
createdAt: ISODate("2024-11-21T10:00:00Z")
} 27