Network Technologies Course Overview
Network Technologies Course Overview
Course Description:
Course Outcomes:
Student will be able to
CO1: Understand the basic concepts of Computer Network, and principle of layering
(Understand)
CO2: Apply the error detection and correction techniques used in data transmission
(Apply)
CO3: Apply IP addressing schemes and sub netting (Apply)
CO4: Understand the concept of routing protocols, Application layer protocols and
Network Security (Understand)
CO5: Apply the socket programming basics to create a simple chat application
(Apply)
Course Structure:
1. Write the client and server programs for establishing termination of connection
between client and server using TCP. Assume the server can handle only one
client.
2. Write the client and server programs for simple data (hello) transfer between
client and server using UDP. Client will send hello server message to the server
program. In its reply the server will send hello client message. The server and
client programs should reside on different computers in a network.
3. Write the client and server programs for connectionless communication
between two different computers in the same TCP/IP network. The server
process receives a byte from the client process should and send back an should
acknowledgement to the client process.
4. Write program for implementing the sliding window protocol of window size 5.
5. Write the client and server program for implementing the broadcasting in the
local network.
Course References:
Recommended Books:
Text Books:
1. [Link]
2. [Link]
ml
Recommended Certifications:
Data refers to the raw facts that are collected while information refers to
processed data that enables us to take decisions.
Ex. When result of a particular test is declared it contains data of all students, when
you find the marks you have scored you have the information that lets you know
whether you have passed or failed.
The word data refers to any information which is presented in a form that is agreed
and accepted upon by is creators and users.
1.3 DATA COMMUNICATION
1. Delivery: The data should be delivered to the correct destination and correct user.
2. Accuracy: The communication system should deliver the data accurately,
without introducing any errors. The data may get corrupted during transmission
affecting the accuracy of the delivered data.
3. Timeliness: Audio and Video data has to be delivered in a timely manner without
any delay; such a data delivery is called real time transmission of data.
4. Jitter: It is the variation in the packet arrival time. Uneven Jitter may affect the
timeliness of data being transmitted.
1. Message
Message is the information to be communicated by the sender to the receiver.
2. Sender
The sender is any device that is capable of sending the data (message).
3. Receiver
The receiver is a device that the sender wants to communicate the data (message).
4. Transmission Medium
It is the path by which the message travels from sender to receiver.
It can be wired or wireless and many subtypes in both.
5. Protocol
It is an agreed upon set or rules used by the sender and receiver to
communicate data.
A protocol is a set of rules that governs data communication.
A Protocol is a necessity in data communications without which the
communicating entities are like two persons trying to talk to each other in a
different language without know the other language.
1. Text
Text includes combination of alphabets in small case as well as upper case.
It is stored as a pattern of bits. Prevalent encoding system : ASCII, Unicode
2. Numbers
Numbers include combination of digits from 0 to 9.
It is stored as a pattern of bits. Prevalent encoding system : ASCII, Unicode
The pixels are represented in the form of bits. Depending upon the type of image
(black n white or color) each pixel would require different number of bits to
represent the value of a pixel.
The size of an image depends upon the number of pixels (also called resolution)
and the bit pattern used to indicate the value of each pixel.
Example: if an image is purely black and white (two color) each pixel can be
represented by a value either 0 or 1, so an image made up of 10 x 10 pixel
elements would require only 100 bits in memory to be stored.
On the other hand an image that includes gray may require 2 bits to represent
every pixel value (00 - black, 01 – dark gray, 10
– light gray, 11 –white). So the same 10 x 10 pixel image would now require 200
bits of memory to be stored.
4. Audio
Data can also be in the form of sound which can be recorded and broadcasted.
Example: What we hear on the radio is a source of data or information.
Audio data is continuous, not discrete.
5. Video
Video refers to broadcasting of data in form of picture or movie
two devices communicate with each other by sending and receiving data. The data
can flow between the two devices in the following ways.
1. Simplex
2. Half Duplex
3. Full Duplex
1.5.1 Simplex
In half duplex both the stations can transmit as well as receive but not at the same
time.
When one device is sending other can only receive and viceversa (as shown in
figure above.) Example: A walkie-talkie.
In Full duplex mode, both stations can transmit and receive at the same time.
Example: mobile phones
Defination:
A computer network can be defined as a collection of nodes. A node can be
any device capable of transmitting or receiving data.
The communicating nodes have to be connected by communication links.
A Compute network should ensure reliability of the data
communication process, should c security of the data
performance by achieving higher throughput and smaller
delay times
1.7 PROTOCOL
When the sender sends a message it may consist of text, number, images, etc.
which are converted into bits and grouped into blocks to be transmitted and
often certain additional information called control information is also added
to help the receiver interpret the data.
For successful communication to occur, the sender and receiver must agree
upon certain rules called protocol.
B. Semantics
It tells the meaning of each section of bits and indicates the
interpretation of each section.
It also tells what action/decision is to be taken based on the
interpretation.
C. Timing
It tells the sender about the readiness of the receiver to receive the data
It tells the sender at what rate the data should be sent to the receiver to
avoid overwhelming the receiver.
1. De facto Standard
o These are the standards that have been traditionally used and mean
by fact or by convention
o These standards are not approved by any organized body but are
adopted by widespread use.
o Examples of Forums
1. ATM Forum
2. MPLS Forum
3. Frame Relay Forum
iii. In general, every task or job can be done by dividing it into sub task or
layers. Consider the example of sending a letter where the sender is in
City A and receiver is in city B.
vi. At the sender site, the activities take place in the following descending
order:
a. Higher Layer: The sender writes the letter along with the sender and
receivers address and put it in an envelope and drop it in the mailbox.
b. Middle Layer: The letter is picked up by the post man and delivered
to the post office
c. Lower Layer: The letters at the post office are sorted and are ready to
be transported through a carrier.
vii. During transition the letter may be carried by truck, plane or ship or a
combination of transport modes before it reaches the destination post
office.
viii. At the Receiver site, the activities take place in the following ascending
order:
a. Lower Layer: The carrier delivers the letter to the destination post
office
b. Middle Layer: After sorting, the letter is delivered to the receivers
mail box
c. Higher Layer: The receiver picks up the letter, opens the envelope and
reads it.
ix. Hierarchy of layers: The activities in the entire task are organized into
three layers. Each activity at the sender or receiver side occurs in a
particular order at the hierarchy.
x. The important and complex activities are organized into the Higher
Layer and the simpler ones into middle and lower layer.
The OSI model has 7 layers each with its own dedicated task.
A message sent from Device A to Device B passes has to pass through all
layers at A from top to bottom then all layers at B from bottom to top as
shown in the figure below.
At Device A, the message is sent from the top layer i.e Application Layer
A then all the layers till it reaches its physical layer and then it is
transmitted through the transmission medium.
The Data Link layer determines the next node where the message is
supposed to be forwarded and the network layer determines the final
recipient.
4.3.3 Communication & Interfaces
For communication to occur, each layer in the sending device adds its own
information to the message it receives from the layer just above it and
passes the whole package to the layer just below it. Each layer in the
receiving device removes the information added at the corresponding
layer and sends the obtained data to the layer above it.
Every Layer has its own dedicated function or services and is different
from the function of the other layers.
On every sending device, each layer calls upon the service offered by the
layer below it.
On every receiving device, each layer calls upon the service offered by the
layer above it.
II. On the sender side, the physical layer receives the data from Data Link
Layer and encodes it into signals to be transmitted onto the medium. On
the receiver side, the physical layer receives the signals from the
transmission medium decodes it back into data and sends it to the Data
Link Layer as shown in the figure below:
III. Interface
The Physical Layer defines the characteristics of interfaces between the
devices & transmission medium.
IV. Representation of bits
The physical layer is concerned with transmission of signals from one
device to another which involves converting data (1‘s & 0‘s) into signals
and vice versa. It is not concerned with the meaning or interpretation of
bits.
V. Data rate
The physical layer defines the data transmission rate i.e. number of bits
sent per second. It is the responsibility of the physical layer to maintain
the defined data rate.
II. On the sender side, the Data Link layer receives the data from
Network Layer and divides the stream of bits into fixed size
manageable units called as Frames and sends it to the physical
layer. On the receiver side, the data link layer receives the stream
of bits from the physical layer and regroups them into frames and
sends them to the Network layer. This process is called Framing.
It is shown in the figure below:
Fig: Data Link Layer: The process of Framing
b. The data link layer imposes flow control mechanism over the
sender and receiver to avoid overwhelming of the receiver.
V. Error control
a. The data link layer imposes error control mechanism to identify
lost or damaged frames, duplicate frames and then retransmit
them.
b. Error control information is present in the trailer of a frame.
VI. Access Control
a. The data link layer imposes access control mechanism to
determine which device has right to send data in an multipoint
connection scenario.
VII. Main Responsibility
i. The main responsibility of the data link layer is hop to hop
transmission of frames.
II. The network layer at the sending side accepts data from the transport
layer, divides it into packets, adds addressing information in the header
and passes it to the data link layer. At the receiving end the network layer
receives the frames sent by data link layer, converts them back into
packets, verifies the physical address (verifies if the receiver address
matches with its own address) and the send the packets to the transport
layer.
At every hop the network layer of the intermediate node check the IP
address in the header, if its own IP address does not match with the
IP address of the receiver found in the header, the intermediate node
concludes that it is not the final node but an intermediate node and
passes the packet to the data link layer where the data is forwarded to
the next node.
V. Routing
VI. The network layer divides data into units called packets of equal
size and bears a sequence number for rearranging on the receiving
end.
Each packet is independent of the other and may travel using different
routes to reach the receiver hence may arrive out of turn at the
receiver.
VI. The Network layer does not perform any flow control or error control
II. At the sending side, the transport layer receives data from the
session layer, divides it into units called segments and sends it to
the network layer. At the receiving side, the transport layer
receives packets from the network layer, converts and arranges
into proper sequence of segments and sends it to the session layer.
Fig: Transport Layer
VI. Flow Control & Error control: the transport layer also carries
out flow control and error control functions; but unlike data link
layer these are end to end rather than node to node.
II. The presentation layer at sending side receives the data from the
application layer adds header which contains information related
to encryption and compression and sends it to the session layer. At
the receiving side, the presentation layer receives data from the
session layer decompresses and decrypts the data as required and
translates it back as per the encoding scheme used at the receiver.
Fig : Presentation Layer
III. Translation
The sending and receiving devices may run on different platforms
(hardware, software and operating system). Hence it is important that they
understand the messages that are used for communicating. Hence a
translation service may be required which is provided by the Presentation
layers
IV. Compression
Compression ensures faster data transfer. The data compressed at sender has to be
decompressed at the receiving end, both performed by the Presentation layer.
V. Encryption
It is the process of transforming the original message to change its
meaning before sending it. The reverse process called decryption has to
be performed at the receiving end to recover the original message from
the encrypted message.
4.3.5.7Application Layer
I. The application layer enables the user
to communicate its data to the receiver by providing certain
services. For ex. Email is sent using X.400 service.
Fig : Application Layer
III. X400 is services that provides basis for mail storage and
forwarding
V. Main Responsibility
Main Responsibility of Application layer is to provide access to
network resources.
Topic 3 : Link Layer Communication
Data Link Layer is second layer of OSI Layered Model. This layer is one of the most
complicated layers and has complex functionalities and liabilities. Data link layer hides the
details of underlying hardware and represents itself to upper layer as the medium to
communicate.
Data link layer works between two hosts which are directly connected in some sense. This
direct connection could be point to point or broadcast. Systems on broadcast network are said
to be on same link. The work of data link layer tends to get more complex when it is dealing
with multiple hosts on single collision domain.
Data link layer is responsible for converting data stream to signals bit by bit and to send that
over the underlying hardware. At the receiving end, Data link layer picks up data from
hardware which are in the form of electrical signals, assembles them in a recognizable frame
format, and hands over to upper layer.
Data link layer has two sub-layers:
Logical Link Control: It deals with protocols, flow-control, and error control
Media Access Control: It deals with actual control of media
1) Single-Bit Error
Only 1 bit of a given data is changed
→ from 1 to 0 or
→ from 0 to 1 ( Figure 10.1a ).
2) Burst Error
→ from 1 to 0 or
→ from 0 to 1 (Figure 10.1b).
3.1.2Redundancy
• The central concept in detecting/correcting errors is redundancy .
• Some extra - bits along with the data have to be sent to detect/correct errors. These extra bits are
called redundant - bits.
• The redundant - bits are
→ adde d by the sender and
→ removed by the receiver.
• The presence of redundant - bits allows the receiver to detect/correct errors.
3-3
3.1.4 Coding
• Redundancy is achieved through various coding-schemes.
1) Sender adds redundant-bits to the data-bits. This process creates a relationship
between → redundant-bits and
→ data-bits.
2) Receiver checks the relationship between redundant-bits & data-bits to
detect/correct errors.
• Two important factors to be considered:
1) Ratio of redundant-bits to the data-bits and 2) Robustness of the process.
• Two broad categories of coding schemes: 1) Block-coding and 2) Convolution coding.
2) At Receiver
i) a) If the received code-word is the same as one of the valid code-
words, the code-word is accepted;
the corresponding data-word is extracted for use.
b) If the received code-word is invalid, the code-word is discarded.
ii) However, if the code-word is corrupted but the received code-
word still matches a valid
codeword, the error remains undetected.
• An error-detecting code can detect only the types of errors for which it is designed; other types
of errors may remain undetected.
In the physical layer, data transmission involves synchronised transmission of bits from the
source to the destination. The data link layer packs these bits into frames.
Data-link layer takes the packets from the Network Layer and encapsulates them into frames.
If the frame size becomes too large, then the packet may be divided into small sized frames.
Smaller sized frames makes flow control and error control more efficient.
Then, it sends each frame bit-by-bit on the hardware. At receiver’s end, data link layer picks
up signals from hardware and assembles them into frames.
Parts of a Frame
A frame has the following parts −
Frame Header − It contains the source and the destination addresses of the frame.
Payload field − It contains the message to be delivered.
Trailer − It contains the error detection and error correction bits.
Flag − It marks the beginning and end of the frame.
Types of Framing
Framing can be of two types, fixed sized framing and variable sized framing.
Fixed-sized Framing
Here the size of the frame is fixed and so the frame length acts as delimiter of the frame.
Consequently, it does not require additional boundary bits to identify the start and end of the
frame.
Example − ATM cells.
Variable – Sized Framing
Here, the size of each frame to be transmitted may be different. So additional mechanisms are
kept to mark the end of one frame and the beginning of the next frame.
It is used in local area networks.
Two ways to define frame delimiters in variable sized framing are −
Length Field − Here, a length field is used that determines the size of the frame. It is
used in Ethernet (IEEE 802.3).
End Delimiter − Here, a pattern is used as a delimiter to determine the size of frame.
It is used in Token Rings. If the pattern occurs in the message, then two approaches are
used to avoid the situation −
o Byte – Stuffing − A byte is stuffed in the message to differentiate from the
delimiter. This is also called character-oriented framing.
o Bit – Stuffing − A pattern of bits of arbitrary length is stuffed in the message to
differentiate from the delimiter. This is also called bit – oriented framing.
The networking industry has used HDLC to derive several other standards used today, such as
frame relay protocols like the ISDN protocol stack known as Link Access Procedure Balanced
(LAPB). It is also the basis for the framing mechanism that utilizes Point-to-Point Protocol
(PPP) on synchronous lines used to connect multiple servers to a WAN (wide area network)
internet link, as well as Cisco HDLC framing techniques that add protocol fields to the HDLC
header.
In Normal Response Mode, a primary station -- usually at the mainframe computer -- sends
data to secondary stations that may be local or may be at remote locations on dedicated leased
lines in what is called a multidrop or multipoint network. (This is not the network we usually
think of; it's a nonpublic closed network. In this arrangement, communication is usually half-
duplex.)
Variations of HDLC are also used for the public networks that use the X.25 communications
protocol and for frame relay, a protocol used in both and WANs -- public and private.
In the X.25 version of HDLC, the data frame contains a packet. (An X.25 network is one in
which packets of data are moved to their destination along routes determined by network
conditions as perceived by routers and reassembled in the right order at the ultimate
destination.)
P2P Protocol :
It’s What
users.
equal rightsisfor
a network aininitiating
Peer-to-Peer
which the Network?
computers are with
communication managed
each independently of one another
other, sharing resources, and have
and validating
Peer-to-peer networks are simple to set up and are often ideal for small businesses that have
fewer than 10 computers and that cannot afford a server-based solution. The disadvantages of
peer-to-peer networks are poor security and lack of centralized file storage and backup
facilities.
Peer-to-peer networks generally implement some form of virtual overlay network on top of
the physical network topology, where the nodes in the overlay form a subset of the nodes in
the physical network. Data is still exchanged directly over the underlying TCP/IP network,
but at the application layer peers are able to communicate with each other directly, via the
logical overlay links (each of which corresponds to a path through the underlying physical
network). Overlays are used for indexing and peer discovery and make the P2P system
independent from the physical network topology. Based on how the nodes are linked to each
other within the overlay network, and how resources are indexed and located, we can classify
networks as unstructured or structured (or as a hybrid between the two).
Unstructured networks
Unstructured peer-to-peer networks do not impose a particular structure on the overlay
network by design, but rather are formed by nodes that randomly form connections to each
other. (Gnutella, Gossip, and Kazaa are examples of unstructured P2P protocols).
Because there is no structure globally imposed upon them, unstructured networks are easy to
build and allow for localized optimizations to different regions of the overlay. Also, because
the role of all peers in the network is the same, unstructured networks are highly robust in the
face of high rates of “churn” – that is, when large numbers of peers are frequently joining and
leaving the network.
Structured networks
In structured peer-to-peer networks, the overlay is organized into a specific topology, and the
protocol ensures that any node can efficiently search the network for a file/resource, even if
the resource is extremely rare.
The most common type of structured P2P networks implements a distributed hash table
(DHT), in which a variant of consistent hashing is used to assign ownership of each file to a
particular peer. This enables peers to search for resources on the network using a hash table:
that is, (key, value) pairs are stored in the DHT, and any participating node can efficiently
retrieve the value associated with a given key.
Topic 4: IP Addressing
What is an IP Address?
An IP (Internet Protocol) address is a numerical label assigned to the devices connected to a
computer network that uses the IP for communication.
IP address act as an identifier for a specific machine on a particular network. It also helps you
to develop a virtual connection between a destination and a source. The IP address is also
called IP number or internet address. It helps you to specify the technical format of the
addressing and packets scheme. Most networks combine TCP with IP
An IP address consists of four numbers, each number contains one to three digits, with a
single dot (.) separates each number or set of digits.
Prefix: The prefix part of IP address identifies the physical network to which the
computer is attached. . Prefix is also known as a network address.
Suffix: The suffix part identifies the individual computer on the network. The suffix is
also called the host address.
IP Header Classes:
Address Subnet Example Leading Max number
Class Application
Range masking IP bits of networks
IP
Used for large number of
Class 1 to 126 [Link] [Link] 8 128
hosts.
A
IP
Used for medium size
Class 128 to 191 [Link] [Link] 16 16384
network.
B
IP
Used for local area
Class 192 to 223 [Link] 192.1.11. 24 2097157
network.
C
IP
Class 224 to 239 NA NA NA NA Reserve for multi-tasking.
D
IP This class is reserved for
Class 240 to 254 NA NA NA NA research and Development
E Purposes.
The address or your area is a group address of all houses that belong to a specific area. The
house address is the unique address of your homes in that area. Here, your area is represented
by a PIN code number.
In this example, the network address comprises all hosts which belong to a specific network.
The host address is the unique address of a particular host in that network.
This addressing method divides the IP address into five separate classes based on four
address bits.
Here, classes A, B, C offers addresses for networks of three distinct network sizes. Class D is
only used for multicast, and class E reserved exclusively for experimental purposes.
An example of a Class A address is [Link]. Here, “102” helps you identify the
network and 168.212.226 identify the host.
Class A addresses [Link] to [Link] cannot be used and is reserved for loopback
and diagnostic functions.
Class B Network
In a B class IP address, the binary addresses start with 10. In this IP address, the class
decimal number that can be between 128 to 191. The number 127 is reserved for loopback,
which is used for internal testing on the local machine. The first 16 bits (known as two octets)
help you identify the network. The other remaining 16 bits indicate the host within the
network.
Class C Network
Class C is a type of IP address that is used for the small network. In this class, three octets are
used to indent the network. This IP ranges between 192 to 223.
In this type of network addressing method, the first two bits are set to be 1, and the third bit is
set to 0, which makes the first 24 bits of the address them and the remaining bit as the host
address. Mostly local area network used Class C IP address to connect with the network.
[Link]
Class D Network
Class D addresses are only used for multicasting applications. Class D is never used for
regular networking operations. This class addresses the first three bits set to “1” and their
fourth bit set to use for “0”. Class D addresses are 32-bit network addresses. All the values
within the range are used to identify multicast groups uniquely.
Therefore, there is no requirement to extract the host address from the IP address, so Class D
does not have any subnet mask.
Class E Network
Class E IP address is defined by including the starting four network address bits as 1, which
allows you two to incorporate addresses from [Link] to [Link]. However, E
class is reserved, and its usage is never defined. Therefore, many network implementations
discard these addresses as undefined or illegal.
[Link]
The network ID cannot start with 127 because 127 belongs to class A address and is
reserved for internal loopback functions.
All bits of network ID set to 1 are reserved for use as an IP broadcast address and
cannot be used.
All bits of network ID are set to 0. They are used to denote a particular host on the
local network and should not be routed.
An IPv4 datagram consists of a header and a data field. The first 20 bytes in the header are
mandatory for all IPv4 datagrams. The Options field following the 20 bytes has a variable
length.
Following are various components/fields of IP packet header
Version: The first IP header field is a 4-bit version indicator. In IPv4, the value of its
four bits is set to 0100, which indicates 4 in binary. However, if the router does not
support the specified version, this packet will be dropped.
Internet Header Length: Internet header length, shortly known as IHL, is 4 bits in
size. It is also called HELEN (Header Length). This IP component is used to show
how many 32-bit words are present in the header.
Type of Service: Type of Service is also called Differentiated Services Code Point or
DSCP. This field is provided features related to the quality of service for data
streaming or VoIP calls. The first 3 bits are the priority bits. It is also used for
specifying how you can handle Datagram.
Total length: The total length is measured in bytes. The minimum size of an IP
datagram is 20 bytes and the maximum, it can be 65535 bytes . HELEN and Total
length can be used to calculate the dimension of the payload. All hosts are required to
be able to read 576-byte datagrams. However, if a datagram is too large for the hosts
in the network, the fragmentation method is widely used.
Identification: Identification is a packet that is used to identify fragments of an IP
datagram uniquely. Some have recommended using this field for other things like
adding information for packet tracing, etc.
IP Flags: Flag is a three-bit field that helps you to control and identify fragments. The
following can be their possible configuration:
Fragment Offset: Fragment Offset represents the number of Data Bytes ahead of the
particular fragment in the specific Datagram. It is specified in terms of the number of
8 bytes, which has a maximum value of 65,528 bytes.
Time to live: It is an 8-bit field that indicates the maximum time the Datagram will be
live in the internet system. The time duration is measured in seconds, and when the
value of TTL is zero, the Datagram will be erased. Every time a datagram is
processed its TTL value is decreased by one second. TTL are used so that datagrams
are not delivered and discarded automatically. The value of TTL can be 0 to 255.
Protocol: This IPv4 header is reserved to denote that internet protocol is used in the
latter portion of the Datagram. For Example, 6 number digit is mostly used to indicate
TCP, and 17 is used to denote the UDP protocol.
Header Checksum: The next component is a 16 bits header checksum field, which is
used to check the header for any errors. The IP header is compared to the value of its
checksum. When the header checksum is not matching, then the packet will be
discarded.
Source Address: The source address is a 32-bit address of the source used for the
IPv4 packet.
Destination address: The destination address is also 32 bit in size stores the address
of the receiver.
IP Options: It is an optional field of IPv4 header used when the value of IHL
(Internet Header Length) is set to greater than 5. It contains values and settings related
with security, record route and time stamp, etc. You can see that list of options
component ends with an End of Options or EOL in most cases.
Data: This field stores the data from the protocol layer, which has handed over the
data to the IP layer.
Addressing
We already know that in any network transmission model, any two devices shall start
communication by the virtue of their unique address. In simple words, we can say that
if any two devices want to engage in any kind of dialogue (communication), then they
should first identify themselves in this complex network transmission arena. So first
they should know each other’s respective addresses (of source and destination) .
In the internet employing TCP/IP protocol, we have four levels of addresses being in
use for different layers. Now let us see the different addressing methods in detail.
Before we dig deep into each type of addressing, we should first understand the respective
mapping (implementation) of each address in each of the layers of TCP/IP model:
In other words
The IP address and the physical address are necessary for a quantity of data to travel from a
source to the destination host. However, arrival at the destination host is not the final objective
of data communications on the Internet. Computers are devices that can run multiple processes
at the same time. The end objective of Internet communication is a process communicating
with another process. For example, computer A can communicate with computer C by using
TELNET. At the same time, computer A communicates with computer B by using the File
Transfer Protocol (FTP). For these processes to receive data simultaneously, we need a method
to label the different processes. In other words, they need addresses. In the TCP/IP architecture,
the label assigned to a process is called a port address. A port address in TCP/IP is 16 bits in
length. A port address is a 16-bit address represented by one decimal number as shown.
4. Specific Addresses:
A few of the applications generally have simple (easy to use) address. Examples of specific
addresses are the e-mail addresses of the University Resource Locators(URL).
Examples mainly consist of the email address (for eg, electronicscrunch@[Link]) and the
Universal Resource Locator (URL) (for eg , [Link]).
These kinds of addresses are designed for a specific address. However, this address gets
changed according to the required logical and port addresses sent from the sender computer.
So this is it from this topic of different addressing methods implemented in TCP/IP
architecture. From the next post onwards we will start looking into each of the protocols in
detail. So stay tuned.
What is Subnetting?
Subnetting is the practice of dividing a network into two or smaller networks. Every website
needs a unique IP address, in order to uniquely identify the website, we are dividing the IP
network into two or more networks called subnet, which is preferred to control network traffic.
It is a smaller network inside a large network.
IP Subnetting designates high-order bits from the host as part of the network prefix. This
method divides a network into smaller subnets.
It also helps you to reduce the size of the routing tables, which is stored in routers. This method
also helps you to extend the existing IP address base & restructures the IP address.
Next slide
A subnet mask is a 32-bit number used to differentiate the network component of an IP
address by dividing the IP address into a network address and host address. It does so with bit
arithmetic whereby a network address is bit multiplied by the subnet mask reveal the
underlying subnetwork. Like the IP address, a subnet mask is written using the "dotted-
decimal" notation.
Subnet masks are used to design subnetworks, or subnets, that connect local networks. It
determines both the number and size of subnets where the size of a subnet is the number of
hosts that can be addressed.
In simplistic terms, you can create a subnet mask by taking the 32-bit value of an existing IP
address, choosing how many subnets you want to create or alternatively, how many nodes
you need on each subnet, and then setting all subsequent network bits to "1" and host bits to
"0". The resulting 32-bit value is your subnet mask.
A subnet mask also pinpoints the endpoints of the range of IP addresses for a subnet. In any
given network, two host addresses are always reserved for special purposes. The "0" address
becomes the network address or network identification and the "255" address is assigned as a
broadcast address. These cannot be assigned to a host.
A subnet mask defines the range of IP addresses that can be used within a network or subnet.
It also separates an IP address into two parts: network bits and host bits.
Subnet masks are used when subnetting, which is when you break a network up into smaller
networks. By adjusting the subnet mask, you can set the number of available IP addresses
within a network.
For example, a common subnet mask for simple home networks is [Link]. This
subnet mask allows up to 254 usable IP addresses within the home network. In other words,
up to 254 computers, phones, and other internet connected devices can connect to your
router/network and access the internet.
Subnet masks break an IP address up into network bits and host bits. When a device sees the
network and host bits of another device's IP address, it can figure out if the other device is
part of the same network (home, business, etc.), or is somewhere else online
The default Subnet Mask is the number of bits which is reserved by the address class.
Using this default mask will accommodate a single network subnet in the relative
class.
A Custom Subnet Mask can be defined by an administrator to accommodate many
Network
To separate network addresses from host addresses, IPv4 uses an additional component with
IP addresses. This component is known as a subnet mask. In other words, in an IP address,
how many bits are used in the network address and how many bits are left for the host address
is determined by the subnet mask. The subnet mask is also 32 bits in length and uses the same
notation that is used by the IP address.
The subnet mask assigns an individual bit for each bit of the IP address. If an IP bit belongs to
the network portion, the subnet mask will turn on the assigned bit. If an IP bit belongs to the
host portion, the subnet mask will turn off the assigned bit.
In binary notation, 1 (one) represents an ON bit while 0 (zero) represents an OFF bit. In
dotted-decimal notation, a value range 1 to 255 represents an ON bit while a value 0 (zero)
represents an OFF bit.
An IP address is always used with the subnet mask. Without the subnet mask, an IP address is
considered an ambiguous address.
Example IP address: 11000001. Here 1st and 2nd bits are 1, and the 3rd bit is 0; hence, it is
class C.
Above example shows how IP addresses should be deconstructed, which makes it simple for
Internet routers to find the right Network to route data into. However, in a Class A network
there could be millions of connected devices, and it could take some time for the router to find
the right device.
Network Addressing
o Network Addressing is one of the major responsibilities of the network layer.
o Network addresses are always logical, i.e., software-based addresses.
o A host is also known as end system that has one link to the network. The boundary
between the host and link is known as an interface. Therefore, the host can have only
one interface.
o A router is different from the host in that it has two or more links that connect to it.
When a router forwards the datagram, then it forwards the packet to one of the links.
The boundary between the router and link is known as an interface, and the router can
have multiple interfaces, one for each of its links. Each interface is capable of sending
and receiving the IP packets, so IP requires each interface to have an address.
o Each IP address is 32 bits long, and they are represented in the form of "dot-decimal
notation" where each byte is written in the decimal form, and they are separated by the
period. An IP address would look like [Link] where 193 represents the decimal
notation of first 8 bits of an address, 32 represents the decimal notation of second 8 bits
of an address.
o In the above figure, a router has three interfaces labeled as 1, 2 & 3 and each router
interface contains its own IP address.
o Each host contains its own interface and IP address.
o All the interfaces attached to the LAN 1 is having an IP address in the form of
[Link], and the interfaces attached to the LAN 2 and LAN 3 have an IP address in
the form of [Link] and [Link] respectively.
o Each IP address consists of two parts. The first part (first three bytes in IP address)
specifies the network and second part (last byte of an IP address) specifies the host in
the network.
o Just as a host address provides a unique identity to the interface in a subnet, a network
address provides a unique identity to the subnet in the network. A network address is
the common address of all interfaces that belong to a specific subnet.
o Let's take an example to understand how network addresses work.
o In a network, four subnets are connected. Network addresses of these subnets are 1.1.1,
2.2.2, 3.3.3, and 4.4.4. Each subnet contains 6 PCs. Host addresses of PC1, PC2, PC3,
PC4, PC5, and PC6 are .1, .2, .3, .4, .5, and .6, respectively.
o In IP addresses, network addresses are always written before host addresses. If we write
the network address before the host address of a PC, we will get the IP address of that
PC. The following image shows this process in our example network.
In IP addresses, network addresses are always written before host addresses. If we write the
network address before the host address of a PC, we will get the IP address of that PC. The
following image shows this process in our example network.
Hosts or PCs of different subnets cannot communicate or exchange data directly. To connect
different subnets, routers are used. Routers are networking devices that connect different
subnets or networks. Routers store the network addresses of all available subnets in their
routing tables.
If a computer wants to send a data packet to a computer that belongs to another subnet, it sends
the data packet to the gateway router. A gateway router is the router that connects the subnet
to other subnets of the network. The gateway router forwards the data packet to the router that
is connected to the destination subnet or know how to reach the destination subnet. To forward
data packets, routers use only network addresses.
Let's understand it through our example. Our example network is divided into four subnets. To
connect these subnets, four routers: R1, R2, R3, and R4 are used. R1, R2, R3, and R4 are
connected to the first subnet ([Link]/8), second subnet ([Link]/8), third subnet ([Link]/8), and
the fourth subnet ([Link]/8), respectively.
Now suppose, PC1 of the first subnet sends a data packet to PC6 of the fourth subnet. PC1 sets
destination IP address in the packet to [Link]/8. In this IP address, 4.4.4 is the network address
and .6 is the host address. The packet reaches R1. R1 checks its routing table and forwards the
packet to R2. R2 follows the same procedure and forwards the packet to R3. R3 forwards the
packet to R4 and R4 forwards the packet to the local network of the fourth subnet. The local
network of the fourth subnet uses the host address of the packet to find the PC6.
When your computer first connects to a Local Area Network (LAN), it does not have an IP
address. It has to connect to the Dynamic Host Configuration Protocol (DHCP) server to get
an IP address. To do so, your computer has to perform a broadcast to a special Broadcast IP
address [Link] which essentially means every machine on the LAN will receive
your request for an IP address. The DHCP server will then respond with an IP address to be
assigned to your machine.
Broadcast routing
To calculate the direct broadcast address, you need to flip the host part of the IP address to all
“1”.
Example:
Benefits of Broadcast
Broadcast allows the machines on a network to auto discover services being offered by various
machines on the network. The example mentioned above about DHCP is vital as it reduces the
workload of network administrators since they don’t have to manually configure all computers
with a static IP address. Using broadcast, computers can also locate any network devices like
printers and scanners without knowing their IP addresses. Without Broadcast IP, you will find
daily life to be very tedious as everything needs to be manually configured.
Loopback Address
A loopback address is a distinct reserved IP address range that starts from [Link] ends
at [Link] though [Link] is the broadcast address for [Link]/8. The
loopback addresses are built into the IP domain system, enabling devices to transmit and
receive the data packets. The loopback address [Link] is generally known as localhost.
TCP/IP protocol manages all the loopback addresses in the operating system. It mocks the
TCP/IP server or TCP/IP client on the same system. These loopback addresses are always
accessible so that the user can use them anytime for troubleshooting TCP/IP.
Whenever a protocol or program sends any data from a computer with any loopback IP
address, that traffic is processed by a TCP/IP protocol stack within itself, i.e., without
transmitting it to the network. That is, if a user is pinging a loopback address, they’ll get the
reply from the same TCP/IP stack running on their computer. So, all the data transmitted to
any of the loopback addresses as the destination address will not pop up on the network.
[Link] is the most commonly used loopback address; generally, [Link] and localhost
are functionally similar, i.e., the loopback address [Link] and the hostname localhost; are
internally mapped. Though, other loopback addresses are also accessible and can be used.
TCP vs UDP
What is the TCP?
The TCP stands for Transmission Control Protocol. If we want the communication between
two computers and communication should be good and reliable. For example, we want to view
a web page, then we expect that nothing should be missing on the page, or we want to download
a file, then we require a complete file, i.e., nothing should be missing either it could be a text
or an image. This can only be possible due to the TCP. It is one of the most widely used
protocols over the TCP/IP network.
Features of TCP
o ata delivery
TCP protocol ensures that the data is received correctly, no data is missing and in order.
If TCP protocol is not used, then the incorrect data can be received or out of order. For
example, if we try to view the web page or download a file without using TCP, then
some data or images could be missing.
o Protocol
TCP is a connection-oriented protocol. Through the word connection-oriented, we
understand that the computers first establish a connection and then do the
communication. This is done by using a three-way handshake. In a three-way
handshake, the first sender sends the SYN message to the receiver then the receiver
sends back the SYN ACK message to confirm that the message has been received. After
receiving the SYN ACK message, the sender sends the acknowledgment message to
the receiver. In this way, the connection is established between the computers. Once
the connection is established, the data will be delivered. This protocol guarantees the
data delivery means that if the data is not received then the TCP will resend the data.
What is UDP?
The UDP stands for User Datagram Protocol. Its working is similar to the TCP as it is also
used for sending and receiving the message. The main difference is that UDP is a
connectionless protocol. Here, connectionless means that no connection establishes prior to
communication. It also does not guarantee the delivery of data packets. It does not even care
whether the data has been received on the receiver's end or not, so it is also known as the "fire-
and-forget" protocol. It is also known as the "fire-and-forget" protocol as it sends the data
and does not care whether the data is received or not. UDP is faster than TCP as it does not
provide the assurance for the delivery of the packets.
TCP UDP
Full form It stands for Transmission Control It stands for User Datagram
Protocol. Protocol.
Speed TCP is slower than UDP as it UDP is faster than TCP as it does
performs error checking, flow not guarantee the delivery of data
control, and provides assurance for packets.
the delivery of
Header size The size of TCP is 20 bytes. The size of the UDP is 8 bytes.
Acknowledgment TCP uses the three-way-handshake UDP does not wait for any
concept. In this concept, if the sender acknowledgment; it just sends the
receives the ACK, then the sender data.
will send the data. TCP also has the
ability to resend the lost data.
Flow control It follows the flow control This protocol follows no such
mechanism mechanism in which too many mechanism.
packets cannot be sent to the receiver
at the same time.
Error checking TCP performs error checking by It does not perform any error
using a checksum. When the data is checking, and also does not
corrected, then the data is resend the lost data packets.
retransmitted to the receiver.
Applications This protocol is mainly used where a This protocol is used where fast
secure and reliable communication communication is required and
process is required, like military does not care about the reliability
services, web browsing, and e-mail. like VoIP, game streaming, video
and music streaming, etc.
IP Routing:
IP routing is the process that defines the shortest path through which data travels to reach
from source to destination. It determines the shortest path to send the data from one computer
to another computer in the same or different network. Routing uses different protocols for
the different networks to find the path that data follows. It defines the path through which
data travel across multiple networks from one computer to other. Forwarding the packets
from source to destination via different routers is called routing. The routing decision is taken
by the routers.
Terminologies:
When the data is sent from the source to the destination the TCP and other protocols of the
source work and form an IP packet that is sent to the network. When an IP packet is sent to
the network from the source it has to pass through multiple routers to reach the destination.
The router in the network gets the destination address from the packet and through its routing
table identifies the next router information to which the data packet has to be passed. The
routing table of the router includes various information about the next router, its cost, and
other necessary information. The router takes the routing decision with the help of routing
protocols and a routing table to which next router the packet has to be sent to find the best
route to reach the destination. Different packets can be sent through different paths but all
the packets reach their intended destination. When the packets reach the destination through
different routers it sends them to the TCP for further processing.
Routing Protocols:
For taking routing decisions router needs various routing protocols and a routing table. The
routing protocols are divided into two domains:
1. Interdomain Routing protocols
2. Intradomain Routing protocols
Interdomain Routing Protocols:
This routing is used among the autonomous networks and it includes Path Vector
Routing(PVR).
Path Vector Routing:
Path vector routing uses path vectors for routing.
Border Gateway Protocol(BGP) is used in PVR for making routing decisions.
This routing is used within the autonomous networks and it includes two types of
routing: Distance Vector Routing(DVR) and Link State Routing(LSR).
Distance Vector Routing:
Distance vector routing uses distance vectors for routing.
It uses the Bellman-Ford algorithm for the computation of various distances.
Routing Information Protocol(RIP) is used in DVR for making routing decisions.
DVR suffers from count to infinity problem which can be solved using split-horizon or
route poisoning.
Link State Routing:
Link State routing uses the state of the link for routing.
It uses the Dijkstra algorithm for the computation of various distances.
Open Shortest Path First(OSPF) is used in LSR for making routing decisions.
LSR suffers from heavy traffic due to flooding which can be solved by the TTL field.
BGP
Autonomous systems (ASes) are collections of connected routers managed by a single
organization, for example an ISP, or a large company like Amazon. The Internet is made up
of ASes that route traffic internally and externally between each other.
IP routers are responsible for the next hop an IP packet takes. In order to know where to
forward a request to, IP routers need to share route information with each other. ASes use
IGPs (Interior Gateway Protocols), like RIP and OSPF, to share route information within
their AS, and EGPs (Exterior Gateway Protocols) to share routing information with other
ASes
BGP is an EGP. BGP1 was specified in 1989, and BGP4 was specified in 1993. BGP4
supports CIDR and subnetting [1, P. 101].
Note: Confusingly, the obsolete precursor to BGP is the Exterior Gateway Protocol (EGP),
but both BGP and EGP are examples of exterior gateway protocols. In this document, EGP
refers to exterior gateway protocols.
Exterior gateway protocols were developed primarily to allow ASes to set policies on their
routing traffic. For example, two ASes might have a commercial agreement which means
they can send traffic through each other, whereas another AS might want to disallow traffic
from an AS that doesn’t pay it for its services. These policies can be configured using
BGP [2, P. 301].
Each AS has a number (an ASN) assigned to it by either an internet registry or a provider.
Like IP, there are reserved private AS numbers [1].
ASes can be categorized based on how they interact with other ASes.
BGP4 protocol
BGP works by establishing connections to other BGP routers, known as BGP peers. BGP
peers initially send all their routing information to each other and then send UPDATE
messages periodically with any routing changes. A BGP router receives UPDATE messages
and uses them to rebuild its IP routing table.
If a route becomes unreachable or a better path becomes available, BGP routers resend
information.
Most of the complexity in BGP comes from the BGP decision process: the process used to
decide which routes should be used when a BGP router receives multiple possible routes for a
single prefix. BGP uses different metrics to determine which routes should be preferred, such
as the number of ASes that a route passes through [3].
Routers running iBGP are called transit routers, routers performing eBGP are known as
border routers.
Neighbor negotiation
When a BGP peer is started it must first connect to its neighbors. Neighbors establish a TCP
connection on port 179, and then send an OPEN message containing information to create the
connection, such as the BGP identifier (the senders ID) and the hold time.
The hold time is the maximum time that can elapse between keepalive messages before a
neighbor is considered dead. BGP peers use whichever is the lower value of the two peers. A
value of 0 means there’s no timeout and the connection is always considered up [1].
During neighbor session establishment, peer routers determine whether they are in the same
AS by inspecting the AS Number.
Errors
NOTIFICATION messages are sent when an error is detected. These are normally errors in
the format of a received message, like Bad Peer AS , Unacceptable Hold Time ,
or Malformed Attribute List .
UPDATE message
The UPDATE message contains routing information updates for both updated routes and
withdrawn routes.
Each UPDATE message contains routes that share common BGP attributes.
Chapter 5 – Application Layer Protocol
DNS
• Although TCP/IP uses IP addresses to locate and connect to hosts (computers and
other TCP/IP network devices), users typically prefer to use friendly names.
• For example, users prefer the friendly name [Link], instead of its IP address,
[Link]. The Domain Name System (DNS), defined in RFCs 1034 and 1035, is
used on the Internet to provide a standard naming convention for locating IP-based
computers.
• On the Internet, before the implementation of DNS, the use of names to locate
resources on TCP/IP networks was supported by a file called Hosts. Network
administrators entered names and IP addresses into Hosts, and computers used the file
for name resolution.
• What is DNS (Domain Name System)?
• Domain Name Systems (DNS) is mechanisms that assign easy to remember names to
IP address. Domain is a large group of computers on the Internet. Under this scheme
each computer has an IP address and a domain name. Domains have been made on
the base of organization type or geographical locations, e.g., the domain name
[Link] (where, com indicates that Google is a commercial organization).
• The Domain Name System (DNS) associates various information with domain
names; most importantly, it serves as the “phone book” for the Internet by translating
human-readable computer hostnames, e.g. [Link], into IP
addresses, e.g. [Link], which networking equipment needs to
deliver information.
• It also stores other information such as the list of mail servers that accept email for a
given domain. In providing a worldwide keyword-based redirection service, the
Domain Name System is an essential component of contemporary Internet use.
• DNS makes it possible to assign Internet names to organizations independent of the
physical routing hierarchy represented by the numerical IP address. Because of this,
hyperlinks and Internet contact information can remain the same, whatever the current
IP routing arrangements may be, and can take a human-readable form, which is easier
to remember than the IP address [Link].
• The Domain Name System distributes the responsibility for assigning domain names
and mapping them to IP networks by allowing an authoritative name server for each
domain to keep track of its own changes, avoiding the need for a central register to be
continually consulted and updated.
• At the request of Jon Postel, Paul Mockapetris invented the Domain Name system in
1983 and wrote the first implementation. The original specifications appear in RFC
882 and RFC 883. In November 1987, the publication of RFC 1034 and RFC 1035
updated the DNS specification and made RFC 882 and RFC 883 obsolete. Several
more-recent RFCs have proposed various extensions to the core DNS protocols.
The Domain Name System consists of a hierarchical set of DNS servers. Each domain or sub
domain has one or more authoritative DNS servers that publish information about that domain
and the name servers of any domains “beneath” it. The hierarchy of authoritative DNS servers
matches the hierarchy of domains. At the top of the hierarchy stand the root name servers: the
servers to query when looking up a top-level domain name.
Domain names, arranged in a tree, cut into zones, each served by a name server.
A domain name usually consists of two or more parts which is conventionally written
separated by dots, such as [Link] rightmost label conveys the top-level
domain for example, the address [Link] has the top-level domain
[Link] label to the left specifies a subdomain of the domain above it. For example:
[Link] comprises a subdomain of the com domain, and
[Link] comprises a subdomain of the domain [Link].
In theory, this subdivision can go down 127 levels. Each label can contain up to 63 characters.
The whole domain name does not exceed a total length of 253 characters
A hostname refers to a domain name that has one or more associated IP addresses; ie: the
‘www. [Link]’ and ‘ [Link]’ domains are both hostnames,
however, the ‘com’ domain is not.
(a) Non-geographical domains are those which indicate the type of organization, e.g.
[Link] in which com indicates that it is commercial type of organization.
b) Geographical domains indicate the code for individual countries, e.g. [Link].
Here .in indicates that the network connection is in a country named India.
Non-geographical Domains: Some examples are given below in reference of Non
geographical and Geographical domains. Some of the standard non-geographical domains are:
Geographical Domains:
The geographical based top level domains use two-letter country designations.
Each domain corresponds to a unique numeric IP address. Whenever we specify a DNS name
like [Link]. This name is converted to its corresponding IP address and this IP
address is used to locate the exact site on Internet.
Domain helps in locating a computer on Internet or in other words DNS is the way that Internet
domain names are located and translated into Internet Protocol Addresses. A domain name is
a meaningful and easy-to-remember for an Internet address.
A domain name is an identification label that defines a realm of administrative autonomy,
authority, or control in the Internet, based on the Domain Name System (DNS).
Domain names are used in various networking contexts and application-specific naming and
addressing purposes. They are organized in subordinate levels (sub-domains) of the DNS root
domain, which is nameless. The first-level set of domain names are the top-level domains
(TLDs), including the generic top-level domains (gTLDs), such as the prominent domains
com, net and org, and the country code top-level domains (ccTLDs).
Below these top level domains in the DNS hierarchy are the second-level and third-level
domain names that are typically open for reservation by end-users that wish to connect local
area networks to the Internet, run web sites, or create other publicly accessible Internet
resources. The registration of these domain names is usually administered by domain name
registrars who sell their services to the public.
ndividual Internet host computers use domain names as host identifiers, or hostnames.
Hostnames are the leaf labels in the domain name system usually without further subordinate
domain name space. Hostnames appear as a component in Uniform Resource Locators
(URLs) for Internet resources such as web sites (e.g., en. [Link]).
Domain names are also used as simple identification labels to indicate ownership or control
of a resource. Such examples are the realm identifiers used in the Session
Initiation Protocol (SIP), the Domain Keys used to verify DNS domains in e-mail systems,
and in many other Uniform Resource Identifiers (URIs).
An important purpose of domain names is to provide easily recognizable and memorizable
names to numerically addressed Internet resources. This abstraction allows any resource (e.g.,
website) to be moved to a different physical location in the address topology of the net1l1ork,
globally or locally in an intranet. Such a move usually requires changing the IF addresses of
a resource and the corresponding translation of this IP address to and from its domain name.
The hierarchy of domains descends from the right to the left label in the name; each label to
the left specifies a subdivision, or sub-domain of the domain to the right. For example: the
label example specifies a sub-domain of the com domain, and www is a sub domain
of example .com. This tree of labels may consist of 127 levels. Each label may contain up to
63 ASCII characters. The fun domain name may not exceed a total length of 253 characters.
In practice, some domain registries may have shorter limits.
A hostname is a domain name that has at least one IP addresses associated. For example, the
domain names [Link] and [Link] are also hostl1ames, whereas the com
domain is not.
Top-level domains-
The top-level domains (TLDs) are the highest level of domain names of the Internet. They
form the DNS root zone of the hierarchical Domain Name System. Every domain name ends
in a top-level or first-level domain label.
When the Domain Name System was created in the 1980s, the domain name space was
divided into two main groups of domains. The country code top-level domains (ccTLD) were
primarily based on the two-character territory codes of IS0-3166 country abbreviations. In
addition, a group of seven generic top-level domains (gTLD) was implemented which
represented a set of categories of names and multi-organizations. These were the domains
GOV, EDD, COM, MIL, ORG, NET, and INT.
Second-level and lower level domains
Below the top-level domains in the domain name hierarchy are the second-level domain (SLD)
names. These are the names directly to the left of .com, .net, and the other top-level domains.
As an example, in the domain [Link], wikipedia is the second-level domain.
Next are third-level domains, which are written immediately to the left of a second-level
domain. There can be fourth and fifth-level domains, and so on, with virtually no limitation.
An example of an operational domain name with four levels of domain labels
is [Link].
The www preceding the domains is the host name of the World-Wide Web server. Each label
is separated by a full stop (dot). ‘sos’ is said to be a sub-domain of ‘[Link]’, and ‘state’ a
sub-domain of ‘[Link]’, etc. In general, sub-domains are domains subordinate to their parent
domain. An example of very deep levels of sub-domain ordering is the IPv6 reverse resolution
DNS zones, e.g., [Link].[Link].[Link].[Link].[Link].[Link].[Link].[Link],
which is the reverse DNS resolution domain name for the IP address of a loop back interface,
or the local host name.
Internationalized domain names
The character set allowed in the Domain Name System has prevented the representation of
names and words of many languages in their native scripts or alphabets. ICANN has approved
the Puny code-based Internationalized domain name (IDNA) system, which maps unicode
strings into the valid DNS character set. Some registries have adopted IDNA.
Domain Name Registration
The right to use a domain name is delegated by domain name registrars who are accredited by
the Internet Corporation for Assigned Names and Numbers (ICANN), the organization
charged with overseeing the name and number systems of the Internet. In addition to ICANN,
each top-level domain (TLD) is maintained and serviced technically by an administrative
organization, operating a registry. A registry is responsible for maintaining the database of
names registered within the TLD it administers. The registry receives registration information
from each domain name registrar authorized to assign names in the corresponding TLD and
publishes the information using a special service, the whois protocol.
Registries and registrars usually charge an annual fee for the service of delegating a domain
name to a user and providing a default set of name servers. Often this transaction is termed a
sale or lease of the domain name and the registrant may sometimes be called an “owner”, but
no such legal relationship is actually associated with the transaction, only the exclusive right
to use the domain name. More correctly, authorized users are known as “registrants” or as
“domain holders”.
ICANN publishes the complete list of TLD registries and domain name registrars. Registrant
information associated with domain names is maintained in an online database accessible with
the WHOIS service. For most of the more than 240 country code top-level domains (ccTLDs),
the domain registries maintain the WHOIS (Registrant, name servers, expiration dates, etc.)
information.
Some domain name registries, often called network information centers (NIC), also function
as registrars to end-users. The major generic top-level domain registries, such as for the COM,
NET, ORG, INFO domains and others, use a registry-registrar model consisting of hundreds
of domain name registrars (see lists at ICANN or VeriSign). In this method of management,
the registry only manages the domain name database and the relationship with the registrars.
The registrants (users of a domain name) are customers of the registrar, in some cases through
additional layers of resellers.
In the process of registering a domain name and maintaining authority over the new name
space created, registrars use several key pieces of information connected with a domain:
Administrative contact: A registrant usually designates an administrative contact to manage
the domain name. The administrative contact usually has the highest level of control over a
domain. Management functions delegated to the administrative contacts may include
management of all business information, such as name’ of record, postal address, and contact
information of the official registrant of the domain and the obligation to conform to the
requirements of the domain registry in order to retain the right to use a domain name.
Furthermore the administrative contact installs additional contact information for technical
and billing functions.
Technical contact: The technical contact manages the name servers of a domain name. The
functions of a technical contact include assuring conformance of the configurations of the
domain name with the requirements of the domain registry, maintaining the domain zone
records, and providing continuous functionality of the name servers (that leads to the
accessibility of the domain name).
Billing contact: The party responsible for receiving billing invoices from the domain name
registrar and paying applicable fees.
Name servers: Most registrars provide two or more name servers as part of the registration
service. However, a registrant may specify its own authoritative name servers to host a
domain’s resource records. The registrar’s policies govern the number of servers and the type
of server information required. Some providers require a host name and the corresponding IP
address or just the hostname, which must be resolvable either in the new domain, or exist
elsewhere. Based on traditional requirements (RFC 1034), typically a minimum of two servers
is required.
Slide 27
The steps involved in DNS Resolution are-
Step-01:
A user program sends a name query to a library procedure called the resolver.
Step-02:
Step-03:
Step-04:
After receiving a response, the DNS client returns the resolution result to the
application.
HTTP
• HTTP stands for HyperText Transfer Protocol.
• It is a protocol used to access the data on the World Wide Web (www).
• The HTTP protocol can be used to transfer the data in the form of plain text,
hypertext, audio, video, and so on.
• This protocol is known as HyperText Transfer Protocol because of its efficiency that
allows us to use in a hypertext environment where there are rapid jumps from one
document to another document.
• HTTP is similar to the FTP as it also transfers the files from one host to another host.
But, HTTP is simpler than FTP as HTTP uses only one connection, i.e., no control
connection to transfer the files.
• HTTP is used to carry the data in the form of MIME-like format.
• HTTP is similar to SMTP as the data is transferred between client and server. The
HTTP differs from the SMTP in the way the messages are sent from the client to the
server and from server to the client. SMTP messages are stored and forwarded while
HTTP messages are delivered immediately.
• Connectionless protocol: HTTP is a connectionless protocol. HTTP client initiates a
request and waits for a response from the server. When the server receives the request,
the server processes the request and sends back the response to the HTTP client after
which the client disconnects the connection. The connection between client and server
exist only during the current request and response time only.
• Media independent: HTTP protocol is a media independent as data can be sent as
long as both the client and server know how to handle the data content. It is required
for both the client and server to specify the content type in MIME-type header.
• Stateless: HTTP is a stateless protocol as both the client and server know each other
only during the current request. Due to this nature of the protocol, both the client and
server do not retain the information between various requests of the web pages.
•
•
HTTP Request
The HTTP Request is the call that the client/browser makes to the web server. It is composed
of the following elements:
Method
URL (possibly including a query string)
Request Headers
Request Body
Method-
Each request, at the time of being sent, requires the mandatory use of one (and only one)
HTTP method, which corresponds to the type of operation performed by the client/browser;
These methods are specified in RFC 7231. The main HTTP methods are as follows:
There are also other HTTP methods, such as OPTIONS, HEAD, etc, which we will not cover
for space reasons. The most important and used in the context of websites and web
applications are GET and POST.
Generally, the GET method is used when the client accesses a page in READING, while the
POST method is used when the client makes a MODIFICATION on the site data (create,
modify, and/or delete a user, an article, etc). Not surprisingly, the POST method is typically
used by forms and AJAX calls that have the effect of MODIFYING the server data; since it is
a method that makes MODIFICATIONS, the POST method is also typically subject to
authentication and/or prior authorization requirements in order to be used (in other words, it
is only available for registered users with adequate permissions.
[Link]
As you can see, the URL can also contain a query string, that is a sequence of parameters
starting with the "?" and then continues with one or more pairs of keys/values separated from
each other by the symbol "&". In the URL above, the query string is made up of a single
key/value pair, where the key is "q" and the value is "tennis".
Request Headers-
Request headers are a sort of "metadata" that can be sent by the client/browser accompanying
an HTTP request to provide information on the context of the request: many of these headers
are added "automatically" by the browser and contain parameters relating to the language
and/or the settings in use on the system. These headers are also transmitted as a series of
key/value pairs. Some examples of the most common request headers:
Request Body
The "body" of the request: is almost always present in POST requests, as it contains the
parameters that are sent to the server; instead it is typically not present ("empty") in GET
requests, as in those cases the parameters are sent to the server through the query string (see
above).
HTTP Response-
Now that we have focused on the main components of an HTTP Request, it is time to
examine the HTTP Response, which is the response that the server sends to the
client/browser following each request.
The response that the web server sends to the client. It consists of the following elements:
Status Code (e.g. 200 - OK, or 404 - Page Not Found, etc)
Response Headers
Response Body
Status Code-
The Status Code (or HTTP Status Code) is a 3-digit numeric code that indicates the outcome
of the HTTP request made by the client. The Status Codes are also defined within RFC 7231
and are grouped into five main categories:
Response Headers-
The response headers are the equivalent of the request headers, that is a sort of "metadata"
that is sent by the server in support of an HTTP response to provide information on the
context of the response. These headers are also transmitted as a series of key/value pairs.
Some examples of the most common response headers:
Response Body
The response body is the content that is transmitted from the server to the client, which is
what the client typically displays on the screen. It can be in various formats: HTML, text,
GIF image, JPG (with EXIF data), WAV, MP4, etc., and is opened by the client/browser
using the media type information contained in the Content-Type response header (see above).
IMPORTANT: do not confuse the Response Body, which concerns the HTTP response
received from the server, with the <BODY> element of the HTML pages, which concerns the
internal structure of the content found within the Response Body.
In order to facilitate the understanding of the HTTP Request / Response cycle, we propose -
with the inevitable simplifications, differences, and approximations of the case - to imagine
the exchange of information between client and server as the mechanism for sending and
receiving orders, shipments, and packages on an e-commerce website, such as Amazon.
In detail:
The HTTP Request corresponds to each of the activities we carry out on the
Amazon site from the selection of the item to the order, namely:
The method (GET when we view the details of the item we want to
order, POST when we send the order confirmation)
The URL (the product we order)
The Request Headers (any delivery details, eg "before 2 pm")
The Request Body (shipping data)
The HTTP Response corresponds to the package we receive from Amazon, that
is:
HTTP Status Code (200 if the package arrives, 404 if the non-delivery
notice arrives, 500 if the address is incorrect, etc)
HTTP Response Headers (the packing slip containing address, sender,
telephone, hours, shipper, the date on which it was stamped, etc)
HTTP Response Body (the contents of the package, or information on
why it did not arrive and/or what to do in case of non-delivery)
WWW Overview-
Overview
WWW stands for World Wide Web. A technical definition of the World Wide Web is : all
the resources and users on the Internet that are using the Hypertext Transfer Protocol (HTTP).
WWW can be defined as the collection of different websites around the world, containing
different information shared via local servers(or computers).
A broader definition comes from the organization that Web inventor Tim Berners-Lee helped
found, the World Wide Web Consortium (W3C).
The World Wide Web is the universe of network-accessible information, an embodiment of
human knowledge.
In simple terms, The World Wide Web is a way of exchanging information between computers
on the Internet, tying them together into a vast collection of interactive multimedia resources.
Internet and Web is not the same thing: Web uses internet to pass over the information.
The terms Internet and World Wide Web are often used as synonyms. However, the two terms
do not mean the same thing. The Internet is a global system of interconnected computer
networks. In contrast, the World Wide Web is a global collection of documents and other
resources, linked by hyperlinks and URLs. Web resources are accessed using HTTP or HTTPS,
which are application-level Internet protocols that use the Internet’s transport protocols.
Viewing a web page on the World Wide Web normally begins either by typing the URL of the
page into a web browser, or by following a hyperlink to that page or resource. The web browser
then initiates a series of background communication messages to fetch and display the
requested page. In the 1990s, using a browser to view web pages – and to move from one web
page to another through hyperlinks – came to be known as ‘browsing,’ ‘web surfing’ (after
channel surfing), or ‘navigating the Web’. Early studies of this new behaviour investigated user
patterns in using web browsers. One study, for example, found five user patterns: exploratory
surfing, window surfing, evolved surfing, bounded navigation and targeted navigation.
System Architecture:
From the user’s point of view, the web consists of a vast, worldwide connection of
documents or web pages. Each page may contain links to other pages anywhere in the
world. The pages can be retrieved and viewed by using browsers of which internet
explorer, Netscape Navigator, Google Chrome, etc are the popular ones. The browser
fetches the page requested interprets the text and formatting commands on it, and displays
the page, properly formatted, on the screen.
The basic model of how the web works are shown in the figure below. Here the browser is
displaying a web page on the client machine. When the user clicks on a line of text that is
linked to a page on the [Link] server, the browser follows the hyperlink by sending a
message to the [Link] server asking it for the page.
Here the browser displays a web page on the client machine when the user clicks on a line of
text that is linked to a page on [Link], the browser follows the hyperlink by sending a
message to the [Link] server asking for the page.
Working of WWW: -
The World Wide Web is based on several different technologies: Web browsers, Hypertext
Markup Language (HTML) and Hypertext Transfer Protocol (HTTP).
A Web browser is used to access web pages. Web browsers can be defined as programs which
display text, data, pictures, animation and video on the Internet. Hyperlinked resources on
the World Wide Web can be accessed using software interfaces provided by Web browsers.
Initially, Web browsers were used only for surfing the Web but now they have become more
universal. Web browsers can be used for several tasks including conducting searches,
mailing, transferring files, and much more. Some of the commonly used browsers are Internet
Explorer, Opera Mini, and Google Chrome.
Features of WWW:
Hypertext Information System
Cross-Platform
Distributed
Open Standards and Open Source
Uses Web Browsers to provide a single interface for many services
Dynamic, Interactive and Evolving.
“Web 2.0”
WWW Architecture-
Identifiers and Character Set
Uniform Resource Identifier (URI) is used to uniquely identify resources on the web
and UNICODE makes it possible to built web pages that can be read and write in human
languages.
Syntax
XML (Extensible Markup Language) helps to define common syntax in semantic web.
Data Interchange
Resource Description Framework (RDF) framework helps in defining core representation of
data for web. RDF represents data about resource in graph form.
Taxonomies
RDF Schema (RDFS) allows more standardized description of taxonomies and
other ontological constructs.
Ontologies
Web Ontology Language (OWL) offers more constructs over RDFS. It comes in following
three versions:
OWL Lite for taxonomies and simple constraints.
OWL DL for full description logic support.
OWL for more syntactic freedom of RDF
Rules
RIF and SWRL offers rules beyond the constructs that are available
from RDFs and OWL. Simple Protocol and RDF Query Language (SPARQL) is SQL like
language used for querying RDF data and OWL Ontologies.
Proof
All semantic and rules that are executed at layers below Proof and their result will be used to
prove deductions.
Cryptography
Cryptography means such as digital signature for verification of the origin of sources is used.
User Interface and Applications
On the top of layer User interface and Applications layer is built for user interaction.
WWW Operation
WWW works on client- server approach. Following steps explains how the web works:
1. User enters the URL (say, [Link] of the web page in the
address bar of web browser.
2. Then browser requests the Domain Name Server for the IP address corresponding to
[Link].
3. After receiving IP address, browser sends the request for web page to the web server
using HTTP protocol which specifies the way the browser and web server
communicates.
4. Then web server receives request using HTTP protocol and checks its search for the
requested web page. If found it returns it back to the web browser and close the HTTP
connection.
5. Now the web browser receives the web page, It interprets it and display the contents of
web page in web browser’s window.
Future
There had been a rapid development in field of web. It has its impact in almost every area such
as education, research, technology, commerce, marketing etc. So the future of web is almost
unpredictable.
Apart from huge development in field of WWW, there are also some technical issues that W3
consortium has to cope up with.
User Interface
Work on higher quality presentation of 3-D information is under deveopment. The W3
Consortium is also looking forward to enhance the web to full fill requirements of global
communities which would include all regional languages and writing systems.
Technology
Work on privacy and security is under way. This would include hiding information, accounting,
access control, integrity and risk management.
Architecture
There has been huge growth in field of web which may lead to overload the internet and
degrade its performance. Hence more better protocol are required to be developed.
Features of WWW:
HyperText Information System
Cross-Platform
Distributed
Open Standards and Open Source
Uses Web Browsers to provide a single interface for many services
Dynamic, Interactive and Evolving.
“Web 2.0”
Broadly speaking, email system is federated and consists of the following parts:
Mail Transfer Agent (MTA) is a program that implements SMTP protocol to transport
messages between hosts (e.g. Sendmail, qmail, Postfix).
Mail User Agent (MUA) is the email client application (e.g. mutt, [Link], Outlook,
Thunderbird).
There can also be a Mail Delivery Agent (MDA) - an intermediate piece of software
that bridges the gap between MTA and MUA on the receiving side (e.g. procmail).
This can be done for spam filtering purposes and to manage the email message
persistence on Unix/Linux systems.
Lastly, there can also be a Mail Submission Agent (MSA) that is equivalent of MDA,
but for sending the email.
Because of the complexity involved in having many different systems send and receive
messages in many different formats, Email protocols have become necessary. Any number of
vendors may attempt to develop and market their idea of a superior messaging system, using
proprietary systems, for sending, receiving, and storing messages.
It would be impossible for all vendors to provide support for all other vendor's messaging
specifications. Instead, vendors have tried to support a limited set of standard specifications,
translating messages between their own message system and the accepted standards.
There are a few message formats that have evolved into de facto standards and most messaging
products include support for them.
What is an email protocol?
An email protocol is a group of rules which ensure that emails are properly transmitted over
the Internet. In fact, there is a list of email protocols that handle email transactions. Thanks to
them we are able to send and receive emails from different machines, networks, and
operating systems. Moreover, these mail protocols allow you to access and manage your
emails from various email programs and devices.
1. X.400
2. SMTP (Simple Mail Transfer Protocol)
3. UUCP (Unix-to-Unix copy)
4. POP-3 (Post Office Protocol version 3)
5. IMAP-4 (Interactive Mail Access Protocol)
6. MIME (Multipurpose Internet Mail Extension)
X.400
The messaging standard with the greatest international impact is called X.400. It attempts to
lay down standards for all electronic messaging systems in the world. Large-scale messaging
services (such as CompuServe and America Online) use the X.400 specification.
If you send a message to an electronic address over one of these services, the service converts
the message to comply with the X.400 specification, and then sends it on to its destination,
where it is converted again, into the message format the receiving program uses to make the
message understandable to the receiving party.
As experience was gained, more elaborate systems were proposed. In 1988, CCITT modified
X.400. However, after a decade of competition, email systems based on RFC 822 are widely
used, whereas those based on X.400 have become obsolete. The reason for RFC 822's success
is not that it is so good, but that X.400 is so poorly designed and so complex that nobody could
implement it well.
SMTP-
Simple Mail Transfer Protocol (SMTP) is an Internet standard for electronic mail (e-mail)
transmission.
First defined by RFC 821 in 1982, it was last updated in 2008 with the Extended SMTP
additions by RFC 5321 -which is the protocol in widespread use today.
SMTP by default uses TCP port 25. The protocol for mail submission is the same, but using
port 587,
and SMTP connections secured by SSL, known as SMTPS, default to port 465.
While electronic mail servers and other mail transfer agents use SMTP to send and receive
mail messages,
user-level client mail applications typically use SMTP only for sending messages to a mail
server for relaying.
SMTP stands for Simple Mail Transfer Protocol. it is a TCP/IP protocol that specifies how
computers exchange electronic mail. It works with post office protocol (POP).
SMTP is used to upload mail directly from the client to an intermediate host, but only
computers constantly connected such as Internet Service Providers (ISP) to the Internet can
use SMTP to receive mail. The ISP servers then offload the mail to the users to whom they
provide the Internet service.
Windows NT Option Pack 4 includes an SMTP mail client so does the Windows NT Resource
Kit. Microsoft Exchange Server will route your LAN mail on and off the Internet.
Working of SMTP-
SMTP is a simple ASCII protocol. After establishing the TCP connection, the sending machine,
operating as the client waits for the receiving machine, operating as the server, to talk first. The
server starts by sending a line of text giving its identity and telling whether or not it is prepared
to receive ail. If it is not, the client releases the connection and tries again later.
If the server is willing to accept email, the client announces whom the email is coming from
and whom it is going to. If such a recipient exists at the destination, the server gives the client
the go-ahead to send the message. Then the client sends the message and the server
acknowledges it.
In short
tands for Simple Mail Transfer Protocol, and it is responsible for sending email messages. This
protocol is used by email clients and mail servers to exchange emails between computers. A
mail client and the SMTP server communicate with each other over a connection established
through a particular email port. Both entities are using SMTP commands and replies to process
your outgoing emails. Thanks to the Simple Mail Transfer Protocol, messages can be sent from
the same account on different email applications.
The problems that may arise with SMTP protocol are as follows:
1. Some older versions of SMTP implementations cannot handle messages exceeding 64
KB.
2. If the client and server have different timeouts, one of them may give up while the
other is still busy, unexpectedly terminating the connection.
To get around these problems extended SMTP (ESMTP) has been defined in RFC 1425.
What is POP3?
The POP3 abbreviation stands for Post Office Protocol version 3, which provides access to an
inbox stored in an email server. It executes the download and delete operations for messages.
Thus, when a POP3 client connects to the mail server, it retrieves all messages from the
mailbox. Then it stores them on your local computer and deletes them from the remote
server.
Thanks to this protocol, you are able to access the messages locally in offline mode as well.
Modern POP3 clients allow you to keep a copy of your messages on the server if you
explicitly select this option.
To retrieve a message from a POP3 server, a POP3 client establishes a Transmission Control
Protocol (TCP) session using TCP port 110, identifies itself to the server, and then issues a
series of POP3 commands:
1. stat - Asks the server for the number of messages waiting to be retrieved.
2. list - Determines the size of each message to retrieved
3. retr - Retrieves individual messages
4. Quit - Ends the POP3 session
After a POP3 client reads a message in its mailbox on a POP3 server, the message is deleted.
Primarily because of this, POP3 is being supplanted by Internet Mail Access Protocol version
3 (IMAP4), which offers better support for mobile users. POP3 is supported by Microsoft
Exchange Server.
Mail stored locally, i.e. always accessible, even without internet connection
Internet connection needed only for sending and receiving mail
Saves server storage space
Option to leave copy of mail on server
Consolidate multiple email accounts and servers into one inbox
IMAP-
IMAP stands for Interactive Mail Access Protocol and it is defined in RFC 1064. This protocol
is designed to help the user who needs to use different types of computers say a laptop computer
while on the move and a workstation while in the office or at home. This protocol is basically
meant for the email server that can be accessed from any machine.
In this protocol, the user machine needs to access the server. IMAP also has the ability to
process the arrived mail from the remote mail server to fetch only the specified mail. For
example, the user can request the mail sent by a particular sender such as XYZ on a specific
date.
Examples of such applications are Outlook, Thunderbird, Eudora, GNUMail, or (Mac) Mail.
Connect to server
Fetch user requested content and cache it locally, e.g. list of new mail, message summaries,
or content of explicitly selected emails
Process user edits, e.g. marking email as read, deleting email etc.
Disconnect
As you can see, the IMAP workflow is a little more complex than POP. Essentially, folder
structures and emails are stored on the server and only copies are kept locally. Typically,
these local copies are stored temporarily. However, you can also store them permanently.
As mentioned in the introduction, IMAP was created to allow remote access to emails stored
on a remote server. The idea was to allow multiple clients or users to manage the same inbox.
So whether you log in from your home or your work computer, you will always see the same
emails and folder structure since they are stored on the server and all changes you make to
local copies are immediately synced to the server.
MIME
MIME stands for (Multipurpose Internet Mail Extensions). It is widely used internet standard
or encoding binary files to send them as email attachments over the internet. MIME allows an
Email message to contain a non-ASCII file such as a video image or a sound and it provides a
mechanism to transfer nontext characters to text characters.
Traditional email sent over the Internet using Simple Mail Transfer Protocol (SMTP) as
specified by Request for Comments (RFC) 822 defines messages as consisting of a header and
a body part, both of which are encoded using 7 bit ASCII text encoding.
The header of an SMTP message consists of a series of field/value pairs that are structured so
that the message can be delivered to its intended recipient. The body is unstructured text and
contains the actual message.
Multipurpose Internet Mail Extensions (MIME) defines five additional extensions to SMTP
message headers, supports multipart messages with more than two parts, and allows the
encoding of 8-bit binary data such as image files so that they can be sent using SMTP.
The encoding method for translating binary information used by MIME, Base64 Encoding,
essentially provides a mechanism for translating nontext information into text characters. the
MIME extensions are implemented as a field within the email message header. These fields
define the following:
1. Content-type
2. The content transfer encoding method
3. MIME version number
4. Content ID
5. Content description
FTP
File transfer protocol (FTP) is an Internet tool provided by TCP/IP. The first feature of FTP
is developed by Abhay Bhushan in 1971. It helps to transfer files from one computer to
another by providing access to directories or folders on remote computers and allows
software, data, text file to be transferred between different kinds of computers. The end-user
in the connection is known as localhost and the server which provides data is known as the
remote host.
o FTP stands for File transfer protocol.
o FTP is a standard internet protocol provided by TCP/IP used for transmitting the files
from one host to another.
o It is mainly used for transferring the web page files from their creator to the computer
that acts as a server for other computers on the internet.
o It is also used for downloading the files to computer from other servers.
Objectives of FTP
o It provides the sharing of files.
o It is used to encourage the use of remote computers.
o It transfers the data more reliably and efficiently.
Why FTP?
Although transferring files from one system to another is very simple and straightforward, but
sometimes it can cause problems. For example, two systems may have different file
conventions. Two systems may have different ways to represent text and data. Two systems
may have different directory structures. FTP protocol overcomes these problems by
establishing two connections between hosts. One connection is used for data transfer, and
another connection is used for the control connection.
The FTP connection is established between two systems and they communicate with each
other using a network. So, for the connection, the user can get permission by providing the
credentials to the FTP server or can use anonymous FTP.
When an FTP connection is established, there are two types of communication channels are
also established and they are known as command channel and data channel. The command
channel is used to transfer the commands and responses from client to server and server to
client. FTP uses the same approach as TELNET or SMTP to communicate across the
control connection. It uses the NVT ASCII character set for communication. It uses port
number 21. Whereas the data channel is used to actually transfer the data between client
and server. It uses port number 20.
The FTP client using the URL gives the FTP command along with the FTP server address.
As soon as the server and the client get connected to the network, the user logins using User
ID and password. If the user is not registered with the server, then also he/she can access
the files by using the anonymous login where the password is the client’s email address.
The server verifies the user login and allows the client to access the files. The client
transfers the desired files and exits the connection. The figure below shows the working of
FTP.
Transmission mode
Applications of FTP
Advantages
Multiple transfers: FTP helps to transfer multiple large files in between the
systems.
Efficiency: FTP helps to organize files in an efficient manner and transfer them
efficiently over the network.
Security: FTP provides access to any user only through user ID and password.
Moreover, the server can create multiple levels of access.
Continuous transfer: If the transfer of the file is interrupted by any means, then
the user can resume the file transfer whenever the connection is established.
Simple: FTP is very simple to implement and use, thus it is a widely used
connection.
Speed: It is the fastest way to transfer files from one computer to another.
Disadvantages
Less security: FTP does not provide an encryption facility when transferring
files. Moreover, the username and passwords are in plain text and not a
combination of symbols, digits, and alphabets, which makes it easier to be
attacked by hackers.
Old technology: FTP is one of the oldest protocols and thus it uses multiple
TCP/IP connections to transfer files. These connections are hindered by
firewalls.
Virus: The FTP connection is difficult to be scanned for viruses, which again
increases the risk of vulnerability.
Limited: The FTP provides very limited user permission and mobile device
access.
Memory and programming: FTP requires more memory and programming
efforts, as it is very difficult to find errors without the commands.
Telnet
The main task of the internet is to provide services to users. For example, users want to
run different application programs at the remote site and transfers a result to the local
site. This requires a client-server program such as FTP, SMTP. But this would not allow
us to create a specific program for each demand.
The better solution is to provide a general client-server program that lets the user access
any application program on a remote computer. Therefore, a program that allows a user
to log on to a remote computer. A popular client-server program Telnet is used to meet
such demands. Telnet is an abbreviation for Terminal Network.
Telnet provides a connection to the remote computer in such a way that a local terminal
appears to be at the remote side.
Remote login
o When a user logs into a local computer, then it is known as local login.
o When the workstation running terminal emulator, the keystrokes entered by the
user are accepted by the terminal driver. The terminal driver then passes these
characters to the operating system which in turn, invokes the desired application
program.
o However, the operating system has special meaning to special characters. For
example, in UNIX some combination of characters have special meanings such as
control character with "z" means suspend. Such situations do not create any
problem as the terminal driver knows the meaning of such characters. But, it can
cause the problems in remote login.
o When the user wants to access an application program on a remote computer,
then the user must perform remote login.
The user sends the keystrokes to the terminal driver, the characters are then sent to the
TELNET client. The TELNET client which in turn, transforms the characters to a
universal character set known as network virtual terminal characters and delivers them
to the local TCP/IP stack
The commands in NVT forms are transmitted to the TCP/IP at the remote machine.
Here, the characters are delivered to the operating system and then pass to the TELNET
server. The TELNET server transforms the characters which can be understandable by
a remote computer. However, the characters cannot be directly passed to the operating
system as a remote operating system does not receive the characters from the TELNET
server. Therefore it requires some piece of software that can accept the characters from
the TELNET server. The operating system then passes these characters to the
appropriate application program.
A socket is one endpoint of a two way communication link between two programs running on
the
network. The socket mechanism provides a means of inter-process communication (IPC) by
establishing
named contact points between which the communication take place.
Like ‘Pipe’ is used to create pipes and sockets is created using ‘socket’ system call. The socket
provides
bidirectional FIFO Communication facility over the network. A socket connecting to the
network is
created at each end of the communication. Each socket has a specific address. This address is
composed
of an IP address and a port number.
Socket are generally employed in client server applications. The server creates
a socket, attaches it to a network port addresses then waits for the client to
contact it. The client creates a socket and then attempts to connect to the
server socket. When the connection is established, transfer of data takes
software. With this combination, the process knows the system address and address
of the application where data is to be sent.
No class on sockets can be completed without mentioning Berkeley Sockets. Berkeley sockets
is an industry standard Application Programming Interface (API) to create and use sockets. It
was initially used as an API for the Unix operating system and was later adopted by TCP/IP.
Berkeley sockets
Berkeley sockets are part of an application programming interface
(API) that specifies the data structures and function calls that interact
with the operating system's network subsystem. The name derives
from the origins of the API in release 4.2 of the Berkeley Standard
Distribution (4.2BSD) of UNIX. Berkeley sockets act at the transport
layer: They help get the data where it's going, but have nothing to say
about the content of the data.
Berkeley sockets are part of an API, not a specific protocol, which
defines how the programmer interacts with an idealized network.
Although strongly associated with the TCP/IP network protocol for
which the API was first designed
return value to determine how many bytes have been sent or received and it must resend any
data
not already processed. [7] When using blocking sockets, special consideration should be given
to
accept() as it may still block after indicating readability if a client disconnects during the
connection
phase.
On the other hand, a non-blocking socket returns whatever is in the receive buffer and
immediately
continues. If not written correctly, programs using non-blocking sockets are particularly
susceptible
to race conditions due to variances in network link speed.
A socket is typically set to blocking or nonblocking mode using the fcntl () or ioctl
() functions.