Genetic Algorithm Based Substitution Technique of Image Steganography
Genetic Algorithm Based Substitution Technique of Image Steganography
5, December 2010
Journal of Global Research in Computer Science
RESEARCH PAPER
Available Online at [Link]
Abstract Steganography is the act of hiding a message inside another message in such a way that can only be detected by its intended recipient.
Naturally, there are security agents who would like to fight these data hiding systems by steganalysis, i.e. discovering covered messages and
rendering them useless. There is currently no steganography system which can resist all steganalysis attacks. The most notable steganalysis
algorithm is the RS attack which detects the steg-message by the statistic analysis of pixel values. To ensure the security against the RS analysis,
we presents a new steganography based on Genetic Algorithm in this paper. In this paper, we present a novel approach to resolve the remained
problems of substitution technique of image steganography. Using the proposed Genetic Algorithm, message bits are embedded into different
bits of the pixel grey level values, resulting in increased robustness. The robustness would be increased against those attacks which try to
reveal the hidden message and also some unintentional attacks like noise addition as well.
message bits in different bit level of the pixel grey level Robustness measures the ability of embedded data or
values. The layers are selected in pseudo – random method watermark to withstand against intentional and unintentional
thereby making it more robust against steganolytic attack. The attacks. Unintentional attacks generally include common data
proposed Genetic approach minimises the effect of bit manipulations such as lossy compression, digital-to-analog
updation on image grey value thereby reducing the risk the conversion, re-sampling, re-quantization, etc. whereas
statistical stego attack. Moreover only the stego image is sent intentional attacks cover a broad range of media degradations
to the reciever end thereby reducing chances of suspicion. which include addition white and colored noise, rescaling,
rotation (for image and video steganography schemes),
REVIEW OF STEGANOGRAPHIC ALGORITHMS
resizing, cropping, random chopping, and filtering attacks .
Steganographic algorithms can be characterized by a number Also, the robustness of the algorithm is defined as an ability
of defining properties. Three of them, which are most of the data detector to extract the embedded message after
important for image steganographic algorithms, are defined common signal processing manipulations. Applications
below. usually require robustness in the presence of a predefined set
Transparency evaluates the image distortion due to signal of signal processing modifications, so that message can be
modifications like message embedding or attacking. In most reliably extracted at the detection side.
of the applications, the steganography algorithm has to insert Image Steganography
additional data without affecting the perceptual quality of the
Image steganography takes the advantage of limited power of
host image. The fidelity of the steganography algorithm is
human visual system (HVS). Here, unlike watermarks which
usually defined as a perceptual similarity between the original
embed added information in every part of an image, only the
and stego image. However, the quality of the stego image is
complex parts of the image holds added information [1-2].
usually degraded, either intentionally by an adversary or
Straight message insertion will simply encode every bit of
unintentionally in the transmission process, before a person
information in the image. More complex encoding can be
perceives it. In that case, it is more adequate to define the
done to embed the message only in "noisy'' areas of the image
fidelity of a steganography algorithm as a perceptual
that will attract less attention [3]. The least significant bit
similarity between the stego image and the original host image
(LSB) insertion method is probably the most well known
at the point at which they are presented to a consumer. In
image steganography technique. The main advantage of this
order to meet fidelity constraint of the embedded information,
method is that human eye is not able to notice the change;
the perceptual distortion introduced due to embedding should
however unfortunately, it is extremely vulnerable to attacks,
be below the masking threshold estimated based on the
such as image manipulation. Masking and filtering techniques
Human Visual System (HVS) and the host media.
hide information by marking an image in a manner similar to
Capacity of an information hiding scheme refers to the
paper watermarks. By covering a faint but perceptible signal
amount of information that a data hiding scheme can
with another to make the first non-perceptible, the fact that the
successfully embed without introducing perceptual distortion
HVS cannot detect slight changes in certain temporal domains
in the marked media. In the case of image, it evaluates the
of the image was exploited in [14]. Masking techniques are
amount of possible embedding information into the host
better choices for lossy JPEG images than LSB method
image. The embedding capacity is the all included embedding
because of their relative immunity to image operations such as
capacity (not the payload) and can be measured in percent
compression and cropping [2-7]. JPEG image format due to
(%), bits per image.
© JGRCS 2010, All Rights Reserved 63
Samir Kumar Bandyopadhyay et al, Journal of Global Research in Computer Science, 1(5),December 2010, 62-69
its good characteristics (having both reasonable quality and techniques is not comparable with the capacity of other more
small size) is the most common image format for web and robust techniques like spread spectrum technique or Discrete
local usages. JPEG uses discrete cosine transform (DCT) to Cosine Transformation (DCT) technique that are highly robust
transform successive 8×8 pixel blocks of the image into 64 but has a negligible embedding capacity.
DCT coefficients. Here, LSBs of the quantized DCT 2.3 Problems of Substitution Techniques of Image
coefficients are used as redundant bits. The modification of Steganography
even a single DCT coefficient affects all 64 image pixels. In Like all multimedia data hiding techniques, image
some image formats such as GIF, the visual structure of the steganography has to satisfy three basic requirements. They
image exists to some degree in all bit layers of the image. are perceptual transparency, capacity of hidden data and
Steganographic systems which modify these formats are robustness. Noticeably, the main problem of audio
mostly vulnerable to visual attacks [8, 9, 13]. However this is substitution steganography algorithm is considerably low
not true about the JPEG format. As the modifications happen robustness.
in the frequency domain rather than spatial domain, there is no There are two types of attacks to steganography and therefore
visual attack against it. there are two type of robustness. One type of attacks tries to
Recently, several steganographic techniques for data reveal the hidden message and another type tries to destroy
hiding in JPEGs have been developed: JSteg [10], JP the hidden message. Substitution techniques are vulnerable
Hide&Seek [10], F5 [11], and OutGuess [12]. All these against both types of attacks. The adversary who tries to
techniques manipulate the quantized DCT coefficients to reveal the hidden message must understand which bits are
embed the hidden message. modified. Since substitution techniques usually modify the
bits of lower layers in the samples -LSBs, it is easy to reveal
Substitution Techniques Based Image Steganography
the hidden message if the low transparency causes suspicious.
The subatitution based steganographic algorithms were
Also, these attacks can be categorized in another way:
primarily developed for digital images and video sequences.
Intentional attacks and unintentional attacks. Unintentional
In the past few years, several algorithms for the embedding
attacks like transition distortions could destroy the hidden
and extraction of message in images have been presented. All
message if is embedded in the bits of lower layers in the
of the developed algorithms take advantage of the perceptual
samples -LSBs.
properties of the HVS in order to add a message into a host
As a result, this paper briefly addresses following problems of
image in a perceptually transparent manner. Many attacks
substitution techniques of image steganography:
such as geometrical distortions, spatial scaling are malicious
1) Having low robustness against attacks which try to reveal
against image steganography algorithms.
the hidden message.
The theory of substitution technique is that simply replacing
2) Having low robustness against distortions with high
either a bit or a few bits in each sample will not be noticeable
average power.
to the human eye. This method has high embedding capacity
A. First Problem
but it is the least robust. It exploits the absolute threshold of
One type of robustness that is very critical for security is
vision but is susceptible to attacks.
withstanding against the attacks which try to reveal or extract
The obvious advantage of the substitution technique, the
the hidden message. This paper is to improve this type of
reason for choosing this technique, is a very high capacity for
robustness. With an intelligent algorithm we hope to reach a
hiding a message. Obviously, the capacity of substitution
more robust substitution technique, as such, extracting the Predictably, substitution techniques try to modify the bits of
hidden message become inaccessible to adversary. samples in accordance with a directive that is defined in
Certain way to withstand against these attacks is making more algorithm. The target bits are definite, and the amount of
difficult discovering which bits are modified. Thus, the resultant noise is not controlled. There may be some better
algorithm may not change some sample due to their situations. techniques that try to adjust the amount of resultant noise in
This selecting will improve the security of the method and substitution techniques. These improved algorithms alter other
robustness of the technique, because if somebody tries to bits else than target bit in sample to decrease the amount of
discover the embedded message, he has to apply a specific resultant noise. A key idea of the improved algorithm is
algorithm to read some bits of samples. But if modified message bit embedding that causes minimal embedding
samples are secret, nobody can discover the message. It is distortion of the host image.
remarkable that if we achieve float target bits, it will be novel. The basic idea of the proposed algorithm is embedding
As we know in samples LSBs are more suspicious, thus that cause minimal embedding distortion of the host image. In
embedding in the bits other than LSBs could be helpful to this approach the amount of resultant noise could be improved
increase the robustness. Furthermore, discovering which pixel since the total noise will be less, when we are able to alter and
samples are modified should be uncharted. To reach to the adjust more samples. This can achieve more transparency and
level of ambiguity, the algorithm will not use a predefined robustness.
procedure to modify the samples but will decide, according to
PROPOSED APPROACH
the environment, in this case the host file; as such it will
modify indistinct pixel samples of image files, depending on Accordingly, there are two following solutions for
their values and co – ordinate positions. Thus, some of the mentioned problems:
samples which algorithm determines they are suitable for 1) The solution for first problem: Making more difficult
modifying will modify and other samples may not change. discovering which bites are embedded by modifying the bits
This ambiguity in selecting pixel samples will thus increase else than LSBs in samples, and selecting the samples to
security and robustness of the proposed algorithm. modify privately-not all samples.
B. Second Problem 2) The solution for second problem: Embedding the message
A significant improvement in robustness against unintentional bits in deeper layers and other bits alteration to decrease the
attacks -for example signal processing manipulation- will be amount of the error.
obtained if an embedded message is able to resist distortions To integrate these two solutions, “embedding the message bits
with high average power. To achieve this robustness the in deeper layers” that is a part of second solution also can
message could embed in deeper layers. But, selecting the layer satisfy “modifying the bits else than LSBs in samples” of
and bits for hosting is critical because selecting higher layer second solution. In addition, when we try to satisfy “other bits
will introduce distortion in pixel grey value. Embedding the alteration to decrease the amount of the error” of second
message bits in deeper layers absolutely causes bigger error solution, if we ignore the samples which are not adjustable,
and it will decrease the quality of transparency. Thus, the also “selecting not all samples” of first solution will be
should modify other bits intelligently to decrease the amount Thus, intelligent algorithm will try to embed the message bits
of this error and reserve the transparency. in the deeper layers of samples and alter other bits to decrease
the error and if alteration is not possible for any samples it
will ignore them. It is clear that the main part of this scenario minimized. Store the new generated pixel
is bit alteration that it should be done by intelligent algorithms value in the corresponding position.
which use genetic algorithms. Step 7. Read the next pixel value of the Cover
The algorithm at sender end and reciever end is : Image (row major order) & next bit of the
Sender End generated bit stream. Loop Step 5 and 6
SENDER (Target Text Message, Cover Image) until the generated bit stream is exhausted.
This function will be used in the sender side to encrypt Step 8. Store the Target Text Message length in the
Input: This function will take Target Text Message and Cover Step 9. Store the Image as Stego-Image and send it
Step 5. Extract (pos+1)th bit(LSB is the 1st bit) of Fitness function : A chromosome’s fitness function is a
maximisation or minimisation function in which the optimal
the corresponding 8 bit binary value or a predefined cut – off value is attained through
representation of the last read pixel value several itteration and learning. Here the fitness fuction should
select the grey value with minimum deviation from the
and concatenate with ‘M’. original grey value of the host image i.e. with maximum
Step 6. Read the next pixel of the Cover Image (in transperancy.
The algorithm executes in four main steps as defined below :
row major order) .Loop Step 4 and 5 until
A. Alteration
S1 number of bits are extracted.
At the first step, message bits substitute with the target
Step 7. Regenerate the Target Text Message ‘msg’
bits of samples. Target bits are those bits which place at the
by taking 8 consecutive bits of ‘M’ at a time
layer that we want to alter. This is done by a simple
and obtaining the character with the ASCII
substitution that does not need adjustability of result be
code which is equivalent to the decimal
measured.
value of the 8 consecutive bits under
B. Modification
consideration.
In fact this step is the most important and essential part
Step 8. Display ‘msg’ as the decoded Target Text
of algorithm. All results and achievements that we expect are
Message.
depending on this step. Efficient and intelligent algorithms are
Step 9. Stop.
useful here. In this stage algorithm tries to decrease the
End
amount of error and improve the transparency. For doing this
Genetic Algorithm Approach stage, two different algorithms will be used.
Here we propose a new genetic algorithm approach to find the One of them that is more simple likes to ordinary
best position for data embedding and also optimize the quality techniques, but in aspect of perspicacity will be more efficient
of the steganographic image. to modify the bits of samples (pixel grey value) better. Since
Each substitution matrix S is represented as a chromosome G; transparency is simply the difference between original sample
G = g0 g1 . . . gN−1 , where N = 2k and gene gi means that
and modified sample, with a more intelligent algorithm, I will
the gray value i in C (host image) will be replaced by the gray
value gi . Note that there are N ! different chromosomes. In try to modify and adjust more bits and samples than some
the generic algorithm, some chromosomes are specified as previous algorithms. If we can decrease the difference of
forming an initial population of the first generation. Then, the
them, transparency will be improved. There are two example
population of the next generation is created by the following
operators, and sieved by a fitness function: of adjusting for expected intelligent algorithm below.
Sample bits are: 00101111 = 47
Reproduction: This randomly duplicates some chromosomes
Target layer is 5, and message bit is 1
for the next generation.
Without adjusting: 00111111 = 63 (difference is 16)
Crossover: This randomly combines the left-hand side of one
chromosome, with the right-hand side of another After adjusting: 00110000 = 48 (difference will be 1 for
chromosome, to form a new chromosome. The new 1 bit embedding)
chromosome must be modified by replacing the repetitive Sample bits are: 00100111 = 39
genes with other genes, so that all genes are different, within
Target layers are 4&5, and message bits are 11
each chromosome. In this operation the place of target bit
Without adjusting: 00111111 = 63 (difference is 24)
embedded is not changed [15 -16].
Mutation: This randomly chooses a chromosome and After adjusting: 00011111 = 31 (difference will be 8 for
exchanges any two genes to form a new chromosome. In this
2 bits embedding)
operation the place of target bit embedded is not changed.
© JGRCS 2010, All Rights Reserved 67
Samir Kumar Bandyopadhyay et al, Journal of Global Research in Computer Science, 1(5),December 2010, 62-69
Retrived Message : DELHI—COMMONWEALTH— [1] Provos N. and Honeyman, P., "Detecting Steganographic
GAMES—2010 Content on the Internet", Center for Information Technology
Integration, University of Michigan. Technical Report 01-11,
2001
COMPUTATIONAL COMPLEXITY ANALYSIS [2] Sellars D., "An Introduction to Steganography",
In the Sender end, if ‘n’ is the total no. of bits in the [Link]/courses/CS400W/NIS/papers99/
dsellars/[Link]
corresponding bit stream of the message, then for inserting [3] Johnson N. and Jajodia S., "Exploring steganography:
Seeing the unseen", Computer, 31, no 2:26-34, Feb. 1998.
each bit it takes time O(n).The intelligent genetic algorithm is [4] Fridrich J., Goljan M., and Hogea D., “Attacking the
OutGuess,” Proc. ACM Workshop Multimedia and
independent of ‘n’ but dependent only on the coordinate Security2002, ACM Press, 2002.
indexes of the Cover Image. So, for this genetic algorithm it is [5] Provos N., Honeyman P., "Hide and Seek: An
Introduction to Steganography", IEEE SECURITY &
also accessed for all the bits of the message, which takes time PRIVACY, MAY/JUNE 2003
[6] Westfeld A., and Pfitzmann A., "Attacks on
nearly n*e(e is time for single execution of the genetic Steganographic Systems". In: Pfitzmann A. (eds.): 3rd
International Workshop. Lecture Notes in Computer Science,
algorithm).As both are executed sequentially, it leads to a Vol.1768. Springer-Verlag, Berlin Heidelberg New York
(2000)
small linear complexity. [7] Westfeld, A. “Detecting Low Embedding Rates”. 5 th
In the Receiver end, the positions for the bits are calculated Information Hiding Workshop, Netherlands, Oct. 7 .9, 2002
[8] Pik-Wah C., "Digital Video Watermarking Techniques
from the same genetic function and then the message is for Secure Multimedia Creation and Delivery", A Thesis
Submitted in Partial Fulfillment of the Requirements for the
regenerated sequentially from the extracted bit string, leading Degree of Master of Philosophy in Computer Science and
Engineering, The Chinese University of Hong Kong, July,
to a small linear time complexity. 2004
[9] Huang C. and Wu J., "A watermark optimization
CONCLUSIONS technique based on genetic algorithms”, SPIE Electronic
Imaging 2000 San Jose, Jan.2000.
[10] Steganography software for Windows,
A new approach is proposed to resolve two problems of [Link]
[Link]
substitution technique of image steganography. First problem [11] Westfeld, A. "High Capacity Despite Better
is having low robustness against attacks which try to reveal Steganalysis (F5–A Steganographic Algorithm)".
Information Hiding. 4th International Workshop.
the hidden message and second one is having low robustness LectureNotes in Computer Science, Vol.2137. Springer-
Verlag,Berlin Heidelberg New York, 2001, pp. 289–302
against distortions with high average power. An intelligent [12] Provos N., "Defending Against Statistical Steganalysis",
Proc.10th USENIX Security Symposium, Washington, 2001
algorithm will try to embed the message bits in the deeper [13] Westfeld A., Pfitzmann A., "Attacks on Steganographic
Systems". In Proceedings of Information Hiding-Third
layers of samples and alter other bits to decrease the error and International [Link]- Verlag, September 1999.
if alteration is not possible for any samples it will ignore [14] Bassia P. and Pitas I., "Robust audio watermarking in
the time domain", Findings report, Dept. of Informatics,
them. Using the proposed genetic algorithm, message bits University of Thessaloniki 1998.
[15] J. H. Holland, "Adaptation in natural and artificial
could be embedded into multiple, vague and deeper layers to systems", Ann Arbor, MI University of Michigan Press
1975.
achieve higher capacity and robustness. [16] D. E. Goldberg, "The genetic algorithms in search,
optimization, and machine learning", New York
REFERENCES