VEER SURENDRA SAI UNIVERSITY OF TECHNOLOGY BURLA
Over view of image steganograph steganography y
2
IMAGE STEGANOGRAPHY A PROJECT REPORT Submitted by
AJIT KUMAR SATAPATHY
in partial fulfillment for the award of the degree of
BACHELOR OF TECHNOLOGY IN
COMPUTER SCIENCE AND ENGINEERING
VEER SURENDRA SAI UNIVERSITY OF TECHNOLOGY
VEER SURENDRA SAI UNIVERSITY OF TECHNOLOGY : BURLA, ODISHA
OCTOBER 2011
3
VEER SURENDRA SAI UNIVERSITY OF TECHNOLOGY : BURLA
BONAFIDE CERTIFICATE
Certified that this project report
“……….STEGANOGRAPHY……………..”
is the bonafide work of ― …………..AJIT KUMAR SATAPATHY …………” who carried out the project work under my supervision.
SIGNATURE
Prof. C.R Tripathy HEAD OF THE DEPARTMENT
COMPUTER SCIENCE AND ENGG. VSSUT ,BURLA ODISHA
SIGNATURE
Prof .R.K.Mohanty .R.K.Mo hanty SUPERVISOR
COMPUTER SCIENCE AND ENGG. VSSUT ,BURLA ODISHA
4
TABLE OF CONTENTS
CHAPTER NO.
TITLE
PAGE NO.
ABSTRACT
6
LIST OF FIGURES
1.
7
INTRODUCTION INTRODUCTIO N
8
1.1 Steganography vscryptography 1.2 Types of Steganography
10 10
1.2.1 Text stganography
10
1.2.1.1 Line shift coding
11
1.2.1.2 Word shift coding
11
1.2.1.3 Feature coding
11
1.2.1.4 Implementation
12
1.2.2 Image steganography
12
1.2.2.1 Least significant bits
13
1.2.2.1 Hiding the data
13
1.2.2.2 Recovery the data
13
1.2.2.3 Images detection
14
1.2.3 Audio Steganography
17
1.2.3.1 LSB Coding
17
1.2.3.2 Phase Coding
17
1.2.4 Video Steganogrphy
17
5
1.2.5 Need and applications of steganography
18
2
OVERVIEW
19
3
Multi segment steganography techniques
20
4
Conclusion
21
6
Abstract of the paper In this paper I am going to introduce steganography and types of steganography mainly used in recent time. Mainly I am working on image steganography .first I am introduce some concept on different types of steganography. Then I describe different techniques (classic (classic as wellas wellas modern techniques) of Image steganography. Then how it different from cryptography. In this paper I am describing the complexity of the stego stego key serch.I am showing the Drawbacks of multi segment stganography techniques which used code table mechanism.
7 LIST OF FIGURES FIG.NO. TITLES
1 2 3 4 5 6 7
Overview of steganography process Steganography type diagram Text data Input image Output image Output image by notepad Covered data 1.jpg 36.1KB stego.jpg 36.3KB stegomessage 39.2KB
PAGE NO.
9 10 15 15 16 16 17
8
1. INTRODUCTION-
The word steganography is derived from the Greek words ―stegos‖ meaning ―cover‖ and ―grafia‖ meaning ―writing‖ ,defining it as ―covered writing‖ . Steganography is one such pro-security innovation in which secret data is embedded in a cover . The notion of data hiding or steganography was first introduced with the example of prisoners' secret message by Simmons in 1983 .
In ancient Greece, people used invisible ink or even the messenger’s body to write down messages and then hide them with wax- as a stego medium. Even earlier than the Greek use, there have been several dated messages found and embedded within the hieroglyphics of ancient Egyptian monuments . All these are methods used in previous years are also known as physical steganography. Nowadays we use different computer file formats to cover the message that the sender wants to hide. The medium used to carry the message is known as the cover medium. The medium after embedding or hiding the message is known as a stego medium. There are four basic cover media used nowadays for steganography purposes. The following formula (Equation 1.1) provides the idea idea behind the Steganographic process :
Cover medium + Hidden data + Stego_key = Stego_medium
fig 1.1
This process is shown in Figure 1.1. Assume three different characters for example Bob, Alice and Eve. Suppose Bob want to send a secret message M to Alice. So he can use one of the steganography techniques and hide the message. To increase complexity, he can also combine cryptography technique with steganography. Then Bob needs to send this stego data file to Alice. Alice when receive this stego data file, she needs to extract the original message by applying the decoding technique. If any eavesdropper, Eve, receive the stego data file the she is unable to detect that message is hidden in the stego file. This is the main advantage of using steganography method in communications as any third person cannotdetect that the conversation is going on . There are few well-known text steganography methods. One is to insert the secret message into a webpage or inside a markup language. In this method the message is hidden using html tags . In line shift text steganography and in word shift text steganography the horizontal and vertical space between the two words is used to hide the message. The toughest text steganography technique is feature specific encoding, in which the characteristic of the letter is used to hide the message. There are some feature specific text steganography techniques available for different languages, for example hindi characters .
9
Figure 1.. Overview of steganography process
1.1Steganography vs cryptography Steganography and cryptography are closely related .Cryptography scrambles the the message so they can’t understand. Steganography hide the message that there no existence of the message in the first place , When steganography fails and the message can be detected ,it is still of no use as it is encrypted using cryptography techniques 1.2TYPES 1.2TYPES OF STEGANOGRAPHYSTEGANOGRAPHY-
10
4 types of Steganography techniques techniques are used that that is is text steganography steganography,audio and video steganography.
,image
steganography
Text
Audio Image
Video
Fig 2 Steganography type diagram
. 1.2.1TEXT STEGANOGRAPHYText steganography can be achieved by altering the text formatting, or by altering certain characteristics characteristics of textual elements (e.g., characters). characters). The goal in the design of coding methods is is to develop alterations alterations that are reliably decodable (even in the presence of noise) yet largely indiscernible to the reader. These criteria, reliable decoding and minimum visible change, are somewhat conflicting; herein lies the challenge in designing document marking techniques. The document format file is a computer file describing the document content and page layout (or formatting), using standard format description d escription languages such as PostScript2, TeX, @off, etc. It is from this format file that the image - what the reader sees - is generated. The three coding techniques that we propose prop ose illustrate different approaches rather than form
11
be used either separately or jointly. Each technique enjoys certain advantages or applicability as we discuss below. 1.2.1.1Line-Shift 1.2.1.1Line-Shift Coding This is a method of altering a document by vertically shifting the locations of text lines to encode the document doc ument uniquely. This encoding may be applied either to the format file or to the bitmap of a page image. The embedded codeword may be extracted from the format file or bitmap. In certain cases this decoding can be accomplished without need of the original image, since the original is known to have uniform line spacing between adjacent lines within a paragraph.
1.2.1.2 Word-Shift Coding This is a method of altering a document by horizontally shifting the locations of words within text lines lines to encode the document uniquely. This This encoding can be applied to either the format format file or to the bitmap of a page image. Decoding may be performed from the format file or bitmap. The method is applicable only to documents with variable spacing between adjacent words. Variable spacing in text documents is commonly used to distribute white space when justifying text. Because of this variable spacing, decoding requires the original image - or more specifically, the spacing between words in the the un-encoded document.
1.2.1.3 Feature Coding This is a coding method that is applied either to a format file or to a bitmap image of a document. The image is examined for chosen text features, and those features are altered, altered, or not altered, altered, depending on the codeword. Decoding requires the original image, or more specifically, a specification of the change in pixels at a feature. There are many possible choices of text features; here, we choose to alter upward, vertical endlines - that t hat is the tops of letters, b, d, h, etc. These endlines are altered by extending or shortening their lengths by one (or more) pixels, but otherwise not changing the endline feature feature .
1.2.1.4 Implementation
In the midway of this our mortal life, I found me in a gloomy wood, astray Gone from the path direct: and e'en to tell It were no easy task, how savage wild That forest, how robust and rough its growth,
12
Which to remember only, my dismay Renews, in bitterness not far from death. Yet to discourse of what there good befell, All else will I relate discover'd there. How first I enter'd it I scarce can say
In the midway of this our mortal life, I found me m e in a gloomy wood, astray Gone from the path dire direct: and e'en to tell It were no easy task, how savage wild Tha Th at forest, how robust and rough its growth, Which to remember only, my dismay Renews, in bitterness not far from death. Yet to discourse of wha what there good befell, All else will I relate discover'd there. How first I enter'd it I scarce can ca n say 06081913030629170827
meet at dawn
1.2.2 IMAGE STEGANOGRAPHY Hiding information inside images is a popular technique nowadays. An image with a secret message inside inside can easily be spread over the World Wide Web or in newsgroups. The use of steganography in newsgroups has been researched by German steganographic expert Niels Provos, who created a scanning cluster which detects the presence of hidden messages inside images that were posted on the net. However, after checking one million images, no hidden messages were found, so the practical use of steganography still seems to be limited. To hide a message inside an image without changing its visible properties, proper ties, the cover source can be altered altered in ‖noisy‖ areas with many color color variations, so less attention attention will be drawn to the modifications. The most most common methods to make make these alterations involve the usage of the least-significant bit or LSB, masking, filtering and transformations on the cover image. These techniques techniques can be used with varying degrees of success on different types of image files. 1.2.2.1 Least Significant Bits Many stego tools make use of least significant bit (LSB). For example, 11111111 is an 8-bit binary number. The rightmost bit is called the LSB because changing it has the least effect on the value of the number. nu mber. The idea is that the LSB of o f every byte can
13
be replaced with little change to the overall overa ll file. The binary data of the secret secre t message is broken up and then inserted into the LSB of each pixel in the image file. 1.2.2.2 Hiding the data Using the Red, Green, Blue (RGB) model a stego tool makes a copy of an image palette, say, an 8-bit image. The copy is rearranged so that colors near each other in the RGB model are near each other in the palette. The LSB of o f each pixel �s 8-bit binary number is replaced with one bit from the hidden message. A new RGB color in the copied palette is found. A new 8-bit binary number of the new RGB color in the original palette is found. The pixel is changed to the 8-bit binary number of the new RGB color. 1.202.3 Recovering the data The stego tool finds the 8-bit binary number of each pixel �s RGB color. The LSB of each pixel's 8-bit binary number is one bit of the hidden data file. Each LSB is then written to an output file. A simplified example with an 8-bit image 1 pixel: (00
01
10
11)
white red green blue
Insert 0011: (00
00
11
11)
white white blue blue
As can be seen from the example, with an 8-bit image, the cover image must be carefully selected since LSB manipulation is not as forgiving because of the color limitations. To hide information in the LSBs of each byte of a 24-bit image, it is possible to store 3 bits b its in each pixel. A simplified example with a 24-bit image 1 pixel:
14
(00100111 11101001 11001000)
Insert 101: (00100111 11101000 11001001) red
green
blue
LSB insertion works well with gray-scale images as well. It is possible to hide data in the least and second least significant bits and the human eye would still st ill not be able to discern it. Unfortunately LSB insertion is vulnerable to slight image manipulation such as cropping and compression. For example, converting a GIF or a BMP image, which reconstructs the original message exactly (lossless compression), to a JPEG format, for mat, which does not (lossy compression), and then converting back, can destroy the data in the LSBs. If NO. of least least significant bits increases increases then the hiding capacity increasesbut image degrades. 1.4.1.4 Images detection Examine color palette Size of the image Differences: Format Last modified date LSB makes use of BMP images, since they use lossless compression. Unfortunately to be able to hide a secret message inside a BMP file, one would require a very large cover image. Nowadays, BMP images of 800 × 600 pixels are not often used on the Internet and might arouse suspicion . For this reason, LSB steganography has also been developed for use with other image file formats.
15 5. Text inside a image In this techniques we hide secret text data inside a jpeg image .
Figure3 Text data
Figure4 Input image
16
Fig 5 out put image
The input image and output image are both looking same but if you open the output image by note pad then you see the secrete data inside the image.
Fig 6 Out put image by notepad n otepad
17
Fig 7 Covered data 1.jpg 36.1KB
stego.jpg 36.3KB
stegomessage 39.2KB
1.2.3 AUDIOSTEGANOGRAPHY In audio steganography, secret message is embedded into digitized audio signal which result slight altering altering of binary sequence of the corresponding audio file. There are several methods are available available for audio steganography. steganography. We are going to have a brief introduction on some of them. 1.2.3.1 LSB Coding Sampling technique followed by Quantization converts analog audio signal to digital binary sequence. In this technique LSB of binary sequence of each sample of digitized audio file is replaced with binary equivalent of secret message. 1.2.3.2 Phase Coding Human Auditory System (HAS) can’t recognize the phase change in audio signal as easy it can recognize noise in the signal. The phase coding method exploits this this fact. This technique encodes the the secret message message bits as phase shifts shifts in the phase spectrum of a digital signal, achieving an inaudible encoding in terms of signal-to- noise ratio. 1.2.4VIDEO STEGANOGRAPHY When information is hidden inside video the program or person hiding the information will usually use the DCT (Discrete Cosine Transform) method.DCT works by slightly changing the each of the images in the video, only so much though so it’s isn’t noticeable by the human eye. To be more more precise about how DCT works, DCT alters values values of certain parts parts of the images, it usually rounds them up. For example if part of an image has a value of 6.667 it will round it up to 7. Steganography in Videos is similar to that of Steganography in Images, apart from information is hidden in each frame frame of video. When only a small amount of information informat ion is hidden inside of video it generally isn’t noticeable at all, however the more information that is hidden hidde n the more noticeable it will become.
18
1.2.5 Need and applications of steganograph s teganography y There has been a rapid growth of interest in this subject over the last two years, and for two main reasons. Firstly, the publishing and broadcasting industries have become interested in techniques for hiding encrypted copyright marks and serial numbers in di gital images, audio recordings, books and multimedia products; an appreciation of ne w market opportunities created by digital distribution is coupled with a fear that digita l works could be too easy to copy. Secondly, moves by various governments to restrict the availability of encryption services have motivated people to study methods by whi ch private messages can be embedded in seemingly innocuous cover messages. There are a number of other applications driving interest in the subject of information hiding: •Military and intelligence agencies require unobtrusive communications. Even if the c ontent is encrypted, the detection of a signal on a modern battlefield may lead rapidly t o an attack on the signaler. For this reason, military communications use techniques su ch as spread spectrum modulation or meteor scatter transmission to make signals hard for the enemy to detect or o r jam. •Criminals also place great value on unobtrusive communications. Their preferred tec hnologies include prepaid mobile phones, mobile phones which have been modified to change their identity frequently, and hacked corporate switchboards through which ca lls can be rerouted.
•Law enforcement and counter intelligence agencies are interested in understanding th ese technologies and their weaknesses, so as to detect and trace hidden messages. •Recent attempts by some governments to limit online free speech and the civilian use of cryptography have spurred people concerned about liberties to develop techniques for anonymous communications on the net, including anonymous remailers and Web proxies. •Schemes for digital elections and digital cash make use of anonymous communication techniques. •Marketers use email forgery techniques to send out huge numbers of unsolicited mess ages while avoiding responses from angry users.
Other applications for steganography include the automatic monitoring of radio advert isements, where it would be convenient to have an automated system to verify that ad verts are played as contracted; indexing of video mail, where we may want to embed c omments in the content; and medical safety, where current image formats such as DIC OM separate image data from the text (such as the patient's name, date and physician),
19
with the result that the link between image and patient occasionally gets mangled by protocol converters. 2.OVERVIEW Steganography comes from the Greek words Steganós (Covered) and Graptos (Writing). The origin of steganography is biological and physiological. The term ―steganography‖ ―steganography‖ came into use in 1500’s after the appearance appearance of Trithemius’ Trithemius’ book on the subject ―Steganographia‖. ―Steganographia‖ . A short overview in this field can be divided into three parts and they are Past, Presentand Future . Past The word ―Steganography‖ technically means ―covered or hidden writing‖. Its ancient origins can be traced back to 440 BC. Although the term steganography was th only coined at the end of the 15 century, the use of steganography dates back several millennia. In ancient times, messages were hidden o n the back of wax writing wr iting tables, written on the stomachs of rabbits, or tattooed on the scalp of slaves. Invisible ink has been in use for centuries — for for fun by children and students and for serious espionage by spies and terrorists . Cryptography became very common place in the middle ages. Secret writing was employed by the Catholic Church in its various struggles down the ages and by the major governments of the time. Steganography was normally used in conjunction with cryptography to further hide secret information . Present The majority of today’s steganographic steganographic systems uses multimedia objects like image, audio, video etc as cover media because people often transmit digital pictures over email and other Internet communication In modern approach, depending on the nature of cover object, steganography can be divided into five Text Steganography Image Steganography Audio Steganography Video Steganography Protocol Steganography So, in the modern age so many steganographic techniques have been designed which works with the above concerned objects. More often o ften in today’s security advancement, we sometimes come across certain cases in which a combination of Cryptography and Steganography are used to achieve data privacy over secrecy. Various software tools are also available . Future In today’s world, we often listen a popular term ―Hacking‖. Hacking is nothing but an unauthorized access of data which w hich can be collected at the time of data transmission. tra nsmission. With respect to steganography this problem is often taken as Steganalysis . Steganalysis is a process in which a steganalyzer cracks the cover object to get the hidden data. So, whatever be the technique will be developed in future, degree of security related with that has to be kept in mind. It is hoped that Dual Steganography,
20
Steganography along with Cryptography may be some of the future solution for this above mentioned problem. 3.Multi segment steganography techniques Proposed by Fayik Alnawok and Basem Ahmed Faculty of Applied Science, Al Aqsa University, Palestine
Code Table The code table is build up according to the idea of having a random numbers (from 0 to 221), the number 221 comes from from that we have 35 characters, and each of them have 6 random codes according to the techniques above. After that each character will be assigned to six random numbers. Now let the sender and receiver share the password "1234QTR" and the code table, as shown shown in Table 1.
Table1. Codes table. List of Code words Q W E R T Y U I O P A S D F G H J K L Z X C V B N M Space 0 1 2 3 4 5 6 7 8 9
117 168 174 80 65 182 53 177 158 101 203 5 78 34 144 73 42 109 71 140 48 154 195 105 211 47 162 114 142 69 170 217 146 133 147 74 163
128 180 82 115 137 218 23 62 72 200 95 120 59 104 167 28 100 37 52 90 209 39 81 165 14 88 79 213 187 132 43 161 148 7 159 55 85
63 156 212 169 143 201 220 84 139 57 21 150 54 56 131 0 33 98 106 25 76 93 117 135 176 199 96 97 99 116 40 175 196 107 122 185 1
66 10 192 11 58 50 149 210 45 173 123 111 13 138 4 118 155 60 75 204 121 94 49 172 26 164 215 110 206 198 197 125 134 214 189 126 221
171 91 12 130 61 153 127 89 41 83 202 113 86 119 46 145 205 166 9 77 188 112 129 179 38 178 87 27 186 24 181 207 20 8 70 193 67
4.2 Message Hiding In this stage the sender wants to send the message "MESSAGE". First First he should decode the message into corresponding code for each character by choosing random of the password. character from it list of code words, let the random numbers be 5, 0, 4, 3, 2, 1, 3. So So the message becomes 92, 174, 113, 111, 21, 167, 192 and the the secret password coding is is 142, 69, 170, 217, 117, 65, 80. The The next step starting to search the image from the byte 142*10; i.e., from byte 1420 about the value 92 let us found it at byte 1500 we changed the value of the bytes 1497 and 1503 into 1420, or 2/3*1420 or 4/3*1420 these factors have been chosen depends on that we want the new values closer to the original value, the next step we start searching from the byte 690 about the value 174 and we found it at the byte 800, now we change the value of the bytes 797 and 803 into 174 and so on for the reaming characters of the message we are going to hide according to the next pseudo code . 1. Start message hiding. 2. Code the message by using random rando m (code word corresponding to each character of the message). 3. Code the password with with the first first code word from the list list of code word corresponding to each character
3 190 209 103 183 216 22 35 17 64 184 102 108 208 16 18 19 6 191 32 160 151 194 44 124 92 30 36 157 51 2 68 136 15 152 29
21
4. For i=1 to length(message) For (j=ith password code * 10 to size of the image) If (byte(j)==code word of ith character of the message) Then stop the loop If(absolute value(byte (j-3)- ¾ * ith character of the message)< byte(j-3) – ith character of the message) and (absolute value(byte (j-3)- ¾ * ith character of the message)< byte(j-3) – 4/3 * ith character of the message) Then byte (j-3) = ¾ * ith character of the message. Else if (absolute value(byte (j-3) – 4/3 * ith character of the message)< byte(j-3) – ith character of the message) and (absolute value(byte (j-3) – 4/3 * ith character of o f the message)< byte(j-3) – 3/4 * ith character of the message) Then byte (j-3) = 4/3 * ith character of the message. Else byte (j-3) = ith character of the message. Do steps b,c and d for the byte (j+3). 5. If the I equal the length (password) Then i = I mod length (password). 6. End. My question on this algorithm is – there there is no use of specific pos Ition. Marking specific byte at specific distance. To find the specific position
4.CONCLUSION Although only some of the main steganographic techniques were discussed in this paper, one can see that there exists a large selection of approaches to hiding information in images. All the major image file formats have different methods of hiding messages, with different strong s trong and weak points respectively. Where one technique lacks in payload capacity, the other lacks lacks in robustness. For example, the patchwork approach has a very high level level of robustness against most most type of attacks, attacks, but can hide only a very very small amount of information. Least significant bit (LSB) in both BMP and GIF makes up for this, but both approaches result in suspicious files that increase the probability of detection when w hen in the presence of o f a warden.
22
List of references REFERENCES [1] Moerland, T., ―Steganography and Steganalysis‖, Leiden Institute of Advanced Computing Science, www.liacs.nl/home/ tmoerl/privtech.pdf [2] Silman, J., ―Steganography and Steganalysis: Steganalysis: An Overview‖, Overview‖, SANS Institute, 2001 [3] Jamil, T., ―Steganography: The art of hiding information information is is plain sight‖, IEEE Potentials, 18:01, 1999 [4] Wang, H & Wang, Wang, S, ―Cyber ―Cyber warfare: warfare: Steganography vs. Steganalysis‖, Steganalysis‖, Communications of the ACM, 47:10, October 2004 [5] Anderson, R.J. & Petitcolas, Petitcolas, F.A.P., ―On the the limits of steganography‖, steganography‖, IEEE Journal of selected Areas in Communications, May 1998 [6] Marvel, L.M., Boncelet Boncelet Jr., C.G. & Retter, C., ―Spread Spectrum Steganography‖, Steganography‖, IEEE Transactions on image processing, 8:08, 1999 [7] Dunbar, B., ―Steganographic techniques and their use in an Open -Systems environment‖, SANS Institute, January 2002 [8] Artz, D., ―Digital Steganography: Steganography: Hiding Data within Data‖, IEEE Internet Internet Computing Journal, June 2001 [9] Simmons, G., ―The prisoners prisoners problem and the subliminal subliminal channel‖, CRYPTO, CRYPTO, 1983 [10] Chandramouli, R., Kharrazi, M. & Memon, N., ―Image steganography and steganalysis: Concepts and Practice‖, Proceedings of the 2 nd International Workshop on Digital Watermarking, October 2003 [11] Currie, D.L. & Irvine, C.E., ―Surmounting the effects of lossy compression on Steganography‖, 19 th National Information Systems Security Conference, 1996 [12] Handel, T. & Sandford, M., ―Hiding data in the OSI network model‖, Proceedings of the 1 st International Workshop on Information Hiding, June 1996 [13] Ahsan, K. & Kundur, D., ―Practical Data hiding in TCP/IP‖, Proceedings of the Workshop on
23
Multimedia Security at ACM Multimedia, 2002 [14] Johnson, Johnson, N.F. & Jajodia, S., ―Exploring ―Exploring Steganography: Seeing the Unseen‖, Computer Journal, February 1998 [15] ―Reference ―Refere nce guide: Graphics Technical Options and Decisions‖, Decisio ns‖, http://www.devx.com/projectcool/Article/19997
24 –