8 Fundamentals Stenganography Dalal2021
8 Fundamentals Stenganography Dalal2021
https://doi.org/10.1007/s10462-021-09968-0
Abstract
In the last few decades, information security has gained huge importance owing to the mas-
sive growth in digital communication; hence, driving steganography to the forefront for
secure communication. Steganography is a practice of concealing information or message
in covert communication which involves hiding the information in any multimedia file such
as text, image, or video. Many contributions have been made in the domain of image steg-
anography; however, due to the low embedding capacity and robustness of images; videos
are gaining more attention of academic researchers. This paper aims to provide a qualita-
tive as well as quantitative analysis of various video steganography techniques by high-
lighting their properties, challenges, pros, and cons. Moreover, different quality metrics for
the evaluation of distinct steganography techniques have also been discussed. The paper
also provides an overview of steganalysis attacks which are commonly employed to test
the security of the steganography techniques. The experimental analysis of some of the
prominent techniques using different quality metrics has also been done. This paper also
presented a critical analysis driven from the literature and the experimental results. The
primary objective of this paper is to help the beginners to understand the basic concepts of
this research domain to initiate their research in this field. Further, the paper highlighted
the real-life applications of video steganography and also suggested some future directions
which require the attention of the research community.
1 Introduction
* Mamta Juneja
[email protected]
Mukesh Dalal
[email protected]
1
UIET, Panjab University, Chandigarh, India
13
Vol.:(0123456789)
M. Dalal, M. Juneja
confidential or non-confidential. Hence, there is a need to make this information safe and
secure. To make communication secure on the internet, the first thing one should consider
is cryptography, which is used for secret/hidden writing. Cryptography techniques scram-
ble the message to be hidden in a way that it becomes meaningless for the third party. How-
ever, there is one downside of cryptography that the encrypted message draws attention
very quickly that something is secretly hidden. To overcome this drawback, there is another
form of secret communication that is being used presently in digital media, that is, steg-
anography. Steganography takes cryptography one step forward in which secret message
can be hidden in such a manner that no one can really suspect its existence. Steganography
hides the secret message such that only the sender and intended recipient are aware of that
message. Although both cryptography and steganography are used for secret communica-
tion, however, there still exist some differences as stated in Table 1. Though steganography
and cryptography try to hide information, however, a single technique alone is not perfect
for secure communication. Sometimes we need both techniques to make our communica-
tion more secure (Sadek et al. 2015).
Another field called digital watermarking is also used for InfoSec, where the informa-
tion is hidden in a digital signal known as a carrier. Watermarking is closely related to
steganography as both of them hide information in noisy channels (Sadek et al. 2015). In
watermarking, the existence of hidden information is known to the parties, and sometimes
they may try to demolish it. A digital watermark should be robust against transforma-
tions and fragile which is used for tamper detection (Luo et al. 2016). In watermarking,
commonly the embedded information is about the object only rather than the secret infor-
mation. However, the fundamental difference between them is that watermarking can be
visible/invisible but steganography is always invisible, and also, there exist many other dif-
ferences, as highlighted in Table 2.
This paper begins by highlighting different steganography events that were reported in
the literature from the last few decades. The primary aim of the paper is to provide a survey
of different video steganography techniques presented in the literature by classifying them
based on their working principle. Although there exist survey papers on video steganog-
raphy Sadek et al. 2015; Mstafa and Elleithy 2017, they lack in highlighting the recent
contribution made in the last few years and experimental analysis. Additionally, this paper
performs an analysis of some of the existing video steganography techniques on a common
video dataset. Moreover, the paper highlights some of the video steganalysis attacks and
different performance evaluation metrics for video steganography.
The remaining paper is constructed as follows- In Sect. 2, the origin of steganography
is given with the development and news related to steganography. Steganography overview
is discussed in Sect. 3 and Sect. 4, gives a brief discussion of video formats and Sect. 5
13
A survey on information hiding using video steganography
presents the literature related to the video steganography techniques with the pros and cons
of each technique. Section 6 contains a brief introduction of steganalysis with recent litera-
ture and attacks on steganography. Quality metrics are introduced in Sect. 7. Experimental
results and discussion has been given in Sect. 8 with critical analysis driven in Sect. 9.
Moreover, real-life applications of video steganography have been discussed in Sect. 10.
Additionally, future directions are suggested towards the end in Sect. 11, and the conclu-
sion is driven in Sect. 12.
2 Origin of Steganography
Hiding the data in plain sight is known as Steganography, and it is the art of hidden and
covert communication. It is obtained from the Greek words steganos and graphia which
means ‘covered’ and ‘writing’ respectively (Krenn 2004). The earliest example of steg-
anography was back in 5 B.C., where a prisoner from Greek wanted to send some secret
information to his son-in-law influencing him to revolt. For that secret message, the slave’s
head was shaved, and the message was tattooed on the scalp of the slave. He was dis-
patched to deliver the message only when his hair had grown long enough (Steganography
2004). There are many other examples in history such as, in ancient Greece, people used
wax to cover a secret message punched on wood. Further, during World War II, the resist-
ance of France sent messages written with invisible ink on the back of couriers (Easttom
(2017)). Germans used tiny photo reductions in World War II called microdots (Steganog-
raphy 2004).
The term steganography was first used in a book called Steganographiain in 1499 by
Johannes Trithemius (Mollin 2000). As the world changes by the passage of time in the
1960s, the use of the internet made it possible to store and process data by the Advanced
Research Projects Agency Network (ARPANET) project. After that in the 1980s, due to
the increased use of PCs (Personal Computers), there was a massive demand for InfoSec
and in the 1990s Operating Systems and Graphical User Interface (GUIs) drew hackers
from corners of the world (Amirtharajan and Rayappan 2013). As numbers of steganog-
raphy techniques were used from the ancient time till present, still much advancement is
needed in terms of InfoSec in concern with secret communication. The rebirth of steganog-
raphy has been experienced with the development of the digital world. Nowadays, digital
files are being utilized for hiding data where it can be hidden easily with many options and
with minimal or no detection (Steganography 2004). Digital steganography is the field of
13
M. Dalal, M. Juneja
steganography in which a program is used to hide secret data or file within a carrier file.
After that, the carrier message is either sent to or posted on a site for downloading (Raggo
and Hosmer 2012); the details of this process are covered in Sect. 3. The fundamental
desire of steganography is imperceptibility which means that no algorithm can discover an
activity containing a secret message.
Digital steganography came into the limelight in 1994, when Andy Brown developed
S-tools for windows. S-tools are the most versatile tools for steganography tested on any
application. Due to the lack of resources at that time only images were tested (S-tools
2016). In 1997, researchers started network steganography for secret communication where
the secret data is hidden in the TCP/IP (Transmission Control Protocol/ Internet Protocol)
layer header. In 1999, researchers found that DNA (Deoxyribonucleic Acid) can be utilized
for steganography. DNA is a coding medium, and it contains four nucleic acids that can
be used to encode binary data (Shimanovsky et al. 2002). The advancement in technol-
ogy automatically helps society to improve and grow fast, but it has the downside effect of
being misused by anyone. The chronological order of few events where steganography was
employed is shown in Table 3 with its advancements.
With the advancement, nowadays, the defense not only has to fight against the terror-
ists with an AK-47 and RDX (Research Department Explosive) but with computer literate
advanced technology users. Many pieces of evidence have been found that the terrorists are
using steganography for their malicious intent as shown in Table 3. United States (U.S.)
officials and experts made some statements regarding the use of the internet and the latest
communication methods by Osama Bin Laden and his associates. They hide maps, tar-
get photographs, and also post instructions for terrorists on social websites (Kelley 2001).
Later, the FBI claimed that a Russian Spy apprehended by the US DOJ (Department of Jus-
tice) in 2010 used steganography for communication. In 2011, according to German news-
paper Die Zeit, the German police arrested an Al-Qaeda member with a memory card, and
after an investigation by computer forensics experts, they found that the memory card holds
a video which was a stego video. They discovered separate text files in that video con-
taining future plans and operations, 3 files among them were entitled”Lessons Learned,” “
Report on Operations” and “Future Work,” (Gallagher 2012). All these examples show that
intelligence agencies need to be more active in the concern of InfoSec and steganography
is a growing field that requires more advanced techniques for secure communication.
Steganography can be defined as a technique that can embed data in a multimedia file with-
out noticeable distortion. At the sender’s side, the data which is to be hidden, known as
secret data is embedded in a cover object with the help of some embedding algorithm/tech-
nique. After embedding, the object is known as a stego-object which contains secret data
inside the cover object. The terminology, stego-object was used as a standard at the first
international workshop on information hiding in 1999 (Pfitzmann (1996)). At the receiv-
er’s side, the stego-object is decoded, and secret data is extracted from the original cover
object. For additional security, the secret key can be used while embedding. The basic flow
13
A survey on information hiding using video steganography
chart of steganography is given in Fig. 1. The data hiding can be done in two domains: spa-
tial and transform. In the spatial domain, embedding is done directly on the values of the
pixel, and in the transform domain, transformed coefficients are utilized to hide data.
Different video steganography techniques have been proposed in the literature, and for a
technique to be successful it must be secure and efficient, there are three basic requirements
or parameters of a steganography technique, i.e., capacity, imperceptibility, and robustness
(Sadek et al. 2015) as shown in Fig. 2. Basic parameters for a successful steganography
technique are defined as follows:
• Capacity Capacity means how much data can be hidden in the multimedia file without
disturbing the complexity of the medium. It refers to the embedding of secret data bits
inside the whole cover media.
• Imperceptibility Imperceptibility means how visible secret information is to the
human eye after doing steganography. To make the technique imperceptible Human
13
M. Dalal, M. Juneja
Transmission
Channel
Robustness
Although, the information should be hidden in a way that it must have an excellent
hiding capacity, should not be visible, and should be robust against attacks. However,
the requirements are interlinked and have an impact on each other as shown in Fig. 2, as
most of the time increase in embedding capacity will decrease the visual quality which
ultimately makes the detection easy and ultimately leads to less robustness against
attacks. So, the researcher must be able to develop an algorithm to make a trade-off
between the three primary requirements to develop a better steganography scheme. Fur-
ther, security is also one of the main concerns of a steganography technique which is
defined as the ability of the stego object to resists steganalysis techniques. Also, security
can be evaluated with the help of imperceptibility and robustness where a technique
with high imperceptibility, robust against attacks and steganalysis is secure.
Information can be hidden almost everywhere on the internet. On a web page, there
are several places to hide data such as text, non-text, links, comments, structure, frames,
and most widely multimedia is used such as image, audio, or video to hide the data.
There exist many techniques to hide secret data in an image and audio, but videos, due
to frequent use on the internet, are becoming more popular these days for steganography.
13
A survey on information hiding using video steganography
There are many other advantages of videos over image and audio which make it more
suitable for steganography as illustrated below:
1. According to the survey of Facebook (Fu et al. 2017) in 2015, the organic reach of vid-
eos was 8.71% which was very high as compared to text (5.71%) and images (3.73%) in
the period of 2014 to 2015.
2. 400 h of videos are uploaded every minute to YouTube as per November, 17 YouTube
statistics (Braun et al. 2018).
3. According to a Cisco study (Sheri and Traoudas 2017), by 2020, 80% of the world’s
traffic on the internet will be videos.
4. Due to the advancement of digital media and existing compression techniques, videos
are frequently transferred on the internet.
5. Currently, portable cameras and video edit software are available that freely allow people
to record, edit, and send videos over Yahoo, Facebook, and YouTube.
6. Very complicated statistics and abundant content make it most suitable for steganogra-
phy.
7. The primary need of steganography is to hide a sufficient amount of data, i.e., capacity
and videos have the ability to carry a significant amount of secret data more than an
image or audio.
Image and audio steganography techniques can also be used sometimes in video steg-
anography as the video is a combination of both. However, there exist certain differ-
ences between the image and video steganography as stated:
• Size The size of images is very small as compared to videos as the number of pixels in
videos are more, resulting in providing high embedding capacity.
• Perceptual Redundancy Videos have temporal features that provide perceptual redun-
dancy to embed secret information without distortion (Gall 1991).
• Complex Structure Videos have a more complex structure as compared to images which
make it difficult for intruders and attackers to detect the existence of secret information.
In addition, videos have more statistical features that can be used for embedding such as
motion vectors, macroblocks, and so forth. Generally, in videos, data can be hidden in three
ways (Al-Frajat et al. 2010):
The audio part of the video is mostly discarded while embedding because of limited
capacity. Therefore, the primary focus of this paper is to highlight the literature related to
video steganography techniques by considering frames or complete video for embedding.
The general structure of video steganography is shown in Fig. 3 in which at the sender side,
the frames are extracted from the original video, and using video steganography embed-
ding algorithm secret data is embedded in the selected frames. After the embedding pro-
cess, the frames are rebuilt and stego video is generated for transmission to the receiver
end. At the receiver’s side, the stego video frames are extracted and the extraction algo-
rithm is used to abstract secret data from the stego video frames and the original video
13
M. Dalal, M. Juneja
Sender
Secret Data
Embedding
Frames Algorithm on Rebuild
Original Stego
Extraction selected frames Frames
Video Video
Transmission
Channel
Extraction
Algorithm on
selected frames
Frames
Stego Rebuild Original
Extraction Secret Data
Video Frames Video
Receiver
4 Video format
Video steganography can be done by using any video available over the internet or self-
made via phone. However, as per the researcher’s point of view, the implementation of
a successful video steganography technique sometimes requires knowledge of a specific
video format (coding standard). Therefore, this section gives a brief introduction to video
coding standards.
There exist different video coding standards such as H.120, H.261, MPEG-1, MPEG-
2, MPEG-4, H.264, H.265, etc. Among these video coding standards, H.260 was the first
13
A survey on information hiding using video steganography
video coding standard introduced in 1984. This coding standard was not in practice due
to its poor performance. After that, H.261 was the first practical video coding standard
which was developed based on motion compensated DCT compression. The next standard
was MPEG-1 developed by Motion Picture Experts Group (MPEG) for VHS (Video Home
System) compression. It was superior to H.261 in terms of quality when operated at high
bit rates (Sikora 1997). The standard added half-pixel motion and bi-directional motion
prediction to H.261. Further, MPEG-1 was overtaken by MPEG-2/H.262 which was used
for broadcast formats at high data rates and was widely used for DVD standard. MPEG-2
was able to efficiently support interlaced scan pictures with a wide range of bit rates (Tudor
1995). The next standard was MPEG-4/H.263 developed in 1999 also known as MPEG-4
Part 2 which made further advancements in video compression. This standard provided
features such as segmented coding of shapes, variable block size, spatial-predictive intra
coding, temporal and spatial scalability, overlapped block motion compensation, etc.
(Richardson 2004). Among these standards MPEG-1, MPEG-2, MPEG-4 has been utilized
by video steganography researchers, and currently, the most commonly used video coding
standard for video steganography is H.264/AVC.
The H.264/AVC standard also known as MPEG-4 Part-10 is a motion-compensated
and block-oriented video compression standard. H.264/AVC is the most repeatedly used
video coding standard for recording, compression, and transfer of the video content H.264/
AVC 2016. It was originated by the alliance of the International Telecommunication
Union (ITU) Telecommunication Standardization Sector (ITU-T) Video Coding Experts
Group (VCEG), Joint Video Team (JVT), and the International Organization for Stand-
ardization and the International Electrotechnical Commission (ISO/IEC)- Joint Technical
Committee 1 (JTC1) Moving Picture Experts Group (MPEG). In May 2003, the first ver-
sion of this video standard was accomplished and after that different expansion of its capa-
bilities has been appended in later editions (Richardson 2004; Video et al. 2014)). Previ-
ous video standards have the basic encoding steps such as transformation, quantization,
motion compensated prediction, and entropy coding. Besides, H.264/AVC standard has
some more important features such as 4 × 4 integer transform, intra-prediction mode, vari-
able block sizes (16 × 16, 8 × 16, 16 × 8, 8 × 8, 4 × 8, 8 × 4, 4 × 4), context-adaptive coding,
multiple reference frames, Flexible Macroblock Ordering (FMO), a quarter-pixel precision
for motion compensation and better entropy coding (Richardson 2004). These auxiliary
features help in enhancing the coding efficiency of the standard and video steganography
researchers utilized some of these features for embedding secret data. The basic steps in
encoding H.264/ AVC video are shown in Fig. 4.
Entropy Quantization
Bit-Stream
Coding Coefficients
13
M. Dalal, M. Juneja
The video frames are fragmented into blocks using variable block size, and the block
prediction is done based on its neighboring blocks (past frames or future frames). The tan-
gible pixel is utilized to subtract the prediction to obtain the remaining, and this remaining
pixel data is transformed using integer DCT. The integer DCT coefficients are quantized
and after that entropy coding is done to transmute it into a bit-stream (Ostermann et al.
2004). It is the most widely used video coding standard over the streaming internet sources
such as Netflix, YouTube, and other web sources also. The modern-day video coding for-
mat of the ITU-T, VCEG is High Efficiency Video Coding (HEVC) which is introduced in
2013. HEVC, also known as H.265/HEVC has been aimed to address basically all prevail-
ing applications of H.264/AVC. Additionally, HEVC focussed on the increased use of par-
allel processing architectures and increased video resolution. This video coding standard is
particularly designed to meet the need for high definition video resolution. H.265/HEVC
uses integer DCT and DST transforms with variable block sizes between 4 × 4 and 32 × 32.
Moreover, HEVC video coding design and feature details can be found in (Sullivan et al.
2012).
5 Literature review
Video steganography techniques can be classified in different ways such as reversible and
irreversible, compressed and uncompressed (raw), and based on domains: spatial and trans-
form domain (shown in Fig. 5a). It is to note that in this paper, we used the spatial domain
to denote the raw data pixel format, and at times for the purpose of disparity, we also used
the word uncompressed domain for raw data. Reversible techniques can recover the origi-
nal/cover video exactly without any visual distortion after secret data extraction which has
been implemented by some of the researchers (Wong et al. 2009; Gujjunoori and Amberker
2013; Song et al. 2015; Yao et al. 2016). Most of the techniques in the literature focus on
extracting the secret data only with no or less distortion without considering the original
video. The other type of classification which could be done is compressed and uncom-
pressed (raw), although uncompressed videos are not practical nowadays still few of the
researchers utilized uncompressed video formats for embedding (Xu and Ping 2007; Cetin
and Ozcerit 2009). Also, videos have specific features that were utilized by researchers for
embedding. Therefore, this paper discussed the literature based on domains (spatial and
Spatial Transform
(a) (b)
13
A survey on information hiding using video steganography
transform) for raw videos and compressed videos as in intra-prediction mode mostly DCT
coefficients are utilized for embedding. Also, video specific methods have been categorized
for compressed videos as these include robust and reversible/irreversible techniques which
are further mentioned in literature as well. Moreover, the authors classified video specific
methods into three sub-categories: motion vector (MV), variable length code (VLC) and
format specific techniques as shown in Fig. 5b. Sometimes video specific techniques also
utilize spatial and transform domain based techniques for embedding. Furthermore, the
classification acquired is motivated by the existing literature and the opted classification
is based on the hiding venues as shown in Fig. 6, have been considered to cover the maxi-
mum literature related to video steganography.
The literature based on the adopted classification schemes is given in the following sub-
sections with their pros and cons, and theoretical analysis.
The term spatial domain refers to the process of working with the pixel values or in other
words working directly with the raw data (Darmstaedter et al. 1998; Sabeti et al. 2007;
Hussain et al. 2018). Video steganography in the spatial domain means embedding is done
directly to the values of the pixel intensity of the frame, and the most basic technique for
embedding in the spatial domain is LSB (Least Significant Bit) substitution. In LSB, the
embedding of the secret data bit is done by replacing the least significant bits of cover
videos, and many researchers have used LSB replacement (Bhattacharyya et al. 1996) and
LSB matching (Mielikainen 2006) method for embedding. Various research studies have
been implemented using these techniques which are discussed in this section.
LSB embedding is the simplest and most utilized technique by the researchers as it con-
tains less relevant information, so embedding can be done without perceptual distortion.
As an example, in a 24-bit color frame each pixel will have 8-bits and to embed any let-
ter “M” with ASCII (American Standard Code for Information Interchange) code 77 in
decimal and “1001101” in binary. To embed the letter “M” in the 24-bit frame, it will need
three pixels, and let us assume that the three consecutive pixels are:
10001111 00101010 01000101 00101000 10010010 11110000 10110110 01100101
After embedding “1001101”, some of the pixels will be changed (bold specify the
changed bits) and the changed pixels are:
10001111 00101010 01000100 00101001 10010011 11110000 10110111 01100101
13
M. Dalal, M. Juneja
(a) Original Frame (b) Secret Image (c) 1-bit LSB Stego Frame
(d) 2-bit LSB Stego Frame (e) 3-bit LSB Stego Frame (f) 4-bit LSB Stego Frame
Fig. 7 Visual results of 1-bit, 2-bit, 3-bit, and 4-bit LSB substitution
In this example, only 4 bits are changed to embed letter “M” using 1-LSB (one bit LSB)
and it can be concluded that normally with 1-LSB it has only 50% chances to change the
LSB bit of every 8 bits. Generally, in LSB embedding, the last four bits of each pixel is
replaced with the secret message bits. The implementation results of simple LSB substitu-
tion with 1, 2, 3, and 4 LSB bit embedding on a video frame are shown in Fig. 7, and the
values of PSNR (Peak Signal to Noise Ratio) and SSIM (Structural Similarity) Metrics
which are used to measure perceptual quality is shown in Table 4. The secret image of size
176 × 320 (source: https://in.pinterest.com/momb8kids/signs-christmas-motifs/?lp=true) is
embedded in an H.264/AVC baseline profile video of 250 frames with a frame rate of 25
frames/second. The results indicate that as the number of LSB bits increases the value of
PSNR and SSIM decreases resulting in a low visual quality of the stego-video.
LSB in videos is utilized by researchers for embedding secret data in grayscale and
colored videos. With the technical advancements, grayscale videos are not in much use
nowadays as cameras, video recording and viewing devices are available at hand for
colored videos. So, for the real world applications colored videos must be considered but
few of the researchers still utilized grayscale video for embedding. As an example, Gupta
13
A survey on information hiding using video steganography
et al. (Gupta and Chaturvedi 2013) used one, two, or three LSB replacement technique to
embed secret text and grayscale image in grayscale video frames. Pros: simple and fast
embedding; and cons: less secure and could be detected by existing LSB tests. The main
advantage of the LSB technique is that it provides high embedding capacity. Keeping this
as an aim, Hanafy et al. (Hanafy et al. 2008) proposed a spatial domain technique to embed
text, image, and video as secret data using two bits of the red, green and blue channels
for embedding. In this technique, before embedding the secret data was partitioned into
non-overlapping blocks, and randomization of the blocks was done. For pseudo-random
locations, a secret key was used for embedding the secret data. Pros: embedding was done
randomly to improve security and cons: capacity could be improved further as only 2 bits
were used for embedding and less robust against attacks. Similarly, to further improve
capacity Bhattacharyya et al. (Bhattacharyya et al. 1996) utilized the LSB replacement
method for video steganography based on directed graph patterns. The proposed method
inserted data in the cover video according to the graph direction and utilized two graph pat-
terns for embedding. Pros: blind scheme with negligible effect on video quality and cons:
statistical analysis was not done to test robustness.
Simple LSB technique alone is not secure enough, so to provide security Balaji and
Naveen (2011) proposed an efficient and highly secure method in which a frame of the
cover video was selected to create an index for the secret data. The index frame was used
to locate the secret data during extraction. According to this index, the frames were divided
into two categories, i.e., used and unused frames and to make it highly secure random data
was also placed in unused frames. Pros: the process of extraction was easy as the index
of embedded data was placed in the video frame and cons: the position of the frame and
secret data pixels were equidistant which makes the technique vulnerable to attacks. Fur-
ther to improve security, a hash-based 3-3-2 LSB video steganography was proposed by
Dasgupta et al. (2012). Here, 3-3-2 means 3-bits of red, 3-bits of green, and 2-bits of blue
LSBs were replaced with secret data bits. After that Dasgupta et al. (2013) enhanced their
technique using a genetic algorithm to get an optimal imperceptibility of hidden data. An
anti-steganalysis test was also used to check the innocence of the frame as compared to the
original frame. Pros: optimal in terms of space complexity and cons: not robust against
compression. Additionally, 3-3-2 LSB was utilized by Paul et al. (2013) as a base technique
to embed secret data using scene change detection. The algorithm checked each frame to
detect abrupt changes in the sequence of frames using histogram differences. Security was
enhanced using a random sequence generator called Indexed based Chaotic Sequence to
generate the pseudorandom sequence. They have also calculated time complexity with O
(w2) where w denotes the chromosomes encoding complexity and space complexity with
O (M × N) where M and N denote the total population and length of chromosome respec-
tively. Pros: robust due to random pixels selection for embedding and cons: high percep-
tual distortion was there in presented histograms. Chen and Qu (2018) presented a novel
authenticated quantum video steganography protocol for secure communication. They
utilized RGB components of frames for embedding secret data bits using LSB method.
Additionally, Kapoor et al. (2015) proposed a video steganography techniques for embed-
ding text using LSB insertion by employing frames diffusion process. The secret text was
compressed using ZIP and converted into chunks of 2 bytes before embedding. Embedding
was done in RGB pixels of the video frames and the experimental results showed high
imperceptibility with average PSNR above 52 dB.
Security of the techniques can also be improved by embedding secret data based on
the human vision system Balu et al. 2018; Luo et al. 2017, using encryption schemes to
encrypt secret data Bhautmage et al. 2013; Ramalingam 2011; Yadav et al. 2013 and error
13
M. Dalal, M. Juneja
correcting codes Mstafa and Ellleithy 2015; Mstafa and Elleithy 2015b, 2016b. Bhautmage
et al. (2013) encrypted the secret data with the help of the bit exchange method before
embedding to cover videos. In this instance, the cover video’s alternate bytes were replaced
using LSB and LSB + 3 bits. It also created an index in a frame to extract the secret data
efficiently to reduce the time for extraction. Pros: bit exchange method used for embedding
improved robustness and cons: no experimental results were presented. Besides, Moon
et al. (2013) used 4LSB method to improve capacity for hiding text and images inside an
AVI (Audio Video Interleaved) video file and also used computer forensics as an authenti-
cation tool. The proposed technique provided three layers of security using encryption and
computer forensics. Pros: high embedding capacity and cons: more prone to attacks as
data was hidden by using 4LSB. Additionally, Kaur et al. (2014) and Sudeepa et al. (2016)
also utilized encryption techniques to encrypt secret data. Kaur et al. (2014) used the RSA
encryption algorithm and Hash-LSB to hide secret data in the RGB (Red, Green, Blue)
pixels of the cover AVI video. In this scheme, the hash function was used to hide the data
by using a 3-2-3 LSB technique. In addition to encryption, the authors also used randomi-
zation and parallelization to improve the throughput and efficiency of the video steganogra-
phy technique. In the proposed technique, frames of the cover video were chosen randomly
with the Feed Back Shift Register (FSR) to embed the text, and four threads were used
for the parallelization process. The processes encryption/embedding and decryption/extrac-
tion were executed in parallel to make it efficient in terms of time and the required time
for embedding was less than 10 s. Pros: Secret data was encrypted before embedding to
enhance security and cons: not tamper resistant and single video was used for experiments.
Additionally, Manisha et al. (2019) encrypted the secret image and embedded the bytes
after segmenting in AVI video frames. The secret image was segmented into 4 pairs of bits
which were embedded in 2 LSBs of the frame randomly. Pros: real-time hiding with high
embedding capacity and cons: complex in terms of computation and time as it consists
of multiple models. Yadav et al. (2013) embedded secret video frames using LSB video
steganography where the secret frames were encrypted with a secret key using XOR. Pros:
high embedding capacity as the video was embedded inside the cover video and cons:
more prone to attacks as embedding was done using LSB. In another work, Mstafa et al.
(2016b) utilized hamming codes for secret data and a tracking algorithm, known as KLT
(Kanade-Lucas-Tomasi) for face detection and tracking. The embedding was done in the
detected face area, and because of error-correcting codes, they were able to achieve a high
level of security. Pros: robust against compression and signal processing attacks and cons:
embedding capacity was low as the data was hidden only at the portion of the frames.
Further, Balu et al. (2018) embedded the secret data in background objects and noninterest
areas other than face by using LSB substitution methods. Pros: data was embedded using
human vision ROI to improve security and cons: computationally complex.
In the spatial domain, most of the researchers have used the fundamental LSB tech-
nique for embedding process sometimes with encryption to enhance security Manisha and
Sharmila 2019; Papadopoulos and Psannis 2018. However, some of the researchers’ also
utilized RGB (Red, Green, and Blue) components, YUV components, and histogram-based
techniques for embedding.
The researchers utilized RGB components to hide secret data in all the three components
or any one or two of the components. Embedding in less number of color components
13
A survey on information hiding using video steganography
enhance the visual quality. Ramalingam et al. (2015) utilized the RGBBGRRG order of
RGB components for embedding text and images in the cover video frames. Pros: easy
to implement and secure as the secret message was embedded randomly and cons: inde-
pendent frames embedding and not robust against compression. In addition to RGB com-
ponents, there are other color components such as YUV components utilized by research-
ers for video steganography. In YUV color space, Y represents the luminance component
(brightness) whereas U and V represent the chrominance components (color). The practical
implementation of RGB conversion to YUV of a video frame is shown in Fig. 8. Consid-
ering YUV color space Mstafa et al. (2014) embedded data in a video by changing the
positions of Y, U, and V pixel components. In this scheme, the authors used three keys
where key1 was used to reposition the Y, U, V component pixels, key2 and key3 were
used for embedding. The secret image was encoded with the help of Hamming code (7,
4) to provide more security and after that, the secret image was XORed with random val-
ues generated by using a key. The proposed method had high embedding efficiency with a
payload of 16 Kbits in each frame, and that can be extended up to 90 Kbits without notice-
able distortion. Pros: provided additional security by using Hamming code for the secret
message before embedding and cons: independent frames embedding. After that, Mstafa
et al. (2015) used BCH ( Bose, Chaudhuri, and Hocquenghem) error correcting codes for
encoding secret data which was embedded randomly to each component of Y, U, V frame
using 3–2-2 LSB embedding. Pros: high embedding capacity and used two keys with BCH
codes to improve security and cons: independent frames embedding. Moreover, Cetin et al.
(2009) utilized color histograms to present a novel data hiding algorithm for videos. In
this method, the histogram value of each frame was calculated with the difference between
consecutive frames for each RGB color component. After that, by using color and motion
(a) (b)
Fig. 8 RGB conversion to YUV color space a Original frame, b YUV frame, c Y component, d U compo-
nent and e V component
13
M. Dalal, M. Juneja
transitions, an evaluating value was calculated, and this value was compared with a prede-
fined threshold. If the calculated value is higher than the threshold, it is called dissimilar
histograms which mean that the frames have an extensive color variation and if the color
variation is monotonous it is called similar histograms. Both these histograms were used
for embedding using two methods: frame based histogram and block-based histogram. The
average histogram value was calculated for each, frame based similar histogram (FBSH),
frame based dissimilar histogram (FBDH), block based similar histogram (BBSH), and
block based dissimilar histogram (BBDH) and compared with the threshold value. The
results demonstrated that in dissimilar histograms the block-based technique was bet-
ter and in other cases, the frame-based technique outperforms the block-based histogram
technique. Pros: high embedding capacity and imperceptibility and; cons: independent
frames embedding and not robust against common attacks. Another reversible video steg-
anography technique based on histogram embedding was proposed by Yeh et al. (2014).
Further, Kelash et al. (2014) proposed a technique where secret data was hidden randomly
in frames using Histogram Constant Value (a threshold). They divided video frame pixels
into two parts, i.e., right and left where in the right part the number of bits was embedded,
and in the left part, pixels were counted. Pros: high embedding capacity and security due
to random embedding; and cons: not tamper-resistant and embedding was done in frames
independently.
In addition to different color space and histogram, researchers also employed other tech-
niques such as QVD (Quotient Value Differencing) Swain 2019, PVD (Pixel- Value Differ-
encing) for steganography, one of the techniques proposed by Sherly et al. (2010) utilized
TPVD (Tri-Way Pixel-Value Differencing) with pseudo-random dithering (TPVDD) for
embedding text in MPEG (Moving Pictures Expert Group) videos. Secret data was embed-
ded with maximum scene change in the micro blocks of I-frame (Intra-coded) whereas P
and B (Bi-directional) predicted frames macroblocks were utilized for MVs (Motion Vec-
tors) based embedding with large magnitude. Pros: high capacity as the data was hidden
in the horizontal, vertical, and diagonal edges and cons: less secure and can be attacked by
existing steganalysis. In contrast to this, Hu et al. (2011) hide a secret video in an uncom-
pressed/raw video of approximately the same size. For embedding, the secret video was
divided into corresponding frames, and those frames were partitioned into non-uniform
rectangular codes which were hidden using 4 LSB bits. In the process of non-uniform rec-
tangular partition, three main factors were used: the initial partition, the bit-variate pol-
ynomial, and the suitable control error. Pros: high capacity as the hidden data was the
secret video of almost the same size without causing any distortion and cons: less robust
against attacks. In a different work, to boost the speed of embedding and extracting hidden
message Ramalingam et al. (2015) proposed a novel technique using an Enhanced Hidden
Markov Model (EHMM). In this scenario, embedding and retrieving operations were done
by state transition dynamics and conditional states. EHMM for data retrieval was calcu-
lated using three benchmark functions namely, state representation, state transition dynam-
ics, and Markov chain. Markov chain was used to set the time interval in a frame for data
bits to be retrieved which results in reduced computational cost and EHMM enhances the
security. Pros: EHMM was used to enhance the speed of embedding and extraction and
cons: not roust against MPEG and H.264/AVC compression.
The spatial domain techniques discussed in this section are not robust against compres-
sion and are not secure enough against steganalysis. Although some of the authors tried
to propose better techniques using optimization Kamil et al. 2018, however, there is still
improvement required. So, the embedding capacity, imperceptibility, and robustness could
be improved by using frequency domain techniques and video-specific methods. Table 5
13
Table 5 Comparative analysis of LSB, RGB, and other spatial domain techniques
Authors Technique Average Capacity Im Ro En Rv PSNR Security
13
Sherly et al. (2010) TPVDD 3.32 KB + + − − 62.28 Untested
Table 5 (continued)
Authors Technique Average Capacity Im Ro En Rv PSNR Security
13
Hu.et al. (2011) 4LSB 1.5 bpp + − − + 29.03 Untested
Ramalingam et al. (2015) State Transition Dynamics, Conditional States N/A − + − − N/A High
and EHMM
M. Dalal, M. Juneja
A survey on information hiding using video steganography
summarizes the details of the above-discussed techniques for the comparison where the
authors tried to calculate the average capacity value achieved by the researchers in the lit-
erature wherever feasible, and it is measured in bits per pixel (bpp), KiloBytes (KB), and
also in terms of capacity ratio (%). It also mentions the fundamental requirements such
as imperceptibility, robustness, and encryption by using ‘+’ and ‘−’ symbols, where ‘+’
denotes ‘support’ and ‘−’ denotes ‘does not support.’ Additionally, efforts have been
made to calculate the average PSNR value from the literature which is measured in deci-
bels (dB), and the security level is also defined as low, high, and untested. Low-security
level means the technique has been detected by the existing steganalysis methods or tests
which are referred in the table, high-security level refers to the technique which has been
tested against the existing steganalysis and has not been detected till date; also high secu-
rity means the proposed technique have applied additional methods to improve the secu-
rity. The third level is named ‘undetected’ which has not been tested and detected by the
existing steganalysis; also, to the author’s best knowledge there exists no method or test
which can detect the particular technique. In Table 5, the columns ‘Im’, ‘Ro’, ‘En’, and
‘Rv’ denote imperceptibility, robustness, encryption, and reversibility respectively.
The spatial and transform domain based techniques have been utilized for both raw and
compressed videos. Generally, in literature, transformed coefficients are manipulated for
compressed video formats after video coding however, some of the researchers have also
utilized transform for raw videos before encoding (Ramalingam and Isa 2014; Chae and
Manjunath 1999) where the data has been embedded after extracting the frames which
are further mentioned in this section. In the transform domain, embedding is done to the
transformed coefficients rather than directly to the intensity values. Though sometimes
after applying transform, LSB and other spatial domain techniques have been utilized for
embedding in transformed coefficients, however, in this paper they have been considered
under transform domain techniques based on the hiding venue. There exist many transform
techniques, but for video steganography, only two types of transformation techniques are
utilized as mentioned below:
In the next two sub-sections, the authors tried to cover the literature present till date on
DCT and DWT based video steganography techniques.
DCT techniques are the base for compressed video formats such as MPEG-1, MPEG-2,
and H.263; however, in H.264 an integer transform is applied which is a variant of DCT
(Richardson 2004). Each of the compressed videos consists of three types of frames
namely, ‘I,’’P’, and ‘B’ frames where ‘I’ refers to Intra-frame, P, and B refers to predicted
and bi-directionally predicted frames respectively. For embedding with DCT coefficients
in compressed videos, these frames are often used in the literature. In DCT, embedding
is mainly done using coefficients of DCT block or quantization parameters. Generally, the
8 × 8 DCT block is considered, but any block size can be used (Xu et al. 2006). Researchers
13
M. Dalal, M. Juneja
have utilized different block sizes such as 4 × 4, 8 × 8, and 16 × 16 for embedding depending
upon the video format, for example, 8 × 8 block size is generally used for MPEG videos,
and for H.264/AVC videos usually, 4 × 4 block size is used in the literature. After embed-
ding to the coefficients, the stego video is generated and transmitted through a channel. The
basic diagram of DCT embedding is shown below in Fig. 9.
It has been employed as a base embedding technique for decades, one of the oldest algo-
rithms is proposed by Swanson et al. (Swanson et al. 1997) to hide high bit rate video in
a video file. The algorithm was based on perception-based linear projection, quantization,
and perturbation. A key was used to generate the pseudo-random number sequence for each
block which was used while retrieving the data. The embedding was done by modifying
the projection of 8 × 8 DCT blocks of the frame. The projection was first quantized with
a threshold value constructed using masking, and the data was embedded by modifying
the projection. The proposed algorithm was able to hide a MPEG video inside a broadcast
news video, and a speech was also hidden in the video. The experimental results after test-
ing stated that both the experiments provide good imperceptibility. Pros: Robust against
noise attacks and compression with high embedding capacity by embedding video in video
and cons: visual distortion and blocking artifacts due to the embedding done by using
DCT. In another method, Chae and Manjunath (Chae and Manjunath 1999) utilized video
as a cover to hide the signature image and video. In this method, the cover video, as well
as the secret image, was transformed using DCT 8 × 8 blocks. They utilized a multi-dimen-
sional lattice for encoding the signature, and after that JPEG (Joint Photographic Experts
Group) quantization matrix was used to renormalize the vectors code. The embedding was
done in the middle-frequency coefficients using texture masking. After embedding, the
stego video was compressed using the MPEG2 code, and the scheme was based on the
loss-less recovery of the signature image. Pros: Less distortion because of the embedding
in the Y component of the textured regions. Cons: Low embedding capacity and could
result in blocked artifacts.
Furthermore, Wong et al. (Wong et al. 2009) presented a novel video steganography
technique for a MPEG-1 video format that preserved the exact quality of the video after
embedding. The proposed scheme was based on Mquant and DCT coefficients in which
embedding was done simultaneously by manipulating the quantized coefficients of DCT
and Mquant. The considered two parameters include payload and embedding efficiency. A
new representation scheme was used to achieve high efficiency known as Reverse Zero-Run
Coefficient Embedding
Selection
Reassemble Frames
Stego Video
13
A survey on information hiding using video steganography
Length (RZL) and results demonstrated that RZL was better in terms of performance than
matrix encoding. Among all macroblocks, the total usage for embedding in I, P, and B
frames were 55.9%, 24.0%, and 23.6% respectively. The increase in bitstream size of the
video after embedding was only 4 bits on an average for every embedded message bit and
this scheme was able to achieve the same PSNR after embedding as it was before embed-
ding. Pros: reversible method as the original video was reconstructed bit-by-bit level and
cons: did not test the technique against common transmission attacks.
Esen et al. (2011) described an adaptive block based steganography technique for videos
by utilizing forbidden zone hiding and selective embedding to determine host signals using
coefficient and block selection techniques. There was de-synchronization that occurred
because of adaptive block selection, and it was handled by using RA (Repeat Accumulate)
codes to resist erasures. The embedding was done in the Y component of the frame and the
middle-frequency band was chosen among DCT coefficients. An energy threshold was used
to process the block and coefficients. If the average energy was higher than the threshold
value, the block and coefficients were processed otherwise skipped. The authors compared
the proposed scheme with the existing Quantization Index Modulation (QIM) technique
and found that the proposed scheme outperforms the QIM technique for MPEG-2 com-
pression attacks. Pros: robust against common attacks for video processing such as com-
pression and scaling and cons: low hiding capacity due to selective embedding. Moreover,
Mstafa et al. (2016) presented a video steganography technique using DCT and BCH Error
Correcting Code (ECC) to enhance security. The secret data was hidden in DCT coeffi-
cients of video frames after the conversion of RGB into YUV components. Embedding was
done in each Y, U, and V components except the DC components of the quantized DCT
coefficients. The proposed technique was able to achieve a high embedding capacity with
approximately 27.53% hiding capacity ratio with minimal distortion. Pros: high security
because of pre-processing of the secret message before embedding using ECC and cons:
not robust against compression. Further, Rabie et al. (2019) utilized temporal redundancy
of the frames pixel by using pixogram which helps in converting the uncorrelated spatial
areas of the frames into highly correlated temporal sections. The pixogram is a temporal
vector of pixels of the video frames extracted from each row and column location and it
is divided into homogeneous sections. For embedding 1D-DCT was used for each pixo-
gram and embedding was done in the magnitude of high frequency coefficients only. The
experimental results showed a high embedding capacity with low visual distortion. Also,
the proposed technique was robust against different types of attacks. Additionally, Gujju-
noori et al. (2013) utilized DCT quantization coefficients for embedding in reversible data
approaches where embedding was done during the MPEG-4 compression process of the
video. The authors proposed two schemes for this. They used integration of HVS (Human
Visual System) based measures, one was PSNR-HVS with average value 51 dB, and the
other was PSNR-HVS-M with average value 37 dB which was a useful measure in terms of
visual quality. The first scheme achieved a better quality of the video in terms of HVS and
the second scheme achieved a high embedding payload with reasonable visual quality. The
first scheme could be used for watermarking and the second scheme was for steganography
applications Pros: high imperceptibility as HVS based parameters were considered and
cons: did not consider inter-frame distortion drift.
Currently, H.264/AVC is the most utilized compressed video format on mobile phones,
the internet, and everywhere, in literature also for video steganography it has been
employed the most. In H.264/AVC two types of modes are there: intra prediction mode
and inter prediction mode which can be deployed for embedding in video steganogra-
phy. H.264/AVC has additional features as compared to the previous formats; one of the
13
M. Dalal, M. Juneja
Mean (A,
D, I, L)
13
A survey on information hiding using video steganography
intraframe distortion drift, Ma et al. (2010) presented a scheme that utilized the I-frame
DCT quantized coefficients for embedding in the luminance (Y) 4 × 4 blocks. From each
coefficient pair, one was used for embedding the secret data, and another was used to fix
the distortion level which resulted in no intra-frame distortion drift to the H.264/AVC
covert video. It is a blind technique which means at the time of extraction; the original
video is not needed, and also it was fast in computation. However, this scheme was able
to utilize only 46% of the 4 × 4 luminance blocks which was improved by Lin et al. (2013)
after fully utilizing the luminance blocks to improve capacity. In Ma et al.’s scheme, 3
bits were embedded into each 4 × 4 luminance block, whereas in Lin et al.’s scheme, the
authors utilized 4 bits to embed into each luminance block. As capacity increased, there
might be more visual distortion so to enhance the visual quality the proposed scheme used
a new shifted 4 × 4 luminance blocks set to embed data by perturbing the QDCT (quantized
DCT) coefficients. The results illustrated that the proposed scheme improved hiding capac-
ity by maintaining the quality of the video. Pros: fully utilized luminance component to
provide high embedding capacity and cons: did not prevent inter-frame distortion drift.
Further, Nie et al. (2018) presented a technique for embedding in an intra-prediction mode
using STC (Syndrome Trellis Codes) and also utilized SAD (Sum of Absolute Difference)
for minimizing the distortion. The mapping rule was introduced to magnify the range of
selected modes for each block. The proposed scheme proved the security by testing against
existing steganalysis technique Zhao et al. 2015. Pros: high robustness and imperceptibil-
ity and cons: computationally complex method. Further, Similarly, Liu et al. (2015) also
presented a data hiding scheme for H.264/AVC to avert the intra-frame distortion drift.
Before embedding, the secret data was divided into sub-groups using Shamir’s (t,n) secret
sharing technique which was used to correct frame errors. After that, the secret data was
encoded by using the BCH code to correct error bits and also to improve robustness. To
get the DCT coefficients and intra-frame prediction modes the original video was entropy
decoded. Then for embedding, the 4 × 4 luma DCT blocks with appropriate coefficients
and large residuals were selected. The experimental results of the proposed scheme dem-
onstrated that it was more robust as it achieved a 100% survival rate as compared to Ma
et al.’s scheme. They further did the advancement in Liu et al. (2016) to achieve more
robustness and better visual quality. Further, Xue et al. (2019) presented a steganalysis
resistance technique by considering the distribution of DCT coefficients and statistical ran-
domness of the 4 × 4 block. However, all these techniques have limited embedding capacity
as the data was embedded in intra frames.
Accordingly, to improve capacity inter frames have been used but they can also cause
distortion after embedding. Therefore, to minimize inter-frame distortion Yao et al. (2016)
described a reversible steganography technique for encrypted video streams. In the pro-
posed technique three types of coding parameters were encrypted that include the MV
(motion vector) differences, the prediction modes, and the DCT coefficients using stream
ciphers. The encryption was done without affecting the size of the video bit rate. The
embedding was done using histogram shifting technique in the 4 × 4 luminance integer
DCT block coefficients of P-frame. This scheme was able to recover videos losslessly after
decryption and extraction with an average PSNR above 32 dB. Pros: the receiver can built
the video bit streams without extracting the secret data and cons: complex as coefficient
selection involves more calculations. For minimization of inter-frame and intra-frame dis-
tortion drift, Song et al. (2014) proposed a novel data hiding algorithm based on the multi-
view coding standard for compressed videos. The data was embedded in the B4 frame to
avert inter-distortion drift and after that QDCT-coefficients of the 4 × 4 luma blocks were
modified to avert intra-distortion drift. Pros: able to prevent inter-frame and intra-frame
13
M. Dalal, M. Juneja
LL HL
LL HL HL
LL HL HL
LH HH
HL
LH HH LH HH
LH HH LH HH LH HH
distortion drift by embedding in B-4 frames and 4 × 4 luma components respectively. Also,
the technique was less complex because they did not need MB (Macro-Block)-type infor-
mation and it was the first 3D video embedding technique according to the authors. Cons:
not tamper resistant and not tested for robustness. In addition to these techniques, some
researchers used encryption (Idbeaa et al. 2016; Mumthas and Lijiya 2017) to provide
more security for video steganography techniques based on DCT.
Another transform domain technique used for video steganography is DWT in which
embedding is done by decomposing the frames into different levels such as a single level
(Lu et al. 2010; Ramalingam and Isa 2014)), two-level (Mstafa and Elleithy 2015a), three-
level (Ahmed et al. 2014; Kumar and Singh 2018) and so forth as shown below in Fig. 11.
Each level is divided into four frequency parts, i.e., LL, LH, HL, and HH where only one
(LL) is the low-level frequency component and the other three are high-level frequency
components. Generally, the LL sub-band/component is iteratively decomposed to further
levels (Faragallah 2013); however, other sub-bands can also be decomposed to different
levels. The implementation results of one-level and two-level (LL sub-band) 2D DWT
decomposition on a frame of a CIF (Common Intermediate Format) video format from
widespread traces dataset (Reisslein 2012) with resolution 352 × 288 is shown in Fig. 12.
The decomposition of the components provides the real application of DWT based steg-
anography as it relates to the HVS characteristics. Among the four components, embedding
can be done in any of the components, preferably high level because low-level compo-
nents contain most of the intensity value which is sufficient for a frame to be visually good.
13
A survey on information hiding using video steganography
DWT is recommended for steganography over DCT as it provides different frequency sub-
bands which can be processed independently for hiding secret data and other advantages
are stated as (Dalal and Juneja 2019):
1. After compression, DCT suffers from blocking artifacts that appeared in the frame which
distorts the visual quality of a video whereas DWT is free of blocking artifacts.
2. DWT provides a multi-resolution facility to analyze the signal at different frequencies
and helps in the transmission of the video.
3. The main asset of DWT is a temporal resolution that captures frame location and fre-
quency information to provide additional hiding venues for steganography.
Among both, DWT is the latest one and researchers have proposed and are still explor-
ing this domain for video steganography.
Initially, Furuta et al. (2003) presented a high payload video steganography technique
for compressed videos. The proposed technique was based on an algorithm for compression
known as 3-D Set Partitioning In Hierarchical Trees (SPIHT). The authors used Bit-Plane
Complexity Segmentation (BPCS) algorithm for embedding. In SPIHT, the wavelet coeffi-
cients were quantized into a structure of bit-plane, and BPCS embedding was done to hide
secret data. Pros: high embedding capacity and cons: the only limitation of the technique
was that it was not robust for lossy compression. Noda et al. (2004) proposed one more
technique in addition to Furuta et al.’s using Motion-JPEG2000 and BPCS. The proposed
technique did experiments for both the embedding schemes using: 3-D SPIHT and Motion-
JPEG2000 with BPCS steganography. The results concluded that SPIHT was superior to
Motion-JPEG2000 in terms of embedding. Pros: bit plane structure benefited for embed-
ding and cons: applicable only to wavelet compressed videos. Further, Lu et al. (2010) pro-
posed an efficient technique to hide biometric data in cover videos. The authors made the
use of motion analysis for embedding secret palm print set using one level DWT technique.
Before embedding, by using a watermark, each frame sequence number was embedded in
the frames themselves for exact data extraction. The dataset was embedded using transform
difference methods viz. frame based and block-based methods in fast motion frames and
fast motion blocks. Pros: high robustness against video processing attacks and compres-
sion; and cons: low hiding capacity. Moreover, Kolakalur et al. (2016) also used LSB for
embedding the video as secret data in HH sub-bands of the cover frames. The cover video
was first divided into RGB channels with each frame further divided into three blocks, and
the red channel from each block was chosen for hiding the secret video. Red channels of
each block were used for discrete wavelet transformation, and embedding of red channels
of the secret video was done in the first block red channels. Whereas, blue channels of the
secret video were embedded in the second block of red channels and green channels of the
secret video were hidden in the third block of red channels. Pros: the grayscale video was
embedded in the cover video, and different wavelets and video formats were tested to gen-
erate stego-videos and cons: not tested for robustness against attacks.
Chantrapornchai et al. (2014) proposed a video steganography technique to hide a secret
image using Lifting Multiple Wavelet Transformation (LMWT). The proposed method
compared the coefficients of both the secret image and the cover video for embedding.
This scheme used two approaches: lifting multiple wavelet transformation-similarities
and lifting multiple wavelets transformation-random for performance comparison. If the
coefficients of the secret image and cover video were found similar then hiding was done
to reduce the errors; otherwise, the pixels of the secret image were hidden in randomly
13
M. Dalal, M. Juneja
selected coefficients. Pros: less visual distortion as similar value coefficients were used
for embedding and cons: limited embedding capacity and not tamper resistant. Further,
Patel et al. (2013) proposed a novel dual security algorithm for compressed video streams
by utilizing frames as well as the audio part of the videos. In this scheme, hiding was done
using LSB encoding but before that video was transformed using 2-D Lazy Lifting Wavelet
transform. The lifting scheme was used to get integer values that were used to store multi-
media data. Each frame of the video was transformed using a lazy lifting wavelet transform
to divide the frame into 4 sub bands. Those four sub-bands were utilized to hide 3 data bits
in each component of the sub-band using LSB. The audio part was used to hide the length
of the data stream, and the last frame was utilized to hide a number of bits using LSB.
Pros: simple implementation with high capacity and cons: less secure against steganaly-
sis. To further improve security, Perumal et al. (2018) done entropy based grouping before
embedding secret data.
In another work, Wahab et al. (2015) proposed a hybrid technique to hide the image in
video frames based on histogram shifting for lossless data hiding and DWT. The secret
data was embedded in sub-bands with high frequency, i.e., vertical, horizontal, and diago-
nal sub-bands. Before embedding the secret data, a histogram of the sub-bands with high
frequency was shifted to make free space for secret data. Pros: utilized only faded pixels
for embedding to improve capacity and imperceptibility, and cons: embedding was done
independently in frames. Further, pivoting on security, Mstafa et al. (2015a) utilized BCH
(15, 11) code to improve security while encoding a secret message. After that embedding
was done in the middle and high-frequency coefficients (HL, LH, and HH) of DWT trans-
formation as they are less sensitive. The algorithm used two keys one for embedding, and
another for extraction to make it more secure. The first key was used to change the position
of the secret message bits randomly before BCH encoding and the second key was used
after encoding, the encoded message was distributed into groups of 15-bit, and each group
was XORed with the 15-bit numbers. Embedding was done by converting RGB compo-
nents into YUV color space, and the process of embedding in each Y, U, and V was done
according to the Eqs. (1), (2), and (3) where ‘Em’ is the embedding process and ‘Se’ rep-
resents the encoded secret data bits. Pros: highly robust against various attacks such as
Gaussian noise, impulse noise, and median filtering attack; and cons: embedding capacity
could be improved.
{ [ ( ( )) ]
Em[ floor( Y bit1,2,3 ), Se if] (Y ≥ 0)
Y= | ( )|
Em floor |Y bit1,2,3 | , Se if (Y < 0) (1)
| |
{ [ ( ( )) ]
[ floor( U (bit1,2,3 ) ), Se if] (U ≥ 0)
Em
U= | |
Em floor |U bit1,2,3 | , Se if (U < 0) (2)
| |
{ [ ( ( )) ]
[ floor( V (bit1,2,3 ) ), Se if] (V ≥ 0)
Em
V= | |
Em floor |V bit1,2,3 | , Se if (V < 0) (3)
| |
Further, to achieve more robustness Sadek et al. (2017) utilized human skin as a Region
of Interest (ROI) for embedding in videos. By utilizing an adaptive algorithm for skin
detection, the authors created a skin map for all the frames. A skin-map blocking step
was performed to discard skin pixels with errors. A three-level DWT was utilized for
13
A survey on information hiding using video steganography
embedding in red and blue channels to increase the robustness. On analyzing the perfor-
mance of the proposed approach, it has been concluded that they achieved success in mag-
nifying the accuracy of skin detection and also got 86% accuracy between the embedded
data and extracted data. Pros: robustness against MPEG4 compression and cons: the pro-
posed scheme compromised with the capacity and their method was computationally more
expensive. Moreover, the performance has been improved by Kumar and Singh (2018)
using 1-bit LSB per 8-bit stacks of the third level DWT segmentation. Pros: secure tech-
nique as human skin ROI was utilized for embedding and also robust against compression;
and cons: limited hiding capacity as there is a trade-off between robustness and capacity.
Some researchers have utilized both the techniques (DCT and DWT) of transform
domain for video steganography and also researchers have utilized both the domains (spa-
tial and transform) for embedding in compressed videos. Ahmed et al. (2014) suggested
a new steganography technique for videos to embed secret data by using LSB in the fre-
quency domain. Two methods were proposed for embedding in the frequency domain—
LSB parity and LSB XORing. Text and gray images were embedded in an R- region
(among RGB) of the frame. The embedding was done in the frequency domain after using
the combined algorithm of DWT and DCT. DWT was applied 3 times to get the HH3 sub-
band after that DCT was applied to the HH3 sub-band. After quantization, the embed-
ding was done in the middle frequency sub-band. Before embedding the secret data was
encrypted through the RSA algorithm to make it more secure. Pros: robust against noise
attacks and cons: limited hiding capacity. Ramalingam et al. (2016) proposed an algorithm
to improve secret data security by deploying scene-change detection and the DCT quan-
tized coefficients of video-sequences. The scheme used DWT to minimize distortions and
enhance security. The proposed method improved security by adapting the data hiding pro-
cess in the DCT and DWT domains and was able to preserve good video quality as a result
of imperceptible distortions. Pros: utilized localization of DWT and embedding was done
in DCT and DWT coefficients for minimal distortion; and cons: not robust against com-
pression. Later, Mstafa et al. (2017a) also used both DWT and DCT for embedding with
error correcting codes for encoding secret data to achieve a trade-off between capacity,
imperceptibility, and robustness. Pros: highly secure and robust against attacks and cons:
limited hiding capacity as the data was hidden in the portion of the frames.
Additionally, few of the researchers (Ramalingam and Isa 2014; Abbass et al. 2007;
Pilania and Gupta 2020; Narayanan et al. 2012) have also utilized Integer Wavelet Trans-
form (IWT) for embedding secret data in videos. To exploit spatial and temporal correla-
tion, Abbass et al. (2007) utilized 1D-IWT for embedding secret data in video frames to
minimize distortion. The experimental results stated high embedding capacity using blue
frames with a zero bit error rate. Pros: 1D- IWT is very time efficient; and cons: low
robustness and more prone to attacks. Moreover, Narayanan et al. (2012) also emphasised
on spatial and temporal correlation to minimize distortion after embedding an image to
video frames using IWT. Before embedding the secret image was also divided into sub-
bands y using IWT and after that fusion encoder is created for embedding. The obtained
results claimed high embedding capacity with firm robustness. Pros: minimum distortion
and cons: the method was applied to gray scale images only. Further, Ramalingam and Isa
(2014) suggested a simple and secure steganography algorithm for AVI videos. The pro-
posed approach was based on Haar Integer Wavelet Transformation (IWT) where embed-
ding of the secret text was done in the RGB components of the AVI video files using LSB.
The RGB components of frames were normalized to avoid overflow/underflow. Experimen-
tal results of the proposed approach demonstrated that there was no change in the video file
size after embedding and the statistical values such as mean and median did not change the
13
M. Dalal, M. Juneja
histogram. Pros: low complexity in terms of calculations and time; and cons: more prone
to attacks as the hiding was done using LSB. The summarization of the discussed literature
in DCT and DWT is demonstrated in Table 6 with similar columns details as in Table 5.
The secret data in videos can be embedded not only with spatial and transform domain
techniques but also with video-specific techniques that can utilize both the domain tech-
niques. The statistical properties of videos are complex and have more features as com-
pared to images and audios. Presently, due to the easy access of compression software’s
majority of the videos are stored in a compressed format and are being utilized by the
researchers for embedding secret data based on video codec features. In this section, litera-
ture related to the motion vector, VLC, and format-specific techniques are discussed.
Motion vectors are the basis of motion estimation which is used in video compression, and
motion estimation is utilized for the reduction of temporal redundancy. The motion estima-
tion aims to find the best matching macroblock by predicting the current frame employing
one or more previous frames due to the high correlation of the neighborhood frames. Usu-
ally, to find the best matching block, three matching criteria are used: Mean Square Error
(MSE) (Cao et al. 2011), Sum of Absolute Difference (SAD) (Su et al. 2013; Cao et al.
2015; Yao et al. 2015) and Mean Absolute Difference (MAD) (Ren et al. 2014). These
three can be calculated by using Eqs. (4), (5), and (6) respectively and after getting the best
match, motion vectors values can be calculated. MV based techniques can utilize both the
domains as neighboring MVs have a spatial and temporal correlation with each other (Tas-
demir et al. 2016). MVs can survive even after compression; therefore, it can be utilized for
video steganography to achieve more robustness. According to the literature, data hiding
using a motion vector in videos can be done by utilizing its vertical components, horizontal
components (Aly 2011), the magnitude of the motion vector, the direction of movement,
phase angle (Xu et al. 2006; Pan et al. 2010), and others (Cao et al. 2015; Yao et al. 2015).
The authors tried to cover all the related literature based on motion vector video steganog-
raphy in this subsection.
1 ∑s−1 ∑s−1 ( )
MSE = Cuij − Rf ij (4)
s2 i=0 j=0
∑s−1 ∑s−1
SAD =
i=0 j=0
|Cuij − Rf ij | (5)
1 ∑s−1 ∑s−1
MAD = |Cuij − Rf ij | (6)
s2 i=0 j=0
where s is the size of the macroblock, Cu and Rf denotes the current and reference
macroblocks.
Xu et al. (Xu et al. 2006) employed motion vectors for embedding by concealing the
data in P- and B- frames. They embedded the data only in motion vectors with a high
magnitude and embedded the control information into the I-frames. For embedding, they
first calculated the phase angle (θ) by using the Eq. (7) where it was calculated by the
13
Table 6 Comparative analysis of DCT and DWT techniques
Authors Technique Average Capacity Im Ro En Rv PSNR Security
8 × 8 DCT blocks
Esen et al.(2011) DCT(Middle frequency) Single bit is hidden in each + + − − N/A Untested
8 × 8 blocks
Zhang.et al.(2015a) 4 × 4 DCT 0.042 KB/frame + + − − 35.12 High
Yang et al. (2005) 4 × 4 DCT 1 bit embedded per 4 × 4 + + − - N/A Untested
DCT block
Shou-Dao et al. (2009) 4 × 4 DCT macroblock 1 bit embedded per 4 × 4 + + − − 42.62 Untested
DCT block
Ma et al.(2010) 4 × 4 DCT 0.104 KB per intra frame + + - - 40.74 Untested
Lin et al. (2013) 4 × 4 DCT 54% of the luma 4 × 4 DCT + + − − 42.21 Untested
block coefficients
Nie et al. (2018) 4 × 4 macroblocks 79.17 KB + − − − 32.76 High
Liu et al.( 2015) 4 × 4 Integer DCT 0.001 KB in 20 I-frames + + − − 35.85 High
Liu et al.( 2016) 4 × 4 Integer DCT 0.017 KB in 20 I-frames + + − − 36.56 Untested
Song et al.(2014) 4 × 4 (Luma blocks) DCT 1 bit hidden in each 4 × 4 + + − − 39.43 Untested
13
luma block
Table 6 (continued)
Authors Technique Average Capacity Im Ro En Rv PSNR Security
13
Idbeaa et al.(2016) Embedding-Based Byte Dif- 2 bits embedded into each + + + − 38.08 (approximately) Untested
ferencing (EBBD) 2 × 2 QAC coefficients per
frame
Mumthas and Lijiya (2017) DCT 0.041 KB + + + − 37.29 Untested
Lu et al. (2010) DWT and Watermark 0.28 KB/frame + + − − 45.59 Untested
Mstafa et al. (2015a) DWT (LL, LH, and HH 28.12% capacity ratio + + − − Y—36.9 High
components) U—41.2 V—42.61
Ahmed et al.(2014) LSB, DCT, and DWT 1.8 KB + + + − 73.21 Untested
Kumar and Singh (2018) Three level DWT 4.64 KB + + − − 64.13 Untested
Noda et al. (2004) BPCS 28% capacity ratio for 12 + + − − 43.95 Untested
bit plane
Kolakalur et.al. (2016) DWT, LSB, and RGB N/A + − + − 35.2 Untested
Chantrapornchai et al. LMWT-similar and LMWT- N/A + − − − LMWT similar- 55.15 Untested
(2014) random LMWT random- 41.95
Patel et al. (2013) LSB and 2-D Lazy lifting 12.5% capacity ratio + − + − 31.23 Untested
wavelet transformation
Wahab et al. (2015) Histogram and DWT 67.39 KB + + − + 47.45 Untested
Sadek et al. (2017) Three level DWT 2.78 KB + + − − 53.55 Untested
Ramalingam et al. (2016) DCT and DWT 25.3% capacity ratio + + − − 26.5 Untested
Mstafa et al. (2017a) DCT and DWT Capacity ratio for DCT- 3.46 + + + − DCT- 48.67 DWT- 49.01 High
Capacity ratio for DWT-
3.40
Narayanan et al. (2012) Integer Wavelet Transform N/A + + − − 49.49 Untested
Ramalingam and Isa (2014) Haar Integer Wavelet 1.13 KB + + − − N/A Untested
Transform
M. Dalal, M. Juneja
A survey on information hiding using video steganography
arctangents of both motion vectors: vertical (MVv) and horizontal (MVh) and based on
that angle embedding was done. If the obtained angle was acute, then embedding was done
in the horizontal component and vice-versa. Pros: high imperceptibility due to the large
magnitude MV selection for embedding and robust against video processing attacks. Cons:
limited capacity and less secure as many existing steganalysis algorithms have detected
the existence of the secret message in stego-videos. In another work, Xu and Ping (2007)
employed integer to integer two-level wavelet transformation for data hiding in the motion
components of low frequency coefficients.
( )
MV v
𝜃 = arctan (7)
MV h
To overcome the drawback of Xu et al.’s scheme, Pan et al. (2010) proposed a video
steganography algorithm based on linear block codes and motion vectors. The proposed
algorithm utilized the phase angle of motion vectors for embedding and linear block codes
(6, 2) were utilized to reduce the modification rate of the motion vectors after embedding.
In this scheme, a consistent monitoring matrix of linear block codes was used as a secret
key. Pros: the proposed algorithm was having good embedding capacity- 2/3 of the total
number of MVs. Cons: not tested against basic video processing attacks. Additionally, Jue
et al. (2011) designed an algorithm with the goal of a very high embedding capacity for
the H.264/AVC format. This algorithm embedded data in MV component’s differences
and utilized the macroblocks of P and B-slices for embedding in the luminance component
aimed at high payload capacity and high embedding efficiency. The payload capacity of
the proposed algorithm was very high as one bit of a secret message can be hidden in each
motion vector with the average PSNR of 36 dB. Pros: fast and simple implementation and
cons: visual quality not satisfactory. In another work, Aly (2011) concealed the data in ver-
tical and horizontal components of motion vectors with high prediction error. For motion
estimation, the authors used fast 3-steps and exhaustive searches by utilizing frame-I as a
reference frame, with data hidden in P and B frames. This scheme used a prediction error
threshold for each frame to help the decoder in motion vectors recognition that carries
secret message bits. Pros: highly robust as a greedy search was used to select the motion
vectors for embedding and cons: low hiding capacity.
In a different work, Cao et al. (2011) presented an adaptive technique with optimized
perturbations to motion estimation for hiding secret data using parity function. The tech-
nique utilized internal dynamics of compression for videos to resist many blind steganaly-
sis techniques. In this scenario, the Candidate Motion Vectors (CMVs) were selected using
MSE, and motion vectors were calculated using the Eq. (8). The method preserved the
statistical characteristics of motion vectors, which made it less detectable to the existing
steganalysis techniques for motion vector based steganography. The experimental results
demonstrated that the average increase in payload increases the scaling factor with a
decrease in the average value of PSNR and SSIM. K-L divergence was used as a parameter
that measures the closeness of histograms. This method can be further optimized by doing
more tests for selection rules and parity functions.
MV(hc, vc) = (HRf − HCu , VRf − VCu ) (8)
where hc,vc denotes the horizontal and vertical components and (HRf, VRf), (HCu, VCu) is
the reference and current macroblock coordinates respectively. Pros: High security due
to the embedding done at the time of motion estimation and cons: complex method in
terms of calculation. Shanableh (2012a) and Su et al. (2013) examined coding features
13
M. Dalal, M. Juneja
for embedding where Shanableh proposed a multi-layer coding and transcoding technique
for data hiding in MPEG-2 standard videos and Su et al. utilized H.264/AVC coding. The
main difference between these two is that Shanableh used two coding features: quantiza-
tion scales and coding parameters whereas in addition to these two, Su et al. also consid-
ered intra-prediction as a third feature. Su et al. also presented three profiles for embed-
ding, namely High-, Medium-, and Low-profile, to state the payload amount. These profiles
were made according to different features of coding as High profile used enhanced residual
embedding, MV(Motion Vector) embedding with Rate Distortion Optimization and 4 × 4
intra coding features. Medium profile used residuals embedding, Most Probable Mode
embedding and 4 × 4 intra coding mode whereas Low profile used only inter and intra-
prediction modes for embedding. The experimental results demonstrated that embedding
was most suitable in Medium profile as it achieved a perfect balance among requirements
such as payload, PSNR, and increased bit-rate. Pros: High hiding capacity and cons: did
not consider intra-frame distortion.
Also, embedding using motion vectors can cause some distortion. Therefore, Yao et al.
(2015) aimed to design a distortion function to preserve the motion vector effect on spa-
tio-temporal correlation also to minimize the distortion due to hiding. Two elements were
used for the distortion functions: Prediction Error Change (PEC) due to motion vectors
modification, and Statistical Distribution Change (SDC) of MV in the spatio-temporal
domain. SDC was considered because after modification of motion vectors the statistical
distribution would change and that could be the breakthrough for steganalysis. So, the co-
occurrence matrix was computed using the spatial and temporal correlation. PEC was con-
sidered because the motion vector and block prediction error were directly correlated. For
embedding, a two-layered Syndrome-Trellis Code (STC) was used to reduce distortion to
a minimum. Obtained results stated that the proposed scheme surpasses the existing tech-
niques in terms of bit rate increment and maintains the visible quality of the video with an
average PSNR above 34 dB. Pros: high robustness and imperceptibility and cons: detected
by the existing steganalysis technique (Wang et al. 2017). Again, Cao et al. (2015) uti-
lized STC to enhance the security of the MV-based video steganography scheme. Data was
embedded using the optimized perturbations to the process of motion estimation. These
perturbations were introduced using the coding results of STC which was used to minimize
the embedding impact. The proposed scheme reduced the detection probability with the
currently present best steganalytic technique (Add-or-Subtract One) (Wang et al. 2014a).
The experimental results demonstrated that it outperforms the existing MV techniques with
a small impact on coding performance. Pros: high security with very little impact on video
coding and cons: complex as it requires more calculations.
In addition to these techniques, Zhang et al. (2015b) also proposed a novel approach
known as Motion Vector Modification with Preserved Local Optimality (MVMPLO). The
MVMPLO scheme was threefold: firstly a search area for candidate motion vectors was
designated; secondly, local optimality was evaluated for each motion vector in that area
for locating all local optimum ones. Finally, one motion vector was selected amongst them
which made less contribution to the degradation of compression efficiency. The proposed
method was combined with steganographic codes which resulted in a highly undetectable
MV data hiding scheme. The experimental results demonstrated that MVMPLO surpasses
the traditionally existing techniques as it withstands the best steganalytic methods currently
present such as Add or Subtract One (AoSO) (Wang et al. 2014a) operation and Subtractive
Probability of Optimal Matching (SPOM) (Ren et al. 2014). Pros: Secure as the technique
resists the existing steganalysis attacks and cons: did not test for imperceptibility and vis-
ual distortion. Further, Rezagholipour and Eshghi (2016) presented a motion vector based
13
A survey on information hiding using video steganography
technique for embedding secret data in moving objects. Additionally, Song et al. (2015)
presented a reversible video steganography algorithm by utilizing 3D multi-view coding
(MVC) videos. The embedding was done in motion vectors with each containing 1 secret
message bit. The experimental results stated that the proposed scheme was able to avert
distortion drift as the secret data was embedded in b4 frames. Pros: Low complexity and
suitable for 3D videos without inter distortion drift and cons: less robust against attacks.
Variable length code is used for compression which maps source symbols to a number of
bits. It allows videos to be compressed and decompressed with zero error. Some research-
ers have utilized this technique for hiding secret data in a video. Liu et al. (2006a) pre-
sented a novel steganography scheme for the compressed domain using VLC for MPEG-2
video streams. The embedding was done adaptively with A/S trees of the VLC domain
using LSB. A/S trees were predefined in a standard table of VLC, which were mapped
to a code-tree. A/S trees automatically embedded the secret data by generating a pseudo-
random number code sequence. It generated three sequences, one for embedding and two
for decoding. After this, again Liu et al. (2006b) did advancements in their work by using
perturbed digital chaos instead of A/S trees to generate Pseudo Random Number (PRN)
sequences. The authors employed a Piecewise Linear Chaotic Map (PWLCM) to gener-
ate two chaotic PRN sequences. One was for the pre-processor, and another for Variable-
Length Decoder (VLD). This time, the authors used DCT quantized coefficients for embed-
ding the secret data. The average PSNR was almost equal to the previous work but the
capacity, in this case, was low. Pros: embedding and extraction were done without decom-
pression and cons: low embedding capacity. Similarly to improve security, in another work
Liu et al. (2008), used the VLD to parse the motion vectors, DCT coefficients, and intra
macroblocks. After this parsing scene change detector used DCT coefficients to deter-
mine slow speed, single scene sub-sequences. The embedding was done in the I-frame of
MPEG-2 video streams using the chaotic dynamic system and VLC mapping. For the secu-
rity of the proposed scheme, a steganalyzer based on the principle of collusion was used
which determined the deviation of DCT coefficients and correlation between frames in a
scene.
There exist embedding techniques that are based on specific features of the video codec
and video formats. Also, with the advancement in video formats (standards), more features
are introduced in the standard such as H.264/AVC provide two types of encoding schemes
namely: CAVLC (Context-Adaptive Variable Length Coding) and CABAC (Context-based
Adaptive Binary Arithmetic Coding) (Wiegand et al. 2003). Few of the researchers utilized
the specific features of the video stream for embedding. Shanableh (2012b) proposed two
new data hiding schemes for MPEG videos. In the first scheme, secret data bits were hid-
den in compressed MPEG videos of constant bit rate by modifying the quantization scale.
From each macroblock, features were extracted and the second order regression model was
used to calculate the bits of the hidden message. The decoder used this regression model
for the prediction of the hidden message bits. This provided very high prediction accu-
racy, but the limitation was on the payload with only one message bit hidden in a mac-
roblock. To overcome this limitation, the second scheme used both bit rates viz. constant
13
M. Dalal, M. Juneja
and variable codes. This scheme used a feature of H.264/AVC videos called Flexible
Macro-block Ordering (FMO) for data hiding. The results of the proposed techniques dem-
onstrated that the average message prediction for the first scheme was 95.83% using second
order regression. The maximum payload was 10 Kbits/s for the first proposed scheme and
30 Kbits/s for the second scheme. Pros: high prediction accuracy for MPEG videos and
cons: restricted payload and due to interfering with the bit rate visual quality affected.
Moreover, Jiang et al. (2015) presented a video steganography technique for H.265/
HEVC video coding standard by introducing a novel concept of “Constant Bitrate Infor-
mation Bit (CBIB)” using CABAC in the entropy coding. For Motion Vector Difference
(MVD) they designed a codeword reservation and substitution rule for the encoding. The
experimental results proved the efficacy of the proposed work with desirable impercep-
tibility, capacity, and low computational cost. Pros: there was no bitrate increase as the
embedding was done on bit substitution with constant bitrate and cons: this technique was
specifically designed for HEVC videos only. In a distinct work, Xu et al. (2014) presented
a novel video steganography technique to hide data in H.264/AVC encrypted videos.
Before hiding the data in video streams, the video was first encrypted with the Standard
Stream Cipher algorithm (ex.RC4). The encryption was not done in the whole video but
parts of the video format viz. the code words of motion vector differences, code words of
intra-prediction modes, and the code words of residual coefficients. The encryption algo-
rithm was merged with Context-Adaptive Variable Length Coding (CAVLC) and Exp-
Golomb entropy encoding. The data was hidden in the compressed and encrypted video
with the help of code-word substitution without the knowledge of the original video. The
code-words of P-frames were used for embedding, and the I-frame code-words remained
unchanged to minimize the prediction error. The results demonstrated that the proposed
technique preserved the size of the file even after video encryption with less visual distor-
tion. Pros: the size of video bitstream was not changed even after encryption and embed-
ding and cons: low embedding capacity as CAVLC statistics characteristics were not
exploited. To improve the embedding capacity, Xu et al. (2016) utilized the redundancy of
the CAVLC codeword more as compared to previous work. The improvement was made in
two ways: in first embedding was performed by pairing code-word substitution when suf-
fixLength was 1, whereas in the second when suffixLength was 2 embedding done through
the multiple-based notational system. Pros: high capacity and intact bit-rate while embed-
ding and cons: visual quality degraded due to change in code words while embedding. The
comparative summary of video format based techniques is given in Table 7.
H.265/HEVC (High-Efficiency Video Coding) is the next step in video encoding stand-
ards which can compress videos more efficiently and provides additional features as com-
pared to the predecessors. As per the requirement of high definition and high compression
ratio recently some of the researchers have utilized HEVC (Wang et al. 2018; Bo and Jie
2018; Liu et al. 2018) for data hiding. Liu et al. (2018) presented a scheme for hiding data
in H.265/HEVC where data was embedded in intra-frame prediction mode without dis-
tortion drift. The scheme was able to avert distortion drift by employing three directional
mode conditions namely: (1) left-down and down-mode, (2) top-right and right- mode, and
(3) intersection of the previous two modes with the right-down mode. The embedding was
done in multi-coefficients of luminance 4 × 4 blocks obtained by applying DST in H.265.
The improvement was done by Liu and Xu (2020) with increased survival rates of 28.74%,
34.60%, 56.05%, and 41.40% as compared to Liu et al. (2018). Also, the proposed scheme
was able to get a 100% survival rate when the loss rate is less than 15%. Further, Konyar
et al. (2020) presented a reversible video steganography scheme based on matrix encoding
for H.265/HEVC video format without distortion drift and embedding was done using DCT
13
A survey on information hiding using video steganography
coefficients. The matrix encoding is used to provide high embedding capacity, high fidelity,
and without error propagation. The experimental results stated the high hiding capacity of
approximately 3543 Bytes with acceptable PSNR and low bitrate increase. Additionally,
Galiano et al. (2020) utilized HEVC high definition videos for embedding secret message
by using 4 × 4 intra-blocks luminance components. They utilized HM-16.2 reference soft-
ware and × 265 real-time encoder for the embedding in the 2nd Most Significant Bit (MSB)
without distorting the quality of the video. The proposed scheme was robust against attacks
and was also tested against steganalysis tools to ensure security.
With the current technological advancements, the need is also changing, and the future
is 3D videos, Luo et al. (2020) utilized 3D- HEVC coding standards for embedding secret
data for real-time application. Additionally, Wang et al. (2019) analyzed the probability
distribution of 4 × 4 intra-prediction mode to propose a secure video steganography tech-
nique for HEVC videos. They utilized the combined coding unit and prediction unit for
cover selection and embedding was done using matrix coding on HEVC video streams.
The experimental results stated that the proposed video steganography scheme is easy to
implement and provides additional security with the high video quality. Further, Yang et al.
(2019) utilized one of the most innovative features of HEVC videos i.e. PU partition mode
in P-frames to propose a high capacity and multilevel video steganography technique.
Moreover, in this advanced scenario, researchers are also utilizing new and latest tech-
nology such as artificial intelligence techniques for different fields. In video steganogra-
phy also, nowadays researchers are using artificial intelligence and machine learning based
techniques for optimization of the hiding venues to hide secret data. Suresh and Sam
(2020) employed the Functional Grey Wolf Optimization technique to select the optimal
region for embedding secret data using a multi-objective cost function. The secret data was
pre-processed using an encryption scheme for additional security before embedding and
embedding was done using Lifting Wavelet Transform on the key frames. The obtained
results stated a high average PSNR of 72.9% with a high average embedding capacity
of 68.7%. Few of the researchers utilized neural networks such as Artificial Neural Net-
works (ANN) (Kaur and Kaur 2018), Full Counter Propagation Neural Networks (FCPNN)
(Choubey and Bansal 2014), etc. for video steganography. Further, Khare et al. (2014) used
back propagation neural networks to improve the performance of LSB based video steg-
anography scheme. In another work, Abdolmohammadi et al. (2019) utilized a 3D convo-
lutional neural network to exploit spatial and temporal features for embedding secret data
using video steganography.
Video steganography techniques embed the secret data in a video with no or little quality deg-
radation and sometimes with high degradation. The quality of the stego-video is measured
with the help of different quality and quantity metrics which numerically approximate the
quality of the stego-video. PSNR and MSE are the primary metrics and most widely used for
quality evaluation. These metrics are simple and have low computational complexity; how-
ever; these were designed initially for images only. So, evaluation of the stego-video on the
basis of these metrics alone may not be ideal. Some video specific metric is needed to evaluate
the quality of stego-video, and there are very few video quality metrics available such as SSIM
and VQM (Video Quality Metric). SSIM is based on the human eye characteristic which states
that the human eye can extricate structural information of video/frames instead of extricating
13
Table 7 Comparative Analysis of Video Codec Based Techniques
Authors Technique Average Capacity Im Ro En Rv PSNR (dB) Security
13
Song et al. (2015) Motion vector embedding in b4 0.054 KB/b4frame + − + + 39.49 Untested
frames
Xu et al. (2006) Motion vector components 0.067 KB in P-frame 0.564 KB in + + − − I-frame 35.22 Low (Ren et al. 2014; Tasdemir
B-frame P-frame 34.61 et al. 2016; Wang et al. 2014a;
B-frame 33.31 Sur et al. 2015)
Su et al. (2013) Quantization, Intra-prediction, and High profile- 14.24% Medium + + − − High profile – 35.44 Untested
the Motion vector profile- 13.27% Low profile- Medium profile-
12.23% 37.62 Low profile-
38.87
Cao et al.(2015) Optimized Perturbation Motion At bit-rate 62.5 KB/sec capacity + + − − At bit rate 0.5 High
Estimation with STC – 0.438 KB/sec At 1Mbits/s Mbits/s- 32.81 At
capacity—0.262 KB/sec bit-rate 1 Mbits/s-
36.1
Yao et al. (2015) Motion Vector and Syndrome- 0.0003 KB per motion vector + + − − 35.47 Low (Wang et al. 2017)
trellis code (STC)
Aly (2011) MV horizontal and vertical Capacity ratio 25% in P-frame and + + − − N/A Low (Ren et al. 2014; Tasdemir
components based on associated 75% in B-frame et al. 2016; Wang et al. 2014a;
macroblock prediction error Sur et al. 2015)
Pan et al. (2010) Phase angle of Motion Vector 4bits embedded per 6 motion + + − − 37.56 Untested
vector bits
Jue et al. (2011) Motion vector components dif- 0.005 KB per P or B-frame with + + − − 36.46 Low (Sur et al. 2015)
ference the largest amplitude of MVs
Shanableh (2012a) Quantization Scale and Motion 2.05 KB/sec + + − − 34.5 Untested
vector using Matrix encoding
Zhang et al. (2015b) Motion vector modification with N/A + + - - 38.78 High*(Zhai et al. 2017)
preserved local optimality
Rezagholipour and Horizontal and vertical compo- 38.05 KB + − − − 43.26 Untested
Eshghi (2016) nents of motion vectors
Liu et al. (2006b) VLC 1 message bit per block + − − − 39.5 Low (Tasdemir et al. 2013)
M. Dalal, M. Juneja
Table 7 (continued)
Authors Technique Average Capacity Im Ro En Rv PSNR (dB) Security
Shanableh (2012b) Constant bit-rate and Variable For constant bit-rate – 1.25 KB/ + − − − 34.21 Untested
bit-rate sec$$For both constant and
variable bit-rates – 3.75 KB/sec
Jiang et al. (2015) MVD’s CABAC 924bpp + − + − N/A Untested
Xu et al. (2016) Code word substitution 0.608 KB/sec + + + + 35.49 Untested
*
means the proposed technique has been attacked by the referred steganalysis method, but the results are not good enough
A survey on information hiding using video steganography
13
M. Dalal, M. Juneja
errors. It is a step by step calculation process which makes it computationally complex. VQM
is another video specific metric that uses seven parameters value as a linear combination for
metric calculation.
Further, for robustness calculation BER (Bit Error Rate) metric is utilized which is used to
examine whether the secret data is recovered from the stego-video successfully or not. Hiding/
Embedding capacity is also one of the main parameters for a successful steganography tech-
nique that determine the amount of secret data hidden in the cover object without distortion.
Also, to examine the amount of secret data transferred from the sender to the receiver, the
metric called entropy is calculated. The entropy metric is not used much for video steganog-
raphy purposes nowadays; only a few researchers have utilized entropy as a quality metric
in their work. All these metrics discussed are for image or video quality, but none of them is
specifically for steganography. So, for steganography, a benchmark known as MMD (Maxi-
mum Mean Discrepancy) is used by the researchers. The definition and formulas of the quality
metrics are discussed in this section.
To measure the quality of the reconstructed video codecs Peak Signal to Noise Ratio (PSNR)
is used. It is an estimation of the quality of human perception for recreated/reconstructed video
measured in decibel (dB). Generally, high PSNR indicates a reconstruction of high quality. It
is mostly calculated with the help of MSE (Mean Square Error). If a noise-free monochrome
frame X (cover) of a × b dimension (resolution) is given and noisy approximation of the frame
is Y (stego), then the mean square error is as given in Eq. (9):
a−1 b−1
1 ∑∑[ ]2
MSE = X(i, j) − Y(i, j) (9)
a ∗ b i=0 j=0
MAX 2X
PSNR = 10log10 (10)
MSE
It is the rate at which error occurs in a transmission medium. If the transmission medium is
sound and signal to noise ratio (SNR) is high, then the value of BER will be small which
indicates no noticeable difference between the video transmitted and received. However, if the
noise is detected, then there may be chances to consider the bit error rate. It is a function of the
amount of distortion due to manipulation of the videos which is used to measure the robust-
ness of the embedding technique. The formula for bit error rate is as follows (He et al. 2012):
∑a ∑b
[X ⊕ Y(i,j) ]
BER =
i=1 j=1 (i,j)
(11)
a∗b
13
A survey on information hiding using video steganography
6.3 Entropy
Entropy is the information density of the contents of the video. It is used as a quantitative
measure of the information transferred by the video. It tells the amount of information trans-
ferred by the video and also measured the randomness of the video. The value of entropy can
fall between 0 and 1. It is measured by Shannon’s entropy formula (Shannon 2001) as given in
Eq. (12):
I
∑
Entropy = − Pi log2 (Pi ) (12)
i=1
where Pi stands for the probability of getting a particular intensity, and I is the total inten-
sity values.
The SSIM is a structural content based metric. The similarity between two signals/frames is
estimated by the difference between original and distorted frames. It calculates the mean, vari-
ances, and covariance of the distorted and original frames. The range of SSIM lies between 0
and 1, and the video is of good quality if its value is near 1. It can be defined by the following
formula (Wang et al. 2004):
( )
2μX μY + C1 (2𝜎XY + C2 )
SSIM(X, Y) = ( 2 ) (13)
μX + μ2Y + C1 (𝜎X2 + 𝜎Y2 + C2 )
6.5 Capacity ratio
The capacity of any steganography scheme is also an important parameter to judge the success
of the scheme which is calculated with the help of the hiding/capacity ratio. The capacity ratio
is defined as the amount of secret data embedded without the visual deformation of the stego
video. The hiding ratio is calculated by using the following equation (Mstafa et al. 2017a):
Secret data size
Capacity ratio = ∗ 100 (14)
video size
6.6 Bit‑rate increase
Bit rate increase is another quantitative measure used for video steganography as after embed-
ding the bit rate increases most of the time and for a successful video steganography technique
the bit rate increase ratio must be zero. It is calculated by using the formula (Jiang et al. 2015)
given in Eq. (15):
s−c
𝝁=
c (15)
where c and s are the bit-rate of the cover and stego videos respectively.
13
M. Dalal, M. Juneja
The General Video Quality Model (GVQM) was designed for general purpose video qual-
ity metrics. ANSI (American National Standards Institute) has adopted it as a standard for
objective video quality. It spans over a vast range of bit rates and quality. It also includes
perception based effects such as blurring, noise, color distortion, jerky motion, and block
distortion. The above measurements are combined to form a single metric that gives an
overall quality prediction. The range of GVQM (Pinson and Wolf 2004) varies from 0 to 1,
and the metric used for this is stated in Eq. (16):
VQM = − 0.2097 ∗ si_loss
+ 0.5969 ∗ hv_loss
+ 0.2483 ∗ hv_gain
+ 0.0192 ∗ chroma_spread (16)
− 2.3416 ∗ si_gain
+ ct_ati_gain
+ 0.0076 ∗ chroma_extreme
Another method is MMD which is a two-sample statistics method used for testing two
kinds of samples whether they are generated from the same distribution or not. This
method is numerically stable with low estimation error even in high-dimensional space and
is theoretically well understood. It has many features such as low computational complex-
ity and fast convergence rate, which makes it a handy benchmark for steganography. It is
used for comparing steganalytic algorithms (Pevny and Fridrich 2008). Table 8 shows the
formulas and expected outcomes of the quality metrics for a good video steganography
technique.
7 Steganalysis: an outline
13
A survey on information hiding using video steganography
the deformity. Usually, classifiers are utilized for the detection of stego-video such as LDA
(Linear Discriminant Analysis) (Sur et al. 2015), Neural Network, SVM (Support Vec-
tor Machine) (Xu et al. 2012; Wang et al. 2014b), etc. Generally, steganalysis is done by
extracting features or particular attributes from the video frames or motion vectors for both
the stego- and cover-video and after that with machine learning methods training of a clas-
sifier is done. The basic framework for video steganalysis is shown in Fig. 13, where for
training both the videos are given as input to extract the features. Among those extracted
features relevant features are selected as a feature vector set and after the feature vector set
the training of the classifier is done using classification algorithms. In the testing phase, the
test video is given as an input for feature extraction, and by comparing those features, and
training features classifier will classify it as a stego-video or original video. This section
provides some of the literature related to the general and specific video steganalysis algo-
rithms without going into depth and without explaining the details of the classifiers used
as that has been discussed in (Dalal and Juneja 2018a). Further, this section also explores
some of the common attacks on steganography.
7.1 Steganalysis techniques
Specific steganalysis techniques are the one that attacks a particular steganography tech-
nique and may not be able to work efficiently on others. On the other hand, a general stega-
nalysis technique can be used for different types of steganography techniques irrespective
of the embedding scheme. There exists very little work for general video steganalysis tech-
niques as steganalysis of a video steganography technique is difficult as compared to other
multimedia files. One of the general video steganalysis present in literature with a uni-
versal feature set was proposed by Xu et al. (2012) for the detection of Intra-frame and
Inter-frames embedding. They derived a spatial universal feature set of 20 features for
video steganalysis by deploying PMF (Probability Mass Function) matrix. The proposed
13
M. Dalal, M. Juneja
scheme utilized the SVM classifier and the experimental results asserted that the scheme
was efficient for detection in portions rather than the complete frame. Even this method
was also not applicable to all types of video steganography techniques as it was not able
to detect motion vector embedding algorithms. Another technique introduced by Wang
et al. (2014a), focused on SAD values for detection and also extracted correlation features
between motion vectors. The features were extracted after adding or subtracting one value
of motion vectors and finding the difference between the actual and locally optimal SAD
values. The proposed steganalysis technique was applicable to different codecs and differ-
ent methods.
Specific steganalysis techniques are comparatively easy to implement and are more
effective. Fan et al. (2016) introduced a steganalysis technique to detect hash based 3-3-2
LSB technique (Dasgupta et al. 2012) by utilizing frames cross-correlation feature. The
proposed technique employed a hash function for the detection of locations of the embed-
ded secret data and was efficiently able to detect the length of the secret message also.
The advantage of the proposed technique was that the detection was performed without
any classifier. The steganalysis of spread spectrum steganography was done by Wang et al.
(2014b) where they utilized the prediction error frame and differential filtering to repress
temporal and spatial redundancies respectively. The dependencies were modeled by using
a first-order Markov chain between neighboring PEFs and subsets of empirical probabil-
ity transition matrices that were used as a feature vector. The scheme utilized an SVM
classifier with a radial basis kernel and results evaluated that the performance was better
than the existing techniques for compressed videos such as MPEG-2 and H.264/AVC. To
detect motion vector based steganography for MPEG-1 coding standards a video steganaly-
sis technique was presented by Tasdemir et al. (2013) for LSB embedding without using
any classifier. The feature employed in the technique increased the detection accuracy by
assembling the motion vectors with reference frame distance. Every assembled MVs were
examined individually which enhanced the detection accuracy. They remarked that after
embedding in motion vector the noise and the size of MVs increased with the reference
frame. Other techniques for the detection of motion vector based video steganography were
13
A survey on information hiding using video steganography
proposed by (Ye et al. 2013; Sur et al. 2015). Both the author’s utilized the spatial–tem-
poral correlation of motion vectors for feature extraction and detection. Ye et al. (2013)
presented a video steganalysis scheme by employing Markov matrix features in fixed size
sliding windows. The classification was done with the help of an ensemble classifier with
324 features by considering spatial and temporal correlation. The experimental results
were effective on typical video steganography algorithms based on motion vectors. The
only disadvantage was that the proposed scheme was limited to adjacent frames only in
case of temporal redundancy. Additionally, Sur et al. (2015) used the spatial and temporal
characteristics including flickering effect, statistical anomalies of MPEG-2 coding stand-
ards. The proposed scheme stated that after embedding in a video using motion vectors,
the statistical features of the video altered, and thus the scheme concentrated on them to
detect the embedding. The LDA classifier was used with 15 features for the detection, and
obtained results claimed that the scheme was effective on low embedding rate also.
Further, another method of extracting features is the calibration based method which
has also been employed by some of the researchers. Zhao et al. (2003) introduced a calibra-
tion based steganalysis method for the detection of intra-prediction mode (IPM) based steg-
anography. Two feature sets were used IPM shift probability and SATD (Sum of Absolute
Transformed Difference) containing 9 and 4 features respectively. Currently, most of the
existing steganalysis methods split the videos into fixed length frames to extract the fea-
tures for detection. In contrast to that, Wang et al. (2017) proposed a steganalysis technique
to detect MV-based steganography methods by reducing the effect of statistical character-
istics originated from videos. The proposed technique fragmented the frames into subse-
quences based on the motion of blocks in each frame. The features were extracted from
every subsequence and the experiments were done by categorizing motion intensities into
three levels- low, middle, and high intensity which were trained with three classifiers inde-
pendently. The performance was evaluated by experimenting on the three existing video
steganography methods, and the results stated that high motion intensity steganalysis per-
formance was better than the other two. Another work proposed by Sadat et al. (2018) is
based on entropy value to detect the embedding done using a motion vector in H.264 com-
pressed videos. The proposed scheme considered the statistical and intrinsic features of the
video for detection using entropy to discern the texture of the motion vector in the blocks.
The blocks were clustered using a fuzzy cluster, and the experimental results indicated the
high performance of the scheme. In addition to these techniques, there exist some common
attacks for steganalysis which are discussed in the next sub-section.
7.2 Attacks on steganography
1. Visual attacks This is the simplest type of attack on a steganographic system. In this,
the human visual system ability is utilized for the attack. Noticeable differences with
the naked eye in a frame indicate that it may carry hidden data. The basic idea of visual
attacks is shown in Fig. 14 given by (Westfeld and Pfitzmann 2000). The visual attack is
the subjective testing which is done by visually comparing the original frames and the
stego frames (Sadek et al. 2017; Xu and Ping 2007; Sudeepa et al. 2016; Ramalingam
and Isa 2015; Yang and Bourbakis 2005). The only disadvantage of visual attack is that
it cannot be automatized and it is not always reliable.
2. Structural attacks In this, analysis of the known properties of data hiding algorithms
are used for detection. These algorithms leave a characteristic structure of the data as
the format of the data file is different after embedding. The data file is analyzed further
13
M. Dalal, M. Juneja
if known properties are found. However, it gives a lot of false-positive outputs and
depends a lot on if the carrier file is known (Ferreira 2015). In the case of frames, both
original and stego-frame are scanned to analyze the structure of the frames which may
have some difference in the properties after embedding (Fridrich et al. 2003).
3. Statistical attacks In the process of information hiding the statistical properties of the
cover medium changes, which can be detected by steganalysts. It is done using math-
ematical formulas, and it is more efficient than other attacks. It can be further classified
into three parts:
i. Chi-Square Attack It is the most straightforward statistical test for detecting randomness
which is calculated with the help of frequency distribution. It is based on differences
between the number of real event occurrences and its expected number of occurrences.
The probability of embedding is calculated with the help of the degree of similarity
between the event that occurred and the theoretical frequency distribution.
The χ2(Chi-square) statistics (Westfeld and Pfitzmann 2000) is calculated using the
equation:
∑k (msi − ex∗i )2
2
𝜒k−1 = (17)
i=1 ex∗i
where, exi* is the expected frequency after embedding, msi is the measured frequency in
the random sample, and k-1 is the degree of freedom.
High scores indicate a condition that is not random and is typical for any part of the
original frame. Its low value indicates a high degree of randomness, and it is typical for
the files with hidden data embedded in them. Kaur and Kaur (Kaur and Kaur 2016) uti-
lized a Chi-square attack for the testing of their proposed LSB embedding technique for
hiding 4 KB secret data in ten random frames of a video.
13
A survey on information hiding using video steganography
Cover Visual
Message bits demonstration of the
Medium extraction bits on their source
Attacked
pixels position
Carlo value of Pi, hex dump, and so forth (S. A P and A. P P 2010; Gupta and Chatur-
vedi 2014).
Other standard attacks can be: Transcoding (at different bit-rate), resizing, random
video frame dropping, random tweaking attacks, and others (Xu et al. 2006; Su et al. 2013).
Video steganography has gained more attention from researchers for the last two decades
because of its popularity and easiness to use and this transfer brought steganography more
towards the implementation of real-world problems. There are many ways to hide data in
a video as it has the flexibility to make a particular frame or different frames as a cover
object for the improvement of security. One can even use the whole video for embedding to
achieve large capacity (Dalal and Juneja 2016), and these advantages of videos over other
multimedia files make it more suitable for steganography. For hiding data in a particular
frame, spatial domain techniques such as LSB have been utilized which replaces the pixel
values of the cover frame with the secret data directly. LSB is easy and straightforward to
implement and has high hiding capacity (Dalal and Juneja 2018b) however, it is vulnerable
to simple attacks and is not even robust against compression (Johnson and Jajodia 1998).
To address this issue which is common in all the spatial domain techniques, researchers are
directed towards transform domain techniques such as DCT and DWT. These techniques
can be utilized for raw videos as well as for compressed video formats such as MPEG-
2, MPEG-4, H.264/AVC, H.265/HEVC. In DCT techniques, data is mostly hidden in
quantized coefficients of DCT blocks of different sizes but sometimes in these techniques
blocking artifacts could occur. In DWT, there is no such problem of blocking artifacts,
and it gives temporal resolution which captures frequency as well as information about
the location (Sadek et al. 2015; Dalal and Juneja 2019). It also allows accurate reconstruc-
tion of the original signal with the help of an integer-integer wavelet transform. All these
advantages make DWT a better choice for video steganography in the case of the transform
domain.
Videos have complex statistics and have additional places for embedding data as com-
pared to images. Motion vector techniques are used explicitly in videos to hide data which
makes them robust against compression as motion vectors have more signal energy. Motion
vectors are the best choice nowadays for compressed videos as most of them have adopted
motion compensation techniques and they are lossless and produce little degradation of
the visual quality. Motion vector based methods embed data while performing motion
13
M. Dalal, M. Juneja
estimation and they target the internal dynamics of video compression Cao et al. 2012.
These advantages make the motion vector less detectable and more suitable for video steg-
anography as compared to other techniques. Some of the researchers have proposed tech-
niques that include spatial and transform domain, as well as specific features of videos for
embedding. Regarding the choice of video coding standard for hiding, H.264/AVC videos
have been widely used nowadays because of their unique features and more hiding sites
as compared to other video streams. It is a hybrid video coding standard that consists of
several processes, and embedding can be done at each processing level. However, there
are chances for the loss of information due to compression Video et al. 2014. So, to avoid
information loss, it is more encouraging to embed a secret message in H.264 compressed
videos directly. Next, in this section standard video dataset for steganography has been
introduced, and further, implementation results of the state-of-the-art techniques have been
discussed.
Video steganography can be done using different videos available over the internet how-
ever, there are standard datasets available with different specifications. This section dis-
cusses some of the video data set available for steganography. One of the most popular
datasets for videos is Video Trace Library which consists of different types and many data-
sets Reisslein 2012. Another most popular dataset for videos is YouTube Dataset, which
consists of millions of videos of YouTube (Abu-El-Haija et al. 2016). Furthermore, there
exist other datasets for videos that have been used by researchers for steganography; some
of them are mentioned in Table 9, with their reference source from where the dataset could
be downloaded.
To get across the effect of embedding in video steganography an experimental test has
been conducted on different videos. Experiments are conducted for qualitative and quanti-
tative analysis of the test videos using the most prominent techniques. All the experiments
have been implemented in MATLAB 2017a version with system specifications of Intel(R)
Core(TM) i5-2450 M CPU @ 2.50 GHz with 8 GB RAM ad 64-Bit Operating system. The
videos with different resolutions have been used for experiments from the traces (Reisslein
2012) and YouTube- 8 M (Abu-El-Haija et al. 2016) video dataset. An image is used as
secret data to hide it inside the videos. Detailed properties of the videos dataset used for
testing are as described in Table 10.
8.2 Experimental results
Experiments have been done for some of the techniques namely LSB, DCT, DWT, and MV
for video steganography, where embedding is done using the fundamental procedure as
discussed in the respective sections of the considered techniques. Additionally, correspond-
ing to these fundamental techniques some of the prominent work has also been imple-
mented as shown in Table 11 where LSB1 and LSB2 refer to Mstafa’s work (Mstafa and
Ellleithy 2015) and Manisha’s work (Manisha and Sharmila 2019) respectively. The tech-
niques DCT1, DCT2, DCT3 refers to the work of Gujjunoori (2013), Zhang et al. (2015a),
Mstafa et al. (2017a) respectively; whereas DWT1, DWT2, DWT3 presents the results of
Ramalingam (2014), Mstafa et al. (2015a), Mstafa et al. (2017a) work respectively, and
MV1 and MV2 broaches the work of Rezagholipour and Eshghi (2016) and Song et al.
13
A survey on information hiding using video steganography
(2015) respectively. Quantitative results in terms of PSNR for the tested videos are shown
in Table 11 with a graph plotted in Fig. 15.
The results of the tested video steganography techniques indicate that on an average
DWT and its corresponding techniques outperform all other techniques in terms of PSNR
for all the examined videos. The qualitative results of a particular cover frame and the
stego-frame of the videos are shown in Fig. 16 for visual quality analysis.
Furthermore, the techniques have been inspected to calculate the value of VQM which
indicates how people perceive video quality. VQM algorithms compared the original and
stego-videos and this is mostly suitable for good quality videos. The calculations of VQM
require more memory space and time as compared to other quality metrics such as PSNR,
as an example: for a 6-s video it generally requires approximately 4 GB RAM free space
and it took few hours to give results. Therefore, for fair experimental comparison, the con-
sidered videos were kept of equal length. The average value of the VQM for the considered
video is plotted in Fig. 17 and the lower value of VQM indicates better video quality. The
average VQM value for DWT and DCT is comparable with 5.9 and 6 respectively, as com-
pared to LSB and MV techniques.
In addition to PSNR and VQM, the experimented techniques have also been examined
to calculate average hiding capacity ratio and bit rate increase using considered videos. The
result of the average hiding capacity of the techniques is shown in Fig. 18 and the average
bit rate increase is shown in Fig. 19. It is evident from the graph that the average hiding
capacity of LSB techniques is more as compared to others; however, the bit rate increase
is also more in the case of LSB. The bit rate increase should be less for a successful video
steganography technique and from Fig. 19 it can be concluded that the overall average bit
rate increase of DWT techniques is minimum with 1.9% as compared to LSB, DCT, and
MV techniques with 2.3%, 3%, and 3.9% respectively for the considered videos.
Further, the above-experimented techniques have been analyzed against steganalysis
based on the details available in the literature, although steganalysis is not the focus of this
survey. However, Fig. 20 has been shown to illustrate the graph of steganalysis resistance
level to the best of the author’s knowledge and according to the literature. The levels are
decided as low, high, and untested by the authors based on the data mentioned in Tables 5,
6, and 7. In this graph the techniques which are not tested by any researchers are kept at
the middle level as “Untested”, the techniques which have been tested against steganalysis
are given “High” resistance against steganalysis, and “Low” level resistance is for the tech-
niques which have been detected by steganalysis researchers.
The experimental analysis done using different quality metrics shows the effectiveness
of different types of video steganography techniques. With different quality metrics dif-
ferent techniques performed well such as in terms of capacity, video steganography using
LSB outperforms the other experimented techniques with the highest average hiding
capacity. In terms of visual quality (imperceptibility) bit rate increase overall DWT tech-
niques performed well. Based on the experimental analysis the authors give some recom-
mendations for future researchers: (1) for a hiding capacity requirement, the researchers
should focus on LSB based video steganography techniques which are easy and fast to
implement, (2) pivoting on imperceptibility, the researchers must utilize transform domain
based techniques and (3) for high robustness also the transform domain based techniques
must be used, especially DWT which performed the best among the experimented tech-
niques. However, for a successful video steganography technique, there must be a stable
trade-off between the basic requirements of steganography so in the future the researchers
must be focused on a technique to provide a better trade-off between the basic require-
ments capacity, imperceptibility, and robustness with high security. Also, in the future new
13
13
Table 9 Video dataset available for steganography
S. No Details Reference using the same dataset Sources
1 Video Trace Library Kamil et al. 2018; Dalal and Juneja 2020; Rajalak- http://trace.eas.asu.edu/
shmi and Mahesh 2018; Hashemzadeh 2018)
2 Xiph.org Suttichaiya et al. 2017; Firmansyah and Ahmad https://media.xiph.org/video/derf/
2016)
3 UCI (University of California Irvine) Machine Learn- Ramalingam et al. 2015; Manikandan et al. 2017) http://archive.ics.uci.edu/ml/index.php
ing Repository
4 YFCC100M (Yahoo Flickr Creative Commons 100 Dalal and Juneja 2019) http://projects.dfki.uni-kl.de/yfcc100m/
Million)
5 PETS2009 (Performance Evaluation of Tracking and Mstafa et al. 2017a) http://www.cvg.reading.ac.uk/PETS2009/a.html
Surveillance)
6 VQEG (Video Quality Experts Group) Stütz et al. 2013) https://www.its.bldrdoc.gov/vqeg/video-datasets-and-
organizations.aspx
7 YouTube Dataset Abdolmohammadi et al. 2019; Neuner et al. 2016) https://netsg.cs.sfu.ca/youtubedata/
8 YouTube-8 M – https://research.google.com/youtube8m/
9 Virat Video Dataset Babaguchi et al. 2013) http://www.viratdata.org/
10 YACVID Yet Another Computer Vision Index To – https://riemenschneider.hayko.at/vision/dataset/index
Datasets) .php?filter=+video
11 Vision – https://lmb.informatik.uni-freiburg.de/resources/datas
ets/sequences.en.html
12 Change Detection – http://changedetection.net/
13 UMass Trace Repository – http://traces.cs.umass.edu/index.php/Mmsys/Mmsys
M. Dalal, M. Juneja
A survey on information hiding using video steganography
hiding venues should be explored by researchers as new video formats are becoming the
next common to provide better security. Some other future recommendations are further
mentioned in Sect. 11 for the researchers working in this field and for new researchers
interested in this field.
9 Critical analysis
Intuitively, among all the video steganography techniques LSB offers a simple and easy
way of embedding secret data inside the cover video. These techniques provide high
embedding capacity with acceptable imperceptibility of the stego-video. However, these
techniques are vulnerable to simple attacks such as compression and noise due to poor
robustness (Asikuzzaman and Pickering 2018). On the other hand, transform domain tech-
niques, DCT, and DWT provide high robustness against attacks with good embedding
capacity (Khosla and Kaur 2014). DWT-based video steganography techniques provide
high imperceptibility with minimal or no visual distortion whereas the embedding in DCT
transformed coefficients sometimes leads to blocking artifacts. These video steganography
techniques provide more security as compared to LSB based techniques.
In motion vector based techniques, the hiding capacity is based on the presence of avail-
ability of the number of motion vectors in a video. However, embedding in these tech-
niques increases the complexity of the process due to the requirement of precise computa-
tion for motion vector calculations as any error will lead to the degradation of the visual
quality of the stego-video. Among the existing video coding formats, H.264/AVC is the
most commonly used standard for video steganography and it has different features that
have been exploited for embedding secret data. As an example, the binary syntax element
on CABAC, etc. has been manipulated as the codewords (bits) are present in huge num-
bers and the videos can be easily regenerated after extraction of secret data for a reversible
process (Video et al. 2014). Additionally, the new standard H.265/HEVC has also been uti-
lized nowadays for embedding secret data as it has additional features that provide new hid-
ing venues for the researchers (Ohm et al. 2012). Further, Table 12 summarized the critical
analysis of the video steganography techniques with their pros and cons.
13
13
Table 11 PSNR values obtained for different techniques for the considered videos
V LSB LSB1 LSB2 DCT DCT1 DCT2 DCT3 DWT DWT1 DWT2 DWT3 MV MV1 MV2
V1 40.7 45.1 44.2 41.4 46.2 41.4 43.8 43.5 42.1 44.7 49.2 41.8 46 40.3
V2 39.8 46.2 45.9 43.1 50.1 42.2 46.6 47.3 43.9 46.7 51 44.1 47.7 41
V3 37.4 44.2 44.1 43.4 45.2 41.5 43.6 45.7 43.8 46.2 45.8 42.4 45.8 40.8
V4 35.5 42.5 45.6 41.5 45.2 40.5 43 46.9 44.9 45.7 48.7 41.9 42.9 41.1
V5 37.9 45.5 44.2 42.9 46.2 40.4 43.7 46.6 45.5 46.7 47.05 43 45.8 40.8
M. Dalal, M. Juneja
A survey on information hiding using video steganography
60
LSB
LSB1
50
LSB2
DCT
40
DCT1
DCT2
30
PSNR
DCT3
DWT
20
DWT1
DWT2
10
DWT3
MV
0
V1 V2 V3 V4 V5 MV1
Videos MV2
In the realm of the internet, secure data transmission is the aim of almost all the fields
in communication which could be achieved by concealing the data inside some common
media. Video steganography has its application in most of the common fields such as mili-
tary, medical, corporate, and multimedia where covert communication may be frequently
required for some internal and external security purposes. Some of the applications of
video steganography are as follows:
(1) Intelligence agencies Most of the time, communication in intelligence agencies are also
covert and recently, video steganography has been widely used for camouflage (Sadek
et al. 2015; Petitcolas et al. 1999). National Security Agencies (NSA) has been using
steganography to transfer secret messages within and outside the agency for a long.
(2) Military In the military and defence field, secure message transfer is the primary aim of
communication. Nowadays, video steganography has been widely used for this purpose
by the military persons for covert communication as common transmission channels
may be conceded (Petitcolas et al. 1999). Authorized communication in the military
and defence is the base of their communication systems. In the military and defence,
additional multiple layered securities are also provided by using different encryption
schemes before embedding the secret data.
(3) Medical Video steganography has also been used in the field of medical sciences
(Mandal 2016; Santhosh and Meghana 2016) in which it is used to hide the crucial
private information of the patient. Further, in medical sciences, a sequence of DNA
has also been used to hide secret information (Bancroft and Clelland 2001; Torkaman
13
M. Dalal, M. Juneja
(a) Cover Frame (3) of V1 (b) Stego- Frame (3) of V1 using LSB
Fig. 16 Qualitative results of different video frames before and after embedding
et al. 2011). This is usually done to secure the patient’s data and to avoid any leakage
of private information from some unauthorized person/party.
(4) Video surveillance Video steganography has been utilized to protect the privacy of
the authorized persons by embedding their information in the videos apprehended by
surveillance system (Zhang et al. 2005; Mstafa et al. 2017b).
(5) Video error correction Another application of video steganography is for video error
correction while transmission (Robie and Mersereau 2001; Lie et al. 2006) or trans-
mitting supplementary data without the need for additional bandwidth (Stanescu et al.
2007).
(6) Corporate In corporate and industry communication, data leakage is the most menacing
thing for a business. Therefore, secure communication using steganography is always
appreciated for security and authenticity as sometimes unsafe communication may lead
to serious data breach (Bandyopadhyay et al. 2008).
13
A survey on information hiding using video steganography
25
20
15
V
Q
10 VQM
M
5
Fig.17 Average VQM values obtained for different techniques for the considered videos
14
C 12
a 10
p 8
a 6
c
4
i Hiding
2
t Capacity %
y 0
Fig. 18 Average hiding capacity ratio obtained for different techniques for the considered videos
5
4.5
B 4
i 3.5
t 3
- 2.5
R 2
Bit Rate
a 1.5
Increase %
t 1
e 0.5
0
Fig. 19 Average bit rate increase values obtained for different techniques for the considered videos
Besides, there are some other real-life applications of video steganography such as video
rights management for social networking (Mandal 2016), cloud computing applications
13
M. Dalal, M. Juneja
High
Unested LSB
LSB1
Steganalysis
LSB2
DCT
DCT1
DCT2
DCT3
DWT
DWT1
DWT2
resistance
DWT3
MV
MV1
MV2
Low
Fig. 20 Steganalysis resistance plot as low, high, and untested of experimented techniques
(Santhosh and Meghana 2016), and digital rights management for online TV (Mandal
2016).
Video steganography has been able to attract researchers in this field for more than a dec-
ade, and some good techniques have also been proposed in the literature. Still, the ful-
fillment of the basic requirements: capacity, imperceptibility, and robustness with high
security is not yet entirely attained. Though the steganographic requirements are mutu-
ally allied, enhancing some requirements may inferior the proficiency in other facets. The
concern is that a concrete solution is not yet attained to solve all the basic requirements
simultaneously. Based on the literature survey, the following pertinent suggestions are
highlighted to give some useful insights into future work for new researchers in this field.
13
Table 12 Analyses of the surveyed techniques with their pros and cons
Techniques Hiding Capacity Visual quality Robustness Pros Cons
LSB High Adequate Poor Simple and easy to implement Poor robustness and easily detectable using simple
steganalysis techniques
DCT High High High Have high embedding capacity with less computa- The visual quality is highly impacted after embedding
tional cost
DWT High Little/No High The influence on visual quality is low High auxiliary data is required for reversible tech-
A survey on information hiding using video steganography
niques
MV Adequate High Adequate With high motion videos, embedding capacity could Detection of motion vectors for embedding is a
be improved with high imperceptibility complex task
VLC Poor High Poor Low computational complexity Size overhead issues are there, after embedding with
poor robustness against statistical measures
Format Specific Adequate High Poor Latest formats have different new features which The quality of the stego video is grievously distorted
could be explored for secure data hiding and robustness is poor
13
M. Dalal, M. Juneja
12 Conclusion
Video steganography is an emerging area for InfoSec because of the large hiding capacity
and complex structure of videos, which makes them more suitable for steganography, as
compared to other multimedia files. This paper presented a survey and analysis of video
13
A survey on information hiding using video steganography
References
Abbass AS, Soleit EA, Ghoniemy SA (2007) Blind video data hiding using integer wavelet transforms.
Ubiquitous Comput Commun J 2(1):11–25
Abdolmohammadi M, Toroghi RM, Bastanfard A (2019) Video steganography using 3D convolutional neu-
ral networks. Paper presented at Mediterranean conference on pattern recognition and artificial intel-
ligence. pp. 149–161
Abu-El-Haija S, Kothari N, Lee J, Natsev P, Toderici G, Varadarajan B, Vijayanarasimhan S (2016) You-
tube-8m: a large-scale video classification benchmark. arXiv Prepr. arXiv1609.08675
Ahmed EAE, Soliman HH, Mostafa HE (2014) Information hiding in video files using frequency domain.
Int J Sci Res 3(6):2431–2437
Al-Frajat AK, Jalab HA, Kasirun ZM, Zaidan AA, Zaidan BB (2010) Hiding data in video file: an overview.
J Appl Sci 10:1644–1649
Aljahdali H, Townend P, Xu J (2013) Enhancing multi-tenancy security in the cloud IaaS model over public
deployment. Paper presented at IEEE 7th International Symposium on in Service Oriented System
Engineering (SOSE), 385–390
Aly HA (2011) Data hiding in motion vectors of compressed video based on their associated prediction
error. IEEE Trans Inf Forensics Secur 6(1):14–18
Amirtharajan R, Rayappan JBB (2013) Steganography-time to time: a review. Res J Inform Technol 5:53–66
Asikuzzaman M, Pickering MR (2018) An overview of digital video watermarking. IEEE Trans Circuits
Syst Video Technol 28(9):2131–2153
Babaguchi N, Cavallaro A, Chellappa R, Dufaux F, Wang L (2013) Guest editorial: special issue on intel-
ligent video surveillance for public security and personal privacy. IEEE Trans Inf Forensics Secur
8(10):1559–1561
Balaji R, Naveen G (2011) Secure data transmission using video Steganography. Paper presented at IEEE
International Conference in Electro/Information Technology (EIT). pp. 1–5
Balu S, Babu CNK, Amudha K (2018) Secure and efficient data transmission by video steganography in
medical imaging system. Cluster Comput. 22(2):1–7
Bancroft FC, Clelland C (2001) DNA-based steganography. Google Patents, 2001
Bandyopadhyay SK, Bhattacharyya D, Ganguly D, Mukherjee S, Das P (2008) A tutorial review on steg-
anography. Int Conf Contemp Comput 101:105–114
Bhattacharyya D, Bhaumik AK, Choi M, Kim TH (2010) Directed graph pattern synthesis in LSB technique
on video steganography. Lect Notes Comput Sci 6059(1996):61–69
Bhautmage P, Jeyakumar A, Dahatonde A (2013) Advanced video steganography algorithm. Int J Eng Res
Appl 3:1641–1644
13
M. Dalal, M. Juneja
13
A survey on information hiding using video steganography
Fridrich J, Goljan M (2002) Practical steganalysis of digital images: state of the art. Electronic Imaging
2002:1–13
Fridrich J, Goljan M, Soukal D (2003) Higher-order statistical steganalysis of palette images. Proc SPIE
5020:178–190
Fu P-W, Wu C-C, Cho Y-J (2017) What makes users share content on facebook? Compatibility among psy-
chological incentive, social capital focus, and content type. Comput Human Behav 67:23–32
Furuta T, Noda H, Niimi M, Kawaguchi E (2003) Bit-plane decomposition steganography using wavelet
compressed video. In: proceedings of the 2003 joint conference of the fourth international conference
on information, communications and signal processing, 2003 and fourth pacific Rim conference on
multimedia. Vol. 2. pp. 970–974
Galiano DR, Del Barrio AA, Botella G, Cuesta D (2020) Efficient embedding and retrieval of information
for high-resolution videos coded with HEVC. Comput Electr Eng 81:106541
Gallagher S (2012) Steganography: how al-Qaeda hid secret documents in a porn video. [Online]. https
://arstechnica.com/business/2012/05/steganography-how-al-qaeda-hid-secret-documents-in-a-porn-
video/. Accessed 01 Jul 2016
Gross MJ (2011) Exclusive: operation shady RAT: unprecedented cyber-espionage campaign and intellec-
tual-property bonanza. Vanity Fair, vol. 2
Gujjunoori S, Amberker BB (2013) DCT based reversible data embedding for MPEG-4 video using HVS
characteristics. J Inf Secur Appl 18(4):157–166
Gupta H, Chaturvedi DS (2013) Video data hiding through LSB substitution technique. Res Inven Int J Eng
Sci 2(10):32–39
Gupta H, Chaturvedi S (2014) Video steganography through LSB based hybrid approach. Int J Comput Sci
Netw Secur 14(3):99–106
H.264/AVC. [Online]. https://en.wikipedia.org/wiki/H.264/MPEG-4_AVC. Accessed 21 Nov 2016
Hanafy AA, Salama GI, Mohasseb YZ (2005) A secure covert communication model based on video steg-
anography. In: Proceedings of the IEEE MILCOM Military Communication Conference. pp. 1–6,
2008
Hashemzadeh M (2018) Hiding information in videos using motion clues of feature points. Comput Electr
Eng 68:14–25
He Y, Yang G, Zhu N (2012) A real-time dual watermarking algorithm of H. 264/AVC video stream for
video-on-demand service. AEU International J Electron Commun 66(4):305–312
Holmes B (2015) Steganography: how Antonopoulos hid a US$100m transaction in a picture of kittens.
[Online]. https://bravenewcoin.com/news/steganography-how-antonopoulos-hid-a-us12m-transactio
n-in-a-picture-of-kittens/. Accessed 13 Dec 2017
Hu, SD KTU, “A novel video steganography based on non-uniform rectangular partition,” Proc. - 14th IEEE
Int. Conf. Comput. Sci. Eng. CSE 2011 11th Int. Symp. Pervasive Syst. Algorithms, Networks, I-SPA
2011 10th IEEE Int. Conf. IUCC 2011, pp. 57–61, 2011.
Hussain M, Wahab AWA, Bin Idris YI, Ho ATS, Jung K-H (2018) Image steganography in spatial domain: a
survey. Signal Process Image Commun 65:46–66
Idbeaa T, Samad SA, Husain H (2016) A secure and robust compressed domain video steganography
for intra-and inter-frames using embedding-based byte differencing (EBBD) scheme. PLoS ONE
11(3):e0150732
Jiang B, Yang G, Chen W (2015) A CABAC based HEVC video steganography algorithm without bitrate
increase. J Comput Inf Syst 11(6):2121–2130
Johnson NF, Jajodia S (1998) Exploring steganography: seeing the unseen. IEEE Comput 31(2):26–34
Johnson NF, Duric Z, Jajodia S (2001) Information hiding: steganography and watermarking-attacks and
countermeasures: steganography and watermarking: attacks and countermeasures. Springer, Berlin
Jue W, Min-qing Z, S. Juan-li S (2011) Video steganography using motion vector components. In: proceed-
ings of the IEEE 3rd international conference on communication software and networks (ICCSN), pp.
500–503
Kamil S, Ayob M, Abdullah SNHS, Ahmad Z (2018) Optimized data hiding in complemented or non-com-
plemented form in video steganography. Cyber Resilience Conf 2018:1–4
Kapoor V, Mirza A (2015) An enhanced LSB based video steganographic system for secure and efficient
data transmission. Int J Comput Appl 121(10):38–42
Kaur R (2016) Kaur S (2016) XOR-EDGE based video steganography and testing against Chi-square stega-
nalysis. Int J Image Graph Signal Process 8(9):31–39
Kaur M, Kaur A (2014) Improved security mechanism of text in video using steganographic technique. Int J
Adv Res Comput Sci Softw Eng. vol. 2, no. 10
Kaur K, Kaur B (2018) DWT-LSB approach for video steganography using artificial neural network. Int.
Adv. Res. J. Sci. Eng. Technol, IARJSET
13
M. Dalal, M. Juneja
Kelash HM, Wahab OFA, Elshakankiry OA, El-sayed HS (2014) Utilization of steganographic techniques
in video sequences. Int J Comput Netw Technol 24(1):17–24
Kelley J (2001) Terrorist instructions hidden online, [Online]. http://usatoday30.usatoday.com/tech/
news/2001-02-05-binladen-side.htm. Accessed 05 Jul 2016
Ker T, Andrew D, Bas, Bohme P, Cogranne R, Craver R, Filler S, Fridrich T, Pevny J (2013) Moving
steganography and steganalysis from the laboratory into the real world. Proceedings of the first
ACM workshop on Information hiding and multimedia security. pp. 45–58
Kessler GC (2004) Steganography: implications for the prosecutor and computer forensics examiner.
American Prosecution Reasearch Institute
Khare R, Mishra R, Arya I (2014) Video steganography using LSB technique by neural network. Int
Conf Comput Intell Commun Netw 2014:898–902
Khosla S, Kaur P (2014) Secure data hiding technique using video steganography and watermarking. Int
J Comput Appl 95(20):7–12
Kolakalur A, Kagalidis I, Vuksanovic B, Iacsit M (2016) Wavelet based color video steganography. Int J
Eng Technol 8(3):165
Konyar MZ, Akbulut O, Öztürk S (2020) Matrix encoding-based high-capacity and high-fidelity revers-
ible data hiding in HEVC. Signal Image Video Process 14:1–9
Kopeytsev V (2020) Steganography in attacks on industrial enterprises, Kaspersky ICS CERT, [Online].
https://ics-cert.kaspersky.com/repor ts/2020/06/17/steganography-in-attack s-on-industrial-enter
prises/. Accessed 07 Jul 2020
Krenn R (2004) Steganography and steganalysis. Retrieved Sept 8:1–9
Kulkarni A, Goldman J, Nabholz B, Eyre W (2009) Detection of steganography-producing software arti-
facts on crime-related seized computers. J Digit Forensics Secur Law 4(2):5–26
Kumar P, Singh K (2018) An improved data-hiding approach using skin-tone detection for video steg-
anography. Multimed Tools Appl 77(18):24247–24268
Le Gall D (1991) MPEG: a video compression standard for multimedia applications. Commun ACM
34(4):46–58
Lie W-N, Lin T-I, Lin C-W (2006) Enhancing video error resilience by using data-embedding tech-
niques. IEEE Trans Circuits Syst Video Technol 16(2):300–308
Lin TJ, Chung KL, Chang PC, Huang YH, Liao HYM, Fang CY (2013) An improved DCT-based
perturbation scheme for high capacity data hiding in H.264/AVC intra frames. J Syst Softw
86(3):604–614
Liu S, Xu D (2020) A robust steganography method for HEVC based on secret sharing. Cogn Syst Res
59:207–220
Liu B, Liu F, Lu B, Luo X (2006) Real-time steganography in compressed video. Paper presented at interna-
tional workshop on multimedia content representation, classification and security, pp. 43–48
Liu B, Liu F, Ni D (2006) Adaptive compressed video steganography in the VLC-domain. Paper pre-
sented at 2006 IET international conference on wireless, mobile and multimedia networks. pp. 1–4
Liu B, Liu F, Yang C, Sun Y (2008) Secure steganography in compressed video bitstreams. Paper presented
at 3rd international conference on ARES Availability, Reliability and Security. pp. 1382–1387
Liu Y, Hu M, Ma X, Zhao H (2015) A new robust data hiding method for H. 264/AVC without intra-
frame distortion drift. Neurocomputing 151:1076–1085
Liu Y, Ju L, Hu M, Zhao H, Jia S, Jia Z (2016) A new data hiding method for H.264 based on secret
sharing. Neurocomputing 188:113–119
Liu Y, Liu S, Zhao H, Liu S (2019) A new data hiding method for H. 265/HEVC video streams without
intra-frame distortion drift. Multimed Tools Appl 78(6):6459–6486
Lu Y, Lu C, Qi M (2010) An effective video steganography method for biometric identification. Lect
Notes Comput Sci 6059:469–479
Luo W, Huang F, Huang J (2010) Edge adaptive image steganography based on LSB matching revisited.
IEEE Trans Inf forensics Secur 5(2):201–214
Luo T, Jiang G, Yu M, Xu H (2016) Asymmetric self-recovery oriented stereo image watermarking
method for three dimensional video system. Multimed Syst 22(5):641–655
Luo T, Jiang G, Yu M, Xu H, Gao W (2017) Sparse recovery based reversible data hiding method using
the human visual system. Multimed Tools Appl 77(15):1–24
Luo T, Zuo L, Jiang G, Gao W, Xu H, Jiang Q (2020) Security of MVD-based 3D video in 3D-HEVC
using data hiding and encryption. J Real-Time Image Process 17(4):773–785
Ma X, Li Z, Tu H, Zhang B (2010) A data hiding algorithm for h.264/AVC video streams without intra-
frame distortion drift. IEEE Trans Circuits Syst Video Technol 20(10):1320–1330
Mandal JK (2016) Handbook of research on natural computing for optimization problems. IGI Global,
Pennsylvania
13
A survey on information hiding using video steganography
13
M. Dalal, M. Juneja
13
A survey on information hiding using video steganography
13
M. Dalal, M. Juneja
Tasdemir K, Kurugollu F, Sezer S (2013) Video steganalysis of LSB based motion vector steganogra-
phy. Paper presented at 4th European workshop on visual information processing (EUVIP). pp.
260–264
Tasdemir K, Kurugollu F, Sezer S (2016) Spatio-temporal rich model-based video steganalysis on cross
sections of motion vector planes. IEEE Trans Image Process 25(7):3316–3328
Thomas TL (2003) Al Qaeda and the internet: the danger of" cyberplanning". Parameters 33(1):112–123
Torkaman MRN, Nikfard P, Kazazi NS, Abbasy MR, Tabatabaiee SF (2011) Improving hybrid crypto-
systems with DNA steganography, Paper presented at international conference on digital enter-
prise and information systems. pp. 42–52
Tudor PN (1995) MPEG-2 video compression. Electron Commun Eng J 7(6):257–264
Video HAVCC, Tew Y, Wong K (2014) An overview of information hiding in H. 264/AVC compressed
video. Circuits Syst Video Technol IEEE Trans 24(2):305–319
Wahab OFA, Badawy MB, Elshakankiry OA, El-sayed HS (2015) Utilizations of reversible lossless data
hiding techniques in video sequences. Int J Comput Netw Technol 3(1)
Wang Z, Lu L, Bovik AC (2004) Video quality assessment based on structural distortion measurement.
Signal Process Image Commun 19(2):121–132
Wang K, Zhao H, Wang H (2014) Video steganalysis against motion vector-based steganography by add-
ing or subtracting one motion vector value. IEEE Trans Inf Forensics Secur 9(5):741–751
Wang K, Han J, Wang H (2014) Digital video steganalysis by subtractive prediction error adjacency
matrix. Multimed Tools Appl 72(1):313–330
Wang P, Cao Y, Zhao X (2017) Segmentation based video steganalysis to detect motion vector modifica-
tion. Secur Commun Netw 2017:1–12
Wang Y, Cao Y, Zhao X, Xu Z, Zhu M (2018) Maintaining rate-distortion optimization for IPM-based
video steganography by constructing isolated channels in HEVC. In: Proceedings of the 6th ACM
workshop on information hiding and multimedia security. pp. 97–107
Wang J, Jia X, Kang X, Shi Y-Q (2019) A cover selection HEVC video steganography based on intra
prediction mode. IEEE Access 7:119393–119402
Warfare C, Wang H, Wang S (2004) Cyber warfare: steganography vs. steganalysis. Commun ACM
47(10):76–82
Westfeld A, Pfitzmann A (2000) Attacks on steganographic Systems. Inf Hiding 1768:1–16
Wiegand T, Sullivan GJ, Bjontegaard G, Luthra A (2003) Overview of the H. 264/AVC video coding
standard. IEEE Trans circuits Syst video Technol 13(7):560–576
Wong K, Tanaka K, Takagi K, Nakajima Y (2009) Complete video quality-preserving data hiding. IEEE
Trans Circuits Syst Video Technol 19(10):1499–1512
Xu C, Ping X (2007) A steganographic algorithm in uncompressed video sequence based on difference
between adjacent frames. In Proceedings of the 4th International Conference on Image Graph
ICIG 2007. pp. 297–302
Xu C, Ping X, Zhang T (2006) Steganography in compressed video stream. Paper presented at First
International Conference on innovative computing, information and control, ICICIC. vol. 1, pp.
269–272
Xu X, Dong J, Tan T (2012) Universal spatial feature set for video steganalysis. In: Proceedings of the
19th IEEE international conference on image processing. pp. 245–248
Xu D, Wang R, Shi YQ (2014) Data hiding in encrypted H.264/AVC video streams by codeword substi-
tution. IEEE Trans Inf Forensics Secur 9(4):596–606
Xu D, Wang R, Shi YQ (2016) An improved scheme for data hiding in encrypted H.264/AVC videos. J
Vis Commun Image Represent 36:229–242
Xue Y, Zhou J, Zeng H, Zhong P, Wen J (2019) An adaptive steganographic scheme for H. 264/AVC
video with distortion optimization. Signal Process Image Commun 76:22–30
Yadav P, Mishra N, Sharma S (2013) A secure video steganography with encryption based on LSB tech-
nique. Paper presented at IEEE International Conference on Computational Intelligence and Com-
puting Research (ICCIC) pp. 1–5
Yang M, Bourbakis N (2005) A high bitrate information hiding algorithm for digital video content under
H.264/AVC compression. Midwest Symp Circuits Syst 2005:935–938
Yang Y, Li Z, Xie W, Zhang Z (2019) High capacity and multilevel information hiding algorithm based
on pu partition modes for HEVC videos. Multimed Tools Appl 78(7):8423–8446
Yao Y, Zhang W, Yu N, Zhao X (2015) Defining embedding distortion for motion vector-based video
steganography. Multimed Tools Appl 74(24):11163–11186
Yao Y, Zhang W, Yu N (2016) Inter-frame distortion drift analysis for reversible data hiding in encrypted
H.264/AVC video bitstreams. Signal Processing 128:531–545
13
A survey on information hiding using video steganography
Ye H, Zhang W, Yao Y, Kong C, Huang H, Yu N, “Motion vector-based video steganalysis using spatial-
temporal correlation. In: Proceedings of the 6th international congress on image and signal pro-
cessing (CISP). Vol. 1. pp. 148–153
Yeh H-L, Gue S-T, Tsai P, Shih W-K (2014) Reversible video data hiding using neighbouring similarity.
IET Signal Process 8(6):579–587
Zhai L, Wang L, Ren Y (2017) Combined and calibrated features for steganalysis of motion vector-based
steganography in H. 264/AVC. In: proceedings of the 5th ACM workshop on information hiding
and multimedia security. pp. 135–146
Zhang W, Cheung S.-CS, Chen M (2005) Hiding privacy information in video surveillance system.
Paper presented at IEEE international conference on image processing. Vol. 3. p II–868
Zhang H, Cao Y, Zhao X (2015) Motion vector-based video steganography with preserved local optimal-
ity. Multimed Tools Appl 89:1–17
Zhang Y, Zhang M, Niu K, Liu J (2015) Video steganography algorithm based on trailing coefficients. Paper
presented at international conference on intelligent networking and collaborative systems (INCOS),
pp. 360–364
Zhao Y, Zhang H, Cao Y, Wang P, Zhao X (2015) Video steganalysis based on intra prediction mode cali-
bration. International Workshop on Digital Watermarking. pp. 119–133
Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.
13