Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2020, International Journal of Scientific Research in Computer Science, Engineering and Information Technology
https://doi.org/10.32628/CSEIT2062109…
7 pages
1 file
One of the very interesting data modalities is video. From a dimensionality and size perspective, videos are one of the most interesting and intuitive data types which enable fast and easy object recognition and learning. Video classification is an important task for archiving digital contents for various video service providers. Video uploading platforms such as YouTube are collecting enormous datasets, empowering Deep Learning research. Videos being an important source to recognize any activity by the humans, video classification becomes an important and critical job for video service providers. The survey paper studies various deep learning, transfer learning and hybrid model approaches. Video data normally occurs as continuous, analog signals In order for a computer to process this video data, the analog signals must be converted to a non-continuous, digital format. In a digital format, the video data can be stored as a series of bits on a hard disk or in computer memory. A video sequence is displayed as a series of frames. Each frame is a snapshot of a moment in time of the motion-video data, and is very similar to a still image. When the frames are played back in sequence on a display device, a rendering of the original video data is created. In real-time video the playback rate is 30 frames per second. This is the minimum rate necessary for the human eye to successfully blend each video frame together into a continuous, smoothly moving image. A single frame of video data can be quite large in size. A video frame with a resolution of 512 x 482 will contain 246,784 pixels. If each pixel contains 24 bits of color information, the frame will require 740,352 bytes of memory or disk space to store. Assuming there are 30 frames per second for real-time video, a 10-second video sequence would be more than 222 megabytes in size! It is clear there can be no computer video without at least one efficient method of video data compression.
Remote Sensing
The smart city concept has attracted high research attention in recent years within diverse application domains, such as crime suspect identification, border security, transportation, aerospace, and so on. Specific focus has been on increased automation using data driven approaches, while leveraging remote sensing and real-time streaming of heterogenous data from various resources, including unmanned aerial vehicles, surveillance cameras, and low-earth-orbit satellites. One of the core challenges in exploitation of such high temporal data streams, specifically videos, is the trade-off between the quality of video streaming and limited transmission bandwidth. An optimal compromise is needed between video quality and subsequently, recognition and understanding and efficient processing of large amounts of video data. This research proposes a novel unified approach to lossy and lossless video frame compression, which is beneficial for the autonomous processing and enhanced representatio...
Advances in Science, Technology and Engineering Systems Journal
Video and its processing are an interesting area as the increase in usage of internet videos, online streaming, CCTV, impact of internet on normal crowd increased. The need to know about video and its processing become an eminent area in research in current era. The paper tries to cover the traditional video processing, the advancement in video codec from the initial year, its origin, features, drawbacks and advancement lead to next stage. It provides an insight to need of video compression, steps involved in it, followed by overall review about video compression in various areas. The detailed explanation with reason of emergence, origin, characteristics are pointed. This information helps to add knowledge about the past and that helps to focus on the advancement and transitions that can be done to the video codecs. It summarizes the advancement in recent video processing using CNN, NN, deep learning too.
Studies in Fuzziness and Soft Computing, 2005
ArXiv, 2021
Video content classification is an important research content in computer vision, which is widely used in many fields, such as image and video retrieval, computer vision. This paper presents a model that is a combination of Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) which develops, trains, and optimizes a deep learning network that can identify the type of video content and classify them into categories such as “Animation, Gaming, natural content, flat content, etc”. To enhance the performance of the model novel keyframe extraction method is included to classify only the keyframes, thereby reducing the overall processing time without sacrificing any significant performance.
June 2020
We present a new approach to video compression for video surveillance by refining the shortcomings of conventional approach and substitute each traditional component with their neural network counterpart. Our proposed work consists of motion estimation, compression and compensation and residue compression, learned end-to-end to minimize the rate-distortion trade off. The whole model is jointly optimized using a single loss function. Our work is based on a standard method to exploit the spatio-temporal redundancy in video frames to reduce the bit rate along with the minimization of distortions in decoded frames. We implement a neural network version of conventional video compression approach and encode the redundant frames with lower number of bits. Although, our approach is more concerned toward surveillance, it can be extended easily to general purpose videos too. Experiments show that our technique is efficient and outperforms standard MPEG encoding at comparable bitrates while pr...
cscjournals.org
Video surveillance has been a popular security tool for years. Video surveillance systems produce huge amounts of data for storage and display. Long-term human monitoring of the acquired video is impractical and ineffective. This paper presents a novel solution for real-time cases that identify and record only "interesting" video frames containing motion. In addition to traditional methods for compressing individual video images, we could identify and record only "interesting" video images, such as those images with significant amounts of motion in the field of view. The model would be built in simulink, one of tools in matlab and incorporated with davinci code processor, a video processor. That could significantly help reduce the data rates for surveillance-specific applications.
With the rise of digital computing and visual data processing, the need for storage and transmission of video data became prevalent. Storage and transmission of uncompressed raw visual data is not a good practice, because it requires a large storage space and great bandwidth. Video compression algorithms can compress this raw visual data or video into smaller files with a little sacrifice on the quality. This paper an overview and comparison of standard efforts on video compression algorithm of: MPEG-1,
Video compression technique is now mature as is proven by the large number of applications that make use of DWT and DCT technology. Now day's lot of video compression techniques proposed. With efficient compression techniques, a significant reduction in file size can be achieved with little or no adverse effect on the visual quality.This paper gives the idea about for video compression technique but not very much good for the real time video compression techniques either have a demerit of loosely techniques like DCT and DWT but here we are going to present a noble technique in which we will use object position change finding algorithm to get our video process in real time and having lossless decompressions. Compression is done in real time, such a way while maintaining the benefits of keeping all of the information of the source and also the benefits of compression during the production process[1]. "Lossless" means that the output from the decompressor is bit-for-bit identical with the original input to the compressor. The decompressed video stream should be completely identical to original. In addition to providing improved coding efficiency in real time the technique provides the ability to selectively encode, decode, and manipulate individual objects in a video stream. The technique used results in video coding that a high compression ratio can be obtained without any loss in data in real time.
IEEE MultiMedia, 2020
Feature coding has been recently considered to facilitate intelligent video analysis for urban computing. Instead of raw videos, extracted features in the front-end are encoded and transmitted to the back-end for further processing. In this article, we present a lossless key-point sequence compression approach for efficient feature coding. The essence of this predict-and-encode strategy is to eliminate the spatial and temporal redundancies of key points in videos. Multiple prediction modes with an adaptive mode selection method are proposed to handle key-point sequences with various structures and motion. Experimental results validate the effectiveness of the proposed scheme on four types of widely used key-point sequences in video analysis. & INTELLIGENT VIDEO ANALYSIS, involving applications such as activity recognition, face recognition, and vehicle reidentification, has become part and parcel of smart cities and urban computing. Recently, deep learning techniques have been adopted to improve the capabilities of urban video analysis and understanding by leveraging on large amounts of video data. With widespread deployment of surveillance systems in urban areas, massive amounts of video data are captured daily from front-end cameras.
Advances in Intelligent Systems and Computing, 2021
State-of-the-art video compression techniques play a vital role in coding and decoding of video images that consume enormous memory space. In general, video compression is widely adapted for all future mobile applications based on recent developed technologies that are inbuilt into architectures to establish communication links with minimum consumption of power and memory for visual audio and video information. High-quality video consumes more memory for a server/user which is a major drawback in video compression. We present a conceptually generic, pliable and prevalent approach to overcome this issue in video compression with minimum memory space. However, storing and backup maintenance of these large data in an efficient way have been provided the researchers a roadmap in developing novel algorithms and techniques. A video compression technique plays a vital role in storage and transmission of data through the limited bandwidth capability. In this paper, authors mentioned advanced technologies in video compression by providing identical quality and recognizing more content of video. In this paper, authors focus on compressors and decompressors (CODECs) which are there like high-efficiency video coding (HEVC/H.265) that is capable to compress the video of any resolutions like 8192 * 4320 including 8K ultra-high definition (UHD) (John Singh et al. in Int J Pure Appl Math 119:3709–3724, [1]). The author highlights the algorithmic approach that describes the prerequisites for HEVC on playback on the web browser and compression of high-definition video for live video streaming, key factors in developments in dividing the video frames into a few subsections to high compression ratio and to retrieve the same video frames with high quality and more accurately (Garcia-Pineda et al. in Comput Commun, [2]). Keywords: Video compression · CODEC · HEVC/H.265 · UHD
International Journal of Research in Advent Technology, 2019
The problems of image recognitions require powerful model implementation like convolutional neural networks (CNN). The obtained results based on extensive evaluation and classification of huge videos obtained from YouTube having more than one million views are used in this paper analysis. The CNN connectivity for additional advantages of temporary information in specific time domain is performed and architecture for multiple resolutions is shown as the result of classification of neural network. The significant improvement in performance from 50.3% to 65.3% shows the display significance for the neural network. But for case when model based on single frame is implemented, this improvement is very low [59% to 60%], but this little improvement is showing its significance. The performance generalization based on selected model and actions of UCF-101 is further studied in our paper. The base line of UCF-101 [44%] is considered as the measuring tool of improvement for reorganization and comparison of large scale videos. The classification is easily done with CNN.
IAEME PUBLICATION, 2020
Video content over the internet is increasing day by day with the increasing trends of live video streaming services. People use to capture, share and save their various moments of life using videos. The main challenge before video compression emerged to deal with high quality video content. This led to the emersion of highly efficient and powerful video compression techniques. Videos are disseminated over the internet using efficient and powerful video compression techniques. Existing video compression techniques are designed and optimized manually. Recent researches have shown that deep learning based video compression techniques are giving comparable and better results in comparison to the existing traditional techniques. These results showed the ways to the researchers to work in the direction of applying deep learning concepts in video compression for their practical applicability. This paper gives an insight into the various recent deep learning based video compression techniques and their comparative analysis based on various parameters pertaining to their architectures, compression results, training set, data set, VQMs etc. The comparative and performance analysis presents a future endeavor for scope of further enhancements and optimizations.
Indonesian Journal of Electrical Engineering and Computer Science, 2022
The sudden surge in the video transmission over internet motivated the exploration of more promising and potent video compression architectures. Though the frame prediction based hand designed techniques are performing well and widely used but the recent deep learning based researches in this domain provided further directions of pure deep learning based next generation codecs. As the bandwidth over the internet is varying, adaptive bit rate representation is more suitable for video quality adjustment in tune with bandwidth variation. The proposed architecture comprises of end to end trainable video compression network consisting of majorly three modules namely-motion extension network, flow autoencoder and frame autoencoder. Frame autoencoder generates the individual compressed frames, flow autoencoder is used for optical flow based motion compensation chore and next frame is predicted by the motion extension network. The network is designed and evaluated in incremental manner. The analysis of the outcomes demonstrates the promising performance of the network quantitatively and qualitatively. Moreover, the results reveal that inclusion of optical flow based motion compensation network to the MotionNet architecture has enhanced the performance.
Social Science Research Network, 2021
In this digital world, Video and Audio (unstructured) data have increased exponentially over the past decade. CCTV surveillance used as a method of maintaining the records for audit to provide Safety and Security as well as Legal, Insurance, Financial, and Health. Storage of video data made safe and secure to the society and organization. However, it is a challenge, in storing the different formats of video data and utilization of space is a constraint, time limit on data storage as well, and not easy to retrieve the data based on the event. Currently, the CCTV transmitted video data is stored in the database server without any changes in the conventional approach. In our hypothesis efficient data stored for an extended period for, further processing. Proposed Hybrid solution with three stages to store the video data in the Specific system.1. Removing the redundant frames by using an efficient background subtraction algorithm 2. Threshold/Entropy approach on Keyframe Extraction and ...
2018
1Professor, Dept. of Computer Engineering, ABMSP’s APCOER Pune, Maharashtra, India 2,3,4,5,6Student, Dept. of Computer Engineering, ABMSP’s APCOER Pune, Maharashtra, India ---------------------------------------------------------------------***--------------------------------------------------------------------Abstract In recent few decades digital video compression technologies have become a necessity for the designing, communication and utilization of visual information. Users today have gotten used to taking and posting photos and videos to record daily life, share experiences. According to recent reports, Instagram users have been posting an average of 55 million photos and videos every day. How to store, backup and maintain these amount of photos and videos in an efficient way has become an urgent problem. Video compression techniques plays a vital role in storage and transmission of data through the limited bandwidth capability. This review paper introduces advance technology ...
Recent years have shown exponential growth in video processing and transfer through the Internet and other applications. With the restriction on bandwidth, processing and storage there is an extensive demand for end-to-end video compression. Many conventional methods have been developed to compress video. However, with the extensive use of Artificial Intelligence, AI, such as Deep Learning (DL) have emerged as a best-of-breed alternative for performing different tasks have been also been used in the option of improving video compression in last years, with the primary objective of reducing compression ratio while preserving the same video quality. Evolving video compression research based on Neural Networks (NNs) focuses on two distinct directions: First; enhancing current video codecs by better predictions integrated even in the same codec framework, and second; holistic end-to-end VC systems approaches. Although some of the outcomes are optimistic and the results are well, no brea...
2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019
We present a new algorithm for video coding, learned end-to-end for the low-latency mode. In this setting, our approach outperforms all existing video codecs across nearly the entire bitrate range. To our knowledge, this is the first ML-based method to do so. We evaluate our approach on standard video compression test sets of varying resolutions, and benchmark against all mainstream commercial codecs, in the low-latency mode. On standard-definition videos, relative to our algorithm, HEVC/H.265, AVC/H.264 and VP9 typically produce codes up to 60% larger. On high-definition 1080p videos, H.265 and VP9 typically produce codes up to 20% larger, and H.264 up to 35% larger. Furthermore, our approach does not suffer from blocking artifacts and pixelation, and thus produces videos that are more visually pleasing. We propose two main contributions. The first is a novel architecture for video compression, which (1) generalizes motion estimation to perform any learned compensation beyond simple translations, (2) rather than strictly relying on previously transmitted reference frames, maintains a state of arbitrary information learned by the model, and (3) enables jointly compressing all transmitted signals (such as optical flow and residual). Secondly, we present a framework for ML-based spatial rate control-a mechanism for assigning variable bitrates across space for each frame. This is a critical component for video coding, which to our knowledge had not been developed within a machine learning setting.
2005
Abstract This paper presents a new CNN-based architecture for real-time video coding applications. The proposed approach, by exploiting object-oriented CNN algorithms and MPEG encoding capabilities, enables low bit-rate encoder/decoder to be designed. Simulation results using Claire video sequence show the effectiveness of the proposed scheme. Copyright© 2005 John Wiley & Sons, Ltd.
Motion Estimation Algorithms for Video Compression, 1997
The introduction of new, more powerful personal computers and workstations has ushered in a new are of computing. New machines must be capable of supporting numerous media data types. These data types include text, graphics, animation, audio, images, and full motion video. Application packages have been used in all aspects of the business and scientific communities for many years. As the users of these programs became more sophisticated, their demands increased. Software manufacturers were forced to develop packages that would supply the users with the flexibility they required. Initially users were content with little more than simple text based programs. Eventually, with the development of graphics-based systems, users saw the advantages of graphical interfaces. The trend of increased application complexity would not stop with windowing systems. Now, audio and video are considered necessary ingredients. These new data types create new demands.
2013 21st Iranian Conference on Electrical Engineering (ICEE), 2013
The widespread usage of internet, limited bandwidth of networks and different types of media all around the net causes a vast growth in compressing data with different abilities and qualities. Nowadays, video is a popular media for everyday usage. In different research areas, there is a need for recording events in high frame rates. Due to the high frame rate video constraints, using complex methods are not suitable for real-time coding of these videos and will increase the cost of the system. There are different lossless, lossy and near-lossless methods for compressing video sequences. Existing lossy methods cannot limit the subjective or objective loss to a certain upper bound. There have been works regarding lossless compression of these sequences, however these works offer modest compression ratios and in some cases will not be enough due to the large size of these sequences. In this paper we propose a near-lossless method that is comparable with successful existing methods of video compression and yet is simple enough for realtime applications. It includes the major conventional parts for this goal which are prediction, quantization and entropy coding. A simple rate control is embedded by different approaches in quantization. The experimental results demonstrate good compression ratios while considering reliability due to control of the maximum pixel error.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.