A Facial Expression Recognition Method Using Deep
A Facial Expression Recognition Method Using Deep
fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.2980060, IEEE Access
Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/[Link] Number
ABSTRACT The imbalanced number and the high similarity of samples in expression database can lead to
overfitting in facial recognition neural networks. To address this problem, based on edge computing, a
facial expression recognition method using deep convolutional neural networks is proposed. In order to
overcome the shortcoming that circular consensus adversarial network model can only be mapped one-to-
one, we construct a constrained circular consensus generative adversarial network by adding class
constraint information. Discriminators and classifiers in this network can share network parameters. In
addition, for the problems of unstable training and easy to encounter model collapse in original GAN
networks, this paper introduces gradient penalty rule into discriminator's loss function to achieve the
normative constraint on gradient changes. Using this network not only generates sample data for a few
classes in the training set of expression database, but also performs effective expression classification.
Compared with other methods, the improved discriminative classifier network structure can enhance the
diversity of samples and get a higher expression recognition rate. Even if other expression feature
extraction methods are used, the higher recognition rate can still be obtained after using proposed data
augmentation framework.
INDEX TERMS facial expression recognition; generative adversarial network; deep learning; edge
computing; class constraint information; gradient penalty rule
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see [Link]
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.2980060, IEEE Access
information of original image and are not robust to image consensus generative adversarial network. And it can judge
scale and lighting conditions. the authenticity of input images and classify expressions.
Compared to manual feature extraction methods, deep 2) Aiming at the problems of unstable training and easy
neural networks can learn features automatically and to encounter model collapse in original GAN networks, this
achieve a high recognition rate in facial expression paper introduces gradient penalty rules into the loss
recognition. In order to extract more facial expression function of discriminator, which can achieve the normative
features, the number of layers of neural networks is also constraint on gradient changes.
increasing gradually. However, these networks tend to
overfit with the deepening of networks and the increase of II. GANS-BASED EXPRESSION RECOGNITION
parameters. The smaller data set, the more severe APPLICATION
overfitting phenomenon. Most of facial expression data sets Facial expression image editing is a special and important
have problems of insufficient data and high sample research topic. Due to human vision is sensitive to facial
similarity. Besides, imbalanced samples can also lead to irregularities and deformation, it is not easy to edit realistic
unsatisfactory neural network recognition [4,5]. facial expression images. In this regard, GANs can edit facial
Data augmentation is an important means to resolve expression images with high-quality detailed textures.
sample shortages and imbalances. Reference [6] applied Moreover, the expression recognition on the edited
traditional rotation and crop data augmentation methods to expression images can still achieve good effect.
expand the training samples. Most of the images carry Facial expression editing is a challenging task because it
duplicate information, which is close to a simple copy of requires advanced semantic understanding of the input facial
the sample. Besides, this is still far from the same number images. In traditional methods, either the paired training data
of different samples in amount of information, and they do is needed, or the synthesized face image resolution is very
not change the identity information of images. Therefore, low. Reference [7] proposed Conditional Adversarial Auto
the problem of large sample similarity remains unsolved. Encoder (CAAE) to learn face manifolds, and then realized
Different from using simple geometric transformations and smooth age expression image regression on the face manifold.
crop for data augmentation, GANs introduce an adversarial In CAAE, the face image is first mapped to a latent vector by
loss function and learns the facial expression images with convolutional encoder and the vector is projected onto a face
the same distribution as target datasets, which can solve the manifold based on age by deconvolution generator. Latent
high similarity problem of generated samples. However, vectors retain the subject's facial features, and age conditions
GANs networks map random vectors into the target dataset. control regression. Using adversarial learning on the encoder
This is often due to the lack of constraints, resulting in and generator makes generated images more realistic.
uneven quality. Experimental results show that the framework has good
For the uneven number of facial expression samples, performance and flexibility, the quality of generated images
such as relatively small amount of disgust and sad is high. Reference [8] proposed an Expression Generative
expression data, this paper introduces Cycle GAN into the Adversarial Networks (Expr GAN) based on CAAE, which
facial expression data augmentation. This enables the can edit the facial expression intensity of real images. In
mapping of neutral expressions to multi-category addition to the encoder and decoder networks, Expr GAN
expressions. At the same time, Cycle GAN has a one-to-one also designed an expression intensity control network
mapping relationship. Therefore, when there is a one-to- specifically for learning expression intensity of generated
many mapping relationship (such as a neutral expression to images. This novel network structure allows the intensity of
a variety of expressions such as happy, sad and surprised), generated expression images to be adjusted from low to high.
the model needs to be trained multiple times, which brings Reference [9] proposed a Conditional Difference
a huge time cost. To address this problem, this paper Adversarial Autoencoder (CDAAE), which is a facial
improves Cycle GAN and further proposes a constrained expression synthesis based on AU tags. Enter an unseen face
circular consensus generative adversarial network for facial image and use the target expression label or facial action unit
expression recognition. The network introduces class (AU) label to generate the person's facial expression image.
constraint conditions and gradient penalty rules, and CDAAE adds a feedforward path to autoencoder structure
implements one-to-many mapping transformation in one and it connects the low-level features of encoder with the
model. It reduces the model training overhead while corresponding levels features of decoder. By learning to
obtaining higher quality generated images. distinguish the differences between low-level image features
Compared with the circular consensus generative of different facial expressions for the same person, the
adversarial network, this network has three major problem of changes due to different identities and changes
improvements: due to different facial expressions can be solved.
1) Add an auxiliary expression classifier based on the Experimental results show that CDAAE can more accurately
discriminator. The newly added discriminative classifiers save the facial expression information of unknown objects
are used to replace the two discriminators of circular than the latest methods. However, the resolution of facial
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see [Link]
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.2980060, IEEE Access
images generated by CDAAE is only 32 × 32 and facial function is optimized by back propagation iteration.
images with AU labels not be well evaluated quantitatively. Reference [18] considered that GANs-based image
Reference [10] combined the geometric model of 3DMM [11] restoration models are susceptible to the initial solution of
with generative model. The former can separate expression non-convex optimization criteria and built an end-to-end
attributes from other facial attributes and generate 3DMM trainable parametric network. They started with a good initial
expression attributes based on target AU label. In this way, solution and get a more realistic image reconstruction with
we can generate high-resolution facial expression images and significant optimization speed and learned to use a recurrent
the target expression label is determined by AU label. neural network to optimize the time window of the initial
Reference [12] proposed a method for augmenting facial solution. In the iterative optimization process, a time
expression data based on Cycle GAN. They used neutral smoothness loss is applied to respect the redundancy of
expressions to convert to other expressions, and expand sequence time dimension. Experimental results show that this
expression images with less data, such as disgust and sad method is significantly better than other methods in image
expressions. Consequently, the classification accuracy is reconstruction quality. Reference [19] designed a facial
improved by 5% -10% after using data augmentation image generative network based on Wasserstein GANs,
technology. which can generate context-complete complementary images
Image restoration is a traditional graphics problem. It and expression recognition networks for the occlusion areas
refers to restoring the missing part of images based on the in images. It extracts expression features and infers
existing information in images, so that human eyes cannot expression categories, and achieves a high recognition effect
distinguish which part is restored. Something about image on CK + database. All of these mechanisms have achieved
restoration involves statistics, probabilistic models, etc. [13- good recognition results, but the quality of images generated
15]. The application of image restoration in facial expression by GANs network is uneven due to the lack of constraints.
recognition is very common. During the process of Moreover, it can't realize flexible expression mapping in the
identifying facial expressions, key parts of facial images that case of unbalanced number of facial expression samples.
may be identified may be occluded. For example, some facial Reference [20] proposed a facial restoration algorithm
images wear sunglasses and some wear scarves, which will based on depth generative model. Different from the
block eyes or mouth and the effect of these occlusions on completion of well-designed background restoration studies,
expression recognition is still significant. Therefore, the the facial restoration task is more challenging because it
occlusion part can be restored by a design algorithm and then usually requires the generation of semantically new pixels for
facial recognition is performed. The traditional methods are key missing parts of a large number of appearance changes,
to restore an image by copying pixels from the original such as eyes and mouth. In reference [21], a novel cascaded
image or by copying patches from an image library, while backbone-branches fully convolutional neural network (BB-
GANs provide a new method for image restoration. FCN) is proposed, which is used to locate face markers
Reference [16] proposed a network structure of context quickly and accurately in unconstrained and chaotic
encoder that is the first image restoration method based on environment. BB-FCN does not need any preprocessing, and
GANs. The network is based on an encoder-decoder generates facial landmark response map directly from the
architecture and the network input is 128 * 128 images with original image. BB-FCN follows a cascade pipeline from
missing blocks. The output is 64 * 64 missing content (when coarse to fine, which is composed of a backbone network and
the missing block is in the middle of original image) or 128 * a branch network, which is used to roughly detect the
128 full restore image (when the missing block is anywhere location of all facial marker points, and provide a branch
in original image). Besides, the objective function includes network for each type of marker point detected to further
adversarial loss and content loss. Experimental results show refine its location. These mechanisms have achieved good
that the restoration effect is better when missing block is in recognition results. However, these mechanisms need
the middle of original image. Reference [17] proposed a multiple training models, which brings huge time cost.
method of semantic image restoration based on GANs In addition to the above two GANs-based expression
iteration, which pre-trains GANs, and its generator maps the recognition applications, there are other application methods.
hidden variable z into an image. Enter an image 0x with However, the applications of GANs in expression
some missing information, and encode 0x into *z by recognition are mostly used for data augmentation.
minimizing the objective function. Among them, the
objective function includes an adversarial loss function and a III. THE PRINCIPLE OF GAN
content loss function. The content loss function calculates the The GAN model contains two networks: generator G and
weighted L1 pixel distance between generated image (*Gz) discriminator D . the basic structure and calculation flow are
and the undamaged area on 0x. The pixel values near missing shown in Figure 1.
information area have a higher weight. Finally, the objective
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see [Link]
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.2980060, IEEE Access
X
Real sample data-x
G(z)
RAndom variable-Z Generator-G
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see [Link]
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.2980060, IEEE Access
small part of labeled data to train discriminators for constructs a structure that constrained generates adversarial
traditional classification and regression tasks. networks based on constrained cycles, as shown in Fig. 2.
Add class constraint information to CycleGAN to achieve
IV. PROPOSED IMPROVED CYCLEGAN multi-category style conversion in one model. 2) Add an
The proposed improved CycleGAN achieves image style auxiliary expression classifier C to the discriminator. The
conversion from source domain to target domain. For the newly added discriminative classifiers are used to replace the
expression recognition task, the data augmentation mainly two discriminators of circular consensus generative
focuses on the conversion of neutral expressions to multiple adversarial network, which can judge the authenticity of
expressions (anger, disgust, fear, sadness, surprise and joy). It input images and classify expressions.
uses circular consensus generative adversarial network to
train multiple models for transformation. This section
cross − entropy ( ei ,ei ) Fake/real ei
C Dcs
ei
Input - Bi
Genc Gdec
G
Gcs
Generate- Bi
Input - A Cyclic- A
Generate- A
G Genc Gdec
Gcs
Input - Bi Input - A Cyclic- Bi
Dcs C
ei
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see [Link]
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.2980060, IEEE Access
Equation (1) represents the loss function to be optimized noise z , and a given view angle label v . The purpose of
by mapping A → BiS=1 , and ei represents category generator G is to generate a new image G ( v, z ) in view v .
information. The role of discriminator D is that the view distinguishes
LLSGAN ( G, Dcs , A, Bi ) real data x from generated data G ( v, z ) . The loss
(
+ Eb P ( B ) Dcs ( G ( b ) ) )
2 z Pz
(9)
data i
3) Classification loss
As mentioned in original GAN network, the training is −2 E P ( Dv ( x ) = v )
x Px
unstable and the model collapses easily. Most researchers in
this problem area have proposed that this is caused by In the reconstruction path, encoder E and decoder D
trying to minimize a strong divergence in network training. are mainly trained, encoder E attempts to reconstruct the
To solve this problem, we introduce gradient penalty rules training samples. The cross-reconstruction method is used
into the discriminator's loss function which can regulate the in encoder E to reconstruct angle information from
gradient change. The gradient penalty used is shown in identity information to ensure that the images of multiple
formula (5): views have the same identity information. Specifically,
samples ( xi , x j ) with the same identity but different angles
2
Ex − x D ( x ) p − 1 (5)
The introduced gradient penalty is not applied to the are reconstructed from xi to x j . xi is used as the input of
entire network area. It can apply to the real sample encoder E and outputs a view estimate v and a
concentration area, the generated sample concentration area, representation z retained by identity information flag, that
and the area in middle of them. Thus, first randomly sample is, ( v , z ) = ( Ev ( xi ) , Ez ( xi ) ) = E ( xi ) . The resulting z
a pair of true and false samples and generate a random
number in range [0-1] as follows: and v j are input into generator G together. Guided by
xr Pr , xg Pg , Uniform 0,1 (6) angle v j , G generates the corresponding xˆ j . At this point
In formula (6), xr Pr represents the area sampling of
xˆ j was refactored from xi . Finally, the discriminator D
real sample concentration, and xg Pg represents the area
sampling of generated sample concentration. The value tries to distinguish real x j from generated xˆ j , and obtains
is a random number in the interval [0,1]. Then perform the corresponding score and angle information. In this
random interpolation sampling on the line between xr and network, the loss function of encoder E is shown in
formula (11):
xg :
Ds ( xˆ j ) +
x = xr + (1 − ) xg (7)
The distribution satisfied by x obtained by sampling
LE = E
(
3 P Dv ( xˆ j ) v j
) (11)
−4 L1 ( xˆ j , x j ) −
according to above process is Px , and the discriminator xi , x j Px
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see [Link]
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.2980060, IEEE Access
Limiting xˆ j with E loss is x j reconstructed from xi . min LDcs ,C = 3 LDxvs + Ladv (14)
Dcs ,C
Lv loss is the cross-entropy loss between estimated view In the experiment, 1 , 2 , 3 in formula (13) and
and real view. The loss function of discriminator D is: formula (14) are set to 100, 10, and 1, respectively.
LDxvs = E Ds ( xˆ j ) − Ds ( xˆ j )
xi , x j Px [Link] EXPRESSION RECOGNITION DIAGRAM
( )
BASED ON IMPROVED CYCLE GAN
+1 E x D ( xˆ ) 2 − 1
2
(12)
xi Px On the basis of improving Cycle GAN, this paper proposes
an efficient and secure facial expression recognition method
− 2 E P ( Dv ( xi ) = vi ) based on the edge cloud framework combined with the
x j Px
improved Cycle GAN. The edge cloud computing
Combining the above three parts of loss function, for the framework is shown in Figure 3. In this system, the Internet
generators G and G , the loss function that needs to be of Things obtains facial expression signals from users
optimized is: through multi secret sharing technology, and then distributes
min LGcs ,G = 1Lcyc + 2 LGvzx − Ladv (13) them to different edge clouds to ensure the privacy of users.
Gcs ,G
For discriminative classifier, the loss function that needs
to be optimized is:
Edge computing
data flow
DeConv(512,4,2)
DeConv(256,4,2)
DeConv(128,4,2)
Conv(1024,4,2)
DeConv(6,4,2)
Conv(128,4,2)
Conv(256,4,2)
Conv(512,4,2)
Conv(64,4,2)
Input - A Generate− B
Genc eh Gdec
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see [Link]
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.2980060, IEEE Access
Conv(1024,4,2)
Conv(128,4,2)
Conv(256,4,2)
Conv(512,4,2)
Conv(64,4,2)
FC(1024)
Fake/real
FC(1)
Input - A/ Generate− B
Angry/Disgust/Fear/
Happy/Sad/Surprise
Conv(1024,4,2)
Conv(128,4,2)
Conv(256,4,2)
Conv(512,4,2)
FC(7),Sigmod
Conv(64,4,2)
FC(1024)
Input - A/ Generate− B
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see [Link]
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.2980060, IEEE Access
domain and remaining expression classes are used as target Fig. 7 shows the experimental results of expression
domain. Consistently generative adversarial network samples classification on JAFFE dataset in this section. It can be seen
through a constrained loop. Since the number of aversive from Fig. 7 that after data augmentation on original data set,
expressions in JAFFE dataset is relatively small, more not only the recognition rate of aversion expressions is
aversive expressions are generated to enhance the original improved, but also the recognition rate of other expressions
data set, and a small number of samples are added to surprise also is improved. This is because as the number of training
expression category. Consequently, the final sample images increases, the differences between expressions
distribution is basically balanced. Some samples in test set increase. The more features we can get during training, the
were not augmented. lower false recognition rate. The average recognition rate is
improved accordingly.
4.01
Average 76.96
72.95
1.44
Surprise 83.76
82.32
3.13
Sad 73.56
70.43
2.55
Happy 88.98
87.43
2.21
Neutral 77.88
75.67
2.53
Fear 67.87
65.34
14.22
Disgust 68.48
54.26
2.98
Angry 78.21
75.23
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see [Link]
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.2980060, IEEE Access
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see [Link]
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.2980060, IEEE Access
suppresses identity information in loss function, while other reference [25], since it is enhanced with geometric data such
types of facial expression samples mapped from neutral as rotation and clipping, the gain brought by it is obvious.
expressions retain the original identity information. For
TABLE 6 RECOGNITION RESULTS OF DIFFERENT METHODS ON CK+ DATABASE
Recognition rate before Recognition rate after
Method Feature Gain (%)
enhancement (%) enhancement (%)
Reference [19] Wasserstein GANs 94.35 95.56 1.21
Reference [25] LeNet-5 cross connected 84.34 86.45 2.11
Reference [26] combine IEw and ATRLBP operator 97.34 98.35 1.01
Cycle GAN + class constraint condition +
Proposed method 97.23 98.46 1.23
gradient penalty rule
Wireless Networks (2019), to be published. DOI:
VII. CONCLUSION [Link]
[4] X. Ma et al., “An IoT-based task scheduling optimization
Facial expression recognition is an important research scheme considering the deadline and cost-aware scientific
content of computer vision and artificial intelligence, which workflow for cloud computing,” EURASIP Journal on Wireless
is widely used in security, automatic driving, business and Communications and Networking, 2019: 249. DOI:
[Link]
other aspects. Facial expression database is the data base of [5] H. Gao et al., “Context-aware QoS prediction with neural
facial expression recognition, which plays an important role collaborative filtering for Internet-of-Things services,” IEEE
in the development of facial expression recognition Internet of Things Journal, 2019. to be published. DOI:
[Link]
technology. In this paper, the shortcomings of traditional data [6] A. T. Lopes, E. D. Aguiar, and T. O. D. Santos, “A facial
enhancement methods are analyzed and summarized. Aiming expression recognition system using convolutional networks,”
at the problem of class imbalance in the existing facial in Proc. SIBGRAPI, Salvador, Bahia, Brazil, 2015, pp. 273-280.
[7] Z. Zhang, Y. Song, and H. Qi, “Age progression/pegression by
expression database, the paper improves Cycle GAN, conditional adversarial autoencoder,” in Proc. CVPR, 2017, pp.
proposes a method of facial expression recognition based on 5810-5818920.
constraint cycle consistent generation to resist network, and [8] D. Hui, S. Kumar, and C. Rama, “ExprGAN: Facial expression
introduces class constraint condition and gradient penalty editing with controllable expression intensity,” in Proc. Thirty-
Second AAAI Conference on Artificial Intelligence, New
rule. The experimental results show that the improved Orleans, Louisiana, USA, 2018, pp. 6781-6788.
generation model can better learn the detailed texture [9] Z. Y, and B. E. Shi, “Photorealistic facial expression synthesis
information of the face image, and the quality of the by the conditional difference adversarial autoencoder,” in Proc.
ACII, San Antonio, TX, USA, 2017, pp. 370-376.
generated image is high. The improved discriminator [10] Z. Liu et al., “Conditional Adversarial Synthesis of 3D Facial
network has better classification and recognition effect on the Action Units,” Neurocomputing, to be published. DOI:
enhanced face expression image recognition. 10.1016/[Link].2019.05.003.
[11] V. Blanz, and T. Vetter, “A morphable model for the synthesis
This paper studies facial expression recognition and of 3D faces,” in Proc. SIGGRAPH, Los Angeles, CA, USA,
expression image data enhancement. Although some 1999, pp. 187-194.
achievements have been made, there are still some [12] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Image net
classification with deep convolutional neural networks,” Neural
deficiencies, which need further research and improvement. Information Processing Systems, vol. 25, no. 2, pp. 1097-1105,
First of all, the expression recognition and data enhancement Jan. 2012.
in this paper are based on the static image, while the [13] H. Gao et al., “Applying probabilistic model checking to path
emotional changes in real life are of a certain timing, and the planning in an intelligent transportation system using mobility
trajectories and their statistical data,” Autosoft, vol. 25, no. 3, pp.
static image can only reflect the expression state of a person 547–559, 2019.
at a certain time. The next work will focus on the data [14] K. Xia et al., “Liver semantic segmentation algorithm based on
enhancement of video sequence. Secondly, in the process of improved deep adversarial networks in combination of
weighted loss function on abdominal CT Images,” IEEE Access,
data enhancement, neutral expression image is used as the vol. 5, pp. 96349-96358, Dec. 2019.
source domain and other expression images as the target [15] H. Gao et al., “Research on cost-driven services composition in
domain, but the expression state of human in the real scene an uncertain environment,” JIT, vol. 20, no. 3, pp. 755-769,
2019.
can be transformed at will. How to enhance the data without [16] D. Pathak et al., “Context encoders: Feature learning by
limiting the expression state of the input image is also a inpainting,” in Proc. CVPR, Las Vegas, NV, USA, 2016, pp.
direction that can be improved in the future. 2536-2544.
[17] R. A. Yeh et al., “Semantic image inpainting with deep
generative models,” in Proc. IEEE Conference on Computer
REFERENCES Vision and Pattern Recognition, Las Vegas, NV, USA, 2017,
[1] M. Zhang et al., “Emotional context modulates micro- pp. 5485-5493.
expression processing as reflected in event-related potentials,” [18] D. A. Pitaloka et al., “Enhancing CNN with preprocessing stage
Psych Journal, vol. 7, no. 1, pp. 13-24, 2018. in sutomatic emotion recognition,” Procedia Computer Science,
[2] J. Yu et al., “Hierarchical deep click feature prediction for fine- vol. 116, pp. 523-529, Oct. 2017.
grained image recognition,” IEEE Transactions on Pattern [19] N. M. Yao et al., “Robust facial expression recognition with
Analysis and Machine Intelligence, to be published. DOI: generative adversarial networks,” Acta Automatica Sinica, vol.
10.1109/TPAMI.2019.2932058. 44, no. 5, pp. 865-877, 2018.
[3] H. Gao et al., “Transformation-based processing of typed [20] Y. Li et al., “Generative face completion,” in Proc. CVPR,
resources for multimedia sources in the IoT environment,”
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see [Link]
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.2980060, IEEE Access
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see [Link]