Papers by Sheng-Luen Chung

2018 IEEE 23rd International Conference on Emerging Technologies and Factory Automation (ETFA)
Precise identification of blister packages carries utmost importance at dispensing stations, wher... more Precise identification of blister packages carries utmost importance at dispensing stations, where numerous prescriptions are to be efficiently dispensed by pharmacists. However, the usual presence of several hundreds of similarly looking, but completely different types of blister packages, in a crowded dispensing station makes it prone to human error, posing serious safety and health concerns for a patients life. In this work, we propose a highlighted deep learning (HDL) based approach for accurate identification of blister packages. HDL allows smart manipulation on raw data in order to better segment the identified targets and to expose inherent descriptive features, thus facilitating an accurate deep learning based classification process. Specifically, HDL uses automatic detection and then segments and processes the raw blister pack images irrespective of position, lighting variations, etc., making them suitable for CNN to classify the correct blister pack types. A ResNet CNN classifier has been trained using the processed images, and the resultant CNN model is finally deployed for classification. We have conducted an extensive experiment at the adult lozenge dispensing station of MacKay Memorial Hospital. The database that will be released along with this paper, consists of 272 types of blister packages, with 65 images belonging to each type for training and additional 7 images for validation. On real testing scenario, the proposed approach yielded almost 100% accuracy, consistently over the entire testing set. The proposed solution can also be extended to any dispensing station for automatic detection and classification of blister packages, thereby offering an highly effective and efficient delivery system.

Personalized network marketing refers to the technique that allows network-marketing decisions to... more Personalized network marketing refers to the technique that allows network-marketing decisions to be made based on the results of the collection and analysis of user profiles. In this paper, we propose an interactive questionnaire mechanism whose design objective is to produce timely and reliable user profile attributes in an incremental fashion through an intelligent interface accessible from web environment. Subsequently, a profile analyzer that utilizes data mining techniques is employed to select fitted users from the user profile database thus collected, and then feed the result back to the intelligent interface mechanism to improve the quality of the collected database profiles. In particular, questions used to reveal attributes can be prioritized based on an adjustable mechanism geared to the particular marketing objectives. To implement the proposed interactive questionnaire mechanism, we have made use of graphical questionnaire design tools, such as question editor, profile...

2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), 2017
License Plate Detection (LPD) is the pivotal step for License Plate Recognition. In this work, we... more License Plate Detection (LPD) is the pivotal step for License Plate Recognition. In this work, we explore and customize state-of-the-art detection approaches for exclusively handling the LPD in the wild. In-the-wild LPD considers license plates captured in challenging conditions caused by bad weathers, lighting, traffics, and other factors. As conventional methods failed to handle these inevitable conditions, we explore the latest deep learning based detectors, namely YOLO (You-Only-Look-Once) and its variant YOLO-9000 (referred here as YOLO-2), and customize them for effectively handling the LPD. The prime customizations include modification of the grid size and of the bounding box parameter estimation, and the composition of a more challenging AOLPE (Application-Oriented License Plate Extended) database for performance evaluation. The AOLPE database is an extended version of the AOLP database [1] with additional images taken under extreme but frequently-encountered conditions. As the original YOLO and YOLO-2 are not designed for the LPD, they failed to handle the LPD on the AOLPE without the customizations. This study can be one of the pioneering works that revise state-of-the-art real-time deep networks for handling the LPD. It also serves as a case study for those who wish to customize existing deep networks for detecting specific objects. In addition to a pioneering explorations of deep networks for handling the in-the-wild LPD, our contribution also includes the release of the AOLPE database and evaluation protocol for a novel benchmark for the LPD.

Journal of Visual Communication and Image Representation, 2018
A novel approach exploiting facial landmarks and depth warping is proposed for robust cross-pose ... more A novel approach exploiting facial landmarks and depth warping is proposed for robust cross-pose face recognition. Unlike the existing 3-D reconstruction based cross-pose recognition algorithms, the proposed algorithm utilizes the automatically identified extensive facial landmarks to replace the computationally expensive 3-D reconstruction procedure, by depth warping. The given face is thereby registered to the most similar 3-D reference model. When matching to a probe face image, the registered depth-warped faces in the gallery are rotated to align to the orientation of the probe image, and sparse regression is then used to identify the correct person. Further, to handle the more challenging cases with eyeglasses, we devise and employ an enhanced Regressive Tree Structured Model (RTSM) combined with inpainting procedure, prior to depth warping. The proposed robust cross-pose recognition (RCPR) algorithm

IFAC Proceedings Volumes, 2000
This paper presents a deadlock prevention method for a class of flexible manufacturing systems (F... more This paper presents a deadlock prevention method for a class of flexible manufacturing systems (FMS) where deadlocks are caused by unmarked siphons in their Petri net models. This method is an iterative approach consisting of two main stages. At each iteration, a fast deadlock detection technique developed by mixed integer programming (MIP) is used to fmd an unmarked maximal siphon. The first stage, called siphons control, of the proposed method is to add, for each unmarked minimal siphons, a control place to the original net to prevent a minimal siphon from being unmarked. The second stage, called augmented siphons control, is to add a control place to the modified net. The second stage is required since adding control places in the first stage may create new unmarked siphons. In addition, the second stage assures that there are no new unmarked siphons generated. We have obtained the relation of the proposed method and the liveness and reversibility of the controlled net. Finally, manufacturing examples are presented for illustrating the method and allow us to compare with prior methods. Copyright @2000IFAC
Discrete Event Systems: Modeling and Control, 1993
We use the Limited Lookahead Policy (LLP) supervisory control scheme of [2, 3, 4, 1] to solve the... more We use the Limited Lookahead Policy (LLP) supervisory control scheme of [2, 3, 4, 1] to solve the Cat-and-Mouse problem of [5]. The solution employs the Variable Lookahead Policy (VLP) algorithms of [4]. These algorithms provide an efficient implementation technique of the LLP scheme.

2011 Chinese Control and Decision Conference (CCDC), 2011
This paper investigates a control strategy for automobile climate control (ACC) problem; in parti... more This paper investigates a control strategy for automobile climate control (ACC) problem; in particular, the heating mode in winter is considered when the initial car compartment temperature and outside temperature are colder than comfortable. By analyzing the air conditioning processes involving the automobile air-handling unit operating at various weather conditions on a psychrometric chart, a control strategy is developed based upon the concept of conservation of sensible heat and latent heat. Irrespective of the car compartment's initial condition, the control strategy is to control air flow and mixed air temperature such that the compartment conditions in terms of temperature and humidity reach a setting condition. The proposed strategy is composed of two parts: air mass flow rate and the percentage of heating air flow. In short, balancing the sensible heat transfer among the air condition processes is essential to achieving temperature control, as variation in the net sensible heat will lead to changes in temperature. The proposed control strategy is evaluated in a Matlab/Simulink simulation environment, which shows the proposed control strategy performs better when compared to traditional fuzzy control.

IEEE Transactions on Information Forensics and Security, 2014
Most RGB-D-based research focuses on scene reconstruction, gesture analysis, and simultaneous loc... more Most RGB-D-based research focuses on scene reconstruction, gesture analysis, and simultaneous localization and mapping, but only a few study its impacts on face recognition. A common yet challenging scenario considered in face recognition takes a single 2D face of frontal pose as the gallery and other poses as the probe set. We consider a similar scenario but with an RGB-D image pair taken at frontal pose for each subject in the gallery, only 2D images with a large scope of pose variations in the probe set, and study the advantage of the additional depth map on top of the regular RGB image. To tackle the cases with depth map corrupted by quantization noise, which are often encountered when the face is not close enough to the RGB-D camera, we propose a resurfacing approach as a preprocessing phase. We formulate the 3D face reconstruction using the RGB-D image as a constrained optimization and compare the results with different reconstruction settings. The reconstructed 3D face allows the generation of 2D face with specific poses, which can be matched against the probes. To deal with occlusion and expression variations, an automatic landmark detection algorithm is exploited to identify the parts on a given probe that are good for recognition. Experiments on benchmark databases show that the additional depth map substantially improves the cross-pose recognition performance, and the landmark-based component selection also improves the recognition under occlusion and expression variation. The performance comparison with other contemporary approaches also shows the effectiveness of the proposed approach.

International Journal of Computer Integrated Manufacturing, 2005
Failure diagnosis back-traces failures based on an observed system behaviour when a failure occur... more Failure diagnosis back-traces failures based on an observed system behaviour when a failure occurs. Ushio et al. (1998) first studied the diagnosability of Petri net (PN) models, a common mathematical modelling structure for CIM systems. With transitions all assumed to be unobservable, the diagnosis process of Ushio et al. relies solely on marking changes at observable places, making diagnosability analysis of most non-trivial PN nondiagnosable. However, in the context of manufacturing process, the initiation of a command and the response of a process are readily available to supervisory controller. This paper, in contrast, assumes that part of the transitions of the PN modelling is observable. Under this assumption, the first contribution of this paper is to investigate how both the label propagation function and the range function, used to construct a diagnoser, are to be revised in order to take advantage of the newly available information provided by observable transitions. The second contribution is to present a procedure to construct, for PN modelling, the associated verifier, first proposed by Yoo and Lafortune (2001) as a polynomial check mechanism on diagnosability but for finite state automata models. As shown by examples, the additional information from observed transitions in general adds diagnosability to the analysed system.
2015 IIAI 4th International Congress on Advanced Applied Informatics, 2015
Karada OK, a term coined to bear similarity to the popular Karaoke, is a body gesture matching ga... more Karada OK, a term coined to bear similarity to the popular Karaoke, is a body gesture matching game. In a time span of a song, video of body movement performed by an exemplary coach is recorded as template, later, players are to match the recorded body gestures along with the music, with score given at the end of the game. Due to the unavoidable camera view difference between the one at recording template and that at game playing, direct comparison for matching scoring can be difficult. This paper presents a Kinect-based solution which utilizes Skeleton-based Invariant Transformation (SVIT) to circumvent the view difference problem.

2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), 2019
Person Re-IDentification (Re-ID) is to recognize a person who has been seen before by different c... more Person Re-IDentification (Re-ID) is to recognize a person who has been seen before by different cameras from possibly scenes. Re-ID poses as one of the most difficult computer vision problems owing to the enormous amount of identities involved in a large-scale image pool, with much similar appearance constrained by low resolution image, in a possibly occluded scene, etc. Global features geared for general object recognition and face recognition are far less adequate to re-identify a same person across cameras. As such, more discriminating features are needed to identify people. In particular, part-based feature extraction methods that extract by learning local fine- grained features of different human body parts from detected persons have been proved effective for person Re-ID. To further improve the part-aligned spatial feature approach, this paper proposes an improved part-aligned feature (IPAF) deep learning framework to better characterize a person's complete information with the following threes highlights: part alignment, finer part segmentation, and better learning network backbone. Our proposed solution has been trained and tested on the two most comprehensive Re-ID datasets with comparable performance of reported state-of-the-art solutions: for the dataset of Market1501 (DukeMTMC-reID), our proposed solution both achieves competitive results with mAP of 85.96% (84.70%) and CMC 1 of 94.30% (89.84%), respectively.

Concerning the development of Chinese medical speech recognition technology, this study re-addres... more Concerning the development of Chinese medical speech recognition technology, this study re-addresses earlier encountered issues in accordance with the process of Machine Learning Engineering for Production (MLOps) from a data centric perspective. First is the new segmentation of speech utterances to meet sentences completeness for all utterances in the collected Chinese Medical Speech Corpus (ChiMeS). Second is optimization of Joint CTC/Attention model through data augmentation in boosting recognition performance out of very limited speech corpus. Overall, to facilitate the development of Chinese medical speech recognition, this paper contributes: (1) The ChiMeS corpus, the first Chinese Medicine Speech corpus of its kind, which is 14.4 hours, with a total of 7,225 sentences. (2) A trained Joint CTC/Attention ASR model by ChiMeS-14, yielding a Character Error Rate (CER) of 13.65% and a Keyword Error Rate (KER) of 20.82%, respectively, when tested on the ChiMeS-14 testing set. And (3...
Ieee Robotics Automation Magazine, 2004

Proceedings of the 33rd Chinese Control Conference, 2014
Image-based 3D reconstruction modeling technology that reconstruct a 3D models out of several 2D ... more Image-based 3D reconstruction modeling technology that reconstruct a 3D models out of several 2D images from different viewpoints of a same object has been developed for decades and applied in various fields. However, creating a 3D model from one single 2D image to emulate a depth portrait of the given picture is less addressed. With an application of automating a commemorative coin from a picture in mind, this paper presents an approach that adds depth onto a 2D picture by extracting edge features edges in the picture, thus rendering a 3D model reminiscent of low relief. In doing so, the 3D low relief model is constructed after edge detection and coin-shape cropping of the highlight figure. This model is then segmented with grids in producing a format ready for 3D-printing. In contrast to the traditional approach of manual operation involving several steps, the proposed approach streamlines the whole process by automating the making of 3D model reminiscent of low relief in the format ready for subsequent 3D printing.

2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), 2015
License plate detection and recognition are mostly studied on automobiles but only few on motorcy... more License plate detection and recognition are mostly studied on automobiles but only few on motorcycles. As motorcycles are becoming popular for local transportation and environmental friendliness, the demands for license plate recognition have been increasing in recent years. The primary difference between the license plate recognition in automobiles and motorcycles is on the detection of license plates, which is the topic of this study. For automobiles, the license plates are mostly installed on the front or on the back of the vehicle with relatively less complicated backgrounds; however, for motorcycles, the backgrounds can be far more complicated. To better handle complicated backgrounds, we study the case with motorcycle detection as preprocessing so that the search area for the license plate can be better constrained, and compare its performance with the case without the preprocessing. A few detection methods are configured and studied for both the motorcycle detection and license plate detection, including the state-of-the-art part-based model. Considering processing speed and accuracy, the histogram of oriented gradients (HOG) with support vector machines (SVMs) is found to be the best detector for motorcycle license plates.
Journal of the Chinese Institute of Engineers, 2004
Abstract In this paper, we introduce an adaptive two‐mode diagnosis scheme for discrete event sys... more Abstract In this paper, we introduce an adaptive two‐mode diagnosis scheme for discrete event systems. When the adaptive diagnoser is initially in the passive mode, it keeps track of the latest state estimates based on the occurrence of observable events. If the system is diagnosable as defined by Sampath et al., (1995) the adaptive diagnoser will detect the occurrence of failures within a finite delay. However, when the adaptive diagnoser gets into an uncertain state, it is at the users’ discretion to switch the operational mode to the active mode. In the active mode, we propose an algorithm such that the active diagnoser has a testing mechanism to generate a resolution sequence in resolving the uncertain state, if the diagnosed system at the uncertain state is actively diagnosable.

Proceedings of the 33rd Chinese Control Conference, 2014
Lane detection critical to alert driver to avert car departure from driving lanes is an important... more Lane detection critical to alert driver to avert car departure from driving lanes is an important issue in Intelligent Vehicle Safety System. Traditional lane detection uses straight line detection approaches like Canny and Hough Line Transforms to detect driving lanes, failing to detect curvy lanes. To solve curvy lane detection and to speed up real-time performance, we propose an Android-based solution for lane detection and departure warning. To achieve straight and cure lane detection, we use adaptive threshold algorithm, frequency of lane appearance and mathematical function to design color-based algorithm. With the camera aligned to the direction of car driving, the middle line of the onboard image is used to check lane departure warning. To speed up real-timer performance, image quality is down-sampled before it is split in half for multi-thread processing by the multi-core CPU commonly available on Android platforms. In contrast to traditional approaches, our solution, solving curvy lane detection with a fps performance roughly doubled, shows much improvement to existing lane detection techniques.

Journal of the Chinese Institute of Engineers, 2008
Abstract Process control in a brewery plant deals with the open/close decisions of valves for pip... more Abstract Process control in a brewery plant deals with the open/close decisions of valves for pipelines in a brewery plant. Interleaved service requests for filtration and CIP (Clean in pipe) operations require exclusive usage of the associated sub‐routes for correct and safe operation. To maximize the pipe utilization, it is desired that non‐conflicting service requests be realizable simultaneously. Conventional approaches rely on experience and simulation from trial‐and‐error to derive a concurrency control map that dictates what service requests can be processed simultaneously. In contrast, this paper addresses the problem of concurrency control of the requests in the framework of EVALPSN. Pipeline utilization is divided into phases of: service request, permission, and execution. Safety constraints regarding mutual exclusive use of sub‐routes of the pipeline system are formulated in EVALPSN statements. With state variables reflecting the current status of the brewery pipe, non‐conflicting service requests can be derived by the rules dictated by the associated EVALPSN rule.
Uploads
Papers by Sheng-Luen Chung