Alotaibi 2020

This document summarizes a research paper that proposes an ensemble learning method for transportation mode detection using data from smartphone sensors. The proposed method uses a stacked learning model consisting of three machine learning classifiers that vote independently and the majority vote is used for prediction. The method was tested on three datasets and achieved accuracies of 89-95% when 90% of data was used for training. It outperformed existing human activity recognition methods and provides an improved approach for transportation mode detection.

Uploaded by

Santhana Krishnan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

93 views12 pages

Alotaibi 2020

Uploaded by

Santhana Krishnan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Received July 1, 2020, accepted July 30, 2020, date of publication August 7, 2020, date of current version August

19, 2020.
Digital Object Identifier 10.1109/ACCESS.2020.3014901

Transportation Mode Detection by Embedded

Sensors Based on Ensemble Learning
BANDAR ALOTAIBI , (Member, IEEE)
Department of Information Technology, University of Tabuk, Tabuk 71491, Saudi Arabia
e-mail: [email protected]

ABSTRACT Context-aware computing has become a certainty due to the widespread use of smartphone
devices equipped with sensors. A wide range of services, such as vehicular traffic monitoring and smart
parking, can be accomplished with the help of awareness of user mobility. Transportation mode detection
(TMD) using machine learning algorithms and the data captured from smartphone embedded sensors
have attracted research community attention. In this research, ensemble learning is utilized to differentiate
between transportation modes, including walking, standing, riding a train, driving a car, and riding a bus.
The ensemble learning consists of three classifiers; each classifier votes independently on the instances,
and the majority vote is applied for robust generalization. The proposed method was validated using
three datasets; the samples included in these datasets were gathered by smartphone sensors (belonging
to heterogeneous users), such as rotation vector sensors, accelerometers, uncalibrated gyroscopes, linear
acceleration, orientation, speed, game rotation vector, sound, and gyroscopes. The proposed ensemble
learning method achieves an accuracy of 89%, 93%, and 95% on the first, second, and third datasets,
respectively, when 10% and 90% of the data are used for testing and training, respectively. On the other set of
experiments, in which 30% and 70% of the data are used for testing and training, respectively, the proposed
method yields accuracies of 86.8%, 92.1%, and 94.9% on the first, second, and third datasets, respectively.
The proposed method shows promising results compared to existing human activity recognition (HAR)
methods.

INDEX TERMS Context-aware computing, embedded sensors, ensemble learning, human activity recogni-
tion, Internet of Things (IoT), transportation mode detection.

I. INTRODUCTION ters and gyroscopes (a.k.a., microelectromechanical systems

Context-aware computing was introduced by Schilit and oth- (MEMS)) can be used to deduce what activity the phone
ers in 1994. The terminology is defined as ‘‘the possibility carrier is performing.
to exploit the changing environment with a new class of The awareness of a person’s transportation mode facilitates
applications that are aware of the context in which they are planning for urban transportation [7], which was performed in
run’’ [1]. Recently, context-aware computing has become a the past via telephone interviews, questionnaires, and travel
reality due to the ease of sensor deployment through IoT diaries [8], [9]. With these manual approaches, the traveler
devices (e.g., smart wearables such as smartwatches and could utilize several modes of transportation and might not
fitness trackers) or smartphone devices [2]. For instance, remember the certain times of the performed transportation
embedded sensors are integrated with smartphones to realize modes. Thus, context-aware computing replaces the tradi-
the surrounding areas, recognize user activity, and locate tional method, which is considered limited, not up-to-date,
user location [3]. A popular research topic (i.e., a type of erroneous, and expensive [10]. Moreover, human wellness
human activity recognition (HAR)) known as transportation and physical activities can be observed using smartphones
mode detection (TMD) has gained noticeable attention [4]– [11], smartwatches, or fitness trackers. Such applications
[6]. TMD is considered a HAR because smartphones carried can also provide daily activities and the number of calories
by users and embedded with sensors such as accelerome- burned. Additionally, the activities that affect the environment
can also be estimated, and environmental hazard exposure
The associate editor coordinating the review of this manuscript and can be traced. Knowing the transportation mode of travelers
approving it for publication was Giovanni Pau . helps shops send real-time information to customers in the

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
145552 VOLUME 8, 2020
B. Alotaibi: TMD by Embedded Sensors Based on Ensemble Learning

form of advertisements. For instance, detecting a consumer extreme gradient boosting (XGBoost), have been investigated
driving a car allows for gas stations to send vehicle ser- to differentiate the mode of transportation. However, these
vices or gas coupons as advertisements. Unlike life critical ensemble methods can only combine machine learning algo-
systems that cause death or injuries to people, environmental rithms with identical types, such as decision trees (DTs).
hazards, or expensive equipment damage, such as heart-lung However, ensemble methods that combine machine learning
machines, ventilation systems, and fire alarms, where an ideal algorithms with different types, such as stacked learning,
100% precision is essential, transportation mode detection have not been utilized by the TMD research community.
can tolerate a margin of error ranging from 5% to 10%. Stacked learning bundles lower level models of different
Currently, three distinguished types of methods are uti- types by a higher level generalized model to maximize accu-
lized to detect transportation modes: sensor-based techniques racy.
[12], global positioning system (GPS)-based techniques [13], Motivated by the released public datasets and the room
and cellular-network-based attempts. Furthermore, several for improvement, a stacked learning model consisting of
research papers [14]–[16] have attempted to build patterns of various machine learning algorithms with different types (i.e.,
transportation modes and improve the accuracy of these tech- the investigated machine learning algorithms in this study
niques. Moreover, some industries, through their applications that generate the most accurate final prediction when bundled
and operating systems (e.g., Android operating system), have together in the stack) is presented to detect the transportation
implemented their own TMD in which users are informed mode using available smartphone sensors that remarkably
of their mode of transportation. However, some of these improve the detection accuracy compared to TMD methods.
applications or systems limit their capabilities of detecting The contributions of this research can be summarized as
activities to four categories: riding a bicycle, riding in a follows:
vehicle, standing, and walking [3]. All of the techniques share
the following identical general principle for the detection of 1) A comprehensive study of different machine learning
the transportation mode: collecting raw data from sensors algorithms is presented to investigate which machine
integrated with mobile phones, preprocessing the raw data learning algorithm best fits the problem of TMD.
to be fed into the classification model, feeding part of the 2) Based on the observations from this study, a new
preprocessed data into the classifier for training purposes, and ensemble method is proposed to differentiate trans-
passing the rest of the preprocessed data to the predictor for portation modes. This method achieves promising
generalization purposes. results compared to existing techniques.
Although TMD methods have achieved considerable suc-
The remainder of this article is organized as follows.
cess in many applications, there are still some limitations
Section II summarizes the related methods, the proposed
regarding the studies that have been proposed to help improve
method is introduced in Section III, and IV presents the
the usefulness of these methods. First, due to the lack of a
experimental settings and the utilized dataset that help to eval-
benchmark dataset, the published results are hard to compare.
uate the proposed method. Section V presents the parameter
Additionally, the proposed techniques are not well defined
optimization, Section VI validates the proposed method and
because of either the limited number of smartphone users
analyzes the results, Section VII discusses the results, and
involved in the testbed or the heterogeneity of sensors that
Section VIII concludes the research paper.
have been utilized (i.e., it is hard for industries to adopt
one of these methods). Fortunately, two recent datasets [3],
[17] have been published in which researchers can compare II. RELATED WORK
their results. Moreover, a large number of proposals have Due to the widespread use of smartphones and the Internet
used binary classification in which the classifier needs to of things (IoT) devices that are equipped with embedded sen-
distinguish between only two modes of transportation. For sors, the context-aware computing topic has regained inter-
instance, smart parking techniques only considered modes est in the research community. TMD is an important field
of transportation: riding a vehicle and walking. The notifi- in which embedded sensors play an integral role in fulfill-
cation is sent once a busy or free area is found [18]. Finally, ing its purpose. TMDs can be categorized into three main
various techniques have been proposed to detect the mode types of techniques: sensor-based, GPS-based techniques,
of transportation using a single machine learning algorithm and cellular-network-based attempts. GPS-based techniques
to improve the accuracy. Certain sets of samples might be are the most famous category because of the ease of acces-
fit well by a single machine learning algorithm; however, sibility to GPS in smartphones and the high achieved perfor-
overfitting or underfitting might become visible to the rest mance in terms of accuracy in differentiating between pedes-
of the samples. Thus, even with parameter optimization, trian movements and riding vehicles. The GPS-based tech-
an upper limit of accuracy might be reached when utilizing niques share two shortcomings: the accuracy degrades once
one machine learning algorithm. Combining various machine the smartphone carrier enters an indoor environment or passes
learning algorithms (ensemble learning) is used to overcome by an urban canyon due to multipath fading and smart-
this shortcoming [19]. Recently, several ensemble methods, phones equipped with battery-constrained capability (i.e.,
such as random forests (RFs), gradient boosting (GB), and these methods consume batteries) [15].