100% found this document useful (1 vote)
955 views1,450 pages

MATLAB Programming Fundamentals - MathWorks

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
955 views1,450 pages

MATLAB Programming Fundamentals - MathWorks

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1450

MATLAB®

Programming Fundamentals

R2021b
How to Contact MathWorks

Latest news: www.mathworks.com

Sales and services: www.mathworks.com/sales_and_services

User community: www.mathworks.com/matlabcentral

Technical support: www.mathworks.com/support/contact_us

Phone: 508-647-7000

The MathWorks, Inc.


1 Apple Hill Drive
Natick, MA 01760-2098
MATLAB Programming Fundamentals
© COPYRIGHT 1984–2021 by The MathWorks, Inc.
The software described in this document is furnished under a license agreement. The software may be used or copied
only under the terms of the license agreement. No part of this manual may be photocopied or reproduced in any form
without prior written consent from The MathWorks, Inc.
FEDERAL ACQUISITION: This provision applies to all acquisitions of the Program and Documentation by, for, or through
the federal government of the United States. By accepting delivery of the Program or Documentation, the government
hereby agrees that this software or documentation qualifies as commercial computer software or commercial computer
software documentation as such terms are used or defined in FAR 12.212, DFARS Part 227.72, and DFARS 252.227-7014.
Accordingly, the terms and conditions of this Agreement and only those rights specified in this Agreement, shall pertain
to and govern the use, modification, reproduction, release, performance, display, and disclosure of the Program and
Documentation by the federal government (or other entity acquiring for or through the federal government) and shall
supersede any conflicting contractual terms or conditions. If this License fails to meet the government's needs or is
inconsistent in any respect with federal procurement law, the government agrees to return the Program and
Documentation, unused, to The MathWorks, Inc.
Trademarks
MATLAB and Simulink are registered trademarks of The MathWorks, Inc. See
www.mathworks.com/trademarks for a list of additional trademarks. Other product or brand names may be
trademarks or registered trademarks of their respective holders.
Patents
MathWorks products are protected by one or more U.S. patents. Please see www.mathworks.com/patents for
more information.
Revision History
June 2004 First printing New for MATLAB 7.0 (Release 14)
October 2004 Online only Revised for MATLAB 7.0.1 (Release 14SP1)
March 2005 Online only Revised for MATLAB 7.0.4 (Release 14SP2)
June 2005 Second printing Minor revision for MATLAB 7.0.4
September 2005 Online only Revised for MATLAB 7.1 (Release 14SP3)
March 2006 Online only Revised for MATLAB 7.2 (Release 2006a)
September 2006 Online only Revised for MATLAB 7.3 (Release 2006b)
March 2007 Online only Revised for MATLAB 7.4 (Release 2007a)
September 2007 Online only Revised for Version 7.5 (Release 2007b)
March 2008 Online only Revised for Version 7.6 (Release 2008a)
October 2008 Online only Revised for Version 7.7 (Release 2008b)
March 2009 Online only Revised for Version 7.8 (Release 2009a)
September 2009 Online only Revised for Version 7.9 (Release 2009b)
March 2010 Online only Revised for Version 7.10 (Release 2010a)
September 2010 Online only Revised for Version 7.11 (Release 2010b)
April 2011 Online only Revised for Version 7.12 (Release 2011a)
September 2011 Online only Revised for Version 7.13 (Release 2011b)
March 2012 Online only Revised for Version 7.14 (Release 2012a)
September 2012 Online only Revised for Version 8.0 (Release 2012b)
March 2013 Online only Revised for Version 8.1 (Release 2013a)
September 2013 Online only Revised for Version 8.2 (Release 2013b)
March 2014 Online only Revised for Version 8.3 (Release 2014a)
October 2014 Online only Revised for Version 8.4 (Release 2014b)
March 2015 Online only Revised for Version 8.5 (Release 2015a)
September 2015 Online only Revised for Version 8.6 (Release 2015b)
October 2015 Online only Rereleased for Version 8.5.1 (Release 2015aSP1)
March 2016 Online only Revised for Version 9.0 (Release 2016a)
September 2016 Online only Revised for Version 9.1 (Release 2016b)
March 2017 Online only Revised for Version 9.2 (Release 2017a)
September 2017 Online only Revised for Version 9.3 (Release 2017b)
March 2018 Online only Revised for Version 9.4 (Release 2018a)
September 2018 Online only Revised for Version 9.5 (Release 2018b)
March 2019 Online only Revised for MATLAB 9.6 (Release 2019a)
September 2019 Online only Revised for MATLAB 9.7 (Release 2019b)
March 2020 Online only Revised for MATLAB 9.8 (Release 2020a)
September 2020 Online only Revised for MATLAB 9.9 (Release 2020b)
March 2021 Online only Revised for MATLAB 9.10 (Release 2021a)
September 2021 Online only Revised for MATLAB 9.11 (Release 2021b)
Contents

Language

Syntax Basics
1
Continue Long Statements on Multiple Lines . . . . . . . . . . . . . . . . . . . 1-2

Name=Value in Function Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3

Ignore Function Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4

Variable Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5


Valid Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5
Conflicts with Function Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5

Case and Space Sensitivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6

Choose Command Syntax or Function Syntax . . . . . . . . . . . . . . . . . . . 1-7


Command Syntax and Function Syntax . . . . . . . . . . . . . . . . . . . . . . . 1-7
Avoid Common Syntax Mistakes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-8
How MATLAB Recognizes Command Syntax . . . . . . . . . . . . . . . . . . . 1-8

Resolve Error: Undefined Function or Variable . . . . . . . . . . . . . . . . . 1-10


Issue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10
Possible Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10

Program Components
2
MATLAB Operators and Special Characters . . . . . . . . . . . . . . . . . . . . 2-2
Arithmetic Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2
Relational Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2
Logical Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2
Special Characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3
String and Character Formatting . . . . . . . . . . . . . . . . . . . . . . . . . . 2-16

Array vs. Matrix Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-20


Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-20
Array Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-20
Matrix Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-22

v
Compatible Array Sizes for Basic Operations . . . . . . . . . . . . . . . . . . 2-25
Inputs with Compatible Sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-25
Inputs with Incompatible Sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-27
Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-27

Array Comparison with Relational Operators . . . . . . . . . . . . . . . . . . 2-29


Array Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-29
Logic Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-31

Operator Precedence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-32


Precedence of AND and OR Operators . . . . . . . . . . . . . . . . . . . . . . 2-32
Overriding Default Precedence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-32

Average Similar Data Points Using a Tolerance . . . . . . . . . . . . . . . . 2-34

Group Scattered Data Using a Tolerance . . . . . . . . . . . . . . . . . . . . . . 2-36

Bit-Wise Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-38

Perform Cyclic Redundancy Check . . . . . . . . . . . . . . . . . . . . . . . . . . 2-44

Conditional Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-47

Loop Control Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-49

Regular Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-51


What Is a Regular Expression? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-51
Steps for Building Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-52
Operators and Characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-55

Lookahead Assertions in Regular Expressions . . . . . . . . . . . . . . . . . 2-63


Lookahead Assertions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-63
Overlapping Matches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-63
Logical AND Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-64

Tokens in Regular Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-66


Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-66
Multiple Tokens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-68
Unmatched Tokens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-69
Tokens in Replacement Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-69
Named Capture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-70

Dynamic Regular Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-72


Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-72
Dynamic Match Expressions — (??expr) . . . . . . . . . . . . . . . . . . . . . 2-73
Commands That Modify the Match Expression — (??@cmd) . . . . . . 2-73
Commands That Serve a Functional Purpose — (?@cmd) . . . . . . . . 2-74
Commands in Replacement Expressions — ${cmd} . . . . . . . . . . . . 2-76

Comma-Separated Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-79


What Is a Comma-Separated List? . . . . . . . . . . . . . . . . . . . . . . . . . 2-79
Generating a Comma-Separated List . . . . . . . . . . . . . . . . . . . . . . . . 2-79
Assigning Output from a Comma-Separated List . . . . . . . . . . . . . . . 2-81
Assigning to a Comma-Separated List . . . . . . . . . . . . . . . . . . . . . . . 2-81
How to Use the Comma-Separated Lists . . . . . . . . . . . . . . . . . . . . . 2-82

vi Contents
Fast Fourier Transform Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-84

Alternatives to the eval Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-86


Why Avoid the eval Function? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-86
Variables with Sequential Names . . . . . . . . . . . . . . . . . . . . . . . . . . 2-86
Files with Sequential Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-87
Function Names in Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-87
Field Names in Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-88
Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-88

Classes (Data Types)

Overview of MATLAB Classes


3
Fundamental MATLAB Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2

Numeric Classes
4
Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2
Integer Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2
Creating Integer Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2
Arithmetic Operations on Integer Classes . . . . . . . . . . . . . . . . . . . . . 4-4
Largest and Smallest Values for Integer Classes . . . . . . . . . . . . . . . . 4-4

Floating-Point Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6


Double-Precision Floating Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6
Single-Precision Floating Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6
Creating Floating-Point Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6
Arithmetic Operations on Floating-Point Numbers . . . . . . . . . . . . . . 4-8
Largest and Smallest Values for Floating-Point Classes . . . . . . . . . . . 4-9
Accuracy of Floating-Point Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-10
Avoiding Common Problems with Floating-Point Arithmetic . . . . . . 4-11

Create Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-13

Infinity and NaN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-14


Infinity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-14
NaN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-14

Identifying Numeric Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-16

Display Format for Numeric Values . . . . . . . . . . . . . . . . . . . . . . . . . . 4-17

Integer Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-19

vii
Single Precision Math . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-26

The Logical Class


5
Find Array Elements That Meet a Condition . . . . . . . . . . . . . . . . . . . . 5-2

Reduce Logical Arrays to Single Value . . . . . . . . . . . . . . . . . . . . . . . . 5-6

Characters and Strings


6
Text in String and Character Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2

Create String Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-5

Cell Arrays of Character Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-12


Create Cell Array of Character Vectors . . . . . . . . . . . . . . . . . . . . . . 6-12
Access Character Vectors in Cell Array . . . . . . . . . . . . . . . . . . . . . . 6-12
Convert Cell Arrays to String Arrays . . . . . . . . . . . . . . . . . . . . . . . . 6-13

Analyze Text Data with String Arrays . . . . . . . . . . . . . . . . . . . . . . . . . 6-15

Test for Empty Strings and Missing Values . . . . . . . . . . . . . . . . . . . . 6-20

Formatting Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-24


Fields of the Formatting Operator . . . . . . . . . . . . . . . . . . . . . . . . . . 6-24
Setting Field Width and Precision . . . . . . . . . . . . . . . . . . . . . . . . . . 6-28
Restrictions on Using Identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-30

Compare Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-32

Search and Replace Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-37

Build Pattern Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-40

Convert Numeric Values to Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-45

Convert Text to Numeric Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-49

Unicode and ASCII Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-53

Hexadecimal and Binary Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-55

Frequently Asked Questions About String Arrays . . . . . . . . . . . . . . . 6-59


Why Does Using Command Form With Strings Return An Error? . . 6-59
Why Do Strings in Cell Arrays Return an Error? . . . . . . . . . . . . . . . 6-60
Why Does length() of String Return 1? . . . . . . . . . . . . . . . . . . . . . . 6-60

viii Contents
Why Does isempty("") Return 0? . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-61
Why Does Appending Strings Using Square Brackets Return Multiple
Strings? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-62

Update Your Code to Accept Strings . . . . . . . . . . . . . . . . . . . . . . . . . 6-64


What Are String Arrays? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-64
Recommended Approaches for String Adoption in Old APIs . . . . . . 6-64
How to Adopt String Arrays in Old APIs . . . . . . . . . . . . . . . . . . . . . 6-66
Recommended Approaches for String Adoption in New Code . . . . . 6-66
How to Maintain Compatibility in New Code . . . . . . . . . . . . . . . . . . 6-67
How to Manually Convert Input Arguments . . . . . . . . . . . . . . . . . . 6-68
How to Check Argument Data Types . . . . . . . . . . . . . . . . . . . . . . . . 6-68
Terminology for Character and String Arrays . . . . . . . . . . . . . . . . . 6-70

Dates and Time


7
Represent Dates and Times in MATLAB . . . . . . . . . . . . . . . . . . . . . . . 7-2

Specify Time Zones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-5

Convert Date and Time to Julian Date or POSIX Time . . . . . . . . . . . . 7-7

Set Date and Time Display Format . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-10


Formats for Individual Date and Duration Arrays . . . . . . . . . . . . . . 7-10
datetime Display Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-10
duration Display Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-11
calendarDuration Display Format . . . . . . . . . . . . . . . . . . . . . . . . . . 7-11
Default datetime Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-12

Generate Sequence of Dates and Time . . . . . . . . . . . . . . . . . . . . . . . 7-14


Sequence of Datetime or Duration Values Between Endpoints with
Step Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-14
Add Duration or Calendar Duration to Create Sequence of Dates . . 7-16
Specify Length and Endpoints of Date or Duration Sequence . . . . . 7-17
Sequence of Datetime Values Using Calendar Rules . . . . . . . . . . . . 7-17

Share Code and Data Across Locales . . . . . . . . . . . . . . . . . . . . . . . . . 7-20


Write Locale-Independent Date and Time Code . . . . . . . . . . . . . . . . 7-20
Write Dates in Other Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-21
Read Dates in Other Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-21

Extract or Assign Date and Time Components of Datetime Array . . 7-23

Combine Date and Time from Separate Variables . . . . . . . . . . . . . . 7-26

Date and Time Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-28

Compare Dates and Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-33

Plot Dates and Durations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-36


Line Plot with Dates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-36

ix
Line Plot with Durations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-37
Scatter Plot with Dates and Durations . . . . . . . . . . . . . . . . . . . . . . 7-39
Plots that Support Dates and Durations . . . . . . . . . . . . . . . . . . . . . 7-40

Core Functions Supporting Date and Time Arrays . . . . . . . . . . . . . . 7-41

Convert Between Datetime Arrays, Numbers, and Text . . . . . . . . . . 7-42


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-42
Convert Between Datetime and Character Vectors . . . . . . . . . . . . . 7-42
Convert Between Datetime and String Arrays . . . . . . . . . . . . . . . . . 7-44
Convert Between Datetime and Date Vectors . . . . . . . . . . . . . . . . . 7-44
Convert Serial Date Numbers to Datetime . . . . . . . . . . . . . . . . . . . 7-45
Convert Datetime Arrays to Numeric Values . . . . . . . . . . . . . . . . . . 7-45

Carryover in Date Vectors and Strings . . . . . . . . . . . . . . . . . . . . . . . . 7-47

Converting Date Vector Returns Unexpected Output . . . . . . . . . . . . 7-48

Categorical Arrays
8
Create Categorical Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2

Convert Text in Table Variables to Categorical . . . . . . . . . . . . . . . . . . 8-6

Plot Categorical Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-10

Compare Categorical Array Elements . . . . . . . . . . . . . . . . . . . . . . . . 8-16

Combine Categorical Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-19

Combine Categorical Arrays Using Multiplication . . . . . . . . . . . . . . 8-22

Access Data Using Categorical Arrays . . . . . . . . . . . . . . . . . . . . . . . . 8-24


Select Data By Category . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-24
Common Ways to Access Data Using Categorical Arrays . . . . . . . . . 8-24

Work with Protected Categorical Arrays . . . . . . . . . . . . . . . . . . . . . . 8-30

Advantages of Using Categorical Arrays . . . . . . . . . . . . . . . . . . . . . . 8-34


Natural Representation of Categorical Data . . . . . . . . . . . . . . . . . . 8-34
Mathematical Ordering for Character Vectors . . . . . . . . . . . . . . . . . 8-34
Reduce Memory Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-34

Ordinal Categorical Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-36


Order of Categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-36
How to Create Ordinal Categorical Arrays . . . . . . . . . . . . . . . . . . . 8-36
Working with Ordinal Categorical Arrays . . . . . . . . . . . . . . . . . . . . 8-38

Core Functions Supporting Categorical Arrays . . . . . . . . . . . . . . . . 8-39

x Contents
Tables
9
Create Tables and Assign Data to Them . . . . . . . . . . . . . . . . . . . . . . . 9-2

Add and Delete Table Rows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-9

Add, Delete, and Rearrange Table Variables . . . . . . . . . . . . . . . . . . . 9-12

Clean Messy and Missing Data in Tables . . . . . . . . . . . . . . . . . . . . . . 9-19

Modify Units, Descriptions, and Table Variable Names . . . . . . . . . . 9-24

Add Custom Properties to Tables and Timetables . . . . . . . . . . . . . . 9-27

Access Data in Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-32


Summary of Table Indexing Syntaxes . . . . . . . . . . . . . . . . . . . . . . . 9-32
Tables Containing Specified Rows and Variables . . . . . . . . . . . . . . . 9-35
Extract Data Using Dot Notation and Logical Values . . . . . . . . . . . . 9-38
Dot Notation with Any Variable Name or Expression . . . . . . . . . . . . 9-40
Extract Data from Specified Rows and Variables . . . . . . . . . . . . . . . 9-42

Calculations on Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-45

Split Data into Groups and Calculate Statistics . . . . . . . . . . . . . . . . 9-49

Split Table Data Variables and Apply Functions . . . . . . . . . . . . . . . . 9-52

Advantages of Using Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-56

Grouping Variables To Split Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-61


Grouping Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-61
Group Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-61
The Split-Apply-Combine Workflow . . . . . . . . . . . . . . . . . . . . . . . . . 9-62
Missing Group Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-62

Changes to DimensionNames Property in R2016b . . . . . . . . . . . . . . 9-64

Data Cleaning and Calculations in Tables . . . . . . . . . . . . . . . . . . . . . 9-66

Grouped Calculations in Tables and Timetables . . . . . . . . . . . . . . . . 9-86

Timetables
10
Create Timetables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-2

Resample and Aggregate Data in Timetable . . . . . . . . . . . . . . . . . . . 10-5

Combine Timetables and Synchronize Their Data . . . . . . . . . . . . . . 10-8

xi
Retime and Synchronize Timetable Variables Using Different
Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-14

Select Times in Timetable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-19

Clean Timetable with Missing, Duplicate, or Nonuniform Times


.................................................... 10-27

Using Row Labels in Table and Timetable Operations . . . . . . . . . . 10-36

Loma Prieta Earthquake Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 10-41

Preprocess and Explore Time-stamped Data Using timetable . . . . 10-51

Structures
11
Structure Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2
Create Scalar Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2
Access Values in Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2
Index into Nonscalar Structure Array . . . . . . . . . . . . . . . . . . . . . . . 11-4

Concatenate Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-8

Generate Field Names from Variables . . . . . . . . . . . . . . . . . . . . . . . 11-10

Access Data in Nested Structures . . . . . . . . . . . . . . . . . . . . . . . . . . 11-11

Access Elements of a Nonscalar Structure Array . . . . . . . . . . . . . . 11-13

Ways to Organize Data in Structure Arrays . . . . . . . . . . . . . . . . . . . 11-15


Plane Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-15
Element-by-Element Organization . . . . . . . . . . . . . . . . . . . . . . . . 11-16

Memory Requirements for Structure Array . . . . . . . . . . . . . . . . . . 11-18

Cell Arrays
12
What Is a Cell Array? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-2

Create Cell Array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-3

Access Data in Cell Array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-5

Add Cells to Cell Array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-8

Delete Data from Cell Array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-9

xii Contents
Combine Cell Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-10

Pass Contents of Cell Arrays to Functions . . . . . . . . . . . . . . . . . . . . 12-11

Preallocate Memory for Cell Array . . . . . . . . . . . . . . . . . . . . . . . . . . 12-15

Cell vs. Structure Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-16

Multilevel Indexing to Access Parts of Cells . . . . . . . . . . . . . . . . . . 12-20

Function Handles
13
Create Function Handle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-2
What Is a Function Handle? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-2
Creating Function Handles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-2
Anonymous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-3
Arrays of Function Handles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-4
Saving and Loading Function Handles . . . . . . . . . . . . . . . . . . . . . . 13-4

Pass Function to Another Function . . . . . . . . . . . . . . . . . . . . . . . . . . 13-5

Call Local Functions Using Function Handles . . . . . . . . . . . . . . . . . 13-6

Compare Function Handles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-8

Map Containers
14
Overview of Map Data Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-2

Description of Map Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-4


Properties of Map Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-4
Methods of Map Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-4

Create Map Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-6


Construct Empty Map Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-6
Construct Initialized Map Object . . . . . . . . . . . . . . . . . . . . . . . . . . 14-6
Combine Map Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-7

Examine Contents of Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-8

Read and Write Using Key Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-9


Read From Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-9
Add Key/Value Pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-10
Build Map with Concatenation . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-10

xiii
Modify Keys and Values in Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-13
Remove Keys and Values from Map . . . . . . . . . . . . . . . . . . . . . . . . 14-13
Modify Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-13
Modify Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-14
Modify Copy of Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-14

Map to Different Value Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-15


Map to Structure Array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-15
Map to Cell Array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-16

Combining Unlike Classes


15
Valid Combinations of Unlike Classes . . . . . . . . . . . . . . . . . . . . . . . . 15-2

Combining Unlike Integer Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-3


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-3
Example of Combining Unlike Integer Sizes . . . . . . . . . . . . . . . . . . 15-3
Example of Combining Signed with Unsigned . . . . . . . . . . . . . . . . . 15-4

Combining Integer and Noninteger Data . . . . . . . . . . . . . . . . . . . . . 15-5

Combining Cell Arrays with Non-Cell Arrays . . . . . . . . . . . . . . . . . . 15-6

Empty Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-7

Concatenation Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-8


Combining Single and Double Types . . . . . . . . . . . . . . . . . . . . . . . . 15-8
Combining Integer and Double Types . . . . . . . . . . . . . . . . . . . . . . . 15-8
Combining Character and Double Types . . . . . . . . . . . . . . . . . . . . . 15-8
Combining Logical and Double Types . . . . . . . . . . . . . . . . . . . . . . . 15-8

Using Objects
16
Object Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-2
Two Copy Behaviors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-2
Handle Object Copy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-2
Value Object Copy Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-2
Handle Object Copy Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-3
Testing for Handle or Value Class . . . . . . . . . . . . . . . . . . . . . . . . . . 16-5

xiv Contents
Defining Your Own Classes
17

Scripts and Functions

Scripts
18
Create Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2

Add Comments to Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-3

Create and Run Sections in Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-5


Divide Your File into Sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-5
Run Sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-6
Navigate Between Sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-7
Behavior of Sections in Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 18-8
Behavior of Sections in Loops and Conditional Statements . . . . . . . 18-8

Scripts vs. Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-10

Add Functions to Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-12


Create a Script with Local Functions . . . . . . . . . . . . . . . . . . . . . . 18-12
Run Scripts with Local Functions . . . . . . . . . . . . . . . . . . . . . . . . . 18-12
Restrictions for Local Functions and Variables . . . . . . . . . . . . . . . 18-13
Access Help for Local Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 18-13

Live Scripts and Functions


19
What Is a Live Script or Function? . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-2
Differences with Plain Code Scripts and Functions . . . . . . . . . . . . . 19-3
Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-4
Unsupported Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-5

Create Live Scripts in the Live Editor . . . . . . . . . . . . . . . . . . . . . . . . 19-6


Create Live Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-6
Add Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-6
Run Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-7
Display Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-7
Change View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-8
Format Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-10
Save Live Scripts as Plain Code . . . . . . . . . . . . . . . . . . . . . . . . . . 19-11

xv
Modify Figures in Live Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-12
Explore Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-12
Update Code with Figure Changes . . . . . . . . . . . . . . . . . . . . . . . . 19-14
Add Formatting and Annotations . . . . . . . . . . . . . . . . . . . . . . . . . 19-14
Add and Modify Multiple Subplots . . . . . . . . . . . . . . . . . . . . . . . . 19-16
Save and Print Figure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-20

Format Text in the Live Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-21


Change Fonts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-23
Autoformatting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-24

Insert Equations into the Live Editor . . . . . . . . . . . . . . . . . . . . . . . 19-27


Insert Equation Interactively . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-27
Insert LaTeX Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-29

Add Interactive Controls to a Live Script . . . . . . . . . . . . . . . . . . . . 19-36


Insert Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-36
Modify Control Labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-37
Link Variables to Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-37
Specify Default Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-39
Modify Control Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-39
Create Live Script with Multiple Interactive Controls . . . . . . . . . . 19-40
Share Live Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-42

Add Interactive Tasks to a Live Script . . . . . . . . . . . . . . . . . . . . . . . 19-44


What Are Live Editor Tasks? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-44
Insert Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-44
Run Tasks and Surrounding Code . . . . . . . . . . . . . . . . . . . . . . . . . 19-47
Modify Output Argument Name . . . . . . . . . . . . . . . . . . . . . . . . . . 19-48
View and Edit Generated Code . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-48

Create Live Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-50


Create Live Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-50
Add Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-50
Add Help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-51
Run Live Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-51
Save Live Functions as Plain Code . . . . . . . . . . . . . . . . . . . . . . . . 19-52

Add Help for Live Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-53

Share Live Scripts and Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-57


Hide Code Before Sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-58

Live Code File Format (.mlx) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-59


Benefits of Live Code File Format . . . . . . . . . . . . . . . . . . . . . . . . . 19-59
Source Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-59

Introduction to the Live Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-60

Accelerate Exploratory Programming Using the Live Editor . . . . . 19-65

Create an Interactive Narrative with the Live Editor . . . . . . . . . . . 19-70

Create Interactive Course Materials Using the Live Editor . . . . . . 19-78

xvi Contents
Create Examples Using the Live Editor . . . . . . . . . . . . . . . . . . . . . . 19-84

Create an Interactive Form Using the Live Editor . . . . . . . . . . . . . 19-85

Create a Real-time Dashboard Using the Live Editor . . . . . . . . . . . 19-88

Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-89

Function Basics
20
Create Functions in Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-2
Syntax for Function Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-2
Contents of Functions and Files . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-3
End Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-4

Add Help for Your Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-5

Configure the Run Button for Functions . . . . . . . . . . . . . . . . . . . . . . 20-7

Base and Function Workspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-9

Share Data Between Workspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-10


Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-10
Best Practice: Passing Arguments . . . . . . . . . . . . . . . . . . . . . . . . . 20-10
Nested Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-10
Persistent Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-11
Global Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-12
Evaluating in Another Workspace . . . . . . . . . . . . . . . . . . . . . . . . . 20-12

Check Variable Scope in Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-14


Use Automatic Function and Variable Highlighting . . . . . . . . . . . . 20-14
Example of Using Automatic Function and Variable Highlighting . 20-14

Types of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-17


Local and Nested Functions in a File . . . . . . . . . . . . . . . . . . . . . . 20-17
Private Functions in a Subfolder . . . . . . . . . . . . . . . . . . . . . . . . . . 20-18
Anonymous Functions Without a File . . . . . . . . . . . . . . . . . . . . . . 20-18

Anonymous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-20


What Are Anonymous Functions? . . . . . . . . . . . . . . . . . . . . . . . . . 20-20
Variables in the Expression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-21
Multiple Anonymous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-21
Functions with No Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-22
Functions with Multiple Inputs or Outputs . . . . . . . . . . . . . . . . . . 20-22
Arrays of Anonymous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 20-23

Local Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-25

Nested Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-27


What Are Nested Functions? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-27
Requirements for Nested Functions . . . . . . . . . . . . . . . . . . . . . . . 20-27

xvii
Sharing Variables Between Parent and Nested Functions . . . . . . . 20-28
Using Handles to Store Function Parameters . . . . . . . . . . . . . . . . 20-29
Visibility of Nested Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-31

Resolve Error: Attempt to Add Variable to a Static Workspace. . . 20-33


Issue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-33
Possible Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-33

Private Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-36

Function Precedence Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-37


Change in Rules For Function Precedence Order . . . . . . . . . . . . . 20-38

Update Code for R2019b Changes to Function Precedence Order


.................................................... 20-40
Identifiers cannot be used for two purposes inside a function . . . . 20-40
Identifiers without explicit declarations might not be treated as
variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-40
Variables cannot be implicitly shared between parent and nested
functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-41
Change in precedence of wildcard-based imports . . . . . . . . . . . . . 20-42
Fully qualified import functions cannot have the same name as nested
functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-42
Fully qualified imports shadow outer scope definitions of the same
name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-43
Error handling when import not found . . . . . . . . . . . . . . . . . . . . . 20-43
Nested functions inherit import statements from parent functions
................................................ 20-44
Change in precedence of compound name resolution . . . . . . . . . . 20-44
Anonymous functions can include resolved and unresolved identifiers
................................................ 20-45

Indexing into Function Call Results . . . . . . . . . . . . . . . . . . . . . . . . 20-46


Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-46
Supported Syntaxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-46

Function Arguments
21
Find Number of Function Arguments . . . . . . . . . . . . . . . . . . . . . . . . 21-2

Support Variable Number of Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . 21-4

Support Variable Number of Outputs . . . . . . . . . . . . . . . . . . . . . . . . 21-5

Validate Number of Function Arguments . . . . . . . . . . . . . . . . . . . . . 21-6

Checking Number of Arguments in Nested Functions . . . . . . . . . . . 21-8

Ignore Inputs in Function Definitions . . . . . . . . . . . . . . . . . . . . . . . 21-10

Check Function Inputs with validateattributes . . . . . . . . . . . . . . . 21-11

xviii Contents
Parse Function Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21-13

Input Parser Validation Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 21-17

Debugging MATLAB Code


22
Debug MATLAB Code Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-2
Display Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-2
Debug by Using Run to Here Button . . . . . . . . . . . . . . . . . . . . . . . . 22-3
View Variable Value While Debugging . . . . . . . . . . . . . . . . . . . . . . . 22-5
Pause a Running File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-5
Step Into Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-5
Add Breakpoints and Run Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-6
End Debugging Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-7
Debug by Using Keyboard Shortcuts or Functions . . . . . . . . . . . . . . 22-8

Set Breakpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-9


Standard Breakpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-9
Conditional Breakpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-10
Error Breakpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-11
Breakpoints in Anonymous Functions . . . . . . . . . . . . . . . . . . . . . . 22-11
Invalid Breakpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-12
Disable Breakpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-12
Clear Breakpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-12

Examine Values While Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . 22-14


View Variable Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-14
View Variable Value Outside Current Workspace . . . . . . . . . . . . . . 22-15

Presenting MATLAB Code


23
Publish and Share MATLAB Code . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-2
Create and Share Live Scripts in the Live Editor . . . . . . . . . . . . . . . 23-2
Publish MATLAB Code Files (.m) . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-2
Add Help and Create Documentation . . . . . . . . . . . . . . . . . . . . . . . 23-4

Publishing Markup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-6


Markup Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-6
Sections and Section Titles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-8
Text Formatting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-9
Bulleted and Numbered Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-10
Text and Code Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-10
External File Content . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-11
External Graphics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-12
Image Snapshot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-14
LaTeX Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-14
Hyperlinks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-16

xix
HTML Markup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-18
LaTeX Markup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-19

Output Preferences for Publishing . . . . . . . . . . . . . . . . . . . . . . . . . 23-21


How to Edit Publishing Options . . . . . . . . . . . . . . . . . . . . . . . . . . 23-21
Specify Output File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-22
Run Code During Publishing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-22
Manipulate Graphics in Publishing Output . . . . . . . . . . . . . . . . . . 23-24
Save a Publish Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-27
Manage a Publish Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . 23-28

Coding and Productivity Tips


24
Save and Back Up Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-2
Save Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-2
Back Up Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-2
Recommendations on Saving Files . . . . . . . . . . . . . . . . . . . . . . . . . 24-3
File Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-3

Check Code for Errors and Warnings Using the Code Analyzer . . . 24-5
Enable Continuous Code Checking . . . . . . . . . . . . . . . . . . . . . . . . . 24-5
View Code Analyzer Status for File . . . . . . . . . . . . . . . . . . . . . . . . . 24-5
View Code Analyzer Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-6
Fix Problems in Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-7
Create a Code Analyzer Message Report . . . . . . . . . . . . . . . . . . . . . 24-8
Adjust Code Analyzer Message Indicators and Messages . . . . . . . . 24-9
Understand Code Containing Suppressed Messages . . . . . . . . . . . 24-11
Understand the Limitations of Code Analysis . . . . . . . . . . . . . . . . 24-12
Enable MATLAB Compiler Deployment Messages . . . . . . . . . . . . . 24-14

Edit and Format Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-16


Column Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-16
Change Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-16
Automatically Complete Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-16
Refactor Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-17
Indent Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-17
Fold Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-18
Change the Right-Side Text Limit Indicator . . . . . . . . . . . . . . . . . . 24-19

Find and Replace Text in Files and Go to Location . . . . . . . . . . . . . 24-21


Find and Replace Any Text in Current File . . . . . . . . . . . . . . . . . . 24-21
Find and Replace Functions or Variables in Current File . . . . . . . . 24-21
Automatically Rename All Variables or Functions in a File . . . . . . 24-22
Find Text in Multiple File Names or Files . . . . . . . . . . . . . . . . . . . 24-23
Go To Location in File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-23

Add Reminders to Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-27


Working with TODO/FIXME Reports . . . . . . . . . . . . . . . . . . . . . . . 24-27

MATLAB Code Analyzer Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-29


Run the Code Analyzer Report . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-29

xx Contents
Change Code Based on Code Analyzer Messages . . . . . . . . . . . . . 24-30
Other Ways to Access Code Analyzer Messages . . . . . . . . . . . . . . 24-30

MATLAB Code Compatibility Report . . . . . . . . . . . . . . . . . . . . . . . . 24-32


Generate the Code Compatibility Report . . . . . . . . . . . . . . . . . . . . 24-32
Programmatic Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-34
Unsupported Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-34

Programming Utilities
25
Identify Program Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-2
Simple Display of Program File Dependencies . . . . . . . . . . . . . . . . 25-2
Detailed Display of Program File Dependencies . . . . . . . . . . . . . . . 25-2
Dependencies Within a Folder . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-2

Protect Your Source Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-6


Building a Content Obscured Format with P-Code . . . . . . . . . . . . . . 25-6
Building a Standalone Executable . . . . . . . . . . . . . . . . . . . . . . . . . . 25-7

Create Hyperlinks that Run Functions . . . . . . . . . . . . . . . . . . . . . . . 25-8


Run a Single Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-8
Run Multiple Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-9
Provide Command Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-9
Include Special Characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-9

Create and Share Toolboxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-11


Create Toolbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-11
Share Toolbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-15

Run Parallel Language in MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . 25-17


Run Parallel Language in Serial . . . . . . . . . . . . . . . . . . . . . . . . . . 25-17
Use Parallel Language Without a Pool . . . . . . . . . . . . . . . . . . . . . . 25-18

Function Argument Validation


26
Function Argument Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-2
Introduction to Argument Validation . . . . . . . . . . . . . . . . . . . . . . . . 26-2
Where to Use Argument Validation . . . . . . . . . . . . . . . . . . . . . . . . . 26-2
arguments Block Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-2
Examples of Argument Validation . . . . . . . . . . . . . . . . . . . . . . . . . . 26-5
Kinds of Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-6
Required and Optional Positional Arguments . . . . . . . . . . . . . . . . . 26-6
Repeating Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-8
Name-Value Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-10
Robust Handling of Name-Value Arguments . . . . . . . . . . . . . . . . . 26-13
Name-Value Arguments from Class Properties . . . . . . . . . . . . . . . 26-14
Argument Validation in Class Methods . . . . . . . . . . . . . . . . . . . . . 26-16

xxi
Order of Argument Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-16
Avoiding Class and Size Conversions . . . . . . . . . . . . . . . . . . . . . . 26-17
nargin in Argument Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-19
Restrictions on Variable and Function Access . . . . . . . . . . . . . . . . 26-20
Debugging Arguments Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-21

Argument Validation Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-22


Numeric Value Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-22
Comparison with Other Values . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-23
Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-23
Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-23
Membership and Range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-24
Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-24
Define Validation Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-24

Ways to Parse Function Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-26


Function Argument Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-26
validateattributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-26
inputParser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-26

Transparency in MATLAB Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-27


Writing Transparent Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-27

Software Development

Error Handling
27
Exception Handling in a MATLAB Application . . . . . . . . . . . . . . . . . 27-2
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-2
Getting an Exception at the Command Line . . . . . . . . . . . . . . . . . . 27-2
Getting an Exception in Your Program Code . . . . . . . . . . . . . . . . . . 27-3
Generating a New Exception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-3

Throw an Exception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-4


Suggestions on How to Throw an Exception . . . . . . . . . . . . . . . . . . 27-4

Respond to an Exception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-6


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-6
The try/catch Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-6
Suggestions on How to Handle an Exception . . . . . . . . . . . . . . . . . 27-7

Clean Up When Functions Complete . . . . . . . . . . . . . . . . . . . . . . . . . 27-9


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-9
Examples of Cleaning Up a Program Upon Exit . . . . . . . . . . . . . . . 27-10
Retrieving Information About the Cleanup Routine . . . . . . . . . . . . 27-11
Using onCleanup Versus try/catch . . . . . . . . . . . . . . . . . . . . . . . . 27-12
onCleanup in Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-12

xxii Contents
Issue Warnings and Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-13
Issue Warnings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-13
Throw Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-13
Add Run-Time Parameters to Your Warnings and Errors . . . . . . . . 27-14
Add Identifiers to Warnings and Errors . . . . . . . . . . . . . . . . . . . . . 27-14

Suppress Warnings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-16


Turn Warnings On and Off . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-16

Restore Warnings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-18


Disable and Restore a Particular Warning . . . . . . . . . . . . . . . . . . . 27-18
Disable and Restore Multiple Warnings . . . . . . . . . . . . . . . . . . . . . 27-19

Change How Warnings Display . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-20


Enable Verbose Warnings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-20
Display a Stack Trace on a Specific Warning . . . . . . . . . . . . . . . . . 27-20

Use try/catch to Handle Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-21

Program Scheduling
28
Schedule Command Execution Using Timer . . . . . . . . . . . . . . . . . . . 28-2
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28-2
Example: Displaying a Message . . . . . . . . . . . . . . . . . . . . . . . . . . . 28-2

Timer Callback Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28-4


Associating Commands with Timer Object Events . . . . . . . . . . . . . . 28-4
Creating Callback Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28-5
Specifying the Value of Callback Function Properties . . . . . . . . . . . 28-6

Handling Timer Queuing Conflicts . . . . . . . . . . . . . . . . . . . . . . . . . . 28-8


Drop Mode (Default) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28-8
Error Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28-9
Queue Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28-10

Performance
29
Measure the Performance of Your Code . . . . . . . . . . . . . . . . . . . . . . 29-2
Overview of Performance Timing Functions . . . . . . . . . . . . . . . . . . 29-2
Time Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29-2
Time Portions of Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29-2
The cputime Function vs. tic/toc and timeit . . . . . . . . . . . . . . . . . . . 29-2
Tips for Measuring Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . 29-3

Profile Your Code to Improve Performance . . . . . . . . . . . . . . . . . . . . 29-4


What Is Profiling? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29-4
Profile Your Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29-4

xxiii
Profile Multiple Statements in Command Window . . . . . . . . . . . . . 29-10
Profile an App . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29-11

Determine Code Coverage Using the Profiler . . . . . . . . . . . . . . . . . 29-12

Techniques to Improve Performance . . . . . . . . . . . . . . . . . . . . . . . . 29-14


Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29-14
Code Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29-14
Programming Practices for Performance . . . . . . . . . . . . . . . . . . . . 29-14
Tips on Specific MATLAB Functions . . . . . . . . . . . . . . . . . . . . . . . 29-15

Preallocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29-16
Preallocating a Nondouble Matrix . . . . . . . . . . . . . . . . . . . . . . . . 29-16

Vectorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29-18
Using Vectorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29-18
Array Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29-19
Logical Array Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29-20
Matrix Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29-21
Ordering, Setting, and Counting Operations . . . . . . . . . . . . . . . . . 29-22
Functions Commonly Used in Vectorization . . . . . . . . . . . . . . . . . 29-23

Background Processing
30
Asynchronous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30-2
Asynchronous Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30-2
Background Workers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30-4

Run MATLAB Functions in Thread-Based Environment . . . . . . . . . 30-5


Run Functions in the Background . . . . . . . . . . . . . . . . . . . . . . . . . . 30-5
Run Functions on a Thread Pool . . . . . . . . . . . . . . . . . . . . . . . . . . . 30-5
Automatically Scale Up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30-5
Check Thread Supported Functions . . . . . . . . . . . . . . . . . . . . . . . . 30-5

Use the Background to Make Your Apps More Responsive . . . . . . . 30-7


Open App Designer App . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30-7
Add a Future Array to the Properties . . . . . . . . . . . . . . . . . . . . . . . 30-7
Create y-axis Data in the Background . . . . . . . . . . . . . . . . . . . . . . . 30-8
Automatically Update Plot After Data Is Calculated in the Background
................................................. 30-8
Make Your App More Responsive by Canceling the Future Array . . 30-9
Responsive App That Calculates and Plots Simple Curves . . . . . . . 30-10

Run Functions in Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30-11

Update Wait Bar While Functions Run in the Background . . . . . . 30-12

xxiv Contents
Memory Usage
31
Strategies for Efficient Use of Memory . . . . . . . . . . . . . . . . . . . . . . . 31-2
Use Appropriate Data Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31-2
Avoid Temporary Copies of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 31-3
Reclaim Used Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31-4

Resolve “Out of Memory” Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31-6


Issue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31-6
Possible Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31-6

How MATLAB Allocates Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . 31-12

Avoid Unnecessary Copies of Data . . . . . . . . . . . . . . . . . . . . . . . . . . 31-16


Passing Values to Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31-16
Why Pass-by-Value Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31-19
Handle Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31-19

Custom Help and Documentation


32
Create Help for Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32-2
Help Text from the doc Command . . . . . . . . . . . . . . . . . . . . . . . . . . 32-2
Custom Help Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32-3

Check Which Programs Have Help . . . . . . . . . . . . . . . . . . . . . . . . . . 32-8

Create Help Summary Files — Contents.m . . . . . . . . . . . . . . . . . . . 32-10


What Is a Contents.m File? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32-10
Create a Contents.m File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32-10
Check an Existing Contents.m File . . . . . . . . . . . . . . . . . . . . . . . . 32-11

Customize Code Suggestions and Completions . . . . . . . . . . . . . . . 32-12


Function Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32-13
Signature Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32-13
Argument Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32-14
Create Function Signature File . . . . . . . . . . . . . . . . . . . . . . . . . . . 32-17
How Function Signature Information is Used . . . . . . . . . . . . . . . . 32-18
Multiple Signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32-19

Display Custom Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32-21


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32-21
Create HTML Help Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32-22
Create info.xml File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32-23
Create helptoc.xml File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32-24
Build a Search Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32-26
Address Validation Errors for info.xml Files . . . . . . . . . . . . . . . . . 32-27

Display Custom Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32-29


How to Display Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32-29

xxv
Elements of the demos.xml File . . . . . . . . . . . . . . . . . . . . . . . . . . 32-30

Projects
33
Create Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-2
What Are Projects? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-2
Create Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-2
Open Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-2
Set up Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-3
Add Files to Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-5
Other Ways to Create Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-6

Automate Startup and Shutdown Tasks . . . . . . . . . . . . . . . . . . . . . . . 33-8


Specify Project Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-8
Set Startup Folder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-8
Specify Startup and Shutdown Files . . . . . . . . . . . . . . . . . . . . . . . . 33-8

Set MATLAB Projects Preferences . . . . . . . . . . . . . . . . . . . . . . . . . . 33-10

Manage Project Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-13


Automatic Updates When Renaming, Deleting, or Removing Files
................................................ 33-14

Find Project Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-15


Group and Sort Project Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-15
Search for and Filter Project Files . . . . . . . . . . . . . . . . . . . . . . . . 33-15
Search the Content in Project Files . . . . . . . . . . . . . . . . . . . . . . . . 33-15

Create Shortcuts to Frequent Tasks . . . . . . . . . . . . . . . . . . . . . . . . 33-17


Run Shortcuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-17
Create Shortcuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-17
Organize Shortcuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-17

Add Labels to Project Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-19


Add Labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-19
View and Edit Label Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-19
Create Labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-20

Create Custom Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-21


Create a Custom Task Function . . . . . . . . . . . . . . . . . . . . . . . . . . 33-21
Run a Custom Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-21
Save Custom Task Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-22

Componentize Large Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-23


Add or Remove Reference to a Project . . . . . . . . . . . . . . . . . . . . . 33-23
View, Edit, or Run Referenced Project Files . . . . . . . . . . . . . . . . . 33-23
Extract Folder to Create a Referenced Project . . . . . . . . . . . . . . . 33-24
Manage Changes in Referenced Project Using Checkpoints . . . . . 33-24

Share Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-26


Create an Export Profile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-29

xxvi Contents
Upgrade Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-30
Run Upgrade Project Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-30
Examine Upgrade Project Report . . . . . . . . . . . . . . . . . . . . . . . . . 33-31

Analyze Project Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-33


Run a Dependency Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-33
Explore the Dependency Graph, Views, and Filters . . . . . . . . . . . . 33-35
Investigate and Resolve Problems . . . . . . . . . . . . . . . . . . . . . . . . . 33-40
Find Required Products and Add-Ons . . . . . . . . . . . . . . . . . . . . . . 33-42
Find File Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-43
Export Dependency Analysis Results . . . . . . . . . . . . . . . . . . . . . . 33-45

Clone from Git Repository . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-47

Use Source Control with Projects . . . . . . . . . . . . . . . . . . . . . . . . . . 33-48


Setup Source Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-48
Perform Source Control Operations . . . . . . . . . . . . . . . . . . . . . . . 33-50
Work with Derived Files in Projects . . . . . . . . . . . . . . . . . . . . . . . 33-57
Find Project Files With Unsaved Changes . . . . . . . . . . . . . . . . . . . 33-58
Manage Open Files When Closing a Project . . . . . . . . . . . . . . . . . 33-58

Create and Edit Projects Programmatically . . . . . . . . . . . . . . . . . . 33-59

Explore an Example Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33-66

Source Control Interface


34
About MathWorks Source Control Integration . . . . . . . . . . . . . . . . . 34-2
Classic and Distributed Source Control . . . . . . . . . . . . . . . . . . . . . . 34-2

Select or Disable Source Control System . . . . . . . . . . . . . . . . . . . . . 34-4


Select Source Control System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-4
Disable Source Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-4

Create Git Repository . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-5

Review Changes in Source Control . . . . . . . . . . . . . . . . . . . . . . . . . . 34-7

Mark Files for Addition to Source Control . . . . . . . . . . . . . . . . . . . . 34-8

Resolve Source Control Conflicts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-9


Examining and Resolving Conflicts . . . . . . . . . . . . . . . . . . . . . . . . . 34-9
Resolve Conflicts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-9
Merge Text Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-10
Extract Conflict Markers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-10

Commit Modified Files to Source Control . . . . . . . . . . . . . . . . . . . . 34-12

Revert Changes in Source Control . . . . . . . . . . . . . . . . . . . . . . . . . . 34-13


Revert Local Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-13
Revert a File to a Specified Revision . . . . . . . . . . . . . . . . . . . . . . . 34-13

xxvii
Set Up SVN Source Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-14
SVN Source Control Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-14
Register Binary Files with SVN . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-14
Standard Repository Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-17
Tag Versions of Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-17
Enforce Locking Files Before Editing . . . . . . . . . . . . . . . . . . . . . . 34-17
Share a Subversion Repository . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-18

Check Out from SVN Repository . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-19


Retrieve Tagged Version of Repository . . . . . . . . . . . . . . . . . . . . . 34-19

Update SVN File Status and Revision . . . . . . . . . . . . . . . . . . . . . . . 34-21


Refresh Status of Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-21
Update Revisions of Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-21

Get SVN File Locks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-22


Manage SVN Repository Locks . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-22

Set Up Git Source Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-23


Configure MATLAB on Windows . . . . . . . . . . . . . . . . . . . . . . . . . . 34-23
Use SSH Authentication with MATLAB . . . . . . . . . . . . . . . . . . . . . 34-23
Configure Git Credential Helper . . . . . . . . . . . . . . . . . . . . . . . . . . 34-25
Register Binary Files with Git . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-25
Use Git LFS with MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-26

Add Git Submodules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-27


Update Submodules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-27
Use Fetch and Merge with Submodules . . . . . . . . . . . . . . . . . . . . 34-27
Use Push to Send Changes to the Submodule Repository . . . . . . . 34-27

Retrieve Files from Git Repository . . . . . . . . . . . . . . . . . . . . . . . . . . 34-29

Update Git File Status and Revision . . . . . . . . . . . . . . . . . . . . . . . . 34-30


Refresh Status of Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-30
Update Revisions of Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-30

Branch and Merge with Git . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-31


Create Branch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-31
Switch Branch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-32
Compare Branches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-33
Merge Branches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-33
Revert to Head . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-34
Delete Branches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-34

Pull, Push and Fetch Files with Git . . . . . . . . . . . . . . . . . . . . . . . . . 34-35


Pull and Push . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-35
Fetch and Merge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-36
Use Git Stashes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-36

Move, Rename, or Delete Files Under Source Control . . . . . . . . . . 34-38

Customize External Source Control to Use MATLAB for Diff and


Merge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-39
Finding the Full Paths for MATLAB Diff, Merge, and AutoMerge . . 34-39
Integration with Git . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-40

xxviii Contents
Integration with SVN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-41
Integration with Other Source Control Tools . . . . . . . . . . . . . . . . . 34-42

MSSCCI Source Control Interface . . . . . . . . . . . . . . . . . . . . . . . . . . 34-44

Set Up MSSCCI Source Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-45


Create Projects in Source Control System . . . . . . . . . . . . . . . . . . . 34-45
Specify Source Control System with MATLAB Software . . . . . . . . 34-46
Register Source Control Project with MATLAB Software . . . . . . . . 34-47
Add Files to Source Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-49

Check Files In and Out from MSSCCI Source Control . . . . . . . . . . 34-50


Check Files Into Source Control . . . . . . . . . . . . . . . . . . . . . . . . . . 34-50
Check Files Out of Source Control . . . . . . . . . . . . . . . . . . . . . . . . 34-50
Undoing the Checkout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-51

Additional MSSCCI Source Control Actions . . . . . . . . . . . . . . . . . . 34-52


Getting the Latest Version of Files for Viewing or Compiling . . . . . 34-52
Removing Files from the Source Control System . . . . . . . . . . . . . . 34-53
Showing File History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-53
Comparing the Working Copy of a File to the Latest Version in Source
Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-54
Viewing Source Control Properties of a File . . . . . . . . . . . . . . . . . 34-55
Starting the Source Control System . . . . . . . . . . . . . . . . . . . . . . . 34-56

Access MSSCCI Source Control from Editors . . . . . . . . . . . . . . . . . 34-58

Troubleshoot MSSCCI Source Control Problems . . . . . . . . . . . . . . 34-59


Source Control Error: Provider Not Present or Not Installed Properly
................................................ 34-59
Restriction Against @ Character . . . . . . . . . . . . . . . . . . . . . . . . . . 34-60
Add to Source Control Is the Only Action Available . . . . . . . . . . . . 34-60
More Solutions for Source Control Problems . . . . . . . . . . . . . . . . 34-60

Unit Testing
35
Write Test Using Live Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-3

Write Script-Based Unit Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-6

Write Script-Based Test Using Local Functions . . . . . . . . . . . . . . . 35-11

Extend Script-Based Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-14


Test Suite Creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-14
Test Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-14
Programmatic Access of Test Diagnostics . . . . . . . . . . . . . . . . . . . 35-15
Test Runner Customization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-15

Run Tests in Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-17

xxix
Write Function-Based Unit Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-20
Create Test Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-20
Run the Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-22
Analyze the Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-23

Write Simple Test Case Using Functions . . . . . . . . . . . . . . . . . . . . . 35-24

Write Test Using Setup and Teardown Functions . . . . . . . . . . . . . . 35-27

Extend Function-Based Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-31


Fixtures for Setup and Teardown Code . . . . . . . . . . . . . . . . . . . . . 35-31
Test Logging and Verbosity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-32
Test Suite Creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-32
Test Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-32
Test Running . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-33
Programmatic Access of Test Diagnostics . . . . . . . . . . . . . . . . . . . 35-33
Test Runner Customization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-34

Author Class-Based Unit Tests in MATLAB . . . . . . . . . . . . . . . . . . . 35-35


The Test Class Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-35
The Unit Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-35
Additional Features for Advanced Test Classes . . . . . . . . . . . . . . . 35-36

Write Simple Test Case Using Classes . . . . . . . . . . . . . . . . . . . . . . . 35-38

Write Setup and Teardown Code Using Classes . . . . . . . . . . . . . . . 35-41


Test Fixtures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-41
Test Case with Method-Level Setup Code . . . . . . . . . . . . . . . . . . . 35-41
Test Case with Class-Level Setup Code . . . . . . . . . . . . . . . . . . . . . 35-42

Table of Verifications, Assertions, and Other Qualifications . . . . . 35-44

Tag Unit Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-47


Tag Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-47
Select and Run Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-48

Write Tests Using Shared Fixtures . . . . . . . . . . . . . . . . . . . . . . . . . . 35-51

Create Basic Custom Fixture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-54

Create Advanced Custom Fixture . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-56

Use Parameters in Class-Based Tests . . . . . . . . . . . . . . . . . . . . . . . 35-61


How to Write Parameterized Tests . . . . . . . . . . . . . . . . . . . . . . . . 35-61
How to Initialize Parameterization Properties . . . . . . . . . . . . . . . . 35-62
Specify Parameterization Level . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-63
Specify How Parameters Are Combined . . . . . . . . . . . . . . . . . . . . 35-64
Use External Parameters in Tests . . . . . . . . . . . . . . . . . . . . . . . . . 35-64

Create Basic Parameterized Test . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-66

Create Advanced Parameterized Test . . . . . . . . . . . . . . . . . . . . . . . . 35-71

Use External Parameters in Parameterized Test . . . . . . . . . . . . . . . 35-78

xxx Contents
Define Parameters at Suite Creation Time . . . . . . . . . . . . . . . . . . . 35-82

Create Simple Test Suites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-89

Run Tests for Various Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-91


Set Up Example Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-91
Run All Tests in Class or Function . . . . . . . . . . . . . . . . . . . . . . . . . 35-91
Run Single Test in Class or Function . . . . . . . . . . . . . . . . . . . . . . . 35-91
Run Test Suites by Name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-92
Run Test Suites from Test Array . . . . . . . . . . . . . . . . . . . . . . . . . . 35-92
Run Tests with Customized Test Runner . . . . . . . . . . . . . . . . . . . . 35-93

Programmatically Access Test Diagnostics . . . . . . . . . . . . . . . . . . . 35-94

Add Plugin to Test Runner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-95

Write Plugins to Extend TestRunner . . . . . . . . . . . . . . . . . . . . . . . . 35-97


Custom Plugins Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-97
Extending Test Session Level Plugin Methods . . . . . . . . . . . . . . . . 35-97
Extending Test Suite Level Plugin Methods . . . . . . . . . . . . . . . . . . 35-98
Extending Test Class Level Plugin Methods . . . . . . . . . . . . . . . . . 35-98
Extending Test Level Plugin Methods . . . . . . . . . . . . . . . . . . . . . . 35-99

Create Custom Plugin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-100

Run Tests in Parallel with Custom Plugin . . . . . . . . . . . . . . . . . . . 35-105

Write Plugin to Add Data to Test Results . . . . . . . . . . . . . . . . . . . 35-113

Write Plugin to Save Diagnostic Details . . . . . . . . . . . . . . . . . . . . 35-118

Plugin to Generate Custom Test Output Format . . . . . . . . . . . . . . 35-122

Analyze Test Case Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-125

Analyze Failed Test Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-128

Rerun Failed Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-130

Dynamically Filtered Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-133


Test Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-133
Method Setup and Teardown Code . . . . . . . . . . . . . . . . . . . . . . . 35-135
Class Setup and Teardown Code . . . . . . . . . . . . . . . . . . . . . . . . . 35-136

Create Custom Constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-139

Create Custom Boolean Constraint . . . . . . . . . . . . . . . . . . . . . . . . 35-142

Create Custom Tolerance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-146

Overview of App Testing Framework . . . . . . . . . . . . . . . . . . . . . . . 35-150


App Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-150
Gesture Support of UI Components . . . . . . . . . . . . . . . . . . . . . . 35-150
Write a Test for an App . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-152

xxxi
Write Test for App . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-155

Write Test That Uses App Testing and Mocking Frameworks . . . 35-159
Create App . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-159
Test App With Manual Intervention . . . . . . . . . . . . . . . . . . . . . . . 35-160
Create Fully Automated Test . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-161

Overview of Performance Testing Framework . . . . . . . . . . . . . . . . 35-164


Determine Bounds of Measured Code . . . . . . . . . . . . . . . . . . . . . 35-164
Types of Time Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-165
Write Performance Tests with Measurement Boundaries . . . . . . . 35-165
Run Performance Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-166
Understand Invalid Test Results . . . . . . . . . . . . . . . . . . . . . . . . . 35-166

Test Performance Using Scripts or Functions . . . . . . . . . . . . . . . . 35-168

Test Performance Using Classes . . . . . . . . . . . . . . . . . . . . . . . . . . 35-172

Measure Fast Executing Test Code . . . . . . . . . . . . . . . . . . . . . . . . 35-178

Create Mock Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-181

Specify Mock Object Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-188


Define Mock Method Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . 35-188
Define Mock Property Behavior . . . . . . . . . . . . . . . . . . . . . . . . . 35-189
Define Repeating and Subsequent Behavior . . . . . . . . . . . . . . . . 35-190
Summary of Behaviors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-192

Qualify Mock Object Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . 35-193


Qualify Mock Method Interaction . . . . . . . . . . . . . . . . . . . . . . . . 35-193
Qualify Mock Property Interaction . . . . . . . . . . . . . . . . . . . . . . . 35-194
Use Mock Object Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-195
Summary of Qualifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-197

Ways to Write Unit Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-199


Script-Based Unit Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-199
Function-Based Unit Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-200
Class-Based Unit Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-200
Extend Unit Testing Framework . . . . . . . . . . . . . . . . . . . . . . . . . 35-201

Compile MATLAB Unit Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-202


Run Tests with Standalone Applications . . . . . . . . . . . . . . . . . . . 35-202
Run Tests in Parallel with Standalone Applications . . . . . . . . . . . 35-203
TestRand Class Definition Summary . . . . . . . . . . . . . . . . . . . . . . 35-203

Develop and Integrate Software with Continuous Integration . . 35-205


Continuous Integration Workflow . . . . . . . . . . . . . . . . . . . . . . . . 35-205
Continuous Integration with MathWorks Products . . . . . . . . . . . 35-207

Generate Artifacts Using MATLAB Unit Test Plugins . . . . . . . . . . 35-209

Continuous Integration with MATLAB on CI Platforms . . . . . . . . 35-213


Azure DevOps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-213
Bamboo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-213
CircleCI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-213

xxxii Contents
GitHub Actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-213
Jenkins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-214
Travis CI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-214
Other Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-214

System object Usage and Authoring


36
What Are System Objects? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-2
Running a System Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-3
System Object Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-3

System Objects vs MATLAB Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-5


System Objects vs. MATLAB Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 36-5
Process Audio Data Using Only MATLAB Functions Code . . . . . . . . . . . . 36-5
Process Audio Data Using System Objects . . . . . . . . . . . . . . . . . . . . . . . 36-6

System Design in MATLAB Using System Objects . . . . . . . . . . . . . . . . . . 36-7


System Design and Simulation in MATLAB . . . . . . . . . . . . . . . . . . . . . . . 36-7
Create Individual Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-7
Configure Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-8
Create and Configure Components at the Same Time . . . . . . . . . . . . . . . 36-8
Assemble Components Into System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-9
Run Your System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-9
Reconfiguring Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-10

Define Basic System Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-11


Create System Object Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-11
Define Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-11

Change the Number of Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-13

Validate Property and Input Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-16


Validate a Single Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-16
Validate Interdependent Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-16
Validate Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-16
Complete Class Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-16

Initialize Properties and Setup One-Time Calculations . . . . . . . . . . . . . 36-18

Set Property Values at Construction Time . . . . . . . . . . . . . . . . . . . . . . . 36-20

Reset Algorithm and Release Resources . . . . . . . . . . . . . . . . . . . . . . . . . 36-22


Reset Algorithm State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-22
Release System Object Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-22

Define Property Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-24


Specify Property as Nontunable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-24
Specify Property as DiscreteState . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-24
Example Class with Various Property Attributes . . . . . . . . . . . . . . . . . . 36-24

xxxiii
Hide Inactive Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-26
Specify Inactive Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-26
Complete Class Definition File with Inactive Properties Method . . . . . . 36-26

Limit Property Values to Finite List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-28


Property Validation with mustBeMember . . . . . . . . . . . . . . . . . . . . . . . 36-28
Enumeration Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-28
Create a Whiteboard System object . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-29

Process Tuned Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-32

Define Composite System Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-34

Define Finite Source Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-36


Use the FiniteSource Class and Specify End of the Source . . . . . . . . . . 36-36
Complete Class Definition File with Finite Source . . . . . . . . . . . . . . . . . 36-36

Save and Load System Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-38


Save System Object and Child Object . . . . . . . . . . . . . . . . . . . . . . . . . . 36-38
Load System Object and Child Object . . . . . . . . . . . . . . . . . . . . . . . . . . 36-38
Complete Class Definition Files with Save and Load . . . . . . . . . . . . . . . 36-38

Define System Object Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-41

Handle Input Specification Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-43


React to Input Specification Changes . . . . . . . . . . . . . . . . . . . . . . . . . . 36-43
Restrict Input Specification Changes . . . . . . . . . . . . . . . . . . . . . . . . . . 36-43

Summary of Call Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-45


Setup Call Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-45
Running the Object or Step Call Sequence . . . . . . . . . . . . . . . . . . . . . . 36-45
Reset Method Call Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-46
Release Method Call Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-47

Detailed Call Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-48


setup Call Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-48
Running the Object or step Call Sequence . . . . . . . . . . . . . . . . . . . . . . 36-48
reset Call Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-49
release Call Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-49

Tips for Defining System Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-50


General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-50
Inputs and Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-50
Using ~ as an Input Argument in Method Definitions . . . . . . . . . . . . . . 36-50
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-50
Text Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-51
Simulink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-51
Code Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-52

Insert System Object Code Using MATLAB Editor . . . . . . . . . . . . . . . . . 36-53


Define System Objects with Code Insertion . . . . . . . . . . . . . . . . . . . . . . 36-53
Create a Temperature Enumeration . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-55
Create Custom Property for Freezing Point . . . . . . . . . . . . . . . . . . . . . . 36-56
Add Method to Validate Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-57

xxxiv Contents
Analyze System Object Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-58
View and Navigate System object Code . . . . . . . . . . . . . . . . . . . . . . . . 36-58
Example: Go to StepImpl Method Using Analyzer . . . . . . . . . . . . . . . . . 36-58

Use Global Variables in System Objects . . . . . . . . . . . . . . . . . . . . . . . . . 36-60


System Object Global Variables in MATLAB . . . . . . . . . . . . . . . . . . . . . 36-60
System Object Global Variables in Simulink . . . . . . . . . . . . . . . . . . . . . 36-60

Create Moving Average System object . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-64

Create New System Objects for File Input and Output . . . . . . . . . . . . . 36-69

Create Composite System object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36-75

xxxv
Language

37
1

Syntax Basics

• “Continue Long Statements on Multiple Lines” on page 1-2


• “Name=Value in Function Calls” on page 1-3
• “Ignore Function Outputs” on page 1-4
• “Variable Names” on page 1-5
• “Case and Space Sensitivity” on page 1-6
• “Choose Command Syntax or Function Syntax” on page 1-7
• “Resolve Error: Undefined Function or Variable” on page 1-10
1 Syntax Basics

Continue Long Statements on Multiple Lines


This example shows how to continue a statement to the next line using ellipsis (...).

s = 1 - 1/2 + 1/3 - 1/4 + 1/5 ...


- 1/6 + 1/7 - 1/8 + 1/9;

Build a long character vector by concatenating shorter vectors together:

mytext = ['Accelerating the pace of ' ...


'engineering and science'];

The start and end quotation marks for a character vector must appear on the same line. For example,
this code returns an error, because each line contains only one quotation mark:

mytext = 'Accelerating the pace of ...


engineering and science'

An ellipsis outside a quoted text is equivalent to a space. For example,

x = [1.23...
4.56];

is the same as

x = [1.23 4.56];

1-2
Name=Value in Function Calls

Name=Value in Function Calls


Since R2021a

MATLAB supports two syntaxes for passing name-value arguments.

plot(x,y,LineWidth=2) name=value syntax

plot(x,y,"LineWidth",2) comma-separated syntax

Use the name=value syntax to help identify name-value arguments for functions and to clearly
distinguish names from values in lists of name-value arguments.

Most functions and methods support both syntaxes, but there are some limitations on where and how
the name=value syntax can be used:

• Mixing name,value and name=value syntaxes: The recommended practice is to use only one
syntax in any given function call. However, if you do mix name=value and name,value syntaxes
in a single call, all name=value arguments must appear after the name,value arguments. For
example, plot(x,y,"Color","red",LineWidth=2) is a valid combination, but
plot(x,y,Color="red","LineWidth",2) errors.
• Using positional arguments after name-value arguments: Some functions have positional
arguments that appear after name-value arguments. For example, this call to the verifyEqual
method uses the RelTol name-value argument, followed by a string input:

verifyEqual(testCase,1.5,2,"RelTol",0.1,...
"Difference exceeds relative tolerance.")

Using the name=value syntax (RelTol=0.1) causes the statement to error. In cases where a
positional argument follows name-value arguments, use the name,value syntax.
• Names that are invalid variable names: Name-value arguments with names that are invalid
MATLAB variable names cannot be used with the name=value syntax. See “Variable Names” on
page 1-5 for more info. For example, a name-value argument like "allow-empty",true errors
if passed as allow-empty=true. Use the name,value syntax in these cases.

Function authors do not need to code differently to support both the name,value and name=value
syntaxes. For information on authoring functions that accept name-value arguments, see “Name-Value
Arguments” on page 26-10.

1-3
1 Syntax Basics

Ignore Function Outputs


This example shows how to ignore specific outputs from a function using the tilde (~) operator.

Request all three possible outputs from the fileparts function.

helpFile = which('help');
[helpPath,name,ext] = fileparts(helpFile);

The current workspace now contains three variables from fileparts: helpPath, name, and ext. In
this case, the variables are small. However, some functions return results that use much more
memory. If you do not need those variables, they waste space on your system.

If you do not use the tilde operator, you can request only the first N outputs of a function (where N is
less than or equal to the number of possible outputs) and ignore any remaining outputs. For example,
request only the first output, ignoring the second and third.

helpPath = fileparts(helpFile);

If you request more than one output, enclose the variable names in square brackets, []. The
following code ignores the output argument ext.

[helpPath,name] = fileparts(helpFile);

To ignore function outputs in any position in the argument list, use the tilde operator. For example,
ignore the first output using a tilde.

[~,name,ext] = fileparts(helpFile);

You can ignore any number of function outputs using the tilde operator. Separate consecutive tildes
with a comma. For example, this code ignores the first two output arguments.

[~,~,ext] = fileparts(helpFile);

See Also

More About
• “Ignore Inputs in Function Definitions” on page 21-10

1-4
Variable Names

Variable Names
In this section...
“Valid Names” on page 1-5
“Conflicts with Function Names” on page 1-5

Valid Names
A valid variable name starts with a letter, followed by letters, digits, or underscores. MATLAB is case
sensitive, so A and a are not the same variable. The maximum length of a variable name is the value
that the namelengthmax command returns.

You cannot define variables with the same names as MATLAB keywords, such as if or end. For a
complete list, run the iskeyword command.

Examples of valid names: Examples of invalid names:


x6 6x
lastValue end
n_factorial n!

Conflicts with Function Names


Avoid creating variables with the same name as a function (such as i, j, mode, char, size, and
path). In general, variable names take precedence over function names. If you create a variable that
uses the name of a function, you sometimes get unexpected results.

Check whether a proposed name is already in use with the exist or which function. exist returns
0 if there are no existing variables, functions, or other artifacts with the proposed name. For example:

exist checkname

ans =
0

If you inadvertently create a variable with a name conflict, remove the variable from memory with the
clear function.

Another potential source of name conflicts occurs when you define a function that calls load or eval
(or similar functions) to add variables to the workspace. In some cases, load or eval add variables
that have the same names as functions. Unless these variables are in the function workspace before
the call to load or eval, the MATLAB parser interprets the variable names as function names. For
more information, see:

• “Unexpected Results When Loading Variables Within a Function”


• “Alternatives to the eval Function” on page 2-86

See Also
clear | exist | iskeyword | namelengthmax | which | isvarname

1-5
1 Syntax Basics

Case and Space Sensitivity


MATLAB code is sensitive to casing, and insensitive to blank spaces except when defining arrays.

Uppercase and Lowercase

In MATLAB code, use an exact match with regard to case for variables, files, and functions. For
example, if you have a variable, a, you cannot refer to that variable as A. It is a best practice to use
lowercase only when naming functions. This is especially useful when you use both Microsoft®
Windows® and UNIX®1 platforms because their file systems behave differently with regard to case.

When you use the help function, the help displays some function names in all uppercase, for
example, PLOT, solely to distinguish the function name from the rest of the text. Some functions for
interfacing to Oracle® Java® software do use mixed case and the command-line help and the
documentation accurately reflect that.

Spaces

Blank spaces around operators such as -, :, and ( ), are optional, but they can improve readability.
For example, MATLAB interprets the following statements the same way.

y = sin (3 * pi) / 2
y=sin(3*pi)/2

However, blank spaces act as delimiters in horizontal concatenation. When defining row vectors, you
can use spaces and commas interchangeably to separate elements:

A = [1, 0 2, 3 3]

A =

1 0 2 3 3

Because of this flexibility, check to ensure that MATLAB stores the correct values. For example, the
statement [1 sin (pi) 3] produces a much different result than [1 sin(pi) 3] does.

[1 sin (pi) 3]

Error using sin


Not enough input arguments.

[1 sin(pi) 3]

ans =

1.0000 0.0000 3.0000

1. UNIX is a registered trademark of The Open Group in the United States and other countries.

1-6
Choose Command Syntax or Function Syntax

Choose Command Syntax or Function Syntax


MATLAB has two ways of calling functions, called function syntax and command syntax. This page
discusses the differences between these syntax formats and how to avoid common mistakes
associated with command syntax.

For introductory information on calling functions, see “Calling Functions”. For information related to
defining functions, see “Create Functions in Files” on page 20-2.

Command Syntax and Function Syntax


In MATLAB, these statements are equivalent:
load durer.mat % Command syntax
load('durer.mat') % Function syntax

This equivalence is sometimes referred to as command-function duality.

All functions support this standard function syntax:


[output1, ..., outputM] = functionName(input1, ..., inputN)

In function syntax, inputs can be data, variables, and even MATLAB expressions. If an input is data,
such as the numeric value 2 or the string array ["a" "b" "c"], MATLAB passes it to the function
as-is. If an input is a variable MATLAB will pass the value assigned to it. If an input is an expression,
like 2+2 or sin(2*pi), MATLAB evaluates it first, and passes the result to the function. If the
functions has outputs, you can assign them to variables as shown in the example syntax above.

Command syntax is simpler but more limited. To use it, separate inputs with spaces rather than
commas, and do not enclose them in parentheses.
functionName input1 ... inputN

With command syntax, MATLAB passes all inputs as character vectors (that is, as if they were
enclosed in single quotation marks) and does not assign outputs to variables. To pass a data type
other than a character vector, use the function syntax. To pass a value that contains a space, you have
two options. One is to use function syntax. The other is to put single quotes around the value.
Otherwise, MATLAB treats the space as splitting your value into multiple inputs.

If a value is assigned to a variable, you must use function syntax to pass the value to the function.
Command syntax always passes inputs as character vectors and cannot pass variable values. For
example, create a variable and call the disp function with function syntax to pass the value of the
variable:
A = 123;
disp(A)

This code returns the expected result,


123

You cannot use command syntax to pass the value of A, because this call
disp A

is equivalent to

1-7
1 Syntax Basics

disp('A')

and returns

Avoid Common Syntax Mistakes


Suppose that your workspace contains these variables:

filename = 'accounts.txt';
A = int8(1:8);
B = A;

The following table illustrates common misapplications of command syntax.

This Command... Is Equivalent to... Correct Syntax for Passing Value


open filename open('filename') open(filename)
isequal A B isequal('A','B') isequal(A,B)
strcmp class(A) int8 strcmp('class(A)','int8') strcmp(class(A),'int8')
cd tempdir cd('tempdir') cd(tempdir)
isnumeric 500 isnumeric('500') isnumeric(500)
round 3.499 round('3.499'), which is round(3.499)
equivalent to round([51 46 52
57 57])
disp hello world disp('hello','world') disp('hello world')

or

disp 'hello world'


disp "string" disp('"string"') disp("string")

Passing Variable Names

Some functions expect character vectors for variable names, such as save, load, clear, and whos.
For example,

whos -file durer.mat X

requests information about variable X in the example file durer.mat. This command is equivalent to

whos('-file','durer.mat','X')

How MATLAB Recognizes Command Syntax


Consider the potentially ambiguous statement

ls ./d

This could be a call to the ls function with './d' as its argument. It also could represent element-
wise division on the array ls, using the variable d as the divisor.

1-8
Choose Command Syntax or Function Syntax

If you issue this statement at the command line, MATLAB can access the current workspace and path
to determine whether ls and d are functions or variables. However, some components, such as the
Code Analyzer and the Editor/Debugger, operate without reference to the path or workspace. When
you are using those components, MATLAB uses syntactic rules to determine whether an expression is
a function call using command syntax.

In general, when MATLAB recognizes an identifier (which might name a function or a variable), it
analyzes the characters that follow the identifier to determine the type of expression, as follows:

• An equal sign (=) implies assignment. For example:

ls =d
• An open parenthesis after an identifier implies a function call. For example:

ls('./d')
• Space after an identifier, but not after a potential operator, implies a function call using command
syntax. For example:

ls ./d
• Spaces on both sides of a potential operator, or no spaces on either side of the operator, imply an
operation on variables. For example, these statements are equivalent:

ls ./ d

ls./d

Therefore, MATLAB treats the potentially ambiguous statement ls ./d as a call to the ls function
using command syntax.

The best practice is to avoid defining variable names that conflict with common functions, to prevent
any ambiguity.

See Also
“Calling Functions” | “Create Functions in Files” on page 20-2

1-9
1 Syntax Basics

Resolve Error: Undefined Function or Variable

Issue
You may encounter the following error message, or something similar, while working with functions
or variables in MATLAB:

Undefined function or variable 'x'.

These errors usually indicate that MATLAB cannot find a particular variable or MATLAB program file
in the current directory or on the search path.

Possible Solutions
Verify Spelling of Function or Variable Name

One of the most common causes is misspelling the function or variable name. Especially with longer
names or names containing similar characters (such as the letter l and numeral one), it is easy to
make mistakes and hard to detect them.

Often, when you misspell a MATLAB function, a suggested function name appears in the Command
Window. For example, this command fails because it includes an uppercase letter in the function
name:

accumArray

Undefined function or variable 'accumArray'.

Did you mean:


>> accumarray

When this happens, press Enter to execute the suggested command or Esc to dismiss it.

Verify Inputs Correspond to the Function Syntax

Object methods are typically called using function syntax: for instance method(object,inputs).
Alternatively, they can be called using dot notation: for instance object.method(inputs). One
common error is to mix these syntaxes. For instance, you might call the method using function syntax,
but to provide inputs following dot notation syntax and leave out the object as an input: for instance,
method(inputs). To avoid this, when calling an object method, make sure you specify the object
first, either through the first input of function syntax or through the first identifier of dot notation.

Make Sure Function Name Matches File Name

When you write a function, you establish its name when you write its function definition line. This
name should always match the name of the file you save it to. For example, if you create a function
named curveplot,

function curveplot(xVal, yVal)


- program code -

then you should name the file containing that function curveplot.m. If you create a pcode file for
the function, then name that file curveplot.p. In the case of conflicting function and file names, the
file name overrides the name given to the function. In this example, if you save the curveplot

1-10
Resolve Error: Undefined Function or Variable

function to a file named curveplotfunction.m, then attempts to invoke the function using the
function name will fail:
curveplot
Undefined function or variable 'curveplot'.

If you encounter this problem, change either the function name or file name so that they are the
same.

To Locate the file that defines this function, use the MATLAB Find Files utility as follows:
1
On the Home tab, in the File section, click Find Files.
2 Under Find files named, enter *.m
3 Under Find files containing text, enter the function name.
4 Click the Find button

Make Sure Necessary Toolbox Is Installed and Correct Version

If you are unable to use a built-in function from MATLAB or its toolboxes, make sure that the function
is installed and is the correct version.

If you do not know which toolbox contains the function you need, search for the function
documentation at https://www.mathworks.com/help. The toolbox name appears at the top of the
function reference page. Alternatively, for steps to identify toolboxes that a function depends on, see
“Identify Program Dependencies” on page 25-2.

Once you know which toolbox the function belongs to, use the ver function to see which toolboxes
are installed on the system from which you run MATLAB. The ver function displays a list of all
currently installed MathWorks® products. If you can locate the toolbox you need in the output
displayed by ver, then the toolbox is installed. If you cannot, you need to install it in order to use it.
For help with installing MathWorks products, see “Install License Manager on License Server”.

1-11
1 Syntax Basics

Verify Path Used to Access Function Toolbox

Tip If you have a custom file path, this step will delete it.

The MATLAB search path is a subset of all the folders in the file system. MATLAB uses the search
path to locate files used with MathWorks products efficiently. For more information, see “What Is the
MATLAB Search Path?”.

If the function you are attempting to use is part of a toolbox, then verify that the toolbox is available
using ver.

Because MATLAB stores the toolbox information in a cache file, you need to first update this cache
and then reset the path.

1
On the Home tab, in the Environment section, click Preferences.

The Preference dialog box appears.


2 On the MATLAB > General page, select Update Toolbox Path Cache.
3
On the Home tab, in the Environment section, select Set Path.

The Set Path dialog box opens.


4 Select Default.

A small dialog box opens warning that you will lose your current path settings if you proceed.
Select Yes if you decide to proceed.

Run ver to see if the toolbox is installed. If not, you may need to reinstall this toolbox to use this
function. For more information about installing a toolbox, see How do I install additional toolboxes
into an existing installation of MATLAB.

Once ver shows your toolbox, run the following command to see if you can find the function:

which -all <functionname>

replacing <functionname> with the name of the function. If MATLAB finds your function file, it
presents you with the path to it. You can add that file to the path using the addpath function. If it
does not, make sure the necessary toolbox is installed, and that it is the correct version.

Confirm The License Is Active

If you are unable to use a built-in function from a MATLAB toolbox and have confirmed that the
toolbox is installed, make sure that you have an active license for that toolbox. Use license to
display currently active licenses. For additional support for managing licenses, see “Manage Your
Licenses”.

1-12
2

Program Components

• “MATLAB Operators and Special Characters” on page 2-2


• “Array vs. Matrix Operations” on page 2-20
• “Compatible Array Sizes for Basic Operations” on page 2-25
• “Array Comparison with Relational Operators” on page 2-29
• “Operator Precedence” on page 2-32
• “Average Similar Data Points Using a Tolerance” on page 2-34
• “Group Scattered Data Using a Tolerance” on page 2-36
• “Bit-Wise Operations” on page 2-38
• “Perform Cyclic Redundancy Check” on page 2-44
• “Conditional Statements” on page 2-47
• “Loop Control Statements” on page 2-49
• “Regular Expressions” on page 2-51
• “Lookahead Assertions in Regular Expressions” on page 2-63
• “Tokens in Regular Expressions” on page 2-66
• “Dynamic Regular Expressions” on page 2-72
• “Comma-Separated Lists” on page 2-79
• “Alternatives to the eval Function” on page 2-86
2 Program Components

MATLAB Operators and Special Characters


This page contains a comprehensive listing of all MATLAB operators, symbols, and special characters.

Arithmetic Operators
Symbol Role More Information
+ Addition plus
+ Unary plus uplus
- Subtraction minus
- Unary minus uminus
.* Element-wise multiplication times
* Matrix multiplication mtimes
./ Element-wise right division rdivide
/ Matrix right division mrdivide
.\ Element-wise left division ldivide
\ Matrix left division mldivide

(also known as backslash)


.^ Element-wise power power
^ Matrix power mpower
.' Transpose transpose
' Complex conjugate transpose ctranspose

Relational Operators
Symbol Role More Information
== Equal to eq
~= Not equal to ne
> Greater than gt
>= Greater than or equal to ge
< Less than lt
<= Less than or equal to le

Logical Operators
Symbol Role More Information
& Find logical AND and
| Find logical OR or
&& Find logical AND (with short- Logical Operators: Short-
circuiting) Circuit && ||

2-2
MATLAB Operators and Special Characters

Symbol Role More Information


|| Find logical OR (with short-
circuiting)
~ Find logical NOT not

Special Characters
@ Name: At symbol

Uses:

• Function handle construction and reference


• Calling superclass methods

Description: The @ symbol forms a handle to either the named function that follows the @
sign, or to the anonymous function that follows the @ sign. You can also use @ to call
superclass methods from subclasses.

Examples

Create a function handle to a named function:

fhandle = @myfun

Create a function handle to an anonymous function:

fhandle = @(x,y) x.^2 + y.^2;

Call the disp method of MySuper from a subclass:

disp@MySuper(obj)

Call the superclass constructor from a subclass using the object being constructed:

obj = obj@MySuper(arg1,arg2,...)

More Information:

• “Create Function Handle” on page 13-2


• “Call Superclass Methods on Subclass Objects”

2-3
2 Program Components

. Name: Period or dot

Uses:

• Decimal point
• Element-wise operations
• Structure field access
• Object property or method specifier

Description: The period character separates the integral and fractional parts of a
number, such as 3.1415. MATLAB operators that contain a period always work element-
wise. The period character also enables you to access the fields in a structure, as well as
the properties and methods of an object.

Examples

Decimal point:

102.5543

Element-wise operations:

A.*B
A.^2

Structure field access:

myStruct.f1

Object property specifier:

myObj.PropertyName

More Information

• “Array vs. Matrix Operations” on page 2-20


• “Structures”
• “Access Property Values”

2-4
MATLAB Operators and Special Characters

... Name: Dot dot dot or ellipsis

Uses: Line continuation

Description: Three or more periods at the end of a line continues the current command
on the next line. If three or more periods occur before the end of a line, then MATLAB
ignores the rest of the line and continues to the next line. This effectively makes a
comment out of anything on the current line that follows the three periods.

Note MATLAB interprets the ellipsis as a space character. Therefore, multi-line


commands must be valid as a single line with the ellipsis replaced by a space character.

Examples

Continue a function call on the next line:

sprintf(['The current value '...


'of %s is %d'],vname,value)

Break a character vector up on multiple lines and concatenate the lines together:

S = ['If three or more periods occur before the '...


'end of a line, then the rest of that line is ' ...
'ignored and MATLAB continues to the next line']

To comment out one line in a multiline command, use ... at the beginning of the line to
ensure that the command remains complete. If you use % to comment out a line it
produces an error:

y = 1 +...
2 +...
% 3 +...
4;

However, this code runs properly since the third line does not produce a gap in the
command:

y = 1 +...
2 +...
... 3 +...
4;

More Information

• “Continue Long Statements on Multiple Lines” on page 1-2

2-5
2 Program Components

, Name: Comma

Uses: Separator

Description: Use commas to separate row elements in an array, array subscripts,


function input and output arguments, and commands entered on the same line.

Examples

Separate row elements to create an array:

A = [12,13; 14,15]

Separate subscripts:

A(1,2)

Separate input and output arguments in function calls:

[Y,I] = max(A,[],2)

Separate multiple commands on the same line (showing output):

figure, plot(sin(-pi:0.1:pi)), grid on

More Information

• horzcat

2-6
MATLAB Operators and Special Characters

: Name: Colon

Uses:

• Vector creation
• Indexing
• For-loop iteration

Description: Use the colon operator to create regularly spaced vectors, index into
arrays, and define the bounds of a for loop.

Examples

Create a vector:

x = 1:10

Create a vector that increments by 3:

x = 1:3:19

Reshape a matrix into a column vector:

A(:)

Assign new elements without changing the shape of an array:

A = rand(3,4);
A(:) = 1:12;

Index a range of elements in a particular dimension:

A(2:5,3)

Index all elements in a particular dimension:

A(:,3)

for loop bounds:

x = 1;
for k = 1:25
x = x + x^2;
end

More Information

• colon
• “Creating, Concatenating, and Expanding Matrices”

2-7
2 Program Components

; Name: Semicolon

Uses:

• Signify end of row


• Suppress output of code line

Description: Use semicolons to separate rows in an array creation command, or to


suppress the output display of a line of code.

Examples

Separate rows to create an array:

A = [12,13; 14,15]

Suppress code output:

Y = max(A);

Separate multiple commands on a single line (suppressing output):

A = 12.5; B = 42.7, C = 1.25;


B =
42.7000

More Information

• vertcat

2-8
MATLAB Operators and Special Characters

( ) Name: Parentheses

Uses:

• Operator precedence
• Function argument enclosure
• Indexing

Description: Use parentheses to specify precedence of operations, enclose function input


arguments, and index into an array.

Examples

Precedence of operations:

(A.*(B./C)) - D

Function argument enclosure:

plot(X,Y,'r*')
C = union(A,B)

Indexing:

A(3,:)
A(1,2)
A(1:5,1)

More Information

• “Operator Precedence” on page 2-32


• “Array Indexing”

2-9
2 Program Components

[ ] Name: Square brackets

Uses:

• Array construction
• Array concatenation
• Empty matrix and array element deletion
• Multiple output argument assignment

Description: Square brackets enable array construction and concatenation, creation of


empty matrices, deletion of array elements, and capturing values returned by a function.

Examples

Construct a three-element vector:

X = [10 12 -3]

Add a new bottom row to a matrix:

A = rand(3);
A = [A; 10 20 30]

Create an empty matrix:

A = []

Delete a matrix column:

A(:,1) = []

Capture three output arguments from a function:

[C,iA,iB] = union(A,B)

More Information

• “Creating, Concatenating, and Expanding Matrices”


• horzcat
• vertcat

2-10
MATLAB Operators and Special Characters

{ } Name: Curly brackets

Uses: Cell array assignment and contents

Description: Use curly braces to construct a cell array, or to access the contents of a
particular cell in a cell array.

Examples

To construct a cell array, enclose all elements of the array in curly braces:

C = {[2.6 4.7 3.9], rand(8)*6, 'C. Coolidge'}

Index to a specific cell array element by enclosing all indices in curly braces:

A = C{4,7,2}

More Information

• “Cell Arrays”
% Name: Percent

Uses:

• Comment
• Conversion specifier

Description: The percent sign is most commonly used to indicate nonexecutable text
within the body of a program. This text is normally used to include comments in your
code.

Some functions also interpret the percent sign as a conversion specifier.

Two percent signs, %%, serve as a cell delimiter as described in “Create and Run Sections
in Code” on page 18-5.

Examples

Add a comment to a block of code:

% The purpose of this loop is to compute


% the value of ...

Use conversion specifier with sprintf:

sprintf('%s = %d', name, value)

More Information

• “Add Comments to Code” on page 18-3

2-11
2 Program Components

%{ %} Name: Percent curly bracket

Uses: Block comments

Description: The %{ and %} symbols enclose a block of comments that extend beyond
one line.

Note With the exception of whitespace characters, the %{ and %} operators must appear
alone on the lines that immediately precede and follow the block of help text. Do not
include any other text on these lines.

Examples

Enclose any multiline comments with percent followed by an opening or closing brace:

%{
The purpose of this routine is to compute
the value of ...
%}

More Information

• “Add Comments to Code” on page 18-3


! Name: Exclamation point

Uses: Operating system command

Description: The exclamation point precedes operating system commands that you want
to execute from within MATLAB.

Not available in MATLAB Online™.

Examples

The exclamation point initiates a shell escape function. Such a function is to be performed
directly by the operating system:

!rmdir oldtests

More Information

• “Shell Escape Function Example”

2-12
MATLAB Operators and Special Characters

? Name: Question mark

Uses: Metaclass for MATLAB class

Description: The question mark retrieves the meta.class object for a particular class
name. The ? operator works only with a class name, not an object.

Examples

Retrieve the meta.class object for class inputParser:

?inputParser

More Information

• metaclass
'' Name: Single quotes

Uses: Character array constructor

Description: Use single quotes to create character vectors that have class char.

Examples

Create a character vector:

chr = 'Hello, world'

More Information

• “Text in String and Character Arrays” on page 6-2


"" Name: Double quotes

Uses: String constructor

Description: Use double quotes to create string scalars that have class string.

Examples

Create a string scalar:

S = "Hello, world"

More Information

• “Text in String and Character Arrays” on page 6-2

2-13
2 Program Components

N/A Name: Space character

Uses: Separator

Description: Use the space character to separate row elements in an array constructor,
or the values returned by a function. In these contexts, the space character and comma
are equivalent.

Examples

Separate row elements to create an array:

% These statements are equivalent


A = [12 13; 14 15]
A = [12,13; 14,15]

Separate output arguments in function calls:

% These statements are equivalent


[Y I] = max(A)
[Y,I] = max(A)
N/A Name: Newline character

Uses: Separator

Description: Use the newline character to separate rows in an array construction


statement. In that context, the newline character and semicolon are equivalent.

Examples

Separate rows in an array creation command:

% These statements are equivalent


A = [12 13
14 15]
A = [12 13; 14 15]

2-14
MATLAB Operators and Special Characters

~ Name: Tilde

Uses:

• Logical NOT
• Argument placeholder

Description: Use the tilde symbol to represent logical NOT or to suppress specific input
or output arguments.

Examples

Calculate the logical NOT of a matrix:

A = eye(3);
~A

Determine where the elements of A are not equal to those of B:

A = [1 -1; 0 1]
B = [1 -2; 3 2]
A~=B

Return only the third output value of union:

[~,~,iB] = union(A,B)

More Information

• not
• “Ignore Inputs in Function Definitions” on page 21-10
• “Ignore Function Outputs” on page 1-4
= Name: Equal sign

Uses: Assignment

Description: Use the equal sign to assign values to a variable. The syntax B = A stores
the elements of A in variable B.

Note The = character is for assignment, whereas the == character is for comparing the
elements in two arrays. See eq for more information.

Examples

Create a matrix A. Assign the values in A to a new variable, B. Lastly, assign a new value
to the first element in B.

A = [1 0; -1 0];
B = A;
B(1) = 200;

2-15
2 Program Components

< & Name: Left angle bracket and ampersand

Uses: Specify superclasses

Description: Specify one or more superclasses in a class definition

Examples

Define a class that derives from one superclass:

classdef MyClass < MySuperclass



end

Define a class that derives from multiple superclasses:

classdef MyClass < Superclass1 & Superclass2 & …



end

More Information:

• “Subclass Syntax”
.? Name: Dot question mark

Uses: Specify fields of name-value structure

Description:

When using function argument validation, you can define the fields of the name-value
structure as the names of all writeable properties of the class.

Examples

Specify the field names of the propArgs structure as the writeable properties of the
matlab.graphics.primitive.Line class.

function f(propArgs)
arguments
propArgs.?matlab.graphics.primitive.Line
end
% Function code
...
end

More Information:

• “Name-Value Arguments from Class Properties” on page 26-14

String and Character Formatting


Some special characters can only be used in the text of a character vector or string. You can use
these special characters to insert new lines or carriage returns, specify folder paths, and more.

Use the special characters in this table to specify a folder path using a character vector or string.

2-16
MATLAB Operators and Special Characters

/ Name: Slash and Backslash

\ Uses: File or folder path separation

Description: In addition to their use as mathematical operators, the slash and backslash
characters separate the elements of a path or folder. On Microsoft Windows based
systems, both slash and backslash have the same effect. On The Open Group UNIX based
systems, you must use slash only.

Examples

On a Windows system, you can use either backslash or slash:

dir([matlabroot '\toolbox\matlab\elmat\shiftdim.m'])
dir([matlabroot '/toolbox/matlab/elmat/shiftdim.m'])

On a UNIX system, use only the forward slash:

dir([matlabroot '/toolbox/matlab/elmat/shiftdim.m'])
.. Name: Dot dot

Uses: Parent folder

Description: Two dots in succession refers to the parent of the current folder. Use this
character to specify folder paths relative to the current folder.

Examples

To go up two levels in the folder tree and down into the test folder, use:

cd ..\..\test

More Information

• cd
* Name: Asterisk

Uses: Wildcard character

Description: In addition to being the symbol for matrix multiplication, the asterisk * is
used as a wildcard character.

Wildcards are generally used in file operations that act on multiple files or folders.
MATLAB matches all characters in the name exactly except for the wildcard character *,
which can match any one or more characters.

Examples

Locate all files with names that start with january_ and have a .mat file extension:

dir('january_*.mat')

2-17
2 Program Components

@ Name: At symbol

Uses: Class folder indicator

Description: An @ sign indicates the name of a class folder.

Examples

Refer to a class folder:

\@myClass\get.m

More Information

• “Class and Path Folders”


+ Name: Plus

Uses: Package directory indicator

Description: A + sign indicates the name of a package folder.

Examples

Package folders always begin with the + character:

+mypack
+mypack/pkfcn.m % a package function
+mypack/@myClass % class folder in a package

More Information

• “Packages Create Namespaces”

There are certain special characters that you cannot enter as ordinary text. Instead, you must use
unique character sequences to represent them. Use the symbols in this table to format strings and
character vectors on their own or in conjunction with formatting functions like compose, sprintf,
and error. For more information, see “Formatting Text” on page 6-24.

Symbol Effect on Text


'' Single quotation mark
%% Single percent sign
\\ Single backslash
\a Alarm
\b Backspace
\f Form feed
\n New line
\r Carriage return
\t Horizontal tab
\v Vertical tab
\xN Hexadecimal number, N

2-18
MATLAB Operators and Special Characters

Symbol Effect on Text


\N Octal number, N

See Also

More About
• “Array vs. Matrix Operations” on page 2-20
• “Array Comparison with Relational Operators” on page 2-29
• “Compatible Array Sizes for Basic Operations” on page 2-25
• “Operator Precedence” on page 2-32
• “Find Array Elements That Meet a Condition” on page 5-2
• “Greek Letters and Special Characters in Chart Text”

2-19
2 Program Components

Array vs. Matrix Operations


In this section...
“Introduction” on page 2-20
“Array Operations” on page 2-20
“Matrix Operations” on page 2-22

Introduction
MATLAB has two different types of arithmetic operations: array operations and matrix operations.
You can use these arithmetic operations to perform numeric computations, for example, adding two
numbers, raising the elements of an array to a given power, or multiplying two matrices.

Matrix operations follow the rules of linear algebra. By contrast, array operations execute element by
element operations and support multidimensional arrays. The period character (.) distinguishes the
array operations from the matrix operations. However, since the matrix and array operations are the
same for addition and subtraction, the character pairs .+ and .- are unnecessary.

Array Operations
Array operations execute element by element operations on corresponding elements of vectors,
matrices, and multidimensional arrays. If the operands have the same size, then each element in the
first operand gets matched up with the element in the same location in the second operand. If the
operands have compatible sizes, then each input is implicitly expanded as needed to match the size of
the other. For more information, see “Compatible Array Sizes for Basic Operations” on page 2-25.

As a simple example, you can add two vectors with the same size.
A = [1 1 1]

A =

1 1 1

B = [1 2 3]

B =

1 2 3

A+B

ans =

2 3 4

If one operand is a scalar and the other is not, then MATLAB implicitly expands the scalar to be the
same size as the other operand. For example, you can compute the element-wise product of a scalar
and a matrix.
A = [1 2 3; 1 2 3]

A =

2-20
Array vs. Matrix Operations

1 2 3
1 2 3

3.*A

ans =

3 6 9
3 6 9

Implicit expansion also works if you subtract a 1-by-3 vector from a 3-by-3 matrix because the two
sizes are compatible. When you perform the subtraction, the vector is implicitly expanded to become
a 3-by-3 matrix.

A = [1 1 1; 2 2 2; 3 3 3]

A =

1 1 1
2 2 2
3 3 3

m = [2 4 6]

m =

2 4 6

A - m

ans =

-1 -3 -5
0 -2 -4
1 -1 -3

A row vector and a column vector have compatible sizes. If you add a 1-by-3 vector to a 2-by-1 vector,
then each vector implicitly expands into a 2-by-3 matrix before MATLAB executes the element-wise
addition.

x = [1 2 3]

x =

1 2 3

y = [10; 15]

y =

10
15

x + y

ans =

11 12 13
16 17 18

2-21
2 Program Components

If the sizes of the two operands are incompatible, then you get an error.
A = [8 1 6; 3 5 7; 4 9 2]

A =

8 1 6
3 5 7
4 9 2

m = [2 4]

m =

2 4

A - m

Matrix dimensions must agree.

The following table provides a summary of arithmetic array operators in MATLAB. For function-
specific information, click the link to the function reference page in the last column.

Operator Purpose Description Reference


Page
+ Addition A+B adds A and B. plus
+ Unary plus +A returns A. uplus
- Subtraction A-B subtracts B from A minus
- Unary minus -A negates the elements of A. uminus
.* Element-wise A.*B is the element-by-element product of A and times
multiplication B.
.^ Element-wise A.^B is the matrix with elements A(i,j) to the power
power B(i,j) power.
./ Right array A./B is the matrix with elements A(i,j)/ rdivide
division B(i,j).
.\ Left array A.\B is the matrix with elements B(i,j)/ ldivide
division A(i,j).
.' Array transpose A.' is the array transpose of A. For complex transpose
matrices, this does not involve conjugation.

Matrix Operations
Matrix operations follow the rules of linear algebra and are not compatible with multidimensional
arrays. The required size and shape of the inputs in relation to one another depends on the operation.
For nonscalar inputs, the matrix operators generally calculate different answers than their array
operator counterparts.

For example, if you use the matrix right division operator, /, to divide two matrices, the matrices
must have the same number of columns. But if you use the matrix multiplication operator, *, to
multiply two matrices, then the matrices must have a common inner dimension. That is, the number
of columns in the first input must be equal to the number of rows in the second input. The matrix
multiplication operator calculates the product of two matrices with the formula,

2-22
Array vs. Matrix Operations

n
C(i, j) = ∑ A(i, k)B(k, j) .
k=1

To see this, you can calculate the product of two matrices.

A = [1 3;2 4]

A =

1 3
2 4

B = [3 0;1 5]

B =

3 0
1 5

A*B

ans =

6 15
10 20

The previous matrix product is not equal to the following element-wise product.

A.*B

ans =

3 0
2 20

The following table provides a summary of matrix arithmetic operators in MATLAB. For function-
specific information, click the link to the function reference page in the last column.

Operator Purpose Description Reference


Page
* Matrix C = A*B is the linear algebraic product of the mtimes
multiplication matrices A and B. The number of columns of A
must equal the number of rows of B.
\ Matrix left x = A\B is the solution to the equation Ax = B. mldivide
division Matrices A and B must have the same number of
rows.
/ Matrix right x = B/A is the solution to the equation xA = B. mrdivide
division Matrices A and B must have the same number of
columns. In terms of the left division operator,
B/A = (A'\B')'.
^ Matrix power A^B is A to the power B, if B is a scalar. For other mpower
values of B, the calculation involves eigenvalues
and eigenvectors.

2-23
2 Program Components

Operator Purpose Description Reference


Page
' Complex A' is the linear algebraic transpose of A. For ctranspose
conjugate complex matrices, this is the complex conjugate
transpose transpose.

See Also

More About
• “Compatible Array Sizes for Basic Operations” on page 2-25
• “MATLAB Operators and Special Characters” on page 2-2
• “Operator Precedence” on page 2-32

2-24
Compatible Array Sizes for Basic Operations

Compatible Array Sizes for Basic Operations


Most binary (two-input) operators and functions in MATLAB support numeric arrays that have
compatible sizes. Two inputs have compatible sizes if, for every dimension, the dimension sizes of the
inputs are either the same or one of them is 1. In the simplest cases, two array sizes are compatible if
they are exactly the same or if one is a scalar. MATLAB implicitly expands arrays with compatible
sizes to be the same size during the execution of the element-wise operation or function.

Inputs with Compatible Sizes


2-D Inputs

These are some combinations of scalars, vectors, and matrices that have compatible sizes:

• Two inputs which are exactly the same size.

• One input is a scalar.

• One input is a matrix, and the other is a column vector with the same number of rows.

• One input is a column vector, and the other is a row vector.

2-25
2 Program Components

Multidimensional Arrays

Every array in MATLAB has trailing dimensions of size 1. For multidimensional arrays, this means
that a 3-by-4 matrix is the same as a matrix of size 3-by-4-by-1-by-1-by-1. Examples of
multidimensional arrays with compatible sizes are:

• One input is a matrix, and the other is a 3-D array with the same number of rows and columns.

• One input is a matrix, and the other is a 3-D array. The dimensions are all either the same or one
of them is 1.

Empty Arrays

The rules are the same for empty arrays or arrays that have a dimension size of zero. The size of the
dimension that is not equal to 1 determines the size of the output. This means that dimensions with a

2-26
Compatible Array Sizes for Basic Operations

size of zero must be paired with a dimension of size 1 or 0 in the other array, and that the output has
a dimension size of 0.

A: 1-by-0
B: 3-by-1
Result: 3-by-0

Inputs with Incompatible Sizes


Incompatible inputs have sizes that cannot be implicitly expanded to be the same size. For example:

• One of the dimension sizes are not equal, and neither is 1.

A: 3-by-2
B: 4-by-2
• Two nonscalar row vectors with lengths that are not the same.

A: 1-by-3
B: 1-by-4

Examples
Subtract Vector from Matrix

To simplify vector-matrix operations, use implicit expansion with dimensional functions such as sum,
mean, min, and others.

For example, calculate the mean value of each column in a matrix, then subtract the mean value from
each element.

A = magic(3)

A =

8 1 6
3 5 7
4 9 2

C = mean(A)

C =

5 5 5

A - C

ans =

3 -4 1
-2 0 2
-1 4 -3

Add Row and Column Vector

Row and column vectors have compatible sizes, and when you perform an operation on them the
result is a matrix.

2-27
2 Program Components

For example, add a row and column vector. The result is the same as bsxfun(@plus,a,b).

a = [1 2 3 4]

ans =

1 2 3 4

b = [5; 6; 7]

ans =

5
6
7

a + b

ans =

6 7 8 9
7 8 9 10
8 9 10 11

See Also
bsxfun

More About
• “Array vs. Matrix Operations” on page 2-20
• “MATLAB Operators and Special Characters” on page 2-2

2-28
Array Comparison with Relational Operators

Array Comparison with Relational Operators


In this section...
“Array Comparison” on page 2-29
“Logic Statements” on page 2-31

Relational operators compare operands quantitatively, using operators like “less than”, “greater
than”, and “not equal to.” The result of a relational comparison is a logical array indicating the
locations where the relation is true.

These are the relational operators in MATLAB.

Symbol Function Equivalent Description


< lt Less than
<= le Less than or equal to
> gt Greater than
>= ge Greater than or equal to
== eq Equal to
~= ne Not equal to

Array Comparison
Numeric Arrays

The relational operators perform element-wise comparisons between two arrays. The arrays must
have compatible sizes to facilitate the operation. Arrays with compatible sizes are implicitly expanded
to be the same size during execution of the calculation. In the simplest cases, the two operands are
arrays of the same size, or one is a scalar. For more information, see “Compatible Array Sizes for
Basic Operations” on page 2-25.

For example, if you compare two matrices of the same size, then the result is a logical matrix of the
same size with elements indicating where the relation is true.

A = [2 4 6; 8 10 12]

A =

2 4 6
8 10 12

B = [5 5 5; 9 9 9]

B =

5 5 5
9 9 9

A < B

ans =

2-29
2 Program Components

1 1 0
1 0 0

Similarly, you can compare one of the arrays to a scalar.

A > 7

ans =

0 0 0
1 1 1

If you compare a 1-by-N row vector to an M-by-1 column vector, then MATLAB expands each vector
into an M-by-N matrix before performing the comparison. The resulting matrix contains the
comparison result for each combination of elements in the vectors.

A = 1:3

A =

1 2 3

B = [2; 3]

B =

2
3

A >= B

ans =

0 1 1
0 0 1

Empty Arrays

The relational operators work with arrays for which any dimension has size zero, as long as both
arrays have compatible sizes. This means that if one array has a dimension size of zero, then the size
of the corresponding dimension in the other array must be 1 or zero, and the size of that dimension in
the output is zero.

A = ones(3,0);
B = ones(3,1);
A == B

ans =

Empty matrix: 3-by-0

However, expressions such as

A == []

return an error if A is not 0-by-0 or 1-by-1. This behavior is consistent with that of all other binary
operators, such as +, -, >, <, &, |, and so on.

To test for empty arrays, use isempty(A).

2-30
Array Comparison with Relational Operators

Complex Numbers

• The operators >, <, >=, and <= use only the real part of the operands in performing comparisons.
• The operators == and ~= test both real and imaginary parts of the operands.

Inf, NaN, NaT, and undefined Element Comparisons

• Inf values are equal to other Inf values.


• NaN values are not equal to any other numeric value, including other NaN values.
• NaT values are not equal to any other datetime value, including other NaT values.
• Undefined categorical elements are not equal to any other categorical value, including other
undefined elements.

Logic Statements
Use relational operators in conjunction with the logical operators A & B (AND), A | B (OR),
xor(A,B) (XOR), and ~A (NOT), to string together more complex logical statements.

For example, you can locate where negative elements occur in two arrays.

A = [2 -1; -3 10]

A =

2 -1
-3 10

B = [0 -2; -3 -1]

B =

0 -2
-3 -1

A<0 & B<0

ans =

0 1
1 0

For more examples, see “Find Array Elements That Meet a Condition” on page 5-2.

See Also
gt | lt | ge | le | eq | ne

More About
• “Array vs. Matrix Operations” on page 2-20
• “Compatible Array Sizes for Basic Operations” on page 2-25
• “MATLAB Operators and Special Characters” on page 2-2

2-31
2 Program Components

Operator Precedence
You can build expressions that use any combination of arithmetic, relational, and logical operators.
Precedence levels determine the order in which MATLAB evaluates an expression. Within each
precedence level, operators have equal precedence and are evaluated from left to right. The
precedence rules for MATLAB operators are shown in this list, ordered from highest precedence level
to lowest precedence level:

1 Parentheses ()
2 Transpose (.'), power (.^), complex conjugate transpose ('), matrix power (^)
3 Power with unary minus (.^-), unary plus (.^+), or logical negation (.^~) as well as matrix
power with unary minus (^-), unary plus (^+), or logical negation (^~).

Note Although most operators work from left to right, the operators (^-), (.^-), (^+), (.^+),
(^~), and (.^~) work from second from the right to left. It is recommended that you use
parentheses to explicitly specify the intended precedence of statements containing these
operator combinations.
4 Unary plus (+), unary minus (-), logical negation (~)
5 Multiplication (.*), right division (./), left division (.\), matrix multiplication (*), matrix
right division (/), matrix left division (\)
6 Addition (+), subtraction (-)
7 Colon operator (:)
8 Less than (<), less than or equal to (<=), greater than (>), greater than or equal to (>=),
equal to (==), not equal to (~=)
9 Element-wise AND (&)
10 Element-wise OR (|)
11 Short-circuit AND (&&)
12 Short-circuit OR (||)

Precedence of AND and OR Operators


MATLAB always gives the & operator precedence over the | operator. Although MATLAB typically
evaluates expressions from left to right, the expression a|b&c is evaluated as a|(b&c). It is a good
idea to use parentheses to explicitly specify the intended precedence of statements containing
combinations of & and |.

The same precedence rule holds true for the && and || operators.

Overriding Default Precedence


The default precedence can be overridden using parentheses, as shown in this example:
A = [3 9 5];
B = [2 1 5];
C = A./B.^2
C =
0.7500 9.0000 0.2000

2-32
Operator Precedence

C = (A./B).^2
C =
2.2500 81.0000 1.0000

See Also

More About
• “Array vs. Matrix Operations” on page 2-20
• “Compatible Array Sizes for Basic Operations” on page 2-25
• “Array Comparison with Relational Operators” on page 2-29
• “MATLAB Operators and Special Characters” on page 2-2

2-33
2 Program Components

Average Similar Data Points Using a Tolerance


This example shows how to use uniquetol to find the average z-coordinate of 3-D points that have
similar (within tolerance) x and y coordinates.

Use random points picked from the peaks function in the domain [ − 3, 3] × [ − 3, 3] as the data set.
Add a small amount of noise to the data.

xy = rand(10000,2)*6-3;
z = peaks(xy(:,1),xy(:,2)) + 0.5-rand(10000,1);
A = [xy z];
plot3(A(:,1), A(:,2), A(:,3), '.')
view(-28,32)

Find points that have similar x and y coordinates using uniquetol with these options:

• Specify ByRows as true, since the rows of A contain the point coordinates.
• Specify OutputAllIndices as true to return the indices for all points that are within tolerance
of each other.
• Specify DataScale as [1 1 Inf] to use an absolute tolerance for the x and y coordinates, while
ignoring the z-coordinate.

DS = [1 1 Inf];
[C,ia] = uniquetol(A, 0.3, 'ByRows', true, ...
'OutputAllIndices', true, 'DataScale', DS);

2-34
Average Similar Data Points Using a Tolerance

Average each group of points that are within tolerance (including the z-coordinates), producing a
reduced data set that still holds the general shape of the original data.

for k = 1:length(ia)
aveA(k,:) = mean(A(ia{k},:),1);
end

Plot the resulting averaged-out points on top of the original data.

hold on
plot3(aveA(:,1), aveA(:,2), aveA(:,3), '.r', 'MarkerSize', 15)

See Also
uniquetol

More About
• “Group Scattered Data Using a Tolerance” on page 2-36

2-35
2 Program Components

Group Scattered Data Using a Tolerance


This example shows how to group scattered data points based on their proximity to points of interest.

Create a set of random 2-D points. Then create and plot a grid of equally spaced points on top of the
random data.

x = rand(10000,2);
[a,b] = meshgrid(0:0.1:1);
gridPoints = [a(:), b(:)];
plot(x(:,1), x(:,2), '.')
hold on
plot(gridPoints(:,1), gridPoints(:,2), 'xr', 'Markersize', 6)

Use ismembertol to locate the data points in x that are within tolerance of the grid points in
gridPoints. Use these options with ismembertol:

• Specify ByRows as true, since the point coordinates are in the rows of x.
• Specify OutputAllIndices as true to return all of the indices for rows in x that are within
tolerance of the corresponding row in gridPoints.

[LIA,LocB] = ismembertol(gridPoints, x, 0.05, ...


'ByRows', true, 'OutputAllIndices', true);

For each grid point, plot the points in x that are within tolerance of that grid point.

2-36
Group Scattered Data Using a Tolerance

figure
hold on
for k = 1:length(LocB)
plot(x(LocB{k},1), x(LocB{k},2), '.')
end
plot(gridPoints(:,1), gridPoints(:,2), 'xr', 'Markersize', 6)

See Also
ismembertol

More About
• “Average Similar Data Points Using a Tolerance” on page 2-34

2-37
2 Program Components

Bit-Wise Operations
This topic shows how to use bit-wise operations in MATLAB® to manipulate the bits of numbers.
Operating on bits is directly supported by most modern CPUs. In many cases, manipulating the bits of
a number in this way is quicker than performing arithmetic operations like division or multiplication.

Number Representations

Any number can be represented with bits (also known as binary digits). The binary, or base 2, form of
a number contains 1s and 0s to indicate which powers of 2 are present in the number. For example,
the 8-bit binary form of 7 is

00000111

A collection of 8 bits is also called 1 byte. In binary representations, the bits are counted from the
right to the left, so the first bit in this representation is a 1. This number represents 7 because

2 1 0
2 + 2 + 2 = 7.

When you type numbers into MATLAB, it assumes the numbers are double precision (a 64-bit binary
representation). However, you can also specify single-precision numbers (32-bit binary
representation) and integers (signed or unsigned, from 8 to 64 bits). For example, the most memory
efficient way to store the number 7 is with an 8-bit unsigned integer:

a = uint8(7)

a = uint8
7

You can even specify the binary form directly using the prefix 0b followed by the binary digits (for
more information, see “Hexadecimal and Binary Values” on page 6-55). MATLAB stores the number
in an integer format with the fewest number of bits. Instead of specifying all the bits, you need to
specify only the left-most 1 and all the digits to the right of it. The bits to the left of that bit are
trivially zero. So the number 7 is:

b = 0b111

b = uint8
7

MATLAB stores negative integers using two's complement. For example, consider the 8-bit signed
integer -8. To find the two's complement bit pattern for this number:

1 Start with the bit pattern of the positive version of the number, 8: 00001000.
2 Next, flip all of the bits: 11110111.
3 Finally, add 1 to the result: 11111000.

The result, 11111000, is the bit pattern for -8:

n = 0b11111000s8

n = int8
-8

2-38
Bit-Wise Operations

MATLAB does not natively display the binary format of numbers. For that, you can use the dec2bin
function, which returns a character vector of binary digits for positive integers. Again, this function
returns only the digits that are not trivially zero.
dec2bin(b)

ans =
'111'

You can use bin2dec to switch between the two formats. For example, you can convert the binary
digits 10110101 to decimal format with the commands
data = [1 0 1 1 0 1 0 1];
dec = bin2dec(num2str(data))

dec = 181

The cast and typecast functions are also useful to switch among different data types. These
functions are similar, but they differ in how they treat the underlying storage of the number:

• cast — Changes the underlying data type of a variable.


• typecast — Converts data types without changing the underlying bits.

Because MATLAB does not display the digits of a binary number directly, you must pay attention to
data types when you work with bit-wise operations. Some functions return binary digits as a
character vector (dec2bin), some return the decimal number (bitand), and others return a vector of
the bits themselves (bitget).

Bit Masking with Logical Operators

MATLAB has several functions that enable you to perform logical operations on the bits of two equal-
length binary representations of numbers, known as bit masking:

• bitand — If both digits are 1, then the resulting digit is also a 1. Otherwise, the resulting digit is
0.
• bitor — If either digit is 1, then the resulting digit is also a 1. Otherwise, the resulting digit is 0.
• bitxor — If the digits are different, then the resulting digit is a 1. Otherwise, the resulting digit
is 0.

In addition to these functions, the bit-wise complement is available with bitcmp, but this is a unary
operation that flips the bits in only one number at a time.

One use of bit masking is to query the status of a particular bit. For example, if you use a bit-wise
AND operation with the binary number 00001000, you can query the status of the fourth bit. You can
then shift that bit to the first position so that MATLAB returns a 0 or 1 (the next section describes bit
shifting in more detail).
n = 0b10111001;
n4 = bitand(n,0b1000);
n4 = bitshift(n4,-3)

n4 = uint8
1

Bit-wise operations can have surprising applications. For example, consider the 8-bit binary
representation of the number n = 8:

2-39
2 Program Components

00001000

8 is a power of 2, so its binary representation contains a single 1. Now consider the number
n − 1 = 7:

00000111

By subtracting 1, all of the bits starting at the right-most 1 are flipped. As a result, when n is a power
of 2, corresponding digits of n and n − 1 are always different, and the bit-wise AND returns zero.

n = 0b1000;
bitand(n,n-1)

ans = uint8
0

0
However, when n is not a power of 2, then the right-most 1 is for the 2 bit, so n and n − 1 have all
0
the same bits except for the 2 bit. For this case, the bit-wise AND returns a nonzero number.

n = 0b101;
bitand(n,n-1)

ans = uint8
4

This operation suggests a simple function that operates on the bits of a given input number to check
whether the number is a power of 2:

function tf = isPowerOfTwo(n)
tf = n && ~bitand(n,n-1);
end

The use of the short-circuit AND operator && checks to make sure that n is not zero. If it is, then the
function does not need to calculate bitand(n,n-1) to know that the correct answer is false.

Shifting Bits

Because bit-wise logical operations compare corresponding bits in two numbers, it is useful to be able
to move the bits around to change which bits are compared. You can use bitshift to perform this
operation:

• bitshift(A,N) shifts the bits of A to the left by N digits. This is equivalent to multiplying A by
N
2 .
• bitshift(A,-N) shifts the bits of A to the right by N digits. This is equivalent to dividing A by
N
2 .

These operations are sometimes written A<<N (left shift) and A>>N (right shift), but MATLAB does not
use << and >> operators for this purpose.

When the bits of a number are shifted, some bits fall off the end of the number, and 0s or 1s are
introduced to fill in the newly created space. When you shift bits to the left, the bits are filled in on
the right; when you shift bits to the right, the bits are filled in on the left.

For example, if you shift the bits of the number 8 (binary: 1000) to the right by one digit, you get 4
(binary: 100).

2-40
Bit-Wise Operations

n = 0b1000;
bitshift(n,-1)

ans = uint8
4

Similarly, if you shift the number 15 (binary: 1111) to the left by two digits, you get 60 (binary:
111100).

n = 0b1111;
bitshift(15,2)

ans = 60

When you shift the bits of a negative number, bitshift preserves the signed bit. For example, if you
shift the signed integer -3 (binary: 11111101) to the right by 2 digits, you get -1 (binary: 11111111).
In these cases, bitshift fills in on the left with 1s rather than 0s.

n = 0b11111101s8;
bitshift(n,-2)

ans = int8
-1

Writing Bits

You can use the bitset function to change the bits in a number. For example, change the first bit of
the number 8 to a 1 (which adds 1 to the number):

bitset(8,1)

ans = 9

By default, bitset flips bits to on or 1. You can optionally use the third input argument to specify the
bit value.

bitset does not change multiple bits at once, so you need to use a for loop to change multiple bits.
Therefore, the bits you change can be either consecutive or nonconsecutive. For example, change the
first two bits of the binary number 1000:

bits = [1 2];
c = 0b1000;
for k = 1:numel(bits)
c = bitset(c,bits(k));
end
dec2bin(c)

ans =
'1011'

Another common use of bitset is to convert a vector of binary digits into decimal format. For
example, use a loop to set the individual bits of the integer 11001101.

data = [1 1 0 0 1 1 0 1];
n = length(data);
dec = 0b0u8;
for k = 1:n
dec = bitset(dec,n+1-k,data(k));

2-41
2 Program Components

end
dec

dec = uint8
205

dec2bin(dec)

ans =
'11001101'

Reading Consecutive Bits

Another use of bit shifting is to isolate consecutive sections of bits. For example, read the last four
bits in the 16-bit number 0110000010100000. Recall that the last four bits are on the left of the
binary representation.

n = 0b0110000010100000;
dec2bin(bitshift(n,-12))

ans =
'110'

To isolate consecutive bits in the middle of the number, you can combine the use of bit shifting with
logical masking. For example, to extract the 13th and 14th bits, you can shift the bits to the right by
12 and then mask the resulting four bits with 0011. Because the inputs to bitand must be the same
integer data type, you can specify 0011 as an unsigned 16-bit integer with 0b11u16. Without the -
u16 suffix, MATLAB stores the number as an unsigned 8-bit integer.

m = 0b11u16;
dec2bin(bitand(bitshift(n,-12),m))

ans =
'10'

Another way to read consecutive bits is with bitget, which reads specified bits from a number. You
can use colon notation to specify several consecutive bits to read. For example, read the last 8 bits of
n.

bitget(n,16:-1:8)

ans = 1x9 uint16 row vector

0 1 1 0 0 0 0 0 1

Reading Nonconsecutive Bits

You can also use bitget to read bits from a number when the bits are not next to each other. For
example, read the 5th, 8th, and 14th bits from n.

bits = [14 8 5];


bitget(n,bits)

ans = 1x3 uint16 row vector

2-42
Bit-Wise Operations

1 1 0

See Also
bitand | bitor | bitxor | bitget | bitset | bitshift | bitcmp

More About
• “Integers” on page 4-2
• “Perform Cyclic Redundancy Check” on page 2-44
• “Hexadecimal and Binary Values” on page 6-55

2-43
2 Program Components

Perform Cyclic Redundancy Check


This example shows how to perform a cyclic redundancy check (CRC) on the bits of a number. CRCs
are used to detect errors in the transmission of data in digital systems. When a piece of data is sent, a
short check value is attached to it. The check value is obtained by polynomial division with the bits in
the data. When the data is received, the polynomial division is repeated, and the result is compared
with the check value. If the results differ, then the data was corrupted during transmission.

Calculate Check Value by Hand

Start with a 16-bit binary number, which is the message to be transmitted:

1101100111011010

To obtain the check value, divide this number by the polynomial x3 + x2 + x + 1. You can represent
this polynomial with its coefficients: 1111.

The division is performed in steps, and after each step the polynomial divisor is aligned with the left-
most 1 in the number. Because the result of dividing by the four term polynomial has three bits (in
general dividing by a polynomial of length n + 1 produces a check value of length n), append the
number with 000 to calculate the remainder. At each step, the result uses the bit-wise XOR of the four
bits being operated on, and all other bits are unchanged.

The first division is

1101100111011010 000
1111
----------------
0010100111011010 000

Each successive division operates on the result of the previous step, so the second division is

0010100111011010 000
1111
----------------
0001010111011010 000

The division is completed once the dividend is all zeros. The complete division, including the above
two steps, is

1101100111011010 000
1111
0010100111011010 000
1111
0001010111011010 000
1111
0000101111011010 000
1111
0000010011011010 000
1111
0000001101011010 000
1111
0000000010011010 000
1111
0000000001101010 000
1111

2-44
Perform Cyclic Redundancy Check

0000000000010010 000
1111
0000000000001100 000
1111
0000000000000011 000
11 11
0000000000000000 110

The remainder bits, 110, are the check value for this message.

Calculate Check Value Programmatically

In MATLAB®, you can perform this same operation to obtain the check value using bit-wise
operations. First, define variables for the message and polynomial divisor. Use unsigned 32-bit
integers so that extra bits are available for the remainder.
message = 0b1101100111011010u32;
messageLength = 16;
divisor = 0b1111u32;
divisorDegree = 3;

Next, initialize the polynomial divisor. Use dec2bin to display the bits of the result.
divisor = bitshift(divisor,messageLength-divisorDegree-1);
dec2bin(divisor)

ans =
'1111000000000000'

Now, shift the divisor and message so that they have the correct number of bits (16 bits for the
message and 3 bits for the remainder).
divisor = bitshift(divisor,divisorDegree);
remainder = bitshift(message,divisorDegree);
dec2bin(divisor)

ans =
'1111000000000000000'

dec2bin(remainder)

ans =
'1101100111011010000'

Perform the division steps of the CRC using a for loop. The for loop always advances a single bit
each step, so include a check to see if the current digit is a 1. If the current digit is a 1, then the
division step is performed; otherwise, the loop advances a bit and continues.
for k = 1:messageLength
if bitget(remainder,messageLength+divisorDegree)
remainder = bitxor(remainder,divisor);
end
remainder = bitshift(remainder,1);
end

Shift the bits of the remainder to the right to get the check value for the operation.
CRC_check_value = bitshift(remainder,-messageLength);
dec2bin(CRC_check_value)

2-45
2 Program Components

ans =
'110'

Check Message Integrity

You can use the check value to verify the integrity of a message by repeating the same division
operation. However, instead of using a remainder of 000 to start, use the check value 110. If the
message is error free, then the result of the division will be zero.

Reset the remainder variable, and add the CRC check value to the remainder bits using a bit-wise OR.
Introduce an error into the message by flipping one of the bit values with bitset.

remainder = bitshift(message,divisorDegree);
remainder = bitor(remainder,CRC_check_value);
remainder = bitset(remainder,6);
dec2bin(remainder)

ans =
'1101100111011110110'

Perform the CRC division operation and then check if the result is zero.

for k = 1:messageLength
if bitget(remainder,messageLength+divisorDegree)
remainder = bitxor(remainder,divisor);
end
remainder = bitshift(remainder,1);
end
if remainder == 0
disp('Message is error free.')
else
disp('Message contains errors.')
end

Message contains errors.

References
[1] Sklar, Bernard. Digital Communications: Fundamentals and Applications. Englewood Cliffs, NJ:
Prentice Hall, 1988.

[2] Wicker, Stephen B. Error Control Systems for Digital Communication and Storage. Upper Saddle
River, NJ: Prentice Hall, 1995.

See Also
bitshift | bitxor

More About
• “Bit-Wise Operations” on page 2-38
• “Hexadecimal and Binary Values” on page 6-55

2-46
Conditional Statements

Conditional Statements
Conditional statements enable you to select at run time which block of code to execute. The simplest
conditional statement is an if statement. For example:

% Generate a random number


a = randi(100, 1);

% If it is even, divide by 2
if rem(a, 2) == 0
disp('a is even')
b = a/2;
end

if statements can include alternate choices, using the optional keywords elseif or else. For
example:

a = randi(100, 1);

if a < 30
disp('small')
elseif a < 80
disp('medium')
else
disp('large')
end

Alternatively, when you want to test for equality against a set of known values, use a switch
statement. For example:

[dayNum, dayString] = weekday(date, 'long', 'en_US');

switch dayString
case 'Monday'
disp('Start of the work week')
case 'Tuesday'
disp('Day 2')
case 'Wednesday'
disp('Day 3')
case 'Thursday'
disp('Day 4')
case 'Friday'
disp('Last day of the work week')
otherwise
disp('Weekend!')
end

For both if and switch, MATLAB executes the code corresponding to the first true condition, and
then exits the code block. Each conditional statement requires the end keyword.

In general, when you have many possible discrete, known values, switch statements are easier to
read than if statements. However, you cannot test for inequality between switch and case values.
For example, you cannot implement this type of condition with a switch:

yourNumber = input('Enter a number: ');

if yourNumber < 0

2-47
2 Program Components

disp('Negative')
elseif yourNumber > 0
disp('Positive')
else
disp('Zero')
end

See Also
if | switch | end | return

2-48
Loop Control Statements

Loop Control Statements


With loop control statements, you can repeatedly execute a block of code. There are two types of
loops:

• for statements loop a specific number of times, and keep track of each iteration with an
incrementing index variable.

For example, preallocate a 10-element vector, and calculate five values:

x = ones(1,10);
for n = 2:6
x(n) = 2 * x(n - 1);
end
• while statements loop as long as a condition remains true.

For example, find the first integer n for which factorial(n) is a 100-digit number:

n = 1;
nFactorial = 1;
while nFactorial < 1e100
n = n + 1;
nFactorial = nFactorial * n;
end

Each loop requires the end keyword.

It is a good idea to indent the loops for readability, especially when they are nested (that is, when one
loop contains another loop):

A = zeros(5,100);
for m = 1:5
for n = 1:100
A(m, n) = 1/(m + n - 1);
end
end

You can programmatically exit a loop using a break statement, or skip to the next iteration of a loop
using a continue statement. For example, count the number of lines in the help for the magic
function (that is, all comment lines until a blank line):

fid = fopen('magic.m','r');
count = 0;
while ~feof(fid)
line = fgetl(fid);
if isempty(line)
break
elseif ~strncmp(line,'%',1)
continue
end
count = count + 1;
end
fprintf('%d lines in MAGIC help\n',count);
fclose(fid);

2-49
2 Program Components

Tip If you inadvertently create an infinite loop (a loop that never ends on its own), stop execution of
the loop by pressing Ctrl+C.

See Also
for | while | break | continue | end

2-50
Regular Expressions

Regular Expressions
In this section...
“What Is a Regular Expression?” on page 2-51
“Steps for Building Expressions” on page 2-52
“Operators and Characters” on page 2-55

This topic describes what regular expressions are and how to use them to search text. Regular
expressions are flexible and powerful, though they use complex syntax. An alternative to regular
expressions is a pattern (since R2020b), which is simpler to define and results in code that is easier
to read. For more information, see “Build Pattern Expressions” on page 6-40.

What Is a Regular Expression?


A regular expression is a sequence of characters that defines a certain pattern. You normally use a
regular expression to search text for a group of words that matches the pattern, for example, while
parsing program input or while processing a block of text.

The character vector 'Joh?n\w*' is an example of a regular expression. It defines a pattern that
starts with the letters Jo, is optionally followed by the letter h (indicated by 'h?'), is then followed
by the letter n, and ends with any number of word characters, that is, characters that are alphabetic,
numeric, or underscore (indicated by '\w*'). This pattern matches any of the following:

Jon, John, Jonathan, Johnny

Regular expressions provide a unique way to search a volume of text for a particular subset of
characters within that text. Instead of looking for an exact character match as you would do with a
function like strfind, regular expressions give you the ability to look for a particular pattern of
characters.

For example, several ways of expressing a metric rate of speed are:

km/h
km/hr
km/hour
kilometers/hour
kilometers per hour

You could locate any of the above terms in your text by issuing five separate search commands:

strfind(text, 'km/h');
strfind(text, 'km/hour');
% etc.

To be more efficient, however, you can build a single phrase that applies to all of these search terms:

2-51
2 Program Components

Translate this phrase into a regular expression (to be explained later in this section) and you have:

pattern = 'k(ilo)?m(eters)?(/|\sper\s)h(r|our)?';

Now locate one or more of the terms using just a single command:

text = ['The high-speed train traveled at 250 ', ...


'kilometers per hour alongside the automobile ', ...
'travelling at 120 km/h.'];
regexp(text, pattern, 'match')

ans =

1×2 cell array

{'kilometers per hour'} {'km/h'}

There are four MATLAB functions that support searching and replacing characters using regular
expressions. The first three are similar in the input values they accept and the output values they
return. For details, click the links to the function reference pages.

Function Description
regexp Match regular expression.
regexpi Match regular expression, ignoring case.
regexprep Replace part of text using regular expression.
regexptranslate Translate text into regular expression.

When calling any of the first three functions, pass the text to be parsed and the regular expression in
the first two input arguments. When calling regexprep, pass an additional input that is an
expression that specifies a pattern for the replacement.

Steps for Building Expressions


There are three steps involved in using regular expressions to search text for a particular term:

1 Identify unique patterns in the string on page 2-53

This entails breaking up the text you want to search for into groups of like character types. These
character types could be a series of lowercase letters, a dollar sign followed by three numbers
and then a decimal point, etc.
2 Express each pattern as a regular expression on page 2-53

2-52
Regular Expressions

Use the metacharacters and operators described in this documentation to express each segment
of your search pattern as a regular expression. Then combine these expression segments into the
single expression to use in the search.
3 Call the appropriate search function on page 2-54

Pass the text you want to parse to one of the search functions, such as regexp or regexpi, or to
the text replacement function, regexprep.

The example shown in this section searches a record containing contact information belonging to a
group of five friends. This information includes each person's name, telephone number, place of
residence, and email address. The goal is to extract specific information from the text..
contacts = { ...
'Harry 287-625-7315 Columbus, OH [email protected]'; ...
'Janice 529-882-1759 Fresno, CA [email protected]'; ...
'Mike 793-136-0975 Richmond, VA [email protected]'; ...
'Nadine 648-427-9947 Tampa, FL [email protected]'; ...
'Jason 697-336-7728 Montrose, CO [email protected]'};

The first part of the example builds a regular expression that represents the format of a standard
email address. Using that expression, the example then searches the information for the email
address of one of the group of friends. Contact information for Janice is in row 2 of the contacts cell
array:
contacts{2}

ans =

'Janice 529-882-1759 Fresno, CA [email protected]'

Step 1 — Identify Unique Patterns in the Text

A typical email address is made up of standard components: the user's account name, followed by an
@ sign, the name of the user's internet service provider (ISP), a dot (period), and the domain to which
the ISP belongs. The table below lists these components in the left column, and generalizes the
format of each component in the right column.

Unique patterns of an email address General description of each pattern


Start with the account name One or more lowercase letters and underscores
jan_stephens . . .
Add '@' @ sign
jan_stephens@ . . .
Add the ISP One or more lowercase letters, no underscores
jan_stephens@horizon . . .
Add a dot (period) Dot (period) character
jan_stephens@horizon. . . .
Finish with the domain com or net
[email protected]

Step 2 — Express Each Pattern as a Regular Expression

In this step, you translate the general formats derived in Step 1 into segments of a regular
expression. You then add these segments together to form the entire expression.

2-53
2 Program Components

The table below shows the generalized format descriptions of each character pattern in the left-most
column. (This was carried forward from the right column of the table in Step 1.) The second column
shows the operators or metacharacters that represent the character pattern.

Description of each segment Pattern


One or more lowercase letters and underscores [a-z_]+
@ sign @
One or more lowercase letters, no underscores [a-z]+
Dot (period) character \.
com or net (com|net)

Assembling these patterns into one character vector gives you the complete expression:

email = '[a-z_]+@[a-z]+\.(com|net)';

Step 3 — Call the Appropriate Search Function

In this step, you use the regular expression derived in Step 2 to match an email address for one of the
friends in the group. Use the regexp function to perform the search.

Here is the list of contact information shown earlier in this section. Each person's record occupies a
row of the contacts cell array:

contacts = { ...
'Harry 287-625-7315 Columbus, OH [email protected]'; ...
'Janice 529-882-1759 Fresno, CA [email protected]'; ...
'Mike 793-136-0975 Richmond, VA [email protected]'; ...
'Nadine 648-427-9947 Tampa, FL [email protected]'; ...
'Jason 697-336-7728 Montrose, CO [email protected]'};

This is the regular expression that represents an email address, as derived in Step 2:

email = '[a-z_]+@[a-z]+\.(com|net)';

Call the regexp function, passing row 2 of the contacts cell array and the email regular
expression. This returns the email address for Janice.

regexp(contacts{2}, email, 'match')

ans =

1×1 cell array

{'[email protected]'}

MATLAB parses a character vector from left to right, “consuming” the vector as it goes. If matching
characters are found, regexp records the location and resumes parsing the character vector, starting
just after the end of the most recent match.

Make the same call, but this time for the fifth person in the list:

regexp(contacts{5}, email, 'match')

ans =

2-54
Regular Expressions

1×1 cell array

{'[email protected]'}

You can also search for the email address of everyone in the list by using the entire cell array for the
input argument:

regexp(contacts, email, 'match');

Operators and Characters


Regular expressions can contain characters, metacharacters, operators, tokens, and flags that specify
patterns to match, as described in these sections:

• “Metacharacters” on page 2-55


• “Character Representation” on page 2-56
• “Quantifiers” on page 2-56
• “Grouping Operators” on page 2-57
• “Anchors” on page 2-58
• “Lookaround Assertions” on page 2-58
• “Logical and Conditional Operators” on page 2-59
• “Token Operators” on page 2-60
• “Dynamic Expressions” on page 2-60
• “Comments” on page 2-61
• “Search Flags” on page 2-61

Metacharacters

Metacharacters represent letters, letter ranges, digits, and space characters. Use them to construct a
generalized pattern of characters.

Metacharacter Description Example


. Any single character, including white '..ain' matches sequences of five
space consecutive characters that end with 'ain'.
[c1c2c3] Any character contained within the '[rp.]ain' matches 'rain' or 'pain' or
square brackets. The following '.ain'.
characters are treated literally: $ | . *
+ ? and - when not used to indicate a
range.
[^c1c2c3] Any character not contained within the '[^*rp]ain' matches all four-letter
square brackets. The following sequences that end in 'ain', except 'rain'
characters are treated literally: $ | . * and 'pain' and '*ain'. For example, it
+ ? and - when not used to indicate a matches 'gain', 'lain', or 'vain'.
range.
[c1-c2] Any character in the range of c1 through '[A-G]' matches a single character in the
c2 range of A through G.

2-55
2 Program Components

Metacharacter Description Example


\w Any alphabetic, numeric, or underscore '\w*' identifies a word comprised of any
character. For English character sets, \w grouping of alphabetic, numeric, or underscore
is equivalent to [a-zA-Z_0-9] characters.
\W Any character that is not alphabetic, '\W*' identifies a term that is not a word
numeric, or underscore. For English comprised of any grouping of alphabetic,
character sets, \W is equivalent to [^a- numeric, or underscore characters.
zA-Z_0-9]
\s Any white-space character; equivalent to '\w*n\s' matches words that end with the
[ \f\n\r\t\v] letter n, followed by a white-space character.
\S Any non-white-space character; '\d\S' matches a numeric digit followed by
equivalent to [^ \f\n\r\t\v] any non-white-space character.
\d Any numeric digit; equivalent to [0-9] '\d*' matches any number of consecutive
digits.
\D Any nondigit character; equivalent to '\w*\D\>' matches words that do not end
[^0-9] with a numeric digit.
\oN or \o{N} Character of octal value N '\o{40}' matches the space character,
defined by octal 40.
\xN or \x{N} Character of hexadecimal value N '\x2C' matches the comma character, defined
by hex 2C.

Character Representation

Operator Description
\a Alarm (beep)
\b Backspace
\f Form feed
\n New line
\r Carriage return
\t Horizontal tab
\v Vertical tab
\char Any character with special meaning in regular expressions that you want to match literally
(for example, use \\ to match a single backslash)

Quantifiers

Quantifiers specify the number of times a pattern must occur in the matching text.

Quantifier Number of Times Expression Occurs Example


expr* 0 or more times consecutively. '\w*' matches a word of any length.
expr? 0 times or 1 time. '\w*(\.m)?' matches words that optionally
end with the extension .m.

2-56
Regular Expressions

Quantifier Number of Times Expression Occurs Example


expr+ 1 or more times consecutively. '<img src="\w+\.gif">' matches an
<img> HTML tag when the file name contains
one or more characters.
expr{m,n} At least m times, but no more than n times '\S{4,8}' matches between four and eight
consecutively. non-white-space characters.

{0,1} is equivalent to ?.
expr{m,} At least m times consecutively. '<a href="\w{1,}\.html">' matches an
<a> HTML tag when the file name contains one
{0,} and {1,} are equivalent to * and +, or more characters.
respectively.
expr{n} Exactly n times consecutively. '\d{4}' matches four consecutive digits.

Equivalent to {n,n}.

Quantifiers can appear in three modes, described in the following table. q represents any of the
quantifiers in the previous table.

Mode Description Example


exprq Greedy expression: match as many characters Given the text '<tr><td><p>text</p></
as possible. td>', the expression '</?t.*>' matches all
characters between <tr and /td>:

'<tr><td><p>text</p></td>'
exprq? Lazy expression: match as few characters as Given the text'<tr><td><p>text</p></
necessary. td>', the expression '</?t.*?>' ends each
match at the first occurrence of the closing
angle bracket (>):

'<tr>' '<td>' '</td>'


exprq+ Possessive expression: match as much as Given the text'<tr><td><p>text</p></
possible, but do not rescan any portions of the td>', the expression '</?t.*+>' does not
text. return any matches, because the closing
angle bracket is captured using .*, and is not
rescanned.

Grouping Operators

Grouping operators allow you to capture tokens, apply one operator to multiple elements, or disable
backtracking in a specific group.

Grouping Description Example


Operator
(expr) Group elements of the expression and capture 'Joh?n\s(\w*)' captures a token that
tokens. contains the last name of any person with the
first name John or Jon.

2-57
2 Program Components

Grouping Description Example


Operator
(?:expr) Group, but do not capture tokens. '(?:[aeiou][^aeiou]){2}' matches two
consecutive patterns of a vowel followed by a
nonvowel, such as 'anon'.

Without grouping, '[aeiou][^aeiou]


{2}'matches a vowel followed by two
nonvowels.
(?>expr) Group atomically. Do not backtrack within the 'A(?>.*)Z' does not match 'AtoZ',
group to complete the match, and do not although 'A(?:.*)Z' does. Using the atomic
capture tokens. group, Z is captured using .* and is not
rescanned.
(expr1|expr2) Match expression expr1 or expression '(let|tel)\w+' matches words that
expr2. contain, but do not end, with let or tel.

If there is a match with expr1, then expr2 is


ignored.

You can include ?: or ?> after the opening


parenthesis to suppress tokens or group
atomically.

Anchors

Anchors in the expression match the beginning or end of a character vector or word.

Anchor Matches the... Example


^expr Beginning of the input text. '^M\w*' matches a word starting with M at
the beginning of the text.
expr$ End of the input text. '\w*m$' matches words ending with m at the
end of the text.
\<expr Beginning of a word. '\<n\w*' matches any words starting with
n.
expr\> End of a word. '\w*e\>' matches any words ending with e.

Lookaround Assertions

Lookaround assertions look for patterns that immediately precede or follow the intended match, but
are not part of the match.

The pointer remains at the current location, and characters that correspond to the test expression
are not captured or discarded. Therefore, lookahead assertions can match overlapping character
groups.

2-58
Regular Expressions

Lookaround Description Example


Assertion
expr(?=test) Look ahead for characters that match test. '\w*(?=ing)' matches terms that are
followed by ing, such as 'Fly' and 'fall'
in the input text 'Flying, not falling.'
expr(?!test) Look ahead for characters that do not 'i(?!ng)' matches instances of the letter i
match test. that are not followed by ng.
(?<=test)expr Look behind for characters that match '(?<=re)\w*' matches terms that follow
test. 're', such as 'new', 'use', and 'cycle'
in the input text 'renew, reuse,
recycle'
(?<!test)expr Look behind for characters that do not '(?<!\d)(\d)(?!\d)' matches single-
match test. digit numbers (digits that do not precede or
follow other digits).

If you specify a lookahead assertion before an expression, the operation is equivalent to a logical AND.

Operation Description Example


(?=test)expr Match both test and expr. '(?=[a-z])[^aeiou]' matches
consonants.
(?!test)expr Match expr and do not match test. '(?![aeiou])[a-z]' matches consonants.

For more information, see “Lookahead Assertions in Regular Expressions” on page 2-63.

Logical and Conditional Operators

Logical and conditional operators allow you to test the state of a given condition, and then use the
outcome to determine which pattern, if any, to match next. These operators support logical OR and if
or if/else conditions. (For AND conditions, see “Lookaround Assertions” on page 2-58.)

Conditions can be tokens on page 2-60, lookaround assertions on page 2-58, or dynamic expressions
on page 2-60 of the form (?@cmd). Dynamic expressions must return a logical or numeric value.

Conditional Operator Description Example


expr1|expr2 Match expression expr1 or expression '(let|tel)\w+' matches words that
expr2. start with let or tel.

If there is a match with expr1, then


expr2 is ignored.
(?(cond)expr) If condition cond is true, then match '(?(?@ispc)[A-Z]:\\)' matches a
expr. drive name, such as C:\, when run on a
Windows system.
(?(cond)expr1|expr2) If condition cond is true, then match 'Mr(s?)\..*?(?(1)her|his) \w*'
expr1. Otherwise, match expr2. matches text that includes her when
the text begins with Mrs, or that
includes his when the text begins with
Mr.

2-59
2 Program Components

Token Operators

Tokens are portions of the matched text that you define by enclosing part of the regular expression in
parentheses. You can refer to a token by its sequence in the text (an ordinal token), or assign names
to tokens for easier code maintenance and readable output.

Ordinal Token Operator Description Example


(expr) Capture in a token the characters that 'Joh?n\s(\w*)' captures a token that
match the enclosed expression. contains the last name of any person
with the first name John or Jon.
\N Match the Nth token. '<(\w+).*>.*</\1>' captures tokens
for HTML tags, such as 'title' from
the text '<title>Some text</
title>'.
(?(N)expr1|expr2) If the Nth token is found, then match 'Mr(s?)\..*?(?(1)her|his) \w*'
expr1. Otherwise, match expr2. matches text that includes her when
the text begins with Mrs, or that
includes his when the text begins with
Mr.

Named Token Operator Description Example


(?<name>expr) Capture in a named token the '(?<month>\d+)-(?<day>\d+)-(?
characters that match the enclosed <yr>\d+)' creates named tokens for
expression. the month, day, and year in an input
date of the form mm-dd-yy.
\k<name> Match the token referred to by name. '<(?<tag>\w+).*>.*</\k<tag>>'
captures tokens for HTML tags, such as
'title' from the text '<title>Some
text</title>'.
(?(name)expr1|expr2) If the named token is found, then 'Mr(?<sex>s?)\..*?(?(sex)her|
match expr1. Otherwise, match his) \w*' matches text that includes
expr2. her when the text begins with Mrs, or
that includes his when the text begins
with Mr.

Note If an expression has nested parentheses, MATLAB captures tokens that correspond to the
outermost set of parentheses. For example, given the search pattern '(and(y|rew))', MATLAB
creates a token for 'andrew' but not for 'y' or 'rew'.

For more information, see “Tokens in Regular Expressions” on page 2-66.

Dynamic Expressions

Dynamic expressions allow you to execute a MATLAB command or a regular expression to determine
the text to match.

The parentheses that enclose dynamic expressions do not create a capturing group.

2-60
Regular Expressions

Operator Description Example


(??expr) Parse expr and include the resulting term '^(\d+)((??\\w{$1}))' determines
in the match expression. how many characters to match by reading
a digit at the beginning of the match. The
When parsed, expr must correspond to a dynamic expression is enclosed in a
complete, valid regular expression. second set of parentheses so that the
Dynamic expressions that use the backslash resulting match is captured in a token. For
escape character (\) require two instance, matching '5XXXXX' captures
backslashes: one for the initial parsing of tokens for '5' and 'XXXXX'.
expr, and one for the complete match.
(??@cmd) Execute the MATLAB command '(.{2,}).?(??@fliplr($1))' finds
represented by cmd, and include the output palindromes that are at least four
returned by the command in the match characters long, such as 'abba'.
expression.
(?@cmd) Execute the MATLAB command '\w*?(\w)(?@disp($1))\1\w*'
represented by cmd, but discard any output matches words that include double letters
the command returns. (Helpful for (such as pp), and displays intermediate
diagnosing regular expressions.) results.

Within dynamic expressions, use the following operators to define replacement terms.

Replacement Operator Description


$& or $0 Portion of the input text that is currently a match
$` Portion of the input text that precedes the current match
$' Portion of the input text that follows the current match (use $'' to represent $')
$N Nth token
$<name> Named token
${cmd} Output returned when MATLAB executes the command, cmd

For more information, see “Dynamic Regular Expressions” on page 2-72.

Comments

The comment operator enables you to insert comments into your code to make it more maintainable.
The text of the comment is ignored by MATLAB when matching against the input text.

Characters Description Example


(?#comment) Insert a comment in the regular expression. '(?# Initial digit)\<\d\w+'
The comment text is ignored when includes a comment, and matches words
matching the input. that begin with a number.

Search Flags

Search flags modify the behavior for matching expressions.

Flag Description
(?-i) Match letter case (default for regexp and regexprep).
(?i) Do not match letter case (default for regexpi).

2-61
2 Program Components

Flag Description
(?s) Match dot (.) in the pattern with any character (default).
(?-s) Match dot in the pattern with any character that is not a newline character.
(?-m) Match the ^ and $ metacharacters at the beginning and end of text (default).
(?m) Match the ^ and $ metacharacters at the beginning and end of a line.
(?-x) Include space characters and comments when matching (default).
(?x) Ignore space characters and comments when matching. Use '\ ' and '\#' to
match space and # characters.

The expression that the flag modifies can appear either after the parentheses, such as

(?i)\w*

or inside the parentheses and separated from the flag with a colon (:), such as

(?i:\w*)

The latter syntax allows you to change the behavior for part of a larger expression.

See Also
regexp | regexpi | regexprep | regexptranslate | pattern

More About
• “Lookahead Assertions in Regular Expressions” on page 2-63
• “Tokens in Regular Expressions” on page 2-66
• “Dynamic Regular Expressions” on page 2-72
• “Search and Replace Text” on page 6-37

2-62
Lookahead Assertions in Regular Expressions

Lookahead Assertions in Regular Expressions


In this section...
“Lookahead Assertions” on page 2-63
“Overlapping Matches” on page 2-63
“Logical AND Conditions” on page 2-64

Lookahead Assertions
There are two types of lookaround assertions for regular expressions: lookahead and lookbehind. In
both cases, the assertion is a condition that must be satisfied to return a match to the expression.

A lookahead assertion has the form (?=test) and can appear anywhere in a regular expression.
MATLAB looks ahead of the current location in the text for the test condition. If MATLAB matches the
test condition, it continues processing the rest of the expression to find a match.

For example, look ahead in a character vector specifying a path to find the name of the folder that
contains a program file (in this case, fileread.m).

chr = which('fileread')

chr =

'matlabroot\toolbox\matlab\iofun\fileread.m'

regexp(chr,'\w+(?=\\\w+\.[mp])','match')

ans =

1×1 cell array

{'iofun'}

The match expression, \w+, searches for one or more alphanumeric or underscore characters. Each
time regexp finds a term that matches this condition, it looks ahead for a backslash (specified with
two backslashes, \\), followed by a file name (\w+) with an .m or .p extension (\.[mp]). The
regexp function returns the match that satisfies the lookahead condition, which is the folder name
iofun.

Overlapping Matches
Lookahead assertions do not consume any characters in the text. As a result, you can use them to find
overlapping character sequences.

For example, use lookahead to find every sequence of six nonwhitespace characters in a character
vector by matching initial characters that precede five additional characters:

chr = 'Locate several 6-char. phrases';


startIndex = regexpi(chr,'\S(?=\S{5})')

startIndex =

1 8 9 16 17 24 25

2-63
2 Program Components

The starting indices correspond to these phrases:


Locate severa everal 6-char -char. phrase hrases

Without the lookahead operator, MATLAB parses a character vector from left to right, consuming the
vector as it goes. If matching characters are found, regexp records the location and resumes parsing
the character vector from the location of the most recent match. There is no overlapping of
characters in this process.
chr = 'Locate several 6-char. phrases';
startIndex = regexpi(chr,'\S{6}')

startIndex =

1 8 16 24

The starting indices correspond to these phrases:


Locate severa 6-char phrase

Logical AND Conditions


Another way to use a lookahead operation is to perform a logical AND between two conditions. This
example initially attempts to locate all lowercase consonants in a character array consisting of the
first 50 characters of the help for the normest function:
helptext = help('normest');
chr = helptext(1:50)

chr =

' NORMEST Estimate the matrix 2-norm.


NORMEST(S'

Merely searching for non-vowels ([^aeiou]) does not return the expected answer, as the output
includes capital letters, space characters, and punctuation:
c = regexp(chr,'[^aeiou]','match')

c =

1×43 cell array

Columns 1 through 14

{' '} {'N'} {'O'} {'R'} {'M'} {'E'} {'S'} {'T'} {' '} {'E'} {'s

Columns 15 through 28

{' '} {'t'} {'h'} {' '} {'m'} {'t'} {'r'} {'x'} {' '} {'2'} {'-

Columns 29 through 42

{'.'} {'↵'} {' '} {' '} {' '} {' '} {'N'} {'O'} {'R'} {'M'} {'E

Column 43

{'S'}

2-64
Lookahead Assertions in Regular Expressions

Try this again, using a lookahead operator to create the following AND condition:

(lowercase letter) AND (not a vowel)

This time, the result is correct:

c = regexp(chr,'(?=[a-z])[^aeiou]','match')

c =

1×13 cell array

{'s'} {'t'} {'m'} {'t'} {'t'} {'h'} {'m'} {'t'} {'r'} {'x'} {'n

Note that when using a lookahead operator to perform an AND, you need to place the match
expression expr after the test expression test:

(?=test)expr or (?!test)expr

See Also
regexp | regexpi | regexprep

More About
• “Regular Expressions” on page 2-51

2-65
2 Program Components

Tokens in Regular Expressions


In this section...
“Introduction” on page 2-66
“Multiple Tokens” on page 2-68
“Unmatched Tokens” on page 2-69
“Tokens in Replacement Text” on page 2-69
“Named Capture” on page 2-70

Introduction
Parentheses used in a regular expression not only group elements of that expression together, but
also designate any matches found for that group as tokens. You can use tokens to match other parts
of the same text. One advantage of using tokens is that they remember what they matched, so you
can recall and reuse matched text in the process of searching or replacing.

Each token in the expression is assigned a number, starting from 1, going from left to right. To make
a reference to a token later in the expression, refer to it using a backslash followed by the token
number. For example, when referencing a token generated by the third set of parentheses in the
expression, use \3.

As a simple example, if you wanted to search for identical sequential letters in a character array, you
could capture the first letter as a token and then search for a matching character immediately
afterwards. In the expression shown below, the (\S) phrase creates a token whenever regexp
matches any nonwhitespace character in the character array. The second part of the expression,
'\1', looks for a second instance of the same character immediately following the first.
poe = ['While I nodded, nearly napping, ' ...
'suddenly there came a tapping,'];

[mat,tok,ext] = regexp(poe, '(\S)\1', 'match', ...


'tokens', 'tokenExtents');
mat

mat =

1×4 cell array

{'dd'} {'pp'} {'dd'} {'pp'}

The cell array tok contains cell arrays that each contain a token.
tok{:}

ans =

1×1 cell array

{'d'}

ans =

2-66
Tokens in Regular Expressions

1×1 cell array

{'p'}

ans =

1×1 cell array

{'d'}

ans =

1×1 cell array

{'p'}

The cell array ext contains numeric arrays that each contain starting and ending indices for a token.

ext{:}

ans =

11 11

ans =

26 26

ans =

35 35

ans =

57 57

For another example, capture pairs of matching HTML tags (e.g., <a> and </a>) and the text
between them. The expression used for this example is

expr = '<(\w+).*?>.*?</\1>';

The first part of the expression, '<(\w+)', matches an opening angle bracket (<) followed by one or
more alphabetic, numeric, or underscore characters. The enclosing parentheses capture token
characters following the opening angle bracket.

The second part of the expression, '.*?>.*?', matches the remainder of this HTML tag (characters
up to the >), and any characters that may precede the next opening angle bracket.

The last part, '</\1>', matches all characters in the ending HTML tag. This tag is composed of the
sequence </tag>, where tag is whatever characters were captured as a token.

hstr = '<!comment><a name="752507"></a><b>Default</b><br>';


expr = '<(\w+).*?>.*?</\1>';

2-67
2 Program Components

[mat,tok] = regexp(hstr, expr, 'match', 'tokens');


mat{:}

ans =

'<a name="752507"></a>'

ans =

'<b>Default</b>'

tok{:}

ans =

1×1 cell array

{'a'}

ans =

1×1 cell array

{'b'}

Multiple Tokens
Here is an example of how tokens are assigned values. Suppose that you are going to search the
following text:

andy ted bob jim andrew andy ted mark

You choose to search the above text with the following search pattern:

and(y|rew)|(t)e(d)

This pattern has three parenthetical expressions that generate tokens. When you finally perform the
search, the following tokens are generated for each match.

Match Token 1 Token 2


andy y
ted t d
andrew rew
andy y
ted t d

Only the highest level parentheses are used. For example, if the search pattern and(y|rew) finds the
text andrew, token 1 is assigned the value rew. However, if the search pattern (and(y|rew)) is
used, token 1 is assigned the value andrew.

2-68
Tokens in Regular Expressions

Unmatched Tokens
For those tokens specified in the regular expression that have no match in the text being evaluated,
regexp and regexpi return an empty character vector ('') as the token output, and an extent that
marks the position in the string where the token was expected.

The example shown here executes regexp on a character vector specifying the path returned from
the MATLAB tempdir function. The regular expression expr includes six token specifiers, one for
each piece of the path. The third specifier [a-z]+ has no match in the character vector because this
part of the path, Profiles, begins with an uppercase letter:
chr = tempdir

chr =

'C:\WINNT\Profiles\bpascal\LOCALS~1\Temp\'

expr = ['([A-Z]:)\\(WINNT)\\([a-z]+)?.*\\' ...


'([a-z]+)\\([A-Z]+~\d)\\(Temp)\\'];

[tok, ext] = regexp(chr, expr, 'tokens', 'tokenExtents');

When a token is not found in the text, regexp returns an empty character vector ('') as the token
and a numeric array with the token extent. The first number of the extent is the string index that
marks where the token was expected, and the second number of the extent is equal to one less than
the first.

In the case of this example, the empty token is the third specified in the expression, so the third token
returned is empty:
tok{:}

ans =

1×6 cell array

{'C:'} {'WINNT'} {0×0 char} {'bpascal'} {'LOCALS~1'} {'Temp'}

The third token extent returned in the variable ext has the starting index set to 10, which is where
the nonmatching term, Profiles, begins in the path. The ending extent index is set to one less than
the starting index, or 9:
ext{:}

ans =

1 2
4 8
10 9
19 25
27 34
36 39

Tokens in Replacement Text


When using tokens in replacement text, reference them using $1, $2, etc. instead of \1, \2, etc. This
example captures two tokens and reverses their order. The first, $1, is 'Norma Jean' and the

2-69
2 Program Components

second, $2, is 'Baker'. Note that regexprep returns the modified text, not a vector of starting
indices.
regexprep('Norma Jean Baker', '(\w+\s\w+)\s(\w+)', '$2, $1')

ans =

'Baker, Norma Jean'

Named Capture
If you use a lot of tokens in your expressions, it may be helpful to assign them names rather than
having to keep track of which token number is assigned to which token.

When referencing a named token within the expression, use the syntax \k<name> instead of the
numeric \1, \2, etc.:
poe = ['While I nodded, nearly napping, ' ...
'suddenly there came a tapping,'];

regexp(poe, '(?<anychar>.)\k<anychar>', 'match')

ans =

1×4 cell array

{'dd'} {'pp'} {'dd'} {'pp'}

Named tokens can also be useful in labeling the output from the MATLAB regular expression
functions. This is especially true when you are processing many pieces of text.

For example, parse different parts of street addresses from several character vectors. A short name is
assigned to each token in the expression:
chr1 = '134 Main Street, Boulder, CO, 14923';
chr2 = '26 Walnut Road, Topeka, KA, 25384';
chr3 = '847 Industrial Drive, Elizabeth, NJ, 73548';

p1 = '(?<adrs>\d+\s\S+\s(Road|Street|Avenue|Drive))';
p2 = '(?<city>[A-Z][a-z]+)';
p3 = '(?<state>[A-Z]{2})';
p4 = '(?<zip>\d{5})';

expr = [p1 ', ' p2 ', ' p3 ', ' p4];

As the following results demonstrate, you can make your output easier to work with by using named
tokens:
loc1 = regexp(chr1, expr, 'names')

loc1 =

struct with fields:

adrs: '134 Main Street'


city: 'Boulder'
state: 'CO'
zip: '14923'

2-70
Tokens in Regular Expressions

loc2 = regexp(chr2, expr, 'names')

loc2 =

struct with fields:

adrs: '26 Walnut Road'


city: 'Topeka'
state: 'KA'
zip: '25384'

loc3 = regexp(chr3, expr, 'names')

loc3 =

struct with fields:

adrs: '847 Industrial Drive'


city: 'Elizabeth'
state: 'NJ'
zip: '73548'

See Also
regexp | regexpi | regexprep

More About
• “Regular Expressions” on page 2-51

2-71
2 Program Components

Dynamic Regular Expressions

In this section...
“Introduction” on page 2-72
“Dynamic Match Expressions — (??expr)” on page 2-73
“Commands That Modify the Match Expression — (??@cmd)” on page 2-73
“Commands That Serve a Functional Purpose — (?@cmd)” on page 2-74
“Commands in Replacement Expressions — ${cmd}” on page 2-76

Introduction
In a dynamic expression, you can make the pattern that you want regexp to match dependent on the
content of the input text. In this way, you can more closely match varying input patterns in the text
being parsed. You can also use dynamic expressions in replacement terms for use with the
regexprep function. This gives you the ability to adapt the replacement text to the parsed input.

You can include any number of dynamic expressions in the match_expr or replace_expr
arguments of these commands:

regexp(text, match_expr)
regexpi(text, match_expr)
regexprep(text, match_expr, replace_expr)

As an example of a dynamic expression, the following regexprep command correctly replaces the
term internationalization with its abbreviated form, i18n. However, to use it on a different
term such as globalization, you have to use a different replacement expression:

match_expr = '(^\w)(\w*)(\w$)';

replace_expr1 = '$118$3';
regexprep('internationalization', match_expr, replace_expr1)

ans =

'i18n'

replace_expr2 = '$111$3';
regexprep('globalization', match_expr, replace_expr2)

ans =

'g11n'

Using a dynamic expression ${num2str(length($2))} enables you to base the replacement


expression on the input text so that you do not have to change the expression each time. This
example uses the dynamic replacement syntax ${cmd}.

match_expr = '(^\w)(\w*)(\w$)';
replace_expr = '$1${num2str(length($2))}$3';

regexprep('internationalization', match_expr, replace_expr)

2-72
Dynamic Regular Expressions

ans =

'i18n'

regexprep('globalization', match_expr, replace_expr)

ans =

'g11n'

When parsed, a dynamic expression must correspond to a complete, valid regular expression. In
addition, dynamic match expressions that use the backslash escape character (\) require two
backslashes: one for the initial parsing of the expression, and one for the complete match. The
parentheses that enclose dynamic expressions do not create a capturing group.

There are three forms of dynamic expressions that you can use in match expressions, and one form
for replacement expressions, as described in the following sections

Dynamic Match Expressions — (??expr)


The (??expr) operator parses expression expr, and inserts the results back into the match
expression. MATLAB then evaluates the modified match expression.

Here is an example of the type of expression that you can use with this operator:
chr = {'5XXXXX', '8XXXXXXXX', '1X'};
regexp(chr, '^(\d+)(??X{$1})$', 'match', 'once');

The purpose of this particular command is to locate a series of X characters in each of the character
vectors stored in the input cell array. Note however that the number of Xs varies in each character
vector. If the count did not vary, you could use the expression X{n} to indicate that you want to match
n of these characters. But, a constant value of n does not work in this case.

The solution used here is to capture the leading count number (e.g., the 5 in the first character vector
of the cell array) in a token, and then to use that count in a dynamic expression. The dynamic
expression in this example is (??X{$1}), where $1 is the value captured by the token \d+. The
operator {$1} makes a quantifier of that token value. Because the expression is dynamic, the same
pattern works on all three of the input vectors in the cell array. With the first input character vector,
regexp looks for five X characters; with the second, it looks for eight, and with the third, it looks for
just one:
regexp(chr, '^(\d+)(??X{$1})$', 'match', 'once')

ans =

1×3 cell array

{'5XXXXX'} {'8XXXXXXXX'} {'1X'}

Commands That Modify the Match Expression — (??@cmd)


MATLAB uses the (??@cmd) operator to include the results of a MATLAB command in the match
expression. This command must return a term that can be used within the match expression.

For example, use the dynamic expression (??@flilplr($1)) to locate a palindrome, “Never Odd or
Even”, that has been embedded into a larger character vector.

2-73
2 Program Components

First, create the input string. Make sure that all letters are lowercase, and remove all nonword
characters.

chr = lower(...
'Find the palindrome Never Odd or Even in this string');

chr = regexprep(chr, '\W*', '')

chr =

'findthepalindromeneveroddoreveninthisstring'

Locate the palindrome within the character vector using the dynamic expression:

palindrome = regexp(chr, '(.{3,}).?(??@fliplr($1))', 'match')

palindrome =

1×1 cell array

{'neveroddoreven'}

The dynamic expression reverses the order of the letters that make up the character vector, and then
attempts to match as much of the reversed-order vector as possible. This requires a dynamic
expression because the value for $1 relies on the value of the token (.{3,}).

Dynamic expressions in MATLAB have access to the currently active workspace. This means that you
can change any of the functions or variables used in a dynamic expression just by changing variables
in the workspace. Repeat the last command of the example above, but this time define the function to
be called within the expression using a function handle stored in the base workspace:

fun = @fliplr;

palindrome = regexp(chr, '(.{3,}).?(??@fun($1))', 'match')

palindrome =

1×1 cell array

{'neveroddoreven'}

Commands That Serve a Functional Purpose — (?@cmd)


The (?@cmd) operator specifies a MATLAB command that regexp or regexprep is to run while
parsing the overall match expression. Unlike the other dynamic expressions in MATLAB, this operator
does not alter the contents of the expression it is used in. Instead, you can use this functionality to
get MATLAB to report just what steps it is taking as it parses the contents of one of your regular
expressions. This functionality can be useful in diagnosing your regular expressions.

The following example parses a word for zero or more characters followed by two identical
characters followed again by zero or more characters:

regexp('mississippi', '\w*(\w)\1\w*', 'match')

ans =

1×1 cell array

2-74
Dynamic Regular Expressions

{'mississippi'}

To track the exact steps that MATLAB takes in determining the match, the example inserts a short
script (?@disp($1)) in the expression to display the characters that finally constitute the match.
Because the example uses greedy quantifiers, MATLAB attempts to match as much of the character
vector as possible. So, even though MATLAB finds a match toward the beginning of the string, it
continues to look for more matches until it arrives at the very end of the string. From there, it backs
up through the letters i then p and the next p, stopping at that point because the match is finally
satisfied:
regexp('mississippi', '\w*(\w)(?@disp($1))\1\w*', 'match')

i
p
p

ans =

1×1 cell array

{'mississippi'}

Now try the same example again, this time making the first quantifier lazy (*?). Again, MATLAB
makes the same match:
regexp('mississippi', '\w*?(\w)\1\w*', 'match')

ans =

1×1 cell array

{'mississippi'}

But by inserting a dynamic script, you can see that this time, MATLAB has matched the text quite
differently. In this case, MATLAB uses the very first match it can find, and does not even consider the
rest of the text:
regexp('mississippi', '\w*?(\w)(?@disp($1))\1\w*', 'match')

m
i
s

ans =

1×1 cell array

{'mississippi'}

To demonstrate how versatile this type of dynamic expression can be, consider the next example that
progressively assembles a cell array as MATLAB iteratively parses the input text. The (?!) operator
found at the end of the expression is actually an empty lookahead operator, and forces a failure at
each iteration. This forced failure is necessary if you want to trace the steps that MATLAB is taking to
resolve the expression.

MATLAB makes a number of passes through the input text, each time trying another combination of
letters to see if a fit better than last match can be found. On any passes in which no matches are

2-75
2 Program Components

found, the test results in an empty character vector. The dynamic script (?@if(~isempty($&)))
serves to omit the empty character vectors from the matches cell array:

matches = {};
expr = ['(Euler\s)?(Cauchy\s)?(Boole)?(?@if(~isempty($&)),' ...
'matches{end+1}=$&;end)(?!)'];

regexp('Euler Cauchy Boole', expr);

matches

matches =

1×6 cell array

{'Euler Cauchy Bo…'} {'Euler Cauchy '} {'Euler '} {'Cauchy Boole'} {'Cauchy '}

The operators $& (or the equivalent $0), $`, and $' refer to that part of the input text that is
currently a match, all characters that precede the current match, and all characters to follow the
current match, respectively. These operators are sometimes useful when working with dynamic
expressions, particularly those that employ the (?@cmd) operator.

This example parses the input text looking for the letter g. At each iteration through the text, regexp
compares the current character with g, and not finding it, advances to the next character. The
example tracks the progress of scan through the text by marking the current location being parsed
with a ^ character.

(The $` and $´ operators capture that part of the text that precedes and follows the current parsing
location. You need two single-quotation marks ($'') to express the sequence $´ when it appears
within text.)

chr = 'abcdefghij';
expr = '(?@disp(sprintf(''starting match: [%s^%s]'',$`,$'')))g';

regexp(chr, expr, 'once');

starting match: [^abcdefghij]


starting match: [a^bcdefghij]
starting match: [ab^cdefghij]
starting match: [abc^defghij]
starting match: [abcd^efghij]
starting match: [abcde^fghij]
starting match: [abcdef^ghij]

Commands in Replacement Expressions — ${cmd}


The ${cmd} operator modifies the contents of a regular expression replacement pattern, making this
pattern adaptable to parameters in the input text that might vary from one use to the next. As with
the other dynamic expressions used in MATLAB, you can include any number of these expressions
within the overall replacement expression.

In the regexprep call shown here, the replacement pattern is '${convertMe($1,$2)}'. In this
case, the entire replacement pattern is a dynamic expression:

regexprep('This highway is 125 miles long', ...


'(\d+\.?\d*)\W(\w+)', '${convertMe($1,$2)}');

2-76
Dynamic Regular Expressions

The dynamic expression tells MATLAB to execute a function named convertMe using the two tokens
(\d+\.?\d*) and (\w+), derived from the text being matched, as input arguments in the call to
convertMe. The replacement pattern requires a dynamic expression because the values of $1 and $2
are generated at runtime.

The following example defines the file named convertMe that converts measurements from imperial
units to metric.
function valout = convertMe(valin, units)
switch(units)
case 'inches'
fun = @(in)in .* 2.54;
uout = 'centimeters';
case 'miles'
fun = @(mi)mi .* 1.6093;
uout = 'kilometers';
case 'pounds'
fun = @(lb)lb .* 0.4536;
uout = 'kilograms';
case 'pints'
fun = @(pt)pt .* 0.4731;
uout = 'litres';
case 'ounces'
fun = @(oz)oz .* 28.35;
uout = 'grams';
end
val = fun(str2num(valin));
valout = [num2str(val) ' ' uout];
end

At the command line, call the convertMe function from regexprep, passing in values for the
quantity to be converted and name of the imperial unit:
regexprep('This highway is 125 miles long', ...
'(\d+\.?\d*)\W(\w+)', '${convertMe($1,$2)}')

ans =

'This highway is 201.1625 kilometers long'

regexprep('This pitcher holds 2.5 pints of water', ...


'(\d+\.?\d*)\W(\w+)', '${convertMe($1,$2)}')

ans =

'This pitcher holds 1.1828 litres of water'

regexprep('This stone weighs about 10 pounds', ...


'(\d+\.?\d*)\W(\w+)', '${convertMe($1,$2)}')

ans =

'This stone weighs about 4.536 kilograms'

As with the (??@ ) operator discussed in an earlier section, the ${ } operator has access to
variables in the currently active workspace. The following regexprep command uses the array A
defined in the base workspace:
A = magic(3)

2-77
2 Program Components

A =

8 1 6
3 5 7
4 9 2

regexprep('The columns of matrix _nam are _val', ...


{'_nam', '_val'}, ...
{'A', '${sprintf(''%d%d%d '', A)}'})

ans =

'The columns of matrix A are 834 159 672'

See Also
regexp | regexpi | regexprep

More About
• “Regular Expressions” on page 2-51

2-78
Comma-Separated Lists

Comma-Separated Lists
In this section...
“What Is a Comma-Separated List?” on page 2-79
“Generating a Comma-Separated List” on page 2-79
“Assigning Output from a Comma-Separated List” on page 2-81
“Assigning to a Comma-Separated List” on page 2-81
“How to Use the Comma-Separated Lists” on page 2-82
“Fast Fourier Transform Example” on page 2-84

What Is a Comma-Separated List?


Typing in a series of numbers separated by commas gives you what is called a comma-separated list.
The MATLAB software returns each value individually:
1,2,3

ans =

ans =

ans =

Such a list, by itself, is not very useful. But when used with large and more complex data structures
like MATLAB structures and cell arrays, the comma-separated list can enable you to simplify your
MATLAB code.

Generating a Comma-Separated List


This section describes how to generate a comma-separated list from either a cell array or a MATLAB
structure.

Generating a List from a Cell Array

Extracting multiple elements from a cell array yields a comma-separated list. Given a 4-by-6 cell array
as shown here
C = cell(4,6);
for k = 1:24
C{k} = k*2;
end
C

C =

2-79
2 Program Components

[2] [10] [18] [26] [34] [42]


[4] [12] [20] [28] [36] [44]
[6] [14] [22] [30] [38] [46]
[8] [16] [24] [32] [40] [48]

extracting the fifth column generates the following comma-separated list:

C{:,5}

ans =

34

ans =

36

ans =

38

ans =

40

This is the same as explicitly typing

C{1,5},C{2,5},C{3,5},C{4,5}

Generating a List from a Structure

For structures, extracting a field of the structure that exists across one of its dimensions yields a
comma-separated list.

Start by converting the cell array used above into a 4-by-1 MATLAB structure with six fields: f1
through f6. Read field f5 for all rows and MATLAB returns a comma-separated list:

S = cell2struct(C,{'f1','f2','f3','f4','f5','f6'},2);
S.f5

ans =

34

ans =

36

ans =

38

2-80
Comma-Separated Lists

ans =

40

This is the same as explicitly typing

S(1).f5,S(2).f5,S(3).f5,S(4).f5

Assigning Output from a Comma-Separated List


You can assign any or all consecutive elements of a comma-separated list to variables with a simple
assignment statement. Using the cell array C from the previous section, assign the first row to
variables c1 through c6:

C = cell(4,6);
for k = 1:24
C{k} = k*2;
end
[c1,c2,c3,c4,c5,c6] = C{1,1:6};
c5

c5 =

34

If you specify fewer output variables than the number of outputs returned by the expression, MATLAB
assigns the first N outputs to those N variables, and then discards any remaining outputs. In this next
example, MATLAB assigns C{1,1:3} to the variables c1, c2, and c3, and then discards C{1,4:6}:

[c1,c2,c3] = C{1,1:6};

You can assign structure outputs in the same manner:

S = cell2struct(C,{'f1','f2','f3','f4','f5','f6'},2);
[sf1,sf2,sf3] = S.f5;
sf3

sf3 =

38

You also can use the deal function for this purpose.

Assigning to a Comma-Separated List


The simplest way to assign multiple values to a comma-separated list is to use the deal function. This
function distributes all of its input arguments to the elements of a comma-separated list.

This example uses deal to overwrite each element in a comma-separated list. First create a list.

c{1} = [31 07];


c{2} = [03 78];
c{:}

ans =

31 7

2-81
2 Program Components

ans =

3 78

Use deal to overwrite each element in the list.

[c{:}] = deal([10 20],[14 12]);


c{:}

ans =

10 20

ans =

14 12

This example does the same as the one above, but with a comma-separated list of vectors in a
structure field:

s(1).field1 = [31 07];


s(2).field1 = [03 78];
s.field1

ans =

31 7

ans =

3 78

Use deal to overwrite the structure fields.

[s.field1] = deal([10 20],[14 12]);


s.field1

ans =

10 20

ans =

14 12

How to Use the Comma-Separated Lists


Common uses for comma-separated lists are

• “Constructing Arrays” on page 2-83


• “Displaying Arrays” on page 2-83

2-82
Comma-Separated Lists

• “Concatenation” on page 2-83


• “Function Call Arguments” on page 2-84
• “Function Return Values” on page 2-84

The following sections provide examples of using comma-separated lists with cell arrays. Each of
these examples applies to MATLAB structures as well.

Constructing Arrays

You can use a comma-separated list to enter a series of elements when constructing a matrix or array.
Note what happens when you insert a list of elements as opposed to adding the cell itself.

When you specify a list of elements with C{:, 5}, MATLAB inserts the four individual elements:
A = {'Hello',C{:,5},magic(4)}

A =

'Hello' [34] [36] [38] [40] [4x4 double]

When you specify the C cell itself, MATLAB inserts the entire cell array:
A = {'Hello',C,magic(4)}

A =

'Hello' {4x6 cell} [4x4 double]

Displaying Arrays

Use a list to display all or part of a structure or cell array:


A{:}

ans =

Hello

ans =

[2] [10] [18] [26] [34] [42]


[4] [12] [20] [28] [36] [44]
[6] [14] [22] [30] [38] [46]
[8] [16] [24] [32] [40] [48]

ans =

16 2 3 13
5 11 10 8
9 7 6 12
4 14 15 1

Concatenation

Putting a comma-separated list inside square brackets extracts the specified elements from the list
and concatenates them:

2-83
2 Program Components

A = [C{:,5:6}]

A =

34 36 38 40 42 44 46 48

Function Call Arguments

When writing the code for a function call, you enter the input arguments as a list with each argument
separated by a comma. If you have these arguments stored in a structure or cell array, then you can
generate all or part of the argument list from the structure or cell array instead. This can be
especially useful when passing in variable numbers of arguments.

This example passes several attribute-value arguments to the plot function:

X = -pi:pi/10:pi;
Y = tan(sin(X)) - sin(tan(X));
C = cell(2,3);
C{1,1} = 'LineWidth';
C{2,1} = 2;
C{1,2} = 'MarkerEdgeColor';
C{2,2} = 'k';
C{1,3} = 'MarkerFaceColor';
C{2,3} = 'g';
figure
plot(X,Y,'--rs',C{:})

Function Return Values

MATLAB functions can also return more than one value to the caller. These values are returned in a
list with each value separated by a comma. Instead of listing each return value, you can use a comma-
separated list with a structure or cell array. This becomes more useful for those functions that have
variable numbers of return values.

This example returns three values to a cell array:

C = cell(1,3);
[C{:}] = fileparts('work/mytests/strArrays.mat')

C =

'work/mytests' 'strArrays' '.mat'

Fast Fourier Transform Example


The fftshift function swaps the left and right halves of each dimension of an array. For a simple
vector such as [0 2 4 6 8 10] the output would be [6 8 10 0 2 4]. For a multidimensional
array, fftshift performs this swap along each dimension.

fftshift uses vectors of indices to perform the swap. For the vector shown above, the index [1 2
3 4 5 6] is rearranged to form a new index [4 5 6 1 2 3]. The function then uses this index
vector to reposition the elements. For a multidimensional array, fftshift must construct an index
vector for each dimension. A comma-separated list makes this task much simpler.

Here is the fftshift function:

2-84
Comma-Separated Lists

function y = fftshift(x)
numDims = ndims(x);
idx = cell(1,numDims);
for k = 1:numDims
m = size(x,k);
p = ceil(m/2);
idx{k} = [p+1:m 1:p];
end
y = x(idx{:});
end

The function stores the index vectors in cell array idx. Building this cell array is relatively simple.
For each of the N dimensions, determine the size of that dimension and find the integer index nearest
the midpoint. Then, construct a vector that swaps the two halves of that dimension.

By using a cell array to store the index vectors and a comma-separated list for the indexing operation,
fftshift shifts arrays of any dimension using just a single operation: y = x(idx{:}). If you were
to use explicit indexing, you would need to write one if statement for each dimension you want the
function to handle:

if ndims(x) == 1
y = x(index1);
else if ndims(x) == 2
y = x(index1,index2);
end
end

Another way to handle this without a comma-separated list would be to loop over each dimension,
converting one dimension at a time and moving data each time. With a comma-separated list, you
move the data just once. A comma-separated list makes it very easy to generalize the swapping
operation to an arbitrary number of dimensions.

2-85
2 Program Components

Alternatives to the eval Function


In this section...
“Why Avoid the eval Function?” on page 2-86
“Variables with Sequential Names” on page 2-86
“Files with Sequential Names” on page 2-87
“Function Names in Variables” on page 2-87
“Field Names in Variables” on page 2-88
“Error Handling” on page 2-88

Why Avoid the eval Function?


Although the eval function is very powerful and flexible, it is not always the best solution to a
programming problem. Code that calls eval is often less efficient and more difficult to read and
debug than code that uses other functions or language constructs. For example:

• MATLAB compiles code the first time you run it to enhance performance for future runs. However,
because code in an eval statement can change at run time, it is not compiled.
• Code within an eval statement can unexpectedly create or assign to a variable already in the
current workspace, overwriting existing data.
• Concatenated character vectors within an eval statement are often difficult to read. Other
language constructs can simplify the syntax in your code.

For many common uses of eval, there are preferred alternate approaches, as shown in the following
examples.

Variables with Sequential Names


A frequent use of the eval function is to create sets of variables such as A1, A2, ..., An, but this
approach does not use the array processing power of MATLAB and is not recommended. The
preferred method is to store related data in a single array. If the data sets are of different types or
sizes, use a structure or cell array.

For example, create a cell array that contains 10 elements, where each element is a numeric array:

numArrays = 10;
A = cell(numArrays,1);
for n = 1:numArrays
A{n} = magic(n);
end

Access the data in the cell array by indexing with curly braces. For example, display the fifth element
of A:

A{5}

ans =
17 24 1 8 15
23 5 7 14 16
4 6 13 20 22

2-86
Alternatives to the eval Function

10 12 19 21 3
11 18 25 2 9

The assignment statement A{n} = magic(n) is more elegant and efficient than this call to eval:
eval(['A', int2str(n),' = magic(n)']) % Not recommended

For more information, see:

• “Create Cell Array” on page 12-3


• “Structure Arrays” on page 11-2

Files with Sequential Names


Related data files often have a common root name with an integer index, such as myfile1.mat
through myfileN.mat. A common (but not recommended) use of the eval function is to construct
and pass each file name to a function using command syntax, such as
eval(['save myfile',int2str(n),'.mat']) % Not recommended

The best practice is to use function syntax, which allows you to pass variables as inputs. For example:
currentFile = 'myfile1.mat';
save(currentFile)

You can construct file names within a loop using the sprintf function (which is usually more
efficient than int2str), and then call the save function without eval. This code creates 10 files in
the current folder:
numFiles = 10;
for n = 1:numFiles
randomData = rand(n);
currentFile = sprintf('myfile%d.mat',n);
save(currentFile,'randomData')
end

For more information, see:

• “Choose Command Syntax or Function Syntax” on page 1-7


• “Import or Export a Sequence of Files”

Function Names in Variables


A common use of eval is to execute a function when the name of the function is in a variable
character vector. There are two ways to evaluate functions from variables that are more efficient than
using eval:

• Create function handles with the @ symbol or with the str2func function. For example, run a
function from a list stored in a cell array:
examples = {@odedemo,@sunspots,@fitdemo};
n = input('Select an example (1, 2, or 3): ');
examples{n}()
• Use the feval function. For example, call a plot function (such as plot, bar, or pie) with data
that you specify at run time:

2-87
2 Program Components

plotFunction = input('Specify a plotting function: ','s');


data = input('Enter data to plot: ');
feval(plotFunction,data)

Field Names in Variables


Access data in a structure with a variable field name by enclosing the expression for the field in
parentheses. For example:

myData.height = [67, 72, 58];


myData.weight = [140, 205, 90];

fieldName = input('Select data (height or weight): ','s');


dataToUse = myData.(fieldName);

If you enter weight at the input prompt, then you can find the minimum weight value with the
following command.

min(dataToUse)

ans =
90

For an additional example, see “Generate Field Names from Variables” on page 11-10.

Error Handling
The preferred method for error handling in MATLAB is to use a try, catch statement. For example:

try
B = A;
catch exception
disp('A is undefined')
end

If your workspace does not contain variable A, then this code returns:

A is undefined

Previous versions of the documentation for the eval function include the syntax
eval(expression,catch_expr). If evaluating the expression input returns an error, then eval
evaluates catch_expr. However, an explicit try/catch is significantly clearer than an implicit
catch in an eval statement. Using the implicit catch is not recommended.

2-88
Classes (Data Types)

89
3

Overview of MATLAB Classes


3 Overview of MATLAB Classes

Fundamental MATLAB Classes


There are many different data types, or classes, that you can work with in MATLAB. You can build
matrices and arrays of floating-point and integer data, characters and strings, logical true and
false values, and so on. Function handles connect your code with any MATLAB function regardless
of the current scope. Tables, timetables, structures, and cell arrays provide a way to store dissimilar
types of data in the same container.

There are 16 fundamental classes in MATLAB. Each of these classes is in the form of a matrix or
array. With the exception of function handles, this matrix or array is a minimum of 0-by-0 in size and
can grow to an n-dimensional array of any size. A function handle is always scalar (1-by-1).

All of the fundamental MATLAB classes are shown in the diagram below:

Numeric classes in the MATLAB software include signed and unsigned integers, and single- and
double-precision floating-point numbers. By default, MATLAB stores all numeric values as double-
precision floating point. (You cannot change the default type and precision.) You can choose to store
any number, or array of numbers, as integers or as single-precision. Integer and single-precision
arrays offer more memory-efficient storage than double-precision.

All numeric types support basic array operations, such as subscripting, reshaping, and mathematical
operations.

You can create two-dimensional double and logical matrices using one of two storage formats:
full or sparse. For matrices with mostly zero-valued elements, a sparse matrix requires a fraction
of the storage space required for an equivalent full matrix. Sparse matrices invoke methods
especially tailored to solve sparse problems.

These classes require different amounts of storage, the smallest being a logical value or 8-bit
integer which requires only 1 byte. It is important to keep this minimum size in mind if you work on
data in files that were written using a precision smaller than 8 bits.

The following table describes the fundamental classes in more detail.

3-2
Fundamental MATLAB Classes

Class Name Documentation Intended Use


double, single Floating-Point • Required for fractional numeric data.
Numbers on page 4- • Double on page 4-6 and Single on page 4-6 precision.
6
• Use realmin and realmax to show range of values on page 4-
9.
• Two-dimensional arrays can be sparse.
• Default numeric type in MATLAB.
int8, uint8, Integers on page 4- • Use for signed and unsigned whole numbers.
int16, uint16, 2 • More efficient use of memory. on page 31-2
int32, uint32,
int64, uint64 • Use intmin and intmax to show range of values on page 4-
4.
• Choose from 4 sizes (8, 16, 32, and 64 bits).
char, string “Characters and • Data type for text.
Strings” • Native or Unicode®.
• Converts to/from numeric.
• Use with regular expressions on page 2-51.
• For multiple character arrays, use cell arrays.
• Starting in R2016b, you also can store text in string arrays. For
more information, see string.
logical “Logical (Boolean) • Use in relational conditions or to test state.
Operations” • Can have one of two values: true or false.
• Also useful in array indexing.
• Two-dimensional arrays can be sparse.
function_handle “Function Handles” • Pointer to a function.
• Enables passing a function to another function
• Can also call functions outside usual scope.
• Use to specify graphics callback functions.
• Save to MAT-file and restore later.
table, timetable “Tables”, • Tables are rectangular containers for mixed-type, column-
“Timetables” oriented data.
• Tables have row and variable names that identify contents.
• Timetables also provide storage for data in a table with rows
labeled by time. Timetable functions can synchronize, resample,
or aggregate timestamped data.
• Use the properties of a table or timetable to store metadata
such as variable units.
• Manipulation of elements similar to numeric or logical arrays.
• Access data by numeric or named index.
• Can select a subset of data and preserve the table container or
can extract the data from a table.

3-3
3 Overview of MATLAB Classes

Class Name Documentation Intended Use


struct “Structures” • Fields store arrays of varying classes and sizes.
• Access one or all fields/indices in single operation.
• Field names identify contents.
• Method of passing function arguments.
• Use in comma-separated lists on page 2-79.
• More memory required for overhead
cell “Cell Arrays” • Cells store arrays of varying classes and sizes.
• Allows freedom to package data as you want.
• Manipulation of elements is similar to numeric or logical arrays.
• Method of passing function arguments.
• Use in comma-separated lists.
• More memory required for overhead

See Also

More About
• “Valid Combinations of Unlike Classes” on page 15-2

3-4
4

Numeric Classes

• “Integers” on page 4-2


• “Floating-Point Numbers” on page 4-6
• “Create Complex Numbers” on page 4-13
• “Infinity and NaN” on page 4-14
• “Identifying Numeric Classes” on page 4-16
• “Display Format for Numeric Values” on page 4-17
• “Integer Arithmetic” on page 4-19
• “Single Precision Math” on page 4-26
4 Numeric Classes

Integers
In this section...
“Integer Classes” on page 4-2
“Creating Integer Data” on page 4-2
“Arithmetic Operations on Integer Classes” on page 4-4
“Largest and Smallest Values for Integer Classes” on page 4-4

Integer Classes
MATLAB has four signed and four unsigned integer classes. Signed types enable you to work with
negative integers as well as positive, but cannot represent as wide a range of numbers as the
unsigned types because one bit is used to designate a positive or negative sign for the number.
Unsigned types give you a wider range of numbers, but these numbers can only be zero or positive.

MATLAB supports 1-, 2-, 4-, and 8-byte storage for integer data. You can save memory and execution
time for your programs if you use the smallest integer type that accommodates your data. For
example, you do not need a 32-bit integer to store the value 100.

Here are the eight integer classes, the range of values you can store with each type, and the MATLAB
conversion function required to create that type:

Class Range of Values Conversion Function


7 7
Signed 8-bit integer -2 to 2 -1 int8
15 15
Signed 16-bit integer -2 to 2 -1 int16
31 31
Signed 32-bit integer -2 to 2 -1 int32
Signed 64-bit integer -263 to 263-1 int64
Unsigned 8-bit integer 0 to 28-1 uint8
16
Unsigned 16-bit integer 0 to 2 -1 uint16
32
Unsigned 32-bit integer 0 to 2 -1 uint32
64
Unsigned 64-bit integer 0 to 2 -1 uint64

Creating Integer Data


MATLAB stores numeric data as double-precision floating point (double) by default. To store data as
an integer, you need to convert from double to the desired integer type. Use one of the conversion
functions shown in the table above.

For example, to store 325 as a 16-bit signed integer assigned to variable x, type
x = int16(325);

If the number being converted to an integer has a fractional part, MATLAB rounds to the nearest
integer. If the fractional part is exactly 0.5, then from the two equally nearby integers, MATLAB
chooses the one for which the absolute value is larger in magnitude:
x = 325.499;
int16(x)

4-2
Integers

ans =

int16

325

x = x + .001;
int16(x)
ans =

int16

326

If you need to round a number using a rounding scheme other than the default, MATLAB provides
four rounding functions: round, fix, floor, and ceil. The fix function enables you to override the
default and round towards zero when there is a nonzero fractional part:
x = 325.9;

int16(fix(x))
ans =

int16

325

Arithmetic operations that involve both integers and floating-point always result in an integer data
type. MATLAB rounds the result, when necessary, according to the default rounding algorithm. The
example below yields an exact answer of 1426.75 which MATLAB then rounds to the next highest
integer:
int16(325) * 4.39
ans =

int16

1427

The integer conversion functions are also useful when converting other classes, such as strings, to
integers:
str = 'Hello World';

int8(str)
ans =

1×11 int8 row vector

72 101 108 108 111 32 87 111 114 108 100

If you convert a NaN value into an integer class, the result is a value of 0 in that integer class. For
example,
int32(NaN)
ans =

int32

4-3
4 Numeric Classes

Arithmetic Operations on Integer Classes


MATLAB can perform integer arithmetic on the following types of data:

• Integers or integer arrays of the same integer data type. This yields a result that has the same
data type as the operands:

x = uint32([132 347 528]) .* uint32(75);


class(x)
ans =
uint32
• Integers or integer arrays and scalar double-precision floating-point numbers. This yields a result
that has the same data type as the integer operands:

x = uint32([132 347 528]) .* 75.49;


class(x)
ans =
uint32

For all binary operations in which one operand is an array of integer data type (except 64-bit
integers) and the other is a scalar double, MATLAB computes the operation using element-wise
double-precision arithmetic, and then converts the result back to the original integer data type. For
binary operations involving a 64-bit integer array and a scalar double, MATLAB computes the
operation as if 80-bit extended-precision arithmetic were used, to prevent loss of precision.

Operations involving complex numbers with integer types is not supported.

Largest and Smallest Values for Integer Classes


For each integer data type, there is a largest and smallest number that you can represent with that
type. The table shown under “Integers” on page 4-2 lists the largest and smallest values for each
integer data type in the “Range of Values” column.

You can also obtain these values with the intmax and intmin functions:

intmax('int8')
ans =

int8

127

intmin('int8')
ans =

int8

-128

If you convert a number that is larger than the maximum value of an integer data type to that type,
MATLAB sets it to the maximum value. Similarly, if you convert a number that is smaller than the
minimum value of the integer data type, MATLAB sets it to the minimum value. For example,

4-4
Integers

x = int8(300)
x =

int8

127

x = int8(-300)
x =

int8

-128

Also, when the result of an arithmetic operation involving integers exceeds the maximum (or
minimum) value of the data type, MATLAB sets it to the maximum (or minimum) value:

x = int8(100) * 3
x =

int8

127

x = int8(-100) * 3
x =

int8

-128

4-5
4 Numeric Classes

Floating-Point Numbers
In this section...
“Double-Precision Floating Point” on page 4-6
“Single-Precision Floating Point” on page 4-6
“Creating Floating-Point Data” on page 4-6
“Arithmetic Operations on Floating-Point Numbers” on page 4-8
“Largest and Smallest Values for Floating-Point Classes” on page 4-9
“Accuracy of Floating-Point Data” on page 4-10
“Avoiding Common Problems with Floating-Point Arithmetic” on page 4-11

MATLAB represents floating-point numbers in either double-precision or single-precision format. The


default is double precision, but you can make any number single precision with a simple conversion
function.

Double-Precision Floating Point


MATLAB constructs the double-precision (or double) data type according to IEEE® Standard 754 for
double precision. Any value stored as a double requires 64 bits, formatted as shown in the table
below:

Bits Usage
63 Sign (0 = positive, 1 = negative)
62 to 52 Exponent, biased by 1023
51 to 0 Fraction f of the number 1.f

Single-Precision Floating Point


MATLAB constructs the single-precision (or single) data type according to IEEE Standard 754 for
single precision. Any value stored as a single requires 32 bits, formatted as shown in the table
below:

Bits Usage
31 Sign (0 = positive, 1 = negative)
30 to 23 Exponent, biased by 127
22 to 0 Fraction f of the number 1.f

Because MATLAB stores numbers of type single using 32 bits, they require less memory than
numbers of type double, which use 64 bits. However, because they are stored with fewer bits,
numbers of type single are represented to less precision than numbers of type double.

Creating Floating-Point Data


Use double-precision to store values greater than approximately 3.4 x 1038 or less than approximately
-3.4 x 1038. For numbers that lie between these two limits, you can use either double- or single-
precision, but single requires less memory.

4-6
Floating-Point Numbers

Creating Double-Precision Data

Because the default numeric type for MATLAB is double, you can create a double with a simple
assignment statement:
x = 25.783;

The whos function shows that MATLAB has created a 1-by-1 array of type double for the value you
just stored in x:
whos x
Name Size Bytes Class

x 1x1 8 double

Use isfloat if you just want to verify that x is a floating-point number. This function returns logical
1 (true) if the input is a floating-point number, and logical 0 (false) otherwise:
isfloat(x)
ans =

logical

You can convert other numeric data, characters or strings, and logical data to double precision using
the MATLAB function, double. This example converts a signed integer to double-precision floating
point:
y = int64(-589324077574); % Create a 64-bit integer

x = double(y) % Convert to double


x =
-5.8932e+11

Creating Single-Precision Data

Because MATLAB stores numeric data as a double by default, you need to use the single
conversion function to create a single-precision number:
x = single(25.783);

The whos function returns the attributes of variable x in a structure. The bytes field of this structure
shows that when x is stored as a single, it requires just 4 bytes compared with the 8 bytes to store it
as a double:
xAttrib = whos('x');
xAttrib.bytes
ans =
4

You can convert other numeric data, characters or strings, and logical data to single precision using
the single function. This example converts a signed integer to single-precision floating point:
y = int64(-589324077574); % Create a 64-bit integer

x = single(y) % Convert to single


x =

4-7
4 Numeric Classes

single

-5.8932e+11

Arithmetic Operations on Floating-Point Numbers


This section describes which classes you can use in arithmetic operations with floating-point
numbers.

Double-Precision Operations

You can perform basic arithmetic operations with double and any of the following other classes.
When one or more operands is an integer (scalar or array), the double operand must be a scalar. The
result is of type double, except where noted otherwise:

• single — The result is of type single


• double
• int* or uint* — The result has the same data type as the integer operand
• char
• logical

This example performs arithmetic on data of types char and double. The result is of type double:

c = 'uppercase' - 32;

class(c)
ans =
double

char(c)
ans =
UPPERCASE

Single-Precision Operations

You can perform basic arithmetic operations with single and any of the following other classes. The
result is always single:

• single
• double
• char
• logical

In this example, 7.5 defaults to type double, and the result is of type single:

x = single([1.32 3.47 5.28]) .* 7.5;

class(x)
ans =
single

4-8
Floating-Point Numbers

Largest and Smallest Values for Floating-Point Classes


For the double and single classes, there is a largest and smallest number that you can represent
with that type.

Largest and Smallest Double-Precision Values

The MATLAB functions realmax and realmin return the maximum and minimum values that you
can represent with the double data type:

str = 'The range for double is:\n\t%g to %g and\n\t %g to %g';


sprintf(str, -realmax, -realmin, realmin, realmax)

ans =
The range for double is:
-1.79769e+308 to -2.22507e-308 and
2.22507e-308 to 1.79769e+308

Numbers larger than realmax or smaller than -realmax are assigned the values of positive and
negative infinity, respectively:

realmax + .0001e+308
ans =
Inf

-realmax - .0001e+308
ans =
-Inf

Largest and Smallest Single-Precision Values

The MATLAB functions realmax and realmin, when called with the argument 'single', return the
maximum and minimum values that you can represent with the single data type:

str = 'The range for single is:\n\t%g to %g and\n\t %g to %g';


sprintf(str, -realmax('single'), -realmin('single'), ...
realmin('single'), realmax('single'))

ans =
The range for single is:
-3.40282e+38 to -1.17549e-38 and
1.17549e-38 to 3.40282e+38

Numbers larger than realmax('single') or smaller than -realmax('single') are assigned the
values of positive and negative infinity, respectively:

realmax('single') + .0001e+038
ans =

single

Inf

-realmax('single') - .0001e+038
ans =

single

4-9
4 Numeric Classes

-Inf

Accuracy of Floating-Point Data


If the result of a floating-point arithmetic computation is not as precise as you had expected, it is
likely caused by the limitations of your computer's hardware. Probably, your result was a little less
exact because the hardware had insufficient bits to represent the result with perfect accuracy;
therefore, it truncated the resulting value.

Double-Precision Accuracy

Because there are only a finite number of double-precision numbers, you cannot represent all
numbers in double-precision storage. On any computer, there is a small gap between each double-
precision number and the next larger double-precision number. You can determine the size of this
gap, which limits the precision of your results, using the eps function. For example, to find the
distance between 5 and the next larger double-precision number, enter

format long

eps(5)
ans =
8.881784197001252e-16

This tells you that there are no double-precision numbers between 5 and 5 + eps(5). If a double-
precision computation returns the answer 5, the result is only accurate to within eps(5).

The value of eps(x) depends on x. This example shows that, as x gets larger, so does eps(x):

eps(50)
ans =
7.105427357601002e-15

If you enter eps with no input argument, MATLAB returns the value of eps(1), the distance from 1
to the next larger double-precision number.

Single-Precision Accuracy

Similarly, there are gaps between any two single-precision numbers. If x has type single, eps(x)
returns the distance between x and the next larger single-precision number. For example,

x = single(5);
eps(x)

returns

ans =

single

4.7684e-07

Note that this result is larger than eps(5). Because there are fewer single-precision numbers than
double-precision numbers, the gaps between the single-precision numbers are larger than the gaps
between double-precision numbers. This means that results in single-precision arithmetic are less
precise than in double-precision arithmetic.

4-10
Floating-Point Numbers

For a number x of type double, eps(single(x)) gives you an upper bound for the amount that x is
rounded when you convert it from double to single. For example, when you convert the double-
precision number 3.14 to single, it is rounded by
double(single(3.14) - 3.14)
ans =
1.0490e-07

The amount that 3.14 is rounded is less than


eps(single(3.14))
ans =

single

2.3842e-07

Avoiding Common Problems with Floating-Point Arithmetic


Almost all operations in MATLAB are performed in double-precision arithmetic conforming to the
IEEE standard 754. Because computers only represent numbers to a finite precision (double precision
calls for 52 mantissa bits), computations sometimes yield mathematically nonintuitive results. It is
important to note that these results are not bugs in MATLAB.

Use the following examples to help you identify these cases:

Example 1 — Round-Off or What You Get Is Not What You Expect

The decimal number 4/3 is not exactly representable as a binary fraction. For this reason, the
following calculation does not give zero, but rather reveals the quantity eps.
e = 1 - 3*(4/3 - 1)

e =
2.2204e-16

Similarly, 0.1 is not exactly representable as a binary number. Thus, you get the following
nonintuitive behavior:
a = 0.0;
for i = 1:10
a = a + 0.1;
end
a == 1
ans =

logical

Note that the order of operations can matter in the computation:


b = 1e-16 + 1 - 1e-16;
c = 1e-16 - 1e-16 + 1;
b == c
ans =

4-11
4 Numeric Classes

logical

There are gaps between floating-point numbers. As the numbers get larger, so do the gaps, as
evidenced by:

(2^53 + 1) - 2^53

ans =
0

Since pi is not really π, it is not surprising that sin(pi) is not exactly zero:

sin(pi)

ans =
1.224646799147353e-16

Example 2 — Catastrophic Cancellation

When subtractions are performed with nearly equal operands, sometimes cancellation can occur
unexpectedly. The following is an example of a cancellation caused by swamping (loss of precision
that makes the addition insignificant).

sqrt(1e-16 + 1) - 1

ans =
0

Some functions in MATLAB, such as expm1 and log1p, may be used to compensate for the effects of
catastrophic cancellation.

Example 3 — Floating-Point Operations and Linear Algebra

Round-off, cancellation, and other traits of floating-point arithmetic combine to produce startling
computations when solving the problems of linear algebra. MATLAB warns that the following matrix A
is ill-conditioned, and therefore the system Ax = b may be sensitive to small perturbations:

A = diag([2 eps]);
b = [2; eps];
y = A\b;
Warning: Matrix is close to singular or badly scaled.
Results may be inaccurate. RCOND = 1.110223e-16.

These are only a few of the examples showing how IEEE floating-point arithmetic affects
computations in MATLAB. Note that all computations performed in IEEE 754 arithmetic are affected,
this includes applications written in C or FORTRAN, as well as MATLAB.

References
[1] Moler, Cleve. “Floating Points.” MATLAB News and Notes. Fall, 1996.

[2] Moler, Cleve. Numerical Computing with MATLAB. Natick, MA: The MathWorks, Inc., 2004.

4-12
Create Complex Numbers

Create Complex Numbers


Complex numbers consist of two separate parts: a real part and an imaginary part. The basic
imaginary unit is equal to the square root of -1. This is represented in MATLAB by either of two
letters: i or j.

The following statement shows one way of creating a complex value in MATLAB. The variable x is
assigned a complex number with a real part of 2 and an imaginary part of 3:

x = 2 + 3i;

Another way to create a complex number is using the complex function. This function combines two
numeric inputs into a complex output, making the first input real and the second imaginary:

x = rand(3) * 5;
y = rand(3) * -8;

z = complex(x, y)
z =
4.7842 -1.0921i 0.8648 -1.5931i 1.2616 -2.2753i
2.6130 -0.0941i 4.8987 -2.3898i 4.3787 -3.7538i
4.4007 -7.1512i 1.3572 -5.2915i 3.6865 -0.5182i

You can separate a complex number into its real and imaginary parts using the real and imag
functions:

zr = real(z)
zr =
4.7842 0.8648 1.2616
2.6130 4.8987 4.3787
4.4007 1.3572 3.6865

zi = imag(z)
zi =
-1.0921 -1.5931 -2.2753
-0.0941 -2.3898 -3.7538
-7.1512 -5.2915 -0.5182

4-13
4 Numeric Classes

Infinity and NaN


In this section...
“Infinity” on page 4-14
“NaN” on page 4-14

Infinity
MATLAB represents infinity by the special value Inf. Infinity results from operations like division by
zero and overflow, which lead to results too large to represent as conventional floating-point values.
MATLAB also provides a function called Inf that returns the IEEE arithmetic representation for
positive infinity as a double scalar value.

Several examples of statements that return positive or negative infinity in MATLAB are shown here.

x = 1/0 x = 1.e1000
x = x =
Inf Inf
x = exp(1000) x = log(0)
x = x =
Inf -Inf

Use the isinf function to verify that x is positive or negative infinity:

x = log(0);

isinf(x)
ans =
1

NaN
MATLAB represents values that are not real or complex numbers with a special value called NaN,
which stands for “Not a Number”. Expressions like 0/0 and inf/inf result in NaN, as do any
arithmetic operations involving a NaN:

x = 0/0
x =

NaN

You can also create NaNs by:

x = NaN;

whos x
Name Size Bytes Class

x 1x1 8 double

The NaN function returns one of the IEEE arithmetic representations for NaN as a double scalar
value. The exact bit-wise hexadecimal representation of this NaN value is,

4-14
Infinity and NaN

format hex
x = NaN

x =

fff8000000000000

Always use the isnan function to verify that the elements in an array are NaN:

isnan(x)
ans =

MATLAB preserves the “Not a Number” status of alternate NaN representations and treats all of the
different representations of NaN equivalently. However, in some special cases (perhaps due to
hardware limitations), MATLAB does not preserve the exact bit pattern of alternate NaN
representations throughout an entire calculation, and instead uses the canonical NaN bit pattern
defined above.

Logical Operations on NaN

Because two NaNs are not equal to each other, logical operations involving NaN always return false,
except for a test for inequality, (NaN ~= NaN):

NaN > NaN


ans =
0

NaN ~= NaN
ans =
1

4-15
4 Numeric Classes

Identifying Numeric Classes


You can check the data type of a variable x using any of these commands.

Command Operation
whos x Display the data type of x.
xType = class(x); Assign the data type of x to a variable.
isnumeric(x) Determine if x is a numeric type.
isa(x, 'integer') Determine if x is the specified numeric type. (Examples for any
isa(x, 'uint64') integer, unsigned 64-bit integer, any floating point, double precision,
isa(x, 'float') and single precision are shown here).
isa(x, 'double')
isa(x, 'single')
isreal(x) Determine if x is real or complex.
isnan(x) Determine if x is Not a Number (NaN).
isinf(x) Determine if x is infinite.
isfinite(x) Determine if x is finite.

4-16
Display Format for Numeric Values

Display Format for Numeric Values


By default, MATLAB uses a 5-digit short format to display numbers. For example,

x = 4/3

x =

1.3333

You can change the display in the Command Window or Editor using the format function.

format long
x

x =

1.333333333333333

Using the format function only sets the format for the current MATLAB session. To set the format for
subsequent sessions, click Preferences on the Home tab in the Environment section. Select
MATLAB > Command Window, and then choose a Numeric format option.

The following table summarizes the numeric output format options.

Style Result Example


short Short, fixed-decimal format with 4 digits after the 3.1416
(default) decimal point.
long Long, fixed-decimal format with 15 digits after the 3.141592653589793
decimal point for double values, and 7 digits after
the decimal point for single values.
shortE Short scientific notation with 4 digits after the 3.1416e+00
decimal point.
longE Long scientific notation with 15 digits after the 3.141592653589793e+00
decimal point for double values, and 7 digits after
the decimal point for single values.
shortG Short, fixed-decimal format or scientific notation, 3.1416
whichever is more compact, with a total of 5
digits.
longG Long, fixed-decimal format or scientific notation, 3.14159265358979
whichever is more compact, with a total of 15
digits for double values, and 7 digits for single
values.
shortEng Short engineering notation (exponent is a multiple 3.1416e+000
of 3) with 4 digits after the decimal point.
longEng Long engineering notation (exponent is a multiple 3.14159265358979e+000
of 3) with 15 significant digits.
+ Positive/Negative format with +, -, and blank +
characters displayed for positive, negative, and
zero elements.

4-17
4 Numeric Classes

Style Result Example


bank Currency format with 2 digits after the decimal 3.14
point.
hex Hexadecimal representation of a binary double- 400921fb54442d18
precision number.
rat Ratio of small integers. 355/113

The display format only affects how numbers are displayed, not how they are stored in MATLAB.

See Also
format

Related Examples
• “Format Output”

4-18
Integer Arithmetic

Integer Arithmetic
This example shows how to perform arithmetic on integer data representing signals and images.

Load Integer Signal Data

Load measurement datasets comprising signals from four instruments using 8 and 16-bit A-to-D's
resulting in data saved as int8, int16 and uint16. Time is stored as uint16.

load integersignal

% Look at variables
whos Signal1 Signal2 Signal3 Signal4 Time1

Name Size Bytes Class Attributes

Signal1 7550x1 7550 int8


Signal2 7550x1 7550 int8
Signal3 7550x1 15100 int16
Signal4 7550x1 15100 uint16
Time1 7550x1 15100 uint16

Plot Data

First we will plot two of the signals to see the signal ranges.

plot(Time1, Signal1, Time1, Signal2);


grid;
legend('Signal1','Signal2');

4-19
4 Numeric Classes

It is likely that these values would need to be scaled to calculate the actual physical value that the
signal represents e.g. Volts.

Process Data

We can perform standard arithmetic on integers such as +, -, *, and /. Let's say we wished to find the
sum of Signal1 and Signal2.

SumSig = Signal1 + Signal2; % Here we sum the integer signals.

Now let's plot the sum signal and see where it saturates.

cla;
plot(Time1, SumSig);
hold on
Saturated = (SumSig == intmin('int8')) | (SumSig == intmax('int8')); % Find where it has saturate
plot(Time1(Saturated),SumSig(Saturated),'rd')
grid
hold off

4-20
Integer Arithmetic

The markers show where the signal has saturated.

Load Integer Image Data

Next we will look at arithmetic on some image data.

street1 = imread('street1.jpg'); % Load image data


street2 = imread('street2.jpg');
whos street1 street2

Name Size Bytes Class Attributes

street1 480x640x3 921600 uint8


street2 480x640x3 921600 uint8

Here we see the images are 24-bit color, stored as three planes of uint8 data.

Display Images

Display first image.

cla;
image(street1); % Display image
axis equal
axis off

4-21
4 Numeric Classes

Display second image

image(street2); % Display image


axis equal
axis off

4-22
Integer Arithmetic

Scale an Image

We can scale the image by a double precision constant but keep the image stored as integers. For
example,

duller = 0.5 * street2; % Scale image with a double constant but create an integer
whos duller

Name Size Bytes Class Attributes

duller 480x640x3 921600 uint8

subplot(1,2,1);
image(street2);
axis off equal tight
title('Original'); % Display image

subplot(1,2,2);
image(duller);
axis off equal tight
title('Duller'); % Display image

4-23
4 Numeric Classes

Add the Images

We can add the two street images together and plot the ghostly result.

combined = street1 + duller; % Add |uint8| images


subplot(1,1,1)
cla;
image(combined); % Display image
title('Combined');
axis equal
axis off

4-24
Integer Arithmetic

4-25
4 Numeric Classes

Single Precision Math


This example shows how to perform arithmetic and linear algebra with single precision data. It also
shows how the results are computed appropriately in single-precision or double-precision, depending
on the input.

Create Double Precision Data

Let's first create some data, which is double precision by default.

Ad = [1 2 0; 2 5 -1; 4 10 -1]

Ad = 3×3

1 2 0
2 5 -1
4 10 -1

Convert to Single Precision

We can convert data to single precision with the single function.

A = single(Ad); % or A = cast(Ad,'single');

Create Single Precision Zeros and Ones

We can also create single precision zeros and ones with their respective functions.

n = 1000;
Z = zeros(n,1,'single');
O = ones(n,1,'single');

Let's look at the variables in the workspace.

whos A Ad O Z n

Name Size Bytes Class Attributes

A 3x3 36 single
Ad 3x3 72 double
O 1000x1 4000 single
Z 1000x1 4000 single
n 1x1 8 double

We can see that some of the variables are of type single and that the variable A (the single precision
version of Ad) takes half the number of bytes of memory to store because singles require just four
bytes (32-bits), whereas doubles require 8 bytes (64-bits).

Arithmetic and Linear Algebra

We can perform standard arithmetic and linear algebra on singles.

B = A' % Matrix Transpose

B = 3x3 single matrix

1 2 4

4-26
Single Precision Math

2 5 10
0 -1 -1

whos B

Name Size Bytes Class Attributes

B 3x3 36 single

We see the result of this operation, B, is a single.


C = A * B % Matrix multiplication

C = 3x3 single matrix

5 12 24
12 30 59
24 59 117

C = A .* B % Elementwise arithmetic

C = 3x3 single matrix

1 4 0
4 25 -10
0 -10 1

X = inv(A) % Matrix inverse

X = 3x3 single matrix

5 2 -2
-2 -1 1
0 -2 1

I = inv(A) * A % Confirm result is identity matrix

I = 3x3 single matrix

1 0 0
0 1 0
0 0 1

I = A \ A % Better way to do matrix division than inv

I = 3x3 single matrix

1 0 0
0 1 0
0 0 1

E = eig(A) % Eigenvalues

E = 3x1 single column vector

4-27
4 Numeric Classes

3.7321
0.2679
1.0000

F = fft(A(:,1)) % FFT

F = 3x1 single column vector

7.0000 + 0.0000i
-2.0000 + 1.7321i
-2.0000 - 1.7321i

S = svd(A) % Singular value decomposition

S = 3x1 single column vector

12.3171
0.5149
0.1577

P = round(poly(A)) % The characteristic polynomial of a matrix

P = 1x4 single row vector

1 -5 5 -1

R = roots(P) % Roots of a polynomial

R = 3x1 single column vector

3.7321
1.0000
0.2679

Q = conv(P,P) % Convolve two vectors

Q = 1x7 single row vector

1 -10 35 -52 35 -10 1

R = conv(P,Q)

R = 1x10 single row vector

1 -15 90 -278 480 -480 278 -90 15 -1

stem(R); % Plot the result

4-28
Single Precision Math

A Program that Works for Either Single or Double Precision

Now let's look at a function to compute enough terms in the Fibonacci sequence so the ratio is less
than the correct machine epsilon (eps) for datatype single or double.

% How many terms needed to get single precision results?


fibodemo('single')

ans = 19

% How many terms needed to get double precision results?


fibodemo('double')

ans = 41

% Now let's look at the working code.


type fibodemo

function nterms = fibodemo(dtype)


%FIBODEMO Used by SINGLEMATH demo.
% Calculate number of terms in Fibonacci sequence.

% Copyright 1984-2014 The MathWorks, Inc.

fcurrent = ones(dtype);
fnext = fcurrent;

4-29
4 Numeric Classes

goldenMean = (ones(dtype)+sqrt(5))/2;
tol = eps(goldenMean);
nterms = 2;
while abs(fnext/fcurrent - goldenMean) >= tol
nterms = nterms + 1;
temp = fnext;
fnext = fnext + fcurrent;
fcurrent = temp;
end

Notice that we initialize several of our variables, fcurrent, fnext, and goldenMean, with values
that are dependent on the input datatype, and the tolerance tol depends on that type as well. Single
precision requires that we calculate fewer terms than the equivalent double precision calculation.

4-30
5

The Logical Class

• “Find Array Elements That Meet a Condition” on page 5-2


• “Reduce Logical Arrays to Single Value” on page 5-6
5 The Logical Class

Find Array Elements That Meet a Condition


This example shows how to filter the elements of an array by applying conditions to the array. For
instance, you can examine the even elements in a matrix, find the location of all 0s in a
multidimensional array, or replace NaN values in data. You can perform these tasks using a
combination of the relational and logical operators. The relational operators (>, <, >=, <=, ==, ~=)
impose conditions on the array, and you can apply multiple conditions by connecting them with the
logical operators and, or, and not, respectively denoted by the symbols &, |, and ~.

Apply a Single Condition

To apply a single condition, start by creating a 5-by-5 matrix that contains random integers between 1
and 15. Reset the random number generator to the default state for reproducibility.

rng default
A = randi(15,5)

A = 5×5

13 2 3 3 10
14 5 15 7 1
2 9 15 14 13
14 15 8 12 15
10 15 13 15 11

Use the relational less than operator, <, to determine which elements of A are less than 9. Store the
result in B.

B = A < 9

B = 5x5 logical array

0 1 1 1 0
0 1 0 1 1
1 0 0 0 0
0 0 1 0 0
0 0 0 0 0

The result is a logical matrix. Each value in B represents a logical 1 (true) or logical 0 (false) state
to indicate whether the corresponding element of A fulfills the condition A < 9. For example, A(1,1)
is 13, so B(1,1) is logical 0 (false). However, A(1,2) is 2, so B(1,2) is logical 1 (true).

Although B contains information about which elements in A are less than 9, it doesn’t tell you what
their values are. Rather than comparing the two matrices element by element, you can use B to index
into A.

A(B)

ans = 8×1

2
2
5
3
8

5-2
Find Array Elements That Meet a Condition

3
7
1

The result is a column vector of the elements in A that are less than 9. Since B is a logical matrix, this
operation is called logical indexing. In this case, the logical array being used as an index is the
same size as the other array, but this is not a requirement. For more information, see “Array
Indexing”.

Some problems require information about the locations of the array elements that meet a condition
rather than their actual values. In this example, you can use the find function to locate all of the
elements in A less than 9.

I = find(A < 9)

I = 8×1

3
6
7
11
14
16
17
22

The result is a column vector of linear indices. Each index describes the location of an element in A
that is less than 9, so in practice A(I) returns the same result as A(B). The difference is that A(B)
uses logical indexing, whereas A(I) uses linear indexing.

Apply Multiple Conditions

You can use the logical and, or, and not operators to apply any number of conditions to an array; the
number of conditions is not limited to one or two.

First, use the logical and operator, denoted &, to specify two conditions: the elements must be less
than 9 and greater than 2. Specify the conditions as a logical index to view the elements that
satisfy both conditions.

A(A<9 & A>2)

ans = 5×1

5
3
8
3
7

The result is a list of the elements in A that satisfy both conditions. Be sure to specify each condition
with a separate statement connected by a logical operator. For example, you cannot specify the
conditions above by A(2<A<9), since it evaluates to A(2<A | A<9).

Next, find the elements in A that are less than 9 and even numbered.

5-3
5 The Logical Class

A(A<9 & ~mod(A,2))

ans = 3×1

2
2
8

The result is a list of all even elements in A that are less than 9. The use of the logical NOT operator,
~, converts the matrix mod(A,2) into a logical matrix, with a value of logical 1 (true) located where
an element is evenly divisible by 2.

Finally, find the elements in A that are less than 9 and even numbered and not equal to 2.
A(A<9 & ~mod(A,2) & A~=2)

ans = 8

The result, 8, is even, less than 9, and not equal to 2. It is the only element in A that satisfies all three
conditions.

Use the find function to get the index of the element equal to 8 that satisfies the conditions.
find(A<9 & ~mod(A,2) & A~=2)

ans = 14

The result indicates that A(14) = 8.

Replace Values That Meet a Condition

Sometimes it is useful to simultaneously change the values of several existing array elements. Use
logical indexing with a simple assignment statement to replace the values in an array that meet a
condition.

Replace all values in A that are greater than 10 with the number 10.
A(A>10) = 10

A = 5×5

10 2 3 3 10
10 5 10 7 1
2 9 10 10 10
10 10 8 10 10
10 10 10 10 10

Next, replace all values in A that are not equal to 10 with a NaN value.
A(A~=10) = NaN

A = 5×5

10 NaN NaN NaN 10


10 NaN 10 NaN NaN
NaN NaN 10 10 10
10 10 NaN 10 10

5-4
Find Array Elements That Meet a Condition

10 10 10 10 10

Lastly, replace all of the NaN values in A with zeros and apply the logical NOT operator, ~A.

A(isnan(A)) = 0;
C = ~A

C = 5x5 logical array

0 1 1 1 0
0 1 0 1 1
1 1 0 0 0
0 0 1 0 0
0 0 0 0 0

The resulting matrix has values of logical 1 (true) in place of the NaN values, and logical 0 (false)
in place of the 10s. The logical NOT operation, ~A, converts the numeric array into a logical array
such that A&C returns a matrix of logical 0 (false) values and A|C returns a matrix of logical 1
(true) values.

See Also
nan | Logical Operators: Short Circuit | isnan | find | and | or | xor | not

5-5
5 The Logical Class

Reduce Logical Arrays to Single Value


This example shows how to use the any and all functions to reduce an entire array to a single
logical value.

The any and all functions are natural extensions of the logical | (OR) and & (AND) operators,
respectively. However, rather than comparing just two elements, the any and all functions compare
all of the elements in a particular dimension of an array. It is as if all of those elements are connected
by & or | operators and the any or all functions evaluate the resulting long logical expressions.
Therefore, unlike the core logical operators, the any and all functions reduce the size of the array
dimension that they operate on so that it has size 1. This enables the reduction of many logical values
into a single logical condition.

First, create a matrix A that contains random integers between 1 and 25. Reset the random number
generator to the default state for reproducibility.

rng default
A = randi(25,5)

A = 5×5

21 3 4 4 17
23 7 25 11 1
4 14 24 23 22
23 24 13 20 24
16 25 21 24 17

Next, use the mod function along with the logical NOT operator, ~, to determine which elements in A
are even.

A = ~mod(A,2)

A = 5x5 logical array

0 0 1 1 0
0 0 0 0 0
1 1 1 0 1
0 1 0 1 1
1 0 0 1 0

The resulting matrices have values of logical 1 (true) where an element is even, and logical 0
(false) where an element is odd.

Since the any and all functions reduce the dimension that they operate on to size 1, it normally
takes two applications of one of the functions to reduce a 2–D matrix into a single logical condition,
such as any(any(A)). However, if you use the notation A(:) to regard all of the elements of A as a
single column vector, you can use any(A(:)) to get the same logical information without nesting the
function calls.

Determine if any elements in A are even.

any(A(:))

5-6
Reduce Logical Arrays to Single Value

ans = logical
1

You can perform logical and relational comparisons within the function call to any or all. This makes
it easy to quickly test an array for a variety of properties.

Determine if all elements in A are odd.

all(~A(:))

ans = logical
0

Determine whether any main or super diagonal elements in A are even. Since the vectors returned by
diag(A) and diag(A,1) are not the same size, you first need to reduce each diagonal to a single
scalar logical condition before comparing them. You can use the short-circuit OR operator || to
perform the comparison, since if any elements in the first diagonal are even then the entire
expression evaluates to true regardless of what appears on the right-hand side of the operator.

any(diag(A)) || any(diag(A,1))

ans = logical
1

See Also
any | all | and | or | xor | Logical Operators: Short Circuit

5-7
6

Characters and Strings

• “Text in String and Character Arrays” on page 6-2


• “Create String Arrays” on page 6-5
• “Cell Arrays of Character Vectors” on page 6-12
• “Analyze Text Data with String Arrays” on page 6-15
• “Test for Empty Strings and Missing Values” on page 6-20
• “Formatting Text” on page 6-24
• “Compare Text” on page 6-32
• “Search and Replace Text” on page 6-37
• “Build Pattern Expressions” on page 6-40
• “Convert Numeric Values to Text” on page 6-45
• “Convert Text to Numeric Values” on page 6-49
• “Unicode and ASCII Values” on page 6-53
• “Hexadecimal and Binary Values” on page 6-55
• “Frequently Asked Questions About String Arrays” on page 6-59
• “Update Your Code to Accept Strings” on page 6-64
6 Characters and Strings

Text in String and Character Arrays


There are two ways to represent text in MATLAB®. Starting in R2016b, you can store text in string
arrays. And in any version of MATLAB, you can store text in character arrays. A typical use for
character arrays is to store pieces of text as character vectors. MATLAB displays strings with double
quotes and character vectors with single quotes.

Represent Text with String Arrays

You can store any 1-by-n sequence of characters as a string, using the string data type. Starting in
R2017a, enclose text in double quotes to create a string.

str = "Hello, world"

str =
"Hello, world"

Though the text "Hello, world" is 12 characters long, str itself is a 1-by-1 string, or string scalar.
You can use a string scalar to specify a file name, plot label, or any other piece of textual information.

To find the number of characters in a string, use the strlength function.

n = strlength(str)

n = 12

If the text includes double quotes, use two double quotes within the definition.

str = "They said, ""Welcome!"" and waved."

str =
"They said, "Welcome!" and waved."

To add text to the end of a string, use the plus operator, +. If a variable can be converted to a string,
then plus converts it and appends it.

fahrenheit = 71;
celsius = (fahrenheit-32)/1.8;
tempText = "temperature is " + celsius + "C"

tempText =
"temperature is 21.6667C"

Starting in R2019a, you can also concatenate text using the append function.

tempText2 = append("Today's ",tempText)

tempText2 =
"Today's temperature is 21.6667C"

The string function can convert different types of inputs, such as numeric, datetime, duration, and
categorical values. For example, convert the output of pi to a string.

ps = string(pi)

ps =
"3.1416"

6-2
Text in String and Character Arrays

You can store multiple pieces of text in a string array. Each element of the array can contain a string
having a different number of characters, without padding.

str = ["Mercury","Gemini","Apollo";...
"Skylab","Skylab B","ISS"]

str = 2x3 string


"Mercury" "Gemini" "Apollo"
"Skylab" "Skylab B" "ISS"

str is a 2-by-3 string array. You can find the lengths of the strings with the strlength function.

N = strlength(str)

N = 2×3

7 6 6
6 8 3

As of R2018b, string arrays are supported throughout MATLAB and MathWorks® products. Functions
that accept character arrays (and cell arrays of character vectors) as inputs also accept string arrays.

Represent Text with Character Vectors

To store a 1-by-n sequence of characters as a character vector, using the char data type, enclose it in
single quotes.

chr = 'Hello, world'

chr =
'Hello, world'

The text 'Hello, world' is 12 characters long, and chr stores it as a 1-by-12 character vector.

whos chr

Name Size Bytes Class Attributes

chr 1x12 24 char

If the text includes single quotes, use two single quotes within the definition.

chr = 'They said, ''Welcome!'' and waved.'

chr =
'They said, 'Welcome!' and waved.'

Character vectors have two principal uses:

• To specify single pieces of text, such as file names and plot labels.
• To represent data that is encoded using characters. In such cases, you might need easy access to
individual characters.

For example, you can store a DNA sequence as a character vector.

seq = 'GCTAGAATCC';

6-3
6 Characters and Strings

You can access individual characters or subsets of characters by indexing, just as you would index
into a numeric array.

seq(4:6)

ans =
'AGA'

Concatenate character vector with square brackets, just as you concatenate other types of arrays.

seq2 = [seq 'ATTAGAAACC']

seq2 =
'GCTAGAATCCATTAGAAACC'

Starting in R2019a, you also can concatenate text using append. The append function is
recommended because it treats string arrays, character vectors, and cell arrays of character vectors
consistently.

seq2 = append(seq,'ATTAGAAACC')

seq2 =
'GCTAGAATCCATTAGAAACC'

MATLAB functions that accept string arrays as inputs also accept character vectors and cell arrays of
character vectors.

See Also
string | char | cellstr | strlength | plus | horzcat | append

Related Examples
• “Create String Arrays” on page 6-5
• “Analyze Text Data with String Arrays” on page 6-15
• “Frequently Asked Questions About String Arrays” on page 6-59
• “Update Your Code to Accept Strings” on page 6-64
• “Cell Arrays of Character Vectors” on page 6-12

6-4
Create String Arrays

Create String Arrays


String arrays were introduced in R2016b. String arrays store pieces of text and provide a set of
functions for working with text as data. You can index into, reshape, and concatenate strings arrays
just as you can with arrays of any other type. You also can access the characters in a string and
append text to strings using the plus operator. To rearrange strings within a string array, use
functions such as split, join, and sort.

Create String Arrays from Variables

MATLAB® provides string arrays to store pieces of text. Each element of a string array contains a 1-
by-n sequence of characters.

Starting in R2017a, you can create a string using double quotes.


str = "Hello, world"

str =
"Hello, world"

As an alternative, you can convert a character vector to a string using the string function. chr is a
1-by-17 character vector. str is a 1-by-1 string that has the same text as the character vector.
chr = 'Greetings, friend'

chr =
'Greetings, friend'

str = string(chr)

str =
"Greetings, friend"

Create a string array containing multiple strings using the [] operator. str is a 2-by-3 string array
that contains six strings.
str = ["Mercury","Gemini","Apollo";
"Skylab","Skylab B","ISS"]

str = 2x3 string


"Mercury" "Gemini" "Apollo"
"Skylab" "Skylab B" "ISS"

Find the length of each string in str with the strlength function. Use strlength, not length, to
determine the number of characters in strings.
L = strlength(str)

L = 2×3

7 6 6
6 8 3

As an alternative, you can convert a cell array of character vectors to a string array using the string
function. MATLAB displays strings in string arrays with double quotes, and displays characters
vectors in cell arrays with single quotes.

6-5
6 Characters and Strings

C = {'Mercury','Venus','Earth'}

C = 1x3 cell
{'Mercury'} {'Venus'} {'Earth'}

str = string(C)

str = 1x3 string


"Mercury" "Venus" "Earth"

In addition to character vectors, you can convert numeric, datetime, duration, and categorical values
to strings using the string function.

Convert a numeric array to a string array.


X = [5 10 20 3.1416];
string(X)

ans = 1x4 string


"5" "10" "20" "3.1416"

Convert a datetime value to a string.


d = datetime('now');
string(d)

ans =
"01-Sep-2021 16:04:06"

Also, you can read text from files into string arrays using the readtable, textscan, and fscanf
functions.

Create Empty and Missing Strings

String arrays can contain both empty and missing values. An empty string contains zero characters.
When you display an empty string, the result is a pair of double quotes with nothing between them
(""). The missing string is the string equivalent to NaN for numeric arrays. It indicates where a string
array has missing values. When you display a missing string, the result is <missing>, with no
quotation marks.

Create an empty string array using the strings function. When you call strings with no
arguments, it returns an empty string. Note that the size of str is 1-by-1, not 0-by-0. However, str
contains zero characters.
str = strings

str =
""

Create an empty character vector using single quotes. Note that the size of chr is 0-by-0.
chr = ''

chr =

0x0 empty char array

6-6
Create String Arrays

Create a string array where every element is an empty string. You can preallocate a string array with
the strings function.

str = strings(2,3)

str = 2x3 string


"" "" ""
"" "" ""

To create a missing string, convert a missing value using the string function. The missing string
displays as <missing>.

str = string(missing)

str =
<missing>

You can create a string array with both empty and missing strings. Use the ismissing function to
determine which elements are strings with missing values. Note that the empty string is not a
missing string.

str(1) = "";
str(2) = "Gemini";
str(3) = string(missing)

str = 1x3 string


"" "Gemini" <missing>

ismissing(str)

ans = 1x3 logical array

0 0 1

Compare a missing string to another string. The result is always 0 (false), even when you compare a
missing string to another missing string.

str = string(missing);
str == "Gemini"

ans = logical
0

str == string(missing)

ans = logical
0

Access Elements of String Array

String arrays support array operations such as indexing and reshaping. Use array indexing to access
the first row of str and all the columns.

6-7
6 Characters and Strings

str = ["Mercury","Gemini","Apollo";
"Skylab","Skylab B","ISS"];
str(1,:)

ans = 1x3 string


"Mercury" "Gemini" "Apollo"

Access the second element in the second row of str.

str(2,2)

ans =
"Skylab B"

Assign a new string outside the bounds of str. MATLAB expands the array and fills unallocated
elements with missing values.

str(3,4) = "Mir"

str = 3x4 string


"Mercury" "Gemini" "Apollo" <missing>
"Skylab" "Skylab B" "ISS" <missing>
<missing> <missing> <missing> "Mir"

Access Characters Within Strings

You can index into a string array using curly braces, {}, to access characters directly. Use curly
braces when you need to access and modify characters within a string element. Indexing with curly
braces provides compatibility for code that could work with either string arrays or cell arrays of
character vectors. But whenever possible, use string functions to work with the characters in strings.

Access the second element in the second row with curly braces. chr is a character vector, not a
string.

str = ["Mercury","Gemini","Apollo";
"Skylab","Skylab B","ISS"];
chr = str{2,2}

chr =
'Skylab B'

Access the character vector and return the first three characters.

str{2,2}(1:3)

ans =
'Sky'

Find the space characters in a string and replace them with dashes. Use the isspace function to
inspect individual characters within the string. isspace returns a logical vector that contains a true
value wherever there is a space character. Finally, display the modified string element, str(2,2).

TF = isspace(str{2,2})

TF = 1x8 logical array

6-8
Create String Arrays

0 0 0 0 0 0 1 0

str{2,2}(TF) = "-";
str(2,2)

ans =
"Skylab-B"

Note that in this case, you can also replace spaces using the replace function, without resorting to
curly brace indexing.
replace(str(2,2)," ","-")

ans =
"Skylab-B"

Concatenate Strings into String Array

Concatenate strings into a string array just as you would concatenate arrays of any other kind.

Concatenate two string arrays using square brackets, [].


str1 = ["Mercury","Gemini","Apollo"];
str2 = ["Skylab","Skylab B","ISS"];
str = [str1 str2]

str = 1x6 string


"Mercury" "Gemini" "Apollo" "Skylab" "Skylab B" "ISS"

Transpose str1 and str2. Concatenate them and then vertically concatenate column headings onto
the string array. When you concatenate character vectors into a string array, the character vectors
are automatically converted to strings.
str1 = str1';
str2 = str2';
str = [str1 str2];
str = [["Mission:","Station:"] ; str]

str = 4x2 string


"Mission:" "Station:"
"Mercury" "Skylab"
"Gemini" "Skylab B"
"Apollo" "ISS"

Append Text to Strings

To append text to strings, use the plus operator, +. The plus operator appends text to strings but
does not change the size of a string array.

Append a last name to an array of names. If you append a character vector to strings, then the
character vector is automatically converted to a string.
names = ["Mary";"John";"Elizabeth";"Paul";"Ann"];
names = names + ' Smith'

names = 5x1 string


"Mary Smith"

6-9
6 Characters and Strings

"John Smith"
"Elizabeth Smith"
"Paul Smith"
"Ann Smith"

Append different last names. You can append text to a string array from a string array or from a cell
array of character vectors. When you add nonscalar arrays, they must be the same size.
names = ["Mary";"John";"Elizabeth";"Paul";"Ann"];
lastnames = ["Jones";"Adams";"Young";"Burns";"Spencer"];
names = names + " " + lastnames

names = 5x1 string


"Mary Jones"
"John Adams"
"Elizabeth Young"
"Paul Burns"
"Ann Spencer"

Append a missing string. When you append a missing string with the plus operator, the output is a
missing string.
str1 = "Jones";
str2 = string(missing);
str1 + str2

ans =
<missing>

Split, Join, and Sort String Array

MATLAB provides a rich set of functions to work with string arrays. For example, you can use the
split, join, and sort functions to rearrange the string array names so that the names are in
alphabetical order by last name.

Split names on the space characters. Splitting changes names from a 5-by-1 string array to a 5-by-2
array.
names = ["Mary Jones";"John Adams";"Elizabeth Young";"Paul Burns";"Ann Spencer"];
names = split(names)

names = 5x2 string


"Mary" "Jones"
"John" "Adams"
"Elizabeth" "Young"
"Paul" "Burns"
"Ann" "Spencer"

Switch the columns of names so that the last names are in the first column. Add a comma after each
last name.
names = [names(:,2) names(:,1)];
names(:,1) = names(:,1) + ','

names = 5x2 string


"Jones," "Mary"

6-10
Create String Arrays

"Adams," "John"
"Young," "Elizabeth"
"Burns," "Paul"
"Spencer," "Ann"

Join the last and first names. The join function places a space character between the strings it joins.
After the join, names is a 5-by-1 string array.

names = join(names)

names = 5x1 string


"Jones, Mary"
"Adams, John"
"Young, Elizabeth"
"Burns, Paul"
"Spencer, Ann"

Sort the elements of names so that they are in alphabetical order.

names = sort(names)

names = 5x1 string


"Adams, John"
"Burns, Paul"
"Jones, Mary"
"Spencer, Ann"
"Young, Elizabeth"

See Also
string | strings | strlength | ismissing | isspace | plus | split | join | sort

Related Examples
• “Analyze Text Data with String Arrays” on page 6-15
• “Search and Replace Text” on page 6-37
• “Compare Text” on page 6-32
• “Test for Empty Strings and Missing Values” on page 6-20
• “Frequently Asked Questions About String Arrays” on page 6-59
• “Update Your Code to Accept Strings” on page 6-64

6-11
6 Characters and Strings

Cell Arrays of Character Vectors


To store text as a character vector, enclose it single quotes. Typically, a character vector has text that
you consider to be a single piece of information, such as a file name or a label for a plot. If you have
many pieces of text, such as a list of file names, then you can store them in a cell array. A cell array
whose elements are all character vectors is a cell array of character vectors.

Note

• As of R2018b, the recommended way to store text is to use string arrays. If you create variables
that have the string data type, store them in string arrays, not cell arrays. For more information,
see “Text in String and Character Arrays” on page 6-2 and “Update Your Code to Accept Strings”
on page 6-64.
• While the phrase cell array of strings frequently has been used to describe such cell arrays, the
phrase is no longer accurate because such a cell array holds character vectors, not strings.

Create Cell Array of Character Vectors


To create a cell array of character vectors, use curly braces, {}, just as you would to create any cell
array. For example, use a cell array of character vectors to store a list of names.
C = {'Li','Sanchez','Jones','Yang','Larson'}

C = 1x5 cell
{'Li'} {'Sanchez'} {'Jones'} {'Yang'} {'Larson'}

The character vectors in C can have different lengths because a cell array does not require that its
contents have the same size. To determine the lengths of the character vectors in C, use the
strlength function.
L = strlength(C)

L = 1×5

2 7 5 4 6

Access Character Vectors in Cell Array


To access character vectors in a cell array, index into it using curly braces, {}. Extract the contents of
the first cell and store it as a character vector.
C = {'Li','Sanchez','Jones','Yang','Larson'};
chr = C{1}

chr =
'Li'

Assign a different character vector to the first cell.


C{1} = 'Yang'

6-12
Cell Arrays of Character Vectors

C = 1x5 cell
{'Yang'} {'Sanchez'} {'Jones'} {'Yang'} {'Larson'}

To refer to a subset of cells, instead of their contents, index using smooth parentheses.

C(1:3)

ans = 1x3 cell


{'Yang'} {'Sanchez'} {'Jones'}

While you can access the contents of cells by indexing, most functions that accept cell arrays as
inputs operate on the entire cell array. For example, you can use the strcmp function to compare the
contents of C to a character vector. strcmp returns 1 where there is a match and 0 otherwise.

TF = strcmp(C,'Yang')

TF = 1x5 logical array

1 0 0 1 0

You can sum over TF to find the number of matches.

num = sum(TF)

num = 2

Use TF as logical indices to return the matches in C. If you index using smooth parentheses, then the
output is a cell array containing only the matches.

M = C(TF)

M = 1x2 cell
{'Yang'} {'Yang'}

Convert Cell Arrays to String Arrays


As of R2018b, string arrays are supported throughout MATLAB® and MathWorks® products.
Therefore it is recommended that you use string arrays instead of cell arrays of character vectors.
(However, MATLAB functions that accept string arrays as inputs do accept character vectors and cell
arrays of character vectors as well.)

You can convert cell arrays of character vectors to string arrays. To convert a cell array of character
vectors, use the string function.

C = {'Li','Sanchez','Jones','Yang','Larson'}

C = 1x5 cell
{'Li'} {'Sanchez'} {'Jones'} {'Yang'} {'Larson'}

str = string(C)

6-13
6 Characters and Strings

str = 1x5 string


"Li" "Sanchez" "Jones" "Yang" "Larson"

In fact, the string function converts any cell array, so long as all of the contents can be converted to
strings.

C2 = {5, 10, 'some text', datetime('today')}

C2=1×4 cell array


{[5]} {[10]} {'some text'} {[01-Sep-2021]}

str2 = string(C2)

str2 = 1x4 string


"5" "10" "some text" "01-Sep-2021"

See Also
cellstr | char | iscellstr | strcmp | string

More About
• “Text in String and Character Arrays” on page 6-2
• “Access Data in Cell Array” on page 12-5
• “Create String Arrays” on page 6-5
• “Update Your Code to Accept Strings” on page 6-64
• “Frequently Asked Questions About String Arrays” on page 6-59

6-14
Analyze Text Data with String Arrays

Analyze Text Data with String Arrays


This example shows how to store text from a file as a string array, sort the words by their frequency,
plot the result, and collect basic statistics for the words found in the file.

Import Text File to String Array

Read text from Shakespeare's Sonnets with the fileread function. fileread returns the text as a
1-by-100266 character vector.

sonnets = fileread('sonnets.txt');
sonnets(1:35)

ans =
'THE SONNETS

by William Shakespeare'

Convert the text to a string using the string function. Then, split it on newline characters using the
splitlines function. sonnets becomes a 2625-by-1 string array, where each string contains one
line from the poems. Display the first five lines of sonnets.

sonnets = string(sonnets);
sonnets = splitlines(sonnets);
sonnets(1:5)

ans = 5x1 string


"THE SONNETS"
""
"by William Shakespeare"
""
""

Clean String Array

To calculate the frequency of the words in sonnets, first clean it by removing empty strings and
punctuation marks. Then reshape it into a string array that contains individual words as elements.

Remove the strings with zero characters ("") from the string array. Compare each element of
sonnets to "", the empty string. Starting in R2017a, you can create strings, including an empty
string, using double quotes. TF is a logical vector that contains a true value wherever sonnets
contains a string with zero characters. Index into sonnets with TF and delete all strings with zero
characters.

TF = (sonnets == "");
sonnets(TF) = [];
sonnets(1:10)

ans = 10x1 string


"THE SONNETS"
"by William Shakespeare"
" I"
" From fairest creatures we desire increase,"
" That thereby beauty's rose might never die,"
" But as the riper should by time decease,"

6-15
6 Characters and Strings

" His tender heir might bear his memory:"


" But thou, contracted to thine own bright eyes,"
" Feed'st thy light's flame with self-substantial fuel,"
" Making a famine where abundance lies,"

Replace some punctuation marks with space characters. For example, replace periods, commas, and
semi-colons. Keep apostrophes because they can be part of some words in the Sonnets, such as
light's.
p = [".","?","!",",",";",":"];
sonnets = replace(sonnets,p," ");
sonnets(1:10)

ans = 10x1 string


"THE SONNETS"
"by William Shakespeare"
" I"
" From fairest creatures we desire increase "
" That thereby beauty's rose might never die "
" But as the riper should by time decease "
" His tender heir might bear his memory "
" But thou contracted to thine own bright eyes "
" Feed'st thy light's flame with self-substantial fuel "
" Making a famine where abundance lies "

Strip leading and trailing space characters from each element of sonnets.
sonnets = strip(sonnets);
sonnets(1:10)

ans = 10x1 string


"THE SONNETS"
"by William Shakespeare"
"I"
"From fairest creatures we desire increase"
"That thereby beauty's rose might never die"
"But as the riper should by time decease"
"His tender heir might bear his memory"
"But thou contracted to thine own bright eyes"
"Feed'st thy light's flame with self-substantial fuel"
"Making a famine where abundance lies"

Split sonnets into a string array whose elements are individual words. You can use the split
function to split elements of a string array on whitespace characters, or on delimiters that you
specify. However, split requires that every element of a string array must be divisible into an equal
number of new strings. The elements of sonnets have different numbers of spaces, and therefore are
not divisible into equal numbers of strings. To use the split function on sonnets, write a for-loop
that calls split on one element at a time.

Create the empty string array sonnetWords using the strings function. Write a for-loop that splits
each element of sonnets using the split function. Concatenate the output from split onto
sonnetWords. Each element of sonnetWords is an individual word from sonnets.
sonnetWords = strings(0);
for i = 1:length(sonnets)

6-16
Analyze Text Data with String Arrays

sonnetWords = [sonnetWords ; split(sonnets(i))];


end
sonnetWords(1:10)

ans = 10x1 string


"THE"
"SONNETS"
"by"
"William"
"Shakespeare"
"I"
"From"
"fairest"
"creatures"
"we"

Sort Words Based on Frequency

Find the unique words in sonnetWords. Count them and sort them based on their frequency.

To count words that differ only by case as the same word, convert sonnetWords to lowercase. For
example, The and the count as the same word. Find the unique words using the unique function.
Then, count the number of times each unique word occurs using the histcounts function.

sonnetWords = lower(sonnetWords);
[words,~,idx] = unique(sonnetWords);
numOccurrences = histcounts(idx,numel(words));

Sort the words in sonnetWords by number of occurrences, from most to least common.

[rankOfOccurrences,rankIndex] = sort(numOccurrences,'descend');
wordsByFrequency = words(rankIndex);

Plot Word Frequency

Plot the occurrences of words in the Sonnets from the most to least common words. Zipf's Law states
that the distribution of occurrences of words in a large body text follows a power-law distribution.

loglog(rankOfOccurrences);
xlabel('Rank of word (most to least common)');
ylabel('Number of Occurrences');

6-17
6 Characters and Strings

Display the ten most common words in the Sonnets.


wordsByFrequency(1:10)

ans = 10x1 string


"and"
"the"
"to"
"my"
"of"
"i"
"in"
"that"
"thy"
"thou"

Collect Basic Statistics in Table

Calculate the total number of occurrences of each word in sonnetWords. Calculate the number of
occurrences as a percentage of the total number of words, and calculate the cumulative percentage
from most to least common. Write the words and the basic statistics for them to a table.
numOccurrences = numOccurrences(rankIndex);
numOccurrences = numOccurrences';
numWords = length(sonnetWords);
T = table;
T.Words = wordsByFrequency;

6-18
Analyze Text Data with String Arrays

T.NumOccurrences = numOccurrences;
T.PercentOfText = numOccurrences / numWords * 100.0;
T.CumulativePercentOfText = cumsum(numOccurrences) / numWords * 100.0;

Display the statistics for the ten most common words.

T(1:10,:)

ans=10×4 table
Words NumOccurrences PercentOfText CumulativePercentOfText
______ ______________ _____________ _______________________

"and" 490 2.7666 2.7666


"the" 436 2.4617 5.2284
"to" 409 2.3093 7.5377
"my" 371 2.0947 9.6324
"of" 370 2.0891 11.722
"i" 341 1.9254 13.647
"in" 321 1.8124 15.459
"that" 320 1.8068 17.266
"thy" 280 1.5809 18.847
"thou" 233 1.3156 20.163

The most common word in the Sonnets, and, occurs 490 times. Together, the ten most common words
account for 20.163% of the text.

See Also
string | split | join | unique | replace | lower | splitlines | histcounts | strip | sort |
table

Related Examples
• “Create String Arrays” on page 6-5
• “Search and Replace Text” on page 6-37
• “Compare Text” on page 6-32
• “Test for Empty Strings and Missing Values” on page 6-20

6-19
6 Characters and Strings

Test for Empty Strings and Missing Values


String arrays can contain both empty strings and missing values. Empty strings contain zero
characters and display as double quotes with nothing between them (""). You can determine if a
string is an empty string using the == operator. The empty string is a substring of every other string.
Therefore, functions such as contains always find the empty string within other strings. String
arrays also can contain missing values. Missing values in string arrays display as <missing>. To find
missing values in a string array, use the ismissing function instead of the == operator.

Test for Empty Strings

You can test a string array for empty strings using the == operator.

Starting in R2017a, you can create an empty string using double quotes with nothing between them
(""). Note that the size of str is 1-by-1, not 0-by-0. However, str contains zero characters.

str = ""

str =
""

Create an empty character vector using single quotes. Note that the size of chr is 0-by-0. The
character array chr actually is an empty array, and not just an array with zero characters.

chr = ''

chr =

0x0 empty char array

Create an array of empty strings using the strings function. Each element of the array is a string
with no characters.

str2 = strings(1,3)

str2 = 1x3 string


"" "" ""

Test if str is an empty string by comparing it to an empty string.

if (str == "")
disp 'str has zero characters'
end

str has zero characters

Do not use the isempty function to test for empty strings. A string with zero characters still has a
size of 1-by-1. However, you can test if a string array has at least one dimension with a size of zero
using the isempty function.

Create an empty string array using the strings function. To be an empty array, at least one
dimension must have a size of zero.

str = strings(0,3)

6-20
Test for Empty Strings and Missing Values

str =

0x3 empty string array

Test str using the isempty function.

isempty(str)

ans = logical
1

Test a string array for empty strings. The == operator returns a logical array that is the same size as
the string array.

str = ["Mercury","","Apollo"]

str = 1x3 string


"Mercury" "" "Apollo"

str == ''

ans = 1x3 logical array

0 1 0

Find Empty Strings Within Other Strings

Strings always contain the empty string as a substring. In fact, the empty string is always at both the
start and the end of every string. Also, the empty string is always found between any two consecutive
characters in a string.

Create a string. Then test if it contains the empty string.

str = "Hello, world";


TF = contains(str,"")

TF = logical
1

Test if str starts with the empty string.

TF = startsWith(str,"")

TF = logical
1

Count the number of characters in str. Then count the number of empty strings in str. The count
function counts empty strings at the beginning and end of str, and between each pair of characters.
Therefore if str has N characters, it also has N+1 empty strings.

str

str =
"Hello, world"

6-21
6 Characters and Strings

strlength(str)

ans = 12

count(str,"")

ans = 13

Replace a substring with the empty string. When you call replace with an empty string, it removes
the substring and replaces it with a string that has zero characters.

replace(str,"world","")

ans =
"Hello, "

Insert a substring after empty strings using the insertAfter function. Because there are empty
strings between each pair of characters, insertAfter inserts substrings between each pair.

insertAfter(str,"","-")

ans =
"-H-e-l-l-o-,- -w-o-r-l-d-"

In general, string functions that replace, erase, extract, or insert substrings allow you to specify
empty strings as the starts and ends of the substrings to modify. When you do so, these functions
operate on the start and end of the string, and between every pair of characters.

Test for Missing Values

You can test a string array for missing values using the ismissing function. The missing string is the
string equivalent to NaN for numeric arrays. It indicates where a string array has missing values. The
missing string displays as <missing>.

To create a missing string, convert a missing value using the string function.

str = string(missing)

str =
<missing>

You can create a string array with both empty and missing strings. Use the ismissing function to
determine which elements are strings with missing values. Note that the empty string is not a
missing string.

str(1) = "";
str(2) = "Gemini";
str(3) = string(missing)

str = 1x3 string


"" "Gemini" <missing>

ismissing(str)

ans = 1x3 logical array

0 0 1

6-22
Test for Empty Strings and Missing Values

Compare str to a missing string. The comparison is always 0 (false), even when you compare a
missing string to another missing string.

str == string(missing)

ans = 1x3 logical array

0 0 0

To find missing strings, use the ismissing function. Do not use the == operator.

See Also
string | strings | strlength | ismissing | contains | startsWith | endsWith | erase |
extractBetween | extractBefore | extractAfter | insertAfter | insertBefore | replace |
replaceBetween | eraseBetween | eq | all | any

Related Examples
• “Create String Arrays” on page 6-5
• “Analyze Text Data with String Arrays” on page 6-15
• “Search and Replace Text” on page 6-37
• “Compare Text” on page 6-32

6-23
6 Characters and Strings

Formatting Text
To convert data to text and control its format, you can use formatting operators with common
conversion functions, such as num2str and sprintf. These operators control notation, alignment,
significant digits, and so on. They are similar to those used by the printf function in the C
programming language. Typical uses for formatted text include text for display and output files.

For example, %f converts floating-point values to text using fixed-point notation. Adjust the format by
adding information to the operator, such as %.2f to represent two digits after the decimal mark, or
%12f to represent 12 characters in the output, padding with spaces as needed.

A = pi*ones(1,3);
txt = sprintf('%f | %.2f | %12f', A)

txt =
'3.141593 | 3.14 | 3.141593'

You can combine operators with ordinary text and special characters in a format specifier. For
instance, \n inserts a newline character.

txt = sprintf('Displaying pi: \n %f \n %.2f \n %12f', A)

txt =
'Displaying pi:
3.141593
3.14
3.141593'

Functions that support formatting operators are compose, num2str, sprintf, fprintf, and the
error handling functions assert, error, warning, and MException.

Fields of the Formatting Operator


A formatting operator can have six fields, as shown in the figure. From right to left, the fields are the
conversion character, subtype, precision, field width, flags, and numeric identifier. (Space characters
are not allowed in the operator. They are shown here only to improve readability of the figure.) The
conversion character is the only required field, along with the leading % character.

Conversion Character

The conversion character specifies the notation of the output. It consists of a single character and
appears last in the format specifier.

Specifier Description
c Single character.

6-24
Formatting Text

Specifier Description
d Decimal notation (signed).
e Exponential notation (using a lowercase e, as in 3.1415e+00).
E Exponential notation (using an uppercase E, as in 3.1415E+00).
f Fixed-point notation.
g The more compact of %e or %f. (Insignificant zeroes do not print.)
G Same as %g, but using an uppercase E.
o Octal notation (unsigned).
s Character vector or string array.
u Decimal notation (unsigned).
x Hexadecimal notation (unsigned, using lowercase letters a–f).
X Hexadecimal notation (unsigned, using uppercase letters A–F).

For example, format the number 46 using different conversion characters to display the number in
decimal, fixed-point, exponential, and hexadecimal formats.

A = 46*ones(1,4);
txt = sprintf('%d %f %e %X', A)

txt =
'46 46.000000 4.600000e+01 2E'

Subtype

The subtype field is a single alphabetic character that immediately precedes the conversion
character. Without the subtype field, the conversion characters %o, %x, %X, and %u treat input data as
integers. To treat input data as floating-point values instead and convert them to octal, decimal, or
hexadecimal representations, use one of following subtype specifiers.

b The input data are double-precision floating-point values rather than unsigned integers. For
example, to print a double-precision value in hexadecimal, use a format like %bx.
t The input data are single-precision floating-point values rather than unsigned integers.

Precision

The precision field in a formatting operator is a nonnegative integer that immediately follows a
period. For example, in the operator %7.3f, the precision is 3. For the %g operator, the precision
indicates the number of significant digits to display. For the %f, %e, and %E operators, the precision
indicates how many digits to display to the right of the decimal point.

Display numbers to different precisions using the precision field.

txt = sprintf('%g %.2g %f %.2f', pi*50*ones(1,4))

txt =
'157.08 1.6e+02 157.079633 157.08'

While you can specify the precision in a formatting operator for input text (for example, in the %s
operator), there is usually no reason to do so. If you specify the precision as p, and p is less than the
number of characters in the input text, then the output contains only the first p characters.

6-25
6 Characters and Strings

Field Width

The field width in a formatting operator is a nonnegative integer that specifies the number of digits or
characters in the output when formatting input values. For example, in the operator %7.3f, the field
width is 7.

Specify different field widths. To show the width for each output, use the | character. By default, the
output text is padded with space characters when the field width is greater than the number of
characters.
txt = sprintf('|%e|%15e|%f|%15f|', pi*50*ones(1,4))

txt =
'|1.570796e+02| 1.570796e+02|157.079633| 157.079633|'

When used on text input, the field width can determine whether to pad the output text with spaces. If
the field width is less than or equal to the number of characters in the input text, then it has no effect.
txt = sprintf('%30s', 'Pad left with spaces')

txt =
' Pad left with spaces'

Flags

Optional flags control additional formatting of the output text. The table describes the characters you
can use as flags.

Character Description Example


Minus sign (-) Left-justify the converted argument %-5.2d
in its field.
Plus sign (+) For numeric values, always print a %+5.2d
leading sign character (+ or -). %+5s
For text values, right-justify the
converted argument in its field.
Space Insert a space before the value. % 5.2f
Zero (0) Pad with zeroes rather than spaces. %05.2f
Pound sign (#) Modify selected numeric %#5.0f
conversions:

• For %o, %x, or %X, print 0, 0x, or


0X prefix.
• For %f, %e, or %E, print decimal
point even when precision is 0.
• For %g or %G, do not remove
trailing zeroes or decimal point.

Right- and left-justify the output. The default behavior is to right-justify the output text.
txt = sprintf('right-justify: %12.2f\nleft-justify: %-12.2f',...
12.3, 12.3)

txt =
'right-justify: 12.30

6-26
Formatting Text

left-justify: 12.30 '

Display a + sign for positive numbers. The default behavior is to omit the leading + sign for positive
numbers.

txt = sprintf('no sign: %12.2f\nsign: %+12.2f',...


12.3, 12.3)

txt =
'no sign: 12.30
sign: +12.30'

Pad to the left with spaces and zeroes. The default behavior is to pad with spaces.

txt = sprintf('Pad with spaces: %12.2f\nPad with zeroes: %012.2f',...


5.2, 5.2)

txt =
'Pad with spaces: 5.20
Pad with zeroes: 000000005.20'

Note You can specify more than one flag in a formatting operator.

Value Identifiers

By default, functions such as sprintf insert values from input arguments into the output text in
sequential order. To process the input arguments in a nonsequential order, specify the order using
numeric identifiers in the format specifier. Specify nonsequential arguments with an integer
immediately following the % sign, followed by a $ sign.

Ordered Sequentially Ordered By Identifier


sprintf('%s %s %s',... sprintf('%3$s %2$s %1$s',...
'1st','2nd','3rd') '1st','2nd','3rd')

ans = ans =

'1st 2nd 3rd' '3rd 2nd 1st'

Special Characters

Special characters can be part of the output text. But because they cannot be entered as ordinary
text, they require specific character sequences to represent them. To insert special characters into
output text, use any of the character sequences in the table.

Special Character Representation in Format


Specifier
Single quotation mark ''
Percent character %%
Backslash \\

6-27
6 Characters and Strings

Special Character Representation in Format


Specifier
Alarm \a
Backspace \b
Form feed \f
New line \n
Carriage return \r
Horizontal tab \t
Vertical tab \v
Character whose Unicode numeric value can be represented by \xN
the hexadecimal number, N
Example: sprintf('\x5A')
returns 'Z'
Character whose Unicode numeric value can be represented by \N
the octal number, N
Example: sprintf('\132')
returns 'Z'

Setting Field Width and Precision


The formatting operator follows a set of rules for formatting output text to the specified field width
and precision. You also can specify values for the field width and precision outside the format
specifier, and use numbered identifiers with the field width and precision.

Rules for Formatting Precision and Field Width

The figure illustrates how the field width and precision settings affect the output of the formatting
functions. In this figure, the zero following the % sign in the formatting operator means to add leading
zeroes to the output text rather than space characters.

• If the precision is not specified, then it defaults to six.


• If the precision p is less than the number of digits in the fractional part of the input, then only p
digits are shown after the decimal point. The fractional value is rounded in the output.
• If the precision p is greater than the number of digits f in the fractional part of the input, then p
digits are shown after the decimal point. The fractional part is extended to the right with p-f
zeroes in the output.
• If the field width is not specified, then it defaults to p+1+n, where n is the number of digits in the
whole part of the input value.

6-28
Formatting Text

• If the field width w is greater than p+1+n, then the whole part of the output value is padded to the
left with w-(p+1+n) additional characters. The additional characters are space characters unless
the formatting operator includes the 0 flag. In that case, the additional characters are zeroes.

Specify Field Width and Precision Outside Format Specifier

You can specify the field width and precision using values from a sequential argument list. Use an
asterisk (*) in place of the field width or precision fields of the formatting operator.

For example, format and display three numbers. In each case, use an asterisk to specify that the field
width or precision come from input arguments that follow the format specifier.

txt = sprintf('%*f %.*f %*.*f',...


15,123.45678,...
3,16.42837,...
6,4,pi)

txt =
' 123.456780 16.428 3.1416'

The table describes the effects of each formatting operator in the example.

Formatting Operator Description


%*f Specify width as the following input argument,
15.
%.*f Specify precision as the following input
argument, 3.
%*.*f Specify width and precision as the following input
arguments, 6, and 4.

You can mix the two styles. For example, get the field width from the following input argument and
the precision from the format specifier.

txt = sprintf('%*.2f', 5, 123.45678)

txt =
'123.46'

Specify Numbered Identifiers in Width and Precision Fields

You also can specify field width and precision as values from a nonsequential argument list, using an
alternate syntax shown in the figure. Within the formatting operator, specify the field width and
precision with asterisks that follow numbered identifiers and $ signs. Specify the values of the field
width and precision with input arguments that follow the format specifier.

For example, format and display three numbers. In each case, use a numbered identifier to specify
that the field width or precision come from input arguments that follow the format specifier.

6-29
6 Characters and Strings

txt = sprintf('%1$*4$f %2$.*5$f %3$*6$.*7$f',...


123.45678, 16.42837, pi, 15, 3, 6, 4)

txt =
' 123.456780 16.428 3.1416'

The table describes the effect of each formatting operator in the example.

Formatting Operator Description


%1$*4$f 1$ specifies the first input argument,
123.45678, as the value
*4$ specifies the fourth input argument, 15, as
the field width
%2$.*5$f 2$ specifies the second input argument,
16.42837, as the value
.*5$ specifies the fifth input argument, 3, as the
precision
%3$*6$.*7$f 3$ specifies the third input argument, pi, as the
value
*6$ specifies the sixth input argument, 6, as the
field width
.*7$ specifies the seventh input argument, 4, as
the precision

Restrictions on Using Identifiers


If any of the formatting operators include an identifier field, then all the operators in the format
specifier must include identifier fields. If you use both sequential and nonsequential ordering in the
same function call, then the output is truncated at the first switch between sequential and
nonsequential identifiers.

Valid Syntax Invalid Syntax


sprintf('%d %d %d %d',... sprintf('%d %3$d %d %d',...
1,2,3,4) 1,2,3,4)

ans = ans =

'1 2 3 4' '1 '

If your function call provides more input arguments than there are formatting operators in the format
specifier, then the operators are reused. However, only function calls that use sequential ordering
reuse formatting operators. You cannot reuse formatting operators when you use numbered
identifiers.

Valid Syntax Invalid Syntax


sprintf('%d',1,2,3,4) sprintf('%1$d',1,2,3,4)

ans = ans =

'1234' '1'

6-30
Formatting Text

If you use numbered identifiers when the input data is a vector or array, then the output does not
contain formatted data.

Valid Syntax Invalid Syntax


v = [1.4 2.7 3.1]; v = [1.4 2.7 3.1];
sprintf('%.4f %.4f %.4f',v) sprintf('%3$.4f %1$.4f %2$.4f',v)

ans = ans =

'1.4000 2.7000 3.1000' 1×0 empty char array

See Also
compose | sprintf | fprintf | num2str

Related Examples
• “Convert Text to Numeric Values” on page 6-49
• “Convert Numeric Values to Text” on page 6-45

6-31
6 Characters and Strings

Compare Text
Compare text in character arrays and string arrays in different ways. String arrays were introduced
in R2016b. You can compare string arrays and character vectors with relational operators and with
the strcmp function. You can sort string arrays using the sort function, just as you would sort
arrays of any other type. MATLAB® also provides functions to inspect characters in pieces of text.
For example, you can determine which characters in a character vector or string array are letters or
space characters.

Compare String Arrays for Equality

You can compare string arrays for equality with the relational operators == and ~=. When you
compare string arrays, the output is a logical array that has 1 where the relation is true, and 0 where
it is not true.

Create two string scalars. Starting in R2017a, you can create strings using double quotes.

str1 = "Hello";
str2 = "World";
str1,str2

str1 =
"Hello"

str2 =
"World"

Compare str1 and str2 for equality.

str1 == str2

ans = logical
0

Compare a string array with multiple elements to a string scalar.

str1 = ["Mercury","Gemini","Apollo";...
"Skylab","Skylab B","International Space Station"];
str2 = "Apollo";
str1 == str2

ans = 2x3 logical array

0 0 1
0 0 0

Compare a string array to a character vector. As long as one of the variables is a string array, you can
make the comparison.

chr = 'Gemini';
TF = (str1 == chr)

TF = 2x3 logical array

0 1 0

6-32
Compare Text

0 0 0

Index into str1 with TF to extract the string elements that matched Gemini. You can use logical
arrays to index into an array.

str1(TF)

ans =
"Gemini"

Compare for inequality using the ~= operator. Index into str1 to extract the elements that do not
match 'Gemini'.

TF = (str1 ~= chr)

TF = 2x3 logical array

1 0 1
1 1 1

str1(TF)

ans = 5x1 string


"Mercury"
"Skylab"
"Skylab B"
"Apollo"
"International Space Station"

Compare two nonscalar string arrays. When you compare two nonscalar arrays, they must be the
same size.

str2 = ["Mercury","Mars","Apollo";...
"Jupiter","Saturn","Neptune"];
TF = (str1 == str2)

TF = 2x3 logical array

1 0 1
0 0 0

Index into str1 to extract the matches.

str1(TF)

ans = 2x1 string


"Mercury"
"Apollo"

Compare String Arrays with Other Relational Operators

You can also compare strings with the relational operators >, >=, <, and <=. Strings that start with
uppercase letters come before strings that start with lowercase letters. For example, the string
"ABC" is less than "abc". Digits and some punctuation marks also come before letters.

6-33
6 Characters and Strings

"ABC" < "abc"

ans = logical
1

Compare a string array that contains names to another name with the > operator. The names
Sanchez, de Ponte, and Nash come after Matthews, because S, d, and N all are greater than M.

str = ["Sanchez","Jones","de Ponte","Crosby","Nash"];


TF = (str > "Matthews")

TF = 1x5 logical array

1 0 1 0 1

str(TF)

ans = 1x3 string


"Sanchez" "de Ponte" "Nash"

Sort String Arrays

You can sort string arrays. MATLAB® stores characters as Unicode® using the UTF-16 character
encoding scheme. Character and string arrays are sorted according to the UTF-16 code point order.
For the characters that are also the ASCII characters, this order means that uppercase letters come
before lowercase letters. Digits and some punctuation also come before letters.

Sort the string array str.

sort(str)

ans = 1x5 string


"Crosby" "Jones" "Nash" "Sanchez" "de Ponte"

Sort a 2-by-3 string array. The sort function sorts the elements in each column separately.

sort(str2)

ans = 2x3 string


"Jupiter" "Mars" "Apollo"
"Mercury" "Saturn" "Neptune"

To sort the elements in each row, sort str2 along the second dimension.

sort(str2,2)

ans = 2x3 string


"Apollo" "Mars" "Mercury"
"Jupiter" "Neptune" "Saturn"

6-34
Compare Text

Compare Character Vectors

You can compare character vectors and cell arrays of character vectors to each other. Use the
strcmp function to compare two character vectors, or strncmp to compare the first N characters.
You also can use strcmpi and strncmpi for case-insensitive comparisons.

Compare two character vectors with the strcmp function. chr1 and chr2 are not equal.

chr1 = 'hello';
chr2 = 'help';
TF = strcmp(chr1,chr2)

TF = logical
0

Note that the MATLAB strcmp differs from the C version of strcmp. The C version of strcmp
returns 0 when two character arrays are the same, not when they are different.

Compare the first two characters with the strncmp function. TF is 1 because both character vectors
start with the characters he.

TF = strncmp(chr1,chr2,2)

TF = logical
1

Compare two cell arrays of character vectors. strcmp returns a logical array that is the same size as
the cell arrays.

C1 = {'pizza'; 'chips'; 'candy'};


C2 = {'pizza'; 'chocolate'; 'pretzels'};
strcmp(C1,C2)

ans = 3x1 logical array

1
0
0

Inspect Characters in String and Character Arrays

You can inspect the characters in string arrays or character arrays with the isstrprop, isletter,
and isspace functions.

• The isstrprop inspects characters in either string arrays or character arrays.


• The isletter and isspace functions inspect characters in character arrays only.

Determine which characters in a character vector are space characters. isspace returns a logical
vector that is the same size as chr.

chr = 'Four score and seven years ago';


TF = isspace(chr)

TF = 1x30 logical array

6-35
6 Characters and Strings

0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0

The isstrprop function can query characters for many different traits. isstrprop can determine
whether characters in a string or character vector are letters, alphanumeric characters, decimal or
hexadecimal digits, or punctuation characters.

Determine which characters in a string are punctuation marks. isstrprop returns a logical vector
whose length is equal to the number of characters in str.

str = "A horse! A horse! My kingdom for a horse!"

str =
"A horse! A horse! My kingdom for a horse!"

isstrprop(str,"punct")

ans = 1x41 logical array

0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0

Determine which characters in the character vector chr are letters.

isstrprop(chr,"alpha")

ans = 1x30 logical array

1 1 1 1 0 1 1 1 1 1 0 1 1 1 0 1 1 1 1 1 0 1 1 1

See Also
strcmp | sort | isstrprop | isletter | isspace | eq | ne | gt | ge | le | lt

Related Examples
• “Text in String and Character Arrays” on page 6-2
• “Create String Arrays” on page 6-5
• “Analyze Text Data with String Arrays” on page 6-15
• “Search and Replace Text” on page 6-37
• “Test for Empty Strings and Missing Values” on page 6-20

6-36
Search and Replace Text

Search and Replace Text


Since R2016b. Replaces Searching and Replacing (R2016a).

Processing text data often involves finding and replacing substrings. There are several functions that
find text and return different information: some functions confirm that the text exists, while others
count occurrences, find starting indices, or extract substrings. These functions work on character
vectors and string scalars, such as "yes", as well as character and string arrays, such as
["yes","no";"abc","xyz"]. In addition, you can use patterns to define rules for searching, such as
one or more letter or digit characters.

Search for Text

To determine if text is present, use a function that returns logical values, like contains,
startsWith, or endsWith. Logical values of 1 correspond to true, and 0 corresponds to false.

txt = "she sells seashells by the seashore";


TF = contains(txt,"sea")

TF = logical
1

Calculate how many times the text occurs using the count function.

n = count(txt,"sea")

n = 2

To locate where the text occurs, use the strfind function, which returns starting indices.

idx = strfind(txt,"sea")

idx = 1×2

11 28

Find and extract text using extraction functions, such as extract, extractBetween,
extractBefore, or extractAfter.

mid = extractBetween(txt,"sea","shore")

mid =
"shells by the sea"

Optionally, include the boundary text.

mid = extractBetween(txt,"sea","shore","Boundaries","inclusive")

mid =
"seashells by the seashore"

Find Text in Arrays

The search and replacement functions can also find text in multi-element arrays. For example, look
for color names in several song titles.

6-37
6 Characters and Strings

songs = ["Yellow Submarine";


"Penny Lane";
"Blackbird"];

colors =["Red","Yellow","Blue","Black","White"];

TF = contains(songs,colors)

TF = 3x1 logical array

1
0
1

To list the songs that contain color names, use the logical TF array as indices into the original songs
array. This technique is called logical indexing.
colorful = songs(TF)

colorful = 2x1 string


"Yellow Submarine"
"Blackbird"

Use the function replace to replace text in songs that matches elements of colors with the string
"Orange".
replace(songs,colors,"Orange")

ans = 3x1 string


"Orange Submarine"
"Penny Lane"
"Orangebird"

Match Patterns

Since R2020b

In addition to searching for literal text, like “sea” or “yellow”, you can search for text that matches a
pattern. There are many predefined patterns, such as digitsPattern to find numeric digits.
address = "123a Sesame Street, New York, NY 10128";
nums = extract(address,digitsPattern)

nums = 2x1 string


"123"
"10128"

For additional precision in searches, you can combine patterns. For example, locate words that start
with the character “S”. Use a string to specify the “S” character, and lettersPattern to find
additional letters after that character.
pat = "S" + lettersPattern;
StartWithS = extract(address,pat)

StartWithS = 2x1 string


"Sesame"

6-38
Search and Replace Text

"Street"

For more information, see “Build Pattern Expressions” on page 6-40.

See Also
contains | extract | count | pattern | replace | strfind

Related Examples
• “Text in String and Character Arrays” on page 6-2
• “Build Pattern Expressions” on page 6-40
• “Test for Empty Strings and Missing Values” on page 6-20
• “Regular Expressions” on page 2-51

6-39
6 Characters and Strings

Build Pattern Expressions


Since R2020b

Patterns are a tool to aid in searching for and modifying text. Similar to regular expressions, a
pattern defines rules for matching text. Patterns can be used with text-searching functions like
contains, matches, and extract to specify which portions of text these functions act on. You can
build a pattern expression in a way similar to how you would build a mathematical expression, using
pattern functions, operators, and literal text. Because building pattern expressions is open ended,
patterns can become quite complicated. Building patterns in steps and using functions like
maskedPattern and namedPattern can help organize complicated patterns.

Building Simple Patterns

The simplest pattern is built from a single pattern function. For example, lettersPattern matches
any letter characters. There are many pattern functions for matching different types of characters
and other features of text. A list of these functions can be found on the pattern reference page.

txt = "abc123def";
pat = lettersPattern;
extract(txt,pat)

ans = 2x1 string


"abc"
"def"

Patterns combine with other patterns and literal text by using the plus(+) operator. This operator
appends patterns and text together in the order they are defined in the pattern expression. The
combined patterns only match text in the same order. In this example, "YYYY/MM/DD" is not a match
because a four-letter string must be at the end of the text.

txt = "Dates can be expressed as MM/DD/YYYY, DD/MM/YYYY, or YYYY/MM/DD";


pat = lettersPattern(2) + "/" + lettersPattern(2) + "/" + lettersPattern(4);
extract(txt,pat)

ans = 2x1 string


"MM/DD/YYYY"
"DD/MM/YYYY"

Patterns used with the or(|) operator specify that only one of the two specified patterns needs to
match a section of text. If neither pattern is able to match then the pattern expression fails to match.

txt = "123abc";
pat = lettersPattern|digitsPattern;
extract(txt,pat)

ans = 2x1 string


"123"
"abc"

Some pattern functions take patterns as their input and modify them in some way. For example,
optionalPattern makes a specified pattern match if possible, but the pattern is not required for a
successful match.

6-40
Build Pattern Expressions

txt = ["123abc" "abc"];


pat = optionalPattern(digitsPattern) + lettersPattern;
extract(txt,pat)

ans = 1x2 string


"123abc" "abc"

Boundary Patterns

Boundary patterns are a special type of pattern that do not match characters but rather match the
boundaries between a designated character type and other characters or the start or end of that
piece of text. For example, digitBoundary matches the boundaries between digit characters and
nondigit characters and between digit characters and the start or end of the text. It does not match
digit characters themselves. Boundary patterns are useful as delimiters for functions like split.

txt = "123abc";
pat = digitBoundary;
split(txt,pat)

ans = 3x1 string


""
"123"
"abc"

Boundary patterns are special amongst patterns because they can be negated using the not(~)
operator. When negated in this way, boundary patterns match before or after characters that did not
satisfy the requirements above. For example, ~digitBoundary matches the boundary between:

• characters that are both digits


• characters that are both nondigits
• a nondigit character and the start or end of a piece of text

Use replace to mark the locations matched by ~digitBoundary with a "|" character.

txt = "123abc";
pat = ~digitBoundary;
replace(txt,pat,"|")

ans =
"1|2|3a|b|c|"

Building Complicated Patterns in Steps

Sometimes a simple pattern is not sufficient to solve a problem and a more complicated pattern is
needed. As a pattern expression grows it can become difficult to understand what it is matching. One
way to simplify building a complicated pattern is building each part of the pattern separately and
then combining the parts together into a single pattern expression.

For instance, email addresses use the form [email protected]. Each of the three identifiers —
local_part, domain, and TLD — must be a combination of digits, letters and underscore characters. To
build the full pattern, start by defining a pattern for the identifiers. Build a pattern that matches one
letter or digit character or one underscore character.

identCharacters = alphanumericsPattern(1) | "_";

6-41
6 Characters and Strings

Now, use asManyOfPattern to match one or more consecutive instances of identCharacters.

identifier = asManyOfPattern(identCharacters,1);

Next, build a pattern that matches an email containing multiple identifiers.

emailPattern = identifier + "@" + identifier + "." + identifier;

Test the pattern by seeing how well it matches the following example emails.

exampleEmails = ["[email protected]"
"[email protected]"
"[email protected]"];
matches(exampleEmails,emailPattern)

ans = 3x1 logical array

1
0
0

The pattern fails to match several of the example emails even though all the emails are valid. Both the
local_part and domain can be made of a series of identifiers that are separated by periods. Use the
identifier pattern to build a pattern that is capable of matching a series of identifiers.
asManyOfPattern matches as many concurrent appearances of the specified pattern as possible,
but if there are none the rest of the pattern is still able to match successfully.

identifierSeries = asManyOfPattern(identifier + ".") + identifier;

Use this pattern to build a new emailPattern that can match all of the example emails.

emailPattern = identifierSeries + "@" + identifierSeries + "." + identifier;


matches(exampleEmails,emailPattern)

ans = 3x1 logical array

1
1
1

Organizing Pattern Display

Complex patterns can sometimes be difficult to read and interpret, especially by those you share
them with who are unfamiliar with the pattern's structure. For example, when displayed,
emailPattern is long and difficult to read.

emailPattern

emailPattern = pattern
Matching:

asManyOfPattern(asManyOfPattern(alphanumericsPattern(1) | "_",1) + ".") + asManyOfPattern(alp

Part of the issue with the display is that there are many repetitions of the identifier pattern. If the
exact details of this pattern are not important to users of the pattern, then the display of the

6-42
Build Pattern Expressions

identifier pattern can be concealed using maskedPattern. This function creates a new pattern
where the display of identifier is masked and the variable name, "identifier", is displayed
instead. Alternatively, you can specify a different name to be displayed. The details of patterns that
are masked in this way can be accessed by clicking "Show all details" in the displayed pattern.
identifier = maskedPattern(identifier);
identifierSeries = asManyOfPattern(identifier + ".") + identifier

identifierSeries = pattern
Matching:

asManyOfPattern(identifier + ".") + identifier

Show all details

Patterns can be further organized using the namedPattern function. namedPattern designates a
pattern as a named pattern that changes how the pattern is displayed when combined with other
patterns. Email addresses have several important portions, [email protected], which each have
their own matching rules. Create a named pattern for each section.
localPart = namedPattern(identifierSeries,"local_part");

Named patterns can be nested, to further delineate parts of a pattern. To nest a named pattern, build
a pattern using named patterns and then designate that pattern as a named pattern. For example,
Domain.TLD can be divided into the domain, subdomains, and the top level domain (TLD). Create
named patterns for each part of domain.TLD.
subdomain = namedPattern(identifierSeries,"subdomain");
domainName = namedPattern(identifier,"domainName");
tld = namedPattern(identifier,"TLD");

Nest the named patterns for the components of domain underneath a single named pattern domain.
domain = optionalPattern(subdomain + ".") + ...
domainName + "." + ...
tld;
domain = namedPattern(domain);

Combine the patterns together into a single named pattern, emailPattern. In the display of
emailPattern you can see each named pattern and what they match as well as the information on
any nested named patterns.
emailPattern = localPart + "@" + domain

emailPattern = pattern
Matching:

local_part + "@" + domain

Using named patterns:

local_part : asManyOfPattern(identifier + ".") + identifier


domain : optionalPattern(subdomain + ".") + domainName + "." + TLD
subdomain : asManyOfPattern(identifier + ".") + identifier
domainName: identifier
TLD : identifier

6-43
6 Characters and Strings

Show all details

You can access named patterns and nested named patterns by dot-indexing into a pattern. For
example, you can access the nested named pattern subdomain by dot-indexing from emailPattern
into domain and then dot-indexing again into subdomain.

emailPattern.domain.subdomain

ans = pattern
Matching:

asManyOfPattern(identifier + ".") + identifier

Show all details

Dot-assignment can be used to change named patterns without needing to rewrite the rest of the
pattern expression.

emailPattern.domain = "mathworks.com"

emailPattern = pattern
Matching:

local_part + "@" + domain

Using named patterns:

local_part: asManyOfPattern(identifier + ".") + identifier


domain : "mathworks.com"

Show all details

See Also
pattern | string | regexp | contains | replace | extract

More About
• “Search and Replace Text” on page 6-37
• “Regular Expressions” on page 2-51

6-44
Convert Numeric Values to Text

Convert Numeric Values to Text


This example shows how to convert numeric values to text and append them to larger pieces of text.
For example, you might want to add a label or title to a plot, where the label includes a number that
describes a characteristic of the plot.

Convert to Strings

Before R2016b, convert to character vectors using num2str.

To convert a number to a string that represents it, use the string function.

str = string(pi)

str =
"3.1416"

The string function converts a numeric array to a string array having the same size.

A = [256 pi 8.9e-3];
str = string(A)

str = 1x3 string


"256" "3.141593" "0.0089"

You can specify the format of the output text using the compose function, which accepts format
specifiers for precision, field width, and exponential notation.

str = compose("%9.7f",pi)

str =
"3.1415927"

If the input is a numeric array, then compose returns a string array. Return a string array that
represents numbers using exponential notation.

A = [256 pi 8.9e-3];
str = compose("%5.2e",A)

str = 1x3 string


"2.56e+02" "3.14e+00" "8.90e-03"

Add Numbers to Strings

Before R2016b, convert numbers to character vectors and concatenate characters in brackets, [].

The simplest way to combine text and numbers is to use the plus operator (+). This operator
automatically converts numeric values to strings when the other operands are strings.

For example, plot a sine wave. Calculate the frequency of the wave and add a string representing that
value in the title of the plot.

X = linspace(0,2*pi);
Y = sin(X);
plot(X,Y)

6-45
6 Characters and Strings

freq = 1/(2*pi);
str = "Sine Wave, Frequency = " + freq + " Hz"

str =
"Sine Wave, Frequency = 0.15915 Hz"

title(str)

Sometimes existing text is stored in character vectors or cell arrays of character vectors. However,
the plus operator also automatically converts those types of data to strings when another operand is
a string. To combine numeric values with those types of data, first convert the numeric values to
strings, and then use plus to combine the text.

str = 'Sine Wave, Frequency = ' + string(freq) + {' Hz'}

str =
"Sine Wave, Frequency = 0.15915 Hz"

Character Codes

If your data contains integers that represent Unicode® values, use the char function to convert the
values to the corresponding characters. The output is a character vector or array.

u = [77 65 84 76 65 66];
c = char(u)

c =
'MATLAB'

6-46
Convert Numeric Values to Text

Converting Unicode values also allows you to include special characters in text. For instance, the
Unicode value for the degree symbol is 176. To add char(176) to a string, use plus.

deg = char(176);
temp = 21;
str = "Temperature: " + temp + deg + "C"

str =
"Temperature: 21°C"

Before R2016b, use num2str to convert the numeric value to a character vector, and then
concatenate.

str = ['Temperature: ' num2str(temp) deg 'C']

str =
'Temperature: 21°C'

Hexadecimal and Binary Values

Since R2019b

You can represent hexadecimal and binary values in your code either using text or using literals. The
recommended way to represent them is to write them as literals. You can write hexadecimal and
binary literals using the 0x and 0b prefixes respectively. However, it can sometimes be useful to
represent such values as text, using the dec2hex or dec2bin functions.

For example, set a bit in a binary value. If you specify the binary value using a literal, then it is stored
as an integer. After setting one of the bits, display the new binary value as text using the dec2bin
function.

register = 0b10010110

register = uint8
150

register = bitset(register,5,0)

register = uint8
134

binStr = dec2bin(register)

binStr =
'10000110'

See Also
dec2bin | dec2hex | char | string | compose | plus

More About
• “Convert Text to Numeric Values” on page 6-49
• “Hexadecimal and Binary Values” on page 6-55
• “Convert Between Datetime Arrays, Numbers, and Text” on page 7-42
• “Formatting Text” on page 6-24

6-47
6 Characters and Strings

• “Unicode and ASCII Values” on page 6-53

6-48
Convert Text to Numeric Values

Convert Text to Numeric Values


This example shows how to convert text to the numeric values that it represents. Typically, you need
to perform such conversions when you have text that represents numbers to be plotted or used in
calculations. For example, the text might come from a text file or spreadsheet. If you did not already
convert it to numeric values when importing it into MATLAB®, you can use the functions shown in
this example.

You can convert string arrays, character vectors, and cell arrays of character vectors to numeric
values. Text can represent hexadecimal or binary values, though when you convert them to numbers
they are stored as decimal values. You can also convert text representing dates and time to
datetime or duration values, which can be treated like numeric values.

Double-Precision Values

The recommended way to convert text to double-precision values is to use the str2double function.
It can convert character vectors, string arrays, and cell arrays of character vectors.

For example, create a character vector using single quotes and convert it to the number it represents.

X = str2double('3.1416')

X = 3.1416

If the input argument is a string array or cell array of character vectors, then str2double converts
it to a numeric array having the same size. You can create strings using double quotes. (Strings have
the string data type, while character vectors have the char data type.)

str = ["2.718","3.1416";
"137","0.015"]

str = 2x2 string


"2.718" "3.1416"
"137" "0.015"

X = str2double(str)

X = 2×2

2.7180 3.1416
137.0000 0.0150

The str2double function can convert text that includes commas (as thousands separators) and
decimal points. For example, you can use str2double to convert the Balance variable in the table
below. Balance represents numbers as strings, using a comma as the thousands separator.

load balances
balances

balances=3×2 table
Customer Balance
_________ ___________

"Diaz" "13,790.00"
"Johnson" "2,456.10"

6-49
6 Characters and Strings

"Wu" "923.71"

T.Balance = str2double(T.Balance)

T=3×2 table
Customer Balance
_________ _______

"Diaz" 13790
"Johnson" 2456.1
"Wu" 923.71

If str2double cannot convert text to a number, then it returns a NaN value.

While the str2num function can also convert text to numbers, it is not recommended. str2num uses
the eval function, which can cause unintended side effects when the text input includes a function
name. To avoid these issues, use str2double.

As an alternative, you can convert strings to double-precision values using the double function. If the
input is a string array, then double returns a numeric array that has the same size, just as
str2double does. However, if the input is a character vector, then double converts the individual
characters to numbers representing their Unicode values.

X = double("3.1416")

X = 3.1416

X = double('3.1416')

X = 1×6

51 46 49 52 49 54

This list summarizes the best practices for converting text to numeric values.

• To convert text to numeric values, use the str2double function. It treats string arrays, character
vectors, and cell arrays of character vectors consistently.
• You can also use the double function for string arrays. However, it treats character vectors
differently.
• Avoid str2num. It calls the eval function which can have unintended consequences.

Hexadecimal and Binary Values

You can represent hexadecimal and binary numbers as text or as literals. When you write them as
literals, you must use the 0x and 0b prefixes. When you represent them as text and then convert
them, you can use the prefixes, but they are not required.

For example, write a hexadecimal number as a literal. The prefix is required.

D = 0x3FF

D = uint16
1023

6-50
Convert Text to Numeric Values

Then convert text representing the same value by using the hex2dec function. It recognizes the
prefix but does not require it.

D = hex2dec('3FF')

D = 1023

D = hex2dec('0x3FF')

D = 1023

Convert text representing binary values using the bin2dec function.

D = bin2dec('101010')

D = 42

D = bin2dec('0b101010')

D = 42

Dates and Times

MATLAB provides the datetime and duration data types to store dates and times, and to treat
them as numeric values. To convert text representing dates and times, use the datetime and
duration functions.

Convert text representing a date to a datetime value. The datetime function recognizes many
common formats for dates and times.

C = '2019-09-20'

C =
'2019-09-20'

D = datetime(C)

D = datetime
20-Sep-2019

You can convert arrays representing dates and times.

str = ["2019-01-31","2019-02-28","2019-03-31"]

str = 1x3 string


"2019-01-31" "2019-02-28" "2019-03-31"

D = datetime(str)

D = 1x3 datetime
31-Jan-2019 28-Feb-2019 31-Mar-2019

If you convert text to duration values, then use the hh:mm:ss or dd:hh:mm:ss formats.

D = duration('12:34:56')

6-51
6 Characters and Strings

D = duration
12:34:56

See Also
bin2dec | hex2dec | str2double | datetime | duration | double | table

More About
• “Convert Numeric Values to Text” on page 6-45
• “Convert Between Datetime Arrays, Numbers, and Text” on page 7-42
• “Hexadecimal and Binary Values” on page 6-55
• “Formatting Text” on page 6-24
• “Unicode and ASCII Values” on page 6-53

6-52
Unicode and ASCII Values

Unicode and ASCII Values


MATLAB® stores all characters as Unicode® characters using the UTF-16 encoding, where every
character is represented by a numeric code value. (Unicode incorporates the ASCII character set as
the first 128 symbols, so ASCII characters have the same numeric codes in Unicode and ASCII.) Both
character arrays and string arrays use this encoding. You can convert characters to their numeric
code values by using various numeric conversion functions. You can convert numbers to characters
using the char function.

Convert Characters to Numeric Code Values

You can convert characters to integers that represent their Unicode code values. To convert a single
character or a character array, use any of these functions:

• double
• uint16, uint32, or uint64

The best practice is to use the double function. However, if you need to store the numeric values as
integers, use unsigned integers having at least 16 bits because MATLAB uses the UTF-16 encoding.

Convert a character vector to Unicode code values using the double function.

C = 'MATLAB'

C =
'MATLAB'

unicodeValues = double(C)

unicodeValues = 1×6

77 65 84 76 65 66

You cannot convert characters in a string array directly to Unicode code values. In particular, the
double function converts strings to the numbers they represent, just as the str2double function
does. If double cannot convert a string to a number, then it returns a NaN value.

str = "MATLAB";
double(str)

ans = NaN

To convert characters in a string, first convert the string to a character vector, or use curly braces to
extract the characters. Then convert the characters using a function such as double.

C = char(str);
unicodeValues = double(C)

unicodeValues = 1×6

77 65 84 76 65 66

Convert Numeric Code Values to Characters

You can convert Unicode values to characters using the char function.

6-53
6 Characters and Strings

D = [77 65 84 76 65 66]

D = 1×6

77 65 84 76 65 66

C = char(D)

C =
'MATLAB'

A typical use for char is to create characters you cannot type and append them to strings. For
example, create the character for the degree symbol and append it to a string. The Unicode code
value for the degree symbol is 176.

deg = char(176)

deg =
'°'

myLabel = append("Current temperature is 21",deg,"C")

myLabel =
"Current temperature is 21°C"

For more information on Unicode, including mappings between characters and code values, see
Unicode.

See Also
char | double | single | string | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 |
uint64

More About
• “Convert Text to Numeric Values” on page 6-49
• “Convert Numeric Values to Text” on page 6-45

External Websites
• Unicode

6-54
Hexadecimal and Binary Values

Hexadecimal and Binary Values


You can represent numbers as hexadecimal or binary values. In some contexts, these representations
of numbers are more convenient. For example, you can represent the bits of a hardware register
using binary values. In MATLAB®, there are two ways to represent hexadecimal and binary values:

• As literals. Starting in R2019b, you can write hexadecimal and binary values as literals using an
appropriate prefix as notation. For example, 0x2A is a literal that specifies 42—and MATLAB
stores it as a number, not as text.
• As strings or character vectors. For example, the character vector '2A' represents the number 42
as a hexadecimal value. When you represent a hexadecimal or binary value using text, enclose it
in quotation marks. MATLAB stores this representation as text, not a number.

MATLAB provides several functions for converting numbers to and from their hexadecimal and binary
representations.

Write Integers Using Hexadecimal and Binary Notation

Hexadecimal literals start with a 0x or 0X prefix, while binary literals start with a 0b or 0B prefix.
MATLAB stores the number written with this notation as an integer. For example, these two literals
both represent the integer 42.

A = 0x2A

A = uint8
42

B = 0b101010

B = uint8
42

Do not use quotation marks when you write a number using this notation. Use 0-9, A-F, and a-f to
represent hexadecimal digits. Use 0 and 1 to represent binary digits.

By default, MATLAB stores the number as the smallest unsigned integer type that can accommodate
it. However, you can use an optional suffix to specify the type of integer that stores the value.

• To specify unsigned 8-, 16-, 32-, and 64-bit integer types, use the suffixes u8, u16, u32, and u64.
• To specify signed 8-, 16-, 32-, and 64-bit integer types, use the suffixes s8, s16, s32, and s64.

For example, write a hexadecimal literal to be stored as a signed 32-bit integer.

A = 0x2As32

A = int32
42

When you specify signed integer types, you can write literals that represent negative numbers.
Represent negative numbers in two's complement form. For example, specify a negative number with
a literal using the s8 suffix.

A = 0xFFs8

A = int8
-1

6-55
6 Characters and Strings

Since MATLAB stores these literals as numbers, you can use them in any context or function where
you use numeric arrays.

Represent Hexadecimal and Binary Values as Text

You can also convert integers to character vectors that represent them as hexadecimal or binary
values using the dec2hex and dec2bin functions. Convert an integer to hexadecimal.

hexStr = dec2hex(255)

hexStr =
'FF'

Convert an integer to binary.

binStr = dec2bin(16)

binStr =
'10000'

Since these functions produce text, use them when you need text that represents numeric values. For
example, you can append these values to a title or a plot label, or write them to a file that stores
numbers as their hexadecimal or binary representations.

Represent Arrays of Hexadecimal Values as Text

The recommended way to convert an array of numbers to text is to use the compose function. This
function returns a string array having the same size as the input numeric array. To produce
hexadecimal format, use %X as the format specifier.

A = [255 16 12 1024 137]

A = 1×5

255 16 12 1024 137

hexStr = compose("%X",A)

hexStr = 1x5 string


"FF" "10" "C" "400" "89"

The dec2hex and dec2bin functions also convert arrays of numbers to text representing them as
hexadecimal or binary values. However, these functions return character arrays, where each row
represents a number from the input numeric array, padded with zeros as necessary.

Convert Binary Representations to Hexadecimal

To convert a binary value to hexadecimal, start with a binary literal, and convert it to text
representing its hexadecimal value. Since a literal is interpreted as a number, you can specify it
directly as the input argument to dec2hex.

D = 0b1111;
hexStr = dec2hex(D)

hexStr =
'F'

6-56
Hexadecimal and Binary Values

If you start with a hexadecimal literal, then you can convert it to text representing its binary value
using dec2bin.

D = 0x8F;
binStr = dec2bin(D)

binStr =
'10001111'

Bitwise Operations with Binary Values

One typical use of binary numbers is to represent bits. For example, many devices have registers that
provide access to a collection of bits representing data in memory or the status of the device. When
working with such hardware you can use numbers in MATLAB to represent the value in a register.
Use binary values and bitwise operations to represent and access particular bits.

Create a number that represents an 8-bit register. It is convenient to start with binary representation,
but the number is stored as an integer.

register = 0b10010110

register = uint8
150

To get or set the values of particular bits, use bitwise operations. For example, use the bitand and
bitshift functions to get the value of the fifth bit. (Shift that bit to the first position so that
MATLAB returns a 0 or 1. In this example, the fifth bit is a 1.)

b5 = bitand(register,0b10000);
b5 = bitshift(b5,-4)

b5 = uint8
1

To flip the fifth bit to 0, use the bitset function.

register = bitset(register,5,0)

register = uint8
134

Since register is an integer, use the dec2bin function to display all the bits in binary format.
binStr is a character vector, and represents the binary value without a leading 0b prefix.

binStr = dec2bin(register)

binStr =
'10000110'

See Also
bin2dec | bitand | bitshift | bitset | dec2bin | dec2hex | hex2dec | sprintf | sscanf

More About
• “Convert Text to Numeric Values” on page 6-49
• “Convert Numeric Values to Text” on page 6-45

6-57
6 Characters and Strings

• “Formatting Text” on page 6-24


• “Bit-Wise Operations” on page 2-38
• “Perform Cyclic Redundancy Check” on page 2-44

External Websites
• Two's Complement

6-58
Frequently Asked Questions About String Arrays

Frequently Asked Questions About String Arrays


MATLAB introduced the string data type in R2016b. Starting in R2018b, you can use string arrays
to work with text throughout MathWorks products. String arrays store pieces of text and provide a
set of functions for working with text as data. You can index into, reshape, and concatenate strings
arrays just as you can with arrays of any other type. For more information, see “Create String Arrays”
on page 6-5.

In most respects, strings arrays behave like character vectors and cell arrays of character vectors.
However, there are a few key differences between string arrays and character arrays that can lead to
results you might not expect. For each of these differences, there is a recommended way to use
strings that leads to the expected result.

Why Does Using Command Form With Strings Return An Error?


When you use functions such as the cd, dir, copyfile, or load functions in command form, avoid
using double quotes. In command form, arguments enclosed in double quotes can result in errors. To
specify arguments as strings, use functional form.

With command syntax, you separate inputs with spaces rather than commas, and you do not enclose
input arguments in parentheses. For example, you can use the cd function with command syntax to
change folders.
cd C:\Temp

The text C:\Temp is a character vector. In command form, all arguments are always character
vectors. If you have an argument, such as a folder name, that contains spaces, then specify it as one
input argument by enclosing it in single quotes.
cd 'C:\Program Files'

But if you specify the argument using double quotes, then cd throws an error.
cd "C:\Program Files"

Error using cd
Too many input arguments.

The error message can vary depending on the function that you use and the arguments that you
specify. For example, if you use the load function with command syntax and specify the argument
using double quotes, then load throws a different error.
load "myVariables.mat"

Error using load


Unable to read file '"myVariables.mat"': Invalid argument.

In command form, double quotes are treated as part of the literal text rather than as the string
construction operator. If you wrote the equivalent of cd "C:\Program Files" in functional form,
then it would look like a call to cd with two arguments.
cd('"C:\Program','Files"')

When specifying arguments as strings, use function syntax. All functions that support command
syntax also support function syntax. For example, you can use cd with function syntax and input
arguments that are double quoted strings.

6-59
6 Characters and Strings

cd("C:\Program Files")

Why Do Strings in Cell Arrays Return an Error?


When you have multiple strings, store them in a string array, not a cell array. Create a string array
using square brackets, not curly braces. String arrays are more efficient than cell arrays for storing
and manipulating text.

str = ["Venus","Earth","Mars"]

str = 1×3 string array


"Venus" "Earth" "Mars"

Avoid using cell arrays of strings. When you use cell arrays, you give up the performance advantages
that come from using string arrays. And in fact, most functions do not accept cell arrays of strings as
input arguments, options, or values of name-value pairs. For example, if you specify a cell array of
strings as an input argument, then the contains function throws an error.

C = {"Venus","Earth","Mars"}

C = 1×3 cell array


{["Venus"]} {["Earth"]} {["Mars"]}

TF = contains(C,"Earth")

Error using contains


First argument must be a string array, character vector, or cell array of character vectors.

Instead, specify the argument as a string array.

str = ["Venus","Earth","Mars"];
TF = contains(str,"Earth");

Before R2016b, the term "cell array of strings" meant a cell array whose elements all contain
character vectors. But it is more precise to refer to such cell arrays as "cell arrays of character
vectors," to distinguish them from string arrays.

Cell arrays can contain variables having any data types, including strings. It is still possible to create
a cell array whose elements all contain strings. And if you already have specified cell arrays of
character vectors in your code, then replacing single quotes with double quotes might seem like a
simple update. However, it is not recommended that you create or use cell arrays of strings.

Why Does length() of String Return 1?


It is common to use the length function to determine the number of characters in a character vector.
But to determine the number of characters in a string, use the strlength function, not length.

Create a character vector using single quotes. To determine its length, use the length function.
Because C is a vector, its length is equal to the number of characters. C is a 1-by-11 vector.

C = 'Hello world';
L = length(C)

L = 11

6-60
Frequently Asked Questions About String Arrays

Create a string with the same characters, using double quotes. Though it stores 11 characters, str is
a 1-by-1 string array, or string scalar. If you call length on a string scalar, then the output argument is
1, no matter how many characters it stores.

str = "Hello World";


L = length(str)

L = 1

To determine the number of characters in a string, use the strlength function, introduced in
R2016b. For compatibility, strlength operates on character vectors as well. In both cases
strlength returns the number of characters.

L = strlength(C)

L = 11

L = strlength(str)

L = 11

You also can use strlength on string arrays containing multiple strings and on cell arrays of
character vectors.

The length function returns the size of the longest dimension of an array. For a string array, length
returns the number of strings along the longest dimension of the array. It does not return the number
of characters within strings.

Why Does isempty("") Return 0?


A string can have no characters at all. Such a string is an empty string. You can specify an empty
string using an empty pair of double quotes.

L = strlength("")

L = 0

However, an empty string is not an empty array. An empty string is a string scalar that happens to
have no characters.

sz = size("")

sz = 1×2
1 1

If you call isempty on an empty string, then it returns 0 (false) because the string is not an empty
array.

tf = isempty("")

tf = logical
0

However, if you call isempty on an empty character array, then it returns 1 (true). A character
array specified as a empty pair of single quotes, '', is a 0-by-0 character array.

tf = isempty('')

6-61
6 Characters and Strings

tf = logical
1

To test whether a piece of text has no characters, the best practice is to use the strlength function.
You can use the same call whether the input is a string scalar or a character vector.

str = "";
if strlength(str) == 0
disp('String has no text')
end

String has no text

chr = '';
if strlength(chr) == 0
disp('Character vector has no text')
end

Character vector has no text

Why Does Appending Strings Using Square Brackets Return Multiple


Strings?
You can append text to a character vector using square brackets. But if you add text to a string array
using square brackets, then the new text is concatenated as new elements of the string array. To
append text to strings, use the plus operator or the strcat function.

For example, if you concatenate two strings, then the result is a 1-by-2 string array.

str = ["Hello" "World"]

str = 1×2 string array


"Hello" "World"

However, if you concatenate two character vectors, then the result is a longer character vector.

str = ['Hello' 'World']

chr = 'HelloWorld'

To append text to a string (or to the elements of a string array), use the plus operator instead of
square brackets.

str = "Hello" + "World"

str = "HelloWorld"

As an alternative, you can use the strcat function. strcat appends text whether the input
arguments are strings or character vectors.

str = strcat("Hello","World")

str = "HelloWorld"

Whether you use square brackets, plus, or strcat, you can specify an arbitrary number of
arguments. Append a space character between Hello and World.

str = "Hello" + " " + "World"

6-62
Frequently Asked Questions About String Arrays

str = "Hello World"

See Also
string | strlength | contains | plus | strcat | sprintf | dir | cd | copyfile | load | length
| size | isempty

Related Examples
• “Create String Arrays” on page 6-5
• “Test for Empty Strings and Missing Values” on page 6-20
• “Compare Text” on page 6-32
• “Update Your Code to Accept Strings” on page 6-64

6-63
6 Characters and Strings

Update Your Code to Accept Strings


In R2016b, MATLAB introduced string arrays as a data type for text. As of R2018b, all MathWorks
products are compatible with string arrays. Compatible means that if you can specify text as a
character vector or a cell array of character vectors, then you also can specify it as a string array.
Now you can adopt string arrays as a text data type in your own code.

If you write code for other MATLAB users, then it is to your advantage to update your API to accept
string arrays, while maintaining backward compatibility with other text data types. String adoption
makes your code consistent with MathWorks products.

If your code has few dependencies, or if you are developing new code, then consider using string
arrays as your primary text data type for better performance. In that case, best practice is to write or
update your API to accept input arguments that are character vectors, cell arrays of character
vectors, or string arrays.

For the definitions of string array and other terms, see “Terminology for Character and String Arrays”
on page 6-70.

What Are String Arrays?


In MATLAB, you can store text data in two ways. One way is to use a character array, which is a
sequence of characters, just as a numeric array is a sequence of numbers. Or, starting in R2016b, the
other way is to store a sequence of characters in a string. You can store multiple strings in a string
array. For more information, see “Characters and Strings”.

Recommended Approaches for String Adoption in Old APIs


When your code has many dependencies, and you must maintain backward compatibility, follow these
approaches for updating functions and classes to present a compatible API.

Functions

• Accept string arrays as input arguments.

• If an input argument can be either a character vector or a cell array of character vectors, then
update your code so that the argument also can be a string array. For example, consider a
function that has an input argument you can specify as a character vector (using single
quotes). Best practice is to update the function so that the argument can be specified as either
a character vector or a string scalar (using double quotes).
• Accept strings as both names and values in name-value pair arguments.

• In name-value pair arguments, allow names to be specified as either character vectors or


strings—that is, with either single or double quotes around the name. If a value can be a
character vector or cell array of character vectors, then update your code so that it also can be
a string array.
• Do not accept cell arrays of string arrays for text input arguments.

• A cell array of string arrays has a string array in each cell. For example, {"hello","world"}
is a cell array of string arrays. While you can create such a cell array, it is not recommended
for storing text. The elements of a string array have the same data type and are stored

6-64
Update Your Code to Accept Strings

efficiently. If you store strings in a cell array, then you lose the advantages of using a string
array.

However, if your code accepts heterogeneous cell arrays as inputs, then consider accepting cell
arrays that contain strings. You can convert any strings in such a cell array to character
vectors.
• In general, do not change the output type.

• If your function returns a character vector or cell array of character vectors, then do not
change the output type, even if the function accepts string arrays as inputs. For example, the
fileread function accepts an input file name specified as either a character vector or a
string, but the function returns the file contents as a character vector. By keeping the output
type the same, you can maintain backward compatibility.
• Return the same data type when the function modifies input text.

• If your function modifies input text and returns the modified text as the output argument, then
the input and output arguments should have the same data type. For example, the lower
function accepts text as the input argument, converts it to all lowercase letters, and returns it.
If the input argument is a character vector, then lower returns a character vector. If the input
is a string array, then lower returns a string array.
• Consider adding a 'TextType' argument to import functions.

• If your function imports data from files, and at least some of that data can be text, then
consider adding an input argument that specifies whether to return text as a character array or
a string array. For example, the readtable function provides the 'TextType' name-value
pair argument. This argument specifies whether readtable returns a table with text in cell
arrays of character vectors or string arrays.

Classes

• Treat methods as functions.

• For string adoption, treat methods as though they are functions. Accept string arrays as input
arguments, and in general, do not change the data type of the output arguments, as described
in the previous section.
• Do not change the data types of properties.

• If a property is a character vector or a cell array of character vectors, then do not change its
type. When you access such a property, the value that is returned is still a character vector or a
cell array of character vectors.

As an alternative, you can add a new property that is a string, and make it dependent on the
old property to maintain compatibility.
• Set properties using string arrays.

• If you can set a property using a character vector or cell array of character vectors, then
update your class to set that property using a string array too. However, do not change the
data type of the property. Instead, convert the input string array to the data type of the
property, and then set the property.
• Add a string method.

6-65
6 Characters and Strings

• If your class already has a char and/or a cellstr method, then add a string method. If you
can represent an object of your class as a character vector or cell array of character vectors,
then represent it as a string array too.

How to Adopt String Arrays in Old APIs


You can adopt strings in old APIs by accepting string arrays as input arguments, and then converting
them to character vectors or cell arrays of character vectors. If you perform such a conversion at the
start of a function, then you do not need to update the rest of it.

The convertStringsToChars function provides a way to process all input arguments, converting
only those arguments that are string arrays. To enable your existing code to accept string arrays as
inputs, add a call to convertStringsToChars at the beginnings of your functions and methods.

For example, if you have defined a function myFunc that accepts three input arguments, process all
three inputs using convertStringsToChars. Leave the rest of your code unaltered.

function y = myFunc(a,b,c)
[a,b,c] = convertStringsToChars(a,b,c);
<line 1 of original code>
<line 2 of original code>
...

In this example, the arguments [a,b,c] overwrite the input arguments in place. If any input
argument is not a string array, then it is unaltered.

If myFunc accepts a variable number of input arguments, then process all the arguments specified by
varargin.

function y = myFunc(varargin)
[varargin{:}] = convertStringsToChars(varargin{:});
...

Performance Considerations

The convertStringsToChars function is more efficient when converting one input argument. If
your function is performance sensitive, then you can convert input arguments one at a time, while
still leaving the rest of your code unaltered.

function y = myFunc(a,b,c)
a = convertStringsToChars(a);
b = convertStringsToChars(b);
c = convertStringsToChars(c);
...

Recommended Approaches for String Adoption in New Code


When your code has few dependencies, or you are developing entirely new code, consider using
strings arrays as the primary text data type. String arrays provide good performance and efficient
memory usage when working with large amounts of text. Unlike cell arrays of character vectors,
string arrays have a homogeneous data type. String arrays make it easier to write maintainable code.
To use string arrays while maintaining backward compatibility to other text data types, follow these
approaches.

6-66
Update Your Code to Accept Strings

Functions

• Accept any text data types as input arguments.

• If an input argument can be a string array, then also allow it to be a character vector or cell
array of character vectors.
• Accept character arrays as both names and values in name-value pair arguments.

• In name-value pair arguments, allow names to be specified as either character vectors or


strings—that is, with either single or double quotes around the name. If a value can be a string
array, then also allow it to be a character vector or cell array of character vectors.
• Do not accept cell arrays of string arrays for text input arguments.

• A cell array of string arrays has a string array in each cell. While you can create such a cell
array, it is not recommended for storing text. If your code uses strings as the primary text data
type, store multiple pieces of text in a string array, not a cell array of string arrays.

However, if your code accepts heterogeneous cell arrays as inputs, then consider accepting cell
arrays that contain strings.
• In general, return strings.

• If your function returns output arguments that are text, then return them as string arrays.
• Return the same data type when the function modifies input text.

• If your function modifies input text and returns the modified text as the output argument, then
the input and output arguments should have the same data type.

Classes

• Treat methods as functions.

• Accept character vectors and cell arrays of character vectors as input arguments, as described
in the previous section. In general, return strings as outputs.
• Specify properties as string arrays.

• If a property contains text, then set the property using a string array. When you access the
property, return the value as a string array.

How to Maintain Compatibility in New Code


When you write new code, or modify code to use string arrays as the primary text data type, maintain
backward compatibility with other text data types. You can accept character vectors or cell arrays of
character vectors as input arguments, and then immediately convert them to string arrays. If you
perform such a conversion at the start of a function, then the rest of your code can use string arrays
only.

The convertCharsToStrings function provides a way to process all input arguments, converting
only those arguments that are character vectors or cell arrays of character vectors. To enable your
new code to accept these text data types as inputs, add a call to convertCharsToStrings at the
beginnings of your functions and methods.

For example, if you have defined a function myFunc that accepts three input arguments, process all
three inputs using convertCharsToStrings.

6-67
6 Characters and Strings

function y = myFunc(a,b,c)
[a,b,c] = convertCharsToStrings(a,b,c);
<line 1 of original code>
<line 2 of original code>
...

In this example, the arguments [a,b,c] overwrite the input arguments in place. If any input
argument is not a character vector or cell array of character vectors, then it is unaltered.

If myFunc accepts a variable number of input arguments, then process all the arguments specified by
varargin.

function y = myFunc(varargin)
[varargin{:}] = convertCharsToStrings(varargin{:});
...

Performance Considerations

The convertCharsToStrings function is more efficient when converting one input argument. If
your function is performance sensitive, then you can convert input arguments one at a time, while
still leaving the rest of your code unaltered.

function y = myFunc(a,b,c)
a = convertCharsToStrings(a);
b = convertCharsToStrings(b);
c = convertCharsToStrings(c);
...

How to Manually Convert Input Arguments


If it is at all possible, avoid manual conversion of input arguments that contain text, and instead use
the convertStringsToChars or convertCharsToStrings functions. Checking the data types of
input arguments and converting them yourself is a tedious approach, prone to errors.

If you must convert input arguments, then use the functions in this table.

Conversion Function
String scalar to character vector char
String array to cell array of character vectors cellstr
Character vector to string scalar string
Cell array of character vectors to string array string

How to Check Argument Data Types


To check the data type of an input argument that could contain text, consider using the patterns
shown in this table.

Required Input Argument Old Check New Check


Type
Character vector or string ischar(X) ischar(X) ||
scalar isStringScalar(X)

6-68
Update Your Code to Accept Strings

Required Input Argument Old Check New Check


Type
validateattributes(X,
{'char','string'},
{'scalartext'})
Character vector or string validateattributes(X, validateattributes(X,
scalar {'char'},{'row'}) {'char','string'},
{'scalartext'})
Nonempty character vector or ischar(X) && ~isempty(X) (ischar(X) ||
string scalar isStringScalar(X)) &&
strlength(X) ~= 0
(ischar(X) ||
isStringScalar(X)) && X
~= ""
Cell array of character vectors iscellstr(X) iscellstr(X) ||
or string array isstring(X)
Any text data type ischar(X) || ischar(X) ||
iscellstr(X) iscellstr(X) ||
isstring(X)

Check for Empty Strings

An empty string is a string with no characters. MATLAB displays an empty string as a pair of double
quotes with nothing between them (""). However, an empty string is still a 1-by-1 string array. It is
not an empty array.

The recommended way to check whether a string is empty is to use the strlength function.

str = "";
tf = (strlength(str) ~= 0)

Note Do not use the isempty function to check for an empty string. An empty string has no
characters but is still a 1-by-1 string array.

The strlength function returns the length of each string in a string array. If the string must be a
string scalar, and also not empty, then check for both conditions.

tf = (isStringScalar(str) && strlength(str) ~= 0)

If str could be either a character vector or string scalar, then you still can use strlength to
determine its length. strlength returns 0 if the input argument is an empty character vector ('').

tf = ((ischar(str) || isStringScalar(str)) && strlength(str) ~= 0)

Check for Empty String Arrays

An empty string array is, in fact, an empty array—that is, an array that has at least one dimension
whose length is 0.

6-69
6 Characters and Strings

The recommended way to create an empty string array is to use the strings function, specifying 0
as at least one of the input arguments. The isempty function returns 1 when the input is an empty
string array.

str = strings(0);
tf = isempty(str)

The strlength function returns a numeric array that is the same size as the input string array. If the
input is an empty string array, then strlength returns an empty array.

str = strings(0);
L = strlength(str)

Check for Missing Strings

String arrays also can contain missing strings. The missing string is the string equivalent to NaN for
numeric arrays. It indicates where a string array has missing values. The missing string displays as
<missing>, with no quotation marks.

You can create missing strings using the missing function. The recommended way to check for
missing strings is to use the ismissing function.

str = string(missing);
tf = ismissing(str)

Note Do not check for missing strings by comparing a string to the missing string.

The missing string is not equal to itself, just as NaN is not equal to itself.

str = string(missing);
f = (str == missing)

Terminology for Character and String Arrays


MathWorks documentation uses these terms to describe character and string arrays. For consistency,
use these terms in your own documentation, error messages, and warnings.

• Character vector — 1-by-n array of characters, of data type char.


• Character array — m-by-n array of characters, of data type char.
• Cell array of character vectors — Cell array in which each cell contains a character vector.
• String or string scalar — 1-by-1 string array. A string scalar can contain a 1-by-n sequence of
characters, but is itself one object. Use the terms "string scalar" and "character vector" alongside
each other when to be precise about size and data type. Otherwise, you can use the term "string"
in descriptions.
• String vector — 1-by-n or n-by-1 string array. If only one size is possible, then use it in your
description. For example, use "1-by-n string array" to describe an array of that size.
• String array — m-by-n string array.
• Empty string — String scalar that has no characters.
• Empty string array — String array with at least one dimension whose size is 0.
• Missing string — String scalar that is the missing value (displays as <missing>).

6-70
Update Your Code to Accept Strings

See Also
char | cellstr | string | strings | convertStringsToChars | convertCharsToStrings |
isstring | isStringScalar | ischar | iscellstr | strlength | validateattributes |
convertContainedStringsToChars

More About
• “Create String Arrays” on page 6-5
• “Test for Empty Strings and Missing Values” on page 6-20
• “Compare Text” on page 6-32
• “Search and Replace Text” on page 6-37
• “Frequently Asked Questions About String Arrays” on page 6-59

6-71
7

Dates and Time

• “Represent Dates and Times in MATLAB” on page 7-2


• “Specify Time Zones” on page 7-5
• “Convert Date and Time to Julian Date or POSIX Time” on page 7-7
• “Set Date and Time Display Format” on page 7-10
• “Generate Sequence of Dates and Time” on page 7-14
• “Share Code and Data Across Locales” on page 7-20
• “Extract or Assign Date and Time Components of Datetime Array” on page 7-23
• “Combine Date and Time from Separate Variables” on page 7-26
• “Date and Time Arithmetic” on page 7-28
• “Compare Dates and Time” on page 7-33
• “Plot Dates and Durations” on page 7-36
• “Core Functions Supporting Date and Time Arrays” on page 7-41
• “Convert Between Datetime Arrays, Numbers, and Text” on page 7-42
• “Carryover in Date Vectors and Strings” on page 7-47
• “Converting Date Vector Returns Unexpected Output” on page 7-48
7 Dates and Time

Represent Dates and Times in MATLAB


The primary way to store date and time information is in datetime arrays, which support arithmetic,
sorting, comparisons, plotting, and formatted display. The results of arithmetic differences are
returned in duration arrays or, when you use calendar-based functions, in calendarDuration
arrays.

For example, create a MATLAB datetime array that represents two dates: June 28, 2014 at 6 a.m. and
June 28, 2014 at 7 a.m. Specify numeric values for the year, month, day, hour, minute, and second
components for the datetime.

t = datetime(2014,6,28,6:7,0,0)

t =
28-Jun-2014 06:00:00 28-Jun-2014 07:00:00

Change the value of a date or time component by assigning new values to the properties of the
datetime array. For example, change the day number of each datetime by assigning new values to the
Day property.

t.Day = 27:28

t =

27-Jun-2014 06:00:00 28-Jun-2014 07:00:00

Change the display format of the array by changing its Format property. The following format does
not display any time components. However, the values in the datetime array do not change.

t.Format = 'MMM dd, yyyy'

t =
Jun 27, 2014 Jun 28, 2014

If you subtract one datetime array from another, the result is a duration array in units of fixed
length.

t2 = datetime(2014,6,29,6,30,45)

t2 =

29-Jun-2014 06:30:45

d = t2 - t

d =

48:30:45 23:30:45

By default, a duration array displays in the format, hours:minutes:seconds. Change the display
format of the duration by changing its Format property. You can display the duration value with a
single unit, such as hours.

d.Format = 'h'

d =

48.512 hrs 23.512 hrs

7-2
Represent Dates and Times in MATLAB

You can create a duration in a single unit using the seconds, minutes, hours, days, or years
functions. For example, create a duration of 2 days, where each day is exactly 24 hours.

d = days(2)

d =
2 days

You can create a calendar duration in a single unit of variable length. For example, one month can be
28, 29, 30, or 31 days long. Specify a calendar duration of 2 months.

L = calmonths(2)

L =
2mo

Use the caldays, calweeks, calquarters, and calyears functions to specify calendar durations
in other units.

Add a number of calendar months and calendar days. The number of days remains separate from the
number of months because the number of days in a month is not fixed, and cannot be determined
until you add the calendar duration to a specific datetime.

L = calmonths(2) + caldays(35)

L =
2mo 35d

Add calendar durations to a datetime to compute a new date.

t2 = t + calmonths(2) + caldays(35)

t2 =

Oct 01, 2014 Oct 02, 2014

t2 is also a datetime array.

whos t2

Name Size Bytes Class Attributes

t2 1x2 161 datetime

In summary, there are several ways to represent dates and times, and MATLAB has a data type for
each approach:

• Represent a point in time, using the datetime data type.


Example: Wednesday, June 18, 2014 10:00:00
• Represent a length of time, or a duration in units of fixed length, using the duration data type.
When using the duration data type, 1 day is always equal to 24 hours, and 1 year is always equal
to 365.2425 days.
Example: 72 hours and 10 minutes
• Represent a length of time, or a duration in units of variable length, using the
calendarDuration data type.
Example: 1 month, which can be 28, 29, 30, or 31 days long.

7-3
7 Dates and Time

The calendarDuration data type also accounts for daylight saving time changes and leap years,
so that 1 day might be more or less than 24 hours, and 1 year can have 365 or 366 days.

See Also
datetime | duration | calendarDuration

7-4
Specify Time Zones

Specify Time Zones


In MATLAB, a time zone includes the time offset from Coordinated Universal Time (UTC), the daylight
saving time offset, and a set of historical changes to those values. The time zone setting is stored in
the TimeZone property of each datetime array. When you create a datetime, it is unzoned by
default. That is, the TimeZone property of the datetime is empty (''). If you do not work with
datetime values from multiple time zones and do not need to account for daylight saving time, you
might not need to specify this property.

You can specify a time zone when you create a datetime, using the 'TimeZone' name-value pair
argument. The time zone value 'local' specifies the system time zone. To display the time zone
offset for each datetime, include a time zone offset specifier such as 'Z' in the value for the
'Format' argument.

t = datetime(2014,3,8:9,6,0,0,'TimeZone','local',...
'Format','d-MMM-y HH:mm:ss Z')

t =

8-Mar-2014 06:00:00 -0500 9-Mar-2014 06:00:00 -0400

A different time zone offset is displayed depending on whether the datetime occurs during daylight
saving time.

You can modify the time zone of an existing datetime. For example, change the TimeZone property of
t using dot notation. You can specify the time zone value as the name of a time zone region in the
IANA Time Zone Database. A time zone region accounts for the current and historical rules for
standard and daylight offsets from UTC that are observed in that geographic region.

t.TimeZone = 'Asia/Shanghai'

t =

8-Mar-2014 19:00:00 +0800 9-Mar-2014 18:00:00 +0800

You also can specify the time zone value as a character vector of the form +HH:mm or -HH:mm, which
represents a time zone with a fixed offset from UTC that does not observe daylight saving time.

t.TimeZone = '+08:00'

t =

8-Mar-2014 19:00:00 +0800 9-Mar-2014 18:00:00 +0800

Operations on datetime arrays with time zones automatically account for time zone differences. For
example, create a datetime in a different time zone.

u = datetime(2014,3,9,6,0,0,'TimeZone','Europe/London',...
'Format','d-MMM-y HH:mm:ss Z')

u =

9-Mar-2014 06:00:00 +0000

View the time difference between the two datetime arrays.

dt = t - u

7-5
7 Dates and Time

dt =

-19:00:00 04:00:00

When you perform operations involving datetime arrays, the arrays either must all have a time zone
associated with them, or they must all have no time zone.

See Also
datetime | timezones

Related Examples
• “Represent Dates and Times in MATLAB” on page 7-2
• “Convert Date and Time to Julian Date or POSIX Time” on page 7-7

7-6
Convert Date and Time to Julian Date or POSIX Time

Convert Date and Time to Julian Date or POSIX Time


You can convert datetime arrays to represent points in time in specialized numeric formats. In
general, these formats represent a point in time as the number of seconds or days that have elapsed
since a specified starting point. For example, the Julian date is the number of days and fractional days
that have elapsed since the beginning of the Julian period. The POSIX® time is the number of
seconds that have elapsed since 00:00:00 1-Jan-1970 UTC (Universal Coordinated Time). MATLAB®
provides the juliandate and posixtime functions to convert datetime arrays to Julian dates and
POSIX times.

While datetime arrays are not required to have a time zone, converting "unzoned" datetime values
to Julian dates or POSIX times can lead to unexpected results. To ensure the expected result, specify
the time zone before conversion.

Specify Time Zone Before Conversion

You can specify a time zone for a datetime array, but you are not required to do so. In fact, by
default the datetime function creates an "unzoned" datetime array.

Create a datetime value for the current date and time.

d = datetime('now')

d = datetime
01-Sep-2021 16:01:28

d is constructed from the local time on your machine and has no time zone associated with it. In many
contexts, you might assume that you can treat the times in an unzoned datetime array as local
times. However, the juliandate and posixtime functions treat the times in unzoned datetime
arrays as UTC times, not local times. To avoid any ambiguity, it is recommended that you avoid using
juliandate and posixtime on unzoned datetime arrays. For example, avoid using
posixtime(datetime('now')) in your code.

If your datetime array has values that do not represent UTC times, specify the time zone using the
TimeZone name-value pair argument so that juliandate and posixtime interpret the datetime
values correctly.

d = datetime('now','TimeZone','America/New_York')

d = datetime
01-Sep-2021 16:01:28

As an alternative, you can specify the TimeZone property after you create the array.

d.TimeZone = 'America/Los_Angeles'

d = datetime
01-Sep-2021 13:01:28

To see a complete list of time zones, use the timezones function.

7-7
7 Dates and Time

Convert Zoned and Unzoned Datetime Values to Julian Dates

A Julian date is the number of days (including fractional days) since noon on November 24, 4714
BCE, in the proleptic Gregorian calendar, or January 1, 4713 BCE, in the proleptic Julian calendar. To
convert datetime arrays to Julian dates, use the juliandate function.

Create a datetime array and specify its time zone.

DZ = datetime('2016-07-29 10:05:24') + calmonths(1:3);


DZ.TimeZone = 'America/New_York'

DZ = 1x3 datetime
29-Aug-2016 10:05:24 29-Sep-2016 10:05:24 29-Oct-2016 10:05:24

Convert D to the equivalent Julian dates.

format longG
JDZ = juliandate(DZ)

JDZ = 1×3

2.4576 2.4577 2.4577

Create an unzoned copy of DZ. Convert D to the equivalent Julian dates. As D has no time zone,
juliandate treats the times as UTC times.

D = DZ;
D.TimeZone = '';
JD = juliandate(D)

JD = 1×3

2.4576 2.4577 2.4577

Compare JDZ and JD. The differences are equal to the time zone offset between UTC and the
America/New_York time zone in fractional days.

JDZ - JD

ans = 1×3

0.1667 0.1667 0.1667

Convert Zoned and Unzoned Datetime Values to POSIX Times

The POSIX time is the number of seconds (including fractional seconds) elapsed since 00:00:00 1-
Jan-1970 UTC (Universal Coordinated Time), ignoring leap seconds. To convert datetime arrays to
POSIX times, use the posixtime function.

Create a datetime array and specify its time zone.

DZ = datetime('2016-07-29 10:05:24') + calmonths(1:3);


DZ.TimeZone = 'America/New_York'

7-8
Convert Date and Time to Julian Date or POSIX Time

DZ = 1x3 datetime
29-Aug-2016 10:05:24 29-Sep-2016 10:05:24 29-Oct-2016 10:05:24

Convert D to the equivalent POSIX times.

PTZ = posixtime(DZ)

PTZ = 1×3

1.4725 1.4752 1.4777

Create an unzoned copy of DZ. Convert D to the equivalent POSIX times. As D has no time zone,
posixtime treats the times as UTC times.

D = DZ;
D.TimeZone = '';
PT = posixtime(D)

PT = 1×3

1.4725 1.4751 1.4777

Compare PTZ and PT. The differences are equal to the time zone offset between UTC and the
America/New_York time zone in seconds.

PTZ - PT

ans = 1×3

14400 14400 14400

See Also
datetime | timezones | posixtime | juliandate

Related Examples
• “Represent Dates and Times in MATLAB” on page 7-2
• “Specify Time Zones” on page 7-5

7-9
7 Dates and Time

Set Date and Time Display Format


In this section...
“Formats for Individual Date and Duration Arrays” on page 7-10
“datetime Display Format” on page 7-10
“duration Display Format” on page 7-11
“calendarDuration Display Format” on page 7-11
“Default datetime Format” on page 7-12

Formats for Individual Date and Duration Arrays


datetime, duration, and calendarDuration arrays have a Format property that controls the
display of values in each array. When you create a datetime array, it uses the MATLAB global default
datetime display format unless you explicitly provide a format. Use dot notation to access the Format
property to view or change its value. For example, to set the display format for the datetime array,
t, to the default format, type:

t.Format = 'default'

Changing the Format property does not change the values in the array, only their display. For
example, the following can be representations of the same datetime value (the latter two do not
display any time components):

Thursday, August 23, 2012 12:35:00


August 23, 2012
23-Aug-2012

The Format property of the datetime, duration, and calendarDuration data types accepts
different formats as inputs.

datetime Display Format


You can set the Format property to one of these character vectors.

Value of Format Description


'default' Use the default display format.
'defaultdate' Use the default date display format that does not
show time components.

To change the default formats, see “Default datetime Format” on page 7-12.

Alternatively, you can use the letters A-Z and a-z to specify a custom date format. You can include
nonletter characters such as a hyphen, space, or colon to separate the fields. This table shows several
common display formats and examples of the formatted output for the date, Saturday, April 19, 2014
at 9:41:06 PM in New York City.

Value of Format Example


'yyyy-MM-dd' 2014-04-19

7-10
Set Date and Time Display Format

Value of Format Example


'dd/MM/yyyy' 19/04/2014
'dd.MM.yyyy' 19.04.2014
'yyyy 年 MM 月 dd 日' 2014 年 04 月 19 日
'MMMM d, yyyy' April 19, 2014
'eeee, MMMM d, yyyy h:mm a' Saturday, April 19, 2014 9:41 PM
'MMMM d, yyyy HH:mm:ss Z' April 19, 2014 21:41:06 -0400
'yyyy-MM-dd''T''HH:mmXXX' 2014-04-19T21:41-04:00

For a complete list of valid symbolic identifiers, see the Format property for datetime arrays.

Note The letter identifiers that datetime accepts are different from those used by the datestr,
datenum, and datevec functions.

duration Display Format


To display a duration as a single number that includes a fractional part (for example, 1.234 hours),
specify one of these character vectors:

Value of Format Description


'y' Number of exact fixed-length years. A fixed-length year is
equal to 365.2425 days.
'd' Number of exact fixed-length days. A fixed-length day is equal
to 24 hours.
'h' Number of hours
'm' Number of minutes
's' Number of seconds

To specify the number of fractional digits displayed, use the format function.

To display a duration in the form of a digital timer, specify one of the following character vectors.

• 'dd:hh:mm:ss'
• 'hh:mm:ss'
• 'mm:ss'
• 'hh:mm'

You also can display up to nine fractional second digits by appending up to nine S characters. For
example, 'hh:mm:ss.SSS' displays the milliseconds of a duration value to 3 digits.

Changing the Format property does not change the values in the array, only their display.

calendarDuration Display Format


Specify the Format property of a calendarDuration array as a character vector that can include
the characters y, q, m, w, d, and t, in this order. The format must include m, d, and t.

7-11
7 Dates and Time

This table describes the date and time components that the characters represent.

Character Unit Required?


y Years no
q Quarters (multiples of 3 no
months)
m Months yes
w Weeks no
d Days yes
t Time (hours, minutes, and yes
seconds)

To specify the number of digits displayed for fractional seconds, use the format function.

If the value of a date or time component is zero, it is not displayed.

Changing the Format property does not change the values in the array, only their display.

Default datetime Format


You can set default formats to control the display of datetime arrays created without an explicit
display format. These formats also apply when you set the Format property of a datetime array to
'default' or 'defaultdate'. When you change the default setting, datetime arrays set to use
the default formats are displayed automatically using the new setting.

Changes to the default formats persist across MATLAB sessions.

To specify a default format, type


datetime.setDefaultFormats('default',fmt)

where fmt is a character vector composed of the letters A-Z and a-z described for the Format
property of datetime arrays, above. For example,
datetime.setDefaultFormats('default','yyyy-MM-dd hh:mm:ss')

sets the default datetime format to include a 4-digit year, 2-digit month number, 2-digit day number,
and hour, minute, and second values.

In addition, you can specify a default format for datetimes created without time components. For
example,
datetime.setDefaultFormats('defaultdate','yyyy-MM-dd')

sets the default date format to include a 4-digit year, 2-digit month number, and 2-digit day number.

To reset the both the default format and the default date-only formats to the factory defaults, type
datetime.setDefaultFormats('reset')

The factory default formats depend on your system locale.

You also can set the default formats in the Preferences dialog box. For more information, see “Set
Command Window Preferences”.

7-12
Set Date and Time Display Format

See Also
datetime | duration | calendarDuration | format

7-13
7 Dates and Time

Generate Sequence of Dates and Time


In this section...
“Sequence of Datetime or Duration Values Between Endpoints with Step Size” on page 7-14
“Add Duration or Calendar Duration to Create Sequence of Dates” on page 7-16
“Specify Length and Endpoints of Date or Duration Sequence” on page 7-17
“Sequence of Datetime Values Using Calendar Rules” on page 7-17

Sequence of Datetime or Duration Values Between Endpoints with


Step Size
This example shows how to use the colon (:) operator to generate sequences of datetime or
duration values in the same way that you create regularly spaced numeric vectors.

Use Default Step Size

Create a sequence of datetime values starting from November 1, 2013 and ending on November 5,
2013. The default step size is one calendar day.

t1 = datetime(2013,11,1,8,0,0);
t2 = datetime(2013,11,5,8,0,0);
t = t1:t2

t = 1x5 datetime
Columns 1 through 3

01-Nov-2013 08:00:00 02-Nov-2013 08:00:00 03-Nov-2013 08:00:00

Columns 4 through 5

04-Nov-2013 08:00:00 05-Nov-2013 08:00:00

Specify Step Size

Specify a step size of 2 calendar days using the caldays function.

t = t1:caldays(2):t2

t = 1x3 datetime
01-Nov-2013 08:00:00 03-Nov-2013 08:00:00 05-Nov-2013 08:00:00

Specify a step size in units other than days. Create a sequence of datetime values spaced 18 hours
apart.

t = t1:hours(18):t2

t = 1x6 datetime
Columns 1 through 3

01-Nov-2013 08:00:00 02-Nov-2013 02:00:00 02-Nov-2013 20:00:00

Columns 4 through 6

7-14
Generate Sequence of Dates and Time

03-Nov-2013 14:00:00 04-Nov-2013 08:00:00 05-Nov-2013 02:00:00

Use the years, days, minutes, and seconds functions to create datetime and duration sequences
using other fixed-length date and time units. Create a sequence of duration values between 0 and 3
minutes, incremented by 30 seconds.

d = 0:seconds(30):minutes(3)

d = 1x7 duration
0 sec 30 sec 60 sec 90 sec 120 sec 150 sec 180 sec

Compare Fixed-Length Duration and Calendar Duration Step Sizes

Assign a time zone to t1 and t2. In the America/New_York time zone, t1 now occurs just before a
daylight saving time change.

t1.TimeZone = 'America/New_York';
t2.TimeZone = 'America/New_York';

If you create the sequence using a step size of one calendar day, then the difference between
successive datetime values is not always 24 hours.

t = t1:t2;
dt = diff(t)

dt = 1x4 duration
24:00:00 25:00:00 24:00:00 24:00:00

Create a sequence of datetime values spaced one fixed-length day apart,

t = t1:days(1):t2

t = 1x5 datetime
Columns 1 through 3

01-Nov-2013 08:00:00 02-Nov-2013 08:00:00 03-Nov-2013 07:00:00

Columns 4 through 5

04-Nov-2013 07:00:00 05-Nov-2013 07:00:00

Verify that the difference between successive datetime values is 24 hours.

dt = diff(t)

dt = 1x4 duration
24:00:00 24:00:00 24:00:00 24:00:00

Integer Step Size

If you specify a step size in terms of an integer, it is interpreted as a number of 24-hour days.

t = t1:1:t2

7-15
7 Dates and Time

t = 1x5 datetime
Columns 1 through 3

01-Nov-2013 08:00:00 02-Nov-2013 08:00:00 03-Nov-2013 07:00:00

Columns 4 through 5

04-Nov-2013 07:00:00 05-Nov-2013 07:00:00

Add Duration or Calendar Duration to Create Sequence of Dates


This example shows how to add a duration or calendar duration to a datetime to create a sequence of
datetime values.

Create a datetime scalar representing November 1, 2013 at 8:00 AM.

t1 = datetime(2013,11,1,8,0,0);

Add a sequence of fixed-length hours to the datetime.

t = t1 + hours(0:2)

t = 1x3 datetime
01-Nov-2013 08:00:00 01-Nov-2013 09:00:00 01-Nov-2013 10:00:00

Add a sequence of calendar months to the datetime.

t = t1 + calmonths(1:5)

t = 1x5 datetime
Columns 1 through 3

01-Dec-2013 08:00:00 01-Jan-2014 08:00:00 01-Feb-2014 08:00:00

Columns 4 through 5

01-Mar-2014 08:00:00 01-Apr-2014 08:00:00

Each datetime in t occurs on the first day of each month.

Verify that the dates in t are spaced 1 month apart.

dt = caldiff(t)

dt = 1x4 calendarDuration
1mo 1mo 1mo 1mo

Determine the number of days between each date.

dt = caldiff(t,'days')

dt = 1x4 calendarDuration
31d 31d 28d 31d

7-16
Generate Sequence of Dates and Time

Add a number of calendar months to the date, January 31, 2014, to create a sequence of dates that
fall on the last day of each month.
t = datetime(2014,1,31) + calmonths(0:11)

t = 1x12 datetime
Columns 1 through 5

31-Jan-2014 28-Feb-2014 31-Mar-2014 30-Apr-2014 31-May-2014

Columns 6 through 10

30-Jun-2014 31-Jul-2014 31-Aug-2014 30-Sep-2014 31-Oct-2014

Columns 11 through 12

30-Nov-2014 31-Dec-2014

Specify Length and Endpoints of Date or Duration Sequence


This example shows how to use the linspace function to create equally spaced datetime or duration
values between two specified endpoints.

Create a sequence of five equally spaced dates between April 14, 2014 and August 4, 2014. First,
define the endpoints.
A = datetime(2014,04,14);
B = datetime(2014,08,04);

The third input to linspace specifies the number of linearly spaced points to generate between the
endpoints.
C = linspace(A,B,5)

C = 1x5 datetime
14-Apr-2014 12-May-2014 09-Jun-2014 07-Jul-2014 04-Aug-2014

Create a sequence of six equally spaced durations between 1 and 5.5 hours.
A = duration(1,0,0);
B = duration(5,30,0);
C = linspace(A,B,6)

C = 1x6 duration
01:00:00 01:54:00 02:48:00 03:42:00 04:36:00 05:30:00

Sequence of Datetime Values Using Calendar Rules


This example shows how to use the dateshift function to generate sequences of dates and time
where each instance obeys a rule relating to a calendar unit or a unit of time. For instance, each
datetime must occur at the beginning a month, on a particular day of the week, or at the end of a
minute. The resulting datetime values in the sequence are not necessarily equally spaced.

7-17
7 Dates and Time

Dates on Specific Day of Week

Generate a sequence of dates consisting of the next three occurrences of Monday. First, define
today's date.

t1 = datetime('today','Format','dd-MMM-yyyy eee')

t1 = datetime
01-Sep-2021 Wed

The first input to dateshift is always the datetime array from which you want to generate a
sequence. Specify 'dayofweek' as the second input to indicate that the datetime values in the
output sequence must fall on a specific day of the week. You can specify the day of the week either by
number or by name. For example, you can specify Monday either as 2 or 'Monday'.

t = dateshift(t1,'dayofweek',2,1:3)

t = 1x3 datetime
06-Sep-2021 Mon 13-Sep-2021 Mon 20-Sep-2021 Mon

Dates at Start of Month

Generate a sequence of start-of-month dates beginning with April 1, 2014. Specify 'start' as the
second input to dateshift to indicate that all datetime values in the output sequence should fall at
the start of a particular unit of time. The third input argument defines the unit of time, in this case,
month. The last input to dateshift can be an array of integer values that specifies how t1 should be
shifted. In this case, 0 corresponds to the start of the current month, and 4 corresponds to the start
of the fourth month from t1.

t1 = datetime(2014,04,01);
t = dateshift(t1,'start','month',0:4)

t = 1x5 datetime
01-Apr-2014 01-May-2014 01-Jun-2014 01-Jul-2014 01-Aug-2014

Dates at End of Month

Generate a sequence of end-of-month dates beginning with April 1, 2014.

t1 = datetime(2014,04,01);
t = dateshift(t1,'end','month',0:2)

t = 1x3 datetime
30-Apr-2014 31-May-2014 30-Jun-2014

Determine the number of days between each date.

dt = caldiff(t,'days')

dt = 1x2 calendarDuration
31d 30d

The dates are not equally spaced.

7-18
Generate Sequence of Dates and Time

Other Units of Dates and Time

You can specify other units of time such as week, day, and hour.

t1 = datetime('now')

t1 = datetime
01-Sep-2021 13:42:35

t = dateshift(t1,'start','hour',0:4)

t = 1x5 datetime
Columns 1 through 3

01-Sep-2021 13:00:00 01-Sep-2021 14:00:00 01-Sep-2021 15:00:00

Columns 4 through 5

01-Sep-2021 16:00:00 01-Sep-2021 17:00:00

Previous Occurrences of Dates and Time

Generate a sequence of datetime values beginning with the previous hour. Negative integers in the
last input to dateshift correspond to datetime values earlier than t1.

t = dateshift(t1,'start','hour',-1:1)

t = 1x3 datetime
01-Sep-2021 12:00:00 01-Sep-2021 13:00:00 01-Sep-2021 14:00:00

See Also
dateshift | linspace

7-19
7 Dates and Time

Share Code and Data Across Locales


In this section...
“Write Locale-Independent Date and Time Code” on page 7-20
“Write Dates in Other Languages” on page 7-21
“Read Dates in Other Languages” on page 7-21

Write Locale-Independent Date and Time Code


Follow these best practices when sharing code that handles dates and time with MATLAB® users in
other locales. These practices ensure that the same code produces the same output display and that
output files containing dates and time are read correctly on systems in different countries or with
different language settings.

Create language-independent datetime values. That is, create datetime values that use month
numbers rather than month names, such as 01 instead of January. Avoid using day of week names.

For example, do this:


t = datetime('today','Format','yyyy-MM-dd')

t = datetime
2021-09-01

instead of this:
t = datetime('today','Format','eeee, dd-MMM-yyyy')

t = datetime
Wednesday, 01-Sep-2021

Display the hour using 24-hour clock notation rather than 12-hour clock notation. Use the 'HH'
identifiers when specifying the display format for datetime values.

For example, do this:


t = datetime('now','Format','HH:mm')

t = datetime
15:32

instead of this:
t = datetime('now','Format','hh:mm a')

t = datetime
03:32 PM

When specifying the display format for time zone information, use the Z or X identifiers instead of the
lowercase z to avoid the creation of time zone names that might not be recognized in other languages
or regions.

7-20
Share Code and Data Across Locales

Assign a time zone to t.

t.TimeZone = 'America/New_York';

Specify a language-independent display format that includes a time zone.

t.Format = 'dd-MM-yyyy Z'

t = datetime
01-09-2021 -0400

If you share files but not code, you do not need to write locale-independent code while you work in
MATLAB. However, when you write to a file, ensure that any text representing dates and times is
language-independent. Then, other MATLAB users can read the files easily without having to specify
a locale in which to interpret date and time data.

Write Dates in Other Languages


Specify an appropriate format for text representing dates and times when you use the char or
cellstr functions. For example, convert two datetime values to a cell array of character vectors
using cellstr. Specify the format and the locale to represent the day, month, and year of each
datetime value as text.

t = [datetime('today');datetime('tomorrow')]

t = 2x1 datetime
01-Sep-2021
02-Sep-2021

S = cellstr(t,'dd. MMMM yyyy','de_DE')

S = 2x1 cell
{'01. September 2021'}
{'02. September 2021'}

S is a cell array of character vectors representing dates in German. You can export S to a text file to
use with systems in the de_DE locale.

Read Dates in Other Languages


You can read text files containing dates and time in a language other than the language that MATLAB
uses, which depends on your system locale. Use the textscan or readtable functions with the
DateLocale name-value pair argument to specify the locale in which the function interprets the
dates in the file. In addition, you might need to specify the character encoding of a file that contains
characters that are not recognized by your computer's default encoding.

• When reading text files using the textscan function, specify the file encoding when opening the
file with fopen. The encoding is the fourth input argument to fopen.
• When reading text files using the readtable function, use the FileEncoding name-value pair
argument to specify the character encoding associated with the file.

7-21
7 Dates and Time

See Also
datetime | char | cellstr | readtable | textscan

7-22
Extract or Assign Date and Time Components of Datetime Array

Extract or Assign Date and Time Components of Datetime


Array
This example shows two ways to extract date and time components from existing datetime arrays:
accessing the array properties or calling a function. Then, the example shows how to modify the date
and time components by modifying the array properties.

Access Properties to Retrieve Date and Time Component

Create a datetime array.

t = datetime('now') + calyears(0:2) + calmonths(0:2) + hours(20:20:60)

t = 1x3 datetime
02-Sep-2021 10:44:38 03-Oct-2022 06:44:38 04-Nov-2023 02:44:38

Get the year values of each datetime in the array. Use dot notation to access the Year property of t.

t_years = t.Year

t_years = 1×3

2021 2022 2023

The output, t_years, is a numeric array.

Get the month values of each datetime in t by accessing the Month property.

t_months = t.Month

t_months = 1×3

9 10 11

You can retrieve the day, hour, minute, and second components of each datetime in t by accessing the
Hour, Minute, and Second properties, respectively.

Use Functions to Retrieve Date and Time Component

Use the month function to get the month number for each datetime in t. Using functions is an
alternate way to retrieve specific date or time components of t.

m = month(t)

m = 1×3

9 10 11

Use the month function rather than the Month property to get the full month names of each datetime
in t.

m = month(t,'name')

7-23
7 Dates and Time

m = 1x3 cell
{'September'} {'October'} {'November'}

You can retrieve the year, quarter, week, day, hour, minute, and second components of each datetime
in t using the year, quarter, week, hour, minute, and second functions, respectively.

Get the week of year numbers for each datetime in t.

w = week(t)

w = 1×3

36 41 44

Get Multiple Date and Time Components

Use the ymd function to get the year, month, and day values of t as three separate numeric arrays.

[y,m,d] = ymd(t)

y = 1×3

2021 2022 2023

m = 1×3

9 10 11

d = 1×3

2 3 4

Use the hms function to get the hour, minute, and second values of t as three separate numeric
arrays.

[h,m,s] = hms(t)

h = 1×3

10 6 2

m = 1×3

44 44 44

s = 1×3

38.3920 38.3920 38.3920

7-24
Extract or Assign Date and Time Components of Datetime Array

Modify Date and Time Components

Assign new values to components in an existing datetime array by modifying the properties of the
array. Use dot notation to access a specific property.

Change the year number of all datetime values in t to 2014. Use dot notation to modify the Year
property.

t.Year = 2014

t = 1x3 datetime
02-Sep-2014 10:44:38 03-Oct-2014 06:44:38 04-Nov-2014 02:44:38

Change the months of the three datetime values in t to January, February, and March, respectively.
You must specify the new value as a numeric array.

t.Month = [1,2,3]

t = 1x3 datetime
02-Jan-2014 10:44:38 03-Feb-2014 06:44:38 04-Mar-2014 02:44:38

Set the time zone of t by assigning a value to the TimeZone property.

t.TimeZone = 'Europe/Berlin';

Change the display format of t to display only the date, and not the time information.

t.Format = 'dd-MMM-yyyy'

t = 1x3 datetime
02-Jan-2014 03-Feb-2014 04-Mar-2014

If you assign values to a datetime component that are outside the conventional range, MATLAB®
normalizes the components. The conventional range for day of month numbers is from 1 to 31. Assign
day values that exceed this range.

t.Day = [-1 1 32]

t = 1x3 datetime
30-Dec-2013 01-Feb-2014 01-Apr-2014

The month and year numbers adjust so that all values remain within the conventional range for each
date component. In this case, January -1, 2014 converts to December 30, 2013.

See Also
datetime | ymd | hms | week

7-25
7 Dates and Time

Combine Date and Time from Separate Variables


This example shows how to read date and time data from a text file. Then, it shows how to combine
date and time information stored in separate variables into a single datetime variable.

Create a space-delimited text file named schedule.txt that contains the following (to create the
file, use any text editor, and copy and paste):
Date Name Time
10.03.2015 Joe 14:31
10.03.2015 Bob 15:33
11.03.2015 Bob 11:29
12.03.2015 Kim 12:09
12.03.2015 Joe 13:05

Read the file using the readtable function. Use the %D conversion specifier to read the first and
third columns of data as datetime values.
T = readtable('schedule.txt','Format','%{dd.MM.uuuu}D %s %{HH:mm}D','Delimiter',' ')

T =
Date Name Time
__________ _____ _____
10.03.2015 'Joe' 14:31
10.03.2015 'Bob' 15:33
11.03.2015 'Bob' 11:29
12.03.2015 'Kim' 12:09
12.03.2015 'Joe' 13:05

readtable returns a table containing three variables.

Change the display format for the T.Date and T.Time variables to view both date and time
information. Since the data in the first column of the file ("Date") have no time information, the time
of the resulting datetime values in T.Date default to midnight. Since the data in the third column of
the file ("Time") have no associated date, the date of the datetime values in T.Time defaults to the
current date.
T.Date.Format = 'dd.MM.uuuu HH:mm';
T.Time.Format = 'dd.MM.uuuu HH:mm';
T

T =
Date Name Time
________________ _____ ________________
10.03.2015 00:00 'Joe' 12.12.2014 14:31
10.03.2015 00:00 'Bob' 12.12.2014 15:33
11.03.2015 00:00 'Bob' 12.12.2014 11:29
12.03.2015 00:00 'Kim' 12.12.2014 12:09
12.03.2015 00:00 'Joe' 12.12.2014 13:05

Combine the date and time information from two different table variables by adding T.Date and the
time values in T.Time. Extract the time information from T.Time using the timeofday function.
myDatetime = T.Date + timeofday(T.Time)

myDatetime =
10.03.2015 14:31
10.03.2015 15:33

7-26
Combine Date and Time from Separate Variables

11.03.2015 11:29
12.03.2015 12:09
12.03.2015 13:05

See Also
readtable | timeofday

7-27
7 Dates and Time

Date and Time Arithmetic


This example shows how to add and subtract date and time values to calculate future and past dates
and elapsed durations in exact units or calendar units. You can add, subtract, multiply, and divide
date and time arrays in the same way that you use these operators with other MATLAB® data types.
However, there is some behavior that is specific to dates and time.

Add and Subtract Durations to Datetime Array

Create a datetime scalar. By default, datetime arrays are not associated with a time zone.
t1 = datetime('now')

t1 = datetime
01-Sep-2021 13:47:20

Find future points in time by adding a sequence of hours.


t2 = t1 + hours(1:3)

t2 = 1x3 datetime
01-Sep-2021 14:47:20 01-Sep-2021 15:47:20 01-Sep-2021 16:47:20

Verify that the difference between each pair of datetime values in t2 is 1 hour.
dt = diff(t2)

dt = 1x2 duration
01:00:00 01:00:00

diff returns durations in terms of exact numbers of hours, minutes, and seconds.

Subtract a sequence of minutes from a datetime to find past points in time.


t2 = t1 - minutes(20:10:40)

t2 = 1x3 datetime
01-Sep-2021 13:27:20 01-Sep-2021 13:17:20 01-Sep-2021 13:07:20

Add a numeric array to a datetime array. MATLAB® treats each value in the numeric array as a
number of exact, 24-hour days.
t2 = t1 + [1:3]

t2 = 1x3 datetime
02-Sep-2021 13:47:20 03-Sep-2021 13:47:20 04-Sep-2021 13:47:20

Add to Datetime with Time Zone

If you work with datetime values in different time zones, or if you want to account for daylight saving
time changes, work with datetime arrays that are associated with time zones. Create a datetime
scalar representing March 8, 2014 in New York.
t1 = datetime(2014,3,8,0,0,0,'TimeZone','America/New_York')

7-28
Date and Time Arithmetic

t1 = datetime
08-Mar-2014

Find future points in time by adding a sequence of fixed-length (24-hour) days.

t2 = t1 + days(0:2)

t2 = 1x3 datetime
08-Mar-2014 00:00:00 09-Mar-2014 00:00:00 10-Mar-2014 01:00:00

Because a daylight saving time shift occurred on March 9, 2014, the third datetime in t2 does not
occur at midnight.

Verify that the difference between each pair of datetime values in t2 is 24 hours.

dt = diff(t2)

dt = 1x2 duration
24:00:00 24:00:00

You can add fixed-length durations in other units such as years, hours, minutes, and seconds by
adding the outputs of the years, hours, minutes, and seconds functions, respectively.

To account for daylight saving time changes, you should work with calendar durations instead of
durations. Calendar durations account for daylight saving time shifts when they are added to or
subtracted from datetime values.

Add a number of calendar days to t1.

t3 = t1 + caldays(0:2)

t3 = 1x3 datetime
08-Mar-2014 09-Mar-2014 10-Mar-2014

View that the difference between each pair of datetime values in t3 is not always 24 hours due to the
daylight saving time shift that occurred on March 9.

dt = diff(t3)

dt = 1x2 duration
24:00:00 23:00:00

Add Calendar Durations to Datetime Array

Add a number of calendar months to January 31, 2014.

t1 = datetime(2014,1,31)

t1 = datetime
31-Jan-2014

t2 = t1 + calmonths(1:4)

7-29
7 Dates and Time

t2 = 1x4 datetime
28-Feb-2014 31-Mar-2014 30-Apr-2014 31-May-2014

Each datetime in t2 occurs on the last day of each month.

Calculate the difference between each pair of datetime values in t2 in terms of a number of calendar
days using the caldiff function.

dt = caldiff(t2,'days')

dt = 1x3 calendarDuration
31d 30d 31d

The number of days between successive pairs of datetime values in dt is not always the same
because different months consist of a different number of days.

Add a number of calendar years to January 31, 2014.

t2 = t1 + calyears(0:4)

t2 = 1x5 datetime
31-Jan-2014 31-Jan-2015 31-Jan-2016 31-Jan-2017 31-Jan-2018

Calculate the difference between each pair of datetime values in t2 in terms of a number of calendar
days using the caldiff function.

dt = caldiff(t2,'days')

dt = 1x4 calendarDuration
365d 365d 366d 365d

The number of days between successive pairs of datetime values in dt is not always the same
because 2016 is a leap year and has 366 days.

You can use the calquarters, calweeks, and caldays functions to create arrays of calendar
quarters, calendar weeks, or calendar days that you add to or subtract from datetime arrays.

Adding calendar durations is not commutative. When you add more than one calendarDuration
array to a datetime, MATLAB® adds them in the order in which they appear in the command.

Add 3 calendar months followed by 30 calendar days to January 31, 2014.

t2 = datetime(2014,1,31) + calmonths(3) + caldays(30)

t2 = datetime
30-May-2014

First add 30 calendar days to the same date, and then add 3 calendar months. The result is not the
same because when you add a calendar duration to a datetime, the number of days added depends on
the original date.

t2 = datetime(2014,1,31) + caldays(30) + calmonths(3)

7-30
Date and Time Arithmetic

t2 = datetime
02-Jun-2014

Calendar Duration Arithmetic

Create two calendar durations and then find their sum.


d1 = calyears(1) + calmonths(2) + caldays(20)

d1 = calendarDuration
1y 2mo 20d

d2 = calmonths(11) + caldays(23)

d2 = calendarDuration
11mo 23d

d = d1 + d2

d = calendarDuration
2y 1mo 43d

When you sum two or more calendar durations, a number of months greater than 12 roll over to a
number of years. However, a large number of days does not roll over to a number of months, because
different months consist of different numbers of days.

Increase d by multiplying it by a factor of 2. Calendar duration values must be integers, so you can
multiply them only by integer values.
2*d

ans = calendarDuration
4y 2mo 86d

Calculate Elapsed Time in Exact Units

Subtract one datetime array from another to calculate elapsed time in terms of an exact number of
hours, minutes, and seconds.

Find the exact length of time between a sequence of datetime values and the start of the previous
day.
t2 = datetime('now') + caldays(1:3)

t2 = 1x3 datetime
02-Sep-2021 13:47:22 03-Sep-2021 13:47:22 04-Sep-2021 13:47:22

t1 = datetime('yesterday')

t1 = datetime
31-Aug-2021

dt = t2 - t1

7-31
7 Dates and Time

dt = 1x3 duration
61:47:22 85:47:22 109:47:22

whos dt

Name Size Bytes Class Attributes

dt 1x3 40 duration

dt contains durations in the format, hours:minutes:seconds.

View the elapsed durations in units of days by changing the Format property of dt.

dt.Format = 'd'

dt = 1x3 duration
2.5746 days 3.5746 days 4.5746 days

Scale the duration values by multiplying dt by a factor of 1.2. Because durations have an exact
length, you can multiply and divide them by fractional values.

dt2 = 1.2*dt

dt2 = 1x3 duration


3.0895 days 4.2895 days 5.4895 days

Calculate Elapsed Time in Calendar Units

Use the between function to find the number of calendar years, months, and days elapsed between
two dates.

t1 = datetime('today')

t1 = datetime
01-Sep-2021

t2 = t1 + calmonths(0:2) + caldays(4)

t2 = 1x3 datetime
05-Sep-2021 05-Oct-2021 05-Nov-2021

dt = between(t1,t2)

dt = 1x3 calendarDuration
4d 1mo 4d 2mo 4d

See Also
between | diff | caldiff

7-32
Compare Dates and Time

Compare Dates and Time


This example shows how to compare datetime and duration arrays. You can perform an element-
by-element comparison of values in two datetime arrays or two duration arrays using relational
operators, such as > and <.

Compare Datetime Arrays

Compare two datetime arrays. The arrays must be the same size or one can be a scalar.

A = datetime(2013,07,26) + calyears(0:2:6)

A = 1x4 datetime
26-Jul-2013 26-Jul-2015 26-Jul-2017 26-Jul-2019

B = datetime(2014,06,01)

B = datetime
01-Jun-2014

A < B

ans = 1x4 logical array

1 0 0 0

The < operator returns logical 1 (true) where a datetime in A occurs before a datetime in B.

Compare a datetime array to text representing a date.

A >= '26-Sep-2014'

ans = 1x4 logical array

0 1 1 1

Comparisons of datetime arrays account for the time zone information of each array.

Compare September 1, 2014 at 4:00 p.m. in Los Angeles with 5:00 p.m. on the same day in New York.

A = datetime(2014,09,01,16,0,0,'TimeZone','America/Los_Angeles',...
'Format','dd-MMM-yyyy HH:mm:ss Z')

A = datetime
01-Sep-2014 16:00:00 -0700

B = datetime(2014,09,01,17,0,0,'TimeZone','America/New_York',...
'Format','dd-MMM-yyyy HH:mm:ss Z')

B = datetime
01-Sep-2014 17:00:00 -0400

A < B

7-33
7 Dates and Time

ans = logical
0

4:00 p.m. in Los Angeles occurs after 5:00 p.m. on the same day in New York.

Compare Durations

Compare two duration arrays.

A = duration([2,30,30;3,15,0])

A = 2x1 duration
02:30:30
03:15:00

B = duration([2,40,0;2,50,0])

B = 2x1 duration
02:40:00
02:50:00

A >= B

ans = 2x1 logical array

0
1

Compare a duration array to a numeric array. Elements in the numeric array are treated as a number
of fixed-length (24-hour) days.

A < [1; 1/24]

ans = 2x1 logical array

1
0

Determine if Dates and Time Are Contained Within an Interval

Use the isbetween function to determine whether values in a datetime array lie within a closed
interval.

Define endpoints of an interval.

tlower = datetime(2014,08,01)

tlower = datetime
01-Aug-2014

tupper = datetime(2014,09,01)

7-34
Compare Dates and Time

tupper = datetime
01-Sep-2014

Create a datetime array and determine whether the values lie within the interval bounded by t1
and t2.

A = datetime(2014,08,21) + calweeks(0:2)

A = 1x3 datetime
21-Aug-2014 28-Aug-2014 04-Sep-2014

tf = isbetween(A,tlower,tupper)

tf = 1x3 logical array

1 1 0

See Also
isbetween

More About
• “Array Comparison with Relational Operators” on page 2-29

7-35
7 Dates and Time

Plot Dates and Durations


You can create plots of datetime and duration values with a variety of graphics functions. You also can
customize the axes, such as changing the format of the tick labels or changing the axis limits.

Line Plot with Dates


Create a line plot with datetime values on the x-axis. Then, change the format of the tick labels and
the x-axis limits.

Create t as a sequence of dates and create y as random data. Plot the vectors using the plot
function.

t = datetime(2014,6,28) + calweeks(0:9);
y = rand(1,10);
plot(t,y);

By default, plot chooses tick mark locations based on the range of data. When you zoom in and out
of a plot, the tick labels automatically adjust to the new axis limits.

Change the x-axis limits. Also, change the format for the tick labels along the x-axis. For a list of
formatting options, see the xtickformat function.

xlim(datetime(2014,[7 8],[12 23]))


xtickformat('dd-MMM-yyyy')

7-36
Plot Dates and Durations

Line Plot with Durations


Create a line plot with duration values on the x-axis. Then, change the format of the tick labels and
the x-axis limits.

Create t as seven linearly spaced duration values between 0 and 3 minutes. Create y as a vector of
random data. Plot the data.

t = 0:seconds(30):minutes(3);
y = rand(1,7);
plot(t,y);

7-37
7 Dates and Time

View the x-axis limits. Since the duration tick labels are in terms of a single unit (minutes), the limits
are stored in terms of that unit.

xl = xlim

xl = 1x2 duration
-4.5 sec 184.5 sec

Change the format for the duration tick labels to display in the form of a digital timer that includes
more than one unit. For a list of formatting options, see the xtickformat function.

xtickformat('mm:ss')

7-38
Plot Dates and Durations

View the x-axis limits again. Since the duration tick labels are now in terms of multiple units, the
limits are stored in units of 24-hour days.

xl = xlim

xl = 1x2 duration
-00:04 03:04

Scatter Plot with Dates and Durations


Create a scatter plot with datetime or duration inputs using the scatter or scatter3 functions. For
example, create a scatter plot with dates along the x-axis.

t = datetime('today') + caldays(1:100);
y = linspace(10,40,100) + 10*rand(1,100);
scatter(t,y)

7-39
7 Dates and Time

Plots that Support Dates and Durations


You can create other types of plots with datetime or duration values. These graphics functions
support datetime and duration values.

bar barh
plot plot3
semilogx (x values must be numeric) semilogy (y values must be numeric)
stem stairs
scatter scatter3
area mesh
surf surface
fill fill3
line text
histogram

See Also
plot | datetime | xtickformat

7-40
Core Functions Supporting Date and Time Arrays

Core Functions Supporting Date and Time Arrays


Many functions in MATLAB operate on date and time arrays in much the same way that they operate
on other arrays.

This table lists notable MATLAB functions that operate on datetime, duration, and
calendarDuration arrays in addition to other arrays.

size isequal intersect plus plot


length isequaln ismember minus plot3
ndims setdiff uminus scatter
numel eq setxor times scatter3
ne unique rdivide bar
isrow lt union ldivide barh
iscolumn le mtimes histogram
ge abs mrdivide
cat gt floor mldivide stem
horzcat ceil diff stairs
vertcat sort round sum area
sortrows mesh
permute issorted min char surf
reshape max string surface
transpose mean cellstr
ctranspose median semilogx
mode semilogy
linspace fill
fill3
line
text

7-41
7 Dates and Time

Convert Between Datetime Arrays, Numbers, and Text


In this section...
“Overview” on page 7-42
“Convert Between Datetime and Character Vectors” on page 7-42
“Convert Between Datetime and String Arrays” on page 7-44
“Convert Between Datetime and Date Vectors” on page 7-44
“Convert Serial Date Numbers to Datetime” on page 7-45
“Convert Datetime Arrays to Numeric Values” on page 7-45

Overview
datetime is the best data type for representing points in time. datetime values have flexible
display formats and up to nanosecond precision, and can account for time zones, daylight saving
time, and leap seconds. However, if you work with code authored in MATLAB R2014a or earlier, or if
you share code with others who use such a version, you might need to work with dates and time
stored in one of these three formats:

• Date String on page 7-42 — A character vector.

Example: Thursday, August 23, 2012 9:45:44.946 AM


• Date Vector on page 7-44 — A 1-by-6 numeric vector containing the year, month, day, hour,
minute, and second.

Example: [2012 8 23 9 45 44.946]


• Serial Date Number on page 7-45 — A single number equal to the number of days since January
0, 0000 in the proleptic ISO calendar (specifying use of the Gregorian calendar). Serial date
numbers are useful as inputs to some MATLAB functions that do not accept the datetime or
duration data types.

Example: 7.3510e+005

Date strings, vectors, and numbers can be stored as arrays of values. Store multiple date strings in a
cell array of character vectors, multiple date vectors in an m-by-6 matrix, and multiple serial date
numbers in a matrix.

You can convert any of these formats to a datetime array using the datetime function. If your
existing MATLAB code expects a serial date number or date vector, use the datenum or datevec
functions, respectively, to convert a datetime array to the expected data format. To convert a
datetime array to character vectors, use the char or cellstr functions.

Starting in R2016b, you also can convert a datetime array to a string array with the string
function.

Convert Between Datetime and Character Vectors


A date string can be a character vector composed of fields related to a specific date and/or time.
There are several ways to represent dates and times in text format. For example, all of the following
are character vectors representing August 23, 2010 at 04:35:42 PM:

7-42
Convert Between Datetime Arrays, Numbers, and Text

'23-Aug-2010 04:35:06 PM'


'Wednesday, August 23'
'08/23/10 16:35'
'Aug 23 16:35:42.946'

A date string includes characters that separate the fields, such as the hyphen, space, and colon used
here:

d = '23-Aug-2010 16:35:42'

Convert one or more date strings to a datetime array using the datetime function. For best
performance, specify the format of the input date strings as an input to datetime.

Note The specifiers that datetime uses to describe date and time formats differ from the specifiers
that the datestr, datevec, and datenum functions accept.

For a complete list of date and time format specifiers, see the Format property of the datetime data
type.

t = datetime(d,'InputFormat','dd-MMM-yyyy HH:mm:ss')

t =

datetime

23-Aug-2010 16:35:42

Although the date string, d, and the datetime scalar, t, look similar, they are not equal. View the
size and data type of each variable.

whos d t

Name Size Bytes Class Attributes

d 1x20 40 char
t 1x1 17 datetime

Convert a datetime array to a character vector using char or cellstr. For example, convert the
current date and time to a timestamp to append to a file name.

t = datetime('now','Format','yyyy-MM-dd''T''HHmmss')

t =

datetime

2017-01-03T151105

S = char(t);
filename = ['myTest_',S]

filename =

'myTest_2017-01-03T151105'

7-43
7 Dates and Time

Convert Between Datetime and String Arrays


Starting in R2016b, you can use the string function to create a string array. If a string array
contains date strings, then you can convert the string array to a datetime array with the datetime
function. Similarly, you can convert a datetime array to a string array with the string function.

Convert a string array. MATLAB displays strings in double quotes. For best performance, specify the
format of the input date strings as an input to datetime.

str = ["24-Oct-2016 11:58:17";


"19-Nov-2016 09:36:29";
"12-Dec-2016 10:09:06"]

str =

3×1 string array

"24-Oct-2016 11:58:17"
"19-Nov-2016 09:36:29"
"12-Dec-2016 10:09:06"

t = datetime(str,'InputFormat','dd-MMM-yyyy HH:mm:ss')

t =

3×1 datetime array

24-Oct-2016 11:58:17
19-Nov-2016 09:36:29
12-Dec-2016 10:09:06

Convert a datetime value to a string.

t = datetime('25-Dec-2016 06:12:34');
str = string(t)

str =

"25-Dec-2016 06:12:34"

Convert Between Datetime and Date Vectors


A date vector is a 1-by-6 vector of double-precision numbers. Elements of a date vector are integer-
valued, except for the seconds element, which can be fractional. Time values are expressed in 24-
hour notation. There is no AM or PM setting.

A date vector is arranged in the following order:

year month day hour minute second

The following date vector represents 10:45:07 AM on October 24, 2012:

[2012 10 24 10 45 07]

Convert one or more date vectors to a datetime array using the datetime function:

t = datetime([2012 10 24 10 45 07])

7-44
Convert Between Datetime Arrays, Numbers, and Text

t =

datetime

24-Oct-2012 10:45:07

Instead of using datevec to extract components of datetime values, use functions such as year,
month, and day instead:

y = year(t)

y =

2012

Alternatively, access the corresponding property, such as t.Year for year values:

y = t.Year

y =

2012

Convert Serial Date Numbers to Datetime


A serial date number represents a calendar date as the number of days that has passed since a fixed
base date. In MATLAB, serial date number 1 is January 1, 0000.

Serial time can represent fractions of days beginning at midnight; for example, 6 p.m. equals 0.75
serial days. So the character vector '31-Oct-2003, 6:00 PM' in MATLAB is date number
731885.75.

Convert one or more serial date numbers to a datetime array using the datetime function. Specify
the type of date number that is being converted:

t = datetime(731885.75,'ConvertFrom','datenum')

t =

datetime

31-Oct-2003 18:00:00

Convert Datetime Arrays to Numeric Values


Some MATLAB functions accept numeric data types but not datetime values as inputs. To apply these
functions to your date and time data, convert datetime values to meaningful numeric values. Then,
call the function. For example, the log function accepts double inputs, but not datetime inputs.
Suppose that you have a datetime array of dates spanning the course of a research study or
experiment.

t = datetime(2014,6,18) + calmonths(1:4)

t =

1×4 datetime array

7-45
7 Dates and Time

18-Jul-2014 18-Aug-2014 18-Sep-2014 18-Oct-2014

Subtract the origin value. For example, the origin value might be the starting day of an experiment.

dt = t - datetime(2014,7,1)

dt =

1×4 duration array

408:00:00 1152:00:00 1896:00:00 2616:00:00

dt is a duration array. Convert dt to a double array of values in units of years, days, hours,
minutes, or seconds using the years, days, hours, minutes, or seconds function, respectively.

x = hours(dt)

x =

408 1152 1896 2616

Pass the double array as the input to the log function.

y = log(x)

y =

6.0113 7.0493 7.5475 7.8694

See Also
datetime | datenum | datevec | cellstr | char | string | duration

More About
• “Represent Dates and Times in MATLAB” on page 7-2
• “Extract or Assign Date and Time Components of Datetime Array” on page 7-23
• Proleptic Gregorian Calendar

7-46
Carryover in Date Vectors and Strings

Carryover in Date Vectors and Strings


If an element falls outside the conventional range, MATLAB adjusts both that date vector element and
the previous element. For example, if the minutes element is 70, MATLAB adjusts the hours element
by 1 and sets the minutes element to 10. If the minutes element is -15, then MATLAB decreases the
hours element by 1 and sets the minutes element to 45. Month values are an exception. MATLAB sets
month values less than 1 to 1.

In the following example, the month element has a value of 22. MATLAB increments the year value to
2010 and sets the month to October.

datestr([2009 22 03 00 00 00])

ans =
03-Oct-2010

The carrying forward of values also applies to time and day values in text representing dates and
times. For example, October 3, 2010 and September 33, 2010 are interpreted to be the same date,
and correspond to the same serial date number.

datenum('03-Oct-2010')

ans =
734414

datenum('33-Sep-2010')

ans =
734414

The following example takes the input month (07, or July), finds the last day of the previous month
(June 30), and subtracts the number of days in the field specifier (5 days) from that date to yield a
return date of June 25, 2010.

datestr([2010 07 -05 00 00 00])

ans =
25-Jun-2010

7-47
7 Dates and Time

Converting Date Vector Returns Unexpected Output

Note The best practice is to use datetime values to represent points in time rather than date
vectors. Unlike date vectors, datetime values display in a human-readable format, often avoiding
the need for conversion to text. If you need to convert a date vector to text, the best practice is to
first convert it to a datetime value, and then to convert the datetime value to text by using the
string or char functions. While you can convert date vectors to text directly by using the datestr
function, you might get unexpected results, as described in this section.

Because a date vector is a 1-by-6 row vector of numbers, the datestr function might interpret input
date vectors as vectors of serial date numbers and return unexpected output. Or it might interpret
vectors of serial date numbers as date vectors. This ambiguity exists because datestr has a
heuristic rule for interpreting a 1-by-6 row vector as either a date vector or a vector of six serial date
numbers. The same ambiguity applies to inputs that are m-by-6 numeric matrices, where each row
can be interpreted either as a date vector or as six serial date numbers.

For example, consider a date vector that includes the year 3000. This year is outside the range of
years that datestr interprets as elements of date vectors. Therefore, the input is interpreted as a 1-
by-6 vector of serial date numbers.
d = datestr([3000 11 05 10 32 56])

d =

6×11 char array

'18-Mar-0008'
'11-Jan-0000'
'05-Jan-0000'
'10-Jan-0000'
'01-Feb-0000'
'25-Feb-0000'

Here datestr interprets 3000 as a serial date number, and converts it to the text '18-Mar-0008'
(the date that is 3000 days after 0-Jan-0000). Also, datestr converts the next five elements as
though they also were serial date numbers.

There are two methods for converting such a date vector to text.

• The recommended method is to convert the date vector to a datetime value. Then convert it
using the char, cellstr, or string function. The datetime function always treats 1-by-6
numeric vectors as date vectors.
dt = datetime([3000 11 05 10 32 56]);
ds = string(dt)

dt =

"05-Nov-3000 10:32:56"
• As an alternative, convert it to a serial date number using the datenum function. Then, convert
the date number to a character vector using datestr.
dn = datenum([3000 11 05 10 32 56]);
ds = datestr(dn)

7-48
Converting Date Vector Returns Unexpected Output

ds =

'05-Nov-3000 10:32:56'

When converting dates to text, datestr interprets input as either date vectors or serial date
numbers using a heuristic rule. Consider an m-by-6 matrix. The datestr function interprets the
matrix as m date vectors when:

• The first five columns contain integers.


• The absolute value of the sum of each row is in the range 1500–2500.

If either condition is false, for any row, then datestr interprets the m-by-6 matrix as an m-by-6 matrix
of serial date numbers.

Usually, dates with years in the range 1700–2300 are interpreted as date vectors. However, datestr
might interpret rows with month, day, hour, minute, or second values outside their normal ranges as
serial date numbers. For example, datestr correctly interprets the following date vector for the year
2020:
d = datestr([2020 06 21 10 51 00])

d =

'21-Jun-2020 10:51:00'

But given a day value outside the typical range (1–31), datestr returns a date for each element of
the vector.
d = datestr([2020 06 2110 10 51 00])

d =

6×11 char array

'12-Jul-0005'
'06-Jan-0000'
'10-Oct-0005'
'10-Jan-0000'
'20-Feb-0000'
'00-Jan-0000'

Again, the datetime function always treats numeric inputs as date vectors. In this case, it calculates
an appropriate date, interpreting 2110 as the 2110th day since the beginning of June 2020.
d = datetime([2020 06 2110 10 51 00])

d =

datetime

11-Mar-2026 10:51:00

• When you have a matrix of date vectors that datestr might interpret incorrectly as serial date
numbers, convert the matrix by using either the datetime or datenum functions. Then convert
those values to text.
• When you have a matrix of serial date numbers that datestr might interpret as date vectors, first
convert the matrix to a column vector. Then, use datestr to convert the column vector.

7-49
7 Dates and Time

See Also
datetime | datenum | datevec | char | string | datestr

More About
• “Represent Dates and Times in MATLAB” on page 7-2
• “Convert Between Datetime Arrays, Numbers, and Text” on page 7-42

7-50
8

Categorical Arrays

• “Create Categorical Arrays” on page 8-2


• “Convert Text in Table Variables to Categorical” on page 8-6
• “Plot Categorical Data” on page 8-10
• “Compare Categorical Array Elements” on page 8-16
• “Combine Categorical Arrays” on page 8-19
• “Combine Categorical Arrays Using Multiplication” on page 8-22
• “Access Data Using Categorical Arrays” on page 8-24
• “Work with Protected Categorical Arrays” on page 8-30
• “Advantages of Using Categorical Arrays” on page 8-34
• “Ordinal Categorical Arrays” on page 8-36
• “Core Functions Supporting Categorical Arrays” on page 8-39
8 Categorical Arrays

Create Categorical Arrays


This example shows how to create a categorical array. categorical is a data type for storing data
with values from a finite set of discrete categories. These categories can have a natural order, but it is
not required. A categorical array provides efficient storage and convenient manipulation of data,
while also maintaining meaningful names for the values. Categorical arrays are often used in a table
to define groups of rows.

By default, categorical arrays contain categories that have no mathematical ordering. For example,
the discrete set of pet categories {'dog' 'cat' 'bird'} has no meaningful mathematical
ordering, so MATLAB® uses the alphabetical ordering {'bird' 'cat' 'dog'}. Ordinal categorical
arrays contain categories that have a meaningful mathematical ordering. For example, the discrete
set of size categories {'small', 'medium', 'large'} has the mathematical ordering small <
medium < large.

When you create categorical arrays from cell arrays of character vectors or string arrays, leading and
trailing spaces are removed. For example, if you specify the text {' cat' 'dog '} as categories, then
when you convert them to categories they become {'cat' 'dog'}.

Create Categorical Array from Cell Array of Character Vectors

You can use the categorical function to create a categorical array from a numeric array, logical
array, string array, cell array of character vectors, or an existing categorical array.

Create a 1-by-11 cell array of character vectors containing state names from New England.

state = ["MA","ME","CT","VT","ME","NH","VT","MA","NH","CT","RI"];

Convert the cell array, state, to a categorical array that has no mathematical order.

state = categorical(state)

state = 1x11 categorical


Columns 1 through 9

MA ME CT VT ME NH VT MA NH

Columns 10 through 11

CT RI

class(state)

ans =
'categorical'

List the discrete categories in the variable state.

categories(state)

ans = 6x1 cell


{'CT'}
{'MA'}
{'ME'}
{'NH'}

8-2
Create Categorical Arrays

{'RI'}
{'VT'}

The categories are listed in alphabetical order.

Create Ordinal Categorical Array from Cell Array of Character Vectors

Create a 1-by-8 cell array of character vectors containing the sizes of eight objects.
AllSizes = ["medium","large","small","small","medium",...
"large","medium","small"];

The cell array, AllSizes, has three distinct values: 'large', 'medium', and 'small'. With the cell
array of character vectors, there is no convenient way to indicate that small < medium < large.

Convert the cell array, AllSizes, to an ordinal categorical array. Use valueset to specify the values
small, medium, and large, which define the categories. For an ordinal categorical array, the first
category specified is the smallest and the last category is the largest.
valueset = ["small","medium","large"];
sizeOrd = categorical(AllSizes,valueset,'Ordinal',true)

sizeOrd = 1x8 categorical


Columns 1 through 6

medium large small small medium large

Columns 7 through 8

medium small

class(sizeOrd)

ans =
'categorical'

The order of the values in the categorical array, sizeOrd, remains unchanged.

List the discrete categories in the categorical variable, sizeOrd.


categories(sizeOrd)

ans = 3x1 cell


{'small' }
{'medium'}
{'large' }

The categories are listed in the specified order to match the mathematical ordering small <
medium < large.

Create Ordinal Categorical Array by Binning Numeric Data

Create a vector of 100 random numbers between zero and 50.


x = rand(100,1)*50;

8-3
8 Categorical Arrays

Use the discretize function to create a categorical array by binning the values of x. Put all values
between zero and 15 in the first bin, all the values between 15 and 35 in the second bin, and all the
values between 35 and 50 in the third bin. Each bin includes the left endpoint, but does not include
the right endpoint.

catnames = ["small","medium","large"];
binnedData = discretize(x,[0 15 35 50],'categorical',catnames);

binnedData is a 100-by-1 ordinal categorical array with three categories, such that small <
medium < large.

Use the summary function to print the number of elements in each category.

summary(binnedData)

small 30
medium 35
large 35

Create Categorical Array from String Array

Starting in R2016b, you can create string arrays with the string function and convert them to
categorical array.

Create a string array that contains names of planets.

str = ["Earth","Jupiter","Neptune","Jupiter","Mars","Earth"]

str = 1x6 string


"Earth" "Jupiter" "Neptune" "Jupiter" "Mars" "Earth"

Convert str to a categorical array.

planets = categorical(str)

planets = 1x6 categorical


Earth Jupiter Neptune Jupiter Mars Earth

Add missing elements to str and convert it to a categorical array. Where str has missing values,
planets has undefined values.

str(8) = "Mars"

str = 1x8 string


Columns 1 through 6

"Earth" "Jupiter" "Neptune" "Jupiter" "Mars" "Earth"

Columns 7 through 8

<missing> "Mars"

planets = categorical(str)

planets = 1x8 categorical


Columns 1 through 6

8-4
Create Categorical Arrays

Earth Jupiter Neptune Jupiter Mars Earth

Columns 7 through 8

<undefined> Mars

See Also
categorical | categories | summary | discretize

Related Examples
• “Convert Text in Table Variables to Categorical” on page 8-6
• “Access Data Using Categorical Arrays” on page 8-24
• “Compare Categorical Array Elements” on page 8-16

More About
• “Advantages of Using Categorical Arrays” on page 8-34
• “Ordinal Categorical Arrays” on page 8-36

8-5
8 Categorical Arrays

Convert Text in Table Variables to Categorical


This example shows how to convert a variable in a table from a cell array of character vectors to a
categorical array. The same workflow applies for table variables that are string arrays.

Load Sample Data and Create a Table

Load sample data gathered from 100 patients.

load patients

whos

Name Size Bytes Class Attributes

Age 100x1 800 double


Diastolic 100x1 800 double
Gender 100x1 11412 cell
Height 100x1 800 double
LastName 100x1 11616 cell
Location 100x1 14208 cell
SelfAssessedHealthStatus 100x1 11540 cell
Smoker 100x1 100 logical
Systolic 100x1 800 double
Weight 100x1 800 double

Store the patient data from Age, Gender, Height, Weight, SelfAssessedHealthStatus, and
Location in a table. Use the unique identifiers in the variable LastName as row names.

T = table(Age,Gender,Height,Weight,...
SelfAssessedHealthStatus,Location,...
'RowNames',LastName);

Convert Table Variables from Cell Arrays of Character Vectors to Categorical Arrays

The cell arrays of character vectors, Gender and Location, contain discrete sets of unique values.

Convert Gender and Location to categorical arrays.

T.Gender = categorical(T.Gender);
T.Location = categorical(T.Location);

The variable, SelfAssessedHealthStatus, contains four unique values: Excellent, Fair, Good,
and Poor.

Convert SelfAssessedHealthStatus to an ordinal categorical array, such that the categories have
the mathematical ordering Poor < Fair < Good < Excellent.

T.SelfAssessedHealthStatus = categorical(T.SelfAssessedHealthStatus,...
{'Poor','Fair','Good','Excellent'},'Ordinal',true);

Print a Summary

View the data type, description, units, and other descriptive statistics for each variable by using
summary to summarize the table.

8-6
Convert Text in Table Variables to Categorical

format compact

summary(T)

Variables:
Age: 100x1 double
Values:
Min 25
Median 39
Max 50
Gender: 100x1 categorical
Values:
Female 53
Male 47
Height: 100x1 double
Values:
Min 60
Median 67
Max 72
Weight: 100x1 double
Values:
Min 111
Median 142.5
Max 202
SelfAssessedHealthStatus: 100x1 ordinal categorical
Values:
Poor 11
Fair 15
Good 40
Excellent 34
Location: 100x1 categorical
Values:
County General Hospital 39
St. Mary s Medical Center 24
VA Hospital 37

The table variables Gender, SelfAssessedHealthStatus, and Location are categorical arrays.
The summary contains the counts of the number of elements in each category. For example, the
summary indicates that 53 of the 100 patients are female and 47 are male.

Select Data Based on Categories

Create a subtable, T1, containing the age, height, and weight of all female patients who were
observed at County General Hospital. You can easily create a logical vector based on the values in the
categorical arrays Gender and Location.

rows = T.Location=='County General Hospital' & T.Gender=='Female';

rows is a 100-by-1 logical vector with logical true (1) for the table rows where the gender is female
and the location is County General Hospital.

Define the subset of variables.

vars = {'Age','Height','Weight'};

Use parentheses to create the subtable, T1.

T1 = T(rows,vars)

8-7
8 Categorical Arrays

T1=19×3 table
Age Height Weight
___ ______ ______
Brown 49 64 119
Taylor 31 66 132
Anderson 45 68 128
Lee 44 66 146
Walker 28 65 123
Young 25 63 114
Campbell 37 65 135
Evans 39 62 121
Morris 43 64 135
Rivera 29 63 130
Richardson 30 67 141
Cox 28 66 111
Torres 45 70 137
Peterson 32 60 136
Ramirez 48 64 137
Bennett 35 64 131

A is a 19-by-3 table.

Since ordinal categorical arrays have a mathematical ordering for their categories, you can perform
element-wise comparisons of them with relational operations, such as greater than and less than.

Create a subtable, T2, of the gender, age, height, and weight of all patients who assessed their health
status as poor or fair.

First, define the subset of rows to include in table T2.

rows = T.SelfAssessedHealthStatus<='Fair';

Then, define the subset of variables to include in table T2.

vars = {'Gender','Age','Height','Weight'};

Use parentheses to create the subtable T2.

T2 = T(rows,vars)

T2=26×4 table
Gender Age Height Weight
______ ___ ______ ______
Johnson Male 43 69 163
Jones Female 40 67 133
Thomas Female 42 66 137
Jackson Male 25 71 174
Garcia Female 27 69 131
Rodriguez Female 39 64 117
Lewis Female 41 62 137
Lee Female 44 66 146
Hall Male 25 70 189
Hernandez Male 36 68 166
Lopez Female 40 66 137
Gonzalez Female 35 66 118
Mitchell Male 39 71 164

8-8
Convert Text in Table Variables to Categorical

Campbell Female 37 65 135


Parker Male 30 68 182
Stewart Male 49 68 170

T2 is a 26-by-4 table.

See Also

Related Examples
• “Create Tables and Assign Data to Them” on page 9-2
• “Create Categorical Arrays” on page 8-2
• “Access Data in Tables” on page 9-32
• “Access Data Using Categorical Arrays” on page 8-24

More About
• “Advantages of Using Categorical Arrays” on page 8-34
• “Ordinal Categorical Arrays” on page 8-36

8-9
8 Categorical Arrays

Plot Categorical Data


This example shows how to plot data from a categorical array.

Load Sample Data

Load sample data gathered from 100 patients.

load patients

whos

Name Size Bytes Class Attributes

Age 100x1 800 double


Diastolic 100x1 800 double
Gender 100x1 11412 cell
Height 100x1 800 double
LastName 100x1 11616 cell
Location 100x1 14208 cell
SelfAssessedHealthStatus 100x1 11540 cell
Smoker 100x1 100 logical
Systolic 100x1 800 double
Weight 100x1 800 double

Create Categorical Arrays from Cell Arrays of Character Vectors

The workspace variable, Location, is a cell array of character vectors that contains the three unique
medical facilities where patients were observed.

To access and compare data more easily, convert Location to a categorical array.

Location = categorical(Location);

Summarize the categorical array.

summary(Location)

County General Hospital 39


St. Mary's Medical Center 24
VA Hospital 37

39 patients were observed at County General Hospital, 24 at St. Mary's Medical Center, and 37 at the
VA Hospital.

The workspace variable, SelfAssessedHealthStatus, contains four unique values, Excellent,


Fair, Good, and Poor.

Convert SelfAssessedHealthStatus to an ordinal categorical array, such that the categories have
the mathematical ordering Poor < Fair < Good < Excellent.

SelfAssessedHealthStatus = categorical(SelfAssessedHealthStatus,...
{'Poor' 'Fair' 'Good' 'Excellent'},'Ordinal',true);

Summarize the categorical array, SelfAssessedHealthStatus.

summary(SelfAssessedHealthStatus)

8-10
Plot Categorical Data

Poor 11
Fair 15
Good 40
Excellent 34

Plot Histogram

Create a histogram bar plot directly from a categorical array.

figure
histogram(SelfAssessedHealthStatus)
title('Self Assessed Health Status From 100 Patients')

The function histogram accepts the categorical array, SelfAssessedHealthStatus, and plots the
category counts for each of the four categories.

Create a histogram of the hospital location for only the patients who assessed their health as Fair or
Poor.

figure
histogram(Location(SelfAssessedHealthStatus<='Fair'))
title('Location of Patients in Fair or Poor Health')

8-11
8 Categorical Arrays

Create Pie Chart

Create a pie chart directly from a categorical array.

figure
pie(SelfAssessedHealthStatus);
title('Self Assessed Health Status From 100 Patients')

8-12
Plot Categorical Data

The function pie accepts the categorical array, SelfAssessedHealthStatus, and plots a pie chart
of the four categories.

Create Pareto Chart

Create a Pareto chart from the category counts for each of the four categories of
SelfAssessedHealthStatus.

figure
A = countcats(SelfAssessedHealthStatus);
C = categories(SelfAssessedHealthStatus);
pareto(A,C);
title('Self Assessed Health Status From 100 Patients')

8-13
8 Categorical Arrays

The first input argument to pareto must be a vector. If a categorical array is a matrix or
multidimensional array, reshape it into a vector before calling countcats and pareto.

Create Scatter Plot

Convert the cell array of character vectors to a categorical array.

Gender = categorical(Gender);

Summarize the categorical array, Gender.

summary(Gender)

Female 53
Male 47

Gender is a 100-by-1 categorical array with two categories, Female and Male.

Use the categorical array, Gender, to access Weight and Height data for each gender separately.

X1 = Weight(Gender=='Female');
Y1 = Height(Gender=='Female');

X2 = Weight(Gender=='Male');
Y2 = Height(Gender=='Male');

X1 and Y1 are 53-by-1 numeric arrays containing data from the female patients.

8-14
Plot Categorical Data

X2 and Y2 are 47-by-1 numeric arrays containing data from the male patients.

Create a scatter plot of height vs. weight. Indicate data from the female patients with a circle and
data from the male patients with a cross.

figure
h1 = scatter(X1,Y1,'o');
hold on
h2 = scatter(X2,Y2,'x');

title('Height vs. Weight')


xlabel('Weight (lbs)')
ylabel('Height (in)')

See Also
categorical | summary | countcats | histogram | pie | bar | rose | scatter

Related Examples
• “Access Data Using Categorical Arrays” on page 8-24

8-15
8 Categorical Arrays

Compare Categorical Array Elements


This example shows how to use relational operations with a categorical array.

Create Categorical Array from Cell Array of Character Vectors

Create a 2-by-4 cell array of character vectors.

C = {'blue' 'red' 'green' 'blue';...


'blue' 'green' 'green' 'blue'};

colors = categorical(C)

colors = 2x4 categorical


blue red green blue
blue green green blue

colors is a 2-by-4 categorical array.

List the categories of the categorical array.

categories(colors)

ans = 3x1 cell


{'blue' }
{'green'}
{'red' }

Determine If Elements Are Equal

Use the relational operator, eq (==), to compare the first and second rows of colors.

colors(1,:) == colors(2,:)

ans = 1x4 logical array

1 0 1 1

Only the values in the second column differ between the rows.

Compare Entire Array to Character Vector

Compare the entire categorical array, colors, to the character vector 'blue' to find the location of
all blue values.

colors == 'blue'

ans = 2x4 logical array

1 0 0 1
1 0 0 1

There are four blue entries in colors, one in each corner of the array.

8-16
Compare Categorical Array Elements

Convert to an Ordinal Categorical Array

Add a mathematical ordering to the categories in colors. Specify the category order that represents
the ordering of color spectrum, red < green < blue.

colors = categorical(colors,{'red','green' 'blue'},'Ordinal',true)

colors = 2x4 categorical


blue red green blue
blue green green blue

The elements in the categorical array remain the same.

List the discrete categories in colors.

categories(colors)

ans = 3x1 cell


{'red' }
{'green'}
{'blue' }

Compare Elements Based on Order

Determine if elements in the first column of colors are greater than the elements in the second
column.

colors(:,1) > colors(:,2)

ans = 2x1 logical array

1
1

Both values in the first column, blue, are greater than the corresponding values in the second
column, red and green.

Find all the elements in colors that are less than 'blue'.

colors < 'blue'

ans = 2x4 logical array

0 1 1 0
0 1 1 0

The function lt (<) indicates the location of all green and red values with 1.

See Also
categorical | categories

8-17
8 Categorical Arrays

Related Examples
• “Access Data Using Categorical Arrays” on page 8-24

More About
• “Relational Operations”
• “Advantages of Using Categorical Arrays” on page 8-34
• “Ordinal Categorical Arrays” on page 8-36

8-18
Combine Categorical Arrays

Combine Categorical Arrays


This example shows how to combine two categorical arrays.

Create Categorical Arrays

Create a categorical array, A, containing the preferred lunchtime beverage of 25 students in


classroom A.

rng('default')
A = randi(3,[25,1]);
A = categorical(A,1:3,{'milk' 'water' 'juice'});

A is a 25-by-1 categorical array with three distinct categories: milk, water, and juice.

Summarize the categorical array, A.

summary(A)

milk 6
water 5
juice 14

Six students in classroom A prefer milk, five prefer water, and fourteen prefer juice.

Create another categorical array, B, containing the preferences of 28 students in classroom B.

B = randi(3,[28,1]);
B = categorical(B,1:3,{'milk' 'water' 'juice'});

B is a 28-by-1 categorical array containing the same categories as A.

Summarize the categorical array, B.

summary(B)

milk 9
water 8
juice 11

Nine students in classroom B prefer milk, eight prefer water, and eleven prefer juice.

Concatenate Categorical Arrays

Concatenate the data from classrooms A and B into a single categorical array, Group1.

Group1 = [A;B];

Summarize the categorical array, Group1

summary(Group1)

milk 15
water 13
juice 25

Group1 is a 53-by-1 categorical array with three categories: milk, water, and juice.

8-19
8 Categorical Arrays

Create Categorical Array with Different Categories

Create a categorical array, Group2, containing data from 50 students who were given the additional
beverage option of soda.

Group2 = randi(4,[50,1]);
Group2 = categorical(Group2,1:4,{'juice' 'milk' 'soda' 'water'});

Summarize the categorical array, Group2.

summary(Group2)

juice 12
milk 14
soda 10
water 14

Group2 is a 50-by-1 categorical array with four categories: juice, milk, soda, and water.

Concatenate Arrays with Different Categories

Concatenate the data from Group1 and Group2.

students = [Group1;Group2];

Summarize the resulting categorical array, students.

summary(students)

milk 29
water 27
juice 37
soda 10

Concatenation appends the categories exclusive to the second input, soda, to the end of the list of
categories from the first input, milk, water, juice, soda.

Use reordercats to change the order of the categories in the categorical array, students.

students = reordercats(students,{'juice','milk','water','soda'});

categories(students)

ans = 4x1 cell


{'juice'}
{'milk' }
{'water'}
{'soda' }

Union of Categorical Arrays

Use the function union to find the unique responses from Group1 and Group2.

C = union(Group1,Group2)

C = 4x1 categorical
milk
water

8-20
Combine Categorical Arrays

juice
soda

union returns the combined values from Group1 and Group2 with no repetitions. In this case, C is
equivalent to the categories of the concatenation, students.

All of the categorical arrays in this example were nonordinal. To combine ordinal categorical arrays,
they must have the same sets of categories including their order.

See Also
categorical | categories | summary | union | cat | horzcat | vertcat

Related Examples
• “Create Categorical Arrays” on page 8-2
• “Combine Categorical Arrays Using Multiplication” on page 8-22
• “Convert Text in Table Variables to Categorical” on page 8-6
• “Access Data Using Categorical Arrays” on page 8-24

More About
• “Ordinal Categorical Arrays” on page 8-36

8-21
8 Categorical Arrays

Combine Categorical Arrays Using Multiplication


This example shows how to use the times function to combine categorical arrays, including ordinal
categorical arrays and arrays with undefined elements. When you call times on two categorical
arrays, the output is a categorical array with new categories. The set of new categories is the set of
all the ordered pairs created from the categories of the input arrays, or the Cartesian product. times
forms each element of the output array as the ordered pair of the corresponding elements of the
input arrays. The output array has the same size as the input arrays.

Combine Two Categorical Arrays

Combine two categorical arrays using times. The input arrays must have the same number of
elements, but can have different numbers of categories.

A = categorical({'blue','red','green'});
B = categorical({'+','-','+'});
C = A.*B

C = 1x3 categorical
blue + red - green +

Cartesian Product of Categories

Show the categories of C. The categories are all the ordered pairs that can be created from the
categories of A and B, also known as the Cartesian product.

categories(C)

ans = 6x1 cell


{'blue +' }
{'blue -' }
{'green +'}
{'green -'}
{'red +' }
{'red -' }

As a consequence, A.*B does not equal B.*A.

D = B.*A

D = 1x3 categorical
+ blue - red + green

categories(D)

ans = 6x1 cell


{'+ blue' }
{'+ green'}
{'+ red' }
{'- blue' }
{'- green'}
{'- red' }

8-22
Combine Categorical Arrays Using Multiplication

Multiplication with Undefined Elements

Combine two categorical arrays. If either A or B have an undefined element, the corresponding
element of C is undefined.

A = categorical({'blue','red','green','black'});
B = categorical({'+','-','+','-'});
A = removecats(A,{'black'});
C = A.*B

C = 1x4 categorical
blue + red - green + <undefined>

Cartesian Product of Ordinal Categorical Arrays

Combine two ordinal categorical arrays. C is an ordinal categorical array only if A and B are both
ordinal. The ordering of the categories of C follows from the orderings of the input categorical arrays.

A = categorical({'blue','red','green'},{'green','red','blue'},'Ordinal',true);
B = categorical({'+','-','+'},'Ordinal',true);
C = A.*B;
categories(C)

ans = 6x1 cell


{'green +'}
{'green -'}
{'red +' }
{'red -' }
{'blue +' }
{'blue -' }

See Also
categorical | categories | summary | times

Related Examples
• “Create Categorical Arrays” on page 8-2
• “Combine Categorical Arrays” on page 8-19
• “Access Data Using Categorical Arrays” on page 8-24

More About
• “Ordinal Categorical Arrays” on page 8-36

8-23
8 Categorical Arrays

Access Data Using Categorical Arrays

In this section...
“Select Data By Category” on page 8-24
“Common Ways to Access Data Using Categorical Arrays” on page 8-24

Select Data By Category


Selecting data based on its values is often useful. This type of data selection can involve creating a
logical vector based on values in one variable, and then using that logical vector to select a subset of
values in other variables. You can create a logical vector for selecting data by finding values in a
numeric array that fall within a certain range. Additionally, you can create the logical vector by
finding specific discrete values. When using categorical arrays, you can easily:

• Select elements from particular categories. For categorical arrays, use the logical operators
== or ~= to select data that is in, or not in, a particular category. To select data in a particular
group of categories, use the ismember function.

For ordinal categorical arrays, use inequalities >, >=, <, or <= to find data in categories above or
below a particular category.
• Delete data that is in a particular category. Use logical operators to include or exclude data
from particular categories.
• Find elements that are not in a defined category. Categorical arrays indicate which elements
do not belong to a defined category by <undefined>. Use the isundefined function to find
observations without a defined value.

Common Ways to Access Data Using Categorical Arrays


This example shows how to index and search using categorical arrays. You can access data using
categorical arrays stored within a table in a similar manner.

Load Sample Data

Load sample data gathered from 100 patients.

load patients
whos

Name Size Bytes Class Attributes

Age 100x1 800 double


Diastolic 100x1 800 double
Gender 100x1 11412 cell
Height 100x1 800 double
LastName 100x1 11616 cell
Location 100x1 14208 cell
SelfAssessedHealthStatus 100x1 11540 cell
Smoker 100x1 100 logical
Systolic 100x1 800 double
Weight 100x1 800 double

8-24
Access Data Using Categorical Arrays

Create Categorical Arrays from Cell Arrays of Character Vectors

Gender and Location contain data that belong in categories. Each cell array contains character
vectors taken from a small set of unique values (indicating two genders and three locations
respectively). Convert Gender and Location to categorical arrays.

Gender = categorical(Gender);
Location = categorical(Location);

Search for Members of a Single Category

For categorical arrays, you can use the logical operators == and ~= to find the data that is in, or not
in, a particular category.

Determine if there are any patients observed at the location, 'Rampart General Hospital'.

any(Location=='Rampart General Hospital')

ans = logical
0

There are no patients observed at Rampart General Hospital.

Search for Members of a Group of Categories

You can use ismember to find data in a particular group of categories. Create a logical vector for the
patients observed at County General Hospital or VA Hospital.

VA_CountyGenIndex = ...
ismember(Location,{'County General Hospital','VA Hospital'});

VA_CountyGenIndex is a 100-by-1 logical array containing logical true (1) for each element in the
categorical array Location that is a member of the category County General Hospital or VA
Hospital. The output, VA_CountyGenIndex contains 76 nonzero elements.

Use the logical vector, VA_CountyGenIndex to select the LastName of the patients observed at
either County General Hospital or VA Hospital.

VA_CountyGenPatients = LastName(VA_CountyGenIndex);

VA_CountyGenPatients is a 76-by-1 cell array of character vectors.

Select Elements in a Particular Category to Plot

Use the summary function to print a summary containing the category names and the number of
elements in each category.

summary(Location)

County General Hospital 39


St. Mary's Medical Center 24
VA Hospital 37

Location is a 100-by-1 categorical array with three categories. County General Hospital
occurs in 39 elements, St. Mary s Medical Center in 24 elements, and VA Hospital in 37
elements.

8-25
8 Categorical Arrays

Use the summary function to print a summary of Gender.


summary(Gender)

Female 53
Male 47

Gender is a 100-by-1 categorical array with two categories. Female occurs in 53 elements and Male
occurs in 47 elements.

Use logical operator == to access the age of only the female patients. Then plot a histogram of this
data.
figure()
histogram(Age(Gender=='Female'))
title('Age of Female Patients')

histogram(Age(Gender=='Female')) plots the age data for the 53 female patients.

Delete Data from a Particular Category

You can use logical operators to include or exclude data from particular categories. Delete all patients
observed at VA Hospital from the workspace variables, Age and Location.
Age = Age(Location~='VA Hospital');
Location = Location(Location~='VA Hospital');

Now, Age is a 63-by-1 numeric array, and Location is a 63-by-1 categorical array.

8-26
Access Data Using Categorical Arrays

List the categories of Location, as well as the number of elements in each category.

summary(Location)

County General Hospital 39


St. Mary's Medical Center 24
VA Hospital 0

The patients observed at VA Hospital are deleted from Location, but VA Hospital is still a
category.

Use the removecats function to remove VA Hospital from the categories of Location.

Location = removecats(Location,'VA Hospital');

Verify that the category, VA Hospital, was removed.

categories(Location)

ans = 2x1 cell


{'County General Hospital' }
{'St. Mary's Medical Center'}

Location is a 63-by-1 categorical array that has two categories.

Delete Element

You can delete elements by indexing. For example, you can remove the first element of Location by
selecting the rest of the elements with Location(2:end). However, an easier way to delete
elements is to use [].

Location(1) = [];
summary(Location)

County General Hospital 38


St. Mary's Medical Center 24

Location is a 62-by-1 categorical array that has two categories. Deleting the first element has no
effect on other elements from the same category and does not delete the category itself.

Check for Undefined Data

Remove the category County General Hospital from Location.

Location = removecats(Location,'County General Hospital');

Display the first eight elements of the categorical array, Location.

Location(1:8)

ans = 8x1 categorical


St. Mary's Medical Center
<undefined>
St. Mary's Medical Center
St. Mary's Medical Center
<undefined>
<undefined>
St. Mary's Medical Center

8-27
8 Categorical Arrays

St. Mary's Medical Center

After removing the category, County General Hospital, elements that previously belonged to
that category no longer belong to any category defined for Location. Categorical arrays denote
these elements as undefined.

Use the function isundefined to find observations that do not belong to any category.

undefinedIndex = isundefined(Location);

undefinedIndex is a 62-by-1 categorical array containing logical true (1) for all undefined
elements in Location.

Set Undefined Elements

Use the summary function to print the number of undefined elements in Location.

summary(Location)

St. Mary's Medical Center 24


<undefined> 38

The first element of Location belongs to the category, St. Mary's Medical Center. Set the first
element to be undefined so that it no longer belongs to any category.

Location(1) = '<undefined>';
summary(Location)

St. Mary's Medical Center 23


<undefined> 39

You can make selected elements undefined without removing a category or changing the categories
of other elements. Set elements to be undefined to indicate elements with values that are unknown.

Preallocate Categorical Arrays with Undefined Elements

You can use undefined elements to preallocate the size of a categorical array for better performance.
Create a categorical array that has elements with known locations only.

definedIndex = ~isundefined(Location);
newLocation = Location(definedIndex);
summary(newLocation)

St. Mary's Medical Center 23

Expand the size of newLocation so that it is a 200-by-1 categorical array. Set the last new element
to be undefined. All of the other new elements also are set to be undefined. The 23 original
elements keep the values they had.

newLocation(200) = '<undefined>';
summary(newLocation)

St. Mary's Medical Center 23


<undefined> 177

8-28
Access Data Using Categorical Arrays

newLocation has room for values you plan to store in the array later.

See Also
categorical | categories | summary | any | histogram | removecats | isundefined

Related Examples
• “Create Categorical Arrays” on page 8-2
• “Convert Text in Table Variables to Categorical” on page 8-6
• “Plot Categorical Data” on page 8-10
• “Compare Categorical Array Elements” on page 8-16
• “Work with Protected Categorical Arrays” on page 8-30

More About
• “Advantages of Using Categorical Arrays” on page 8-34
• “Ordinal Categorical Arrays” on page 8-36

8-29
8 Categorical Arrays

Work with Protected Categorical Arrays


This example shows how to work with a categorical array with protected categories.

When you create a categorical array with the categorical function, you have the option of
specifying whether or not the categories are protected. Ordinal categorical arrays always have
protected categories, but you also can create a nonordinal categorical array that is protected using
the 'Protected',true name-value pair argument.

When you assign values that are not in the array's list of categories, the array updates automatically
so that its list of categories includes the new values. Similarly, you can combine (nonordinal)
categorical arrays that have different categories. The categories in the result include the categories
from both arrays.

When you assign new values to a protected categorical array, the values must belong to one of the
existing categories. Similarly, you can only combine protected arrays that have the same categories.

• If you want to combine two nonordinal categorical arrays that have protected categories, they
must have the same categories, but the order does not matter. The resulting categorical array
uses the category order from the first array.
• If you want to combine two ordinal categorical array (that always have protected categories), they
must have the same categories, including their order.

To add new categories to the array, you must use the function addcats.

Create Ordinal Categorical Array

Create a categorical array containing the sizes of 10 objects. Use the names small, medium, and
large for the values 'S', 'M', and 'L'.
A = categorical({'M';'L';'S';'S';'M';'L';'M';'L';'M';'S'},...
{'S','M','L'},{'small','medium','large'},'Ordinal',true)

A = 10x1 categorical
medium
large
small
small
medium
large
medium
large
medium
small

A is a 10-by-1 categorical array.

Display the categories of A.


categories(A)

ans = 3x1 cell


{'small' }
{'medium'}
{'large' }

8-30
Work with Protected Categorical Arrays

Verify That Categories Are Protected

When you create an ordinal categorical array, the categories are always protected.

Use the isprotected function to verify that the categories of A are protected.

tf = isprotected(A)

tf = logical
1

The categories of A are protected.

Assign Value in New Category

If you try to assign a new value that does not belong to one of the existing categories, then MATLAB®
returns an error. For example, you cannot assign the value 'xlarge' to the categorical array, as in
the expression A(2) = 'xlarge', because xlarge is not a category of A. Instead, MATLAB®
returns the error:

Error using categorical/subsasgn (line 68)

Cannot add a new category 'xlarge' to this categorical array

because its categories are protected. Use ADDCATS to

add the new category.

To add a new category for xlarge, use the addcats function. Since A is ordinal you must specify the
order for the new category.

A = addcats(A,'xlarge','After','large');

Now, assign a value for 'xlarge', since it has an existing category.

A(2) = 'xlarge'

A = 10x1 categorical
medium
xlarge
small
small
medium
large
medium
large
medium
small

A is now a 10-by-1 categorical array with four categories, such that small < medium < large <
xlarge.

Combine Two Ordinal Categorical Arrays

Create another ordinal categorical array, B, containing the sizes of five items.

8-31
8 Categorical Arrays

B = categorical([2;1;1;2;2],1:2,{'xsmall','small'},'Ordinal',true)

B = 5x1 categorical
small
xsmall
xsmall
small
small

B is a 5-by-1 categorical array with two categories such that xsmall < small.

To combine two ordinal categorical arrays (which always have protected categories), they must have
the same categories and the categories must be in the same order.

Add the category 'xsmall' to A before the category 'small'.

A = addcats(A,'xsmall','Before','small');

categories(A)

ans = 5x1 cell


{'xsmall'}
{'small' }
{'medium'}
{'large' }
{'xlarge'}

Add the categories {'medium','large','xlarge'} to B after the category 'small'.

B = addcats(B,{'medium','large','xlarge'},'After','small');

categories(B)

ans = 5x1 cell


{'xsmall'}
{'small' }
{'medium'}
{'large' }
{'xlarge'}

The categories of A and B are now the same including their order.

Vertically concatenate A and B.

C = [A;B]

C = 15x1 categorical
medium
xlarge
small
small
medium
large
medium
large
medium

8-32
Work with Protected Categorical Arrays

small
small
xsmall
xsmall
small
small

The values from B are appended to the values from A.

List the categories of C.

categories(C)

ans = 5x1 cell


{'xsmall'}
{'small' }
{'medium'}
{'large' }
{'xlarge'}

C is a 16-by-1 ordinal categorical array with five categories, such that xsmall < small < medium
< large < xlarge.

See Also
categorical | categories | summary | isprotected | isordinal | addcats

Related Examples
• “Create Categorical Arrays” on page 8-2
• “Convert Text in Table Variables to Categorical” on page 8-6
• “Access Data Using Categorical Arrays” on page 8-24
• “Combine Categorical Arrays” on page 8-19
• “Combine Categorical Arrays Using Multiplication” on page 8-22

More About
• “Ordinal Categorical Arrays” on page 8-36

8-33
8 Categorical Arrays

Advantages of Using Categorical Arrays

In this section...
“Natural Representation of Categorical Data” on page 8-34
“Mathematical Ordering for Character Vectors” on page 8-34
“Reduce Memory Requirements” on page 8-34

Natural Representation of Categorical Data


categorical is a data type to store data with values from a finite set of discrete categories. One
common alternative to using categorical arrays is to use character arrays or cell arrays of character
vectors. To compare values in character arrays and cell arrays of character vectors, you must use
strcmp which can be cumbersome. With categorical arrays, you can use the logical operator eq (==)
to compare elements in the same way that you compare numeric arrays. The other common
alternative to using categorical arrays is to store categorical data using integers in numeric arrays.
Using numeric arrays loses all the useful descriptive information from the category names, and also
tends to suggest that the integer values have their usual numeric meaning, which, for categorical
data, they do not.

Mathematical Ordering for Character Vectors


Categorical arrays are convenient and memory efficient containers for nonnumeric data with values
from a finite set of discrete categories. They are especially useful when the categories have a
meaningful mathematical ordering, such as an array with entries from the discrete set of categories
{'small','medium','large'} where small < medium < large.

An ordering other than alphabetical order is not possible with character arrays or cell arrays of
character vectors. Thus, inequality comparisons, such as greater and less than, are not possible. With
categorical arrays, you can use relational operations to test for equality and perform element-wise
comparisons that have a meaningful mathematical ordering.

Reduce Memory Requirements


This example shows how to compare the memory required to store data as a cell array of character
vectors versus a categorical array. Categorical arrays have categories that are defined as character
vectors, which can be costly to store and manipulate in a cell array of character vectors or char
array. Categorical arrays store only one copy of each category name, often reducing the amount of
memory required to store the array.

Create a sample cell array of character vectors.

state = [repmat({'MA'},25,1);repmat({'NY'},25,1);...
repmat({'CA'},50,1);...
repmat({'MA'},25,1);repmat({'NY'},25,1)];

Display information about the variable state.

whos state

8-34
Advantages of Using Categorical Arrays

Name Size Bytes Class Attributes

state 150x1 16200 cell

The variable state is a cell array of character vectors requiring 17,400 bytes of memory.

Convert state to a categorical array.

state = categorical(state);

Display the discrete categories in the variable state.

categories(state)

ans = 3x1 cell


{'CA'}
{'MA'}
{'NY'}

state contains 150 elements, but only three distinct categories.

Display information about the variable state.

whos state

Name Size Bytes Class Attributes

state 150x1 476 categorical

There is a significant reduction in the memory required to store the variable.

See Also
categorical | categories

Related Examples
• “Create Categorical Arrays” on page 8-2
• “Convert Text in Table Variables to Categorical” on page 8-6
• “Compare Categorical Array Elements” on page 8-16
• “Access Data Using Categorical Arrays” on page 8-24

More About
• “Ordinal Categorical Arrays” on page 8-36

8-35
8 Categorical Arrays

Ordinal Categorical Arrays


In this section...
“Order of Categories” on page 8-36
“How to Create Ordinal Categorical Arrays” on page 8-36
“Working with Ordinal Categorical Arrays” on page 8-38

Order of Categories
categorical is a data type to store data with values from a finite set of discrete categories, which
can have a natural order. You can specify and rearrange the order of categories in all categorical
arrays. However, you only can treat ordinal categorical arrays as having a mathematical ordering to
their categories. Use an ordinal categorical array if you want to use the functions min, max, or
relational operations, such as greater than and less than.

The discrete set of pet categories {'dog' 'cat' 'bird'} has no meaningful mathematical
ordering. You are free to use any category order and the meaning of the associated data does not
change. For example, pets = categorical({'bird','cat','dog','dog','cat'}) creates a
categorical array and the categories are listed in alphabetical order, {'bird' 'cat' 'dog'}. You
can choose to specify or change the order of the categories to {'dog' 'cat' 'bird'} and the
meaning of the data does not change.

ordinal categorical arrays contain categories that have a meaningful mathematical ordering. For
example, the discrete set of size categories {'small', 'medium', 'large'} has the
mathematical ordering small < medium < large. The first category listed is the smallest and the
last category is the largest. The order of the categories in an ordinal categorical array affects the
result from relational comparisons of ordinal categorical arrays.

How to Create Ordinal Categorical Arrays


This example shows how to create an ordinal categorical array using the categorical function with
the 'Ordinal',true name-value pair argument.

Ordinal Categorical Array from a Cell Array of Character Vectors

Create an ordinal categorical array, sizes, from a cell array of character vectors, A. Use valueset,
specified as a vector of unique values, to define the categories for sizes.

A = {'medium' 'large';'small' 'medium'; 'large' 'small'};


valueset = {'small', 'medium', 'large'};

sizes = categorical(A,valueset,'Ordinal',true)

sizes = 3x2 categorical


medium large
small medium
large small

sizes is 3-by-2 ordinal categorical array with three categories such that small < medium <
large. The order of the values in valueset becomes the order of the categories of sizes.

8-36
Ordinal Categorical Arrays

Ordinal Categorical Array from Integers

Create an equivalent categorical array from an array of integers. Use the values 1, 2, and 3 to define
the categories small, medium, and large, respectively.

A2 = [2 3; 1 2; 3 1];
valueset = 1:3;
catnames = {'small','medium','large'};

sizes2 = categorical(A2,valueset,catnames,'Ordinal',true)

sizes2 = 3x2 categorical


medium large
small medium
large small

Compare sizes and sizes2

isequal(sizes,sizes2)

ans = logical
1

sizes and sizes2 are equivalent categorical arrays with the same ordering of categories.

Convert a Categorical Array from Nonordinal to Ordinal

Create a nonordinal categorical array from the cell array of character vectors, A.

sizes3 = categorical(A)

sizes3 = 3x2 categorical


medium large
small medium
large small

Determine if the categorical array is ordinal.

isordinal(sizes3)

ans = logical
0

sizes3 is a nonordinal categorical array with three categories, {'large','medium','small'}.


The categories of sizes3 are the sorted unique values from A. You must use the input argument,
valueset, to specify a different category order.

Convert sizes3 to an ordinal categorical array, such that small < medium < large.

sizes3 = categorical(sizes3,{'small','medium','large'},'Ordinal',true);

8-37
8 Categorical Arrays

sizes3 is now a 3-by-2 ordinal categorical array equivalent to sizes and sizes2.

Working with Ordinal Categorical Arrays


In order to combine or compare two categorical arrays, the sets of categories for both input arrays
must be identical, including their order. Furthermore, ordinal categorical arrays are always
protected. Therefore, when you assign values to an ordinal categorical array, the values must belong
to one of the existing categories. For more information see “Work with Protected Categorical Arrays”
on page 8-30.

See Also
categorical | categories | isordinal | isequal

Related Examples
• “Create Categorical Arrays” on page 8-2
• “Convert Text in Table Variables to Categorical” on page 8-6
• “Compare Categorical Array Elements” on page 8-16
• “Access Data Using Categorical Arrays” on page 8-24

More About
• “Advantages of Using Categorical Arrays” on page 8-34

8-38
Core Functions Supporting Categorical Arrays

Core Functions Supporting Categorical Arrays


Many functions in MATLAB operate on categorical arrays in much the same way that they operate on
other arrays. A few of these functions might exhibit special behavior when operating on a categorical
array. If multiple input arguments are ordinal categorical arrays, the function often requires that they
have the same set of categories, including order. Furthermore, a few functions, such as max and gt,
require that the input categorical arrays are ordinal.

The following table lists notable MATLAB functions that operate on categorical arrays in addition to
other arrays.

size isequal intersect plot double


length isequaln ismember plot3 single
ndims setdiff scatter int8
numel eq setxor scatter3 int16
ne unique bar int32
isrow lt union barh int64
iscolumn le histogram uint8
ge times uint16
cat gt pie uint32
horzcat sort rose uint64
vertcat min sortrows stem char
max issorted stairs string
median area cellstr
mode permute mesh
reshape surf
transpose surface
ctranspose
semilogx
semilogy
fill
fill3
line
text

8-39
9

Tables

• “Create Tables and Assign Data to Them” on page 9-2


• “Add and Delete Table Rows” on page 9-9
• “Add, Delete, and Rearrange Table Variables” on page 9-12
• “Clean Messy and Missing Data in Tables” on page 9-19
• “Modify Units, Descriptions, and Table Variable Names” on page 9-24
• “Add Custom Properties to Tables and Timetables” on page 9-27
• “Access Data in Tables” on page 9-32
• “Calculations on Tables” on page 9-45
• “Split Data into Groups and Calculate Statistics” on page 9-49
• “Split Table Data Variables and Apply Functions” on page 9-52
• “Advantages of Using Tables” on page 9-56
• “Grouping Variables To Split Data” on page 9-61
• “Changes to DimensionNames Property in R2016b” on page 9-64
• “Data Cleaning and Calculations in Tables” on page 9-66
• “Grouped Calculations in Tables and Timetables” on page 9-86
9 Tables

Create Tables and Assign Data to Them


Tables are suitable for column-oriented data such as tabular data from text files or spreadsheets.
Tables store columns of data in variables. The variables in a table can have different data types,
though all of the variables must have the same number of rows. However, table variables are not
restricted to storing only column vectors. For example, a table variable can contain a matrix with
multiple columns as long as it has the same number of rows as the other table variables.

In MATLAB®, you can create tables and assign data to them in several ways.

• Create a table from input arrays by using the table function.


• Add variables to an existing table by using dot notation.
• Assign variables to an empty table.
• Preallocate a table and fill in its data later.
• Convert variables to tables by using the array2table, cell2table, or struct2table
functions.
• Read a table from file by using the readtable function.
• Import a table using the Import Tool.

The way you choose depends on the nature of your data and how you plan to use tables in your code.

Create Tables from Input Arrays

You can create a table from arrays by using the table function. For example, create a small table
with data for five patients.

First, create six column-oriented arrays of data. These arrays have five rows because there are five
patients. (Most of these arrays are 5-by-1 column vectors, while BloodPressure is a 5-by-2 matrix.)

LastName = ["Sanchez";"Johnson";"Zhang";"Diaz";"Brown"];
Age = [38;43;38;40;49];
Smoker = [true;false;true;false;true];
Height = [71;69;64;67;64];
Weight = [176;163;131;133;119];
BloodPressure = [124 93; 109 77; 125 83; 117 75; 122 80];

Now create a table, patients, as a container for the data. In this call to the table function, the
input arguments use the workspace variable names for the names of the variables in patients.

patients = table(LastName,Age,Smoker,Height,Weight,BloodPressure)

patients=5×6 table
LastName Age Smoker Height Weight BloodPressure
_________ ___ ______ ______ ______ _____________

"Sanchez" 38 true 71 176 124 93


"Johnson" 43 false 69 163 109 77
"Zhang" 38 true 64 131 125 83
"Diaz" 40 false 67 133 117 75
"Brown" 49 true 64 119 122 80

9-2
Create Tables and Assign Data to Them

The table is a 5-by-6 table because it has six variables. As the BloodPressure variable shows, a
table variable itself can have multiple columns. This example shows why tables have rows and
variables, not rows and columns.

Add Variable to Table Using Dot Notation

Once you have created a table, you can add a new variable at any time by using dot notation. Dot
notation refers to table variables by name, T.varname, where T is the table and varname is the
variable name. This notation is similar to the notation you use to access and assign data to the fields
of a structure.

For example, add a BMI variable to patients. Calculate body mass index, or BMI, using the values in
patients.Weight and patients.Height. Assign the BMI values to a new table variable.

patients.BMI = (patients.Weight*0.453592)./(patients.Height*0.0254).^2

patients=5×7 table
LastName Age Smoker Height Weight BloodPressure BMI
_________ ___ ______ ______ ______ _____________ ______

"Sanchez" 38 true 71 176 124 93 24.547


"Johnson" 43 false 69 163 109 77 24.071
"Zhang" 38 true 64 131 125 83 22.486
"Diaz" 40 false 67 133 117 75 20.831
"Brown" 49 true 64 119 122 80 20.426

Assign Variables to Empty Table

Another way to create a table is to start with an empty table and assign variables to it. For example,
re-create the table of patient data, but this time assign variables using dot notation.

First, create an empty table, patients2, by calling table without arguments.

patients2 = table

patients2 =

0x0 empty table

Next, create a copy of the patient data by assigning variables. Table variable names do not have to
match array names, as shown by the Name and BP table variables.

patients2.Name = LastName;
patients2.Age = Age;
patients2.Smoker = Smoker;
patients2.Height = Height;
patients2.Weight = Weight;
patients2.BP = BloodPressure

patients2=5×6 table
Name Age Smoker Height Weight BP
_________ ___ ______ ______ ______ __________

"Sanchez" 38 true 71 176 124 93


"Johnson" 43 false 69 163 109 77
"Zhang" 38 true 64 131 125 83
"Diaz" 40 false 67 133 117 75

9-3
9 Tables

"Brown" 49 true 64 119 122 80

Preallocate Table and Fill Rows

Sometimes you know the sizes and data types of the data that you want to store in a table, but you
plan to assign the data later. Perhaps you plan to add only a few rows at a time. In that case,
preallocating space in the table and then assigning values to empty rows can be more efficient.

For example, to preallocate space for a table to contain time and temperature readings at different
stations, use the table function. Instead of supplying input arrays, specify the sizes and data types of
the table variables. To give them names, specify the 'VariableNames' argument. Preallocation fills
table variables with default values that are appropriate for their data types.

sz = [4 3];
varTypes = ["double","datetime","string"];
varNames = ["Temperature","Time","Station"];
temps = table('Size',sz,'VariableTypes',varTypes,'VariableNames',varNames)

temps=4×3 table
Temperature Time Station
___________ ____ _________

0 NaT <missing>
0 NaT <missing>
0 NaT <missing>
0 NaT <missing>

One way to assign or add a row to a table is to assign a cell array to a row. If the cell array is a row
vector and its elements match the data types of their respective variables, then the assignment
converts the cell array to a table row. However, you can assign only one row at a time using cell
arrays. Assign values to the first two rows.

temps(1,:) = {75,datetime('now'),"S1"};
temps(2,:) = {68,datetime('now')+1,"S2"}

temps=4×3 table
Temperature Time Station
___________ ____________________ _________

75 01-Sep-2021 16:04:11 "S1"


68 02-Sep-2021 16:04:12 "S2"
0 NaT <missing>
0 NaT <missing>

As an alternative, you can assign rows from a smaller table into a larger table. With this method, you
can assign one or more rows at a time.

temps(3:4,:) = table([63;72],[datetime('now')+2;datetime('now')+3],["S3";"S4"])

temps=4×3 table
Temperature Time Station
___________ ____________________ _______

75 01-Sep-2021 16:04:11 "S1"


68 02-Sep-2021 16:04:12 "S2"

9-4
Create Tables and Assign Data to Them

63 03-Sep-2021 16:04:12 "S3"


72 04-Sep-2021 16:04:12 "S4"

You can use either syntax to increase the size of a table by assigning rows beyond the end of the
table. If necessary, missing rows are filled in with default values.
temps(6,:) = {62,datetime('now')+6,"S6"}

temps=6×3 table
Temperature Time Station
___________ ____________________ _________

75 01-Sep-2021 16:04:11 "S1"


68 02-Sep-2021 16:04:12 "S2"
63 03-Sep-2021 16:04:12 "S3"
72 04-Sep-2021 16:04:12 "S4"
0 NaT <missing>
62 07-Sep-2021 16:04:12 "S6"

Convert Variables to Tables

You can convert variables that have other data types to tables. Cell arrays and structures are other
types of containers that can store arrays that have different data types. So you can convert cell arrays
and structures to tables. You can also convert an array to a table where the table variables contain
columns of values from the array. To convert these kinds of variables, use the array2table,
cell2table, or struct2table functions.

For example, convert an array to a table by using array2table. Arrays do not have column names,
so the table has default variable names.
A = randi(3,3)

A = 3×3

3 3 1
3 2 2
1 1 3

a2t = array2table(A)

a2t=3×3 table
A1 A2 A3
__ __ __

3 3 1
3 2 2
1 1 3

You can provide your own table variable names by using the "VariableNames" name-value
argument.
a2t = array2table(A,"VariableNames",["First","Second","Third"])

a2t=3×3 table
First Second Third

9-5
9 Tables

_____ ______ _____

3 3 1
3 2 2
1 1 3

Read Table from File

It is common to have a large quantity of tabular data in a file such as a CSV (comma-separated value)
file or an Excel® spreadsheet. To read such data into a table, use the readtable function.

For example, the CSV file outages.csv is a sample file that is distributed with MATLAB. The file
contains data for a set of electrical power outages. The first line of outages.csv has column names.
The rest of the file has comma-separated data values for each outage. The first few lines are shown
here.

Region,OutageTime,Loss,Customers,RestorationTime,Cause
SouthWest,2002-02-01 12:18,458.9772218,1820159.482,2002-02-07 16:50,winter storm
SouthEast,2003-01-23 00:49,530.1399497,212035.3001,,winter storm
SouthEast,2003-02-07 21:15,289.4035493,142938.6282,2003-02-17 08:14,winter storm
West,2004-04-06 05:44,434.8053524,340371.0338,2004-04-06 06:10,equipment fault
MidWest,2002-03-16 06:18,186.4367788,212754.055,2002-03-18 23:23,severe storm
...

To read outages.csv and store the data in a table, you can use readtable. It reads numeric
values, dates and times, and strings into table variables that have appropriate data types. Here, Loss
and Customers are numeric arrays. The OutageTime and RestorationTime variables are
datetime arrays because readtable recognizes the date and time formats of the text in those
columns of the input file. To read the rest of the text data into string arrays, specify the "TextType"
name-value argument.

outages = readtable("outages.csv","TextType","string")

outages=1468×6 table
Region OutageTime Loss Customers RestorationTime Cause
___________ ________________ ______ __________ ________________ ______________

"SouthWest" 2002-02-01 12:18 458.98 1.8202e+06 2002-02-07 16:50 "winter storm"


"SouthEast" 2003-01-23 00:49 530.14 2.1204e+05 NaT "winter storm"
"SouthEast" 2003-02-07 21:15 289.4 1.4294e+05 2003-02-17 08:14 "winter storm"
"West" 2004-04-06 05:44 434.81 3.4037e+05 2004-04-06 06:10 "equipment fau
"MidWest" 2002-03-16 06:18 186.44 2.1275e+05 2002-03-18 23:23 "severe storm"
"West" 2003-06-18 02:49 0 0 2003-06-18 10:54 "attack"
"West" 2004-06-20 14:39 231.29 NaN 2004-06-20 19:16 "equipment fau
"West" 2002-06-06 19:28 311.86 NaN 2002-06-07 00:51 "equipment fau
"NorthEast" 2003-07-16 16:23 239.93 49434 2003-07-17 01:12 "fire"
"MidWest" 2004-09-27 11:09 286.72 66104 2004-09-27 16:37 "equipment fau
"SouthEast" 2004-09-05 17:48 73.387 36073 2004-09-05 20:46 "equipment fau
"West" 2004-05-21 21:45 159.99 NaN 2004-05-22 04:23 "equipment fau
"SouthEast" 2002-09-01 18:22 95.917 36759 2002-09-01 19:12 "severe storm"
"SouthEast" 2003-09-27 07:32 NaN 3.5517e+05 2003-10-04 07:02 "severe storm"
"West" 2003-11-12 06:12 254.09 9.2429e+05 2003-11-17 02:04 "winter storm"
"NorthEast" 2004-09-18 05:54 0 0 NaT "equipment fau

9-6
Create Tables and Assign Data to Them

Import Table Using Import Tool

Finally, you can interactively preview and import data from spreadsheets or delimited text files by
using the Import Tool. There are two ways to open the Import Tool.

• MATLAB Toolstrip: On the Home tab, in the Variable section, click Import Data.
• MATLAB command prompt: Enter uiimport(filename), where filename is the name of a text
or spreadsheet file.

For example, open the outages.csv sample file by using uiimport and which to get the path to
the file.

uiimport(which("outages.csv"))

The Import Tool shows you a preview of the six columns from outages.csv. To import the data as a
table, follow these steps.

1 In the Imported Data section, select Table as the output type.


2 Click Import Selection (near the upper-right corner). The new table, named outages, appears
in your workspace.

See Also
readtable | table | array2table | cell2table | struct2table | Import Tool

Related Examples
• “Access Data in Tables” on page 9-32
• “Add and Delete Table Rows” on page 9-9

9-7
9 Tables

• “Add, Delete, and Rearrange Table Variables” on page 9-12


• “Clean Messy and Missing Data in Tables” on page 9-19
• “Modify Units, Descriptions, and Table Variable Names” on page 9-24
• “Advantages of Using Tables” on page 9-56

9-8
Add and Delete Table Rows

Add and Delete Table Rows


This example shows how to add and delete rows in a table. You can also edit tables using the
Variables Editor.

Load Sample Data

Load the sample patients data and create a table, T.

load patients
T = table(LastName,Gender,Age,Height,Weight,Smoker,Systolic,Diastolic);
size(T)

ans = 1×2

100 8

The table, T, has 100 rows and eight variables (columns).

Add Rows by Concatenation

Read data on more patients from a comma-delimited file, morePatients.csv, into a table, T2. Then,
append the rows from T2 to the end of the table, T.

T2 = readtable('morePatients.csv');
Tnew = [T;T2];
size(Tnew)

ans = 1×2

104 8

The table Tnew has 104 rows. In order to vertically concatenate two tables, both tables must have the
same number of variables, with the same variable names. If the variable names are different, you can
directly assign new rows in a table to rows from another table. For example, T(end+1:end+4,:) =
T2.

Add Rows from Cell Array

To append new rows stored in a cell array, vertically concatenate the cell array onto the end of the
table. You can concatenate directly from a cell array when it has the right number of columns and the
contents of its cells can be concatenated onto the corresponding table variables.

cellPatients = {'Edwards','Male',42,70,158,0,116,83;
'Falk','Female',28,62,125,1,120,71};
Tnew = [Tnew;cellPatients];
size(Tnew)

ans = 1×2

106 8

You also can convert a cell array to a table using the cell2table function.

9-9
9 Tables

Add Rows from Structure

You also can append new rows stored in a structure. Convert the structure to a table, and then
concatenate the tables.

structPatients(1,1).LastName = 'George';
structPatients(1,1).Gender = 'Male';
structPatients(1,1).Age = 45;
structPatients(1,1).Height = 76;
structPatients(1,1).Weight = 182;
structPatients(1,1).Smoker = 1;
structPatients(1,1).Systolic = 132;
structPatients(1,1).Diastolic = 85;

structPatients(2,1).LastName = 'Hadley';
structPatients(2,1).Gender = 'Female';
structPatients(2,1).Age = 29;
structPatients(2,1).Height = 58;
structPatients(2,1).Weight = 120;
structPatients(2,1).Smoker = 0;
structPatients(2,1).Systolic = 112;
structPatients(2,1).Diastolic = 70;

Tnew = [Tnew;struct2table(structPatients)];
size(Tnew)

ans = 1×2

108 8

Omit Duplicate Rows

To omit any rows in a table that are duplicated, use the unique function.

Tnew = unique(Tnew);
size(Tnew)

ans = 1×2

106 8

unique deleted two duplicate rows.

Delete Rows by Row Number

Delete rows 18, 20, and 21 from the table.

Tnew([18,20,21],:) = [];
size(Tnew)

ans = 1×2

103 8

The table contains information on 103 patients now.

9-10
Add and Delete Table Rows

Delete Rows by Row Name

First, specify the variable of identifiers, LastName, as row names. Then, delete the variable,
LastName, from Tnew. Finally, use the row name to index and delete rows.

Tnew.Properties.RowNames = Tnew.LastName;
Tnew.LastName = [];
Tnew('Smith',:) = [];
size(Tnew)

ans = 1×2

102 7

The table now has one less row and one less variable.

Search for Rows to Delete

You also can search for observations in the table. For example, delete rows for any patients under the
age of 30.

toDelete = Tnew.Age < 30;


Tnew(toDelete,:) = [];
size(Tnew)

ans = 1×2

85 7

The table now has 17 fewer rows.

See Also
table | readtable | array2table | cell2table | struct2table

Related Examples
• “Add, Delete, and Rearrange Table Variables” on page 9-12
• “Clean Messy and Missing Data in Tables” on page 9-19

9-11
9 Tables

Add, Delete, and Rearrange Table Variables


This example shows how to add, delete, and rearrange column-oriented variables in a table. You can
add, move, and delete table variables using the addvars, movevars, and removevars functions. As
alternatives, you also can modify table variables using dot syntax or by indexing into the table. Use
the splitvars and mergevars functions to split multicolumn variables and combine multiple
variables into one. Finally, you can reorient a table so that the rows of the table become variables of
an output table, using the rows2vars function.

You also can modify table variables using the Variables Editor.

Load Sample Data and Create Tables

Load arrays of sample data from the patients MAT-file. Display the names and sizes of the variables
loaded into the workspace.

load patients
whos -file patients

Name Size Bytes Class Attributes

Age 100x1 800 double


Diastolic 100x1 800 double
Gender 100x1 11412 cell
Height 100x1 800 double
LastName 100x1 11616 cell
Location 100x1 14208 cell
SelfAssessedHealthStatus 100x1 11540 cell
Smoker 100x1 100 logical
Systolic 100x1 800 double
Weight 100x1 800 double

Create two tables. Create one table, T, with information collected from a patient questionnaire and
create another table, T2, with data measured from patients. Each table has 100 rows.

T = table(Age,Gender,Smoker);
T2 = table(Height,Weight,Systolic,Diastolic);

Display the first five rows of each table.

head(T,5)

ans=5×3 table
Age Gender Smoker
___ __________ ______

38 {'Male' } true
43 {'Male' } false
38 {'Female'} false
40 {'Female'} false
49 {'Female'} false

head(T2,5)

ans=5×4 table
Height Weight Systolic Diastolic

9-12
Add, Delete, and Rearrange Table Variables

______ ______ ________ _________

71 176 124 93
69 163 109 77
64 131 125 83
67 133 117 75
64 119 122 80

Add Variables Concatenated from Another Table

Add variables to the table T by horizontally concatenating it with T2.


T = [T T2];

Display the first five rows of T.


head(T,5)

ans=5×7 table
Age Gender Smoker Height Weight Systolic Diastolic
___ __________ ______ ______ ______ ________ _________

38 {'Male' } true 71 176 124 93


43 {'Male' } false 69 163 109 77
38 {'Female'} false 64 131 125 83
40 {'Female'} false 67 133 117 75
49 {'Female'} false 64 119 122 80

The table T now has 7 variables and 100 rows.

If the tables that you are horizontally concatenating have row names, horzcat concatenates the
tables by matching the row names. Therefore, the tables must use the same row names, but the row
order does not matter.

Add Variable from Workspace to Table

Add the names of patients from the workspace variable LastName before the first table variable in T.
You can specify any location in the table using the name of a variable near the new location. Use
quotation marks to refer to the names of table variables. However, do not use quotation marks for
input arguments that are workspace variables.
T = addvars(T,LastName,'Before','Age');
head(T,5)

ans=5×8 table
LastName Age Gender Smoker Height Weight Systolic Diastolic
____________ ___ __________ ______ ______ ______ ________ _________

{'Smith' } 38 {'Male' } true 71 176 124 93


{'Johnson' } 43 {'Male' } false 69 163 109 77
{'Williams'} 38 {'Female'} false 64 131 125 83
{'Jones' } 40 {'Female'} false 67 133 117 75
{'Brown' } 49 {'Female'} false 64 119 122 80

You also can specify locations in a table using numbers. For example, the equivalent syntax using a
number to specify location is T = addvars(T,LastName,'Before',1).

9-13
9 Tables

Add Variables Using Dot Syntax

An alternative way to add new table variables is to use dot syntax. When you use dot syntax, you
always add the new variable as the last table variable. You can add a variable that has any data type,
as long as it has the same number of rows as the table.

Create a new variable for blood pressure as a horizontal concatenation of the two variables
Systolic and Diastolic. Add it to T.

T.BloodPressure = [Systolic Diastolic];


head(T,5)

ans=5×9 table
LastName Age Gender Smoker Height Weight Systolic Diastolic B
____________ ___ __________ ______ ______ ______ ________ _________ _

{'Smith' } 38 {'Male' } true 71 176 124 93


{'Johnson' } 43 {'Male' } false 69 163 109 77
{'Williams'} 38 {'Female'} false 64 131 125 83
{'Jones' } 40 {'Female'} false 67 133 117 75
{'Brown' } 49 {'Female'} false 64 119 122 80

T now has 9 variables and 100 rows. A table variable can have multiple columns. So although
BloodPressure has two columns, it is one table variable.

Add a new variable, BMI, in the table T, that contains the body mass index for each patient. BMI is a
function of height and weight. When you calculate BMI, you can refer to the Weight and Height
variables that are in T.

T.BMI = (T.Weight*0.453592)./(T.Height*0.0254).^2;

The operators ./ and .^ in the calculation of BMI indicate element-wise division and exponentiation,
respectively.

Display the first five rows of the table T.

head(T,5)

ans=5×10 table
LastName Age Gender Smoker Height Weight Systolic Diastolic B
____________ ___ __________ ______ ______ ______ ________ _________ _

{'Smith' } 38 {'Male' } true 71 176 124 93


{'Johnson' } 43 {'Male' } false 69 163 109 77
{'Williams'} 38 {'Female'} false 64 131 125 83
{'Jones' } 40 {'Female'} false 67 133 117 75
{'Brown' } 49 {'Female'} false 64 119 122 80

Move Variable in Table

Move the table variable BMI using the movevars function, so that it is after the variable Weight.
When you specify table variables by name, use quotation marks.

T = movevars(T,'BMI','After','Weight');
head(T,5)

9-14
Add, Delete, and Rearrange Table Variables

ans=5×10 table
LastName Age Gender Smoker Height Weight BMI Systolic Dias
____________ ___ __________ ______ ______ ______ ______ ________ ____

{'Smith' } 38 {'Male' } true 71 176 24.547 124 9


{'Johnson' } 43 {'Male' } false 69 163 24.071 109 7
{'Williams'} 38 {'Female'} false 64 131 22.486 125 8
{'Jones' } 40 {'Female'} false 67 133 20.831 117 7
{'Brown' } 49 {'Female'} false 64 119 20.426 122 8

You also can specify locations in a table using numbers. For example, the equivalent syntax using a
number to specify location is T = movevars(T,'BMI,'After',6). It is often more convenient to
refer to variables by name.

Move Table Variable Using Indexing

As an alternative, you can move table variables by indexing. You can index into a table using the same
syntax you use for indexing into a matrix.

Move BloodPressure so that it is next to BMI.

T = T(:,[1:7 10 8 9]);
head(T,5)

ans=5×10 table
LastName Age Gender Smoker Height Weight BMI BloodPressure
____________ ___ __________ ______ ______ ______ ______ _____________

{'Smith' } 38 {'Male' } true 71 176 24.547 124 93


{'Johnson' } 43 {'Male' } false 69 163 24.071 109 77
{'Williams'} 38 {'Female'} false 64 131 22.486 125 83
{'Jones' } 40 {'Female'} false 67 133 20.831 117 75
{'Brown' } 49 {'Female'} false 64 119 20.426 122 80

In a table with many variables, it is often more convenient to use the movevars function.

Delete Variables

To delete table variables, use the removevars function. Delete the Systolic and Diastolic table
variables.

T = removevars(T,{'Systolic','Diastolic'});
head(T,5)

ans=5×8 table
LastName Age Gender Smoker Height Weight BMI BloodPressure
____________ ___ __________ ______ ______ ______ ______ _____________

{'Smith' } 38 {'Male' } true 71 176 24.547 124 93


{'Johnson' } 43 {'Male' } false 69 163 24.071 109 77
{'Williams'} 38 {'Female'} false 64 131 22.486 125 83
{'Jones' } 40 {'Female'} false 67 133 20.831 117 75
{'Brown' } 49 {'Female'} false 64 119 20.426 122 80

9-15
9 Tables

Delete Variable Using Dot Syntax

As an alternative, you can delete variables using dot syntax and the empty matrix, []. Remove the
Age variable from the table.
T.Age = [];
head(T,5)

ans=5×7 table
LastName Gender Smoker Height Weight BMI BloodPressure
____________ __________ ______ ______ ______ ______ _____________

{'Smith' } {'Male' } true 71 176 24.547 124 93


{'Johnson' } {'Male' } false 69 163 24.071 109 77
{'Williams'} {'Female'} false 64 131 22.486 125 83
{'Jones' } {'Female'} false 67 133 20.831 117 75
{'Brown' } {'Female'} false 64 119 20.426 122 80

Delete Variable Using Indexing

You also can delete variables using indexing and the empty matrix, []. Remove the Gender variable
from the table.
T(:,'Gender') = [];
head(T,5)

ans=5×6 table
LastName Smoker Height Weight BMI BloodPressure
____________ ______ ______ ______ ______ _____________

{'Smith' } true 71 176 24.547 124 93


{'Johnson' } false 69 163 24.071 109 77
{'Williams'} false 64 131 22.486 125 83
{'Jones' } false 67 133 20.831 117 75
{'Brown' } false 64 119 20.426 122 80

Split and Merge Table Variables

To split multicolumn table variables into variables that each have one column, use the splitvars
functions. Split the variable BloodPressure into two variables.
T = splitvars(T,'BloodPressure','NewVariableNames',{'Systolic','Diastolic'});
head(T,5)

ans=5×7 table
LastName Smoker Height Weight BMI Systolic Diastolic
____________ ______ ______ ______ ______ ________ _________

{'Smith' } true 71 176 24.547 124 93


{'Johnson' } false 69 163 24.071 109 77
{'Williams'} false 64 131 22.486 125 83
{'Jones' } false 67 133 20.831 117 75
{'Brown' } false 64 119 20.426 122 80

Similarly, you can group related table variables together in one variable, using the mergevars
function. Combine Systolic and Diastolic back into one variable, and name it BP.

9-16
Add, Delete, and Rearrange Table Variables

T = mergevars(T,{'Systolic','Diastolic'},'NewVariableName','BP');
head(T,5)

ans=5×6 table
LastName Smoker Height Weight BMI BP
____________ ______ ______ ______ ______ __________

{'Smith' } true 71 176 24.547 124 93


{'Johnson' } false 69 163 24.071 109 77
{'Williams'} false 64 131 22.486 125 83
{'Jones' } false 67 133 20.831 117 75
{'Brown' } false 64 119 20.426 122 80

Reorient Rows To Become Variables

You can reorient the rows of a table or timetable, so that they become the variables in the output
table, using the rows2vars function. However, if the table has multicolumn variables, then you must
split them before you can call rows2vars.

Reorient the rows of T. Specify that the names of the patients in T are the names of table variables in
the output table. The first variable of T3 contains the names of the variables of T. Each remaining
variable of T3 contains the data from the corresponding row of T.
T = splitvars(T,'BP','NewVariableNames',{'Systolic','Diastolic'});
T3 = rows2vars(T,'VariableNamesSource','LastName');
T3(:,1:5)

ans=6×5 table
OriginalVariableNames Smith Johnson Williams Jones
_____________________ ______ _______ ________ ______

{'Smoker' } 1 0 0 0
{'Height' } 71 69 64 67
{'Weight' } 176 163 131 133
{'BMI' } 24.547 24.071 22.486 20.831
{'Systolic' } 124 109 125 117
{'Diastolic'} 93 77 83 75

You can use dot syntax with T3 to access patient data as an array. However, if the row values of an
input table cannot be concatenated, then the variables of the output table are cell arrays.
T3.Smith

ans = 6×1

1.0000
71.0000
176.0000
24.5467
124.0000
93.0000

See Also
table | addvars | movevars | removevars | splitvars | mergevars | inner2outer |
rows2vars

9-17
9 Tables

Related Examples
• “Add and Delete Table Rows” on page 9-9
• “Clean Messy and Missing Data in Tables” on page 9-19
• “Modify Units, Descriptions, and Table Variable Names” on page 9-24

9-18
Clean Messy and Missing Data in Tables

Clean Messy and Missing Data in Tables


This example shows how to find, clean, and delete table rows with missing data.

Load Sample Data

Load sample data from a comma-separated text file, messy.csv. The file contains many different
missing data indicators:

• Empty character vector ('')


• period (.)
• NA
• NaN
• -99

To specify the character vectors to treat as empty values, use the 'TreatAsEmpty' name-value pair
argument with the readtable function. (Use the disp function to display all 21 rows, even when
running this example as a live script.)
T = readtable('messy.csv','TreatAsEmpty',{'.','NA'});
disp(T)

A B C D E
________ ____ __________ ____ ____

{'afe1'} 3 {'yes' } 3 3
{'egh3'} NaN {'no' } 7 7
{'wth4'} 3 {'yes' } 3 3
{'atn2'} 23 {'no' } 23 23
{'arg1'} 5 {'yes' } 5 5
{'jre3'} 34.6 {'yes' } 34.6 34.6
{'wen9'} 234 {'yes' } 234 234
{'ple2'} 2 {'no' } 2 2
{'dbo8'} 5 {'no' } 5 5
{'oii4'} 5 {'yes' } 5 5
{'wnk3'} 245 {'yes' } 245 245
{'abk6'} 563 {0x0 char} 563 563
{'pnj5'} 463 {'no' } 463 463
{'wnn3'} 6 {'no' } 6 6
{'oks9'} 23 {'yes' } 23 23
{'wba3'} NaN {'yes' } NaN 14
{'pkn4'} 2 {'no' } 2 2
{'adw3'} 22 {'no' } 22 22
{'poj2'} -99 {'yes' } -99 -99
{'bas8'} 23 {'no' } 23 23
{'gry5'} NaN {'yes' } NaN 21

T is a table with 21 rows and five variables. 'TreatAsEmpty' only applies to numeric columns in the
file and cannot handle numeric values specified as text, such as '-99'.

Summarize Table

View the data type, description, units, and other descriptive statistics for each variable by creating a
table summary using the summary function.
summary(T)

9-19
9 Tables

Variables:

A: 21x1 cell array of character vectors

B: 21x1 double

Values:

Min -99
Median 14
Max 563
NumMissing 3

C: 21x1 cell array of character vectors

D: 21x1 double

Values:

Min -99
Median 7
Max 563
NumMissing 2

E: 21x1 double

Values:

Min -99
Median 14
Max 563

When you import data from a file, the default is for readtable to read any variables with
nonnumeric elements as a cell array of character vectors.

Find Rows with Missing Values

Display the subset of rows from the table, T, that have at least one missing value.

TF = ismissing(T,{'' '.' 'NA' NaN -99});


rowsWithMissing = T(any(TF,2),:);
disp(rowsWithMissing)

A B C D E
________ ___ __________ ___ ___

{'egh3'} NaN {'no' } 7 7


{'abk6'} 563 {0x0 char} 563 563
{'wba3'} NaN {'yes' } NaN 14
{'poj2'} -99 {'yes' } -99 -99
{'gry5'} NaN {'yes' } NaN 21

readtable replaced '.' and 'NA' with NaN in the numeric variables, B, D, and E.

Replace Missing Value Indicators

Clean the data so that the missing values indicated by code -99 have the standard MATLAB®
numeric missing value indicator, NaN.

9-20
Clean Messy and Missing Data in Tables

T = standardizeMissing(T,-99);
disp(T)

A B C D E
________ ____ __________ ____ ____

{'afe1'} 3 {'yes' } 3 3
{'egh3'} NaN {'no' } 7 7
{'wth4'} 3 {'yes' } 3 3
{'atn2'} 23 {'no' } 23 23
{'arg1'} 5 {'yes' } 5 5
{'jre3'} 34.6 {'yes' } 34.6 34.6
{'wen9'} 234 {'yes' } 234 234
{'ple2'} 2 {'no' } 2 2
{'dbo8'} 5 {'no' } 5 5
{'oii4'} 5 {'yes' } 5 5
{'wnk3'} 245 {'yes' } 245 245
{'abk6'} 563 {0x0 char} 563 563
{'pnj5'} 463 {'no' } 463 463
{'wnn3'} 6 {'no' } 6 6
{'oks9'} 23 {'yes' } 23 23
{'wba3'} NaN {'yes' } NaN 14
{'pkn4'} 2 {'no' } 2 2
{'adw3'} 22 {'no' } 22 22
{'poj2'} NaN {'yes' } NaN NaN
{'bas8'} 23 {'no' } 23 23
{'gry5'} NaN {'yes' } NaN 21

standardizeMissing replaces three instances of -99 with NaN.

Create a new table, T2, and replace missing values with values from previous rows of the table.
fillmissing provides a number of ways to fill in missing values.

T2 = fillmissing(T,'previous');
disp(T2)

A B C D E
________ ____ _______ ____ ____

{'afe1'} 3 {'yes'} 3 3
{'egh3'} 3 {'no' } 7 7
{'wth4'} 3 {'yes'} 3 3
{'atn2'} 23 {'no' } 23 23
{'arg1'} 5 {'yes'} 5 5
{'jre3'} 34.6 {'yes'} 34.6 34.6
{'wen9'} 234 {'yes'} 234 234
{'ple2'} 2 {'no' } 2 2
{'dbo8'} 5 {'no' } 5 5
{'oii4'} 5 {'yes'} 5 5
{'wnk3'} 245 {'yes'} 245 245
{'abk6'} 563 {'yes'} 563 563
{'pnj5'} 463 {'no' } 463 463
{'wnn3'} 6 {'no' } 6 6
{'oks9'} 23 {'yes'} 23 23
{'wba3'} 23 {'yes'} 23 14
{'pkn4'} 2 {'no' } 2 2
{'adw3'} 22 {'no' } 22 22
{'poj2'} 22 {'yes'} 22 22

9-21
9 Tables

{'bas8'} 23 {'no' } 23 23
{'gry5'} 23 {'yes'} 23 21

Remove Rows with Missing Values

Create a new table, T3, that contains only the rows from T without missing values. T3 has only 16
rows.
T3 = rmmissing(T);
disp(T3)

A B C D E
________ ____ _______ ____ ____

{'afe1'} 3 {'yes'} 3 3
{'wth4'} 3 {'yes'} 3 3
{'atn2'} 23 {'no' } 23 23
{'arg1'} 5 {'yes'} 5 5
{'jre3'} 34.6 {'yes'} 34.6 34.6
{'wen9'} 234 {'yes'} 234 234
{'ple2'} 2 {'no' } 2 2
{'dbo8'} 5 {'no' } 5 5
{'oii4'} 5 {'yes'} 5 5
{'wnk3'} 245 {'yes'} 245 245
{'pnj5'} 463 {'no' } 463 463
{'wnn3'} 6 {'no' } 6 6
{'oks9'} 23 {'yes'} 23 23
{'pkn4'} 2 {'no' } 2 2
{'adw3'} 22 {'no' } 22 22
{'bas8'} 23 {'no' } 23 23

T3 contains 16 rows and five variables.

Organize Data

Sort the rows of T3 in descending order by C, and then sort in ascending order by A.
T3 = sortrows(T2,{'C','A'},{'descend','ascend'});
disp(T3)

A B C D E
________ ____ _______ ____ ____

{'abk6'} 563 {'yes'} 563 563


{'afe1'} 3 {'yes'} 3 3
{'arg1'} 5 {'yes'} 5 5
{'gry5'} 23 {'yes'} 23 21
{'jre3'} 34.6 {'yes'} 34.6 34.6
{'oii4'} 5 {'yes'} 5 5
{'oks9'} 23 {'yes'} 23 23
{'poj2'} 22 {'yes'} 22 22
{'wba3'} 23 {'yes'} 23 14
{'wen9'} 234 {'yes'} 234 234
{'wnk3'} 245 {'yes'} 245 245
{'wth4'} 3 {'yes'} 3 3
{'adw3'} 22 {'no' } 22 22
{'atn2'} 23 {'no' } 23 23
{'bas8'} 23 {'no' } 23 23
{'dbo8'} 5 {'no' } 5 5

9-22
Clean Messy and Missing Data in Tables

{'egh3'} 3 {'no' } 7 7
{'pkn4'} 2 {'no' } 2 2
{'ple2'} 2 {'no' } 2 2
{'pnj5'} 463 {'no' } 463 463
{'wnn3'} 6 {'no' } 6 6

In C, the rows are grouped first by 'yes', followed by 'no'. Then in A, the rows are listed
alphabetically.

Reorder the table so that A and C are next to each other.

T3 = T3(:,{'A','C','B','D','E'});
disp(T3)

A C B D E
________ _______ ____ ____ ____

{'abk6'} {'yes'} 563 563 563


{'afe1'} {'yes'} 3 3 3
{'arg1'} {'yes'} 5 5 5
{'gry5'} {'yes'} 23 23 21
{'jre3'} {'yes'} 34.6 34.6 34.6
{'oii4'} {'yes'} 5 5 5
{'oks9'} {'yes'} 23 23 23
{'poj2'} {'yes'} 22 22 22
{'wba3'} {'yes'} 23 23 14
{'wen9'} {'yes'} 234 234 234
{'wnk3'} {'yes'} 245 245 245
{'wth4'} {'yes'} 3 3 3
{'adw3'} {'no' } 22 22 22
{'atn2'} {'no' } 23 23 23
{'bas8'} {'no' } 23 23 23
{'dbo8'} {'no' } 5 5 5
{'egh3'} {'no' } 3 7 7
{'pkn4'} {'no' } 2 2 2
{'ple2'} {'no' } 2 2 2
{'pnj5'} {'no' } 463 463 463
{'wnn3'} {'no' } 6 6 6

See Also
readtable | summary | ismissing | sortrows | standardizeMissing | rmmissing |
fillmissing

Related Examples
• “Add and Delete Table Rows” on page 9-9
• “Add, Delete, and Rearrange Table Variables” on page 9-12
• “Modify Units, Descriptions, and Table Variable Names” on page 9-24
• “Access Data in Tables” on page 9-32
• “Missing Data in MATLAB”
• “Data Cleaning and Calculations in Tables” on page 9-66
• “Grouped Calculations in Tables and Timetables” on page 9-86

9-23
9 Tables

Modify Units, Descriptions, and Table Variable Names


This example shows how to access and modify table properties for variable units, descriptions and
names. You also can edit these property values using the Variables Editor.

Load Sample Data

Load the sample patients data and create a table.

load patients
BloodPressure = [Systolic Diastolic];

T = table(Gender,Age,Height,Weight,Smoker,BloodPressure);

Display the first five rows of the table, T.

T(1:5,:)

ans=5×6 table
Gender Age Height Weight Smoker BloodPressure
__________ ___ ______ ______ ______ _____________

{'Male' } 38 71 176 true 124 93


{'Male' } 43 69 163 false 109 77
{'Female'} 38 64 131 false 125 83
{'Female'} 40 67 133 false 117 75
{'Female'} 49 64 119 false 122 80

T has 100 rows and 6 variables.

Add Variable Units

Specify units for each variable in the table by modifying the table property, VariableUnits. Specify
the variable units as a cell array of character vectors.

T.Properties.VariableUnits = {'' 'Yrs' 'In' 'Lbs' '' ''};

An individual empty character vector within the cell array indicates that the corresponding variable
does not have units.

Add a Variable Description for a Single Variable

Add a variable description for the variable, BloodPressure. Assign a single character vector to the
element of the cell array containing the description for BloodPressure.

T.Properties.VariableDescriptions{'BloodPressure'} = 'Systolic/Diastolic';

You can use the variable name, 'BloodPressure', or the numeric index of the variable, 6, to index
into the cell array of character vectors containing the variable descriptions.

Summarize the Table

View the data type, description, units, and other descriptive statistics for each variable by using
summary to summarize the table.

summary(T)

9-24
Modify Units, Descriptions, and Table Variable Names

Variables:

Gender: 100x1 cell array of character vectors

Age: 100x1 double

Properties:
Units: Yrs
Values:

Min 25
Median 39
Max 50

Height: 100x1 double

Properties:
Units: In
Values:

Min 60
Median 67
Max 72

Weight: 100x1 double

Properties:
Units: Lbs
Values:

Min 111
Median 142.5
Max 202

Smoker: 100x1 logical

Values:

True 34
False 66

BloodPressure: 100x2 double

Properties:
Description: Systolic/Diastolic
Values:
Column 1 Column 2
________ ________

Min 109 68
Median 122 81.5
Max 138 99

The BloodPressure variable has a description and the Age, Height, Weight, and BloodPressure
variables have units.

9-25
9 Tables

Change a Variable Name

Change the variable name for the first variable from Gender to Sex.

T.Properties.VariableNames{'Gender'} = 'Sex';

Display the first five rows of the table, T.

T(1:5,:)

ans=5×6 table
Sex Age Height Weight Smoker BloodPressure
__________ ___ ______ ______ ______ _____________

{'Male' } 38 71 176 true 124 93


{'Male' } 43 69 163 false 109 77
{'Female'} 38 64 131 false 125 83
{'Female'} 40 67 133 false 117 75
{'Female'} 49 64 119 false 122 80

In addition to properties for variable units, descriptions and names, there are table properties for row
and dimension names, a table description, and user data.

See Also
readtable | table | array2table | cell2table | struct2table | summary

Related Examples
• “Add, Delete, and Rearrange Table Variables” on page 9-12
• “Access Data in Tables” on page 9-32

9-26
Add Custom Properties to Tables and Timetables

Add Custom Properties to Tables and Timetables


This example shows how to add custom properties to tables and timetables, set and access their
values, and remove them.

All tables and timetables have properties that contain metadata about them or their variables. You
can access these properties through the T.Properties object, where T is the name of the table or
timetable. For example, T.Properties.VariableNames returns a cell array containing the names
of the variables of T.

The properties you access through T.Properties are part of the definitions of the table and
timetable data types. You cannot add or remove these predefined properties. But starting in
R2018b, you can add and remove your own custom properties, by modifying the
T.Properties.CustomProperties object of a table or timetable.

Add Properties

Read power outage data into a table. Sort it using the first variable that contains dates and times,
OutageTime. Then display the first three rows.
T = readtable('outages.csv');
T = sortrows(T,'OutageTime');
head(T,3)

ans=3×6 table
Region OutageTime Loss Customers RestorationTime Cause
_____________ ________________ ______ __________ ________________ ____________

{'SouthWest'} 2002-02-01 12:18 458.98 1.8202e+06 2002-02-07 16:50 {'winter sto


{'MidWest' } 2002-03-05 17:53 96.563 2.8666e+05 2002-03-10 14:41 {'wind'
{'MidWest' } 2002-03-16 06:18 186.44 2.1275e+05 2002-03-18 23:23 {'severe sto

Display its properties. These are the properties that all tables have in common. Note that there is also
a CustomProperties object, but that by default it has no properties.
T.Properties

ans =
TableProperties with properties:

Description: ''
UserData: []
DimensionNames: {'Row' 'Variables'}
VariableNames: {1x6 cell}
VariableDescriptions: {}
VariableUnits: {}
VariableContinuity: []
RowNames: {}
CustomProperties: No custom properties are set.
Use addprop and rmprop to modify CustomProperties.

To add custom properties, use the addprop function. Specify the names of the properties. For each
property, also specify whether it has metadata for the whole table (similar to the Description
property) or for its variables (similar to the VariableNames property). If the property has variable
metadata, then its value must be a vector whose length is equal to the number of variables.

9-27
9 Tables

Add custom properties that contain an output file name, file type, and indicators of which variables to
plot. Best practice is to assign the input table as the output argument of addprop, so that the custom
properties are part of the same table. Specify that the output file name and file type are table
metadata using the 'table' option. Specify that the plot indicators are variable metadata using the
'variable' option.
T = addprop(T,{'OutputFileName','OutputFileType','ToPlot'}, ...
{'table','table','variable'});
T.Properties

ans =
TableProperties with properties:

Description: ''
UserData: []
DimensionNames: {'Row' 'Variables'}
VariableNames: {1x6 cell}
VariableDescriptions: {}
VariableUnits: {}
VariableContinuity: []
RowNames: {}

Custom Properties (access using t.Properties.CustomProperties.<name>):


OutputFileName: []
OutputFileType: []
ToPlot: []

Set and Access Values of Custom Properties

When you add custom properties using addprop, their values are empty arrays by default. You can
set and access the values of the custom properties using dot syntax.

Set the output file name and type. These properties contain metadata for the table. Then assign a
logical array to the ToPlot property. This property contains metadata for the variables. In this
example, the elements of the value of the ToPlot property are true for each variable to be included
in a plot, and false for each variable to be excluded.
T.Properties.CustomProperties.OutputFileName = 'outageResults';
T.Properties.CustomProperties.OutputFileType = '.mat';
T.Properties.CustomProperties.ToPlot = [false false true true true false];
T.Properties

ans =
TableProperties with properties:

Description: ''
UserData: []
DimensionNames: {'Row' 'Variables'}
VariableNames: {1x6 cell}
VariableDescriptions: {}
VariableUnits: {}
VariableContinuity: []
RowNames: {}

Custom Properties (access using t.Properties.CustomProperties.<name>):


OutputFileName: 'outageResults'
OutputFileType: '.mat'

9-28
Add Custom Properties to Tables and Timetables

ToPlot: [0 0 1 1 1 0]

Plot variables from T in a stacked plot using the stackedplot function. To plot only the Loss,
Customers, and RestorationTime values, use the ToPlot custom property as the second input
argument.

stackedplot(T,T.Properties.CustomProperties.ToPlot);

When you move or delete table variables, both the predefined and custom properties are reordered so
that their values correspond to the same variables. In this example, the values of the ToPlot custom
property stay aligned with the variables marked for plotting, just as the values of the
VariableNames predefined property stay aligned.

Remove the Customers variable and display the properties.

T.Customers = [];
T.Properties

ans =
TableProperties with properties:

Description: ''
UserData: []
DimensionNames: {'Row' 'Variables'}
VariableNames: {1x5 cell}
VariableDescriptions: {}

9-29
9 Tables

VariableUnits: {}
VariableContinuity: []
RowNames: {}

Custom Properties (access using t.Properties.CustomProperties.<name>):


OutputFileName: 'outageResults'
OutputFileType: '.mat'
ToPlot: [0 0 1 1 0]

Convert the table to a timetable, using the outage times as row times. Move Region to the end of the
table, and RestorationTime before the first variable, using the movevars function. Note that the
properties are reordered appropriately. The RestorationTime and Loss variables still have
indicators for inclusion in a plot.

T = table2timetable(T);
T = movevars(T,'Region','After','Cause');
T = movevars(T,'RestorationTime','Before',1);
T.Properties

ans =
TimetableProperties with properties:

Description: ''
UserData: []
DimensionNames: {'OutageTime' 'Variables'}
VariableNames: {'RestorationTime' 'Loss' 'Cause' 'Region'}
VariableDescriptions: {}
VariableUnits: {}
VariableContinuity: []
RowTimes: [1468x1 datetime]
StartTime: 2002-02-01 12:18
SampleRate: NaN
TimeStep: NaN

Custom Properties (access using t.Properties.CustomProperties.<name>):


OutputFileName: 'outageResults'
OutputFileType: '.mat'
ToPlot: [1 1 0 0]

Remove Properties

You can remove any or all of the custom properties of a table using the rmprop function. However,
you cannot use it to remove predefined properties from T.Properties, because those properties are
part of the definition of the table data type.

Remove the OutputFileName and OutputFileType custom properties. Display the remaining table
properties.

T = rmprop(T,{'OutputFileName','OutputFileType'});
T.Properties

ans =
TimetableProperties with properties:

Description: ''
UserData: []

9-30
Add Custom Properties to Tables and Timetables

DimensionNames: {'OutageTime' 'Variables'}


VariableNames: {'RestorationTime' 'Loss' 'Cause' 'Region'}
VariableDescriptions: {}
VariableUnits: {}
VariableContinuity: []
RowTimes: [1468x1 datetime]
StartTime: 2002-02-01 12:18
SampleRate: NaN
TimeStep: NaN

Custom Properties (access using t.Properties.CustomProperties.<name>):


ToPlot: [1 1 0 0]

See Also
readtable | table | head | addprop | table2timetable | movevars | rmprop | sortrows |
stackedplot

Related Examples
• “Modify Units, Descriptions, and Table Variable Names” on page 9-24
• “Access Data in Tables” on page 9-32
• “Add, Delete, and Rearrange Table Variables” on page 9-12

9-31
9 Tables

Access Data in Tables


In this section...
“Summary of Table Indexing Syntaxes” on page 9-32
“Tables Containing Specified Rows and Variables” on page 9-35
“Extract Data Using Dot Notation and Logical Values” on page 9-38
“Dot Notation with Any Variable Name or Expression” on page 9-40
“Extract Data from Specified Rows and Variables” on page 9-42

A table is a container that stores column-oriented data in variables. Table variables can have different
data types and sizes as long as all variables have the same number of rows. Table variables have
names, just as the fields of a structure have names. The rows of a table can have names, but row
names are not required. To access table data, index into the rows and variables using either their
names or numeric indices.

Typical reasons for indexing into tables include:

• Reordering or removing rows and variables.


• Adding arrays as new rows or variables.
• Extracting arrays of data to use as input arguments to functions.

Summary of Table Indexing Syntaxes


Depending on the type of indexing you use, you can access either a subtable or an array extracted
from the table. Indexing with:

• Smooth parentheses, (), returns a table that has selected rows and variables.
• Dot notation returns the contents of a variable as an array.
• Curly braces, {}, returns an array concatenated from the contents of selected rows and
variables.

You can specify rows and variables by name, numeric index, or data type. Starting in R2019b,
variable names and row names can include any characters, including spaces and non-ASCII
characters. Also, they can start with any characters, not just letters. Variable and row names do not
have to be valid MATLAB identifiers (as determined by the isvarname function).

9-32
Access Data in Tables

Type of Output Syntax Rows Variables Examples


Table, containing T(rows,vars) Specified as: Specified as: • T(1:5,[1 4
specified rows 5])
and variables • Row numbers • Variable
(between 1 and numbers Table having
m) (between 1 and the first five
• Names, if T has n) rows and the
row names • Names first, fourth,
and fifth
• Times, if T is a • Colon (:), variables of T
timetable meaning all
variables • T(:,
• Colon (:), {'A','B'})
meaning all
rows Table having all
rows and the
variables
named 'A' and
'B' from T
Table, containing S = Specified as: Specified as a data • S =
variables that vartype(type); type, such as vartype('num
have specified • Row numbers 'numeric', eric');
data type T(rows,S) (between 1 and 'categorical',
m) or 'datetime' T(1:5,S)
• Names, if T has
Table having
row names
the first five
• Times, if T is a rows and the
timetable numeric
• Colon (:), variables of T
meaning all
rows
Array, extracting T.var Not specified Specified as: • T.Date
data from one
variable T.(expression) • A variable Array extracted
name (without from table
quotation variable named
marks) 'Date'
• An expression • T.
inside ('2019/06/30
parentheses ')
that returns a
variable name Array extracted
or number from table
variable named
'2019/06/30'
• T.(1)

Array extracted
from the first
table variable

9-33
9 Tables

Type of Output Syntax Rows Variables Examples


Array, extracting T.var(rows) Specified as Specified as: • T.Date(1:5)
data from one numeric or logical
variable and T.(expression) indices of the array • A variable First five rows
specified rows (rows) name (without of array
quotation extracted from
marks) table variable
• An expression named 'Date'
inside • T.
parentheses ('2019/06/30
that returns a ')(1:5)
variable name
or number First five rows
of array
extracted from
table variable
named
'2019/06/30'
• T.(1)(1:5)

First five rows


of array
extracted from
the first table
variable
Array, T{rows,vars} Specified as: Specified as: • T{1:5,[1 4
concatenating 5]}
data from • Row numbers • Variable
specified rows (between 1 and numbers Array
and variables m) (between 1 and concatenated
• Names, if T has n) from the first
row names • Names five rows and
the first, fourth,
• Times, if T is a • Colon (:), and fifth
timetable meaning all variables of T
• Colon (:), variables
• T{:,
meaning all {'A','B'}}
rows
Array
concatenated
from all rows
and the
variables
named 'A' and
'B' from T

9-34
Access Data in Tables

Type of Output Syntax Rows Variables Examples


Array, S = Specified as: Specified as a data • S =
concatenating vartype(type); type, such as vartype('num
data from • Row numbers 'numeric', eric');
specified rows T{rows,S} (between 1 and 'categorical',
and variables m) or 'datetime' T{1:5,S}
with specified • Names, if T has
data type Array
row names
concatenated
• Times, if T is a from the first
timetable five rows and
• Colon (:), the numeric
meaning all variables of T
rows
Array, T.Variables Not specified Not specified • T.Variables
concatenating
data from all Identical to
rows and array returned
variables by T{:,:}

Tables Containing Specified Rows and Variables


Load sample data for 100 patients from the patients MAT-file to workspace variables.

load patients
whos

Name Size Bytes Class Attributes

Age 100x1 800 double


Diastolic 100x1 800 double
Gender 100x1 11412 cell
Height 100x1 800 double
LastName 100x1 11616 cell
Location 100x1 14208 cell
SelfAssessedHealthStatus 100x1 11540 cell
Smoker 100x1 100 logical
Systolic 100x1 800 double
Weight 100x1 800 double

Create a table and populate it with the Age, Gender, Height, Weight, and Smoker workspace
variables. Use the unique identifiers in LastName as row names. T is a 100-by-5 table. (When you
specify row names, they do not count as a table variable).

T = table(Age,Gender,Height,Weight,Smoker,...
'RowNames',LastName)

T=100×5 table
Age Gender Height Weight Smoker
___ __________ ______ ______ ______

Smith 38 {'Male' } 71 176 true


Johnson 43 {'Male' } 69 163 false
Williams 38 {'Female'} 64 131 false

9-35
9 Tables

Jones 40 {'Female'} 67 133 false


Brown 49 {'Female'} 64 119 false
Davis 46 {'Female'} 68 142 false
Miller 33 {'Female'} 64 142 true
Wilson 40 {'Male' } 68 180 false
Moore 28 {'Male' } 68 183 false
Taylor 31 {'Female'} 66 132 false
Anderson 45 {'Female'} 68 128 false
Thomas 42 {'Female'} 66 137 false
Jackson 25 {'Male' } 71 174 false
White 39 {'Male' } 72 202 true
Harris 36 {'Female'} 65 129 false
Martin 48 {'Male' } 71 181 true

Index Using Numeric Indices

Create a subtable containing the first five rows and all the variables from T. To specify the desired
rows and variables, use numeric indices within parentheses. This type of indexing is similar to
indexing into numeric arrays.
T1 = T(1:5,:)

T1=5×5 table
Age Gender Height Weight Smoker
___ __________ ______ ______ ______

Smith 38 {'Male' } 71 176 true


Johnson 43 {'Male' } 69 163 false
Williams 38 {'Female'} 64 131 false
Jones 40 {'Female'} 67 133 false
Brown 49 {'Female'} 64 119 false

T1 is a 5-by-5 table.

In addition to numeric indices, you can use row or variable names inside the parentheses. (In this
case, using row indices and a colon is more compact than using row or variable names.)

Index Using Names

Select all the data for the patients with the last names 'Williams' and 'Brown'. Since T has row
names that are the last names of patients, index into T using row names.
T2 = T({'Williams','Brown'},:)

T2=2×5 table
Age Gender Height Weight Smoker
___ __________ ______ ______ ______

Williams 38 {'Female'} 64 131 false


Brown 49 {'Female'} 64 119 false

T2 is a 2-by-5 table.

You also can select variables by name. Create a table that has only the first five rows of T and the
Height and Weight variables. Display it.

9-36
Access Data in Tables

T3 = T(1:5,{'Height','Weight'})

T3=5×2 table
Height Weight
______ ______

Smith 71 176
Johnson 69 163
Williams 64 131
Jones 67 133
Brown 64 119

Table variable names do not have to be valid MATLAB identifiers. They can include spaces and non-
ASCII characters, and can start with any character.

Add a variable name with spaces and a dash to T. Then index into T using variable names.
T = addvars(T,SelfAssessedHealthStatus,'NewVariableNames','Self-Assessed Health Status');
T(1:5,{'Age','Smoker','Self-Assessed Health Status'})

ans=5×3 table
Age Smoker Self-Assessed Health Status
___ ______ ___________________________

Smith 38 true {'Excellent'}


Johnson 43 false {'Fair' }
Williams 38 false {'Good' }
Jones 40 false {'Fair' }
Brown 49 false {'Good' }

Specify Data Type Subscript

Instead of specifying variables using names or numbers, you can create a data type subscript that
matches all variables having the same data type.

First, create a data type subscript to match numeric table variables.


S = vartype('numeric')

S =
table vartype subscript:

Select table variables matching the type 'numeric'

See Access Data in a Table.

Create a table that has only the numeric variables, and only the first five rows, from T.
T4 = T(1:5,S)

T4=5×3 table
Age Height Weight
___ ______ ______

Smith 38 71 176
Johnson 43 69 163

9-37
9 Tables

Williams 38 64 131
Jones 40 67 133
Brown 49 64 119

Extract Data Using Dot Notation and Logical Values


Create a table from the patients MAT-file. Then use dot notation to extract data from table
variables. You can also index using logical indices generated from values in a table variable that meet
a condition.

load patients

T = table(Age,Gender,Height,Weight,Smoker,...
'RowNames',LastName);

Extract Data from Variable

To extract data from one variable, use dot notation. Extract numeric values from the variable Weight.
Then plot a histogram of those values.

histogram(T.Weight)
title('Patient Weight')

T.Weight is a double-precision column vector with 100 rows.

9-38
Access Data in Tables

Select Rows with Logical Indexing

You can index into an array or a table using an array of logical indices. Typically, you use a logical
expression that determines which values in a table variable meet a condition. The result of the
expression is an array of logical indices.

For example, create logical indices matching patients whose age is less than 40.

rows = T.Age < 40

rows = 100x1 logical array

1
0
1
0
0
0
1
0
1
1

To extract heights for patients whose age is less than 40, index into the Height variable using rows.
There are 56 patients younger than 40.

T.Height(rows)

ans = 56×1

71
64
64
68
66
71
72
65
69
69

You can index into a table with logical indices. Display the rows of T for the patients who are younger
than 40.

T(rows,:)

ans=56×5 table
Age Gender Height Weight Smoker
___ __________ ______ ______ ______

Smith 38 {'Male' } 71 176 true


Williams 38 {'Female'} 64 131 false
Miller 33 {'Female'} 64 142 true
Moore 28 {'Male' } 68 183 false

9-39
9 Tables

Taylor 31 {'Female'} 66 132 false


Jackson 25 {'Male' } 71 174 false
White 39 {'Male' } 72 202 true
Harris 36 {'Female'} 65 129 false
Thompson 32 {'Male' } 69 191 true
Garcia 27 {'Female'} 69 131 true
Martinez 37 {'Male' } 70 179 false
Rodriguez 39 {'Female'} 64 117 false
Walker 28 {'Female'} 65 123 true
Hall 25 {'Male' } 70 189 false
Allen 39 {'Female'} 63 143 false
Young 25 {'Female'} 63 114 false

You can match multiple conditions with one logical expression. Display the rows for smoking patients
younger than 40.

rows = (T.Smoker==true & T.Age<40);


T(rows,:)

ans=18×5 table
Age Gender Height Weight Smoker
___ __________ ______ ______ ______

Smith 38 {'Male' } 71 176 true


Miller 33 {'Female'} 64 142 true
White 39 {'Male' } 72 202 true
Thompson 32 {'Male' } 69 191 true
Garcia 27 {'Female'} 69 131 true
Walker 28 {'Female'} 65 123 true
King 30 {'Male' } 67 186 true
Nelson 33 {'Male' } 66 180 true
Mitchell 39 {'Male' } 71 164 true
Turner 37 {'Male' } 70 194 true
Sanders 33 {'Female'} 67 115 true
Price 31 {'Male' } 72 178 true
Jenkins 28 {'Male' } 69 189 true
Long 39 {'Male' } 68 182 true
Patterson 37 {'Female'} 65 120 true
Flores 31 {'Female'} 66 141 true

Dot Notation with Any Variable Name or Expression


When you index using dot notation, there are two ways to specify a variable.

• By name, without quotation marks. For example, T.Date specifies a variable named 'Date'.
• By an expression, where the expression is enclosed by parentheses after the dot. For example, T.
('Start Date') specifies a variable named 'Start Date'.

Use the first syntax when a table variable name also happens to be a valid MATLAB® identifier. (A
valid identifier starts with a letter and includes only letters, digits, and underscores.)

Use the second syntax when you specify:

9-40
Access Data in Tables

• A number that indicates the position of the variable in the table.


• A variable name that isn't a valid MATLAB identifier.
• A function whose output is the name of a variable in the table, or a variable you add to the table.
The output of the function must be a character vector or a string scalar.

For example, create a table from the patients MAT-file. Then use dot notation to access the
contents of table variables.

load patients

T = table(Age,Gender,Height,Weight,Smoker,...
'RowNames',LastName);

To specify a variable by position in the table, use a number. Age is the first variable in T, so use the
number 1 to specify its position.

T.(1)

ans = 100×1

38
43
38
40
49
46
33
40
28
31

To specify a variable by name, you can enclose it in quotation marks. Since 'Age' is a valid identifier,
you can specify it using either T.Age or T.('Age').

T.('Age')

ans = 100×1

38
43
38
40
49
46
33
40
28
31

You can specify table variable names that are not valid MATLAB identifiers. Variable names can
include spaces and non-ASCII characters, and can start with any character. However, when you use
dot notation to access a table variable with such a name, you must specify it using parentheses.

9-41
9 Tables

Add a variable name with spaces and a hyphen to T.

T = addvars(T,SelfAssessedHealthStatus,'NewVariableNames','Self-Assessed Health Status');


T(1:5,:)

ans=5×6 table
Age Gender Height Weight Smoker Self-Assessed Health Status
___ __________ ______ ______ ______ ___________________________

Smith 38 {'Male' } 71 176 true {'Excellent'}


Johnson 43 {'Male' } 69 163 false {'Fair' }
Williams 38 {'Female'} 64 131 false {'Good' }
Jones 40 {'Female'} 67 133 false {'Fair' }
Brown 49 {'Female'} 64 119 false {'Good' }

Access the new table variable using dot notation. Display the first five elements.

C = T.('Self-Assessed Health Status');


C(1:5)

ans = 5x1 cell


{'Excellent'}
{'Fair' }
{'Good' }
{'Fair' }
{'Good' }

You also can use the output of a function as a variable name. Delete the T.('Self-Assessed
Health Status') variable. Then replace it with a variable whose name includes today's date.

T.('Self-Assessed Health Status') = [];


T.(string(datetime('today')) + ' Self Report') = SelfAssessedHealthStatus;
T(1:5,:)

ans=5×6 table
Age Gender Height Weight Smoker 01-Sep-2021 Self Report
___ __________ ______ ______ ______ _______________________

Smith 38 {'Male' } 71 176 true {'Excellent'}


Johnson 43 {'Male' } 69 163 false {'Fair' }
Williams 38 {'Female'} 64 131 false {'Good' }
Jones 40 {'Female'} 67 133 false {'Fair' }
Brown 49 {'Female'} 64 119 false {'Good' }

Extract Data from Specified Rows and Variables


Indexing with curly braces extracts data from a table and results in an array, not a subtable. But
other than that difference, you can specify rows and variables using numbers, names, and data type
subscripts, just as you can when you index using smooth parentheses. To extract values from a table,
use curly braces. If you extract values from multiple table variables, then the variables must have
data types that allow them to be concatenated together.

9-42
Access Data in Tables

Specify Rows and Variables

Create a table from numeric and logical arrays from the patients file.

load patients

T = table(Age,Height,Weight,Smoker,...
'RowNames',LastName);

Extract data from multiple variables in T. Unlike dot notation, indexing with curly braces can extract
values from multiple table variables and concatenate them into one array.

Extract the height and weight for the first five patients. Use numeric indices to select the first five
rows, and variable names to select the variables Height and Weight.

A = T{1:5,{'Height','Weight'}}

A = 5×2

71 176
69 163
64 131
67 133
64 119

A is a 5-by-2 numeric array, not a table.

If you specify one variable name, then curly brace indexing results in the same array you can get with
dot notation. However, you must specify both rows and variables when you use curly brace indexing.
For example, this syntaxes T.Height and T{:,'Height'} return the same array.

Extract Data from All Rows and Variables

If all the table variables have data types that allow them to be concatenated together, then you can
use the T.Variables syntax to put all the table data into an array. This syntax is equivalent to
T{:,:} where the colons indicate all rows and all variables.

A2 = T.Variables

A2 = 100×4

38 71 176 1
43 69 163 0
38 64 131 0
40 67 133 0
49 64 119 0
46 68 142 0
33 64 142 1
40 68 180 0
28 68 183 0
31 66 132 0

See Also
table | histogram | addvars | vartype

9-43
9 Tables

Related Examples
• “Advantages of Using Tables” on page 9-56
• “Create Tables and Assign Data to Them” on page 9-2
• “Modify Units, Descriptions, and Table Variable Names” on page 9-24
• “Calculations on Tables” on page 9-45
• “Data Cleaning and Calculations in Tables” on page 9-66
• “Grouped Calculations in Tables and Timetables” on page 9-86
• “Find Array Elements That Meet a Condition” on page 5-2

9-44
Calculations on Tables

Calculations on Tables
This example shows how to perform calculations on tables.

The functions rowfun and varfun each apply a specified function to a table, yet many other
functions require numeric or homogeneous arrays as input arguments. You can extract data from
individual variables using dot indexing or from one or more variables using curly braces. The
extracted data is then an array that you can use as input to other functions. Starting in R2018a, you
also can use the groupsummary function for calculations on groups of data in a table.

Read Sample Data into Table

Read data from a comma-separated text file, testScores.csv, into a table using the readtable
function. testScores.csv contains test scores for several students. Use the student names in the
first column of the text file as row names in the table.

T = readtable('testScores.csv','ReadRowNames',true)

T=10×4 table
Gender Test1 Test2 Test3
__________ _____ _____ _____

HOWARD {'male' } 90 87 93
WARD {'male' } 87 85 83
TORRES {'male' } 86 85 88
PETERSON {'female'} 75 80 72
GRAY {'female'} 89 86 87
RAMIREZ {'female'} 96 92 98
JAMES {'male' } 78 75 77
WATSON {'female'} 91 94 92
BROOKS {'female'} 86 83 85
KELLY {'male' } 79 76 82

T is a table with 10 rows and four variables.

Summarize the Table

View the data type, description, units, and other descriptive statistics for each variable by using the
summary function to summarize the table.

summary(T)

Variables:

Gender: 10x1 cell array of character vectors

Test1: 10x1 double

Values:

Min 75
Median 86.5
Max 96

Test2: 10x1 double

9-45
9 Tables

Values:

Min 75
Median 85
Max 94

Test3: 10x1 double

Values:

Min 72
Median 86
Max 98

The summary contains the minimum, median, and maximum score for each test.

Find the Average Across Each Row

Extract the data from the second, third, and fourth variables using curly braces, {}, find the average
of each row, and store it in a new variable, TestAvg.

T.TestAvg = mean(T{:,2:end},2)

T=10×5 table
Gender Test1 Test2 Test3 TestAvg
__________ _____ _____ _____ _______

HOWARD {'male' } 90 87 93 90
WARD {'male' } 87 85 83 85
TORRES {'male' } 86 85 88 86.333
PETERSON {'female'} 75 80 72 75.667
GRAY {'female'} 89 86 87 87.333
RAMIREZ {'female'} 96 92 98 95.333
JAMES {'male' } 78 75 77 76.667
WATSON {'female'} 91 94 92 92.333
BROOKS {'female'} 86 83 85 84.667
KELLY {'male' } 79 76 82 79

Alternatively, you can use the variable names, T{:,{'Test1','Test2','Test3'}} or the variable
indices, T{:,2:4} to select the subset of data.

Compute Statistics Using Grouping Variable

Compute the mean and maximum of TestAvg by gender of the students. First, compute the means by
using the varfun function.

varfun(@mean,T,'InputVariables','TestAvg',...
'GroupingVariables','Gender')

ans=2×3 table
Gender GroupCount mean_TestAvg
__________ __________ ____________

{'female'} 5 87.067
{'male' } 5 83.4

9-46
Calculations on Tables

Starting in R2018a, you also can use the groupsummary function to perform computations on groups
of data in a table. Compute the maximum values of TestAvg for each group of students using
groupsummary.

groupsummary(T,'Gender','max','TestAvg')

ans=2×3 table
Gender GroupCount max_TestAvg
__________ __________ ___________

{'female'} 5 95.333
{'male' } 5 90

Replace Data Values

The maximum score for each test is 100. Use curly braces to extract the data from the table and
convert the test scores to a 25 point scale.

T{:,2:end} = T{:,2:end}*25/100

T=10×5 table
Gender Test1 Test2 Test3 TestAvg
__________ _____ _____ _____ _______

HOWARD {'male' } 22.5 21.75 23.25 22.5


WARD {'male' } 21.75 21.25 20.75 21.25
TORRES {'male' } 21.5 21.25 22 21.583
PETERSON {'female'} 18.75 20 18 18.917
GRAY {'female'} 22.25 21.5 21.75 21.833
RAMIREZ {'female'} 24 23 24.5 23.833
JAMES {'male' } 19.5 18.75 19.25 19.167
WATSON {'female'} 22.75 23.5 23 23.083
BROOKS {'female'} 21.5 20.75 21.25 21.167
KELLY {'male' } 19.75 19 20.5 19.75

Change Variable Name

Change the variable name from TestAvg to Final.

T.Properties.VariableNames{end} = 'Final'

T=10×5 table
Gender Test1 Test2 Test3 Final
__________ _____ _____ _____ ______

HOWARD {'male' } 22.5 21.75 23.25 22.5


WARD {'male' } 21.75 21.25 20.75 21.25
TORRES {'male' } 21.5 21.25 22 21.583
PETERSON {'female'} 18.75 20 18 18.917
GRAY {'female'} 22.25 21.5 21.75 21.833
RAMIREZ {'female'} 24 23 24.5 23.833
JAMES {'male' } 19.5 18.75 19.25 19.167
WATSON {'female'} 22.75 23.5 23 23.083
BROOKS {'female'} 21.5 20.75 21.25 21.167

9-47
9 Tables

KELLY {'male' } 19.75 19 20.5 19.75

See Also
table | summary | rowfun | varfun | findgroups | splitapply | groupsummary

Related Examples
• “Access Data in Tables” on page 9-32
• “Split Table Data Variables and Apply Functions” on page 9-52
• “Data Cleaning and Calculations in Tables” on page 9-66
• “Grouped Calculations in Tables and Timetables” on page 9-86

9-48
Split Data into Groups and Calculate Statistics

Split Data into Groups and Calculate Statistics


This example shows how to split data from the patients.mat data file into groups. Then it shows
how to calculate mean weights and body mass indices, and variances in blood pressure readings, for
the groups of patients. It also shows how to summarize the results in a table.

Load Patient Data

Load sample data gathered from 100 patients.


load patients

Convert Gender and SelfAssessedHealthStatus to categorical arrays.


Gender = categorical(Gender);
SelfAssessedHealthStatus = categorical(SelfAssessedHealthStatus);
whos

Name Size Bytes Class Attributes

Age 100x1 800 double


Diastolic 100x1 800 double
Gender 100x1 330 categorical
Height 100x1 800 double
LastName 100x1 11616 cell
Location 100x1 14208 cell
SelfAssessedHealthStatus 100x1 560 categorical
Smoker 100x1 100 logical
Systolic 100x1 800 double
Weight 100x1 800 double

Calculate Mean Weights

Split the patients into nonsmokers and smokers using the Smoker variable. Calculate the mean
weight for each group.
[G,smoker] = findgroups(Smoker);
meanWeight = splitapply(@mean,Weight,G)

meanWeight = 2×1

149.9091
161.9412

The findgroups function returns G, a vector of group numbers created from Smoker. The
splitapply function uses G to split Weight into two groups. splitapply applies the mean function
to each group and concatenates the mean weights into a vector.

findgroups returns a vector of group identifiers as the second output argument. The group
identifiers are logical values because Smoker contains logical values. The patients in the first group
are nonsmokers, and the patients in the second group are smokers.
smoker

smoker = 2x1 logical array

9-49
9 Tables

Split the patient weights by both gender and status as a smoker and calculate the mean weights.

G = findgroups(Gender,Smoker);
meanWeight = splitapply(@mean,Weight,G)

meanWeight = 4×1

130.3250
130.9231
180.0385
181.1429

The unique combinations across Gender and Smoker identify four groups of patients: female
nonsmokers, female smokers, male nonsmokers, and male smokers. Summarize the four groups and
their mean weights in a table.

[G,gender,smoker] = findgroups(Gender,Smoker);
T = table(gender,smoker,meanWeight)

T=4×3 table
gender smoker meanWeight
______ ______ __________

Female false 130.32


Female true 130.92
Male false 180.04
Male true 181.14

T.gender contains categorical values, and T.smoker contains logical values. The data types of these
table variables match the data types of Gender and Smoker respectively.

Calculate body mass index (BMI) for the four groups of patients. Define a function that takes Height
and Weight as its two input arguments, and that calculates BMI.

meanBMIfcn = @(h,w)mean((w ./ (h.^2)) * 703);


BMI = splitapply(meanBMIfcn,Height,Weight,G)

BMI = 4×1

21.6721
21.6686
26.5775
26.4584

Group Patients Based on Self-Reports

Calculate the fraction of patients who report their health as either Poor or Fair. First, use
splitapply to count the number of patients in each group: female nonsmokers, female smokers,
male nonsmokers, and male smokers. Then, count only those patients who report their health as
either Poor or Fair, using logical indexing on S and G. From these two sets of counts, calculate the
fraction for each group.

9-50
Split Data into Groups and Calculate Statistics

[G,gender,smoker] = findgroups(Gender,Smoker);
S = SelfAssessedHealthStatus;
I = ismember(S,{'Poor','Fair'});
numPatients = splitapply(@numel,S,G);
numPF = splitapply(@numel,S(I),G(I));
numPF./numPatients

ans = 4×1

0.2500
0.3846
0.3077
0.1429

Compare the standard deviation in Diastolic readings of those patients who report Poor or Fair
health, and those patients who report Good or Excellent health.

stdDiastolicPF = splitapply(@std,Diastolic(I),G(I));
stdDiastolicGE = splitapply(@std,Diastolic(~I),G(~I));

Collect results in a table. For these patients, the female nonsmokers who report Poor or Fair health
show the widest variation in blood pressure readings.

T = table(gender,smoker,numPatients,numPF,stdDiastolicPF,stdDiastolicGE,BMI)

T=4×7 table
gender smoker numPatients numPF stdDiastolicPF stdDiastolicGE BMI
______ ______ ___________ _____ ______________ ______________ ______

Female false 40 10 6.8872 3.9012 21.672


Female true 13 5 5.4129 5.0409 21.669
Male false 26 8 4.2678 4.8159 26.578
Male true 21 3 5.6862 5.258 26.458

See Also
findgroups | splitapply

Related Examples
• “Grouping Variables To Split Data” on page 9-61
• “Split Table Data Variables and Apply Functions” on page 9-52
• “Data Cleaning and Calculations in Tables” on page 9-66

9-51
9 Tables

Split Table Data Variables and Apply Functions


This example shows how to split power outage data from a table into groups by region and cause of
the power outages. Then it shows how to apply functions to calculate statistics for each group and
collect the results in a table.

Load Power Outage Data

The sample file, outages.csv, contains data representing electric utility outages in the United
States. The file contains six columns: Region, OutageTime, Loss, Customers, RestorationTime,
and Cause. Read outages.csv into a table.

T = readtable('outages.csv');

Convert Region and Cause to categorical arrays, and OutageTime and RestorationTime to
datetime arrays. Display the first five rows.

T.Region = categorical(T.Region);
T.Cause = categorical(T.Cause);
T.OutageTime = datetime(T.OutageTime);
T.RestorationTime = datetime(T.RestorationTime);
T(1:5,:)

ans=5×6 table
Region OutageTime Loss Customers RestorationTime Cause
_________ ________________ ______ __________ ________________ _______________

SouthWest 2002-02-01 12:18 458.98 1.8202e+06 2002-02-07 16:50 winter storm


SouthEast 2003-01-23 00:49 530.14 2.1204e+05 NaT winter storm
SouthEast 2003-02-07 21:15 289.4 1.4294e+05 2003-02-17 08:14 winter storm
West 2004-04-06 05:44 434.81 3.4037e+05 2004-04-06 06:10 equipment fault
MidWest 2002-03-16 06:18 186.44 2.1275e+05 2002-03-18 23:23 severe storm

Calculate Maximum Power Loss

Determine the greatest power loss due to a power outage in each region. The findgroups function
returns G, a vector of group numbers created from T.Region. The splitapply function uses G to
split T.Loss into five groups, corresponding to the five regions. splitapply applies the max
function to each group and concatenates the maximum power losses into a vector.

G = findgroups(T.Region);
maxLoss = splitapply(@max,T.Loss,G)

maxLoss = 5×1
104 ×

2.3141
2.3418
0.8767
0.2796
1.6659

Calculate the maximum power loss due to a power outage by cause. To specify that Cause is the
grouping variable, use table indexing. Create a table that contains the maximum power losses and
their causes.

9-52
Split Table Data Variables and Apply Functions

T1 = T(:,'Cause');
[G,powerLosses] = findgroups(T1);
powerLosses.maxLoss = splitapply(@max,T.Loss,G)

powerLosses=10×2 table
Cause maxLoss
________________ _______

attack 582.63
earthquake 258.18
energy emergency 11638
equipment fault 16659
fire 872.96
severe storm 8767.3
thunder storm 23418
unknown 23141
wind 2796
winter storm 2883.7

powerLosses is a table because T1 is a table. You can append the maximum losses as another table
variable.

Calculate the maximum power loss by cause in each region. To specify that Region and Cause are
the grouping variables, use table indexing. Create a table that contains the maximum power losses
and display the first 15 rows.

T1 = T(:,{'Region','Cause'});
[G,powerLosses] = findgroups(T1);
powerLosses.maxLoss = splitapply(@max,T.Loss,G);
powerLosses(1:15,:)

ans=15×3 table
Region Cause maxLoss
_________ ________________ _______

MidWest attack 0
MidWest energy emergency 2378.7
MidWest equipment fault 903.28
MidWest severe storm 6808.7
MidWest thunder storm 15128
MidWest unknown 23141
MidWest wind 2053.8
MidWest winter storm 669.25
NorthEast attack 405.62
NorthEast earthquake 0
NorthEast energy emergency 11638
NorthEast equipment fault 794.36
NorthEast fire 872.96
NorthEast severe storm 6002.4
NorthEast thunder storm 23418

Calculate Number of Customers Impacted

Determine power-outage impact on customers by cause and region. Because T.Loss contains NaN
values, wrap sum in an anonymous function to use the 'omitnan' input argument.

9-53
9 Tables

osumFcn = @(x)(sum(x,'omitnan'));
powerLosses.totalCustomers = splitapply(osumFcn,T.Customers,G);
powerLosses(1:15,:)

ans=15×4 table
Region Cause maxLoss totalCustomers
_________ ________________ _______ ______________

MidWest attack 0 0
MidWest energy emergency 2378.7 6.3363e+05
MidWest equipment fault 903.28 1.7822e+05
MidWest severe storm 6808.7 1.3511e+07
MidWest thunder storm 15128 4.2563e+06
MidWest unknown 23141 3.9505e+06
MidWest wind 2053.8 1.8796e+06
MidWest winter storm 669.25 4.8887e+06
NorthEast attack 405.62 2181.8
NorthEast earthquake 0 0
NorthEast energy emergency 11638 1.4391e+05
NorthEast equipment fault 794.36 3.9961e+05
NorthEast fire 872.96 6.1292e+05
NorthEast severe storm 6002.4 2.7905e+07
NorthEast thunder storm 23418 2.1885e+07

Calculate Mean Durations of Power Outages

Determine the mean durations of all U.S. power outages in hours. Add the mean durations of power
outages to powerLosses. Because T.RestorationTime has NaT values, omit the resulting NaN
values when calculating the mean durations.

D = T.RestorationTime - T.OutageTime;
H = hours(D);
omeanFcn = @(x)(mean(x,'omitnan'));
powerLosses.meanOutage = splitapply(omeanFcn,H,G);
powerLosses(1:15,:)

ans=15×5 table
Region Cause maxLoss totalCustomers meanOutage
_________ ________________ _______ ______________ __________

MidWest attack 0 0 335.02


MidWest energy emergency 2378.7 6.3363e+05 5339.3
MidWest equipment fault 903.28 1.7822e+05 17.863
MidWest severe storm 6808.7 1.3511e+07 78.906
MidWest thunder storm 15128 4.2563e+06 51.245
MidWest unknown 23141 3.9505e+06 30.892
MidWest wind 2053.8 1.8796e+06 73.761
MidWest winter storm 669.25 4.8887e+06 127.58
NorthEast attack 405.62 2181.8 5.5117
NorthEast earthquake 0 0 0
NorthEast energy emergency 11638 1.4391e+05 77.345
NorthEast equipment fault 794.36 3.9961e+05 87.204
NorthEast fire 872.96 6.1292e+05 4.0267
NorthEast severe storm 6002.4 2.7905e+07 2163.5

9-54
Split Table Data Variables and Apply Functions

NorthEast thunder storm 23418 2.1885e+07 46.098

See Also
findgroups | splitapply | rowfun | varfun

Related Examples
• “Access Data in Tables” on page 9-32
• “Calculations on Tables” on page 9-45
• “Grouping Variables To Split Data” on page 9-61
• “Split Data into Groups and Calculate Statistics” on page 9-49
• “Data Cleaning and Calculations in Tables” on page 9-66
• “Grouped Calculations in Tables and Timetables” on page 9-86

9-55
9 Tables

Advantages of Using Tables


Conveniently Store Mixed-Type Data in Single Container

You can use the table data type to collect mixed-type data and metadata properties, such as variable
name, row names, descriptions, and variable units, in a single container. Tables are suitable for
column-oriented or tabular data that is often stored as columns in a text file or in a spreadsheet. For
example, you can use a table to store experimental data, with rows representing different
observations and columns representing different measured variables.

Tables consist of rows and column-oriented variables. Each variable in a table can have a different
data type and a different size, but each variable must have the same number of rows.

For example, load sample patients data.


load patients

Then, combine the workspace variables, Systolic and Diastolic into a single BloodPressure
variable and convert the workspace variable, Gender, from a cell array of character vectors to a
categorical array.
BloodPressure = [Systolic Diastolic];
Gender = categorical(Gender);

whos('Gender','Age','Smoker','BloodPressure')

Name Size Bytes Class Attributes

Age 100x1 800 double


BloodPressure 100x2 1600 double
Gender 100x1 330 categorical
Smoker 100x1 100 logical

The variables Age, BloodPressure, Gender, and Smoker have varying data types and are
candidates to store in a table since they all have the same number of rows, 100.

Now, create a table from the variables and display the first five rows.
T = table(Gender,Age,Smoker,BloodPressure);
T(1:5,:)

ans=5×4 table
Gender Age Smoker BloodPressure
______ ___ ______ _____________

Male 38 true 124 93


Male 43 false 109 77
Female 38 false 125 83
Female 40 false 117 75
Female 49 false 122 80

The table displays in a tabular format with the variable names at the top.

Each variable in a table is a single data type. If you add a new row to the table, MATLAB® forces
consistency of the data type between the new data and the corresponding table variables. For
example, if you try to add information for a new patient where the first column contains the patient's

9-56
Advantages of Using Tables

age instead of gender, as in the expression T(end+1,:) = {37,{'Female'},true,[130 84]},


then you receive the error:

Invalid RHS for assignment to a categorical array.

The error occurs because MATLAB® cannot assign numeric data, 37, to the categorical array,
Gender.

For comparison of tables with structures, consider the structure array, StructArray, that is
equivalent to the table, T.
StructArray = table2struct(T)

StructArray=100×1 struct array with fields:


Gender
Age
Smoker
BloodPressure

Structure arrays organize records using named fields. Each field's value can have a different data
type or size. Now, display the named fields for the first element of StructArray.
StructArray(1)

ans = struct with fields:


Gender: Male
Age: 38
Smoker: 1
BloodPressure: [124 93]

Fields in a structure array are analogous to variables in a table. However, unlike with tables, you
cannot enforce homogeneity within a field. For example, you can have some values of S.Gender that
are categorical array elements, Male or Female, others that are character vectors, 'Male' or
'Female', and others that are integers, 0 or 1.

Now consider the same data stored in a scalar structure, with four fields each containing one variable
from the table.
ScalarStruct = struct(...
'Gender',{Gender},...
'Age',Age,...
'Smoker',Smoker,...
'BloodPressure',BloodPressure)

ScalarStruct = struct with fields:


Gender: [100x1 categorical]
Age: [100x1 double]
Smoker: [100x1 logical]
BloodPressure: [100x2 double]

Unlike with tables, you cannot enforce that the data is rectangular. For example, the field
ScalarStruct.Age can be a different length than the other fields.

A table allows you to maintain the rectangular structure (like a structure array) and enforce
homogeneity of variables (like fields in a scalar structure). Although cell arrays do not have named

9-57
9 Tables

fields, they have many of the same disadvantages as structure arrays and scalar structures. If you
have rectangular data that is homogeneous in each variable, consider using a table. Then you can use
numeric or named indexing, and you can use table properties to store metadata.

Access Data Using Numeric or Named Indexing

You can index into a table using parentheses, curly braces, or dot indexing. Parentheses allow you to
select a subset of the data in a table and preserve the table container. Curly braces and dot indexing
allow you to extract data from a table. Within each table indexing method, you can specify the rows
or variables to access by name or by numeric index.

Consider the sample table from above. Each row in the table, T, represents a different patient. The
workspace variable, LastName, contains unique identifiers for the 100 rows. Add row names to the
table by setting the RowNames property to LastName and display the first five rows of the updated
table.
T.Properties.RowNames = LastName;
T(1:5,:)

ans=5×4 table
Gender Age Smoker BloodPressure
______ ___ ______ _____________

Smith Male 38 true 124 93


Johnson Male 43 false 109 77
Williams Female 38 false 125 83
Jones Female 40 false 117 75
Brown Female 49 false 122 80

In addition to labeling the data, you can use row and variable names to access data in the table. For
example, use named indexing to display the age and blood pressure of the patients Williams and
Brown.
T({'Williams','Brown'},{'Age','BloodPressure'})

ans=2×2 table
Age BloodPressure
___ _____________

Williams 38 125 83
Brown 49 122 80

Now, use numeric indexing to return an equivalent subtable. Return the third and fifth row from the
second and fourth variables.
T(3:2:5,2:2:4)

ans=2×2 table
Age BloodPressure
___ _____________

Williams 38 125 83
Brown 49 122 80

With cell arrays or structures, you do not have the same flexibility to use named or numeric indexing.

9-58
Advantages of Using Tables

• With a cell array, you must use strcmp to find desired named data, and then you can index into
the array.
• With a scalar structure or structure array, it is not possible to refer to a field by number.
Furthermore, with a scalar structure, you cannot easily select a subset of variables or a subset of
observations. With a structure array, you can select a subset of observations, but you cannot select
a subset of variables.
• With a table, you can access data by named index or by numeric index. Furthermore, you can
easily select a subset of variables and a subset of rows.

For more information on table indexing, see “Access Data in Tables” on page 9-32.

Use Table Properties to Store Metadata

In addition to storing data, tables have properties to store metadata, such as variable names, row
names, descriptions, and variable units. You can access a property using T.Properties.PropName,
where T is the name of the table and PropName is one of the table properties.

For example, add a table description, variable descriptions, and variable units for Age.

T.Properties.Description = 'Simulated Patient Data';

T.Properties.VariableDescriptions = ...
{'Male or Female' ...
'' ...
'true or false' ...
'Systolic/Diastolic'};

T.Properties.VariableUnits{'Age'} = 'Yrs';

Individual empty character vectors within the cell array for VariableDescriptions indicate that
the corresponding variable does not have a description. For more information, see the Properties
section of table.

To print a table summary, use the summary function.

summary(T)

Description: Simulated Patient Data

Variables:

Gender: 100x1 categorical

Properties:
Description: Male or Female
Values:

Female 53
Male 47

Age: 100x1 double

Properties:
Units: Yrs
Values:

9-59
9 Tables

Min 25
Median 39
Max 50

Smoker: 100x1 logical

Properties:
Description: true or false
Values:

True 34
False 66

BloodPressure: 100x2 double

Properties:
Description: Systolic/Diastolic
Values:
Column 1 Column 2
________ ________

Min 109 68
Median 122 81.5
Max 138 99

Structures and cell arrays do not have properties for storing metadata.

See Also
table | summary

Related Examples
• “Create Tables and Assign Data to Them” on page 9-2
• “Modify Units, Descriptions, and Table Variable Names” on page 9-24
• “Access Data in Tables” on page 9-32
• “Data Cleaning and Calculations in Tables” on page 9-66
• “Grouped Calculations in Tables and Timetables” on page 9-86

9-60
Grouping Variables To Split Data

Grouping Variables To Split Data


You can use grouping variables to split data variables into groups. Typically, selecting grouping
variables is the first step in the Split-Apply-Combine workflow. You can split data into groups, apply a
function to each group, and combine the results. You also can denote missing values in grouping
variables, so that corresponding values in data variables are ignored.

Grouping Variables
Grouping variables are variables used to group, or categorize, observations—that is, data values in
other variables. A grouping variable can be any of these data types:

• Numeric, logical, categorical, datetime, or duration vector


• Cell array of character vectors
• Table, with table variables of any data type in this list

Data variables are the variables that contain observations. A grouping variable must have a value
corresponding to each value in the data variables. Data values belong to the same group when the
corresponding values in the grouping variable are the same.

This table shows examples of data variables, grouping variables, and the groups that you can create
when you split the data variables using the grouping variables.

Data Variable Grouping Variable Groups of Data


[5 10 15 20 25 30] [0 0 0 0 1 1] [5 10 15 20] [25 30]
[10 20 30 40 50 60] [1 3 3 1 2 1] [10 40 60] [50] [20 30]
[64 72 67 69 64 68] {'F','M','F','M','F','F'} [64 67 64 68] [72 69]

You can give groups of data meaningful names when you use cell arrays of character vectors or
categorical arrays as grouping variables. A categorical array is an efficient and flexible choice of
grouping variable.

Group Definition
Typically, there are as many groups as there are unique values in the grouping variable. (A
categorical array also can include categories that are not represented in the data.) The groups and
the order of the groups depend on the data type of the grouping variable.

• For numeric, logical, datetime, or duration vectors, or cell arrays of character vectors, the
groups correspond to the unique values sorted in ascending order.
• For categorical arrays, the groups correspond to the unique values observed in the array, sorted in
the order returned by the categories function.

The findgroups function can accept multiple grouping variables, for example G =
findgroups(A1,A2). You also can include multiple grouping variables in a table, for example T =
table(A1,A2); G = findgroups(T). The findgroups function defines groups by the unique
combinations of values across corresponding elements of the grouping variables. findgroups
decides the order by the order of the first grouping variable, and then by the order of the second
grouping variable, and so on. For example, if A1 = {'a','a','b','b'} and A2 = [0 1 0 0],

9-61
9 Tables

then the unique values across the grouping variables are 'a' 0, 'a' 1, and 'b' 0, defining three
groups.

The Split-Apply-Combine Workflow


After you select grouping variables and split data variables into groups, you can apply functions to
the groups and combine the results. This workflow is called the Split-Apply-Combine workflow. You
can use the findgroups and splitapply functions together to analyze groups of data in this
workflow. This diagram shows a simple example using the grouping variable Gender and the data
variable Height to calculate the mean height by gender.

The findgroups function returns a vector of group numbers that define groups based on the unique
values in the grouping variables. splitapply uses the group numbers to split the data into groups
efficiently before applying a function.

Missing Group Values


Grouping variables can have missing values. This table shows the missing value indicator for each
data type. If a grouping variable has missing values, then findgroups assigns NaN as the group
number, and splitapply ignores the corresponding values in the data variables.

Grouping Variable Data Type Missing Value Indicator


Numeric NaN
Logical (Cannot be missing)
Categorical <undefined>
datetime NaT
duration NaN
Cell array of character vectors ''
String <missing>

See Also
findgroups | splitapply | rowfun | varfun

9-62
Grouping Variables To Split Data

Related Examples
• “Access Data in Tables” on page 9-32
• “Split Table Data Variables and Apply Functions” on page 9-52
• “Split Data into Groups and Calculate Statistics” on page 9-49
• “Data Cleaning and Calculations in Tables” on page 9-66
• “Grouped Calculations in Tables and Timetables” on page 9-86

9-63
9 Tables

Changes to DimensionNames Property in R2016b


The table data type is suitable for collecting column-oriented, heterogeneous data in a single
container. Tables also contain metadata properties such as variable names, row names, dimension
names, descriptions, and variable units. Starting in R2016b, you can use the dimension names to
access table data and metadata using dot subscripting. To support that, the dimension names must
satisfy the same requirements as the variable names. For backwards compatibility, tables enforce
those restrictions by automatically modifying dimension names when needed.

Create a table that has row names and variable names.


Number = [8; 21; 13; 20; 11];
Name = {'Van Buren'; 'Arthur'; 'Fillmore'; 'Garfield'; 'Polk'};
Party = categorical({'Democratic'; 'Republican'; 'Whig'; 'Republican'; 'Republican'});
T = table(Number,Party,'RowNames',Name)

T =

Number Party
______ __________

Van Buren 8 Democratic


Arthur 21 Republican
Fillmore 13 Whig
Garfield 20 Republican
Polk 11 Republican

Display its properties, including the dimension names. The default values of the dimension names are
'Row' and 'Variables'.
T.Properties

ans =

struct with fields:

Description: ''
UserData: []
DimensionNames: {'Row' 'Variables'}
VariableNames: {'Number' 'Party'}
VariableDescriptions: {}
VariableUnits: {}
RowNames: {5×1 cell}

Starting in R2016b, you can assign new names to the dimension names, and use them to access table
data. Dimension names must be valid MATLAB identifiers, and must not be one of the reserved
names, 'Properties', 'RowNames', or 'VariableNames'.

Assign a new name to the first dimension name, and use it to access the row names of the table.
T.Properties.DimensionNames{1} = 'Name';
T.Name

ans =

5×1 cell array

'Van Buren'
'Arthur'

9-64
Changes to DimensionNames Property in R2016b

'Fillmore'
'Garfield'
'Polk'

Create a new table variable called Name. When you create the variable, the table modifies its first
dimension name to prevent a conflict. The updated dimension name becomes Name_1.

T{:,'Name'} = {'Martin'; 'Chester'; 'Millard'; 'James'; 'James'}


Warning: DimensionNames property was modified to avoid conflicting dimension and variable names:
'Name'. See Compatibility Considerations for Using Tables for more details. This will become an
error in a future release.

T =

Number Party Name


______ __________ _________

Van Buren 8 Democratic 'Martin'


Arthur 21 Republican 'Chester'
Fillmore 13 Whig 'Millard'
Garfield 20 Republican 'James'
Polk 11 Republican 'James'

T.Properties.DimensionNames

ans =

1×2 cell array

'Name_1' 'Data'

Similarly, if you assign a dimension name that is not a valid MATLAB identifier, the name is modified.

T.Properties.DimensionNames{1} = 'Last Name';


T.Properties.DimensionNames
Warning: DimensionNames property was modified to make the name 'Last Name' a valid MATLAB
identifier. See Compatibility Considerations for Using Tables for more details. This will
become an error in a future release.

ans =

1×2 cell array

'LastName' 'Data'

In R2016b, tables raise warnings when dimension names are not valid identifiers, or conflict with
variable names or reserved names, so that you can continue to work with code and tables created
with previous releases. If you encounter these warnings, it is recommended that you update your
code to avoid them.

9-65
9 Tables

Data Cleaning and Calculations in Tables


This example shows how to clean data stored in a MATLAB® table. It also shows how to perform
calculations by using the numeric and categorical data that the table contains.

Because tables and timetables are containers, working with them is somewhat different than working
with ordinary numeric arrays. The example shows how to use different tabular subscripting modes,
how these modes differ, and the advantages and disadvantages of each mode for different situations.
It also shows how to access and assign data, apply transformation and summary functions, convert
table variables to different data types, and plot results.

The Ames Housing Data used in this example comes from residential real estate data for the town of
Ames, Iowa, in the United States. You can download the original data from an XLS (Excel®
Workbook) spreadsheet. The data description is available as a text file. (Used with permission of the
copyright holder. Please contact the copyright holder if you wish to publish or redistribute this data.)

Import Spreadsheet Data to Table

The best way to import a spreadsheet into MATLAB is to use the readtable function, or for data that
include timestamps, the readtimetable function. While the Ames Housing Data includes the sale
month and year for each house, the month and year are stored in separate columns. In this case, it is
simpler to use readtable.

Read the housing data. With readtable you can read data directly from a URL. Store all text data
from the spreadsheet as string arrays in the output table. Also, when readtable reads column
headers from a file, it uses them as table variable names and transforms them into valid MATLAB
identifiers. To preserve the original names, use the 'VariableNamingRule' name-value argument.

housing = readtable("http://jse.amstat.org/v19n3/decock/AmesHousing.xls","TextType","string");

Warning: Column headers from the file were modified to make them valid MATLAB identifiers before
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names

Display housing. The table has one variable for each of the 82 columns in the spreadsheet.

housing

housing=2930×82 table
Order PID MSSubClass MSZoning LotFrontage LotArea Street Alley
_____ ____________ __________ ________ ___________ _______ ______ _____

1 "0526301100" "020" "RL" 141 31770 "Pave" "NA"


2 "0526350040" "020" "RH" 80 11622 "Pave" "NA"
3 "0526351010" "020" "RL" 81 14267 "Pave" "NA"
4 "0526353030" "020" "RL" 93 11160 "Pave" "NA"
5 "0527105010" "060" "RL" 74 13830 "Pave" "NA"
6 "0527105030" "060" "RL" 78 9978 "Pave" "NA"
7 "0527127150" "120" "RL" 41 4920 "Pave" "NA"
8 "0527145080" "120" "RL" 43 5005 "Pave" "NA"
9 "0527146030" "120" "RL" 39 5389 "Pave" "NA"
10 "0527162130" "060" "RL" 60 7500 "Pave" "NA"
11 "0527163010" "060" "RL" 75 10000 "Pave" "NA"
12 "0527165230" "020" "RL" NaN 7980 "Pave" "NA"
13 "0527166040" "060" "RL" 63 8402 "Pave" "NA"
14 "0527180040" "020" "RL" 85 10176 "Pave" "NA"
15 "0527182190" "120" "RL" NaN 6820 "Pave" "NA"

9-66
Data Cleaning and Calculations in Tables

16 "0527216070" "060" "RL" 47 53504 "Pave" "NA"


The spreadsheet has some column headers with spaces and other column headers that start with
numbers. Column headers become variable names in the output table. By default, readtable
standardizes names with spaces by using camel case, and standardizes names beginning with
numbers by prepending them with 'x'. Although a table can have variable names with spaces and
other non-alphanumeric characters in them, the standardization makes working with table variable
names more natural. Before standardizing names, readtable saves the original column headers in
housing.Properties.VariableDescriptions.

housing.Properties.VariableDescriptions

ans = 1×82 cell


{'Order'} {'PID'} {'MS SubClass'} {'MS Zoning'} {'Lot Frontage'} {'Lot Area'}

In this example, the original variable names are not needed. To delete them, assign an empty cell
array to the VariableDescriptions property.

housing.Properties.VariableDescriptions = {};

Clean Data Before Analysis

You can remove the Order variable because it is a row index and not needed. To remove one variable
from the table, assign an empty array, [], to the variable, just as you delete rows or columns from a
matrix.

housing.Order = [];

There are 81 variables left in the table. For a complete analysis of the housing prices, most of the
variables are probably important. But for this example, only a much smaller subset is needed. To
delete the unwanted variables one-by-one is tedious. The removevars function can delete them all at
once, but in this case there is an easier way. First list the variables that you want to keep. Then use
subscripting to select them and delete the others. Selecting variables by name is often much easier
than figuring out their numeric indices.

keep = ["PID" "MSSubClass" "LotFrontage", "LotArea" "Neighborhood" "BldgType" ...


"OverallCond" "YearBuilt" "YearRemod_Add" "Foundation" "Heating" ...
"CentralAir" "x1stFlrSF" "x2ndFlrSF" "LowQualFinSF" "GrLivArea" ...
"BsmtFullBath" "BsmtHalfBath" "FullBath" "HalfBath" "BedroomAbvGr" ...
"GarageType" "MoSold" "YrSold" "SalePrice"];
housing = housing(:,keep)

housing=2930×25 table
PID MSSubClass LotFrontage LotArea Neighborhood BldgType OverallCo
____________ __________ ___________ _______ ____________ ________ _________

"0526301100" "020" 141 31770 "NAmes" "1Fam" 5


"0526350040" "020" 80 11622 "NAmes" "1Fam" 6
"0526351010" "020" 81 14267 "NAmes" "1Fam" 6
"0526353030" "020" 93 11160 "NAmes" "1Fam" 5
"0527105010" "060" 74 13830 "Gilbert" "1Fam" 5
"0527105030" "060" 78 9978 "Gilbert" "1Fam" 6
"0527127150" "120" 41 4920 "StoneBr" "TwnhsE" 5
"0527145080" "120" 43 5005 "StoneBr" "TwnhsE" 5

9-67
9 Tables

"0527146030" "120" 39 5389 "StoneBr" "TwnhsE" 5


"0527162130" "060" 60 7500 "Gilbert" "1Fam" 5
"0527163010" "060" 75 10000 "Gilbert" "1Fam" 5
"0527165230" "020" NaN 7980 "Gilbert" "1Fam" 7
"0527166040" "060" 63 8402 "Gilbert" "1Fam" 5
"0527180040" "020" 85 10176 "Gilbert" "1Fam" 5
"0527182190" "120" NaN 6820 "StoneBr" "TwnhsE" 5
"0527216070" "060" 47 53504 "StoneBr" "1Fam" 5

Two of the variable names are not very clear. Rename those variables with better names by using the
VariableNames property.
housing.Properties.VariableNames(["GrLivArea" "LowQualFinSF"]) = ["TotalAboveGroundLivingArea" "L

There are two other variable names, starting with 'x', that look odd. Another way to rename them is
to use the renamevars function. If you use renamevars, assign the output to the original table.
Otherwise the update is lost.
housing = renamevars(housing,["x1stFlrSF" "x2ndFlrSF"],["FirstFlrArea" "SecondFlrArea"]);

Convert and Clean Up Data Types

Six of the variables are string arrays. Conceptually they all contain categorical data: discrete,
nonnumeric values drawn from a small fixed set of possible values or categories. It is almost always a
good idea to convert that kind of data to categorical arrays. You can use the
detectImportOptions function to control the data types of the data you read with readtable. But
instead of starting over, you can convert these table variables to have the categorical data type.
For example, convert the Neighborhood variable to a categorical array.
housing.Neighborhood = categorical(housing.Neighborhood);

This assignment overwrites, or replaces, the existing text variable Neighborhood in the table with a
new categorical variable. Replacement is what enables the assignment to change the data type. In
contrast, this assignment, using indexing:

housing.Neighborhood(:) = categorical(housing.Neighborhood)

assigns values into the existing text variable, element by element, rather than replacing the variable.
In that case housing.Neighborhood remains a string array. This behavior is consistent with the
behavior of ordinary workspace variables. Assignment by indexing into an array does not change the
type of the array. For example, if you index into an array of integers and assign a floating-point value
to an element, the value is truncated and stored as an integer.
x = uint32([1 2 3]);
x(2) = 2.2 % converted to 2, as a uint32

x = 1×3 uint32 row vector

1 2 3

Assignment with dot notation is one way to convert the type of a variable in a table. The
convertvars function is another way and has two benefits. First, it avoids any confusion about
overwriting as opposed to assignment into a variable. The convertvars function always overwrites
existing variables and converts their type. Second, convertvars can operate on more than one

9-68
Data Cleaning and Calculations in Tables

variable at a time. There are several more text variables in housing to be converted to the
categorical data type. Changing them one at a time would get tedious, but convertvars can
convert more than one variable in one command.
housing = convertvars(housing,["BldgType" "Foundation"],"categorical");

It is not necessary to explicitly list the variables by name or position in the table. You can find all the
table variables that are string arrays and convert them to categorical variables. To specify table
variables that are string arrays, use the function handle @isstring when calling convertvars.
housing = convertvars(housing,@isstring,"categorical");

In both cases, assign the output of convertvars back to the original table. Otherwise, the update is
lost.

Sometimes, converting all text variables to categorical is too much. For example, if the current
homeowners' names were present in the data, then it would not make sense to store them in a
categorical variable. Homeowners' names do not define housing categories. You might keep their
names in a string array instead.

As another example, the CentralAir variable is one of the variables that was converted to
categorical. But because its categories are just Y and N, it might make more sense to consider it a
logical variable.
summary(housing.CentralAir)

N 196
Y 2734

The logical data type (like all the integer types) does not allow missing values (analogous to NaN),
while categorical does. The CentralAir variable happens to have no missing data values. You
can use either logical or categorical as the data type for CentralAir.
any(ismissing(housing.CentralAir))

ans = logical
0

Convert the data type to logical, with true corresponding to Y, using dot notation to overwrite the
existing categorical variable with the new logical one.
housing.CentralAir = (housing.CentralAir == "Y");

Display the converted data in housing.


housing

housing=2930×25 table
PID MSSubClass LotFrontage LotArea Neighborhood BldgType OverallCond
__________ __________ ___________ _______ ____________ ________ ___________

0526301100 020 141 31770 NAmes 1Fam 5


0526350040 020 80 11622 NAmes 1Fam 6
0526351010 020 81 14267 NAmes 1Fam 6
0526353030 020 93 11160 NAmes 1Fam 5
0527105010 060 74 13830 Gilbert 1Fam 5
0527105030 060 78 9978 Gilbert 1Fam 6

9-69
9 Tables

0527127150 120 41 4920 StoneBr TwnhsE 5


0527145080 120 43 5005 StoneBr TwnhsE 5
0527146030 120 39 5389 StoneBr TwnhsE 5
0527162130 060 60 7500 Gilbert 1Fam 5
0527163010 060 75 10000 Gilbert 1Fam 5
0527165230 020 NaN 7980 Gilbert 1Fam 7
0527166040 060 63 8402 Gilbert 1Fam 5
0527180040 020 85 10176 Gilbert 1Fam 5
0527182190 120 NaN 6820 StoneBr TwnhsE 5
0527216070 060 47 53504 StoneBr 1Fam 5

All the text data has been converted to categorical variables. But there are still a few things to
clean up.

The OverallCond variable was read in as a numeric array, but its values are all drawn from the
integers 1-10. You can leave these values as numeric data, but you can think of it as ordinal
categorical data. When a categorical array is ordinal, its categories have a specified order. For
example, the categories 10 and 5 can be compared (10 > 5, because a house whose condition is
rated as a 10 is theoretically nicer than one rated 5), but for these comparisons, there is no numeric
meaning to 10 - 5. To avoid unintentionally treating OverallCond as numeric data, convert it to an
ordinal categorical array, which still enables relational comparisons but prevents arithmetic
operations. The category names 1, 2, and so on are easy to interpret and are acceptable.
housing.OverallCond = categorical(housing.OverallCond,1:10,"Ordinal",true);

Similarly, the MSSubClass variable consisted of numeric codes in the original spreadsheet. You can
think of those values as being categorical data. Because there is no mathematical order to these
particular codes, the categories are nonordinal (or nominal). In this case, readtable read those
values in as text to preserve leading zeroes in the codes. MSSubClass was then converted to
categorical data.

While MSSubClass has the data type that you want, you might find it difficult to interpret the codes
as categories of houses. The file that describes the Ames Housing Data contains the definitions of the
numeric codes. Giving these categories readable names can help you understand the data. To make it
clear which names go with which numbers, specify both the categories (code) and their names
(subclass) in another call to the categorical function.
code = ["020" "030" "040" "045" "050" "060" "070" "075" "080" "085" "090" "120" "150" "160" "180"
subclass = ["1-STORY 1946 & NEWER ALL STYLES" ...
"1-STORY 1945 & OLDER" ...
"1-STORY W/FINISHED ATTIC ALL AGES" ...
"1-1/2 STORY - UNFINISHED ALL AGES" ...
"1-1/2 STORY FINISHED ALL AGES" ...
"2-STORY 1946 & NEWER" ...
"2-STORY 1945 & OLDER" ...
"2-1/2 STORY ALL AGES" ...
"SPLIT OR MULTI-LEVEL" ...
"SPLIT FOYER" ...
"DUPLEX - ALL STYLES AND AGES" ...
"1-STORY PUD (Planned Unit Development) - 1946 & NEWER" ...
"1-1/2 STORY PUD - ALL AGES" ...
"2-STORY PUD - 1946 & NEWER" ...
"PUD - MULTILEVEL - INCL SPLIT LEV/FOYER" ...
"2 FAMILY CONVERSION - ALL STYLES AND AGES"];
housing.MSSubClass = categorical(housing.MSSubClass,code,subclass);

9-70
Data Cleaning and Calculations in Tables

The category names for the BldgType variable are not obvious. As with MSSubClass, more
descriptive names can help you understand the building categories. To display the number of houses
in each building category, use the summary function.

summary(housing.BldgType)

1Fam 2425
2fmCon 62
Duplex 109
Twnhs 101
TwnhsE 233

With only five categories, you can safely list the new category names in the right order without
specifying the old names. To rename categories, use the renamecats function.

types = ["Single-family Detached" "Two-family Conversion" "Duplex" "Townhouse End Unit" "Townhous
housing.BldgType = renamecats(housing.BldgType,types);

The GarageType variable includes the category NA, standing for Not Applicable. In GarageType, NA
means that the house does not have a garage. But it is too easy to confuse NA with a missing value. A
true missing value means it cannot be determined if a house has a garage. But in this housing data, it
is always known if a house has a garage. Change that one category name to make its meaning clearer.

housing.GarageType = renamecats(housing.GarageType,"NA","None");

Finally, the PID variable was read in as a string array. While its values were numeric, some of them
had leading zeroes. The readtable function preserved this information by storing the values as
strings. Then the call to convertvars converted the PID variable to a categorical array. PID
stores identification numbers that are unique. Identification numbers are assigned as needed and do
not come from a fixed set of values. There is no particular advantage in storing them in a
categorical variable. If every identification number is a category, then adding a new identification
number means adding a new category to PID. It might be more convenient to convert PID back to a
string array. To convert values to strings, use the string function.

housing.PID = string(housing.PID);

Display the results of the preliminary data cleaning.

housing

housing=2930×25 table
PID MSSubClass LotFrontage LotAr
____________ _____________________________________________________ ___________ _____

"0526301100" 1-STORY 1946 & NEWER ALL STYLES 141 3177


"0526350040" 1-STORY 1946 & NEWER ALL STYLES 80 1162
"0526351010" 1-STORY 1946 & NEWER ALL STYLES 81 1426
"0526353030" 1-STORY 1946 & NEWER ALL STYLES 93 1116
"0527105010" 2-STORY 1946 & NEWER 74 1383
"0527105030" 2-STORY 1946 & NEWER 78 997
"0527127150" 1-STORY PUD (Planned Unit Development) - 1946 & NEWER 41 492
"0527145080" 1-STORY PUD (Planned Unit Development) - 1946 & NEWER 43 500
"0527146030" 1-STORY PUD (Planned Unit Development) - 1946 & NEWER 39 538
"0527162130" 2-STORY 1946 & NEWER 60 750
"0527163010" 2-STORY 1946 & NEWER 75 1000
"0527165230" 1-STORY 1946 & NEWER ALL STYLES NaN 798
"0527166040" 2-STORY 1946 & NEWER 63 840

9-71
9 Tables

"0527180040" 1-STORY 1946 & NEWER ALL STYLES 85 1017


"0527182190" 1-STORY PUD (Planned Unit Development) - 1946 & NEWER NaN 682
"0527216070" 2-STORY 1946 & NEWER 47 5350

Create Variable for Date of Sale

The table has separate variables for the month and year of sale. It is more convenient if those
variables are combined in one datetime variable. Assignment by using dot notation is a good way to
add a new variable at the right edge of a table. Add the date of sale as a new variable.

housing.LastSoldDate = datetime(housing.YrSold,housing.MoSold,0,"Format","MMM yyyy");

Now delete the two original variables. It is easier to list the variables by name and use removevars.

housing = removevars(housing,["YrSold" "MoSold"])

housing=2930×24 table
PID MSSubClass LotFrontage LotAr
____________ _____________________________________________________ ___________ _____

"0526301100" 1-STORY 1946 & NEWER ALL STYLES 141 3177


"0526350040" 1-STORY 1946 & NEWER ALL STYLES 80 1162
"0526351010" 1-STORY 1946 & NEWER ALL STYLES 81 1426
"0526353030" 1-STORY 1946 & NEWER ALL STYLES 93 1116
"0527105010" 2-STORY 1946 & NEWER 74 1383
"0527105030" 2-STORY 1946 & NEWER 78 997
"0527127150" 1-STORY PUD (Planned Unit Development) - 1946 & NEWER 41 492
"0527145080" 1-STORY PUD (Planned Unit Development) - 1946 & NEWER 43 500
"0527146030" 1-STORY PUD (Planned Unit Development) - 1946 & NEWER 39 538
"0527162130" 2-STORY 1946 & NEWER 60 750
"0527163010" 2-STORY 1946 & NEWER 75 1000
"0527165230" 1-STORY 1946 & NEWER ALL STYLES NaN 798
"0527166040" 2-STORY 1946 & NEWER 63 840
"0527180040" 1-STORY 1946 & NEWER ALL STYLES 85 1017
"0527182190" 1-STORY PUD (Planned Unit Development) - 1946 & NEWER NaN 682
"0527216070" 2-STORY 1946 & NEWER 47 5350

Explore Data with Plots

Explore the data by making some simple plots. Many basic plotting commands do not accept tables as
input arguments. But you can use dot notation to pass one or more table variables into a plotting
function. You are taking arrays out of the table and passing them as input arguments to a plotting
function.

For example, make a scatter plot of the sale prices of houses in the table as a function of the years in
which they were built.

scatter(housing.YearBuilt,housing.SalePrice,20,"filled");

9-72
Data Cleaning and Calculations in Tables

A log transformation of the prices might show a simpler relationship between year and price. Also,
you can show more information in the scatter plot by using the living area of the houses to color the
markers. The living areas have a long right tail, so it is also useful to show a log transformation of the
areas. To transform the two table variables, wrap them in calls to the log function. Then make
another scatter plot.

logSalePrice = log(housing.SalePrice);
logLivingArea = log(housing.TotalAboveGroundLivingArea);
scatter(housing.YearBuilt,logSalePrice,20,logLivingArea,"filled");

9-73
9 Tables

Clean Errors in Data

Any large, complex data set collected over a long period of time might contain some errors. Check for
errors in the housing data. Dates in the data are a good place to start. First compare YearBuilt to
YearRemod_Add.
checkRows = housing.YearBuilt > housing.YearRemod_Add;
housing(checkRows,:)

ans=1×24 table
PID MSSubClass LotFrontage LotArea Neighborhood
____________ _______________________________ ___________ _______ ____________

"0907194160" 1-STORY 1946 & NEWER ALL STYLES 65 10739 CollgCr

It is not possible for remodeling to have been done in 2001 if the house itself was built in 2002. If you
assume that the YearBuilt value is known to be the error (an assumption that needs to be
confirmed), you can use dot notation to assign 2001 as the year in which this house was built.
housing.YearBuilt(checkRows) = 2001;

As another check, compare the new LastSoldDate variable to YearBuilt.


checkRows = housing.YearBuilt > year(housing.LastSoldDate);
housing(checkRows,:)

ans=2×24 table
PID MSSubClass LotFrontage LotArea Neighborhood

9-74
Data Cleaning and Calculations in Tables

____________ _______________________________ ___________ _______ ____________

"0908154235" 2-STORY 1946 & NEWER 313 63887 Edwards


"0908154195" 1-STORY 1946 & NEWER ALL STYLES 128 39290 Edwards

There is another issue. These two houses were sold in late 2007, as shown in the LastSoldDate
variable. But the corresponding value in YearBuilt is 2008. It might be that for these houses, the
years in YearBuilt were recorded in early 2008 (another assumption needing confirmation). Update
the YearBuilt variable, this time by using dot notation to assign to two rows.

housing.YearBuilt(checkRows) = 2007;

Clean Up Missing Data

The next step in cleaning the data is to check for missing data in the numeric and categorical
variables. The one logical variable in housing does not support missing values. The ismissing
function indicates which elements of the table have missing values.

missingElements = ismissing(housing)

missingElements = 2930×24 logical array

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

The ismissing function returns a logical matrix that is the same size as the table. Summing the
columns of that matrix gives the number of missing values in each of the variables of the table.

numMissing = sum(missingElements,1)

numMissing = 1×24

0 0 490 0 0 0 0 0 0 0 0 0 0 0 0 0

Only three of the variables have missing data, but without the variable names it is not easy to tell
which variables they are. One way to tell is to index into the VariableNames property of the table to
find the names that correspond to the variables with missing values.

housing.Properties.VariableNames(numMissing > 0)

ans = 1×3 cell


{'LotFrontage'} {'BsmtFullBath'} {'BsmtHalfBath'}

Deciding what to do about missing data is a challenge. If the data is missing at random, and there are
only a few missing values, one strategy is to remove those rows from the table. The four missing

9-75
9 Tables

basement bath values (NaNs, in this case) occur in only two rows. You can remove those two rows by
using the rmmissing function.

missingBsmtBath = ismissing(housing.BsmtFullBath) | ismissing(housing.BsmtHalfBath);


housing(missingBsmtBath,:)

ans=2×24 table
PID MSSubClass LotFrontage LotArea Neighborhood
____________ _______________________________ ___________ _______ ____________

"0903230120" 1-STORY 1946 & NEWER ALL STYLES 99 5940 BrkSide


"0908154080" 1-STORY 1946 & NEWER ALL STYLES 123 47007 Edwards

housing = rmmissing(housing,"DataVariables",["BsmtFullBath" "BsmtHalfBath"]);


whos housing

Name Size Bytes Class Attributes

housing 2928x24 595935 table

This call to rmmissing removes only the rows that have missing values in BsmtFullBath and
BsmtHalfBath. The 490 rows with missing LotFrontage values are still in the table. You can
remove these 490 rows but doing so deletes more than 16% of the data. You also can fill these
missing values with the mean frontage value by using the fillmissing function, but that is not
practical for this data. For variables that form a time series, fillmissing also supports filling
variables with interpolated values or moving-window smoothed values. LotFrontage is not a time
series. The data in this variable is a cross-sectional data set.

One commonly used strategy for filling in missing values in cross-sectional data is to create a
regression model to predict the missing values in a row from the non-missing data in that row. A
simple scatter plot indicates that there is a log-log relationship between the area of a lot and its
frontage. That relationship suggests a model.

loglog(housing.LotArea,housing.LotFrontage,'o')

9-76
Data Cleaning and Calculations in Tables

You can use that log-log relationship to fill in the missing LotFrontage values by regressing the
values on LotArea.

missingValues = ismissing(housing.LotFrontage);
beta = polyfit(log(housing.LotArea(~missingValues)),log(housing.LotFrontage(~missingValues)),1);
housing.LotFrontage(missingValues) = exp(polyval(beta,log(housing.LotArea(missingValues))));

You can use dot notation to work on data in a table when you use functions such as polyfit and
polyval that accept numeric vectors but not tables. You can think of a table as a container that is
designed to hold data having different types. Functions such as polyfit that are specifically for
numeric inputs do not work on a table because a table often contains nonnumeric data. Even when a
table contains only numeric data, it is still a container. The functions must be applied to the contents
of the table. Use dot notation to access table variables.

Add the imputed missing values that you calculated with polyfit and polyval to the scatter plot. A
simple imputation scheme might not be sufficient in a real analysis of this data, but it illustrates how
to visualize and make computations on numeric data in a table.

hold on
loglog(housing.LotArea(missingValues),housing.LotFrontage(missingValues),'rx')
hold off

9-77
9 Tables

Arithmetic on Table Variables

Dot notation has been convenient for operations such as converting an existing table variable, adding
a new variable, assigning values, plotting, and applying functions like polyval to a table variable.
Dot notation is also convenient for arithmetic operations on table variables. For example, convert the
LotFrontage variable from feet to meters.

housing.LotFrontage = 0.3048 * housing.LotFrontage;


housing.Properties.VariableUnits("LotFrontage") = "m"

housing=2928×24 table
PID MSSubClass LotFrontage LotAr
____________ _____________________________________________________ ___________ _____

"0526301100" 1-STORY 1946 & NEWER ALL STYLES 42.977 3177


"0526350040" 1-STORY 1946 & NEWER ALL STYLES 24.384 1162
"0526351010" 1-STORY 1946 & NEWER ALL STYLES 24.689 1426
"0526353030" 1-STORY 1946 & NEWER ALL STYLES 28.346 1116
"0527105010" 2-STORY 1946 & NEWER 22.555 1383
"0527105030" 2-STORY 1946 & NEWER 23.774 997
"0527127150" 1-STORY PUD (Planned Unit Development) - 1946 & NEWER 12.497 492
"0527145080" 1-STORY PUD (Planned Unit Development) - 1946 & NEWER 13.106 500
"0527146030" 1-STORY PUD (Planned Unit Development) - 1946 & NEWER 11.887 538
"0527162130" 2-STORY 1946 & NEWER 18.288 750
"0527163010" 2-STORY 1946 & NEWER 22.86 1000
"0527165230" 1-STORY 1946 & NEWER ALL STYLES 19.049 798
"0527166040" 2-STORY 1946 & NEWER 19.202 840

9-78
Data Cleaning and Calculations in Tables

"0527180040" 1-STORY 1946 & NEWER ALL STYLES 25.908 1017


"0527182190" 1-STORY PUD (Planned Unit Development) - 1946 & NEWER 17.465 682
"0527216070" 2-STORY 1946 & NEWER 14.326 5350

Using dot notation means that the multiplication is applied not to the housing table, which cannot
be done because tables are containers, but rather to its LotFrontage variable, which is a numeric
vector. With dot notation, you extracted LotFrontage from the table and put the modified version
back in.

Another way to access the contents of a table is to subscript into it by using curly braces, just as you
use curly braces to extract the contents of a cell array. You can use curly brace subscripting to refer
to and operate on data in a table by extracting and reinserting contents. For example, convert
LotFrontage back to feet by using curly brace subscripting.

housing{:,"LotFrontage"} = housing{:,"LotFrontage"} / 0.3048;


housing.Properties.VariableUnits("LotFrontage") = "ft";

Dot notation and brace subscripting are different syntaxes for the same kinds of operations. They
both work on the contents of a table. Also, they both enable you to specify a table variable and a
subset of its rows.

housing.LotArea(1:2)

ans = 2×1

31770
11622

housing{1:2,"LotArea"}

ans = 2×1

31770
11622

While both syntaxes work on the contents of table, there are two subtle differences to consider.

First, a limitation of curly brace subscripting is that it assigns into the contents of a table rather than
replacing a variable. For example, this assignment does not change the data type of the
LotFrontage variable in the way that an assignment using dot notation does. The call to the single
function on the right side of the assignment creates an array having the single data type. But by
subscripting into housing with curly braces, you assign values from that array into the existing table
variable. And the data type of LotFrontage is double. The values from the right side are converted
back to double by this assignment.

housing{:,"LotFrontage"} = single(housing{:,"LotFrontage"});

Second, a benefit of curly brace subscripting is that, unlike dot notation, it uses the familiar two-
dimensional subscripting syntax. This syntax enables you to refer to more than one variable at a time
and also to a subset of rows. For example, there are five variables whose units are square feet.
Converting these variables to square meters one at a time is tedious. To apply the multiplication to all
five variables at once, use curly brace subscripting.

9-79
9 Tables

areaVars = ["LotArea" "FirstFlrArea" "SecondFlrArea" "LowQualFinishedArea" "TotalAboveGroundLivin


housing{:,areaVars} = 0.3048^2 * housing{:,areaVars};
housing.Properties.VariableUnits(areaVars) = "m^2";

A common mistake is to use parenthesis subscripting instead of braces to operate on the contents of a
table. While some functions, such as ismissing or varfun, do accept a table as their input, many
numeric operations, including arithmetic, do not. For example, this assignment using parentheses
results in an error. The try-catch block catches the error and displays it.

try
housing(:,areaVars) = 0.3048^2 * housing(:,areaVars);
catch ME
disp(ME.message)
end

Undefined function 'mtimes' for input arguments of type 'table'.

This assignment results in an error because housing(:,areaVars) is a 2928-by-5 table, not a


numeric matrix. If you used curly brace subscripting, such as housing{:,areaVars}, then the
result would be a 2928-by-5 numeric matrix. Because tables are designed to hold data of different
types, including nonnumeric data, many functions that make sense for only numeric data do not work
on a table. Dot notation and curly brace subscripting exist to give you access to data in a table.

A third way to do calculations on numeric variables in a table is to use the varfun function. Like
curly brace subscripting, varfun can operate on all or only some of the variables in a table. Unlike
curly braces, varfun operates on each table variable separately. By default, varfun returns another
table containing a variable for each separate result.

Sometimes the operation that you want to apply is an existing function. To pass the function as an
argument to varfun, use a function handle. For example, use the round function to round data in the
variables specified by areaVars.

roundedAreaTable = varfun(@round,housing,"InputVariables",areaVars)

roundedAreaTable=2928×5 table
round_LotArea round_FirstFlrArea round_SecondFlrArea round_LowQualFinishedArea ro
_____________ __________________ ___________________ _________________________ __

2952 154 0 0
1080 83 0 0
1325 123 0 0
1037 196 0 0
1285 86 65 0
927 86 63 0
457 124 0 0
465 119 0 0
501 150 0 0
697 96 72 0
929 71 83 0
741 110 0 0
781 73 63 0
945 125 0 0
634 140 0 0
4971 157 148 0

9-80
Data Cleaning and Calculations in Tables

If there is no function that does exactly what you want, you can also write an anonymous function to
do it.
sqMeters2sqFeet = @(x) x / 0.3048^2;
areaTable = varfun(sqMeters2sqFeet,housing,"InputVariables",areaVars)

areaTable=2928×5 table
Fun_LotArea Fun_FirstFlrArea Fun_SecondFlrArea Fun_LowQualFinishedArea Fun_TotalA
___________ ________________ _________________ _______________________ __________

31770 1656 0 0
11622 896 0 0
14267 1329 0 0
11160 2110 0 0
13830 928 701 0
9978 926 678 0
4920 1338 0 0
5005 1280 0 0
5389 1616 0 0
7500 1028 776 0
10000 763 892 0
7980 1187 0 0
8402 789 676 0
10176 1341 0 0
6820 1502 0 0
53504 1690 1589 0

Because that result is a table, it can be assigned back into the original table with parenthesis
subscripting.
housing(:,areaVars) = areaTable;
housing.Properties.VariableUnits(areaVars) = "ft^2";

It is important to understand the difference between the parentheses in

housing(:,areaVars) = areaTable;

and the braces in

housing{:,areaVars} = 0.3048^2 * housing{:,areaVars};

The two assignments have the same effect. The assignment with parentheses assigns one table to
another. The assignment with curly braces explicitly assigns values to the content of the table. The
left and right sides of that assignment are numeric matrices. Because curly brace subscripting
extracts and reinserts data, it is a convenient way to modify data in place. Contents-to-contents
assignment can operate on only one data type at a time, while table-to-table assignment can move
data of different types. For example, this assignment results in an error because it involves mixed
numeric and categorical data in brace subscripting.
try
housing{:,["LotFrontage" "OverallCond"]} = normalize(housing{:,["LotFrontage" "OverallCond"]}
catch ME
disp(ME.message)
end

Unable to concatenate the specified table variables.

9-81
9 Tables

Because varfun returns a table, assignment using parenthesis subscripting cannot change the type
of any table variables. For example, this assignment does not convert any variables from the double
to single data type.
housing(:,areaVars) = varfun(@single,housing,"InputVariables",areaVars);

To convert the data types of table variables, use convertvars, as previously shown.

Row Operations on Data in Table

Because curly brace subscripting extracts the variables from a table as one matrix having one data
type, you can use it to perform row operations across numeric variables in a table. For example, a
check on the data is to compare the individual square footage variables against
TotalAboveGroundLivingArea. Extract the former by using curly braces. Then compare their row
sums to TotalAboveGroundLivingArea, extracted by using dot notation.
area = housing{:,["FirstFlrArea" "SecondFlrArea" "LowQualFinishedArea"]}

area = 2928×3

1656 0 0
896 0 0
1329 0 0
2110 0 0
928 701 0
926 678 0
1338 0 0
1280 0 0
1616 0 0
1028 776 0

isequal(sum(area,2), housing.TotalAboveGroundLivingArea)

ans = logical
1

The square footage data is consistent. Another example is to compute the total number of bathrooms
in each house by extracting the four different bathroom counts and adding them up across each row.
bathCountVars = ["BsmtHalfBath" "HalfBath" "BsmtFullBath" "FullBath"];
bathCounts = housing{:,bathCountVars}

bathCounts = 2928×4

0 0 1 1
0 0 0 1
0 1 0 1
0 1 1 2
0 1 0 2
0 1 0 2
0 0 1 2
0 0 0 2
0 0 1 2
0 1 0 2

9-82
Data Cleaning and Calculations in Tables

You might think to sum the rows of that matrix as:

sum(housing{:,bathCountVars},2);

but that sum is not correct. Half-baths count only half as much as full bathrooms. A trend in real
estate listings is to account for multiple half-baths by counting them after the decimal point. Matrix
multiplication makes that operation one line.

TotalBaths = housing{:,bathCountVars} * [.1; .1; 1; 1];

Replace those four variables with TotalBaths, rather than adding a new variable at the end of the
table. Begin this replacement by using addvars to add TotalBaths next to the existing variables.

housing = addvars(housing,TotalBaths, 'After',"HalfBath");

There is a mistake in one row of the data. A townhouse built in 2007 probably does not have four half
baths and no full baths.

groupcounts(housing,"TotalBaths")

ans=17×3 table
TotalBaths GroupCount Percent
__________ __________ ________

0.4 1 0.034153
1 442 15.096
1.1 293 10.007
1.2 20 0.68306
1.3 2 0.068306
2 890 30.396
2.1 558 19.057
2.2 29 0.99044
3 349 11.919
3.1 288 9.8361
3.2 6 0.20492
3.3 1 0.034153
4 25 0.85383
4.1 16 0.54645
4.2 3 0.10246
6 2 0.068306

housing(housing.TotalBaths < 1,:)

ans=1×25 table
PID MSSubClass LotFrontage LotAr
____________ _____________________________________________________ ___________ _____

"0528228275" 1-STORY PUD (Planned Unit Development) - 1946 & NEWER 53 3922

The BsmtHalfBath count should be two full bathrooms. The bathroom counts are all numeric. The
assignment with braces updates all three values across that row.

housing{housing.TotalBaths < 1,["BsmtHalfBath" "FullBath" "TotalBaths"]} = [0 2 2.2];

Next use removevars to delete the redundant original variables.

9-83
9 Tables

housing = removevars(housing,bathCountVars)

housing=2928×21 table
PID MSSubClass LotFrontage LotAr
____________ _____________________________________________________ ___________ _____

"0526301100" 1-STORY 1946 & NEWER ALL STYLES 141 3177


"0526350040" 1-STORY 1946 & NEWER ALL STYLES 80 1162
"0526351010" 1-STORY 1946 & NEWER ALL STYLES 81 1426
"0526353030" 1-STORY 1946 & NEWER ALL STYLES 93 1116
"0527105010" 2-STORY 1946 & NEWER 74 1383
"0527105030" 2-STORY 1946 & NEWER 78 997
"0527127150" 1-STORY PUD (Planned Unit Development) - 1946 & NEWER 41 492
"0527145080" 1-STORY PUD (Planned Unit Development) - 1946 & NEWER 43 500
"0527146030" 1-STORY PUD (Planned Unit Development) - 1946 & NEWER 39 538
"0527162130" 2-STORY 1946 & NEWER 60 750
"0527163010" 2-STORY 1946 & NEWER 75 1000
"0527165230" 1-STORY 1946 & NEWER ALL STYLES 62.496 798
"0527166040" 2-STORY 1946 & NEWER 63 840
"0527180040" 1-STORY 1946 & NEWER ALL STYLES 85 1017
"0527182190" 1-STORY PUD (Planned Unit Development) - 1946 & NEWER 57.299 682
"0527216070" 2-STORY 1946 & NEWER 47 5350

Unlike curly braces, varfun operates on each variable in a table separately. For that reason, varfun
cannot do row operations. The related function rowfun can do row operations. It is often simpler and
faster to use curly brace subscripting for row operations.

Reductions of Data in Table

In previous sections, the operations on numeric data in the table were transformations that replace
the original values. Many other important operations are reductions whose results are scalars. For
example, calculate the median price of the values in SalePrice.
median(housing.SalePrice)

ans = 160000

The median function works column-wise on matrices. You can use curly brace subscripting to extract
those four variables as a numeric matrix. Then you can calculate the medians of the columns of the
matrix.
median(housing{:,["LotFrontage", "LotArea" "TotalAboveGroundLivingArea" "SalePrice"]})

ans = 1×4
105 ×

0.0007 0.0944 0.0144 1.6000

This operation does not attach variable names or any other table metadata to the result. As an
alternative, you can use varfun to apply median to each variable in the table. With varfun, the
result is another table that contains separate numeric results and preserves the names.
varfun(@median,housing,"InputVariables",["LotFrontage", "LotArea" "TotalAboveGroundLivingArea" "S

ans=1×4 table
median_LotFrontage median_LotArea median_TotalAboveGroundLivingArea median_SalePrice

9-84
Data Cleaning and Calculations in Tables

__________________ ______________ _________________________________ ________________

69.183 9436.5 1442 1.6e+05

These two ways to get the medians are equivalent. There is a trade-off between having the variable
names preserved in another table and having the results in one numeric row vector. The way you pick
depends on what you plan to do with the result.

Operations on Mixed Data Types

Using curly braces when calculating the medians has another drawback. Curly braces require
compatible data type for all the variables. That is, the data you extract from the variables must have
data types that allow them to be concatenated into one matrix. Ordinal categorical data can also
have median values. Because categorical and numeric arrays cannot be concatenated, this
operation results in an error.

median(housing{:,["LotFrontage", "LotArea" "OverallCond"


"TotalAboveGroundLivingArea" "SalePrice"]})

But because varfun operates on each variable in the table separately, there is no requirement that
the variables have the same data type or compatible types allowing concatenation. The only
requirement is that all the variables must support the function that is applied. To calculate the
medians of ordinal categorical variables and numeric variables in one function call use varfun.

varfun(@median,housing,"InputVariables",["LotFrontage", "LotArea" "OverallCond" "TotalAboveGround

ans=1×5 table
median_LotFrontage median_LotArea median_OverallCond median_TotalAboveGroundLivingAr
__________________ ______________ __________________ _______________________________

69.183 9436.5 5 1442

See Also
categorical | table | readtable | varfun | renamevars | convertvars | summary |
ismissing | rmmissing | datetime | removevars | addvars | groupcounts

Related Examples
• “Access Data in Tables” on page 9-32
• “Clean Messy and Missing Data in Tables” on page 9-19
• “Add and Delete Table Rows” on page 9-9
• “Add, Delete, and Rearrange Table Variables” on page 9-12
• “Modify Units, Descriptions, and Table Variable Names” on page 9-24
• “Create Timetables” on page 10-2
• “Resample and Aggregate Data in Timetable” on page 10-5
• “Grouped Calculations in Tables and Timetables” on page 9-86

9-85
9 Tables

Grouped Calculations in Tables and Timetables


Grouped calculations can help you interpret large datasets such as time-series data. In such
calculations, you use a grouping variable to split a dataset into groups and apply a function to each
group. A grouping variable contains values, such as time periods or station locations, that you can use
to group other data values, such as temperature readings or atmospheric concentrations of a gas. In
MATLAB®, you can store such data in tables or timetables. With grouped calculations in a table you
can often calculate results in-place, in one table, instead of breaking data out into separate tables and
merging results later.

This example shows how to import nitrogen dioxide (NO2) data from the US Environmental
Protection Agency (EPA) into a table and do grouped calculations on this data. NO2 is one of the
Criteria Air Pollutants regulated under the US Clean Air Act. It is toxic by itself and is also a key
component of photochemical smog that results in ground-level ozone production. NO2 is produced
through high-temperature processes that can split nitrogen and oxygen gases and enable them to
recombine. Natural processes contribute NO2 to the atmosphere, but so do human activities such as
combustion in automobile engines and power plants, lightning, and biomass burning. The
concentration of NO2 in the atmosphere is also influenced by the photochemical cycling between NO
and NO2, atmospheric transport, and ultimately oxidation to nitric acid, causing acid rain. Different
processes contribute NO2 to the atmosphere on different timescales, leading to daily (diurnal),
weekly, and annual cycles in its atmospheric concentration. Time-series analysis of such data relies
heavily on grouped calculations to examine different periodic behavior or to average the data over
time to smooth out high-frequency variability and reveal long-term trends.

The example first shows how to do preliminary data cleaning, including conversion of the table to a
timetable. Then it shows simple ways to group the data by one grouping variable and calculate annual
mean NO2 concentrations. It also shows how to group the NO2 data by two grouping variables
together, time and location, enabling calculations that find locations exceeding EPA standards at
various times. You can also group the NO2 data by time period to look for daily or yearly cycles.
Finally it shows how to apply a function that requires inputs from multiple table variables to find the
times at which the maximum NO2 concentrations occurred at each site.

Import NO2 Data to Table

First, import NO2 data from the Air Quality System (AQS) database maintained by the EPA. This data
consists of hourly measurements of NO2 concentrations from outdoor monitors across the United
States, Puerto Rico, and the U.S. Virgin Islands. It is stored as a set of zipped spreadsheets, one for
each year starting with 1980.

Download hourly NO2 measurements for the years 1985–1989. You can download and unzip the
compressed spreadsheets by using the unzip function. The result is set of files in your current folder
with names such as hourly_42602_1985.csv. Here, 42602 is an EPA code for NO2. (Data from the
US Environmental Protection Agency. Air Quality System Data Mart available via https://
www.epa.gov/airdata. Accessed July 15, 2021.)
yrs = string(1985:1989);
urls = "https://aqs.epa.gov/aqsweb/airdata/hourly_42602_" + yrs + ".zip";
fnames = strings(numel(yrs),1);
for ii = 1:numel(yrs)
fnames(ii) = unzip(urls(ii));
end
fnames

fnames = 5×1 string


"hourly_42602_1985.csv"

9-86
Grouped Calculations in Tables and Timetables

"hourly_42602_1986.csv"
"hourly_42602_1987.csv"
"hourly_42602_1988.csv"
"hourly_42602_1989.csv"

Import data from the spreadsheets into a table. Start by creating an empty table. Then import data
from the spreadsheets, one by one, by using the readtable function and adding it to the table.

Create import options that help specify how readtable imports tabular data. To create import
options based on the contents of the spreadsheets, use the detectImportOptions function. Read
all the text data into table variables that store strings. You can also specify that only specified table
variables have certain data types. To specify that only the TimeGMT and TimeLocal table variables
store times as duration arrays, use the setvaropts function.

NO2data = table;
opts = detectImportOptions(fnames(1),"TextType","string");
opts = setvaropts(opts,["TimeGMT","TimeLocal"],"Type","duration","InputFormat","hh:mm");

Import data from the spreadsheets by using the readtable function. You can vertically concatenate
the tables you read in so that all the data is in one large table.

The spreadsheets have column names, such as "Time GMT", that you cannot use as MATLAB
identifiers. As the warning messages indicate, readtable converts these names into table variable
names that are valid MATLAB identifiers, such as TimeGMT. When a table variable name is also a
valid MATLAB identifier, it is easier to access the variable by using dot notation, as in
NO2data.TimeGMT.

for ii = 1:numel(yrs)
NO2data = [NO2data; readtable(fnames(ii),opts)];
end

Warning: Column headers from the file were modified to make them valid MATLAB identifiers before
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names

Warning: Column headers from the file were modified to make them valid MATLAB identifiers before
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names

Warning: Column headers from the file were modified to make them valid MATLAB identifiers before
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names

Warning: Column headers from the file were modified to make them valid MATLAB identifiers before
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names

Warning: Column headers from the file were modified to make them valid MATLAB identifiers before
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names

Display NO2data. It has 24 variables storing NO2 sample measurements, site locations, state names,
times, and many other pieces of information.

NO2data

NO2data=11294497×24 table
StateCode CountyCode SiteNum ParameterCode POC Latitude Longitude Datum
_________ __________ _______ _____________ ___ ________ _________ ______

4 1 7 42602 1 34.128 -109.31 "WGS84


4 1 7 42602 1 34.128 -109.31 "WGS84

9-87
9 Tables

4 1 7 42602 1 34.128 -109.31 "WGS84


4 1 7 42602 1 34.128 -109.31 "WGS84
4 1 7 42602 1 34.128 -109.31 "WGS84
4 1 7 42602 1 34.128 -109.31 "WGS84
4 1 7 42602 1 34.128 -109.31 "WGS84
4 1 7 42602 1 34.128 -109.31 "WGS84
4 1 7 42602 1 34.128 -109.31 "WGS84
4 1 7 42602 1 34.128 -109.31 "WGS84
4 1 7 42602 1 34.128 -109.31 "WGS84
4 1 7 42602 1 34.128 -109.31 "WGS84
4 1 7 42602 1 34.128 -109.31 "WGS84
4 1 7 42602 1 34.128 -109.31 "WGS84
4 1 7 42602 1 34.128 -109.31 "WGS84
4 1 7 42602 1 34.128 -109.31 "WGS84

Clean NO2 Table and Convert to Timetable

Next, prepare NO2data for analysis by cleaning the data. Data cleaning is the process of detecting
and correcting (or removing) parts of the data set that are either corrupt, inaccurate, or irrelevant.
You can also convert table variables so that they have data types that can be more convenient for
analysis, such as categorical or datetime arrays.

For example, the table variable SampleMeasurement has measurements of NO2 concentration.
Concentrations below the method detection limit (MDL) are unreliable. To exclude them from
analysis, find the rows where SampleMeasurement is below the MDL. Set those elements to NaN.
NO2data.SampleMeasurement(NO2data.SampleMeasurement < NO2data.MDL) = NaN;

Create a table that contains only the subset of variables that are relevant to this example. You can use
table subscripting to create a table that has all rows (specified by a colon) and only those variables
that you name.
NO2data = NO2data(:,["DateLocal","TimeLocal","SampleMeasurement","StateName","CountyName","SiteNu

NO2data=11294497×8 table
DateLocal TimeLocal SampleMeasurement StateName CountyName SiteNum Latitud
__________ _________ _________________ _________ __________ _______ _______

1985-01-02 01:00 NaN "Arizona" "Apache" 7 34.128


1985-01-02 02:00 NaN "Arizona" "Apache" 7 34.128
1985-01-02 03:00 NaN "Arizona" "Apache" 7 34.128
1985-01-02 04:00 NaN "Arizona" "Apache" 7 34.128
1985-01-02 05:00 NaN "Arizona" "Apache" 7 34.128
1985-01-02 06:00 NaN "Arizona" "Apache" 7 34.128
1985-01-02 08:00 NaN "Arizona" "Apache" 7 34.128
1985-01-02 09:00 NaN "Arizona" "Apache" 7 34.128
1985-01-02 10:00 NaN "Arizona" "Apache" 7 34.128
1985-01-02 16:00 NaN "Arizona" "Apache" 7 34.128
1985-01-02 17:00 NaN "Arizona" "Apache" 7 34.128
1985-01-02 18:00 NaN "Arizona" "Apache" 7 34.128
1985-01-02 19:00 NaN "Arizona" "Apache" 7 34.128
1985-01-02 20:00 NaN "Arizona" "Apache" 7 34.128
1985-01-02 21:00 NaN "Arizona" "Apache" 7 34.128
1985-01-02 22:00 NaN "Arizona" "Apache" 7 34.128

9-88
Grouped Calculations in Tables and Timetables

Combine the local date and time into a single timestamp. The new Timestamp table variable is a
datetime array. Delete the DateLocal and TimeLocal variables because they are now redundant.

NO2data.Timestamp = NO2data.DateLocal + NO2data.TimeLocal;


NO2data.Timestamp.Format = "default";
NO2data.DateLocal = [];
NO2data.TimeLocal = [];

To categorize the data later, convert the StateName and CountyName variables to categorical
arrays, first erasing space characters from the names. There are fixed sets of state and county names
in the data, which makes it convenient to create categories based on them.

NO2data.StateName = categorical(erase(NO2data.StateName," "));


NO2data.CountyName = categorical(erase(NO2data.CountyName," "));

Rename the SampleMeasurement variable to MeasuredNO2. One way to rename table variables is
by using the VariableNames property of the table.

NO2data.Properties.VariableNames("SampleMeasurement") = "MeasuredNO2";

Convert NO2data to a timetable. The datetime values in Timestamp are now row times that label
the rows of the timetable. The dates and times of the original table were in separate variables. To put
data like this data into a timetable, it is more convenient to import the data as a table, and then
combine the separate date and time variables into one datetime variable. Then convert the modified
table by using the table2timetable function.

NO2data = table2timetable(NO2data)

NO2data=11294497×6 timetable
Timestamp MeasuredNO2 StateName CountyName SiteNum Latitude Long
____________________ ___________ _________ __________ _______ ________ ____

02-Jan-1985 01:00:00 NaN Arizona Apache 7 34.128 -10


02-Jan-1985 02:00:00 NaN Arizona Apache 7 34.128 -10
02-Jan-1985 03:00:00 NaN Arizona Apache 7 34.128 -10
02-Jan-1985 04:00:00 NaN Arizona Apache 7 34.128 -10
02-Jan-1985 05:00:00 NaN Arizona Apache 7 34.128 -10
02-Jan-1985 06:00:00 NaN Arizona Apache 7 34.128 -10
02-Jan-1985 08:00:00 NaN Arizona Apache 7 34.128 -10
02-Jan-1985 09:00:00 NaN Arizona Apache 7 34.128 -10
02-Jan-1985 10:00:00 NaN Arizona Apache 7 34.128 -10
02-Jan-1985 16:00:00 NaN Arizona Apache 7 34.128 -10
02-Jan-1985 17:00:00 NaN Arizona Apache 7 34.128 -10
02-Jan-1985 18:00:00 NaN Arizona Apache 7 34.128 -10
02-Jan-1985 19:00:00 NaN Arizona Apache 7 34.128 -10
02-Jan-1985 20:00:00 NaN Arizona Apache 7 34.128 -10
02-Jan-1985 21:00:00 NaN Arizona Apache 7 34.128 -10
02-Jan-1985 22:00:00 NaN Arizona Apache 7 34.128 -10

Simple Grouped Calculations by State

Given the size of the timetable, it is obvious that there are many thousands of hourly measurements
in every state. One way to calculate the number of measurements for each state is to sum the number
of rows that have a particular state as a category. For example, calculate the number of
measurements for Alaska, and then for Arizona.

9-89
9 Tables

numAlaska = sum(NO2data.StateName=="Alaska")

numAlaska = 7071

numArizona = sum(NO2data.StateName=="Arizona")

numArizona = 142793

It is tedious to perform this calculation multiple times or to store intermediate results in many
variables or subtables. Instead, MATLAB provides functions that group data in tables and apply
functions to each group in-place. For example, use the groupcounts function to group the data in
NO2data by the states in StateName and count the rows in each group. Instead of calling sum many
times, call groupcounts once.
NO2counts = groupcounts(NO2data,"StateName")

NO2counts=42×3 table
StateName GroupCount Percent
__________________ __________ ________

Alaska 7071 0.062606


Arizona 1.4279e+05 1.2643
Arkansas 40723 0.36056
California 3.4015e+06 30.116
Colorado 2.2394e+05 1.9827
Connecticut 1.2615e+05 1.1169
Delaware 75185 0.66568
DistrictOfColumbia 74988 0.66393
Florida 2.1172e+05 1.8746
Georgia 74971 0.66378
Illinois 4.407e+05 3.9019
Indiana 3.3058e+05 2.927
Kansas 10625 0.094072
Kentucky 2.8789e+05 2.549
Louisiana 1.6914e+05 1.4976
Maryland 1.2565e+05 1.1125

To sort the results in a table or timetable, use the sortrows function. Sort gc on its GroupCount
variable from highest to lowest value.
sortedNO2counts = sortrows(NO2counts,"GroupCount","descend")

sortedNO2counts=42×3 table
StateName GroupCount Percent
_____________ __________ _______

California 3.4015e+06 30.116


Pennsylvania 8.2796e+05 7.3307
Missouri 4.8609e+05 4.3038
Texas 4.7163e+05 4.1758
Illinois 4.407e+05 3.9019
Virginia 3.4464e+05 3.0514
Massachusetts 3.3833e+05 2.9956
Indiana 3.3058e+05 2.927
NewJersey 3.2862e+05 2.9095
Montana 2.8886e+05 2.5576
Kentucky 2.8789e+05 2.549

9-90
Grouped Calculations in Tables and Timetables

NorthDakota 2.7822e+05 2.4633


Ohio 2.7478e+05 2.4329
Oklahoma 2.4477e+05 2.1672
NewYork 2.3126e+05 2.0475
Wisconsin 2.3079e+05 2.0434

To calculate other statistics, use the groupsummary function. For example, find the maximum NO2
concentration measured in each state.
NO2max = groupsummary(NO2data,"StateName","max","MeasuredNO2");
sortedNO2max = sortrows(NO2max,"max_MeasuredNO2","descend")

sortedNO2max=42×3 table
StateName GroupCount max_MeasuredNO2
____________ __________ _______________

Nevada 64821 743.5


California 3.4015e+06 540
Indiana 3.3058e+05 500
Colorado 2.2394e+05 462
NewYork 2.3126e+05 451
Tennessee 1.9592e+05 410
Ohio 2.7478e+05 403
Kentucky 2.8789e+05 368
Pennsylvania 8.2796e+05 357
Minnesota 90293 328
Missouri 4.8609e+05 326
Connecticut 1.2615e+05 319
Oklahoma 2.4477e+05 318
NewHampshire 25598 312
Delaware 75185 300
Louisiana 1.6914e+05 286

As an alternative, you can use the varfun function with the "GroupingVariables" name-value
argument for grouped calculations. But the groupsummary function is simpler and performs most of
the same grouped calculations as varfun.

Simple Grouped Calculations by Time

Functions such as groupcounts, groupsummary, and varfun work equally well on tables and
timetables. But timetables also provide the retime and synchronize functions, which can perform
time-based calculations by using their row times. You can group timetable data by time and perform
calculations on data within the time periods. The retime function is the best option for such cases.

For example, group the data in NO2data into yearly time periods. Find the maximum NO2
concentration for each year.
yearlyMaxNO2 = retime(NO2data(:,"MeasuredNO2"),"yearly","max")

yearlyMaxNO2=5×1 timetable
Timestamp MeasuredNO2
___________ ___________

01-Jan-1985 407.3

9-91
9 Tables

01-Jan-1986 500
01-Jan-1987 497
01-Jan-1988 743.5
01-Jan-1989 462

This calculation is useful if you have one time series. In this case, the data in the MeasuredNO2
variable come from multiple sites. A more useful analysis is to group by both year and site.

Calculate Annual Means by Site

The US EPA has two National Ambient Air Quality Standards (NAAQS) for NO2. A location is not in
compliance with the NAAQS if either:

• The annual mean exceeds 53 ppb


• The 98th percentile of 1-hour daily maximum concentrations, averaged over 3 years, exceeds 100
parts-per-billion (ppb)

Analyze data in NO2data to find locations that are not in compliance with the first standard, where
the annual mean exceeded 53 ppb. There are three different ways to approach this analysis. What the
three approaches have in common is that you can group the data by both time and site to calculate
annual means by site.

Group by Multiple Grouping Variables

To find sites that do not comply with the NAAQS, calculate the mean value for each site for each year.
While NO2data does not include unique identifiers for the sites, you can use state names, county
names, and site numbers together to uniquely identify air quality sites.

The row times of NO2data are datetime values. Extract their year components and add a new
variable to NO2data named Year. Calculate the annual means for each site by using groupsummary
with StateName, CountyName, SiteNum, and Year as grouping variables.

NO2data.Year = year(NO2data.Timestamp);
meanNO2bySite = groupsummary(NO2data,["StateName","CountyName","SiteNum","Year"],"mean","Measured

meanNO2bySite=1585×6 table
StateName CountyName SiteNum Year GroupCount mean_MeasuredNO2
_________ ______________ _______ ____ __________ ________________

Alaska KenaiPeninsula 1004 1989 7071 9.7986


Arizona Apache 7 1985 5920 7.75
Arizona Apache 7 1986 2059 6.7857
Arizona Apache 7 1988 1981 7.1391
Arizona Apache 7 1989 3861 6.9146
Arizona Apache 8 1985 6007 5.9138
Arizona Apache 8 1986 1999 6.1875
Arizona Apache 8 1988 1924 6.3333
Arizona Apache 8 1989 3771 7.2619
Arizona Apache 9 1985 5852 6.7021
Arizona Apache 9 1986 1942 7.6579
Arizona Apache 9 1988 2068 8.5333
Arizona Apache 9 1989 3813 6.9604
Arizona Apache 10 1985 5905 8.4406
Arizona Apache 10 1986 2009 7.3333
Arizona Apache 10 1988 2117 7.2381

9-92
Grouped Calculations in Tables and Timetables

To find the sites that have the highest mean NO2, sort the timetable.
sortedMeanNO2bySite = sortrows(meanNO2bySite,"mean_MeasuredNO2","descend")

sortedMeanNO2bySite=1585×6 table
StateName CountyName SiteNum Year GroupCount mean_MeasuredNO2
__________ __________ _______ ____ __________ ________________

California LosAngeles 1103 1988 8272 61.526


California LosAngeles 1103 1986 8083 61.266
California LosAngeles 1103 1985 8217 59.965
California LosAngeles 1105 1985 1194 58.399
California LosAngeles 1002 1986 8084 57.422
California LosAngeles 1002 1985 8159 57.401
California LosAngeles 1701 1989 8299 57.118
California LosAngeles 1701 1986 8229 55.924
California LosAngeles 1103 1989 8135 55.335
California LosAngeles 1701 1987 8284 54.864
California LosAngeles 1601 1989 8201 54.685
California LosAngeles 1701 1985 8341 54.147
California LosAngeles 1103 1987 8150 54.092
California LosAngeles 1601 1988 7546 53.828
California LosAngeles 1601 1985 8307 53.377
California LosAngeles 2005 1989 8225 53.174

You can create a table that includes only those sites exceeding 53 ppb by using logical indexing.
Create a logical vector that indicates the rows where mean_MeasuredNO2 is greater than 53. Use
that vector as a subscript to get matching rows from meanNO2bySite.
exceeded53ppb = meanNO2bySite.mean_MeasuredNO2 > 53;
sitesExceed53ppb = meanNO2bySite(exceeded53ppb,:)

sitesExceed53ppb=19×6 table
StateName CountyName SiteNum Year GroupCount mean_MeasuredNO2
__________ __________ _______ ____ __________ ________________

California LosAngeles 2 1988 8278 53.17


California LosAngeles 1002 1985 8159 57.401
California LosAngeles 1002 1986 8084 57.422
California LosAngeles 1002 1988 8176 53.004
California LosAngeles 1103 1985 8217 59.965
California LosAngeles 1103 1986 8083 61.266
California LosAngeles 1103 1987 8150 54.092
California LosAngeles 1103 1988 8272 61.526
California LosAngeles 1103 1989 8135 55.335
California LosAngeles 1105 1985 1194 58.399
California LosAngeles 1601 1985 8307 53.377
California LosAngeles 1601 1988 7546 53.828
California LosAngeles 1601 1989 8201 54.685
California LosAngeles 1701 1985 8341 54.147
California LosAngeles 1701 1986 8229 55.924
California LosAngeles 1701 1987 8284 54.864

9-93
9 Tables

Pivot to Find Relationships Between Grouping Variables

Sometimes pivoting, or rearranging statistics calculated from tabular data, makes it easier to see and
analyze results, particularly when you look at the relationship between two grouping variables. For
example, you can create a pivot table for the annual mean NO2 by site. By pivoting, you can create a
table where every site lists annual mean NO2 in its own table variable, showing the relationship
between year and site. In MATLAB, you can create pivot tables by using the stack and unstack
functions, which stack and unstack table variables into taller or wider formats.

A complication in this case is that NO2data has three grouping variables that together uniquely
identify sites: state name, county name, and site number. To create a pivot table, first combine these
three table variables into one variable. Convert StateName, CountyName, and SiteNum into strings
and add them together. Replace spaces and dashes with underscores, and erase periods and
parentheses. The names in SiteID are unique site identifiers.

siteID = string(NO2data.StateName) + "_" + string(NO2data.CountyName) + "_" + string(NO2data.Site


siteID = replace(siteID,[" ","-"],"_");
siteID = erase(siteID,[".","(",")"]);

Add SiteID to NO2data as a new table variable. Calculate annual means by using groupsummary,
but this time use SiteID as a grouping variable.

NO2data.SiteID = categorical(siteID);
meanNO2bySiteID = groupsummary(NO2data,["SiteID","Year"],"mean","MeasuredNO2")

meanNO2bySiteID=1585×4 table
SiteID Year GroupCount mean_MeasuredNO2
__________________________ ____ __________ ________________

Alaska_KenaiPeninsula_1004 1989 7071 9.7986


Arizona_Apache_10 1985 5905 8.4406
Arizona_Apache_10 1986 2009 7.3333
Arizona_Apache_10 1988 2117 7.2381
Arizona_Apache_10 1989 4282 6.9929
Arizona_Apache_11 1985 5262 8.4264
Arizona_Apache_11 1986 1960 7.587
Arizona_Apache_11 1988 2063 8.0471
Arizona_Apache_11 1989 4266 7.5714
Arizona_Apache_7 1985 5920 7.75
Arizona_Apache_7 1986 2059 6.7857
Arizona_Apache_7 1988 1981 7.1391
Arizona_Apache_7 1989 3861 6.9146
Arizona_Apache_8 1985 6007 5.9138
Arizona_Apache_8 1986 1999 6.1875
Arizona_Apache_8 1988 1924 6.3333

To create a pivot table, use the unstack function. Each unique site in the SiteID variable of
meanNO2bySiteID becomes the name of a separate table variable in the output,
pivotedMeanNO2bySiteID, and has the annual means associated with that site. This unstacking
operation is how you can create a pivot table in MATLAB.

pivotedMeanNO2bySiteID = unstack(meanNO2bySiteID,"mean_MeasuredNO2","SiteID","GroupingVariable","

pivotedMeanNO2bySiteID=5×443 table
Year Alaska_KenaiPeninsula_1004 Arizona_Apache_10 Arizona_Apache_11 Arizona_Apach

9-94
Grouped Calculations in Tables and Timetables

____ __________________________ _________________ _________________ _____________

1989 9.7986 6.9929 7.5714 6.9146


1985 NaN 8.4406 8.4264 7.75

This representation of the annual means by site has an advantage and a disadvantage.

• It is easier to look at the short five-year time series for each site. After unstacking, each site has
its own variable in pivotedMeanNO2bySiteID. You can easily compare sites to each other.
• It is harder to sort and pick out the largest values across the whole pivoted table. After
unstacking, pivotedMeanNO2bySiteID has 443 variables. The stacked version,
meanNO2bySite, has only seven variables.

Group by Time and Another Grouping Variable

To group data in NO2data by year and another grouping variable, it was necessary to add Year as an
additional variable. Also, the output from groupsummary is a table even when the input is a
timetable. But suppose you want to keep the results in a timetable instead. The retime function can
also produce annual summaries. But it can group data only by time. To group data by site and by year,
rearrange NO2data so that you can call retime on a timetable where the NO2 concentrations are
already grouped by site.

Group the raw data in NO2data by site by using the unstack function. The output timetable has a
separate variable for each site. This timetable looks similar to a pivot table. But instead of having
means or some other statistic, NO2bySite has all the raw data. It is just reorganized. For further
convenience, sort the rows of the timetable by their row times so that the earliest timestamps come
first.

NO2bySite = unstack(NO2data,"MeasuredNO2","SiteID","GroupingVariable","Timestamp");
NO2bySite = sortrows(NO2bySite)

NO2bySite=43824×442 timetable
Timestamp Alaska_KenaiPeninsula_1004 Arizona_Apache_10 Arizona_Apache_11
____________________ __________________________ _________________ _________________

01-Jan-1985 00:00:00 NaN NaN NaN


01-Jan-1985 01:00:00 NaN NaN NaN

In this format you can easily plot the raw data by using the stackedplot function. This plot shows
NO2 concentrations for each site as a function of time.

stackedplot(NO2bySite)

9-95
9 Tables

To create a timetable that is also a pivot table, use retime to calculate annual means.

meanNO2bySiteTT = retime(NO2bySite,"yearly","mean")

meanNO2bySiteTT=5×442 timetable
Timestamp Alaska_KenaiPeninsula_1004 Arizona_Apache_10 Arizona_Apache_11 Arizon
___________ __________________________ _________________ _________________ ______

01-Jan-1985 NaN 8.4406 8.4264


01-Jan-1986 NaN 7.3333 7.587 6

Create a stacked plot of the annual means.

stackedplot(meanNO2bySiteTT)

9-96
Grouped Calculations in Tables and Timetables

You might want to preserve information from the original timetable NO2data in this timetable of
results. For example, you might want to add the latitudes and longitudes of the sites to NO2bySite.
They were stored for each timestamp in NO2data. But to store them more compactly in this
timetable, add them as per-variable custom properties to NO2bySite.
LatLon = groupsummary(NO2data,"SiteID","mode",["Latitude","Longitude"]);
NO2bySite = addprop(NO2bySite,["Latitude","Longitude"],["variable","variable"]);
NO2bySite.Properties.CustomProperties.Latitude(string(LatLon.SiteID)) = LatLon.mode_Latitude';
NO2bySite.Properties.CustomProperties.Longitude(string(LatLon.SiteID)) = LatLon.mode_Longitude';

Moving Means for Data Grouped by Site

To calculate compliance with the second NAAQS standard for NO2 requires a sequence of grouped
calculations. By the second standard, a location is out of compliance if the 98th percentile of the 1-
hour daily maximum concentrations of NO2, averaged over 3 years, exceeds 100 ppb.

Start with the hourly concentrations of NO2 by site. To find the daily maximum for each site, use the
retime function, specifying "max" as the method to find the maximum concentration for each day's
worth of data. Then find the 98th percentiles of the daily maximums in each year's worth of data,
calling retime a second time. To calculate percentiles, use the findPrctile supporting function
referred to in this example.
dailyMax = retime(NO2bySite,"daily","max");
yearlyP98 = retime(dailyMax,"yearly",@(x)findPrctile(x,98))

yearlyP98=5×442 timetable
Timestamp Alaska_KenaiPeninsula_1004 Arizona_Apache_10 Arizona_Apache_11 Arizon

9-97
9 Tables

___________ __________________________ _________________ _________________ ______

01-Jan-1985 NaN 27 29
01-Jan-1986 NaN 13 14

Next calculate a moving mean for each site, specifying a three-year window for the moving mean. The
smoothdata enables you to apply the movmean function to each variable in yearlyP98.
moving3yearAvg = smoothdata(yearlyP98,"movmean",[years(3) 0])

moving3yearAvg=5×442 timetable
Timestamp Alaska_KenaiPeninsula_1004 Arizona_Apache_10 Arizona_Apache_11 Arizon
___________ __________________________ _________________ _________________ ______

01-Jan-1985 NaN 27 29
01-Jan-1986 NaN 20 21.5

Display sites that are out of compliance. First specify a time range starting in 1987, the first year for
which the moving three-year window has three full years of data.
full3years = timerange("1987-01-01","1989-01-01","closed")

full3years =
timetable timerange subscript:

Select timetable rows with times in the closed interval:


[01-Jan-1987 00:00:00, 01-Jan-1989 00:00:00]