0% found this document useful (0 votes)

44 views12 pages

Lab03.Processing Text Streams

Uploaded by

BCO

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views12 pages

Lab03.Processing Text Streams

Uploaded by

BCO

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Linux 101

Processing Text Streams

Regular expressions
language for expressing patterns in text
special strings that define search patterns

text
REGEX All patterns in text matching regex
ENGINE
regex
Find all email addresses in a document

regex matches string = the string has the same structure as defined by the regexp

normal characters
REGEX = metacharacters represent patterns
escape character interpret metacharacters as normal ones

[Link]
Regular expressions
metacharacters

. any character escape character

repetitions \
* zero or more times
? zero or one time
+ one or more times
{n,m} minimum n and maximum m times

| or
groups and ranges
[aeiou] character set matches any vowel
[^aeiou] any character not in the set matches any consonant
[a-z] character range matches entire lowercase alphabet
() grouping

anchors
^ start of line ^[0-9]{3}$
$ end of line 000...999 on a single line
\b word boundaries
Using grep

grep Global Regular Expression Print. Print lines matching a pattern

-E, --extended-regexp (same as egrep)
-c, --count count matching lines
-f <file>, --file=<file take pattern from file
-i, --ignore-case
-r, --recursive search directories recursively
(same as rgrep)

grep [options] regexp [files]

TIP: quote the regexp to avoid shell expansion “regexp “

Using sed

sed stream editor

-n, --quiet, --silent don’t print lines automatically
-e <script>, --expression=<script> add script to the commands
-f <script_file>, --file=<script_file> read commands from <script_file>

sed [options] script [file]

command grouping commands

line _restriction command { }

3 apply command to line 3 s/pattern/replacement/flags

2,15 all lines between 2 and 15 substitute
/pattern/ all lines matching pattern p print line
/pattern1/, /pattern2/ d delete line
w file write to file
negate restriction q quit
!
Using Filters

COLUMN / FIELD
processing join [opts] file1 file2
tr [opts] set1 [set2]
cut, paste
LINE join
processing expand, tr CHARACTER
unexpand processing
FILE
head, tail, nl
cat, tac processing
sort, uniq
split sed

uniq [opts] [in [out]] wc

split [opts] [file [prefix]] od, pr, fmt
FILE
PRINT statistics
formatting

command [opts] [file] …

Using Filters
head output the beginning (default 10 lines) of the file
-c <num>, --bytes=<num>
-n <num>, --lines=<num>

tail output the end (default 10 lines) of the file

-f, --follow
--pid=<pid> terminate following when <pid> terminates

sort order lines lexicographically (or by a field)

-f, --ignore-case
-n, --numeric-sort sort numerically
-r, --reverse
-k <field>, --key=<field> field to sort by (default first)
uniq discard duplicate lines
-u show only unique lines uniq [opts] [in [out]]
-d show only duplicate lines
-c count occurrences

nl number lines in the output a – all t – non blank

-h <style>, -b <style>, -f <style> n – no number
-n <format>, --number-format=<format>
-i line increment ln rn rz
Using Filters
cut extract sections (columns) from each line
-b <list>, --bytes=<list>
-c <list>, --characters=<list>
-f <list>, --fields=<list>
-d <char>, --delimiter=<char> (default tab)
-s, --only-delimited

paste merge files line by line

-d <list>, --delimiters=<list>
-s --serial put each file on a line

default delimiter is TAB

default delimiter is space

join combines two files by matching fields

-t <char> field separator
-i ignore case
-1 n, -2 n specify join field number

join [opts] file1 file2

Using Filters

expand convert tabs to spaces

-t <num>, --tabs <nums> modify spacing of tabs (default 8)

unexpand convert spaces to tabs

tr translate characters
ABC
-t, --truncate-set1 1-9 = 123456789
-d deletes characters from set 1

tr [opts] set1 [set2]

$echo “lower to upper case” | tr “a-z” “A-Z”

LOWER TO UPPER CASE

wc word count – counts lines, words and bytes

-l, --lines -w, --words
-c, --bytes -m, --chars
-L, --max-line-length
Using Filters

cat concatenate files to the output

-E, --show-ends put a $ at the end of each line
-n, --number add line numbers
-b, --number-nonblank numbers only nonblank lines
-s, --squeeze-blank compresses more blanks lines into a single one
-T, --show-tabs display tab chars as ^I
-v, --show-nonprinting displays control chars as (e.g. ^M)

tac concatenate and reverse order of lines in each file

split break a single file into multiple parts

-b <size>, --bytes=<size> default prefix: x
-C <size>, --line-bytes=<size>
-l <lines>, --lines=<lines> default suffixes: aa, ab, ac …
-d, --numeric-suffixes

split [opts] [file [prefix]]

Using Filters

pr prepare a file for printing

-l <lines>, --length=<lines> set page length
-h <text>, --header=<text> set header text
-o <chars>, --indent=<chars> set left margin
-w <chars>, --width=<chars> set page width

fmt format paragraphs

-<width>, -w <width>, --width=<width> (default 75)
-t, --tagged-paragraph indentation first line

od (octal dump) display files in octal or other formats

-t <type>, --format=<type>
-w <width>, --width=<width> output <width> bytes per line

TYPE
d2 – decimal shorts, d4 – decimal longs
x2 – hexadecimal shorts, x4 – hexadecimal longs
o2 – octal shorts (default), o4 – octal longs
Vi editor
Operation modes
Command mode Ex mode Insert mode
default colon commands

w, b = forward, bakward one word

h, j, k, l = Left, Down, Up, Right ^, $ = start, end of line

precede with number to multiply command Commands that enter insert mode

d delete i insert before the cursor

dw delete word I insert at line start
dd delete line a append after the cursor
A append at the end of line
y, yw, yy yank (copy) o open line after cursor
c, cw, cc change O open line before cursor
p paste after cursor r replace character
P paste before cursor R replace to the end of line
:w save
/ forward search
:q quit
? reverse search
:wq, ZZ save & quit

Module 5
No ratings yet
Module 5
13 pages
Unix Command Guide for Beginners
No ratings yet
Unix Command Guide for Beginners
5 pages
Module 5
No ratings yet
Module 5
14 pages
Linux Command Cheat Sheet
No ratings yet
Linux Command Cheat Sheet
5 pages
Bash Ch01
No ratings yet
Bash Ch01
14 pages
Unix & Regex for Academics
No ratings yet
Unix & Regex for Academics
83 pages
L5 - Reg Exp
No ratings yet
L5 - Reg Exp
38 pages
Essential Linux Filter Commands
100% (1)
Essential Linux Filter Commands
18 pages
System Cheetsheet
No ratings yet
System Cheetsheet
4 pages
Pipingfile
No ratings yet
Pipingfile
11 pages
SW LAB 10 Filter
No ratings yet
SW LAB 10 Filter
45 pages
Unit 3 Linux Regular Expression
No ratings yet
Unit 3 Linux Regular Expression
15 pages
Using Grep, TR and Sed With Regular Expressions
No ratings yet
Using Grep, TR and Sed With Regular Expressions
7 pages
Files:: Ls Ls - L Ls - A Esc K More Filename
No ratings yet
Files:: Ls Ls - L Ls - A Esc K More Filename
9 pages
Unix Commands: Text Processing Guide
No ratings yet
Unix Commands: Text Processing Guide
66 pages
Sedbook
No ratings yet
Sedbook
16 pages
Essential Linux Filters and Commands
No ratings yet
Essential Linux Filters and Commands
19 pages
Linux Commands
No ratings yet
Linux Commands
13 pages
Sed One-Liners Explained (Preview Copy)
No ratings yet
Sed One-Liners Explained (Preview Copy)
17 pages
Advanced Unix Commands-Tmp
No ratings yet
Advanced Unix Commands-Tmp
30 pages
Bash System Commands Cheat Sheet
No ratings yet
Bash System Commands Cheat Sheet
15 pages
Unit-3 Usp
No ratings yet
Unit-3 Usp
82 pages
Filer Command
No ratings yet
Filer Command
38 pages
Linux Command Cheat Sheet
No ratings yet
Linux Command Cheat Sheet
5 pages
Final Study Notes
No ratings yet
Final Study Notes
36 pages
Advanced Linux Command Guide
No ratings yet
Advanced Linux Command Guide
21 pages
Lab 8
No ratings yet
Lab 8
6 pages
Unix Commands
No ratings yet
Unix Commands
10 pages
UNIT4
No ratings yet
UNIT4
105 pages
Linux Stream Editor
No ratings yet
Linux Stream Editor
85 pages
Linux/UNIX Shell Scripting Lab Manual
No ratings yet
Linux/UNIX Shell Scripting Lab Manual
21 pages
UNIT-4: Filters
No ratings yet
UNIT-4: Filters
30 pages
Essential UNIX Filter Commands Guide
No ratings yet
Essential UNIX Filter Commands Guide
18 pages
Linux Unit 3
No ratings yet
Linux Unit 3
9 pages
Unix Suggestion
No ratings yet
Unix Suggestion
4 pages
Unix Suggestion
No ratings yet
Unix Suggestion
3 pages
Sed, A Stream Editor: by Ken Pizzini, Paolo Bonzini
No ratings yet
Sed, A Stream Editor: by Ken Pizzini, Paolo Bonzini
81 pages
Linux Ex
No ratings yet
Linux Ex
3 pages
Linuxsuite 6
No ratings yet
Linuxsuite 6
55 pages
Understanding the "ps -ef" Command
No ratings yet
Understanding the "ps -ef" Command
76 pages
Commands in UNIX
No ratings yet
Commands in UNIX
29 pages
Filter Commands
No ratings yet
Filter Commands
7 pages
Essential Linux File Commands Guide
No ratings yet
Essential Linux File Commands Guide
33 pages
Basic Filters & Pipes
No ratings yet
Basic Filters & Pipes
33 pages
Essential Sed One-Liners Guide
No ratings yet
Essential Sed One-Liners Guide
8 pages
Unix Utilities: Grep, Sed, and Awk
100% (1)
Unix Utilities: Grep, Sed, and Awk
81 pages
Scripting Language Lab 2
No ratings yet
Scripting Language Lab 2
8 pages
Os Lab 04
No ratings yet
Os Lab 04
23 pages
Advanced UNIX Utilities Guide
No ratings yet
Advanced UNIX Utilities Guide
32 pages
Grep Command Guide for Developers
No ratings yet
Grep Command Guide for Developers
2 pages
20.10 Filters-Text Processing Commands
No ratings yet
20.10 Filters-Text Processing Commands
14 pages
Dhruv Pandit: Name: Class:Cba Enrolment No: Batch: Cse - 21
No ratings yet
Dhruv Pandit: Name: Class:Cba Enrolment No: Batch: Cse - 21
12 pages
Sed Commands
No ratings yet
Sed Commands
8 pages
Linux Evaluation
No ratings yet
Linux Evaluation
47 pages
Grep, Awk, and Sed Essentials
No ratings yet
Grep, Awk, and Sed Essentials
9 pages
SQL Essentials for Beginners
No ratings yet
SQL Essentials for Beginners
3 pages
Lab07.Filesystem Management
No ratings yet
Lab07.Filesystem Management
9 pages
Lab06.the Linux Boot Process
No ratings yet
Lab06.the Linux Boot Process
8 pages
Collections That Sell v3
No ratings yet
Collections That Sell v3
12 pages
Linux Text Stream Processing Guide
No ratings yet
Linux Text Stream Processing Guide
4 pages
Leadership - Private Notes
No ratings yet
Leadership - Private Notes
6 pages

Lab03.Processing Text Streams

Uploaded by

Lab03.Processing Text Streams

Uploaded by

Linux 101

Processing Text Streams

. any character escape character

grep Global Regular Expression Print. Print lines matching a pattern

grep [options] regexp [files]

TIP: quote the regexp to avoid shell expansion “regexp “

sed stream editor

sed [options] script [file]

command grouping commands

3 apply command to line 3 s/pattern/replacement/flags

uniq [opts] [in [out]] wc

command [opts] [file] …

tail output the end (default 10 lines) of the file

sort order lines lexicographically (or by a field)

nl number lines in the output a – all t – non blank

paste merge files line by line

default delimiter is TAB

join combines two files by matching fields

join [opts] file1 file2

expand convert tabs to spaces

unexpand convert spaces to tabs

tr [opts] set1 [set2]

$echo “lower to upper case” | tr “a-z” “A-Z”

wc word count – counts lines, words and bytes

cat concatenate files to the output

tac concatenate and reverse order of lines in each file

split break a single file into multiple parts

split [opts] [file [prefix]]

pr prepare a file for printing

fmt format paragraphs

od (octal dump) display files in octal or other formats

w, b = forward, bakward one word

d delete i insert before the cursor

You might also like