Skip to content

Commit 09c0c9c

Browse files
committed
add 2. Unicode-block-filter
@getreu getreu tagged this 12 minutes ago - "always pass" characters introduced with version 1.2.0 removed: ` !"#$%&'()*+,-./0123456789:;<=>?` - instead use the new 2. block-filter e.g.: `-e UTF-16be,10,U+20..U+2f,U+400..U+07ff` - speed improvements
1 parent e05e071 commit 09c0c9c

File tree

8 files changed

+299
-122
lines changed

8 files changed

+299
-122
lines changed

Cargo.lock

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[package]
22
name = "stringsext"
3-
version = "1.2.2"
3+
version = "1.3.1"
44
authors = ["Jens Getreu <[email protected]>"]
55

66
[dependencies]

doc/src/stringsext--man.rst

Lines changed: 31 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,18 @@ search for multi-byte encoded strings in binary data.
1515
:Date: 2017-01-03
1616
:Version: 1.2.0
1717
18+
:Date: 2017-01-04
19+
:Version: 1.2.1
20+
21+
:Date: 2017-01-05
22+
:Version: 1.2.2
23+
24+
:Date: 2017-01-07
25+
:Version: 1.3.0
26+
1827
:Author: Jens Getreu
19-
:Date: 2017-01-04
20-
:Version: 1.2.1
28+
:Date: 2017-01-08
29+
:Version: 1.3.1
2130
:Copyright: Apache License, Version 2.0 (for details see COPYING section)
2231
:Manual section: 1
2332
:Manual group: Forensic Tools
@@ -104,7 +113,7 @@ OPTIONS
104113
**-e** *ENC*, **--encoding**\ =\ *ENC*
105114
Set (multiple) input search encodings.
106115

107-
*ENC*\ ==\ *ENCNAME*\ [,\ *MIN*\ [,\ *UNICODEBLOCK*\ ]]
116+
*ENC*\ ==\ *ENCNAME*\ [,\ *MIN*\ [,\ *UNICODEBLOCK*\ [,\ *UNICODEBLOCK*\ ]]]
108117

109118
*ENCNAME*
110119
Search for strings in encoded in ENCNAME. Encoding names
@@ -119,7 +128,7 @@ OPTIONS
119128
*UNICODEBLOCK*
120129
Restrict the search to characters within *UNICODEBLOCK*. This
121130
can be used to search for a certain script or to reduce false
122-
positives when searching for UTF-16 encoded strings. See
131+
positives, especially when searching for UTF-16 encoded strings. See
123132
``https://en.wikipedia.org/wiki/Unicode_block`` for a list of
124133
scripts and their corresponding Unicode-block-ranges.
125134
*UNICODEBLOCK* has the following syntax:
@@ -136,10 +145,8 @@ OPTIONS
136145
this case a warning specifying the enlarged *UNICODEBLOCK* is
137146
emitted.
138147

139-
The following characters do not observe *UNICODEBLOCK*
140-
restrictions and are always printed even if they are out of range:
141-
``\t !"#$%&'()*+,-./0123456789:;<=>?``
142-
(U+0009, U+0020..U+003F).
148+
When a second optional *UNICODEBLOCK* is given, the total
149+
Unicode-point search range is the union of the first and the second.
143150

144151
See the output of **--help** for the default value of *ENC*.
145152

@@ -218,17 +225,19 @@ When used with pipes ``-c r`` is required:
218225
stringsext -e iso-8859-7 -c r -t x someimage.raw | grep "Ιστορία"
219226

220227
Reduce the number of false positives, when scanning an image file for
221-
UTF-16:
228+
UTF-16. In the following example we search for Cyrillic, Arabic and Siriac
229+
strings, which may contain these additional these symbols:
230+
``\t !"#$%&'()*+,-./0123456789:;<=>?``
222231

223232
::
224233

225-
stringsext -e UTF-16le,20,U+0..U+3FF -e UTF-16le,20,U+400..U+7FF someimage.raw
234+
stringsext -e UTF-16le,30,U+20..U+3f,U+400..U+07ff someimage.raw
226235

227236
The same but shorter:
228237

229238
::
230239

231-
stringsext -e UTF-16le,20,0..3FF -e UTF-16le,20,400..7FF someimage.raw
240+
stringsext -e UTF-16le,30,20..3f,400..07ff someimage.raw
232241

233242
Combine Little-Endian and Big-Endian scanning:
234243

@@ -246,6 +255,13 @@ The following settings are designed to produce bit-identical output with
246255
stringsext -e ascii -c i -t x # equals `strings -t x`
247256
stringsext -e ascii -c i -t o # equals `strings -t o`
248257

258+
The following examples perform the same search, but the output format is
259+
slightly different:
260+
261+
::
262+
263+
stringsext -e UTF-16LE,10,0..7f # equals `strings -n 10 -e l`
264+
stringsext -e UTF-16BE,10,0..7f # equals `strings -n 10 -e b`
249265

250266

251267
LIMITATIONS
@@ -289,7 +305,10 @@ will most likely never exceed the WIN\_LEN buffer and therefor will never be
289305
split. In such a scenario it is a good practise to run Unicode and ASCII
290306
scanners in parallel.
291307

292-
308+
When a graphic string has to be cut at the WIN_LEN buffer boundary, *stringsext*
309+
can not in all cases determine the length of the first piece. In these rare
310+
cases *stringsext* always prints the second piece, even when it is shorter than
311+
**--bytes** would require.
293312

294313

295314

make-targets

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,12 +4,8 @@ rustup target add x86_64-unknown-linux-musl
44
rustup target add i686-unknown-linux-musl
55
#apt-get install libc6-dev-i386
66
rustup target add i686-unknown-linux-gnu
7-
#rustup target add i686-pc-windows-gnu
8-
#rustup target add x86_64-pc-windows-gnu
97

108
cargo build --target x86_64-unknown-linux-gnu --release
119
cargo build --target x86_64-unknown-linux-musl --release
1210
cargo build --target i686-unknown-linux-musl --release
1311
cargo build --target i686-unknown-linux-gnu --release
14-
#cargo build --target i686-pc-windows-gnu --release
15-
#cargo build --target x86_64-pc-windows-gnu --release

make-targets.bat

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
rustup default stable
2+
rustup target add i686-pc-windows-gnu
3+
rustup target add x86_64-pc-windows-gnu
4+
5+
rustup default stable-i686-pc-windows-gnu
6+
rustup set default-host i686-pc-windows-gnu
7+
cargo build --target i686-pc-windows-gnu --release
8+
9+
rustup default stable-x86_64-pc-windows-gnu
10+
rustup set default-host x86_64-pc-windows-gnu
11+
cargo build --target x86_64-pc-windows-gnu --release

0 commit comments

Comments
 (0)