document new command-line-option

getreu · getreu · commit 08a67af5712e · 2020-03-17T21:55:47.000+01:00
diff --git a/doc/source/stringsext--man.md b/doc/source/stringsext--man.md
@@ -1,4 +1,4 @@
-% STRINGSEXT(1) Version 2.1.1 | Stringsext Documentation
+% STRINGSEXT(1) Version 2.2.0 | Stringsext Documentation
 
 <!--
 previous versions
@@ -52,6 +52,9 @@ Version: 2.1.0
 
 Date: 2020-02-01
 Version: 2.1.0
+
+Date: 2020-03-17
+Version: 2.2.0
 -->
 
 # NAME
@@ -77,7 +80,7 @@ binary data: It prints all graphic character sequences in *FILE* or
 
 Unlike *GNU strings* **stringsext** can be configured to search for
 valid characters not only in ASCII but also in many other input
-encodings, e.g.: utf-8, utf-16be, utf-16le, big5, euc-jp, koi8-r
+encodings, e.g.: *utf-8, utf-16be, utf-16le, big5, euc-jp, koi8-r*
 and many others. **\--list-encodings** shows a list of valid encoding
 names based on the WHATWG Encoding Standard. When more than one encoding
 is specified, the scan is performed in different threads simultaneously.
@@ -199,6 +202,18 @@ as *GNU strings* replacement.
     next line. The downside with long output lines is, that the scanner loses
     precision in locating the findings.
 
+**-r**, **\--same-unicode-block**
+
+:   Require all characters in a finding to originate from the same Unicode
+    block. This option helps to reduce false positives, especially when
+    scanning for UTF-16. When set, "`stringsext`" prints only Unicode block
+    homogenous strings. For example: "`-u All -n 10 -r`" finds a sequence of at
+    least 10 Cyrillic characters in a row or finds at least 10 Greek characters
+    in a row, whereas it ignores strings with randomly Cyrillic-Greek mixed
+    characters.  Technically this option guarantees, that all multibyte
+    characters of a finding - encoded as UTF-8 - start with the same leading
+    byte.
+
 **-s** *NUM*, **\--counter-offset**=*NUM*
 
  :  Start offset NUM for the input-stream-byte-counter given as decimal or
diff --git a/src/options.rs b/src/options.rs
@@ -84,7 +84,7 @@ Options:
     chars_min_default!(),
     ").
  -p FILE, --output=FILE         Print not to stdout but in file.
- -q NUM, --output-line-len=NUM  Output line length in UTF-8 characters (default: ",
+ -q NUM, --output-line-len=NUM  Output line length in Unicode-codepoints (default: ",
     output_line_char_nb_max_default!(),
     ").
  -r, --same-unicode-block       Require finding to be Unicode-block homogen.