Skip to content

Commit ab669cb

Browse files
committed
Documented why changing Base64 sometimes does not invalid signature checks. Resolves #518
1 parent c38f4af commit ab669cb

File tree

1 file changed

+105
-2
lines changed

1 file changed

+105
-2
lines changed

README.md

Lines changed: 105 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -75,7 +75,10 @@ enforcement.
7575
* [JSON Processor](#json)
7676
* [Custom JSON Processor](#json-custom)
7777
* [Jackson ObjectMapper](#json-jackson)
78-
* [Base64 Codec](#base64)
78+
* [Base64 Support](#base64)
79+
* [Base64 in Security Contexts](#base64-security)
80+
* [Base64 is not Encryption](#base64-not-encryption)
81+
* [Changing Base64 Characters](#base64-changing-characters)
7982
* [Custom Base64 Codec](#base64-custom)
8083

8184
<a name="features"></a>
@@ -1327,7 +1330,107 @@ utility classes.
13271330
`io.jsonwebtoken.io.Decoders`:
13281331

13291332
* `BASE64` is an RFC 4648 [Base64](https://tools.ietf.org/html/rfc4648#section-4) decoder
1330-
* `BASE64URL` is an RFC 4648 [Base64URL](https://tools.ietf.org/html/rfc4648#section-5) decoder
1333+
* `BASE64URL` is an RFC 4648 [Base64URL](https://tools.ietf.org/html/rfc4648#section-5) decoder
1334+
1335+
<a name="base64-security"></a>
1336+
### Understanding Base64 in Security Contexts
1337+
1338+
All cryptographic operations, like encryption and message digest calculations, result in binary data - raw byte arrays.
1339+
1340+
Because raw byte arrays cannot be represented natively in JSON, the JWT
1341+
specifications employ the Base64URL encoding scheme to represent these raw byte values in JSON documents or compound
1342+
structures like a JWT.
1343+
1344+
This means that the Base64 and Base64URL algorithms take a raw byte array and converts the bytes into a string suitable
1345+
to use in text documents and protocols like HTTP. These algorithms can also convert these strings back
1346+
into the original raw byte arrays for decryption or signature verification as necessary.
1347+
1348+
That's nice and convenient, but there are two very important properties of Base64 (and Base64URL) text strings that
1349+
are critical to remember when they are used in security scenarios like with JWTs:
1350+
1351+
* [Base64 is not encryption](#base64-not-encryption)
1352+
* [Changing Base64 characters](#base64-changing-characters) **does not automatically invalidate data**.
1353+
1354+
<a name="base64-not-encryption"></a>
1355+
#### Base64 is not encryption
1356+
1357+
Base64-encoded text is _not_ encrypted.
1358+
1359+
While a byte array representation can be converted to text with the Base64 algorithms,
1360+
anyone in the world can take Base64-encoded text, decode it with any standard Base64 decoder, and obtain the
1361+
underlying raw byte array data. No key or secret is required to decode Base64 text - anyone can do it.
1362+
1363+
Based on this, when encoding sensitive byte data with Base64 - like a shared or private key - **the resulting
1364+
string NOT is safe to expose publicly**.
1365+
1366+
A base64-encoded key is still sensitive information and must
1367+
be kept as secret and as safe as the original thing you got the bytes from (e.g. a Java `PrivateKey` or `SecretKey`
1368+
instance).
1369+
1370+
After Base64-encoding data into a string, it is possible to then encrypt the string to keep it safe from prying
1371+
eyes if desired, but this is different. Encryption is not encoding. They are separate concepts.
1372+
1373+
<a name="base64-changing-characters"></a>
1374+
#### Changing Base64 Characters
1375+
1376+
In an effort to see if signatures or encryption is truly validated correctly, some try to edit a JWT
1377+
string - particularly the Base64-encoded signature part - to see if the edited string fails security validations.
1378+
1379+
This conceptually makes sense: change the signature string, you would assume that signature validation would fail.
1380+
1381+
_But this doesn't always work. Changing base64 characters is an invalid test_.
1382+
1383+
Why?
1384+
1385+
Because of the way the Base64 algorithm works, there are multiple Base64 strings that can represent the same raw byte
1386+
array.
1387+
1388+
Going into the details of the Base64 algorithm is out of scope for this documentation, but there are many good
1389+
Stackoverflow [answers](https://stackoverflow.com/questions/33663113/multiple-strings-base64-decoded-to-same-byte-array?noredirect=1&lq=1)
1390+
and [JJWT issue comments](https://github.com/jwtk/jjwt/issues/211#issuecomment-283076269) that explain this in detail.
1391+
Here's one [good answer](https://stackoverflow.com/questions/29941270/why-do-base64-decode-produce-same-byte-array-for-different-strings):
1392+
1393+
> Remember that Base64 encodes each 8 bit entity into 6 bit chars. The resulting string then needs exactly
1394+
> 11 * 8 / 6 bytes, or 14 2/3 chars. But you can't write partial characters. Only the first 4 bits (or 2/3 of the
1395+
> last char) are significant. The last two bits are not decoded. Thus all of:
1396+
>
1397+
> dGVzdCBzdHJpbmo
1398+
> dGVzdCBzdHJpbmp
1399+
> dGVzdCBzdHJpbmq
1400+
> dGVzdCBzdHJpbmr
1401+
> All decode to the same 11 bytes (116, 101, 115, 116, 32, 115, 116, 114, 105, 110, 106).
1402+
1403+
As you can see by the above 4 examples, they all decode to the same exact 11 bytes. So just changing one or two
1404+
characters at the end of a Base64 string may not work and can often result in an invalid test.
1405+
1406+
<a name="base64-invalid-characters"></a>
1407+
##### Adding Invalid Characters
1408+
1409+
JJWT's default Base64/Base64URL decoders automatically ignore illegal Base64 characters located in the beginning and
1410+
end of an encoded string. Therefore prepending or appending invalid characters like `{` or `]` or similar will also
1411+
not fail JJWT's signature checks either. Why?
1412+
1413+
Because such edits - whether changing a trailing character or two, or appending invalid characters - do not actually
1414+
change the _real_ signature, which in cryptographic contexts, is always a byte array. Instead, tests like these
1415+
change a text encoding of the byte array, and as we covered above, they are different things.
1416+
1417+
So JJWT 'cares' more about the real byte array and less about its text encoding because that is what actually matters
1418+
in cryptographic operations. In this sense, JJWT follows the [Robustness Principle](https://en.wikipedia.org/wiki/Robustness_principle)
1419+
in being _slightly_ lenient on what is accepted per the rules of Base64, but if anything in the real underlying
1420+
byte array is changed, then yes, JJWT's cryptographic assertions will definitely fail.
1421+
1422+
To help understand JJWT's approach, we have to remember why signatures exist. From our documentation above on
1423+
[signing JWTs](#jws):
1424+
1425+
> * guarantees it was created by someone we know (it is authentic), as well as
1426+
> * guarantees that no-one has manipulated or changed it after it was created (its integrity is maintained).
1427+
1428+
Just prepending or appending invalid text to try to 'trick' the algorithm doesn't change the integrity of the
1429+
underlying claims or signature byte arrays, nor the authenticity of the claims byte array, because those byte
1430+
arrays are still obtained intact.
1431+
1432+
Please see [JJWT Issue #518](https://github.com/jwtk/jjwt/issues/518) and its referenced issues and links for more
1433+
information.
13311434

13321435
<a name="base64-custom"></a>
13331436
### Custom Base64

0 commit comments

Comments
 (0)