0% found this document useful (0 votes)
30 views2 pages

Uni Code Basic

The document discusses Unicode and encoding, focusing on UTF-8 patterns and comparisons with UTF-16 and UTF-32. It also covers image formats, differentiating between vector and bitmap images, and includes practice exercises for encoding and image size calculations. Additionally, it provides exam reminders regarding units and assumptions for encoding-related questions.

Uploaded by

senbeth11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views2 pages

Uni Code Basic

The document discusses Unicode and encoding, focusing on UTF-8 patterns and comparisons with UTF-16 and UTF-32. It also covers image formats, differentiating between vector and bitmap images, and includes practice exercises for encoding and image size calculations. Additionally, it provides exam reminders regarding units and assumptions for encoding-related questions.

Uploaded by

senbeth11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

AS Computer Science: Unicode (UTF-8 focus) and Images

Unicode vs Encoding

• Unicode: a list of characters; each has a code point, e.g. ’A’ = U+0041, Euro = U+20AC.

• Encoding: rules to store code points as bytes. We compare UTF-8, UTF-16, UTF-32.

UTF-8 patterns
Range Pattern Bytes
U+0000–007F 0xxxxxxx 1
U+0080–07FF 110xxxxx 10xxxxxx 2
U+0800–FFFF 1110xxxx 10xxxxxx 10xxxxxx 3
U+10000–10FFFF 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 4

Examples (hex bytes)

• ’A’ U+0041 → 41 ’é’ U+00E9 → C3 A9

• ’£’ U+00A3 → C2 A3 C’ U+20AC → E2 82 AC


’=

• ’’ U+4F60 → E4 BD A0 ’’ U+0634 → D8 B4

• ’’ U+1F600 → F0 9F 98 80

Compare encodings

• UTF-8: 1–4 bytes; ASCII stays 1 byte; web standard.

• UTF-16: 2 or 4 bytes; uses surrogate pairs for U+10000+; often BOM (LE/BE).

• UTF-32: always 4 bytes; simplest indexing; large files.

Practice (Unicode)

1. Encode to UTF-8: (a) ’£’ U+00A3; (b) ’’ U+0939; (c) ’’ U+1F642.

2. Decode: (a) C3 A7; (b) E6 97 A5; (c) F0 9F 8E 89.

3. A file stores Cafe (with é U+00E9) in UTF-8. How many bytes?

Images: vector vs bitmap

• Vector: paths, strokes, fills (SVG/PDF). Scales perfectly.

• Bitmap: pixels in a grid. Resolution = width × height, bit depth in bits per pixel.

• Uncompressed size ≈ w × h × bpp/8 bytes.

• Lossless (PNG/GIF) vs Lossy (JPEG). Metadata (EXIF) adds bytes.

1
Practice (Images)

1. Choose a format and justify: (a) school logo; (b) holiday photo; (c) UI icons.

2. Calculate uncompressed size: (a) 1024 × 768 at 24bpp; (b) 3840 × 2160 at 24bpp.

3. A banner 2560 × 720 at 24bpp is saved as JPEG (lossy). Explain why the file on disk is usually
much smaller than the uncompressed size.

Exam reminders

Show units, state assumptions (“ignore compression”), and use correct encoding for size questions
(UTF-8 is variable length).

You might also like