Sorted Notes
Alright — let’s go through Python’s `sorted()` in detail, but in a clear, no-dusty-
documents way.
---
## **1. What it does**
`sorted(iterable, *, key=None, reverse=False)`
* Takes **any iterable** (list, tuple, string, set, dict, generator, etc.)
* Returns a **new list** containing the elements in sorted order.
* **Does not modify** the original iterable (unlike `.sort()` which works in-place
on lists).
---
## **2. Parameters**
### **iterable**
Anything you can loop over:
```python
sorted([3, 1, 2]) # [1, 2, 3]
sorted("banana") # ['a', 'a', 'a', 'b', 'n', 'n']
sorted({5, 2, 9}) # [2, 5, 9]
```
---
### **key**
A function that decides **what to sort by**.
For example, sort words by length:
```python
words = ["apple", "kiwi", "banana"]
sorted(words, key=len)
# ['kiwi', 'apple', 'banana']
```
You can use **any function or lambda**:
```python
nums = [-5, 1, -10, 3]
sorted(nums, key=abs) # sort by absolute value
# [1, 3, -5, -10]
```
---
### **reverse**
If `True`, reverses the order:
```python
sorted([3, 1, 2], reverse=True)
# [3, 2, 1]
```
---
## **3. Important points**
1. **Doesn’t mutate** the original object:
```python
l = [3, 1, 2]
sorted(l) # [1, 2, 3]
print(l) # [3, 1, 2]
```
2. **Stable sort** — if two elements are equal according to the `key`, their
original order is kept:
```python
data = [('a', 2), ('b', 2), ('c', 1)]
sorted(data, key=lambda x: x[1])
# [('c', 1), ('a', 2), ('b', 2)] ← 'a' stays before 'b'
```
3. Works with **mixed types** only if they can be compared.
In Python 3, you **can’t** sort a list with both numbers and strings:
```python
sorted([1, "apple"]) # TypeError
```
---
✅ **Summary**:
`sorted()` is like a polite guest — it takes your data, arranges it nicely into a
**new list**, and leaves the original untouched. You can tell it **how** to arrange
things using `key` and `reverse`.
---
If you want, I can also explain **why `sorted()` is usually preferred over
`.sort()` in functions** — that’s a subtle but important design choice.
The short answer: **`sorted()` is safer and more flexible than `.sort()` when
you’re writing functions, because it doesn’t mutate the input and works on any
iterable, not just lists.**
Let’s break it down:
---
## **1. `.sort()` changes the list in-place**
```python
data = [3, 1, 2]
data.sort()
print(data) # [1, 2, 3]
```
* In-place changes mean you **lose the original order** unless you made a copy.
* If your function receives a list from outside code, modifying it can cause
**unexpected side effects**.
Example problem:
```python
def sort_and_do_something(numbers):
numbers.sort()
# do stuff
return numbers
nums = [5, 3, 1]
result = sort_and_do_something(nums)
print(nums) # [1, 3, 5] ← original list changed (oops)
```
---
## **2. `sorted()` creates a new list**
```python
data = [3, 1, 2]
sorted_data = sorted(data)
print(data) # [3, 1, 2] (unchanged)
print(sorted_data) # [1, 2, 3]
```
* No risk of accidental mutation.
* Makes functions **pure** (same input → same output, no hidden changes).
---
## **3. `sorted()` works on *any* iterable**
`.sort()` only works on lists:
```python
[3, 1, 2].sort() # ✅ works
(3, 1, 2).sort() # ❌ AttributeError
```
But `sorted()` works on:
```python
sorted((3, 1, 2)) # tuple
sorted({3, 1, 2}) # set
sorted("cba") # string
```
---
## **4. `sorted()` can be chained easily**
Because it returns a new list, you can do:
```python
result = sorted(data, key=len, reverse=True)[:5]
```
With `.sort()`, you’d need extra steps because it returns `None`:
```python
data.sort(key=len, reverse=True)
result = data[:5]
```
---
✅ **Rule of thumb**:
* Use `.sort()` **only** when:
* You have a list,
* You *intend* to modify it,
* And you don’t care about the original order.
* Use `sorted()` for **everything else**, especially inside functions.
---
If you want, I can also explain **why Python made `.sort()` return `None`** — it’s
actually a deliberate design to prevent bugs.
Sure — I can walk you through a simplified version of how Python’s `sorted()` works
internally.
The real implementation is in **C** (in `Objects/listobject.c`), but here’s a
**Python-like pseudocode** version so you can understand the logic.
---
## **Simplified `sorted()`**
```python
def sorted(iterable, *, key=None, reverse=False):
# 1. Convert the iterable into a list (so we can sort it)
result = list(iterable)
# 2. If a key function is given, use it in sorting
if key is None:
result.sort(reverse=reverse) # In-place TimSort
else:
result.sort(key=key, reverse=reverse)
# 3. Return the sorted copy
return result
```
---
## **What really happens under the hood**
1. **Convert to list**
* `sorted()` works on *any* iterable.
* First step: turn it into a list so it can be sorted in-place.
```python
result = list(iterable)
```
2. **Use TimSort**
* Python’s `list.sort()` uses **TimSort**, a hybrid of merge sort and insertion
sort.
* **Why TimSort?**
* Very fast for partially sorted data
* Stable (keeps order of equal elements)
* O(n log n) worst case, O(n) best case
3. **Handle `key`**
* If `key` is provided, Python actually transforms each element into a
`(key_value, original_value)` tuple internally (Schwartzian transform).
* This avoids calling `key` repeatedly for the same item.
4. **Reverse if needed**
* If `reverse=True`, just reverse the result after sorting (efficiently).
---
## **Real-world example**
```python
nums = [3, 1, 2]
print(sorted(nums)) # [1, 2, 3]
print(sorted(nums, reverse=True)) # [3, 2, 1]
print(sorted(nums, key=lambda x: -x)) # [3, 2, 1]
```
---
✅ **In summary**:
`sorted()` → **copy iterable → run TimSort → return new list**.
That’s why it’s safe, works for any iterable, and doesn’t touch your original data.
---
If you want, I can also show you **a minimal Python implementation of TimSort** so
you can see exactly how the sorting logic works. It’s surprisingly elegant.
Alright — let’s break down **Timsort** step-by-step:
It’s Python’s built-in sorting algorithm for lists and tuples (`list.sort()` and
`sorted()`), introduced in Python 2.3 (2002).
---
## **1. What is Timsort?**
Timsort is a **hybrid sorting algorithm** that combines:
* **Insertion Sort** (good for small or nearly sorted data)
* **Merge Sort** (good for large, arbitrary data)
It was designed by **Tim Peters** specifically for real-world data patterns.
The key ideas:
* Real-world data often contains **runs** (consecutive increasing or decreasing
sequences).
* Sorting can be accelerated by **identifying these runs** and merging them
smartly.
---
## **2. Step-by-Step How It Works**
### Step 1 — Find natural runs
* The list is scanned to find ascending or descending sequences.
* Descending runs are reversed to make them ascending.
* These runs are of length ≥ `minrun` (a small chunk size, usually **32–64**,
chosen based on list size).
```python
def find_runs(arr, minrun):
runs = []
i = 0
n = len(arr)
while i < n:
run_start = i
# detect ascending or descending
if i + 1 < n and arr[i] <= arr[i+1]:
while i + 1 < n and arr[i] <= arr[i+1]:
i += 1
else:
while i + 1 < n and arr[i] > arr[i+1]:
i += 1
arr[run_start:i+1] = reversed(arr[run_start:i+1])
run_end = i + 1
run_len = run_end - run_start
# extend run to minrun length if needed
if run_len < minrun:
extend_to = min(run_start + minrun, n)
insertion_sort(arr, run_start, extend_to)
run_end = extend_to
runs.append((run_start, run_end))
i = run_end
return runs
```
---
### Step 2 — Sort small runs using Insertion Sort
* Since small lists are faster to sort with Insertion Sort than Merge Sort, each
run is sorted individually.
```python
def insertion_sort(arr, left, right):
for i in range(left+1, right):
key = arr[i]
j = i - 1
while j >= left and arr[j] > key:
arr[j+1] = arr[j]
j -= 1
arr[j+1] = key
```
---
### Step 3 — Merge runs (like Merge Sort, but smart)
* Runs are merged using a **merge strategy** that:
* Keeps run sizes balanced (avoid huge merge cost)
* Uses **galloping mode** when one run wins repeatedly (optimization for
partially sorted data)
```python
def merge(left, right):
result = []
i = j = 0
while i < len(left) and j < len(right):
if left[i] <= right[j]:
result.append(left[i])
i += 1
else:
result.append(right[j])
j += 1
result.extend(left[i:])
result.extend(right[j:])
return result
```
---
### Step 4 — Final sorted array
* After merging all runs, you get a fully sorted array.
---
## **3. Complexity**
| Case | Time Complexity
|
| ---------- |
----------------------------------------------------------------------------------
|
| Best Case | **O(n)** (when data is already sorted or nearly sorted — insertion
sort dominates) |
| Average | **O(n log n)**
|
| Worst Case | **O(n log n)**
|
---
## **4. Space Complexity**
* **O(n)** auxiliary space (because of merging)
* But merges are done **in-place** where possible, and Python’s implementation
minimizes extra memory.
---
## **5. Python’s C Implementation**
The actual Python source is in:
* **CPython repo:** `Objects/listobject.c` and `Objects/timsort.h`
* Written in C, highly optimized
* Uses a **stack of runs** and merging rules
You can see the official C version here:
[https://github.com/python/cpython/blob/main/Objects/listobject.c](https://
github.com/python/cpython/blob/main/Objects/listobject.c)
---
## **6. Minimal Python Implementation of Timsort**
Here’s a small Python version (simplified, no galloping):
```python
def timsort(arr):
minrun = 32
n = len(arr)
# Sort small chunks with insertion sort
for start in range(0, n, minrun):
end = min(start + minrun, n)
insertion_sort(arr, start, end)
# Merge chunks
size = minrun
while size < n:
for start in range(0, n, size * 2):
mid = start + size
end = min(start + size * 2, n)
if mid < end:
arr[start:end] = merge(arr[start:mid], arr[mid:end])
size *= 2
return arr
```
---
If you want, I can also **show you the exact merging stack logic from CPython** so
you can see how it enforces run size rules — that’s the part that makes Timsort
better than a plain merge sort.
Do you want me to walk through that part next?
Got it — you want to use `sorted()` to sort a dictionary **by keys** and **by
values**.
Let’s go step-by-step with examples.
---
## Sample dictionary
```python
data = {"b": 3, "a": 1, "c": 2}
```
---
## **1. Sort by key**
```python
sorted_by_key = dict(sorted(data.items()))
print(sorted_by_key)
# {'a': 1, 'b': 3, 'c': 2}
```
**Explanation:**
* `data.items()` → returns list-like `dict_items` of `(key, value)` pairs.
* `sorted()` sorts by key by default (since `(key, value)` tuples are compared by
the first element).
* `dict(...)` turns it back into a dictionary.
---
## **2. Sort by value**
```python
sorted_by_value = dict(sorted(data.items(), key=lambda item: item[1]))
print(sorted_by_value)
# {'a': 1, 'c': 2, 'b': 3}
```
**Explanation:**
* `key=lambda item: item[1]` → sort by value (the second element of the tuple).
---
## **3. Sort by value (descending)**
```python
sorted_by_value_desc = dict(sorted(data.items(), key=lambda item: item[1],
reverse=True))
print(sorted_by_value_desc)
# {'b': 3, 'c': 2, 'a': 1}
```
---
## **4. Sort by key (descending)**
```python
sorted_by_key_desc = dict(sorted(data.items(), key=lambda item: item[0],
reverse=True))
print(sorted_by_key_desc)
# {'c': 2, 'b': 3, 'a': 1}
```
---
📌 **Note**:
From Python 3.7+, dictionaries **remember insertion order**, so sorting and then
converting back to a dict keeps that order.
---
If you want, I can show you **a combined sort** — for example, sort by value first,
then by key if values are equal. That’s a common interview-style question.