0% found this document useful (0 votes)
26 views8 pages

Advanced Data Structures in Python

This document provides an overview of Python's collections module, detailing specialized data structures such as Counter, deque, namedtuple, OrderedDict, and defaultdict. Each structure is explained with its key features, initialization methods, common methods, and code examples to illustrate their usage. Mastering these data structures can enhance code efficiency and clarity.

Uploaded by

raghuveera97n
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views8 pages

Advanced Data Structures in Python

This document provides an overview of Python's collections module, detailing specialized data structures such as Counter, deque, namedtuple, OrderedDict, and defaultdict. Each structure is explained with its key features, initialization methods, common methods, and code examples to illustrate their usage. Mastering these data structures can enhance code efficiency and clarity.

Uploaded by

raghuveera97n
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Introduction

While Python's standard data structures (list, tuple, dict, set) are powerful, they are
designed for general-purpose use. The collections module offers more specialized
tools designed to solve specific problems with better performance and more
readable, convenient APIs. Mastering these data structures can significantly improve
the efficiency and clarity of your code.

This guide provides a detailed look at the following key data structures from the
module:
●​ Counter
●​ deque
●​ namedtuple
●​ OrderedDict
●​ defaultdict
1. collections.Counter
A Counter is a dictionary subclass specifically designed for counting hashable
objects. It's a collection where elements are stored as dictionary keys and their counts
are stored as dictionary values.

Key Features
●​ Counting Made Easy: It simplifies the process of tallying frequencies of items in
an iterable.
●​ Dictionary-like: It behaves much like a regular dictionary, but returns a zero
count for missing items instead of raising a KeyError.
●​ Arithmetic Operations: It supports arithmetic operations for combining
counters.
Initialization
You can create a Counter in several ways:

from collections import Counter​



# From an iterable (like a list or string)​
c1 = Counter(['apple', 'orange', 'apple', 'banana', 'orange', 'apple'])​
print(c1) # Output: Counter({'apple': 3, 'orange': 2, 'banana': 1})​

c2 = Counter('programming')​
print(c2) # Output: Counter({'r': 2, 'g': 2, 'm': 2, 'p': 1, 'o': 1, 'a': 1, 'i': 1, 'n': 1})​

# From a dictionary​
c3 = Counter({'red': 4, 'blue': 2})​
print(c3) # Output: Counter({'red': 4, 'blue': 2})​

# Using keyword arguments​
c4 = Counter(cats=4, dogs=8)​
print(c4) # Output: Counter({'dogs': 8, 'cats': 4})​

Common Methods

Method Description

elements() Returns an iterator over elements, repeating


each as many times as its count.

most_common([n]) Returns a list of the n most common elements


and their counts.

subtract(iterable/mapping) Subtracts counts of elements from another


iterable or mapping.

update(iterable/mapping) Adds counts of elements from another iterable


or mapping.

Code Example: Word Frequency


from collections import Counter​

sentence = "the quick brown fox jumps over the lazy dog"​
words = sentence.split()​

word_counts = Counter(words)​

# Find the 3 most common words​
print(f"Most common words: {word_counts.most_common(3)}")​

# Using elements()​
print(f"Elements iterator: {sorted(list(word_counts.elements()))}")​

# Arithmetic with another counter​
inventory = Counter(apple=5, orange=3)​
sold = Counter(apple=2, orange=3, banana=1)​
inventory.subtract(sold)​
print(f"Remaining inventory: {inventory}")​
# Output:​
# Most common words: [('the', 2), ('quick', 1), ('brown', 1)]​
# Elements iterator: ['brown', 'dog', 'fox', 'jumps', 'lazy', 'over', 'quick', 'the', 'the']​
# Remaining inventory: Counter({'apple': 3, 'orange': 0, 'banana': -1})​

2. collections.deque
A deque (pronounced "deck") object is a double-ended queue. It's a list-like container
optimized for fast appends and pops from both its ends.

Key Features
●​ High Performance: Appending or popping from either end of a deque takes
constant time, O(1). In contrast, inserting or removing from the beginning of a
standard list is slow, O(n).
●​ Thread-Safe: Deques support thread-safe, memory-efficient appends and pops
from either side.
●​ Bounded Size: A deque can be created with a maximum length (maxlen). When a
bounded deque is full, it automatically discards items from the opposite end when
new items are added.
Common Methods

Method Description Time Complexity

append(x) Add x to the right side. O(1)

appendleft(x) Add x to the left side. O(1)

pop() Remove and return an O(1)


element from the right side.

popleft() Remove and return an O(1)


element from the left side.

extend(iterable) Extend the right side by O(k)


appending from an iterable.

extendleft(iterable) Extend the left side (note: the O(k)


iterable is reversed).

rotate(n) Rotate the deque n steps to O(k)


the right. n can be negative.

Code Example: Processing a Queue and Keeping History


from collections import deque​

# Basic queue operations​
tasks = deque()​
tasks.append("Task 1")​
tasks.append("Task 2")​
tasks.append("Task 3")​
print(f"Initial tasks: {tasks}")​

# Process a task from the left​
processed_task = tasks.popleft()​
print(f"Processed: {processed_task}, Remaining tasks: {tasks}")​

# Keeping a history of the last N items​
last_five_searches = deque(maxlen=5)​
last_five_searches.append("python collections")​
last_five_searches.append("python deque")​
last_five_searches.append("python list performance")​
last_five_searches.append("python generators")​
last_five_searches.append("python decorators")​
print(f"\nLast 5 searches: {last_five_searches}")​

# Adding a new item pushes the oldest one out​
last_five_searches.append("python async")​
print(f"After one more search: {last_five_searches}")​

# Rotating elements​
d = deque([1, 2, 3, 4, 5])​
d.rotate(2) # rotate right​
print(f"\nRotated deque: {d}")​
# Output:​
# Initial tasks: deque(['Task 1', 'Task 2', 'Task 3'])​
# Processed: Task 1, Remaining tasks: deque(['Task 2', 'Task 3'])​
#​
# Last 5 searches: deque(['python collections', 'python deque', 'python list
performance', 'python generators', 'python decorators'], maxlen=5)​
# After one more search: deque(['python deque', 'python list performance', 'python
generators', 'python decorators', 'python async'], maxlen=5)​
#​
# Rotated deque: deque([4, 5, 1, 2, 3])​

3. collections.namedtuple
A namedtuple is a factory function for creating tuple subclasses with named fields. It
allows you to create lightweight, immutable object types that are more readable than
regular tuples.

Key Features
●​ Readability: Accessing fields by name (e.g., point.x) is more descriptive than by
index (point[0]).
●​ Backward Compatibility: It can be used anywhere a regular tuple can, including
being unpacked.
●​ Lightweight: It's memory-efficient, just like a regular tuple.

Defining and Using a namedtuple


from collections import namedtuple​

# Define a namedtuple type 'Point' with fields 'x' and 'y'​
Point = namedtuple('Point', ['x', 'y'])​

# Create an instance​
p1 = Point(10, 20)​

# Access fields by name​
print(f"Access by name: x={p1.x}, y={p1.y}")​

# Access fields by index (like a regular tuple)​
print(f"Access by index: x={p1[0]}, y={p1[1]}")​

# Unpack it like a regular tuple​
x_val, y_val = p1​
print(f"Unpacked values: x={x_val}, y={y_val}")​

# namedtuples are immutable​
try:​
p1.x = 30​
except AttributeError as e:​
print(f"\nError on trying to modify: {e}")​

Helper Methods
●​ _make(iterable): Class method to create a new instance from an existing iterable.
●​ _asdict(): Returns a new OrderedDict which maps field names to their values.
●​ _fields: A tuple of strings listing the field names.

# Using helper methods​


data = [100, 200]​
p2 = Point._make(data)​
print(f"\nCreated with _make(): {p2}")​

print(f"As dictionary: {p2._asdict()}")​
print(f"Field names: {p2._fields}")​

4. collections.OrderedDict
An OrderedDict is a dictionary subclass that remembers the order in which items were
inserted.

Important Note: Since Python 3.7, the standard dict type also preserves insertion
order. OrderedDict is now less critical but still has some unique features that make it
useful.

OrderedDict vs. Standard dict (Python 3.7+)

Feature OrderedDict dict

Order Preservation Yes, by design. Yes, as an implementation


detail (guaranteed in 3.7+).

Equality Test (==) Order-sensitive. Order-insensitive. dict(a=1,


OrderedDict(a=1, b=2) != b=2) == dict(b=2, a=1).
OrderedDict(b=2, a=1).
Reordering Provides move_to_end(key, No direct reordering method.
last=True).

Popping Items popitem(last=True) can pop popitem() only pops the last
from either end (LIFO/FIFO). inserted item (LIFO).

Intent Explicitly signals that order is Order is a feature, but may


critical to logic. not be critical.

Code Example
from collections import OrderedDict​

# Create an OrderedDict​
od = OrderedDict()​
od['apple'] = 4​
od['banana'] = 2​
od['orange'] = 5​
print(f"OrderedDict: {od}")​

# Move an item​
od.move_to_end('apple', last=False) # move 'apple' to the beginning​
print(f"After moving 'apple' to front: {od}")​

# Pop an item from the beginning​
first_item = od.popitem(last=False)​
print(f"Popped first item: {first_item}")​
print(f"Remaining dict: {od}")​

5. collections.defaultdict
A defaultdict is a dictionary subclass that calls a factory function to supply missing
values. It's a convenient way to handle missing keys without writing extra logic.

Key Feature
When a key is accessed for the first time, if it's not already in the dictionary, it is
automatically created with a default value provided by the default_factory. This avoids
KeyError.

Initialization and Usage


The defaultdict constructor takes a default_factory as its first argument. Common
factories are int, list, set, or any other callable that returns a value.

from collections import defaultdict​



# Grouping items into lists​
s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]​
d = defaultdict(list)​
for k, v in s:​
d[k].append(v)​

print(f"Grouped items: {sorted(d.items())}")​

# Counting items​
# default_factory is 'int', which returns 0​
word = "mississippi"​
d = defaultdict(int)​
for k in word:​
d[k] += 1​

print(f"Letter counts: {sorted(d.items())}")​

Without defaultdict, the same grouping logic would be more verbose:

# Equivalent logic with a standard dict​


d_standard = {}​
s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]​
for k, v in s:​
if k not in d_standard:​
d_standard[k] = []​
d_standard[k].append(v)​

You might also like