The task is to extract words that appear in only one of two given strings while ignoring common words. For example:
A = "Geeks for Geeks"
B = "Learning from Geeks for Geeks"
Output: ['Learning', 'from'].
Let’s explore multiple methods to find uncommon words in Python.
Using collections.Counter
Counter from collections module count word occurrences efficiently. It processes both strings, storing word frequencies in a dictionary like structure. The uncommon words are then extracted by filtering words that appear only once.
from collections import Counter
s1 = "Geeks for Geeks"
s2 = "Learning from Geeks for Geeks"
count = Counter(s1.split()) + Counter(s2.split())
res = [word for word in count if count[word] == 1]
print(res)
Output
['Learning', 'from']
Explanation:
- s1.split() and s2.split() divide the strings into words.
- Counter(...) + Counter(...) merges word counts from both strings.
- [word for word in count if count[word] == 1] selects words that appear exactly once.
Using get()
This approach manually constructs a dictionary to store word counts from both strings. The get() method ensures efficient updates while iterating over the words. After building the dictionary, uncommon words are extracted by checking their frequency.
s1 = "Geeks for Geeks"
s2 = "Learning from Geeks for Geeks"
d = {}
for word in (s1 + " " + s2).split():
d[word] = d.get(word, 0) + 1
res = [word for word in d if d[word] == 1]
print(res)
Output
['Learning', 'from']
Explanation:
- (s1 + " " + s2).split() merges and splits both strings into words.
- d.get(word, 0) + 1 increments the word count or initializes it to 1.
- List comprehension selects words that appear only once.
Using set
Set operations identify words that are unique to each string. The symmetric difference (^) extracts words that are present in one string but not the other.
s1 = "Geeks for Geeks"
s2 = "Learning from Geeks for Geeks"
set1 = set(s1.split())
set2 = set(s2.split())
res = list(set1 ^ set2)
print(res)
Output
['from', 'Learning']
Explanation:
- set(s1.split()) and set(s2.split()) create sets of unique words.
- set1 ^ set2 gives words present in only one set.
- list(...) converts the set back to a list.
Using for loop
This approach manually compares each word with all others using nested loops to count occurrences. It does not require additional data structures but is highly inefficient due to repetitive comparisons.
s1 = "Geeks for Geeks"
s2 = "Learning from Geeks for Geeks"
words = (s1 + " " + s2).split()
res = []
for word in words:
if words.count(word) == 1:
res.append(word)
print(res)
Output
['Learning', 'from']
Explanation:
- words.count(word) counts occurrences of each word.
- Only words appearing once are appended to res.