0% found this document useful (0 votes)
29 views3 pages

CT Algorithm Project

Uploaded by

23560056
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views3 pages

CT Algorithm Project

Uploaded by

23560056
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Problem: Counting the number of occurrences of a word and its synonyms in a corpus of text documents.

1. Decomposition

The problem can be broken into two primary sub-problems:

a. Synonym Expansion:

- Expand the keyword to include all its synonyms based on the thesaurus.

- Parse the thesaurus to retrieve synonyms for the given keyword.

b. Word Count in Corpus:

- Iterate through each document in the corpus.

- For each document, count occurrences of the keyword and its synonyms.

2. Pattern Recognition

Two primary patterns emerge in the solution:

a. Iterating over collections:

- The corpus contains multiple documents, and the same process of searching for words is applied to e

- The thesaurus contains a list of synonyms, and we need to process all synonyms associated with the

b. Searching and counting:

- Within each document, the process of counting occurrences of the keyword and its synonyms is repea

3. Data Abstraction and Representation

The data can be represented as follows:

a. Thesaurus: A dictionary where the key is a word, and the value is a list of synonyms.

Example:

thesaurus = {
"happy": ["joyful", "content", "pleased"],

"sad": ["unhappy", "sorrowful", "downcast"]

b. Corpus: A list of strings, where each string is a document.

Example:

corpus = [

"I am very happy and joyful today.",

"This content is about being happy.",

"Feeling sad and sorrowful now."

c. Keyword: A single string, e.g., "happy".

4. Algorithm

The algorithm for solving the problem is as follows:

a. Input:

- Keyword (string)

- Thesaurus (dictionary of word-synonym pairs)

- Corpus (list of text documents)

b. Synonym Expansion:

- Retrieve the list of synonyms for the keyword from the thesaurus.

- Combine the keyword and its synonyms into a single list of "search terms."

c. Word Count in Corpus:

- Initialize a counter to 0.

- For each document in the corpus:

- Split the document into words.


- For each word in the document:

- Check if the word is in the list of search terms (keyword + synonyms).

- If yes, increment the counter.

d. Output:

- Return the counter as the total number of occurrences of the keyword and its synonyms.

5. Real-World Problem Example

A company analyzing customer feedback to assess sentiment might use this algorithm to count positive wo

This step could form the basis for determining a sentiment score for products.

You might also like