0% found this document useful (0 votes)
48 views8 pages

Program 4

The document outlines a program that uses pre-trained GloVe word embeddings to enrich prompts for a Generative AI model by adding semantically similar words. It details the process of loading the GloVe model, defining a function to enrich prompts, and generating stories based on both original and enriched prompts. The program compares the outputs in terms of detail and relevance, demonstrating the effectiveness of enriched prompts in enhancing AI-generated content.

Uploaded by

Akash Y
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views8 pages

Program 4

The document outlines a program that uses pre-trained GloVe word embeddings to enrich prompts for a Generative AI model by adding semantically similar words. It details the process of loading the GloVe model, defining a function to enrich prompts, and generating stories based on both original and enriched prompts. The program compares the outputs in terms of detail and relevance, demonstrating the effectiveness of enriched prompts in enhancing AI-generated content.

Uploaded by

Akash Y
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Program 4:

Use word embeddings to improve prompts for Generative AI model. Retrieve similar words
using word embeddings. Use the similar words to enrich a GenAI prompt. Use the AI model to
generate responses for the original and enriched prompts. Compare the outputs in terms of
detail and relevance.

1. Importing Pre-trained Word Embeddings


import gensim.downloader as api

• gensim.downloader allows us to download pre-trained word embeddings from


gensim's model repository.
• These models provide pre-trained word vectors, so we don't have to train our own
Word2Vec model from scratch.

2. Loading the GloVe Model


model = api.load("glove-wiki-gigaword-50")

• This downloads and loads the GloVe (Global Vectors for Word Representation)
model trained on Wikipedia (glove-wiki-gigaword-50).
• The model contains word vectors of size 50 (50-dimensional vector representations
of words).
• Each word is mapped to a high-dimensional numerical representation, which helps
find semantic similarities.

3. Function Definition: enrich_prompt


def enrich_prompt(prompt, num_similar=3):

• This function takes a prompt (text input) and enriches it by adding similar words.
• num_similar=3: Specifies the number of similar words to add for each word in the
prompt.

4. Splitting the Prompt into Words


words = prompt.split()

• Splits the prompt (input sentence) into individual words.

5. Initialize an Empty List


enriched_words = []

• Creates an empty list enriched_words to store words along with their similar words.

6. Loop Through Each Word in the Prompt


for word in words:

• Iterates through each word in the prompt.

7. Finding Similar Words Using GloVe


try:
similar_words = [w for w, _ in model.most_similar(word,
topn=num_similar)]

• model.most_similar(word, topn=num_similar): Finds the num_similar (default


3) most similar words for the given word.
• It returns a list of tuples: (similar_word, similarity_score), but we only extract
similar_word.
• Example:

model.most_similar("cat", topn=3)

May return:

[('dog', 0.91), ('kitten', 0.85), ('feline', 0.83)]

Meaning "dog", "kitten", and "feline" are most similar to "cat".

enriched_words.append(word + " (" + ", ".join(similar_words) + ")")

• Formats the word by appending its similar words in parentheses.


• Example:

"cat" → "cat (dog, kitten, feline)"

• Appends this to the enriched_words list.

9. Handling Words Not Found in GloVe


except KeyError:
enriched_words.append(word)

• If the word is not found in the GloVe vocabulary, it remains unchanged.


• This avoids errors for uncommon words or typos.
10. Join Words Back Into a Sentence
return " ".join(enriched_words)

• Converts the list of enriched words back into a sentence.

11. Define an Original Prompt


original_prompt = "Write a story about a cat."

• This is the original input sentence.

12. Generate an Enriched Prompt


enriched_prompt = enrich_prompt(original_prompt)

• Calls the function enrich_prompt() with the input "Write a story about a
cat."
• Returns an enriched version of the prompt with similar words added.

13. Print the Results


print("Original Prompt:", original_prompt)
print("Enriched Prompt:", enriched_prompt)

• Prints both the original and enriched prompts.

Example Output
Original Prompt: Write a story about a cat.
Enriched Prompt: Write a (another, an, one) story (stories, book, tale)
about (than, there, more) a (another, an, one) cat.

• The function adds context to the prompt by suggesting words that are semantically
related.

Summary

• Loads pre-trained GloVe embeddings from gensim.


• Finds similar words for each word in the input prompt.
• Formats the enriched prompt by adding similar words in parentheses.
• Handles missing words gracefully.
• This technique can be used for prompt expansion, NLP applications, and creative
writing.

4a:

pip install gensim

import gensim.downloader as api


model = api.load("glove-wiki-gigaword-50")

def enrich_prompt(prompt, num_similar=3):


words = prompt.split()
enriched_words = []
for word in words:
try:
similar_words = [w for w, _ in model.most_similar(word, topn=num_similar)]
enriched_words.append(word + " (" + ", ".join(similar_words) + ")")
except KeyError:
enriched_words.append(word)
return " ".join(enriched_words)

original_prompt = "Write a story about a Dog."


enriched_prompt = enrich_prompt(original_prompt)

print("Original Prompt:", original_prompt)


print("Enriched Prompt:", enriched_prompt)
4b:
import gensim.downloader as api
import random

# Load the GloVe model (only needs to be done once)


try:
model = api.load("glove-wiki-gigaword-50")
except ValueError:
print("Downloading glove-wiki-gigaword-50 model...")
model = api.load("glove-wiki-gigaword-50")

def enrich_prompt(prompt, num_similar=3):


"""
Enriches a prompt by adding similar words to each word in the prompt.

Args:
prompt (str): The original prompt.
num_similar (int): The number of similar words to add.

Returns:
str: The enriched prompt.
"""
words = prompt.split()
enriched_words = []
for word in words:
try:
similar_words = [w for w, _ in model.most_similar(word, topn=num_similar)]
enriched_words.append(word + " (" + ", ".join(similar_words) + ")")
except KeyError:
enriched_words.append(word)
return " ".join(enriched_words)

def generate_story(prompt, length=100):


"""
Generates a simple story based on a prompt. This is a VERY basic
example and does not use any advanced language models. It's just
to illustrate the difference in story content.

Args:
prompt (str): The prompt to base the story on.
length (int): The approximate length of the story in words.

Returns:
str: A generated story.
"""

story = ""
words = prompt.split()
possible_next_words = words[:] # start with the words from the prompt
current_word = random.choice(words)
story += current_word + " "

for _ in range(length - 1):


next_word = random.choice(possible_next_words)
story += next_word + " "
possible_next_words.append(next_word) #add prev word
# Add some simple logic to make the story slightly more coherent.
if next_word in ["a", "an", "the"]:
possible_next_words.extend(words) # Boost words from the original prompt
if next_word in [".", "?", "!"]:
possible_next_words.extend(words[:]) # restart with key words from prompt
# add random meaningful words
meaningful_words = ["happily", "suddenly", "quietly", "jumped","ran", "slept","ate",
"thought","dreamed"]
possible_next_words.append(random.choice(meaningful_words))

return story + "."

# Example Usage:
original_prompt = "Write a story about a cat."
enriched_prompt = enrich_prompt(original_prompt)

print("Original Prompt:", original_prompt)


print("Enriched Prompt:", enriched_prompt)

original_story = generate_story(original_prompt)
enriched_story = generate_story(enriched_prompt)

print("\nOriginal Story:\n", original_story)


print("\nEnriched Story:\n", enriched_story)

# Compare the results


print("\nStory Lengths:")
print("Original Story:", len(original_story.split()))
print("Enriched Story:", len(enriched_story.split()))
print("\nOriginal Prompt Response Length:", len(original_prompt))
print("Enriched Prompt Response Length:", len(enriched_prompt))

inprotected.com

You might also like