CLASS MANUAL ACTIVITY
Topic: Computational Annotation
Text for Annotation:
On Sunday, the iconic Louvre Museum in the French capital played host to a speedy heist in
which eight items of precious jewellery dating from the Napoleonic era were spirited away from
its second floor. The stolen items included a tiara pertaining to the jewellery set of Queen
Marie-Amelie and Queen Hortense, an emerald necklace utilised by Empress Marie-Louise, a
large necklace belonging to Empress Eugenie, and other similar goodies.
Activity 1: POS Annotation
Objective: Identify grammatical categories for each word using the Penn Treebank POS Tagset.
(Tagset given at the end.)
Word POS Tag Explanation
On IN Preposition
Sunday NNP Proper noun
the DT Determiner
iconic JJ Adjective
Task:
Complete POS tagging for the entire paragraph.
Discuss tag disagreements in pairs.
Activity 2: Semantic Annotation
Objective: Tag entities and roles that express who, what, where, when, etc.
Semantic Role / Entity Tags Used:
• TIME – Temporal expressions
• LOCATION – Place
• EVENT – Event or action
• OBJECT – Physical object
• PERSON – Human entity
• ERA – Historical period
Example Annotation:
[TIME On Sunday], the iconic [LOCATION Louvre Museum] in the [LOCATION French capital]
played host to a [EVENT heist] in which eight [OBJECT items of precious jewellery] dating from
the [ERA Napoleonic era] were spirited away from its [LOCATION second floor].
Task:
1. Use brackets [ ] and labels (TIME, PERSON, OBJECT, etc.).
2. Compare your semantic labeling with your class partner’s work.
3. Discuss: How do semantic labels enrich the meaning compared to POS tags?
Activity 3: XML Annotation
Objective: Encode annotated data in XML format to simulate computational corpora.
Example Output:
<text>
<sentence id="1">
<word id="1" pos="IN">On</word>
<word id="2" pos="NNP" sem="TIME">Sunday</word>
<word id="3" pos="DT">the</word>
<word id="4" pos="JJ">iconic</word>
<word id="5" pos="NNP" sem="LOCATION">Louvre</word>
<word id="6" pos="NNP" sem="LOCATION">Museum</word>
<word id="7" pos="IN">in</word>
<word id="8" pos="DT">the</word>
<word id="9" pos="JJ" sem="LOCATION">French</word>
<word id="10" pos="NN" sem="LOCATION">capital</word>
<word id="11" pos="VBD">played</word>
<word id="12" pos="NN">host</word>
<word id="13" pos="TO">to</word>
<word id="14" pos="DT">a</word>
<word id="15" pos="JJ">speedy</word>
<word id="16" pos="NN" sem="EVENT">heist</word>
</sentence>
</text>
Task:
1. Encode your annotated text using similar XML syntax.
2. Validate XML (use online validator or Notepad++ XML tools).
3. Save as annotation_activity.xml.
Reflection Questions
1. What differences did you notice between syntactic (POS) and semantic annotation?
2. Why is XML useful for computational processing?
3. How can such annotations help in training AI or NLP systems?
Basic Penn Treebank POS Tagset (Simplified for Classroom Use)
Tag Meaning / Category Example
NN Noun, singular or mass book, car, jewellery
NNS Noun, plural books, cars, items
NNP Proper noun, singular Lahore, France, Sunday
Tag Meaning / Category Example
NNPS Proper noun, plural Americans, Muslims
PRP Personal pronoun he, she, it, they
PRP$ Possessive pronoun his, her, its, their
DT Determiner the, a, an, this
JJ Adjective beautiful, iconic, precious
JJR Comparative adjective bigger, faster
JJS Superlative adjective biggest, fastest
RB Adverb quickly, slowly, away
RBR Comparative adverb faster, earlier
RBS Superlative adverb fastest, earliest
VB Verb, base form go, play, eat
VBD Verb, past tense went, played, ate
VBG Verb, gerund/present participle going, playing
VBN Verb, past participle gone, eaten, spirited
VBP Verb, non-3rd person present play, run (I/you/we/they play)
VBZ Verb, 3rd person singular present plays, runs (he/she/it plays)
IN Preposition or subordinating conjunction in, on, from, of
CC Coordinating conjunction and, but, or
TO “to” (infinitive marker or preposition) to go, to Paris
CD Cardinal number one, two, 8
MD Modal verb will, can, may, should
EX Existential “there” there is, there are
Tag Meaning / Category Example
UH Interjection oh, wow, hey
WDT Wh-determiner which, that
WP Wh-pronoun who, whom, what
WRB Wh-adverb when, where, why
POS Possessive ending ’s, ’ (as in John’s)
FW Foreign word déjà, café
SYM Symbol $, %, +
PUNCT Punctuation mark .,?!“”