0% found this document useful (0 votes)
84 views22 pages

Crowdsourcing Handwritten Text

The document discusses optical character recognition (OCR) including its history, how it works, and its impact. It then announces a campaign to crowdsource images of handwritten text in various languages and formats to help improve OCR models.

Uploaded by

Nikhil Chaitanya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views22 pages

Crowdsourcing Handwritten Text

The document discusses optical character recognition (OCR) including its history, how it works, and its impact. It then announces a campaign to crowdsource images of handwritten text in various languages and formats to help improve OCR models.

Uploaded by

Nikhil Chaitanya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Proprietary + Confidential

Agenda
Proprietary + Confidential

● Understanding OCR (10m)

● Share Your Script: Participation Guide (10m)

● Fun Activity! (40m)


What is Optical Character Recognition (OCR)?
Proprietary + Confidential

It is the mechanical or electronic conversion


“ of images of typed, handwritten or printed
text into machine-encoded text.

Image Text ”
Then and Now
Proprietary + Confidential
[Example] OCR in Action
Proprietary + Confidential

Reading
Image Text Understanding Translations
the Text
Capture Recognition Structure in Context
Out Loud
Impact of OCR
Proprietary + Confidential

● Translating restaurant menus, sign boards when visiting a new country

● Quickly scanning and storing business cards

● Reading text from printed or handwritten docs and translating (especially relevant for virtual learning)

● Helping users with low literacy or learning difficulties to understand text

● Enabling accessibility for people who are blind or visually impaired

● (and a lot more..)

Urmila’s Story
For OCR models to recognize
Proprietary + Confidential

handwritten text, they need to


understand diverse languages,
scripts and writing styles,
across formats..
Campaign Announcement: Share Your Script!
Proprietary + Confidential

Crowdsourcing images of text, handwritten in your native


language, to create an impact (while having fun!)

How to participate?
*
Step 1 Step 2 Step 3 Step 4

Fill the participation form Click pictures of handwritten Upload images using the Add relevant labels to the images-
([Link] to text across formats like Image Capture task on • Passage: script, Marathi passage
submit your email address passages (3-6 sentences), your Crowdsource app • List: script, Marathi, list
& select the languages that lists (eg. grocery list) and ([Link]), • Notes: script, Marathi, notes
you’ll be contributing in notes (eg. study notes) ensuring quality & privacy

Submit as many contributions as you can by December 19, 2020.


Participation Guide
Proprietary + Confidential

Step 1

● Fill the participation form ([Link]

● Enter the email address associated with your

Crowdsource account

● Select the language(s) that you will be contributing in

● Enter the name of your referral source (We will

have a surprise for people who can get the maximum

number of new participants to the campaign)


Participation Guide
Proprietary + Confidential

Step 2

● Click pictures of

handwritten passages in

your language

● eg. paragraph (3-6

sentences minimum)

Format: Passage

Examples
Participation Guide
Proprietary + Confidential

Step 2

● Click pictures of

handwritten lists in your

language

● eg. grocery list, to do list,

etc.

Format: List

Examples
Participation Guide
Proprietary + Confidential

Step 2

● Click pictures of

handwritten notes in

your language

● eg. study notes, rough

notes, etc.

Format: Notes

Examples
Participation Guide
Proprietary + Confidential

Step 3

● Upload images on Crowdsource using the Image Capture

task

● You will need the Crowdsource app for this (Android)

○ Installation link: [Link]

● Quality: Make sure that the images are clear (not blurred)

and have significant amount of meaningful text

● Privacy: No sensitive or personally identifiable

information is captured in the image


Participation Guide
Proprietary + Confidential

Step 4

● Label the images with-

○ script

○ Marathi

○ Passage / List / Notes

■ eg. passage, list OR notes

● Feel free to add more labels to describe the

image, in addition to the 3 listed above

● You can add the labels in English and/or your

native language
Next Proprietary
Steps+ Confidential
Fun Activity - Let’s get started!

Passage List Notes

Country Pride Bucket List How to prepare Tea?


Main Proprietary
Menu+ Confidential
[Passage] Country Pride
● Describe a place that you would like your friends to visit. (in your language)
● Write 3-6 sentences about your favorite section - something that you’re proud of
● Click a picture → Upload on Image Capture → Add labels: script, Marathi, passage

Note: Please don’t include any personally identifiable information (eg. phone number, home address, email address, credit card information, etc.)
Main Proprietary
Menu+ Confidential
[List] Bucket List
● Which 5 countries do you want to visit?
● Make a list of these countries, in your native language
● Click a picture → Upload on Image Capture → Add labels: script, Marathi, list

Note: Please don’t include any personally identifiable information (eg. phone number, home address, email address, credit card information, etc.)
Main Proprietary
Menu+ Confidential
[List] Cinematic
● Which 5 movies do you recommend others to watch?
● Make a list of these movies, in your native language
● Click a picture → Upload on Image Capture → Add labels: script, Marathi, list

Note: Please don’t include any personally identifiable information (eg. phone number, home address, email address, credit card information, etc.)
Main Proprietary
Menu+ Confidential
[Notes] How to prepare Tea?
● Please write the method to prepare TEA step by step in Marathi.
● Click a picture → Upload on Image Capture → Add labels: script, Marathi, notes

Note: Please don’t include any personally identifiable information (eg. phone number, home address, email address, credit card information, etc.)
Next Steps
Proprietary + Confidential

1 Keep contributing images of handwritten text - You can upload pictures of handwritten text that you already
have, or can write new content by following the broad guidelines that we shared for each format (passage,
list, notes). The more diverse styles that you can get, the better it is for OCR models to learn!

2 Spread the word and ask others to contribute - You can conduct a similar event in your community, or tell
your friends/family about it. Feel free to plan other fun activities (Let your creative juices flow!). We will
share a copy of this presentation with our influencers to facilitate such community events, and would love
to join whenever possible.

3 Make sure to sign up - Don’t forget to fill the participation form ([Link] and ask anyone else
who is participating to fill it too! Quick recap on the next slide..
Share Your Script!
Proprietary + Confidential
Crowdsourcing images of text, handwritten in your native
language, to create an impact (while having fun!)

How to participate?
*
Step 1 Step 2 Step 3 Step 4

Fill the participation form Click pictures of handwritten Upload images using the Add relevant labels to the images-
([Link] to text across formats like Image Capture task on • Passage: script, Marathi passage
submit your email address passages (3-6 sentences), your Crowdsource app • List: script, Marathi, list
& select the languages that lists (eg. grocery list) and ([Link]), • Notes: script, Marathi, notes
you’ll be contributing in notes (eg. study notes) ensuring quality & privacy*

Submit as many contributions as you can by December 19, 2020.

*Note: Please contribute images that are clear (not blurred) and have a significant amount of meaningful text.
Make sure that no sensitive or personally identifiable information is captured in the images.
Proprietary + Confidential

You might also like