0% found this document useful (0 votes)

12 views10 pages

ClassTest1 DeepLearning

Uploaded by

RAUSHAN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views10 pages

ClassTest1 DeepLearning

Uploaded by

RAUSHAN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

What does BERT stand for?

A) Basic Encoder for Robust Transformers

B) Bidirectional Encoder Representations from Transformers

C) Binary Encoded Recursive Transformers

D) Balanced Embedding Representation Technology

Which of the following is NOT a
component of the transformer architecture?

A) Multi-head attention

B) Feed-forward neural networks

C) Positional encoding

D) Convolutional layers
What is the primary advantage of
the transformer architecture over RNNs?

A) Lower computational complexity

B) Ability to handle variable-length sequences

C) Parallel processing of input sequences

D) Smaller model size

What pre-training task does BERT
use to learn bidirectional context?

A) Next Sentence Prediction

B) Masked Language Modeling

C) Machine Translation

D) Both A and B
What is the purpose of the [CLS] token in BERT?

A) To mark the end of a sentence

B) To represent the entire sequence for classification tasks

C) To separate two sentences in the input

D) To mask random words in the input

How do transformer models handle out-of-vocabulary words?

A) ignore them

B) Use sub-word tokenization

C) Assign them a random embedding

D) Assign embedding of the closest word from vocabulary

What is the key difference between BERT and GPT models?

A) BERT uses encoders only while GPT uses decoders only

B) BERT is bidirectional while GPT is unidirectional

C) BERT is for classification tasks only, while GPT is for generation

D) Both A and B
What is the primary purpose of self-attention in transformer models?

A) To reduce the model size

B) To speed up training

C) To eliminate the need for positional encoding

D) To capture dependencies between

different positions in a sequence
What is the purpose of the scaling factor
in the scaled dot-product attention?

A) To normalize the input

B) To prevent vanishing gradients

C) To stabilize the gradients,

especially for large dimension inputs

D) To increase the model's capacity

What is the purpose of positional encoding in transformer models?

A) To add information about the order of the sequence

B) To increase the model's vocabulary

C) To reduce computational complexity

D) To enable multi-head attention

NLP MCQ Advanced Real 1 20
No ratings yet
NLP MCQ Advanced Real 1 20
7 pages
Applied NLP - Project - Learner Template
No ratings yet
Applied NLP - Project - Learner Template
5 pages
NLP MCQ Advanced Real 21 40
No ratings yet
NLP MCQ Advanced Real 21 40
6 pages
Applied NLP
50% (2)
Applied NLP
8 pages
LLMS, Gpus, and Bert (Module 1)
No ratings yet
LLMS, Gpus, and Bert (Module 1)
15 pages
Final Mcqs
No ratings yet
Final Mcqs
7 pages
T Quiz1
No ratings yet
T Quiz1
4 pages
2 - LLMs - ...
No ratings yet
2 - LLMs - ...
2 pages
Evolution of NLP Models: LSTM to BERT
No ratings yet
Evolution of NLP Models: LSTM to BERT
30 pages
BERT Interview Questions and Cross Questions-1
No ratings yet
BERT Interview Questions and Cross Questions-1
9 pages
Assignment 1 For Gen AI Compact Solutions Achieving High Accuracy
No ratings yet
Assignment 1 For Gen AI Compact Solutions Achieving High Accuracy
3 pages
Dani Exam
No ratings yet
Dani Exam
9 pages
Stanford CS 224N Deep Learning For NLP Practice Quiz Pack
No ratings yet
Stanford CS 224N Deep Learning For NLP Practice Quiz Pack
4 pages
Exam Practice Questions
No ratings yet
Exam Practice Questions
17 pages
Exam 2
No ratings yet
Exam 2
5 pages
1Z0 1127 24
No ratings yet
1Z0 1127 24
9 pages
Ajaz Ahmad 101203540
No ratings yet
Ajaz Ahmad 101203540
7 pages
Class Notes
No ratings yet
Class Notes
43 pages
Deep Learning Viva Questions
No ratings yet
Deep Learning Viva Questions
10 pages
Vin AI
No ratings yet
Vin AI
55 pages
MCQ PDF 6 7
No ratings yet
MCQ PDF 6 7
33 pages
Genai 2 Marks
No ratings yet
Genai 2 Marks
4 pages
Deep Learning MCQ Practice Questions
No ratings yet
Deep Learning MCQ Practice Questions
19 pages
Transformers and Attention Mechanisms - Post Quiz - Attempt Review
No ratings yet
Transformers and Attention Mechanisms - Post Quiz - Attempt Review
5 pages
Deep Learning MCQ Previous Year MCQ
100% (1)
Deep Learning MCQ Previous Year MCQ
11 pages
BERT: Bidirectional Encoder Insights
No ratings yet
BERT: Bidirectional Encoder Insights
24 pages
Itaia1 100mc Quiz Week 1-3
No ratings yet
Itaia1 100mc Quiz Week 1-3
15 pages
Huggingface Co Blog Warm Starting Encoder Decoder Data Preprocessing
No ratings yet
Huggingface Co Blog Warm Starting Encoder Decoder Data Preprocessing
20 pages
50MCQ Lecture1
No ratings yet
50MCQ Lecture1
18 pages
Exam 1
No ratings yet
Exam 1
5 pages
OCI Answers
No ratings yet
OCI Answers
14 pages
Deep Learning Quiz for CSE Students
No ratings yet
Deep Learning Quiz for CSE Students
3 pages
Fine-Tuning and Masked Lan-Guage Models: 11.1 Bidirectional Transformer Encoders
No ratings yet
Fine-Tuning and Masked Lan-Guage Models: 11.1 Bidirectional Transformer Encoders
22 pages
Final
No ratings yet
Final
30 pages
AI Quiz ch1+ch2
No ratings yet
AI Quiz ch1+ch2
25 pages
Deep Learning M2-T1-Student Question Bank
No ratings yet
Deep Learning M2-T1-Student Question Bank
2 pages
Examen Deep Learning
100% (1)
Examen Deep Learning
8 pages
LLM Report
No ratings yet
LLM Report
6 pages
Practice Final sp22
No ratings yet
Practice Final sp22
10 pages
Assignment 10 Solution
No ratings yet
Assignment 10 Solution
6 pages
Exam 3
No ratings yet
Exam 3
6 pages
Answer For Introduction To Generative AI Quiz
75% (8)
Answer For Introduction To Generative AI Quiz
5 pages
GPT-2 and BERT for Question Generation
No ratings yet
GPT-2 and BERT for Question Generation
10 pages
AI Exam Prep: 1Z0-1127-24 Questions
No ratings yet
AI Exam Prep: 1Z0-1127-24 Questions
12 pages
Transformers MUIA
No ratings yet
Transformers MUIA
34 pages
Jacob Devlin BERT
No ratings yet
Jacob Devlin BERT
43 pages
Transformer Part3 16 Mar 23 PDF
No ratings yet
Transformer Part3 16 Mar 23 PDF
59 pages
Assignment 6 Solution
No ratings yet
Assignment 6 Solution
3 pages
hw9 Sol
No ratings yet
hw9 Sol
5 pages
Com 801
No ratings yet
Com 801
20 pages
Deep Learning Important Questions As Per Jntuh Syllabus
No ratings yet
Deep Learning Important Questions As Per Jntuh Syllabus
4 pages
Question Bank
No ratings yet
Question Bank
14 pages
Examsboost SXH48 PRD9821072025103606 Demo
No ratings yet
Examsboost SXH48 PRD9821072025103606 Demo
6 pages
Deep Learning MCQs for B.Tech Exam
No ratings yet
Deep Learning MCQs for B.Tech Exam
13 pages
MCQ
No ratings yet
MCQ
5 pages
DL Bits
No ratings yet
DL Bits
3 pages
Stanford Dataset 2.0
No ratings yet
Stanford Dataset 2.0
9 pages
Warm-Starting Encoder-Decoder Models
No ratings yet
Warm-Starting Encoder-Decoder Models
50 pages

ClassTest1 DeepLearning

Uploaded by

ClassTest1 DeepLearning

Uploaded by

What does BERT stand for?

A) Basic Encoder for Robust Transformers

B) Bidirectional Encoder Representations from Transformers

C) Binary Encoded Recursive Transformers

D) Balanced Embedding Representation Technology

B) Feed-forward neural networks

A) Lower computational complexity

B) Ability to handle variable-length sequences

C) Parallel processing of input sequences

D) Smaller model size

A) Next Sentence Prediction

B) Masked Language Modeling

A) To mark the end of a sentence

B) To represent the entire sequence for classification tasks

C) To separate two sentences in the input

D) To mask random words in the input

B) Use sub-word tokenization

C) Assign them a random embedding

D) Assign embedding of the closest word from vocabulary

A) BERT uses encoders only while GPT uses decoders only

B) BERT is bidirectional while GPT is unidirectional

C) BERT is for classification tasks only, while GPT is for generation

A) To reduce the model size

C) To eliminate the need for positional encoding

D) To capture dependencies between

A) To normalize the input

B) To prevent vanishing gradients

C) To stabilize the gradients,

D) To increase the model's capacity

A) To add information about the order of the sequence

B) To increase the model's vocabulary

C) To reduce computational complexity

D) To enable multi-head attention

You might also like