0% found this document useful (0 votes)

39 views29 pages

Lecture 7 - The Web As A Graph

The lecture by Dr. Reem Essameldin Ebrahim covers link analysis in web mining, emphasizing its role in knowledge discovery and search engine optimization. It discusses the structure of the web as a directed graph, categorizing web mining into content, structure, and usage mining. Additionally, it addresses the challenges of ranking web pages and introduces ranking algorithms based on text and links, highlighting the bowtie structure of the web as identified in research.

Uploaded by

ahmeddhamed179

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views29 pages

Lecture 7 - The Web As A Graph

Uploaded by

ahmeddhamed179

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

07.

Link Analysis:
The Web
Lecturer: Dr. Reem Essameldin Ebrahim

Social Network Computing

Based on CS224W Analysis of Networks Mining and Learning with Graphs: Stanford University
Copyright © Dr. Reem Essameldin 2023-2024
Link Analysis in Web Mining

Searching the Web

The Web as a Directed Graph

In this Lecture The Structure of the Web

Topics to be covered are:

Copyright © Dr. Reem Essameldin 2023-2024

Link Analysis
Link analysis is essentially a kind of knowledge discovery that can be used to
visualize data to allow for better analysis, especially in the context of links,
whether web links or relationship links between people or between different
entities. It is often used in search engine optimization, in security analysis, in market research and medical
research.

Link analysis has three primary purposes:

Find matches for known patterns of interests

between linked objects.
Find anomalies by detecting violated known
patterns.
Find new patterns of interest (e.g., in social networking and
marketing and business intelligence)
Link Analysis in Web Mining
Hyperlink analysis by itself is a part of a bigger research domain – web mining.
Web Mining is the process of applying data mining techniques to extract useful
information from web data.

Example
In the case of a website where all of the links and
backlinks that are present must be analyzed, a tool
has to sift through all of the HTML codes and various
scripts in the page and then follow all the links it
finds in order to determine what sort of links are
present and whether they are active or dead. This
information can be very important for search engine
optimization because it allows the analyst to
determine whether the search engine is actually able
to find and index the website.
Link Analysis in Web Mining
Web mining can be divided into three distinct categories, according to the kinds of
data to be mined:

1 Web Content Mining

Web content mining is the process of
Web Mining extracting useful information from the
contents of web documents. Content data
is the collection of facts a web page is
designed to contain. It may consist of text,
Web Web
Web Usage images, audio, video or structured records
Content Structure
Mining such as lists and tables.
Mining Mining

Figure: Web Mining

Copyright © Dr. Reem Essameldin 2023-2024

Link Analysis in Web Mining
Web mining can be divided into three distinct categories, according to the kinds of
data to be mined:

2 Web Structure Mining

The structure of a typical web graph
Web Mining consists of web pages as nodes, and
hyperlinks as edges connecting two
related pages. WSM is used to find out the
relation between different web pages by
Web Web
Web Usage processing the structure of web. WSM can
Content Structure be performed at two levels:
Mining
Mining Mining
Document structure analysis: deals with the
structure of a document such as the Document Object
Model.
Figure: Web Mining
Link type analysis: deals with links that may be inter-
document or intra-document

Copyright © Dr. Reem Essameldin 2023-2024

Link Analysis in Web Mining
Web mining can be divided into three distinct categories, according to the kinds of
data to be mined:

3 Web Usage Mining

Web usage mining is the application of
Web Mining data mining techniques to discover
interesting usage patterns from web
usage data, in order to understand and
better serve the needs of web-based
Web Web
Web Usage applications. Usage data captures the
Content Structure
Mining identity or origin of web users along with
Mining Mining
their browsing behavior at a website.
Web usage mining can be classified further
Figure: Web Mining depending on the kind of usage data considered: web
server data, application server data, and application-
level data.

Copyright © Dr. Reem Essameldin 2023-2024

Searching the Web: The Problem of Ranking
When you go to Google and type “Cornell,” the first result it shows you is
www.cornell.edu, the home page of Cornell University. It’s certainly hard to argue
with this as a first choice, but how did Google “know” that this was the best
answer?
Search engines determine how to rank pages
using automated methods that look at the
Web itself, not some external source of
knowledge, so the conclusion is that there
must be enough information intrinsic to the
Web and its structure to figure this out. Before
discussing some of the ideas behind the
ranking of pages, let’s begin by considering a
few of the basic reasons why it’s a hard
problem.

Copyright © Dr. Reem Essameldin 2023-2024

Searching the Web: The Problem of Ranking
Search is a hard problem for computers to solve in any setting, not just on the
Web. Indeed, the field of information retrieval has dealt with this problem for
decades before the creation of the Web.

Keywords are a very limited way to express a complex

information need.
They may suffer from the problems of synonymy (multiple
ways to say the same thing) and polysemy (multiple meanings for the same term).

The diversity in authoring styles (everyone is an author and a searcher).

The dynamic and constantly-changing nature of Web
content.
Web has shifted much of the information
retrieval question from a problem of
scarcity to a problem of abundance.

Copyright © Dr. Reem Essameldin 2023-2024

From scarcity to abundance

To filter, from among an enormous number of relevant
documents, the few that are most important. In other
words, a search engine has no problem finding and
indexing literally millions of documents that are
genuinely relevant to the one-word query “Cornell”;
the problem is that the human being performing the
search is going to want to look at only a few of these.
Which few should the search engine recommend?
An understanding of the network structure of Web pages will
be crucial for addressing these questions, as we now discuss.

Copyright © Dr. Reem Essameldin 2023-2024

Ranking Algorithms
The web page ranking algorithms rank the search results depending upon their
relevance to the search query. Algorithms rank the search results in descending
order of relevance to the query string being searched.

A web page’s ranking for a specific query depends on

factors such as: its relevance to the words and
concepts in the query, its overall link popularity. There
are two categories of these algorithms:

Text Based Link Based

Ranking Ranking
The pages are ranked based on
their textual content. Factors
that influence the rank of a page
are Number of matched terms,
Location of terms in document, and
Frequency of terms in document.

Copyright © Dr. Reem Essameldin 2023-2024

Link-Based Ranking Algorithm:
This algorithm views the web as a directed graph where the web pages form the
nodes and the hyperlinks between the web pages form the directed edges
between these nodes..
Ranking Algorithms

Link-based ranking algorithms propagate page

importance through links. Two most influential
hyperlink-based search algorithms are:

Hyperlink- Page-
Induced Topic ranking
Search (HITS) algorithm

Copyright © Dr. Reem Essameldin 2023-2024

The Web as a Graph
In early days of the Web links were navigational
nodes represent static html pages and hyp erlinks represent directed edges.

Copyright © Dr. Reem Essameldin 2023-2024

The Web as a Directed Graph
In early days of the Web links were navigational

Today many links are transactional

(used not to navigate from page to page, but to post,
comment, like, buy, …)

Copyright © Dr. Reem Essameldin 2023-2024

The Web as a Directed Graph
Given node 𝑣, what can 𝑣 reach? What other nodes can reach 𝑣?
Out Component In Component

Copyright © Dr. Reem Essameldin 2023-2024

Any Directed Graph can be

Any directed graph (the Web) can be

Weakly Connected expressed in terms of these two
types!
Strongly Connected
So called Directed Acyclic Graph Set of pages such that for all pairs of
(DAG). Set of pages each of which is pages (𝑢, 𝑣) in the set, there exists a
reachable from any other if directed path from 𝑢 to 𝑣 . Any node
hyperlinks may be followed either can reach any node via a directed
forwards or backwards (i.e. has no path. 𝐼𝑛(𝐴) = 𝑂𝑢𝑡(𝐴) = {𝐴, 𝐵, 𝐶, 𝐷, 𝐸}.
cycles: if 𝑢 can reach 𝑣 , then 𝑣
cannot reach 𝑢.

Is the Web a big strongly connected graph

or a DAG?
Copyright © Dr. Reem Essameldin 2023-2024
Strongly Connected Component
A Strongly Connected Component (SCC) is a set of nodes 𝑆 so that:
Every pair of nodes in 𝑆 can reach each other.
There is no larger set containing 𝑆 with this property.

Copyright © Dr. Reem Essameldin 2023-2024

Strongly Connected Component
Fact: Every directed graph is a DAG on its SCCs
SCCs partition the nodes of G, that is, each node is in exactly one SCC.
If we build a graph G’ whose nodes are SCCs, and with an edge between nodes of G’ if there is
an edge between corresponding SCCs in G, then G’ is a DAG

Copyright © Dr. Reem Essameldin 2023-2024

Structure of The Web Broder et al. 2000, Altavista web crawl (Oct ’99)
Test of Time Award for the classical paper: ”Graph Structure in the Web”

Goal: Take a large snapshot of the Web and try to

understand how its SCCs “fit together” as a DAG.

Web crawl is based on a large set of starting

points accumulated over time from various
sources, including voluntary submissions.
203 million URLS and 1.5 billion links.

In Photo: Tomkins, Broder, and Kumar, 2017 The paper proved to be highly influential, accumulating
The Researchers whose work revealed the over 3,500 citations in 17 years. Check the paper from here
"bow-tie" structure of the web

Graph Structure of the Web
Computational Issue: Want to find a SCC containing node 𝑣?
SCC containing 𝑣 is: 𝑂𝑢𝑡 𝑣 ∩ 𝐼𝑛 𝑣 .
𝑂𝑢𝑡(𝑣) → nodes that can be reached from 𝑣 (w/ BFS)

Example
For the given graph, find the SCC that contains A

Structure of the Web
The results are drawn from an Altavista crawl from May 1999,
which contains over 200 million pages and 1.5 billion links.

By performing breadth-first searches from a number of

random starting nodes following hyperlinks forwards,
and then separately backwards.

Computation: Compute 𝑰𝑵(𝒗) and 𝑶𝑼𝑻(𝒗) by

starting at random nodes.

Observation: the BFS either visits

many nodes or very few.

Result: Based on 𝐼𝑁 and 𝑂𝑈𝑇 of

a random node 𝑣,
𝑂𝑢𝑡(𝑣) ≈ 100 million (50% nodes).
𝐼𝑛(𝑣) ≈ 100 million (50% nodes).
Largest SCC: 56 million (28% nodes).

Structure of the Web
The results are drawn from an Altavista crawl from May 1999,
which contains over 200 million pages and 1.5 billion links.

By performing breadth-first searches from a number of

random starting nodes following hyperlinks forwards,
and then separately backwards.

Computation: Compute 𝑰𝑵(𝒗) and 𝑶𝑼𝑻(𝒗) by

Whatnodes.
starting at random does this tell us aboutthe
conceptual picture of the Web
Observation: the BFS either visits
graph?
many nodes or very few.

Result: Based on 𝐼𝑁 and 𝑂𝑈𝑇 of

a random node 𝑣,
𝑂𝑢𝑡(𝑣) ≈ 100 million (50% nodes).
𝐼𝑛(𝑣) ≈ 100 million (50% nodes).
Largest SCC: 56 million (28% nodes).

Bowtie Structure
Broder et.al. were able to elicit the map of the web depicted in Figure.

203 million pages, 1.5 billion links [Broder et al. 2000]

Bowtie Structure
Broder et.al. were able to elicit the map of the web depicted in Figure.

1
The ”knot” of the bowtie called the SCC stneserper
ezis fo tnenopmoc detcennoc-ylgnorts tnaig elgnis eht
dnuora 56 million. 2 1
2
The “left side" of the bowtie represents about 44
million pages called IN, defined to be all pages not in
the SCC, but from which a path exists to some node of
the SCC (i.e. a path to every node of the SCC). The set IN can be
thought of as ”new pages” that link to interesting
destinations on the web, but which have not yet been
discovered by the core of the web and are therefore
not reachable from the SCC. If a page of IN became
reachable from the SCC, it would become part of the
SCC(.
Copyright © Dr. Reem Essameldin 2023-2024 203 million pages, 1.5 billion links [Broder et al. 2000]
Bowtie Structure
Broder et.al. were able to elicit the map of the web depicted in Figure.
4

3
Another large set of approximately 44 million pages
make up the “right side" of the bowtie. This set is 3
called OUT and has the property that any page of
OUT can be reached from any page of the SCC by
following hyperlinks, but no page of the SCC can be
reached from a page of OUT by following hyperlinks.
A surfer beginning at one of these pages will quickly
get stuck and b e unable to explore further. One may
think of these pages as corporate internets which are
well-known, but whose links point only internally.

4
There is a fourth region called the TENDRILS,
consisting of pages that do not link to the knot, and 3
which are not reachable from the knot. These pages
may b e thought of as possessing the disadvantages
of IN and OUT: the web has not yet discovered these
pages, and these pages do not contain interesting
links back to better-known regions of the web.
Although these pages do not fit the image of
traditional web pages, they nonetheless make up a
significant fraction of the web, once again, there are
approximately 44 million such pages.

Note that
The bowtie in Figure, in conjunction with a deeper
analysis of the pages outside the SCC, reveals an
unexpected property of web connectivity: for most
pages 𝑢 and 𝑣 , there does not exist a path from 𝑢 to
𝑣. More precisely, if 𝑢 lies in IN ∪ SCC, and 𝑣 lies in
SCC ∪ OUT then a path exists, but if not, then a path
will almost certainly not exist. The probability that 𝑢
lies in IN ∪ SCC is about 1/2, and the probability that 𝑣
lies in SCC ∪ OUT is likewise about 1/2, so the
probability that these two independent events hold
simultaneously is about 1/4. Thus, for around 75% of
pages 𝑢 and 𝑣 , no path exists.

So we’re back to our question from the beginning of the Lecture: in response to
the one-word query “Cornell,” what are the clues that suggest Cornell’s home
page, www.cornell.edu, is a good answer?.

All web pages are not equally “important”. There

is large diversity in the web-graph node
connectivity. We will cover the following Link
Analysis approaches to computing importance of
nodes in a graph:
Voting by In-Links.
Hubs and Authorities.
PageRank (Google Algorithm).

Web Mining: Content, Structure, Usage
No ratings yet
Web Mining: Content, Structure, Usage
8 pages
Web and Text Mining
No ratings yet
Web and Text Mining
73 pages
6 WebMining
No ratings yet
6 WebMining
45 pages
Web and Text Mining Techniques Overview
No ratings yet
Web and Text Mining Techniques Overview
36 pages
Web Mining: BY: Anitha K 17EUEE017
No ratings yet
Web Mining: BY: Anitha K 17EUEE017
19 pages
Enhancing Link Evaluation Through A Coor
No ratings yet
Enhancing Link Evaluation Through A Coor
21 pages
Web Structure Mining
No ratings yet
Web Structure Mining
22 pages
Information Networks and World Wide Web
No ratings yet
Information Networks and World Wide Web
37 pages
Web Mining and Search Engine Algorithms
No ratings yet
Web Mining and Search Engine Algorithms
35 pages
Web Content Mining: by Saumya Aggarwal (0232083107 - IT) Richa Sharma (0732082707 - CSE)
No ratings yet
Web Content Mining: by Saumya Aggarwal (0232083107 - IT) Richa Sharma (0732082707 - CSE)
12 pages
Dm-Unit Advanced Concepts
No ratings yet
Dm-Unit Advanced Concepts
57 pages
SNA-UNIT-2 Full
No ratings yet
SNA-UNIT-2 Full
33 pages
Equilibrium Sets in Directed Networks
No ratings yet
Equilibrium Sets in Directed Networks
16 pages
Module 4 IR
No ratings yet
Module 4 IR
27 pages
Web Usage Mining
No ratings yet
Web Usage Mining
13 pages
Web Mining and Text Mining
No ratings yet
Web Mining and Text Mining
65 pages
UNIT-5
No ratings yet
UNIT-5
17 pages
Searching The Web
No ratings yet
Searching The Web
24 pages
Overview of Web Data Mining and Applications: Bamshad Mobasher Depaul University
No ratings yet
Overview of Web Data Mining and Applications: Bamshad Mobasher Depaul University
25 pages
Link Analysis in Data Mining Overview
No ratings yet
Link Analysis in Data Mining Overview
86 pages
Web Mining Techniques Explained
No ratings yet
Web Mining Techniques Explained
9 pages
WE Lecture 01 Introduction
No ratings yet
WE Lecture 01 Introduction
36 pages
Web Mining Techniques and Challenges
No ratings yet
Web Mining Techniques and Challenges
42 pages
DMDW-Unit V
No ratings yet
DMDW-Unit V
13 pages
Web Mining: Content, Structure, and Usage
No ratings yet
Web Mining: Content, Structure, and Usage
3 pages
Unit7 Advance Topics Unit 8 Search Engines
No ratings yet
Unit7 Advance Topics Unit 8 Search Engines
6 pages
Unit 4 (DWDM)
No ratings yet
Unit 4 (DWDM)
27 pages
Webmininglec
100% (1)
Webmininglec
75 pages
Module1PartAweb Mining-Intro
No ratings yet
Module1PartAweb Mining-Intro
28 pages
Module 6-: Real Time Big Data Models
No ratings yet
Module 6-: Real Time Big Data Models
58 pages
Web Mining
No ratings yet
Web Mining
53 pages
Issues in Sequential Web Page Ranking Algorithms
No ratings yet
Issues in Sequential Web Page Ranking Algorithms
5 pages
Bda Final
No ratings yet
Bda Final
42 pages
Big Data Analytics Module Wise Important Questions and Answers Mumbai University
No ratings yet
Big Data Analytics Module Wise Important Questions and Answers Mumbai University
12 pages
Impact of Contextual Information For Hypertext Document Retrieval
No ratings yet
Impact of Contextual Information For Hypertext Document Retrieval
9 pages
Web Mining
No ratings yet
Web Mining
28 pages
Web Mining Techniques and Algorithms
No ratings yet
Web Mining Techniques and Algorithms
25 pages
Overview of Web Mining Techniques
No ratings yet
Overview of Web Mining Techniques
41 pages
A Web Mining and Optimization Approach For Improving Data Retrieval Performance in Web Search Engine Outcomes
No ratings yet
A Web Mining and Optimization Approach For Improving Data Retrieval Performance in Web Search Engine Outcomes
5 pages
Artificial Intelligence and Innovative A
No ratings yet
Artificial Intelligence and Innovative A
9 pages
Web Mining: Presented By: Vikash Kumar
No ratings yet
Web Mining: Presented By: Vikash Kumar
24 pages
Web Mining for Data Analysts
No ratings yet
Web Mining for Data Analysts
4 pages
Web Mining for BPUT Results
No ratings yet
Web Mining for BPUT Results
5 pages
Unit 3 Web
No ratings yet
Unit 3 Web
81 pages
Analysis of Web Usage Mining: International Journal of Application or Innovation in Engineering & Management (IJAIEM)
No ratings yet
Analysis of Web Usage Mining: International Journal of Application or Innovation in Engineering & Management (IJAIEM)
7 pages
Introduction to Web Mining Concepts
No ratings yet
Introduction to Web Mining Concepts
29 pages
Screenshot 2024-06-04 at 12.03.03 AM
No ratings yet
Screenshot 2024-06-04 at 12.03.03 AM
32 pages
Implementation and Analysis of Google's Page Rank Algorithm Using Network Dataset
No ratings yet
Implementation and Analysis of Google's Page Rank Algorithm Using Network Dataset
5 pages
Deep Crawling of Web Sites Using Frontier Technique: Samantula Hemalatha
No ratings yet
Deep Crawling of Web Sites Using Frontier Technique: Samantula Hemalatha
11 pages
Data Mining & Web Analytics Guide
No ratings yet
Data Mining & Web Analytics Guide
21 pages
Webmining I
No ratings yet
Webmining I
69 pages
EB Ining: Dvanced Opics
0% (1)
EB Ining: Dvanced Opics
48 pages
Network Analysis of Wikipedia Entries
No ratings yet
Network Analysis of Wikipedia Entries
12 pages
Web Mining Techniques and Applications
No ratings yet
Web Mining Techniques and Applications
4 pages
Web Mining Techniques Overview
No ratings yet
Web Mining Techniques Overview
34 pages
Unit 7: Web Mining and Text Mining
No ratings yet
Unit 7: Web Mining and Text Mining
13 pages
Week 1
No ratings yet
Week 1
80 pages
Ir 73 103
No ratings yet
Ir 73 103
31 pages
Lecture 5
No ratings yet
Lecture 5
24 pages
Lecture 02 Handouts
No ratings yet
Lecture 02 Handouts
16 pages
CISM Domain4
No ratings yet
CISM Domain4
191 pages
Lecture 2 Cyber Security Fundamentals
No ratings yet
Lecture 2 Cyber Security Fundamentals
19 pages
WOUND Lecture 4
No ratings yet
WOUND Lecture 4
41 pages
Buffer Overflow
No ratings yet
Buffer Overflow
11 pages
Lecture 1 Introduction To AI Security Issues
No ratings yet
Lecture 1 Introduction To AI Security Issues
23 pages
Lecture 3 Effect of AI On CyberSecurity Isues ....
No ratings yet
Lecture 3 Effect of AI On CyberSecurity Isues ....
12 pages
Assignment 1
No ratings yet
Assignment 1
5 pages
Lecture 9 - Knowledge Graph
No ratings yet
Lecture 9 - Knowledge Graph
31 pages
Prem Nath 3+ Years QA Analyst Resume
No ratings yet
Prem Nath 3+ Years QA Analyst Resume
3 pages
Customer Perception of E-Banking
No ratings yet
Customer Perception of E-Banking
80 pages
2020 - Cerqueira - Smart Vest For Real Time Postural Biofeedback and Ergonomic Risk Assessment
No ratings yet
2020 - Cerqueira - Smart Vest For Real Time Postural Biofeedback and Ergonomic Risk Assessment
10 pages
5G Overview Day1 - 095205
No ratings yet
5G Overview Day1 - 095205
43 pages
SPM Syllabus
No ratings yet
SPM Syllabus
3 pages
The Impact of Artificial Intelligence On Strategic Decision-Making in Corporations
No ratings yet
The Impact of Artificial Intelligence On Strategic Decision-Making in Corporations
15 pages
SEO Tips for Literature Reviews on Temperature Controllers
100% (2)
SEO Tips for Literature Reviews on Temperature Controllers
6 pages
Vehicle Routing Problem With Time Windows
No ratings yet
Vehicle Routing Problem With Time Windows
16 pages
Dav Presentation Final PDF
No ratings yet
Dav Presentation Final PDF
8 pages
MS Visio (Lecture Slides)
No ratings yet
MS Visio (Lecture Slides)
22 pages
On Distributed Os
100% (1)
On Distributed Os
131 pages
Spectrum Series Power Supply Guide
No ratings yet
Spectrum Series Power Supply Guide
4 pages
2026 Qa Market Research Report
No ratings yet
2026 Qa Market Research Report
5 pages
Mitsubishi Electric MSZ-LN-VG Installation Manual Eng
No ratings yet
Mitsubishi Electric MSZ-LN-VG Installation Manual Eng
8 pages
c1 User Manual
No ratings yet
c1 User Manual
77 pages
Projects 2025 - Mtech
No ratings yet
Projects 2025 - Mtech
9 pages
PMCS for M88-Series Recovery Vehicle
No ratings yet
PMCS for M88-Series Recovery Vehicle
15 pages
EC-1402 Assignment-II PDF
No ratings yet
EC-1402 Assignment-II PDF
2 pages
Dashboard - Monitoring - VoLTE - KPI Formula - Final
No ratings yet
Dashboard - Monitoring - VoLTE - KPI Formula - Final
46 pages
Applications of Artificial Intteligence in Mining: by Suresh Babu
No ratings yet
Applications of Artificial Intteligence in Mining: by Suresh Babu
16 pages
JN16SBKKKM39ZM8ACNJCZMJG10
No ratings yet
JN16SBKKKM39ZM8ACNJCZMJG10
1 page
CT Image Reconstruction
No ratings yet
CT Image Reconstruction
21 pages
Developer Timesheet Summary
No ratings yet
Developer Timesheet Summary
39 pages
BC-30Vet Operator S Manual 4.0 en
No ratings yet
BC-30Vet Operator S Manual 4.0 en
134 pages
Iphone Repair & Service - Apple Support
No ratings yet
Iphone Repair & Service - Apple Support
1 page
Certificate of Undertaking for Recruitment
0% (1)
Certificate of Undertaking for Recruitment
2 pages
Part 13
100% (1)
Part 13
30 pages
DET-760D Guide To Instantaneous Selectivity PDF
No ratings yet
DET-760D Guide To Instantaneous Selectivity PDF
19 pages
Full Download (Ebook PDF) Operations and Supply Chain Management The Core 3th PDF
100% (10)
Full Download (Ebook PDF) Operations and Supply Chain Management The Core 3th PDF
50 pages