0% found this document useful (0 votes)
34 views4 pages

IR Unit I Notes

The document outlines the characteristics of the World Wide Web, highlighting its global accessibility, interconnectedness, and support for multimedia and user interactivity. It discusses the transformative impact of the web on information retrieval, emphasizing global availability, search engines, and personalized retrieval while also addressing challenges like information overload and credibility concerns. Additionally, it differentiates between web search and information retrieval (IR), noting their distinct scopes, technologies, and user interactions.

Uploaded by

mohamedfarookali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views4 pages

IR Unit I Notes

The document outlines the characteristics of the World Wide Web, highlighting its global accessibility, interconnectedness, and support for multimedia and user interactivity. It discusses the transformative impact of the web on information retrieval, emphasizing global availability, search engines, and personalized retrieval while also addressing challenges like information overload and credibility concerns. Additionally, it differentiates between web search and information retrieval (IR), noting their distinct scopes, technologies, and user interactions.

Uploaded by

mohamedfarookali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Characteristics of web

The web (World Wide Web) is a vast, interconnected system for accessing and sharing
information over the internet. Here are some key characteristics:
1. Global Accessibility
 The web is accessible to anyone with an internet connection, enabling worldwide communication
and information exchange.
2. Interconnectedness
 Hyperlinks allow users to navigate seamlessly between related documents or resources across
different domains.
3. Dynamism
 The web includes both static content (unchanging) and dynamic content (customized and
updated based on user interaction or time).
4. Multimedia Support
 Supports various types of content, including text, images, videos, audio, and interactive
elements.
5. User Interactivity
 Modern web applications support complex user interactions, such as filling out forms, chatting,
and online shopping.
6. Scalability
 The web is designed to scale from small personal sites to large platforms serving millions of
users.
7. Decentralization
 No single entity controls the entire web; it is distributed and relies on various servers and
systems across the globe.
8. Cross-Platform Compatibility
 The web can be accessed from various devices, including computers, smartphones, tablets, and
IoT devices, through web browsers.
9. Open Standards
 It is built on open standards like HTML, CSS, and JavaScript, governed by organizations like the
W3C (World Wide Web Consortium).
10. Searchability
 Web content is indexed by search engines, making it easy to locate information through
keywords.
11. Evolutionary Nature
 The web continuously evolves, incorporating new technologies (e.g., Web 2.0, Web 3.0) and
practices.
12. Security
 Encryption protocols (like HTTPS) and authentication methods protect sensitive data and user
privacy.
13. Ubiquity
 The web integrates with various aspects of daily life, including education, business,
entertainment, and social networking.
Impact of web in information retrieval
The impact of the web on information retrieval has been transformative, fundamentally changing
how individuals and organizations access, process, and utilize information. Here are the key
ways the web has influenced information retrieval:
1. Global Availability of Information
 The web provides instant access to a vast repository of information on virtually any topic,
breaking down geographical and institutional barriers.
2. Search Engines as Gateways
 Search engines like Google, Bing, and DuckDuckGo have revolutionized information retrieval
by indexing web content and enabling keyword-based searches, making it quick and easy to find
relevant data.
3. Improved Accessibility
 The web ensures that diverse audiences, including those with disabilities, can retrieve
information using assistive technologies like screen readers and voice search.
4. Real-Time Updates
 Dynamic web platforms deliver real-time information, such as breaking news, stock market
updates, or weather forecasts, enabling timely decision-making.
5. Cost Reduction
 Free access to vast amounts of information reduces costs associated with traditional information
retrieval methods, such as purchasing books or journal subscriptions.
6. Personalized Information Retrieval
 Advanced algorithms analyze user behavior and preferences to deliver personalized search
results and recommendations, improving relevance and efficiency.
7. Interactivity and Collaboration
 Web 2.0 platforms enable collaborative information retrieval through forums, wikis, and social
media, allowing users to share insights and build knowledge collectively.
8. Multimedia Integration
 The web facilitates retrieval of various formats (text, audio, video, and images), accommodating
diverse learning and consumption preferences.
9. Scalability
 The web supports retrieval from small, niche databases to massive global datasets, catering to
both specific and general needs.
10. Democratization of Information
 The web empowers individuals by making high-quality information freely accessible, leveling
the playing field for education and innovation.
11. Challenges in Information Overload
 The sheer volume of web content can overwhelm users, making tools like advanced search
filters, metadata, and artificial intelligence crucial for effective retrieval.
12. Credibility Concerns
 While the web makes information widely available, it also requires users to critically evaluate
sources due to the prevalence of misinformation and biased content.
13. Data-Driven Decision Making
 The web has enabled organizations to mine data for trends and insights, enhancing business
intelligence and research methodologies.
Web Search vs. Information Retrieval (IR)

While "web search" and "information retrieval" (IR) are closely related concepts, they differ in
scope, focus, and application. Here's a comparative breakdown:

1. Definition

 Web Search:
o The process of using a search engine to find information on the web.
o Focuses on retrieving relevant web pages or resources based on user queries.
 Information Retrieval (IR):
o A broader field encompassing techniques and systems for finding relevant
information in any repository (e.g., databases, documents, or web content).
o Deals with indexing, storing, and retrieving information from structured and
unstructured data sources.

2. Scope

 Web Search:
o Limited to the web and uses search engines like Google or Bing.
o Mainly concerned with retrieving URLs, multimedia, or other web resources.
 IR:
o Can occur across various mediums, including digital libraries, file systems,
databases, or offline repositories.
o Includes advanced methods for structured retrieval (e.g., SQL queries) and
unstructured data analysis (e.g., semantic search).

3. Underlying Technology

 Web Search:
o Heavily relies on search engine algorithms (e.g., PageRank).
o Incorporates crawling, indexing, ranking, and user personalization based on
browsing behavior.
 IR:
o Utilizes broader methodologies, such as:
 Vector Space Models
 Natural Language Processing (NLP)
 Latent Semantic Analysis (LSA)
o IR systems may not necessarily use a web-based infrastructure.

4. User Interaction

 Web Search:
o User-focused and designed for simplicity and speed.
o Prioritizes ease of use, often employing simple keyword searches and
autocomplete suggestions.
 IR:
o Designed for researchers or technical users in some cases.
o May involve advanced query formulations, filters, or technical knowledge to
retrieve highly specific results.
5. Data Characteristics

 Web Search:
o Primarily targets unstructured data (HTML, blogs, social media posts, etc.).
o Deals with data inconsistency, redundancy, and credibility challenges.
 IR:
o Works with both structured (e.g., databases, spreadsheets) and unstructured
(e.g., text documents, multimedia) data.
o Focuses on metadata, tagging, and context-based analysis.

6. Examples

 Web Search:
o Typing "best smartphones 2024" into Google to retrieve top-ranked articles,
reviews, or online stores.
 IR:
o Retrieving relevant research papers on "machine learning models" from an
academic database like PubMed or IEEE Xplore.

7. Evaluation Metrics

 Web Search:
o Evaluated by user engagement metrics:
 Click-through rate (CTR)
 Bounce rate
 Time on page
o Success is based on the relevance and ranking of web pages.
 IR:
o Evaluated using precision, recall, F1-score, and Mean Average Precision (MAP).
o Focuses on accuracy and efficiency of retrieval.

8. Applications

 Web Search:
o Designed for the general public to find web-based resources.
o Commonly used in marketing, e-commerce, and day-to-day information
gathering.
 IR:
o Used in specialized domains like enterprise search (internal company data),
digital libraries, or medical diagnosis systems.

Summary Table

Aspect Web Search Information Retrieval (IR)


Scope Limited to the web Broader, covers all data repositories
User General public Researchers, technical users
Data Types Primarily unstructured web content Structured and unstructured data
Technology Search engine-specific (e.g., PageRank) Broad IR techniques (e.g., NLP, LSA)
Examples Google, Bing PubMed, enterprise search systems

You might also like