UNIT-6: Information Search and Visualization
• Information overload and anxiety common
• Developing more powerful search and visualization methods, integration
of technology with task
• Terms:
– Information gathering
– Seeking
– Filtering
– Visualization
• Huge volumes of available data:
– Data mining
– Data warehouses and data marts
– Knowledge networks or semantic webs
– A know-item-search versus making sense and discovering
• Traditional interfaces have been difficult for novice users
– Complex commands
– Boolean operators
– Unwieldy concepts
• Traditional interfaces have been inadequate for expert users
– Difficulty in repeating searches across multiple databases
– Weak methods for discovering where to narrow broad searches
– Poor integration with other tools
• Designers are just learning how to present large amounts of data in
orderly and user-controlled ways
• OAI (Objects / Actions Interface)
• Customizable search options and displays using control panels
• Structured relational database
– contains relations and a schema to describe the relations
– relations have records
– records have fields, and fields have values Textual document
libraries
– set of items (10 to 100,000)
• Multimedia document librairies
– Contains images, sound, video, animations, etc
– Digital archives are more loosely organized
– Directories contain metadata
• Task Actions (fact-finding)
– Browsing and Searching
• Scrolling
• Zooming
• Joining
• Linking
– Specific fact finding
– Extended fact finding
– Open-ended browsing
– Exploration of availability
• Where to Search
– Finding aides
• Table of contents, Indexes, Description introductions,
Subject classification, Key-Word-In-Context (KWIC)
– Preview and overview surrogates
Searching in textual documents and database querying
• World Wide Web search engines have greatly improved their
performance by using statistical ranks and the information in the web’s
hyperlink structure
• Searching in structured relational database systems well established task
using SQL language
• Users write queries that specify matches on attribute levels
• Example of SQL command
– SELECT DOCUMENT#
– FROM JOURNAL-DB
– WHERE (Date >= and Date<= 1998)
– and (Language = English or French)
– and (publisher = ASIST or HFES or ACM).
• SQL has powerful features, but it requires 2 to 20 hours training
• While SQL is a standard form-fillin queries have simplified query
formulation
• Finding a way not to overwhelm novice users is a challenge
• Evidence shows that users perform better and have higher satisfaction
when they can view and control the search
• Framework to coordinate design practice:
– Formulation
• source of the information
• fields for limiting the source
• phrases
• variants
• size of results
– Initiation of action
• explicit or implicit
• most systems have a search button for explicit initiation, or
for delayed or regularly scheduled initiation
• implicit actions are initiated by changes to a parameter
– Review of results
• view overview and previews
• manipulate visualizations
• examine selected items
– Refinement
• should provide meaningful messages to explain search
outcomes
• should support progressive refinement
– Use
• allow queries, parameters, or results to be saved, used, or
shared
Multimedia document searches
• Searches for databases and textual documents are good, but multimedia
searches are in a primitive stage
• Current multimedia searches require descriptive documents or metadata
searches
• Search by date, text captions, or media is possible
• Useful to have computers perform some filtering
• New systems will incorporate powerful annotation and indexing, with
better search algorithms and browsing
• Image Search:
• Finding photos with images such as the Statue of Liberty is a
challenge
• Query-by-Image-Content (QBIC) is difficult
• Search by profile (shape of lady), distinctive features (torch),
colors (green copper)
• Use simple drawing tools to build templates or profiles to search
with
• More success is attainable by searching restricted collections
• Search a vase collection
• Find a vase with a long neck by drawing a profile of it
• Critical searches such as fingerprint matching requires a minimum
of 20 distinct features
• For small collections of personal photos effective browsing and
lightweight annotation are important
• Map Search
• On-line maps are plentiful
• Search by latitude/longitude is the structured-database solution
• Today's maps are allow utilizing structured aspects and multiple
layers
• City, state, and site searches
• Flight information searches
• Weather information searches
• Example: www.mapquest.com
• Mobile devices can allow “here” as a point of reference
• Design/Diagram Searches
• Some computer-assisted design packages support search of
designs
• Allows searches of diagrams, blueprints, newspapers, etc.
• E.g. search for a red circle in a blue square or a piston in an
engine
• Document-structure recognition for searching newspapers
• Sound Search
• MIR supports audio input
• Search for phone conversations may be possible in future on
speaker independent basis
• Video Search
• Provide an overview
• Segmentation into scenes and frames
• Support multiple search methods
• Infomedia project
• Animation Search
• Prevalence increased with the popularity of Flash
• Possible to search for specific animations like a spinning globe
• Search for moving text on a black background
Advanced filtering and search interfaces
For advanced uses there are alternatives to form fillin query interfaces:
• Filtering with complex boolean queries
– Problem with informal English, e.g. use of ‘and’ and ‘or’
– Venn diagrams, decision tables, and metaphor of water flowing
have not worked for complex queries
• Dynamic Queries - Adjusting sliders, buttons, etc and getting immediate
feedback
– “direct-manipulation” queries
– Use sliders and other related controls to adjust the query
– Get immediate (less than 100 msec) feedback with data
– Dynamic HomeFinder and Blue Nile
– Hard to update fast with large databases
• Query previews present an overview to give users information and the
distribution of data and thereby eliminate undesired items
• Faceted metadata search
– Integrates category browsing with keyword searching
– Flameco
• Collaborative Filtering
– Groups of users combine evaluations to help in finding items in a
large database
– User "votes" and his info is used for rating the item of interest
– E.g. a user rating sex restaurants highly is given a list of restaurants
also rated highly by those who agree the six are good
• Multilingual searches
– Current systems provide rudimentary translation searches
– Prototypes of systems with specific dictionaries and more
sophisticated translation
• Visual searches
– Specialized visual representations of the possible values
– E.g. dates on a calendar or seats on a plane
– On a map the location may be more important than the name
– Implicit initiation and immediate feedback
Information visualization
• "A picture is worth a thousand words!"
• Large amounts of information in compact and user-controlled ways
– example: USA map, click a city to see more info
• Information visualization can be defined as the use of interactive visual
representations of abstract data to amplify cognition
• Scientific visualization
– continuous variables, volumes and surfaces
• Information visualization
– categorical variables and the discovery of patterns, trends,
clusters, outliers, and gaps
• Visual data mining
• Answer questions users didn’t know they had
• Tufte offers advice for static information, but dynamic displays present a
challenge
• Must be more than cool
• The Visual Information Seeking Mantra
– Overview first
– zoom and filter
– then details-on-demand
• Basic data types
– 1 - Dimensional
• Linear data types include textual documents, program
source code, lists of names in sequential order
• E.g. highlight lines of code that have changed
– 2 - Dimensional
• Planar or map data includes geographic maps, floor plans,
newspaper layouts
• E.g. Geographic Information Systems, spatial displays of
document collections
• Example tasks: find regions containing items
• Basic data types (cont.)
– 3 - Dimensional
• Real-world objects such as molecules, the human body,
buildings
• Users must cope with understanding their position and
orientation when viewing the objects
• E.g. overviews, landmarks, stereo displays, transparency,
color coding
• Virtual Reality displays
• Users’ tasks typically deal with continuous variables
• National Library of Medicine's Visible Human Project
• Controversial
– Multi-Dimensional
• Most relational and statistical databases
• N attributes become points in an n-dimensional space
• Interface representation could be a 2-D scattergram with
each additional dimension controlled by a slider
• Parallel coordinate plots
• Table Lens
• Hierarchal or k-means clustering
– Temporal
• Time Lines are widely used and accepted
• Items have a start and finish time and items may overlap
• Tasks include finding all events before, after, or during some
time period
– Tree
• Collections of items with each item having a link to one
parent item (except root)
• Outline style of indented labels or node-and-link diagram
• Space-filling approach
– Networks
• Sometimes data needs to be linked to an arbitrary number
of other items
• Example: A graphical representation of the World Wide Web
• Mode-and-link diagrams, matrices
• Basic tasks
– Overview
• Gain an overview of the entire collection
• Adjoining detail view
• The overview might contain a movable field-of-view box to
control the contents of the detail view
• allowing zoom factors of 3 to 30
• Fisheye view
– Zoom
• Zoom in on items of interest
• Allows a more detailed view
• Need to maintain context
• Particularly important for small displays
– Filter
• Filter out uninteresting items
Allows user to reduce size of search
– Details-on-Demand
• Select an item or group and get details when needed
• Useful to pinpoint a good item
• Usually click on an item and review details in a separate or
pop-up window
– Relate
• View relationships among items
• Use human perceptual ability – proximity, containment,
connected line, color coding
• Example: Set directors name, and view all movies with that
director
– History
• Keep a history to allow undo, replay, and progressive
refinement
• Allows a mistake to be undone, or a series of steps to be
replayed
– Extract
• Extract the items or data
• Save to file, print, or drag to another application
Challenges for information visualization
– Import data
– Combine visual representations and textual labels
– See related information
– View large volumes of data
– Integrate data mining
– Collaborate with others
– Achieve universal usability